WO2017070611A1 - Cysteine reactive probes and uses thereof - Google Patents

Cysteine reactive probes and uses thereof Download PDF

Info

Publication number
WO2017070611A1
WO2017070611A1 PCT/US2016/058308 US2016058308W WO2017070611A1 WO 2017070611 A1 WO2017070611 A1 WO 2017070611A1 US 2016058308 W US2016058308 W US 2016058308W WO 2017070611 A1 WO2017070611 A1 WO 2017070611A1
Authority
WO
WIPO (PCT)
Prior art keywords
cysteine
protein
containing protein
small molecule
moiety
Prior art date
Application number
PCT/US2016/058308
Other languages
French (fr)
Inventor
Benjamin F. Cravatt
Keriann M. BACKUS
Bruno E. Correia
Megan M. Blewett
John R. TEIJARO
Original Assignee
The Scripps Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Scripps Research Institute filed Critical The Scripps Research Institute
Priority to JP2018516113A priority Critical patent/JP6953400B2/en
Priority to EP16858391.2A priority patent/EP3365686A4/en
Priority to CA3001847A priority patent/CA3001847A1/en
Publication of WO2017070611A1 publication Critical patent/WO2017070611A1/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6803General methods of protein analysis not limited to specific proteins or families of proteins
    • G01N33/6845Methods of identifying protein-protein interactions in protein mixtures
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6803General methods of protein analysis not limited to specific proteins or families of proteins
    • G01N33/6842Proteomic analysis of subsets of protein mixtures with reduced complexity, e.g. membrane proteins, phosphoproteins, organelle proteins
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/0004Oxidoreductases (1.)
    • C12N9/0006Oxidoreductases (1.) acting on CH-OH groups as donors (1.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/1003Transferases (2.) transferring one-carbon groups (2.1)
    • C12N9/1007Methyltransferases (general) (2.1.1.)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/48Hydrolases (3) acting on peptide bonds (3.4)
    • C12N9/50Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
    • C12N9/64Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from animal tissue
    • C12N9/6421Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from animal tissue from mammals
    • C12N9/6472Cysteine endopeptidases (3.4.22)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y101/00Oxidoreductases acting on the CH-OH group of donors (1.1)
    • C12Y101/01Oxidoreductases acting on the CH-OH group of donors (1.1) with NAD+ or NADP+ as acceptor (1.1.1)
    • C12Y101/01042Isocitrate dehydrogenase (NADP+) (1.1.1.42)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y201/00Transferases transferring one-carbon groups (2.1)
    • C12Y201/01Methyltransferases (2.1.1)
    • C12Y201/01023Protein-arginine N-methyltransferase (2.1.1.23)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/22Cysteine endopeptidases (3.4.22)
    • C12Y304/22061Caspase-8 (3.4.22.61)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/22Cysteine endopeptidases (3.4.22)
    • C12Y304/22063Caspase-10 (3.4.22.63)
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/5005Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving human or animal cells
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/5005Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving human or animal cells
    • G01N33/5008Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving human or animal cells for testing or evaluating the effect of chemical or biological compounds, e.g. drugs, cosmetics

Definitions

  • Protein function assignment has been benefited from genetic methods, such as target gene disruption, RNA interference, and genome editing technologies, which selectively disrupt the expression of proteins in native biological systems.
  • Chemical probes offer a complementary way to perturb proteins that have the advantages of producing graded (dose-dependent) gain- (agonism) or loss- (antagonism) of-function effects that are introduced acutely and reversibly in cells and organisms.
  • Small molecules present an alternative method to selectively modulate proteins and to serve as leads for the development of novel therapeutics.
  • a method of identifying a cysteine containing protein as a binding target for a small molecule fragment comprising: (a) obtaining a set of cysteine-reactive probe-protein complexes from a sample treated with a cysteine-reactive probe wherein the cysteine-reactive probe comprises a reactive moiety capable of forming a covalent bond with a cysteine residue located on the cysteine containing protein; (b) analyzing the set of cysteine-reactive probe-protein complexes by a proteomic analysis means; (c) based on step b), identifying a cysteine containing protein as the binding target for the small molecule fragment.
  • the method further comprises assigning a value to each of the cysteine containing protein from the set of cysteine-reactive probe-protein complexes for identifying a cysteine containing protein as the binding target for the small molecule fragment, wherein the value is determined based on the proteomic analysis means of step b).
  • the sample comprises a first cell solution and a second cell solution.
  • the method further comprises contacting the first cell solution with a small molecule fragment for an extended period of time prior to incubating the first cell solution with a first cysteine-reactive probe to generate a first group of cysteine-reactive probe-protein complexes.
  • the extended period of time is about 5, 10, 15, 20, 30, 60, 90, 120 minutes or longer.
  • the method further comprises contacting the second cell solution with a second cysteine-reactive probe to generate a second group of cysteine-reactive probe-protein complexes.
  • the first cysteine-reactive probe and the second cysteine-reactive probe are the same.
  • the first group and the second group of cysteine-reactive probe-protein complexes comprise the set of cysteine- reactive probe-protein complexes.
  • cells from the second cell solution are grown in a media (e.g., an isotopically enriched media).
  • cells from the first cell solution are grown in a media (e.g., an isotopically enriched media).
  • a media e.g., an isotopically enriched media.
  • cells from both the first cell solution and the second cell solution are grown in two different isotopically enriched media so that cells from the first cell solution is
  • the method further comprises contacting the first cell solution with a first set of small molecule fragments and a complementing set of cysteine-reactive probes wherein each small molecule fragment competes with its complementing cysteine-reactive probe for binding with a cysteine residue, and wherein each small molecule fragment and each complementing cysteine-reactive probe are different within each respective set.
  • the method further comprises contacting the second cell solution with a second set of cysteine-reactive probes wherein the second set of cysteine-reactive probes is the same as the complementing set of cysteine-reactive probes, and wherein each cysteine-reactive probe is different within the set.
  • the first set of cysteine-reactive probes generates a third group of cysteine-reactive probe-protein complexes and the second set of cysteine-reactive probes generates a fourth group of cysteine- reactive probe-protein complexes.
  • the cysteine containing protein comprises a biologically active cysteine residue.
  • the biologically active cysteine site is a cysteine residue that is located about ⁇ or less to an active-site ligand or residue. In some embodiments, the cysteine residue that is located about ⁇ or less to the active-site ligand or residue is an active site cysteine. In some embodiments, the biologically active cysteine site is an active site cysteine. In some embodiments, the biologically active cysteine site is a cysteine residue that is located greater than lOA from an active-site ligand or residue. In some embodiments, the cysteine residue that is located greater than lOA from the active-site ligand or residue is a non-active site cysteine.
  • the biologically active cysteine site is a non-active site cysteine.
  • the small molecule fragment that covalently interacts with the biologically active cysteine impairs and/or inhibits activity of the cysteine containing protein.
  • the cysteine containing protein exists in an active form.
  • the small molecule fragment and/or the cysteine- reactive probe interact with the active form of the cysteine containing protein.
  • the cysteine containing protein exists in a pro-active form.
  • the small molecule fragment and/or the cysteine-reactive probe interact with the pro-active form of the cysteine containing protein.
  • the structural environment of the biologically active cysteine residue modulates the reactivity of the cysteine residue.
  • the structural environment is a hydrophobic environment or a hydrophilic environment.
  • the structural environment is a charged environment.
  • the structural environment is a nucleophilic environment.
  • the cysteine containing protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, a plasma protein, transcription related protein, translation related protein, mitochondrial protein, or cytoskeleton related protein.
  • the cysteine containing protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, transcription related protein, or translation related protein.
  • the enzyme comprises kinases, proteases, or deubiquitinating enzymes.
  • the protease is a cysteine protease.
  • the cysteine protease comprises caspases.
  • the signaling protein comprises vascular endothelial growth factor. In some embodiments, the signaling protein comprises a redox signaling protein. In some embodiments, the cysteine containing protein is a protein illustrated in Table 1. In some embodiments, the cysteine containing protein is a protein illustrated in Table 2. In some embodiments, the cysteine containing protein is a protein illustrated in Table 3. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 3. In some embodiments, the cysteine containing protein is a protein illustrated in Table 8. In some embodiments, the cysteine containing protein is a protein illustrated in Table 9.
  • the cysteine containing protein is a protein illustrated in Table 10A, Table 10B, Table IOC, Table 10D or Table 10E.
  • the small molecule fragment is a small molecule fragment of Formula (I): Formula (I), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • F is obtained from a compound library.
  • the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment-Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from AnCoreX, Bio Building Blocks from ASINEX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library.
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • F further comprises a linker moiety that connects F to the carbonyl moiety.
  • the small molecule fragment is a small molecule fragment illustrated in Fig. 3.
  • the small molecule fragment is a specific inhibitor or a pan inhibitor.
  • the cysteine-reactive probe is a cysteine-
  • RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue
  • AHM is an affinity handle moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • the affinity handle moiety comprises an affinity handle and a binding moiety that facilitates covalent interaction of the cysteine-reactive probe to a cysteine residue of a cysteine-containing protein.
  • the binding moiety is a small molecule fragment obtained from a compound library.
  • the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment-Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from AnCoreX, Bio Building Blocks from ASINEX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library.
  • the affinity handle is a bioorthogonal affinity handle.
  • the affinity handle comprises a carbodiimide, N- hydroxysuccinimide (NHS) ester, imidoester, pentafluorophenyl ester, hydroxymethyl phosphine, maleimide, haloacetyl, pyridyl disulfide, thiosulfonate, vinylsulfone, hydrazide, alkoxyamine, alkyne, azide, or isocyanate group.
  • the affinity handle comprises an alkyne or an azide group.
  • the affinity handle is further conjugated to an affinity ligand.
  • the affinity ligand comprises a chromophore, a labeling group, or a combination thereof.
  • the affinity ligand comprises a chromophore, a labeling group, or a combination thereof.
  • chromophore comprises fluorochrome, non-fluorochrome chromophore, quencher, an absorption chromophore, fluorophore, organic dye, inorganic dye, metal chelate, or a fluorescent enzyme substrate.
  • the fluorophore comprises rhodamine, rhodol, fluorescein, thiofluorescein, aminofluorescein, carboxyfluorescein, chlorofluorescein, methylfluorescein, sulfofluorescein, aminorhodol, carboxyrhodol, chlororhodol, methylrhodol, sulforhodol, aminorhodamine, carboxyrhodamine, chlororhodamine, methylrhodamine, sulforhodamine, thiorhodamine, cyanine, indocarbocyanine, oxacarbocyanine, thiacarbocyanine, merocyanine, cyanine 2, cyanine 3, cyanine 3.5, cyanine 5, cyanine 5.5, cyanine 7, oxadiazole derivatives, pyridyloxazole, nitrobenzo
  • the labeling group is biotin moiety, streptavidin moiety, bead, resin, a solid support, or a combination thereof.
  • the affinity handle moiety further comprises a chromophore.
  • the cysteine-reactive probe is a cysteine- reactive probe illustrated in Fig. 3.
  • the second cell solution further comprises a control.
  • the control is dimethyl sulfoxide (DMSO).
  • the proteomic analysis means comprises a mass spectroscopy method.
  • the mass spectroscopy method is a liquid-chromatography-mass spectrometry (LC-MS) method.
  • the method further comprises analyzing the results from the mass spectroscopy method by an algorithm for protein identification.
  • the algorithm combines the results from the mass spectroscopy method with a protein sequence database for protein identification.
  • the algorithm comprises ProLuCID algorithm, Probity, Scaffold, SEQUEST, or Mascot.
  • the mass spectroscopy method is a MALDI-TOF based method.
  • the value assigned to each of the cysteine containing protein is obtained from the mass spectroscopy analysis.
  • the value assigned to each of the cysteine containing protein is the area-under-the curve from a plot of signal intensity as a function of mass-to-charge ratio.
  • the identifying in step c) further comprises (i) locating a first value assigned to a cysteine containing protein from the first group of cysteine- reactive probe-protein complex and a second value of the same cysteine containing protein from the second group of cysteine-reactive probe-protein complex; and (ii) calculating a ratio between the two values assigned to the same cysteine containing protein.
  • the ratio of greater than 2 indicates that the cysteine containing protein is a candidate for interacting with the small molecule fragment.
  • the ratio of greater than 3 indicates that the cysteine containing protein is a candidate for interacting with the small molecule fragment.
  • the identifying in step c) further comprises calculating a percentage of inhibition of the cysteine-reactive probe to the cysteine containing protein.
  • the percentage of inhibition of greater than 50%, 60%, 70%, 80%, 90%, or at
  • the cell is obtained from a tumor cell line. In some embodiments, the cell is obtained from a MDA-MB-231, Ramos, or Jurkat cell line. In some embodiments, the cell is obtained from a tumor sample. In some embodiments, the sample is a tissue sample. In some embodiments, the method is an in situ method. In some embodiments, the cysteine-reactive probe is not 4-hydroxynonenal or 15-deoxy-A12, 14-prostaglandin J2.
  • a method of screening a small molecule fragment for interaction with a cysteine containing protein comprising: (a) harvesting a set of cysteine-reactive probe-protein complexes from a sample treated with a cysteine-reactive probe wherein the cysteine-reactive probe comprises a reactive moiety capable of forming a covalent bond with a cysteine residue located on the cysteine containing protein; (b) analyzing the set of cysteine-reactive probe-protein complexes by a proteomic analysis means; and (c) based on step b), identifying the small molecule fragment as interacting with the cysteine containing protein.
  • the method further comprises assigning a value to each of the cysteine containing protein from the set of cysteine-reactive probe-protein complexes prior to identifying the small molecule fragment as interacting with the cysteine containing protein, wherein the value is determined based on the proteomic analysis means of step b).
  • the sample comprises a first cell solution and a second cell solution.
  • the method further comprises contacting the first cell solution with a small molecule fragment for an extended period of time prior to incubating the first cell solution with a first cysteine-reactive probe to generate a first group of cysteine-reactive probe-protein complexes.
  • the extended period of time is about 5, 10, 15, 20, 30, 60, 90, 120 minutes or longer.
  • the method further comprises contacting the second cell solution with a second cysteine-reactive probe to generate a second group of cysteine-reactive probe- protein complexes.
  • the first cysteine-reactive probe and the second cysteine-reactive probe are the same.
  • the first group and the second group of cysteine-reactive probe-protein complexes comprise the set of cysteine-reactive probe- protein complexes.
  • cells from the second cell solution are grown in a media (e.g., an isotopically enriched media).
  • cells from the first cell solution are grown in a media (e.g., an isotopically enriched media).
  • a media e.g., an isotopically enriched media
  • cells from both the first cell solution and the second cell solution are grown in two different isotopically enriched media so that cells from the first cell solution is distinguishable from cells obtained from the second cell solution.
  • cells from only one of the cell solutions e.g., either the first cell solution or the second cell solution
  • the method further comprises contacting the first cell solution with a first set of small molecule fragments and a complementing set of cysteine-reactive probes wherein each small molecule fragment competes with its
  • the method further comprises contacting the second cell solution with a second set of cysteine-reactive probes wherein the second set of cysteine- reactive probes is the same as the complementing set of cysteine-reactive probes, and wherein each cysteine-reactive probe is different within the set.
  • the first set of cysteine-reactive probes generates a third group of cysteine-reactive probe-protein complexes and the second set of cysteine-reactive probes generates a fourth group of cysteine-reactive probe-protein complexes.
  • the cysteine containing protein comprises a biologically active cysteine residue.
  • the biologically active cysteine site is a cysteine residue that is located about ⁇ or less to an active-site ligand or residue.
  • the cysteine residue that is located about ⁇ or less to the active-site ligand or residue is an active site cysteine.
  • the biologically active cysteine site is an active site cysteine.
  • the biologically active cysteine site is a cysteine residue that is located greater than ⁇ from an active-site ligand or residue.
  • the cysteine residue that is located greater than ⁇ from the active-site ligand or residue is a non-active site cysteine.
  • the biologically active cysteine site is a non-active site cysteine.
  • the small molecule fragment that covalently interacts with the biologically active cysteine impairs and/or inhibits activity of the cysteine containing protein.
  • the cysteine containing protein exists in an active form.
  • the small molecule fragment and/or the cysteine-reactive probe interact with the active form of the cysteine containing protein.
  • the cysteine containing protein exists in a pro-active form.
  • the small molecule fragment and/or the cysteine-reactive probe interact with the pro-active form of the cysteine containing protein.
  • the structural environment of the biologically active cysteine residue modulates the reactivity of the cysteine residue.
  • the structural environment is a hydrophobic environment or a hydrophilic environment.
  • the structural environment is a charged environment.
  • the structural environment is a nucleophilic environment.
  • the cysteine containing protein is selected from an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, a plasma protein, transcription related protein, translation related protein, mitochondrial protein, or cytoskeleton related protein.
  • the cysteine containing protein is selected from an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, transcription related protein, or translation related protein.
  • the enzyme comprises kinases, proteases, or deubiquitinating enzymes.
  • the protease is a cysteine protease.
  • the cysteine protease comprises caspases.
  • the signaling protein comprises vascular endothelial growth factor.
  • the signaling protein comprises a redox signaling protein.
  • the cysteine containing protein is selected from Table 1.
  • the cysteine containing protein is a protein illustrated in Table 2.
  • the cysteine containing protein is a protein illustrated in Table 3.
  • the cysteine containing protein comprises a cysteine residue denoted in Table 3.
  • the cysteine containing protein is a protein illustrated in Table 8.
  • the cysteine containing protein is a protein illustrated in Table 9.
  • the cysteine containing protein is a protein illustrated in Table 10A, Table 10B, Table IOC, Table 10D or Table 10E.
  • the cysteine containing protein is TIGAR, EVIPDH2, IDH1, IDH2, BTK, ZAK, TGM2, Map2k7, XPOl, Casp5, Casp8, ERCC3, Park 7 (Toxoplasma DM), GSTOl, ALDH2, CTSZ, STAT1, STAT3, SMAD2, RBPJ, FOXK1, IRF4, IRF8, GTF3C1, or TCERG1.
  • the small molecule fragment is a small molecule fragment of Formula (I): Formula (I), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety.
  • RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue
  • F is a small molecule fragment moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • F is obtained from a compound library.
  • the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment-Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from AnCoreX, Bio Building Blocks from ASINEX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library.
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • F further comprises a linker moiety that connects F to the carbonyl moiety.
  • the small molecule fragment is a small molecule fragment illustrated in Fig. 3.
  • the small molecule fragment is a specific inhibitor or a pan inhibitor.
  • the cysteine-reactive probe is a cysteine-
  • RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue
  • AHM is an affinity handle moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • the affinity handle moiety comprises an affinity handle and a binding moiety that facilitates covalent interaction of the cysteine-reactive probe to a cysteine residue of a cysteine-containing protein.
  • the binding moiety is a small molecule fragment obtained from a compound library.
  • the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment-Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from AnCoreX, Bio Building Blocks from ASINEX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library.
  • the affinity handle is a bioorthogonal affinity handle.
  • the affinity handle comprises a carbodiimide, N- hydroxysuccinimide (NHS) ester, imidoester, pentafluorophenyl ester, hydroxymethyl phosphine, maleimide, haloacetyl, pyridyl disulfide, thiosulfonate, vinylsulfone, hydrazide, alkoxyamine, alkyne, azide, or isocyanate group.
  • the affinity handle comprises an alkyne or an azide group.
  • the affinity handle is further conjugated to an affinity ligand.
  • the affinity ligand comprises a chromophore, a labeling group, or a combination thereof.
  • the affinity ligand comprises a chromophore, a labeling group, or a combination thereof.
  • chromophore comprises fluorochrome, non-fluorochrome chromophore, quencher, an absorption chromophore, fluorophore, organic dye, inorganic dye, metal chelate, or a fluorescent enzyme substrate.
  • the fluorophore comprises rhodamine, rhodol, fluorescein, thiofluorescein, aminofluorescein, carboxyfluorescein, chlorofluorescein, methylfluorescein, sulfofluorescein, aminorhodol, carboxyrhodol, chlororhodol, methylrhodol, sulforhodol;
  • the labeling group is biotin moiety, streptavidin moiety, bead, resin, a solid support, or a combination thereof.
  • the affinity handle moiety further comprises a chromophore.
  • the cysteine-reactive probe is a cysteine- reactive probe illustrated in Fig. 3.
  • the second cell solution further comprises a control.
  • the control is dimethyl sulfoxide (DMSO).
  • the proteomic analysis means comprises a mass spectroscopy method.
  • the mass spectroscopy method is a MALDI-TOF based method.
  • the mass spectroscopy method is a liquid-chromatography-mass spectrometry
  • the method further comprises analyzing the results from the mass spectroscopy method by an algorithm for protein identification.
  • the algorithm combines the results from the mass spectroscopy method with a protein sequence database for protein identification.
  • the algorithm comprises ProLuCID algorithm, Probity, Scaffold, SEQUEST, or Mascot.
  • the value assigned to each of the cysteine containing protein is obtained from the mass spectroscopy analysis. In some embodiments, the value assigned to each of the cysteine containing protein is the area-under-the curve from a plot of signal intensity as a function of mass-to-charge ratio. In some embodiments, the identifying in step c) further comprises (i) locating a first value assigned to a cysteine containing protein from the first group of cysteine- reactive probe-protein complex and a second value of the same cysteine containing protein from the second group of cysteine-reactive probe-protein complex; and (ii) calculating a ratio between the two values assigned to the same cysteine containing protein.
  • the ratio of greater than 2 indicates that the cysteine containing protein is a candidate for interacting with the small molecule fragment. In some embodiments, the ratio of greater than 3 indicates that the cysteine containing protein is a candidate for interacting with the small molecule fragment. In some embodiments, the identifying in step c) further comprises calculating a percentage of inhibition of the cysteine-reactive probe to the cysteine containing protein. In some
  • the percentage of inhibition of greater than 50%, 60%, 70%, 80%, 90%, or at 100%) indicates that the cysteine containing protein is a candidate for interacting with the small molecule fragment.
  • the cell is obtained from a tumor cell line.
  • the cell is obtained from a MDA-MB-231, Ramos, or Jurkat cell line.
  • the cell is obtained from a tumor sample.
  • the sample is a tissue sample.
  • the method is an in situ method.
  • a method of mapping a biologically active cysteine site on a protein comprising (a) harvesting a set of cysteine-reactive probe-protein complexes from a sample treated with a cysteine-reactive probe wherein the cysteine-reactive probe comprises a reactive moiety capable of forming a covalent bond with a cysteine residue located on the cysteine containing protein; (b) analyzing the set of cysteine-reactive probe- protein complexes by a proteomic analysis means; and (c) based on step b), mapping the biologically active cysteine site on the protein.
  • the sample comprises a first cell solution and a second cell solution.
  • the method further comprises contacting the first cell solution with a small molecule fragment for an extended period of time prior to incubating the first cell solution with a first cysteine-reactive probe to generate a first group of cysteine-reactive probe-protein complexes.
  • the extended period of time is about 5, 10, 15, 20, 30, 60, 90, 120 minutes or longer.
  • the method further comprises contacting the second cell solution with a second cysteine-reactive probe to generate a second group of cysteine-reactive probe-protein complexes.
  • the first cysteine-reactive probe and the second cysteine- reactive probe are the same.
  • the biologically active cysteine site is a cysteine residue that is located about lOA or less to an active-site ligand or residue. In some embodiments, the cysteine residue that is located about lOA or less to the active-site ligand or residue is an active site cysteine. In some embodiments, the biologically active cysteine site is an active site cysteine. In some embodiments, the biologically active cysteine site is a cysteine residue that is located greater than lOA from an active-site ligand or residue. In some embodiments, the cysteine residue that is located greater than lOA from the active-site ligand or residue is a non-active site cysteine.
  • the biologically active cysteine site is a non-active site cysteine.
  • the small molecule fragment that covalently interacts with the biologically active cysteine impairs and/or inhibits activity of the cysteine containing protein.
  • the cysteine containing protein exists in an active form.
  • the small molecule fragment and/or the cysteine-reactive probe interact with the active form of the cysteine containing protein.
  • the cysteine containing protein exists in a pro-active form.
  • the small molecule fragment and/or the cysteine-reactive probe interact with the pro-active form of the cysteine containing protein.
  • the structural environment of the biologically active cysteine residue modulates the reactivity of the cysteine residue.
  • the structural environment is a hydrophobic environment or a hydrophilic environment.
  • the structural environment is a charged environment.
  • the structural environment is a nucleophilic environment.
  • the protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, a plasma protein, transcription related protein, translation related protein, mitochondrial protein, or cytoskeleton related protein.
  • the protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, transcription related protein, or translation related protein.
  • the enzyme comprises kinases, proteases, or deubiquitinating enzymes.
  • the protease is a cysteine protease.
  • the cysteine protease comprises caspases.
  • the signaling protein comprises vascular endothelial growth factor.
  • the signaling protein comprises a redox signaling protein.
  • the protein is a protein illustrated in Table 1.
  • the cysteine containing protein is a protein illustrated in Table 2.
  • the cysteine containing protein is a protein illustrated in Table 3. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 3. In some embodiments, the cysteine containing protein is a protein illustrated in Table 8. In some embodiments, the cysteine containing protein is a protein illustrated in Table 9. In some embodiments, the cysteine containing protein is a protein illustrated in Table 10A, Table 10B, Table IOC, Table 10D or Table 10E.
  • the small molecule fragment is a small molecule fragment of Formula (I): Formula (I), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety.
  • RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue
  • F is a small molecule fragment moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • F is obtained from a compound library.
  • the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment-Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from AnCoreX, Bio Building Blocks from ASF EX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library.
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • F further comprises a linker moiety that connects F to the carbonyl moiety.
  • the small molecule fragment is a small molecule fragment illustrated in Fig. 3.
  • the small molecule fragment is a specific inhibitor or a pan inhibitor.
  • the cysteine-reactive probe is a cysteine-reactive probe of
  • Formula (II) Formula (II), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue; and AHM is an affinity handle moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • the affinity handle moiety comprises an affinity handle and a binding moiety that facilitates covalent interaction of the cysteine-reactive probe to a cysteine residue of a cysteine- containing protein.
  • the binding moiety is a small molecule fragment obtained from a compound library.
  • the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment-Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from AnCoreX, Bio Building Blocks from ASFNEX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library.
  • the affinity handle is a bioorthogonal affinity handle.
  • the affinity handle comprises a carbodiimide, N-hydroxysuccinimide (NHS) ester, imidoester, pentafluorophenyl ester, hydroxymethyl phosphine, maleimide, haloacetyl, pyridyl disulfide, thiosulfonate, vinylsulfone, hydrazide, alkoxyamine, alkyne, azide, or isocyanate group.
  • the affinity handle comprises an alkyne or an azide group.
  • the affinity handle is further conjugated to an affinity ligand.
  • the affinity ligand comprises a chromophore, a labeling group, or a combination thereof.
  • the chromophore comprises fluorochrome, non-fluorochrome chromophore, quencher, an absorption chromophore, fluorophore, organic dye, inorganic dye, metal chelate, or a fluorescent enzyme substrate.
  • the fluorophore comprises rhodamine, rhodol, fluorescein, thiofluorescein, aminofluorescein,
  • the labeling group is biotin moiety, streptavidin moiety, bead, resin, a solid support, or a combination thereof.
  • the affinity handle moiety further comprises a chromophore.
  • the cysteine-reactive probe is a cysteine-reactive probe illustrated in Fig. 3.
  • the second cell solution further comprises a control.
  • the control is dimethyl sulfoxide (DMSO).
  • the proteomic analysis means comprises a mass spectroscopy method.
  • the mass spectroscopy method is a liquid-chromatography-mass spectrometry (LC-MS) method.
  • the method further comprises analyzing the results from the mass spectroscopy method by an algorithm for protein identification.
  • the algorithm combines the results from the mass spectroscopy method with a protein sequence database for protein identification.
  • the algorithm comprises ProLuCID algorithm, Probity, Scaffold, SEQUEST, or Mascot.
  • the mass comprises ProLuCID algorithm, Probity, Scaffold, SEQUEST, or Mascot.
  • the spectroscopy method is a MALDI-TOF based method.
  • the cell is obtained from a tumor cell line.
  • the cell is obtained from a MDA-MB- 231, Ramos, or Jurkat cell line.
  • the cell is obtained from a tumor sample.
  • the sample is a tissue sample.
  • the method is an in situ method.
  • composition comprising: a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of a small amount of
  • molecule fragment of Formula (I) Formula (I), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety; and a cysteine containing protein wherein the cysteine containing protein is covalently bond to the small molecule fragment.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • F is obtained from a compound library.
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • F further comprises a linker moiety that connects F to the carbonyl moiety.
  • composition comprising: a cysteine-
  • RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue
  • AFDVI is an affinity handle moiety
  • a cysteine containing protein wherein the cysteine containing protein is covalently bond to the cysteine-reactive probe.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • the affinity handle moiety comprises an affinity handle and a binding moiety that facilitates covalent interaction of the cysteine-reactive probe to a cysteine residue of a cysteine-containing protein.
  • the binding moiety is a small molecule fragment obtained from a compound library.
  • the affinity handle is a bioorthogonal affinity handle.
  • the affinity handle comprises a carbodiimide, N-hydroxysuccinimide (NHS) ester, imidoester, pentafluorophenyl ester, hydroxymethyl phosphine, maleimide, haloacetyl, pyridyl disulfide, thiosulfonate, vinylsulfone, hydrazide, alkoxyamine, alkyne, azide, or isocyanate group.
  • the affinity handle is further conjugated to an affinity ligand.
  • the affinity handle moiety further comprises a chromophore.
  • the cysteine-reactive probe is a cysteine-reactive probe illustrated in Fig. 3.
  • compositions comprising: an isolated sample wherein the isolated sample is an isolated cell or a tissue sample; and a cysteine-reactive probe to be assayed for its ability to interact with a cysteine containing protein expressed in the isolated sample.
  • the composition further comprises contacting the isolated sample with a small molecule fragment for an extended period of time prior to incubating the isolated sample with the cysteine-reactive probe to generate a cysteine-reactive probe-protein complex.
  • the extended period of time is about 5, 10, 15, 20, 30, 60, 90, 120 minutes or longer.
  • an isolated treated cell comprising a cysteine-reactive probe covalently attached to a cysteine containing protein.
  • the isolated treated cell further comprises a set of cysteine-reactive probes wherein each of the cysteine-reactive probes is covalently attached to a cysteine containing protein.
  • an isolated treated cell comprising a small molecule fragment covalently attached to a cysteine containing protein.
  • the isolated treated cell further comprises a set of small molecule fragments wherein each of the small molecule fragments is covalently attached to a cysteine containing protein.
  • the isolated treated cell further comprises a cysteine-reactive probe.
  • the isolated treated cell further comprises a set of cysteine-reactive probes.
  • an isolated treated population of cells comprising a set of cysteine-reactive probes covalently attached to cysteine containing proteins. Also disclosed herein, in certain embodiments, is an isolated treated population of cells comprising a set of small molecule fragments covalently attached to cysteine containing proteins. In some embodiments, the isolated treated population of cells further comprises a set of cysteine-reactive probes.
  • an isolated and purified polypeptide comprising at least 90% sequence identity to at least seven contiguous amino acids of an amino acid sequence selected from Tables 1-3 or 8-9. In some embodiments, the isolated and purified polypeptide comprising at least 95% sequence identity to at least seven contiguous amino acids of an amino acid sequence selected from Tables 1-3 or 8-9. In some embodiments, the isolated and purified polypeptide comprising 100% sequence identity to at least seven contiguous amino acids of an amino acid sequence selected from Tables 1-3 or 8-9. In some embodiments, the isolated and purified polypeptide consisting 100% sequence identity to the full length of an amino acid sequence selected from Tables 1-3 or 8-9. In some embodiments, the isolated and purified polypeptide is at most 50 amino acids in length. A polypeptide probe for screening a small molecule fragment comprising an isolated and purified polypeptide described herein.
  • nucleic acid encoding a polypeptide comprising at least 90% sequence identity at least seven contiguous amino acids of an amino acid sequence selected from Tables 1-3 or 8-9. In some embodiments, the nucleic acid encoding a polypeptide comprising at least 95% sequence identity at least seven contiguous amino acids of an amino acid sequence selected from Tables 1 -3 or 8-9. In some embodiments, the nucleic acid encoding a polypeptide comprising 100% sequence identity at least seven contiguous amino acids of an amino acid sequence selected from Tables 1-3 or 8-9. In some embodiments, the nucleic acid encoding a polypeptide consisting 100% sequence identity to the full length of an amino acid sequence selected from Tables 1-3 or 8-9.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, wherein the small molecule fragment is a small molecule fragment of
  • Formula (I) Formula (I), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety.
  • the cysteine containing protein is a protein illustrated in Table 1.
  • the cysteine containing protein is a protein illustrated in Table 2.
  • the cysteine containing protein is a protein illustrated in Table 3.
  • the cysteine containing protein comprises a cysteine residue denoted in Table 3.
  • the cysteine containing protein is a protein illustrated in Table 8.
  • the cysteine containing protein is a protein illustrated in Table 9.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • F is obtained from a compound library.
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • F further comprises a linker moiety that connects F to the carbonyl moiety.
  • the small molecule fragment binds irreversibly to the cysteine containing protein.
  • the small molecule fragment binds reversibly to the cysteine containing protein.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, wherein the small molecule fragment has a molecular weight of about 150 Dalton or higher.
  • the small molecule fragment has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the molecular weight of the small molecule fragment is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof.
  • the cysteine containing protein is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more. In some embodiments, the cysteine containing protein is a protein illustrated in Table 1.
  • the cysteine containing protein is a protein illustrated in Table 2. In some embodiments, the cysteine containing protein is a protein illustrated in Table 3. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 3. In some embodiments, the cysteine containing protein is a protein illustrated in Table 8. In some embodiments, the cysteine containing protein is a protein illustrated in Table 9. In some embodiments, the small molecule fragment is a small molecule
  • RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue
  • F is a small molecule fragment moiety.
  • the small molecule fragment of Formula (I) has a molecular weight of about 150, 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • F is obtained from a compound library.
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • F further comprises a linker moiety that connects F to the carbonyl moiety.
  • the small molecule fragment bond irreversibly to the cysteine containing protein.
  • the small molecule fragment bond reversibly to the cysteine containing protein.
  • RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue
  • F is a small molecule fragment moiety; and wherein the contacting time is between about 5 minutes and about 2 hours.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • F is obtained from a compound library.
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • F further comprises a linker moiety that connects F to the carbonyl moiety.
  • the small molecule fragment of Formula (I) has a molecular weight of about 150, 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the cysteine containing protein is a protein illustrated in Table 1.
  • the cysteine containing protein is a protein illustrated in Table 2.
  • the cysteine containing protein is a protein illustrated in Table 3.
  • the cysteine containing protein comprises a cysteine residue denoted in Table 3.
  • the cysteine containing protein is a protein illustrated in Table 8.
  • the cysteine containing protein is a protein illustrated in Table 9. In some embodiments, the cysteine containing protein is a protein illustrated in Table 10A, Table 10B, Table IOC, Table 10D or Table 10E. In some embodiments, the small molecule fragment binds irreversibly to the cysteine containing protein. In some embodiments, the small molecule fragment binds reversibly to the cysteine containing protein.
  • a modified cysteine containing protein comprising a cysteine-reactive probe having a covalent bond to a cysteine residue of a cysteine containing protein, wherein the cysteine-reactive probe is a cysteine-reactive probe of Formula
  • the cysteine containing protein is a protein illustrated in Table 1.
  • the cysteine containing protein is a protein illustrated in Table 2.
  • the cysteine containing protein is a protein illustrated in Table 8.
  • the cysteine containing protein is a protein illustrated in Table 9.
  • the cysteine containing protein is a protein illustrated in Table 10A, Table 10B, Table IOC, Table 10D or Table 10E.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • the affinity handle moiety comprises an affinity handle and a binding moiety that facilitates covalent interaction of the cysteine-reactive probe to a cysteine residue of a cysteine-containing protein.
  • the binding moiety is a small molecule fragment obtained from a compound library.
  • the affinity handle is a bioorthogonal affinity handle.
  • the affinity handle comprises a carbodiimide, N-hydroxysuccinimide (NHS) ester, imidoester, pentafluorophenyl ester, hydroxymethyl phosphine, maleimide, haloacetyl, pyridyl disulfide, thiosulfonate, vinylsulfone, hydrazide, alkoxyamine, alkyne, azide, or isocyanate group.
  • NHS N-hydroxysuccinimide
  • imidoester imidoester
  • pentafluorophenyl ester hydroxymethyl phosphine
  • maleimide haloacetyl
  • pyridyl disulfide pyridyl disulfide
  • thiosulfonate vinylsulfone
  • hydrazide alkoxyamine, alkyne, azide, or isocyanate group.
  • the affinity handle is further conjugated to an affinity ligand.
  • the affinity handle moiety further comprises a chromophore.
  • the cysteine-reactive probe is a cysteine-reactive probe illustrated in Fig. 3. In some embodiments, the cysteine-reactive probe binds irreversibly to the cysteine containing protein. In some embodiments, the cysteine-reactive probe binds reversibly to the cysteine containing protein.
  • cysteine-reactive probe of Formula (II) is a cysteine-reactive probe of Formula (II): Formula (II), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue; and AHM is an affinity handle moiety.
  • the cysteine-reactive probe covalently binds to a cysteine residue on a cysteine containing protein.
  • cysteine containing protein is a protein illustrated in Table 1.
  • the cysteine containing protein is a protein illustrated in Table 2.
  • the cysteine containing protein is a protein illustrated in Table 3. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 3. In some embodiments, the cysteine containing protein is a protein illustrated in Table 8. In some embodiments, the cysteine containing protein is a protein illustrated in Table 9. In some embodiments, the cysteine containing protein is a protein illustrated in Table 10A, Table 10B, Table IOC, Table 10D or Table 10E. In some embodiments, the cysteine-reactive probe binds irreversibly to the cysteine containing protein. In some embodiments, the cysteine-reactive probe binds reversibly to the cysteine containing protein.
  • a compound capable of covalently binding to a cysteine containing protein identified using the method comprising: (a) obtaining a set of cysteine-reactive probe-protein complexes from a sample wherein the cysteine-reactive probe comprises a reactive moiety capable of forming a covalent bond with a cysteine residue located on the cysteine containing protein; (b) analyzing the set of cysteine-reactive probe-protein complexes by a proteomic analysis means; (c) based on step b), identifying a cysteine containing protein as the binding target for the compound.
  • the compound is a small molecule fragment.
  • the small molecule fragment is a small molecule
  • RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue
  • F is a small molecule fragment moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • F is obtained from a compound library.
  • the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment -Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • F further comprises a linker moiety that connects F to the carbonyl moiety.
  • the small molecule fragment is a small molecule fragment illustrated in Fig. 3.
  • the small molecule fragment is a specific inhibitor or a pan inhibitor.
  • the cysteine containing protein comprises a biologically active cysteine residue.
  • the biologically active cysteine site is a cysteine residue that is located about ⁇ or less to an active-site ligand or residue.
  • the cysteine residue that is located about ⁇ or less to the active-site ligand or residue is an active site cysteine.
  • the biologically active cysteine site is an active site cysteine.
  • the biologically active cysteine site is a cysteine residue that is located greater than ⁇ from an active-site ligand or residue.
  • the cysteine residue that is located greater than ⁇ from the active-site ligand or residue is a non-active site cysteine.
  • the biologically active cysteine site is a non-active site cysteine.
  • the small molecule fragment that covalently interacts with the biologically active cysteine impairs and/or inhibits activity of the cysteine containing protein.
  • the cysteine containing protein exists in an active form.
  • the small molecule fragment and/or the cysteine- reactive probe interact with the active form of the cysteine containing protein.
  • the cysteine containing protein exists in a pro-active form.
  • the small molecule fragment and/or the cysteine-reactive probe interact with the pro-active form of the cysteine containing protein.
  • the structural environment of the biologically active cysteine residue modulates the reactivity of the cysteine residue.
  • the structural environment is a hydrophobic environment or a hydrophilic environment.
  • the structural environment is a charged environment.
  • the structural environment is a nucleophilic environment.
  • the cysteine containing protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, a plasma protein, transcription related protein, translation related protein, mitochondrial protein, or cytoskeleton related protein.
  • the cysteine containing protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, transcription related protein, or translation related protein.
  • the enzyme comprises kinases, proteases, or deubiquitinating enzymes.
  • the protease is a cysteine protease.
  • the cysteine protease comprises caspases.
  • the signaling protein comprises vascular endothelial growth factor. In some embodiments, the signaling protein comprises a redox signaling protein. In some embodiments, the cysteine containing protein is a protein illustrated in Table 1. In some embodiments, the cysteine containing protein is a protein illustrated in Table 2. In some embodiments, the cysteine containing protein is a protein illustrated in Table 3. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 3. In some embodiments, the cysteine containing protein is a protein illustrated in Table 8. In some embodiments, the cysteine containing protein is a protein illustrated in Table 9. In some embodiments, the cysteine containing protein is a protein illustrated in Table 10A, Table 10B, Table IOC, Table 10D or
  • R i s selected from: (a) ; (b)
  • F is H, C 1-C3 alkyl, or aryl; and F is a small molecule fragment moiety.
  • F has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the molecular weight of F is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms.
  • the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof.
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • the cysteine containing protein is a cysteine containing protein described herein. In some embodiments, the cysteine containing protein is a protein illustrated in Tables 1, 2, 3, 8 or 9. In some embodiments, the cysteine containing protein is a protein illustrated in Table 1. In some embodiments, the cysteine containing protein is a protein illustrated in Table 2. In some embodiments, the cysteine containing protein is a protein illustrated in Table 3. In some embodiments, the cysteine containing protein is a protein illustrated in Table 8. In some embodiments, the cysteine containing protein is a protein illustrated in Table 9.
  • R is selected from: (a) (b)
  • F is H, C 1-C3 alkyl, or aryl; and F' is a small molecule fragment moiety.
  • F has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the molecular weight of F' is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms.
  • the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof.
  • F' is a small molecule fragment moiety illustrated in Fig. 3.
  • F' is a small molecule fragment moiety.
  • F' has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the molecular weight of F' is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, F' is a small molecule fragment moiety illustrated in Fig. 3.
  • R is selected from: (a)
  • F' is a small molecule fragment moiety.
  • F' has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the molecular weight of F is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms.
  • the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof.
  • F' is a small molecule fragment moiety illustrated in Fig. 3.
  • R is selected from: (a)
  • F' is a small molecule fragment moiety.
  • F' has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the molecular weight of F' is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms.
  • the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof.
  • F' is a small molecule fragment moiety illustrated in Fig. 3.
  • R is selected from: (a) ; wherein R 1 is H, C 1-C3 alkyl, or aryl; and F' is a small molecule fragment moiety.
  • F' has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the molecular weight of F' is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, F' is a small molecule fragment moiety illustrated in Fig. 3.
  • F' is a small molecule fragment moiety.
  • F' has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the molecular weight of F' is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, F' is a small molecule fragment moiety illustrated in Fig. 3.
  • R is selected from: (a)
  • F' has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the molecular weight of F' is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms.
  • the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof.
  • F' is a small molecule fragment moiety illustrated in Fig. 3.
  • R is selected from: (a)
  • F' is a small molecule fragment moiety.
  • F' has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the molecular weight of F' is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms.
  • the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof.
  • F' is a small molecule fragment moiety illustrated in Fig. 3.
  • R is selected from: (a) ; wherein R 1 is H, C 1-C3 alkyl, or aryl; and F is a small molecule fragment moiety.
  • F has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the molecular weight of F is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, F is a small molecule fragment moiety illustrated in Fig. 3.
  • R is selected from: (a) ; (b)
  • F' is a small molecule fragment moiety.
  • F has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600,
  • the molecular weight of F' is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms.
  • the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof.
  • F' is a small molecule fragment moiety illustrated in Fig. 3.
  • R is selected from: (a) (b)
  • F' is a small molecule fragment moiety.
  • F' has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the molecular weight of F' is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms.
  • the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof.
  • F' is a small molecule fragment moiety illustrated in Fig. 3.
  • R is selected from: ; (b) ; wherein R 1 is H, C 1-C3 alkyl, or aryl; and F' is a small molecule fragment moiety.
  • F' has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600,
  • the molecular weight of F is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms.
  • the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof.
  • F' is a small molecule fragment moiety illustrated in Fig. 3.
  • a method of identifying a cysteine containing protein as a binding target for a small molecule fragment comprising: (a) obtaining a set of cysteine-reactive probe-protein complexes from a sample comprising a first cell solution treated with a small molecule fragment and a cysteine reactive probe wherein the cysteine- reactive probe comprises a reactive moiety capable of forming a covalent bond with a cysteine residue located on the cysteine containing protein; (b) analyzing the set of cysteine-reactive probe-protein complexes by a proteomic analysis means; and (c) based on step b), identifying a cysteine containing protein as the binding target for the small molecule fragment.
  • the method further comprises determining a value of each of the cysteine containing protein from the set of cysteine-reactive probe-protein complexes for identifying a cysteine containing protein as the binding target for the small molecule fragment, wherein the value is determined based on the proteomic analysis means of step b).
  • the sample further comprises a second cell solution.
  • the method further comprises contacting the first cell solution with a small molecule fragment for an extended period of time prior to incubating the first cell solution with a first cysteine-reactive probe to generate a first group of cysteine-reactive probe-protein complexes.
  • the extended period of time is about 5, 10, 15, 20, 30, 60, 90, 120 minutes or longer.
  • the method further comprises contacting the second cell solution with a second cysteine-reactive probe to generate a second group of cysteine-reactive probe-protein complexes.
  • the first cysteine-reactive probe and the second cysteine- reactive probe are the same.
  • the first group and the second group of cysteine-reactive probe-protein complexes comprise the set of cysteine-reactive probe-protein complexes.
  • the cysteine containing protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, a plasma protein, transcription related protein, translation related protein, mitochondrial protein, or cytoskeleton related protein.
  • the cysteine containing protein is a protein illustrated in Table 3. In some embodiments, the cysteine containing protein is a protein illustrated in Table 1, Table 2, Table 8, Table 9, Table 10A, Table 10B, Table IOC, Table 10D or Table 10E.
  • the small molecule fragment is a small molecule fragment of Formula (I): Formula (I), wherein: RM is a reactive moiety selected from a
  • Michael acceptor moiety a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • F is obtained from a compound library.
  • the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment -Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • the cysteine-reactive probe is a cysteine-
  • RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue
  • AHM is an affinity handle moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • the affinity handle moiety comprises an affinity handle and a binding moiety that facilitates covalent interaction of the cysteine-reactive probe to a cysteine residue of a cysteine-containing protein.
  • the binding moiety is a small molecule fragment obtained from a compound library.
  • the affinity handle comprises a carbodiimide, N-hydroxysuccinimide (NHS) ester, imidoester, pentafluorophenyl ester, hydroxymethyl phosphine, maleimide, haloacetyl, pyridyl disulfide, thiosulfonate, vinylsulfone, hydrazide, alkoxyamine, alkyne, azide, or isocyanate group.
  • NHS N-hydroxysuccinimide
  • imidoester imidoester
  • pentafluorophenyl ester hydroxymethyl phosphine
  • maleimide haloacetyl
  • pyridyl disulfide pyridyl disulfide
  • thiosulfonate vinylsulfone
  • hydrazide alkoxyamine, alkyne, azide, or isocyanate group.
  • the affinity handle is further conjugated to an affinity ligand.
  • the affinity ligand comprises a chromophore, a labeling group, or a combination thereof.
  • the chromophore comprises non-fluorochrome chromophore, quencher, an absorption chromophore, fluorophore, organic dye, inorganic dye, metal chelate, or a fluorescent enzyme substrate.
  • the labeling group is a biotin moiety, a streptavidin moiety, bead, resin, a solid support, or a combination thereof.
  • the cysteine-reactive probe is a cysteine-reactive probe illustrated in Fig. 3.
  • the proteomic analysis means comprises a mass spectroscopy method.
  • the identifying in step c) further comprises (i) locating a first value assigned to a cysteine containing protein from the first group of cysteine-reactive probe-protein complex and a second value of the same cysteine containing protein from the second group of cysteine- reactive probe-protein complex; and (ii) calculating a ratio between the two values assigned to the same cysteine containing protein. In some embodiments, the ratio of greater than 2 indicates that the cysteine containing protein is a candidate for interacting with the small molecule fragment.
  • the identifying in step c) further comprises calculating a percentage of inhibition of the cysteine-reactive probe to the cysteine containing protein. In some embodiments, the percentage of inhibition of greater than 50%, 60%, 70%, 80%, 90%, or at 100%) indicates that the cysteine containing protein is a candidate for interacting with the small molecule fragment.
  • the method is an in situ method. In some embodiments, the cysteine-reactive probe is not 4-hydroxynonenal or 15-deoxy-A12,14- prostaglandin J2.
  • modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, wherein the small molecule fragment has a molecular weight of about 150 Dalton or higher.
  • the cysteine containing protein comprises a cysteine residue site denoted in Table 3.
  • the cysteine containing protein comprises a protein sequence illustrated in Table 1, Table 2, Table 8, Table 9, Table 10A, Table 10B, Table IOC, Table 10D or Table 10E.
  • the cysteine containing protein is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
  • the cysteine residue of the modified cysteine containing protein has the structure SR, wherein R is selected from:
  • R 1 is H, C1-C3 alkyl, or aryl; and F' is the small molecule fragment moiety.
  • the small molecule fragment has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms.
  • the modified cysteine containing protein is selected from IDH2, caspase-8, caspase-10 or PRMT1.
  • IDH2 is modified at cysteine position 308.
  • caspase-8 is modified at cysteine position 360.
  • caspase-10 exist in the proform and is modified at cysteine position 401.
  • PRMT1 is modified at cysteine position 109. In some embodiments, the small molecule fragment
  • molecule fragment is a small molecule fragment of Formula (I): Formula (I), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • F is obtained from a compound library.
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • F further comprises a linker moiety that connects F to the carbonyl moiety.
  • the small molecule fragment binds irreversibly to the cysteine containing protein.
  • the small molecule fragment binds reversibly to the cysteine containing protein.
  • a method of screening a small molecule fragment for interaction with a cysteine containing protein comprising: (a) harvesting a set of cysteine-reactive probe-protein complexes from a sample comprising a first cell solution treated with a small molecule fragment and a cysteine reactive probe wherein the cysteine-reactive probe comprises a reactive moiety capable of forming a covalent bond with a cysteine residue located on the cysteine containing protein; (b) analyzing the set of cysteine-reactive probe- protein complexes by a proteomic analysis means; and (c) based on step b), identifying the small molecule fragment as interacting with the cysteine containing protein.
  • the method further comprises determining a value of each of the cysteine containing protein from the set of cysteine-reactive probe-protein complexes prior to identifying the small molecule fragment as interacting with the cysteine containing protein, wherein the value is determined based on the proteomic analysis means of step b).
  • the cysteine containing protein is a protein illustrated in Table 3.
  • the cysteine containing protein is a protein illustrated in Table 1, Table 2, Table 8, Table 9, Table 10A, Table 10B, Table IOC, Table 10D or Table 10E.
  • Fig. 1 illustrates proteome-wide screening of covalent fragments.
  • A General protocol for competitive isoTOP-ABPP. Cell lysate or intact cells are pre-treated with a fragment electrophile or DMSO and then reacted with an IA-alkyne probe 1.
  • the fragment- and DMSO- treated samples are then conjugated to isotopically-differentiated TEV protease-cleavable biotin tags [light (red) and heavy (blue), respectively] by copper-mediated azide-alkyne cycloaddition (CuAAC or click) chemistry, mixed, and IA-labeled proteins enriched by streptavidin- conjugated beads and digested stepwise on-bead with trypsin and TEV to yield IA-labeled peptides for MS analysis.
  • Competition ratios, or R values are measured by dividing the MSI ion peaks for IA-labeled peptides in DMSO-treated (heavy or blue) versus fragment-treated (light or red) samples.
  • B Representative members of the electrophilic fragment library, where the reactive (electrophilic) and binding groups are colored green and black, respectively.
  • C Initial analysis of the proteomic reactivity of fragments using an IA-rhodamine probe 16. Soluble proteome from Ramos cells was treated with the indicated fragments (500 ⁇ each) for 1 h, followed by labeling with IA-rhodamine (1 ⁇ , 1 h) and analysis by SDS-PAGE and in-gel fluorescence scanning. Several proteins were identified that show impaired reactivity with IA- rhodamine in the presence of one or more fragments (asterisks). Fluorescent gel shown in grayscale.
  • Proteomic reactivity values, or liganded cysteine rates, for fragments were calculated as the percentage of total cysteines with R values > 4 in DMSO/fragment (heavy/light) comparisons.
  • E Concentration-dependent labeling of MDA-MB-231 soluble proteomes with acrylamide 18 and chloroacetamide 19 click probes detected by CuACC with a rhodamine-azide tag and analysis by SDS-PAGE and in-gel fluorescence scanning.
  • F Representative MSI peptide ion
  • FIG. 2 illustrates a conceptual schematic of an exemplary computer server to be used for processing a method described herein.
  • Fig. 3 shows composition of fragment electrophile library and structures of additional tool compounds, click probes, and fragments.
  • Fig. 4 illustrates analysis of proteomic reactivities of fragment electrophiles
  • A Frequency of quantification of all cysteines across the complete set of competitive isoTOP-ABPP experiments performed with fragment electrophiles. Note that cysteines were required to have been quantified in at least three isoTOP-ABPP data sets for interpretation.
  • B Rank order of proteomic reactivity values (or liganded cysteine rates) of fragments calculated as the percentage of all quantified cysteines with R values > 4 for each fragment. The majority of fragments were evaluated in 2-4 replicate experiments in MDA-MB-231 and/or Ramos cell lysates, and their proteomic reactivity values are reported as mean ⁇ SEM values for the replicates.
  • C Comparison of the proteomic reactivities of representative fragments screened at 500 versus 25 ⁇ in cell lysates.
  • Fig. 5 illustrates analysis of cysteines and proteins liganded by fragment electrophiles.
  • A Fraction of total quantified cysteines and proteins that were liganded by fragment
  • cysteines are plotted on the x-axis and were sorted by reactivity, which is shown on the left y-axis. A moving average with a step-size of 50 is shown in blue for the percentage of liganded cysteines within each reactivity bin (percent values shown on the right y-axis).
  • F Number of liganded and quantified cysteines per protein measured by isoTOP-ABPP. Respective average values of one and three for liganded and quantified cysteines per protein were measured by isoTOP-ABPP.
  • G R values for six cysteines in XPOl quantified by isoTOP-ABPP, identifying C528 as the most liganded cysteine in this protein. Each point represents a distinct fragment-cysteine interaction quantified by isoTOP-ABPP.
  • FIG. 6 illustrates analysis of fragment-cysteine interactions.
  • A Heatmap showing R values for representative cysteines and fragments organized by proteomic reactivity values (high to low, left to right) and percentage of fragment hits for individual cysteines (high to low, top to bottom). R values > 4 designate fragment hits (colored medium and dark blue). White color designates fragment-cysteine interactions that were not detected (ND).
  • B C, Histograms depicting the percentage of fragments that are hits (R > 4) for all 768 liganded cysteines (B) or for liganded cysteines found in enzymes for which X-ray and/or NMR structures have been reported (or reported for a close homologue of the enzyme) (C).
  • D Percentage of liganded cysteines targeted only by group A (red) or B (blue) fragments or both group A and B fragments (black). Shown for all liganded cysteines, liganded cysteines in enzyme active and non-active sites, and liganded cysteines in transcription factors/ regulators.
  • C D, active-site cysteines were defined as those that reside ⁇ 10 A from established active-site residues and/or bound substrates/inhibitors in enzyme structures.
  • E Representative example of reactive docking predictions shown for XPOl (PDB ID: 3GB8). All accessible cysteines were identified and reactive docking was conducted with all fragments from the library within a 25 A docking cube centered on each accessible cysteine. Categories of XPOl cysteines based on combined docking and isoTOP-ABPP results are shown.
  • F Success rate of reactive docking predictions for liganded cysteines identified by isoTOP-ABPP in 29 representative proteins.
  • Fig. 7 illustrates analysis of cysteines liganded by fragment electrophiles in
  • A Representative MSI ion chromatograms for peptides containing C481 of BTK and C131 of MAP2K7, two cysteines known to be targeted by the anti -cancer drug ibrutinib.
  • Ramos cells were treated with ibrutinib (1 ⁇ , 1 h, red trace) or DMSO (blue trace) and evaluated by isoTOP-ABPP.
  • C Total number of liganded cysteines found in the active sites and non-active sites of enzymes for which X-ray and/or NMR structures have been reported (or reported for a close homologue of the enzyme).
  • C R values for eight cysteines in PHGDH quantified by isoTOP-ABPP, identifying a single liganded cysteine C369 that is targeted by several fragment electrophiles. Each point represents a distinct fragment- cysteine interaction quantified by isoTOP-ABPP.
  • D Heatmap showing representative fragment interactions for liganded cysteines found in the active sites and non-active sites of kinases.
  • E Histogram showing the fragment hit rate for active- and non-active site cysteines in kinases.
  • F The percentage of liganded cysteines in kinases that were targeted by only group A, only group B, or both group A and B compounds.
  • G Heatmap showing representative fragment interactions for liganded cysteines found in transcription factors/regulators.
  • H The fraction of cysteines predicted to be ligandable or not ligandable by reactive docking that were quantified in isoTOP- ABPP experiments.
  • Fig. 8 illustrates confirmation and functional analysis of fragment-cysteine
  • A Representative MSI chromatograms for the indicated Cys-containing peptides from PRMTl quantified in competitive isoTOP-ABPP experiments of MDA-MB-231 cell lysates, showing blockade of IA-alkyne 1 labeling of CI 09 by fragment 11, but not control fragment 3.
  • B 11, but not 3 blocked IA-rhodamine (2 ⁇ ) labeling of recombinant, purified
  • F IC 50 curve for blockade of ibrutinib probe-labeling of MLTK by 60.
  • G 60, but not control fragment 3 (100 ⁇ of each fragment) inhibited the kinase activity of WT-, but not C22A-MLTK.
  • H Click probe 18 (25 ⁇ ) labeled WT-FMPDH2 and C331 S-FMPDH2, but not C140S-FMPDH2 (or C140S/C331 S-FMPDH2). Labeling was detected by CuAAC conjugation to a rhodamine-azide reporter tag and analysis by SDS-PAGE and in-gel fluorescence scanning. Recombinant FMPDH2 WT and mutants were expressed and purified from E.
  • Fig. 9 illustrates confirmation and functional analysis of fragment-cysteine
  • A Representative MSI ion chromatograms for the MLTK tryptic peptide containing liganded cysteine C22 quantified by isoTOP-ABPP in MDA-MB-231 lysates treated with fragment 4 or control fragment 3 (500 ⁇ each).
  • B Lysates from HEK293T cells expressing WT- or C22A-MLTK treated with the indicated fragments and then an ibrutinib- derived activity probe 59 at 10 ⁇ .
  • MLTK labeling by 59 was detected by CuAAC conjugation to a rhodamine-azide tag and analysis by SDS-PAGE and in-gel fluorescence scanning.
  • C Representative MSI ion chromatograms for the MLTK tryptic peptide containing liganded cysteine C22 quantified by isoTOP-ABPP in MDA-MB-231 lysates treated with fragment 4 or control fragment 3 (500 ⁇ each).
  • B Lysates from HEK2
  • E Fragment reactivity with recombinant, purified IMPDH2 added to Jurkat lysates to a final concentration of 1 ⁇ protein, where reactivity was detected in competition assays using the click probe 18 (25 ⁇ ; see Fig. 8H for structure of 18). Note that 18 reacted with WT- and C331 S-IMPDH2, but not C140S or C140S/C331 S-IMPDH2.
  • F Nucleotide competition of 18 (25 ⁇ ) labeling of WT- IMPDH2 added to cell lysates to a final concentration of 1 ⁇ protein.
  • G Representative MSI chromatograms for TIGAR tryptic peptides containing CI 14 and C161 quantified by isoTOP- ABPP in cell lysates treated with the indicated fragments (500 ⁇ each).
  • H Crystal structure of TIGAR (PDB ID: 3DCY) showing CI 14 (red spheres), C161 (yellow spheres), and inorganic phosphate (blue).
  • I Labeling of recombinant, purified TIGAR and mutant proteins by the IA- rhodamine (2 ⁇ ) probe. TIGAR proteins were added to cell lysates, to a final concentration of 2 ⁇ protein.
  • J Concentration-dependent inhibition of WT-TIGAR by 5. Note that the C140S- TIGAR mutant was not inhibited by 5. Data represent mean values ⁇ SEM for 4 replicate experiments at each concentration.
  • FIG. 10 illustrates in situ activity of fragment electrophiles.
  • A X-ray crystal structure of IDHl (PDB ID: 3MAS) showing the position of C269 and the frequently mutated residue in cancer, R132.
  • B, C Reactivity of 20 and control fragment 2 with recombinant, purified WT- IDH1 (B) or R132H-IDH1 (C) added to cell lysates to a final concentration of 2 or 4 ⁇ protein, respectively. Fragment reactivity was detected in competition assays using the IA- rhodamine probe (2 ⁇ ); note that the C269S-IDH1 mutant did not react with IA-rhodamine.
  • FIG. 11 illustrates in situ activity of fragment electrophiles.
  • A Blockade of 16 labeling of WT-IDHl by representative fragment electrophiles. Recombinant, purified WT-IDHl was added to MDA-MB-231 lysates at a final concentration of 2 ⁇ , treated with fragments at the indicated concentrations, followed by IA-rhodamine probe 16 (2 ⁇ ) and analysis by SDS-
  • IA-rhodamine 16 IC 50 curve for blockade of IA-rhodamine-labeling of IDHl by 20. Note that the control fragment 2 showed much lower activity.
  • C 20, but not 2, inhibited IDHl- catalyzed oxidation of isocitrate to a-ketoglutarate (a-KG) as measured by an increase in
  • NADPH production (340 nm absorbance). 20 did not inhibit the C269S-IDH1 mutant. D, 20 inhibited oncometabolite 2-hydroxyglutarate (2-HG) production by R132H-IDH1. MUM2C cells stably overexpressing the oncogenic R132H-IDH1 mutant or control GFP-expressing MUM2C cells were treated with the indicated fragments (2 h, in situ).
  • FIG. 12 illustrates fragment electrophiles that target pro-CASP8.
  • A Representative MSI chromatograms for CASP8 tryptic peptide containing the catalytic cysteine C360 quantified by isoTOP-ABPP in cell lysates or cells treated with fragment 4 (250 ⁇ , in vitro; 100 ⁇ , in situ) and control fragment 21 (500 ⁇ , in vitro; 200 ⁇ , in situ).
  • B Fragment reactivity with recombinant, purified active CASP8 added to cell lysates, where reactivity was detected in competition assays using the caspase activity probe Rho-DEVD-AOMK probe ("DEVD" disclosed as SEQ ID NO: 857) (2 ⁇ , 1 h).
  • C Western blot of proteomes from MDA-MB-231, Jurkat, and CASP8-null Jurkat proteomes showing that CASP8 was only found in the pro-enzyme form in these cells.
  • D Fragment reactivity with recombinant, purified pro- CASP8 (D374A, D384A, C409S) added to cell lysates to a final concentration of 1 ⁇ protein, where reactivity was detected in competition assays with the JA-rhodamine probe (2 ⁇ ). Note that mutation of both cysteine-360 and cysteine-409 to serine prevented labeling of pro-CASP8 by IA-rhodamine.
  • Fig. 13 illustrates fragment electrophiles that target pro-CASP8.
  • A 7 blocked IA- rhodamine 16 labeling of pro-CASP8.
  • Experiments were performed with recombinant, purified pro-CASP8 (bearing a C409S mutation to eliminate IA-rhodamine labeling at this site) added to Ramos cell lysate at a final concentration of 1 ⁇ and treated with the indicated concentrations of 7 followed by IA-rhodamine (2 ⁇ ). Note that a C360S/C409S-mutant of pro-CASP8 did not label with IA-rhodamine.
  • B IC 50 curve for blockade of IA-rhodamine labeling of pro-CASP8 (C409S) by 7.
  • C 7 (50 ⁇ ) fully competed IA-alkyne-labeling of C360 of endogenous CASP8 in cell lysates as measured by isoTOP-ABPP.
  • Representative MSI chromatograms are shown for the C360-containing peptide of CASP8.
  • D 7 selectively blocked probe labeling of pro- CASP8 compared to active CASP8.
  • Recombinant pro- and active- CASP8 (added to Ramos cell lysates at a final concentration of 1 ⁇ each) were treated with 7 (50 ⁇ ) or the established caspase inhibitor, Ac-DEVD-CHO ("DEVD” disclosed as SEQ ID NO: 857) (20 ⁇ ), for 1 h followed by labeling with the click probe 61 (25 ⁇ ) for pro-CASP8 and the Rho-DEVD- AOMK probe (“DEVD” disclosed as SEQ ID NO: 857) (2 ⁇ ) for active-CASP8.
  • DEVD the established caspase inhibitor
  • Recombinant pro- and active-CASP8 were added to Ramos lysates at 1 ⁇ and then treated with 7 (30 ⁇ ) followed by isoTOP-ABPP.
  • G Substitution of a naphthylamine for the aniline portion of 7 furnishes a control fragment 62 that did not compete with IA-rhodamine labeling of C360 of pro-CASP8.
  • H 7, but not control fragment 62, blocked extrinsic, but not intrinsic apoptosis.
  • Jurkat cells (1.5 million cells/mL) were incubated with 7 or 62 (30 ⁇ ) or the pan-caspase inhibitor VAD-FMK (100 ⁇ ) for 30 min prior to addition of staurosporine (2 ⁇ ) or * 3 ⁇ 4perFasLigandTM (100 ng /mL). Cells were incubated for 6 hours and viability was quantified with CellTiter-Glo®. RLU- relative light unit.
  • I For cells treated as described in H, cleavage of PARP (89 kDa), CASP8 (p43/p41), and CASP3 (pl9/pl7) was visualized by western blot.
  • panels B, E, and H data represent mean values ⁇ SEM for at least three independent experiments.
  • Fig. 14 shows electrohile compounds that target pro-CASP8 and pro-CASPlO.
  • Fig. 15 illustrates a fraction of liganded (62%; 341 of 553 quantified cysteines) and unliganded (20%; 561 of 2870 quantified cysteines) cysteines that are sensitive to heat denaturation measured by IA-alkyne labeling (R > 3 native/heat denatured).
  • Fig. 16 shows a percentage of proteins identified by isoTOP-ABPP as liganded by fragments 3 and 14 and enriched by their corresponding click probes 19 and 18 that are sensitive to heat denaturation (64% (85 of 133 quantified protein targets) and 73% (19 of 26 quantified protein targets), respectively). Protein enrichment by 18 and 19 was measured by whole protein capture of isotopically-SILAC labeled MDA-MB-231 cells using quantitative (SILAC) proteomics.
  • Fig. 17A-B illustrate exemplary fractions of cysteines predicted based on isoTOP- ABPP method or IA-alkyne probe.
  • Fig. 17A shows the fraction of cysteines predicted to be ligandable or unligandable by reactive docking that were quantified in isoTOP-ABPP experiments.
  • Fig. 17B shows the fraction of cysteines predicted to be ligandable or unligandable by reactive docking that show heat-sensitive labeling by the IA-alkyne probe (R > 3 native/heat denatured).
  • Fig. 18 shows lysates from HEK293T cells expressing WT or C22A-MLTK treated with the indicated fragments and then an ibrutinib -derived activity probe 59 at 10 ⁇ .
  • MLTK labeling by 59 was detected by CuAAC conjugation to a rhodamine-azide tag and analysis by SDS-PAGE and in-gel fluorescence scanning.
  • Fig. 19 shows click probe 18 (25 ⁇ ) labeled WT-IMPDH2 and C331 S-IMPDH2, but not C140S-IMPDH2 (or C140S/C331 S-IMPDH2). Labeling was detected by CuAAC
  • Recombinant IMPDH2 WT and mutants were expressed and purified from E. coli and added to Jurkat lysates to a final concentration of 1 ⁇ protein.
  • Fig. 20 shows the apparent IC 50 curve for blockade of IA rhodamine-labeling of R132H-IDH1 by 20.
  • Fig. 21A-C show the activity of compounds 7 and 62 with respect to different recombinant caspases.
  • Fig. 21A shows that 7 does not inhibit active caspases.
  • Recombinant, active caspases were added to MDA-MD-231 lysate to a final concentration of 200 nM (CASP2, 3, 6, 7) or 1 ⁇ (CASP8, 10), treated with z- VAD-FMK (25 ⁇ ) or 7 (50 ⁇ ), followed by labeling with the Rho-DEVD-AOMK probe ("DEVD" disclosed as SEQ ID NO: 857) (2 ⁇ ).
  • DEVD Rho-DEVD-AOMK probe
  • FIG. 21B shows a western blot of the cleavage of PARP (96 kDa), CASP8 (p43/p41, pl8), and CASP3 (pl7).
  • Fig. 21C shows that 7 protects Jurkat cells from extrinsic, but not intrinsic apoptosis.
  • Cleavage of PARP, CASP8, and CASP3 detected by western blotting as shown in Fig. 2 IB was quantified for three (STS) or two (FasL) independent experiments.
  • Cleavage products PARP (96 kDa), CASP8 (p43/p41), CASP3 (pi 7)) were quantified for compound treatment and the % cleavage relative to DMSO treated samples was calculated.
  • STS data represent mean values ⁇ SEM for three independent experiments, and FasL data represent mean values ⁇ SD for two independent experiments.
  • Statistical significance was calculated with unpaired students t-tests comparing active compounds (VAD-FMK and 7) to control compound 62; **, p ⁇ 0.01, ***, p ⁇ 0.001, ****, p ⁇ 0.0001.
  • Fig. 22 shows that CASPIO is involved in intrinsic apoptosis in primary human T cells.
  • A Representative MSI peptide signals showing R values for caspases detected by quantitative proteomics using probe 61.
  • ABPP-SILAC experiments Jurkat cells (10 million cells) were treated with either DMSO (heavy cells) or the indicated compounds (light cells) for 2 h followed by probe 61 (10 ⁇ , 1 h).
  • B 7 competed 61-labeling of pro-CASP8 and CASPIO, whereas 63-R selectively blocked probe labeling of pro-CASP8.
  • C 7, but not 63-R block probe labeling of pro-CASPlO.
  • Recombinant pro-CASPIO was added to MDA-MB-231 lysates to a final concentration of 300 nM, treated with the indicated compounds, and labeled with probe 61.
  • F Apparent IC 50 curve for blockade of 61 labeling of pro-CASP8 and pro-CASPIO by 63-R.
  • G 63-R shows increased potency against pro- CASP8.
  • Recombinant pro-CASP8 was added to MDA-MB-231 lysates to a final concentration of 300 nM, treated with the indicated compounds and labeled with probe 61.
  • H Apparent IC50 curve for blockade of 61 labeling of pro-CASP8 by 63-R compared with 63-S. The structure of 63-S is shown.
  • I CASP10 is more highly expressed in primary human T cells compared to Jurkat cells.
  • cycloheximide (CHX, 2.5 ng/mL). Cells were incubated for 6 h and viability quantified with CTG M, For T cells treated as in Fig. 14B cleavage of C ASP 10 (p22), CASP8 (pi 8), CASP3 (pl7) and RIPK (33 kDa) was visualized by western blotting.
  • T cells treated as in Fig. 14B cleavage of C ASP 10 (p22), CASP8 (pi 8), CASP3 (pl7) and RIPK (33 kDa) was visualized by western blotting.
  • D-F, H, and J-K data represent mean values ⁇ SEM for at least three independent experiments. Statistical significance was calculated with unpaired students t-tests comparing DMSO- to fragment-treated samples; **, /? ⁇ 0.01, ****, /? ⁇ 0.0001.
  • Fig. 23A-F exemplify DMF inhibits the activation of primary human T cells.
  • Fig. 23A illustrates the chemical structures of DMF, MMF, and DMS.
  • Fig. 23B - Fig. 23E illustrate bar graphs that exemplify IL-2 release (Fig. 23B), CD25 expression (Fig. 23 C and Fig. 23D), and CD69 expression (Fig. 23E) in primary human T cells, either unstimulated (Unstim) or stimulated (Stim) with anti-CD3 + anti-CD28 in the presence of DMSO or the indicated concentrations of DMF, MMF, and DMS for 8 hours.
  • Fig. 23F illustrates a bar graph that exemplifies time course of DMF effects.
  • Fig. 24 illustrates a bar graph that exemplifies DMF does not affect T cell viability.
  • Primary human T cells were stimulated with anti-CD3 and anti-CD28 antibodies as indicated and treated concomitantly with compound for 8 h. Cells were then stained with LIVE/DEAD fixable blue stain, and analyzed by flow cytometry. Shown are data gated on CD4+ cells. Data represent mean ⁇ SE for four experiments per group.
  • Fig. 25A-B illustrate bar graphs that exemplify DMF, but not MMF, inhibits the activation of primary mouse T cells.
  • Splenic T cells were harvested from C57BL/6 mice and left either unstimulated (Unstim) or stimulated (Stim) with anti-CD3 + anti-CD28 in the presence of DMSO or the indicated concentrations of DM F, MMF, and DM S for 8 h.
  • Activation was assessed by measuring CD25 (Fig. 25 A) and CD69 (Fig. 25B) expression. Data represent mean ⁇ SE for four experiments per group. ***p ⁇ 0.001 in comparison to DMSO group.
  • Fig. 26A-D illustrate bar graphs that exemplify inhibitor ⁇ ' " effects of DMF are equivalent in Nrf2(+/+) and (-/-) T cells and not caused by reductions in cellular GSH.
  • Fig. 26A exemplifies CD25 expression in anti-CD3 + anti-CD28-stimulated Nrf2(+/+) and (-/-) T cells.
  • Splenic T cells were harvested from Nrf2(+/+) and (-/-) mice, then stimulated in the presence of indicated compounds for 24h.
  • Fig. 26B and Fig. 26C exemplify treatment with DMF or BSO causes significant reductions in GSH content of human T cells.
  • Primary human T cells were stimulated with anti-CD3 + anti-CD28 antibodies and treated with DMF (50 ⁇ , 2 hours) or BSO (2.5 niM, 4 hours), after which intracellular GSH levels were measured.
  • Fig. 26D exemplifies that BSO does not alter T cell activation.
  • Primary human T cells were treated with DMSO, DMF (50 ⁇ ), or BSO (2.5 mM) and stimulated as indicated for 8 h, after which CD25 expression was measured.
  • Data represent mean ⁇ SE for two biological replicates, with 3-4 technical replicates per biological replicate. *p ⁇ 0.05, **p ⁇ 0.01, ***p ⁇ 0.001 in comparison to DMSO groups.
  • Fig. 27A-F exemplify isoTOP-ABPP of DMF -treated primary human T cells.
  • Fig. 27 A illustrates a graph that exemplifies isoTOP-ABPP ratios, or R values, for > 2400 Cys residues in primary human T cells treated with DMSO or DMF or MMF (50 ⁇ , 4 h).
  • Fig. 27B illustrates a graph that exemplifies expanded profile for DMF-sensitive Cys residues (R values > 4 for DMSO/DMF).
  • data represent aggregate quantified Cys residues from five biological replicates. For Cys residues quantified in more than one replicate, average ratios are reported.
  • Dashed line designates R values > 4, which was used to define DMF- sensitive Cys residues (> 4-fold reductions in IA-alkyne reactivity in DMF-treated T cells).
  • Fig. 27C and Fig. 27D illustrate graphs that exemplify concentration- and time- dependent profiles for DMF-sensitive Cys residues in T cells, respectively. For additional concentrations (10 and 25 ⁇ ) and time points (1 and 2 h), data represent aggregate quantified Cys residues from one- three isoTOP-ABPP experiments per group.
  • FIG. 27E illustrates a chart which exemplifies fraction of proteins for which both a DMF-sensitive Cys residue and at least one additional Cys residue was quantified (Left) and, fraction of these proteins where additional Cys residue was clearly unchanged (Right) (R value ⁇ 2.0 for DMSO/DMF).
  • Fig. 27F illustrates representative MS I profiles for quantified Cys residues in PRKDC, one of which (C4045) shows sensitivity to DMF.
  • Fig. 28A-B illustrate bar graphs that exemplify the total number of unique quantified peptides (Fig. 28A) and proteins (Fig. 28B) begin to plateau after five biological replicates of the isoTOP-ABPP experiment in primary human T cells (treated with 50 uM DMF for 4 h).
  • Fig. 29 illustrates a graph that exemplifies isoTOP-ABPP of BSO-treated primary human T cells. Cells were treated with 2.5 mM BSO for 4 hours. Data represent aggregate quantified Cys residues from two isoTOP-ABPP experiments per group.
  • Fig. 30A-C exemplify conservation and functional analysis of DMF-sensitive cysteines.
  • Fig. 30 A exemplifies fraction of DMF-sensitive cysteines in the human T cell proteome that are conserved in mice.
  • Fig. 30B exemplifies fraction of conserved DMF-sensitive Cys residues in human T cells that were quantified and also sensitive to DMF in mouse T cells.
  • Fig. 30C exemplifies distribution of proteins harboring DMF-sensitive Cys residues by functional class.
  • Fig. 31A-C exemplify DMF inhibits p65 translocation to the nucleus in primary human T cells.
  • Fig. 31 A exemplify Human T cells were either left unstimulated or stimulated with anti- CD3 and anti-CD28 antibodies and treated with DMSO or DMF (50 uM) for 1 h.
  • Fig. 3 IB illustrates a bar graph that exemplifies ratio of nuclear to cytoplasmic localization of p65 for samples shown in Fig. 31 A, as well as samples treated with MMF (50 uM) or DMS (50 uM).
  • Fig. 31C exemplifies p65 levels in whole cell lysate.
  • Fig. 32A-G exemplify DMF-sensitive C14/C 17 residues in PKCO are important for
  • FIG. 32A illustrates representative MS I profiles for
  • Fig. 32B exemplifies sequence conservation analysis of human and mouse PKCG, human PKC5, and human PKCs
  • Fig. 32C illustrates location of DMF-sensitive C14 and C17 residues in the C2 domain of PKCG
  • Fig. 32D exemplifies DMF, but not MMF, treatment blocks the association of PKCG with CD28.
  • Peripheral CD4+ T cells from C57BL/6 mice were pre- incubated with DMSO, DMF (50 ⁇ ), or MMF (50 ⁇ ), either left unstimulated or stimulated with anti- CD3 + anti-CD28 for 5 min, then washed and lysed.
  • Immunoprecipitations (IPs) were performed in the cell lysates with anti-CD28 or control IgG antibodies and IPs blotted for CD28 or PKCG.
  • Fig. 32E illustrates Co-IP of WT PKCG and the C14S/C 17S (2CS) PKCG mutant with
  • PKC0(-/-) T cells were reconstituted with empty vector (EV), WT PKCG, or the 2CS
  • Fig. 32F and Fig. 32 G illustrate PKCGf-/-) T cells reconstituted with WT or 2CS PKC0 were assayed for activation potential by measuring CD25 expression (Fig. 32F) and IL-2 (Fig. 32G).
  • Fig. 32E - Fig. 32G PKC0( --/-) T cell cultures were pre-activated with plate- coated anti-CD3 + anti-CD28 for 24 h before retroviral transduction with empty vector, WT PKC0, or the 2CS PKC0 mutant. Cells were rested in culture medium without stimulation for 48 h, then re-stimulated with or without 1 ⁇ g/mL plate-coated anti-CD3(+28) overnight (Fig.
  • Fig. 33A-D exemplify DMF sensitivity of C14/C17 in PKC0.
  • Fig. 33A illustrates representative MSI profile of C14/C17 of mouse PKC0 shows sensitivity to DMF (50 ⁇ , 4 h) in isoTOP-ABPP experiments.
  • Fig. 33B and Fig. 33C exemplify Time- and concentration- dependence of DMF sensitivity of C14/C17 in human PKC0, respectively, as determined by isoTOP-ABPP experiments.
  • Fig. 33D exemplifies C14/C17 of human PKC0 are insensitive to MMF treatment (50 ⁇ MMF, 4 h).
  • Fig. 34A-B exemplify DMF-sensitive Cys residue in ADA.
  • Fig. 34A illustrates the DMF-sensitive Cys, C75 (magenta), is -25 angstroms from the ADA active site (orange).
  • Fig. 34B illustrates mutations in both residues neighboring C75 (G74 and R76 (blue)) have been associated with the severe combined immunodeficiency known as ADASCID (OMFM: 608958).
  • Cysteine containing proteins encompass a large repertoire of proteins that participate in numerous cellular functions such as mitogenesis, proliferation, apoptosis, gene regulation, and proteolysis. These proteins include enzymes, transporters, receptors, channel proteins, adaptor proteins, chaperones, signaling proteins, plasma proteins, transcription related proteins, translation related proteins, mitochondrial proteins, or cytoskeleton related proteins.
  • Dysregulated expression of a cysteine containing protein in many cases, is associated with or modulates a disease, such as an inflammatory related disease, a neurodegenerative disease, or cancer.
  • a disease such as an inflammatory related disease, a neurodegenerative disease, or cancer.
  • identification of a potential agonist/antagonist to a cysteine containing protein aids in improving the disease condition in a patient.
  • small molecule fragments are employed in some instances to serve as launching point for structure-guided elaboration of an initial interaction into a high- affinity drug.
  • one method of identifying a small molecule fragment that interacts with a cysteine containing protein is through monitoring their interaction under an in vitro environment.
  • the in vitro environment does not mimic the native condition of the cysteine containing protein.
  • the in vitro environment lacks additional helper proteins to facilitate interaction with the small molecule fragment. Further still, in some instances, difficulties arise during the expression and/or purification stage of the cysteine-containing protein.
  • Described herein is another method of identifying small molecule fragments for interaction with a cysteine containing protein.
  • this method allows for mapping of small molecule fragments for interaction with a cysteine containing protein under native conditions, thereby allows for an accurate mapping of interaction with potential small molecule fragments.
  • this method also allows for identification of novel cysteine containing protein targets as this method eliminates the need of recombinant expression and purification.
  • compositions, cells, cell populations, assays, probes, and service related to the method of identifying a small molecule fragment for interaction with a cysteine containing protein.
  • the methods described herein utilize a small molecule fragment and a cysteine-reactive probe for competitive interaction with a cysteine-containing protein.
  • the method is as described in Fig. 1 A.
  • Fig. 1A illustrates contacting a first cell solution with a small molecule fragment for an extended period of time prior to incubating the first cell solution with a first cysteine-reactive probe to generate a first group of cysteine-reactive probe-protein complexes.
  • the extended period of time is about 5, 10, 15, 20, 30, 60, 90, 120 minutes or longer.
  • the small molecule fragment competes with the first cysteine-reactive probe for interaction with a protein target.
  • the small molecule fragment or the cysteine-reactive probe form a covalent bond via a Michael's reaction with a cysteine residue of the cysteine containing protein.
  • Fig. 1 A further illustrates contacting a second cell solution with a second cysteine-reactive probe to generate a second group of cysteine-reactive probe-protein complexes.
  • the first cysteine-reactive probe and the second cysteine-reactive probe are the same.
  • cells from the second cell solution are grown in an enriched media (e.g., an isotopically enriched media).
  • cells from the first cell solution are grown in an enriched media (e.g., an isotopically enriched media).
  • cells from both the first cell solution and the second cell solution are grown in two different enriched media (e.g., two different isotopically enriched media) so that a protein obtained from cells grown in the first cell solution is distinguishable from a protein obtained from cells grown in the second cell solution.
  • cells from only one of the cell solutions e.g., either the first cell solution or the second cell solution
  • are grown in an enriched media e.g., isotopically enriched media.
  • a protein obtained from the enriched cells is distinguishable from a protein obtained from cells that have not been enriched (e.g., isotopically enriched).
  • the second cell solution is not treated with a small molecule fragment. In such cases, the second cell solution acts as a control.
  • cells from the second cell solution are are further treated with a buffer.
  • the buffer is DMSO.
  • cells from the second cell solution are not treated with a small molecule fragment and the second cell solution acts as a control.
  • a first group of cysteine-reactive probe-protein complexes and a second group of cysteine-reactive probe-protein complexes are harvested separately and combined to generate a set of cysteine-reactive probe-protein complexes which is further processed by a proteomic analysis means.
  • either the first group of cysteine- reactive probe-protein complexes or the second group of cysteine-reactive probe-protein complexes contain labeled proteins obtained from cells grown in an enriched media (e.g., isotopically enriched media).
  • both groups of cysteine-reactive probe-protein complexes contain labeled proteins obtained from cells grown in two different enriched media (e.g., two different isotopically enriched media).
  • either the first group of cysteine- reactive probe-protein complexes, the second group of cysteine-reactive probe-protein complexes, or both groups of cysteine-reactive probe-protein complexes contain labeled proteins in which the proteins have been labeled after havesting from a cell.
  • a first group of cysteine-reactive probe-protein complexes and a second group of cysteine-reactive probe-protein complexes are harvested separately and the proteins from one of the two groups of cysteine-reactive probe-protein complexes are subsequently labeled (e.g., by methylation).
  • first group of cysteine-reactive probe-protein complexes and a second group of cysteine-reactive probe-protein complexes are then combined and subjected to proteomic analysis means.
  • a first group of cysteine-reactive probe-protein complexes and a second group of cysteine-reactive probe-protein complexes are harvested separately and both groups are subjected to proteomic analysis means.
  • data obtained from a protemoic analysis means is then combined for further analysis.
  • the proteomic analysis means comprises a mass spectroscopy method.
  • the mass spectroscopy method is a liquid-chromatography-mass spectrometry (LC-MS) method.
  • the proteomic analysis means further comprise analyzing the results from the mass spectroscopy method by an algorithm for protein
  • the algorithm combines the results from the mass spectroscopy method with a protein sequence database for protein identification.
  • the algorithm comprises ProLuCID algorithm, Probity, Scaffold, SEQUEST, or Mascot.
  • the mass spectroscopy method is a MALDI-TOF based method.
  • a value is assigned to each of the cysteine binding protein from the cysteine-reactive probe-protein complexes after proteomic analysis, in which the value is determined from the proteomic analysis.
  • the value assigned to each of the cysteine containing protein is obtained from a mass spectroscopy analysis.
  • the value is an area-under-the curve from a plot of signal intensity as a function of mass-to- charge ratio.
  • a first value is assigned to a cysteine binding protein from the first group of cysteine-reactive probe-protein complex of the first cell solution and a second value of the same cysteine binding protein from the second group of cysteine-reactive probe- protein complex of the second cell solution.
  • a ratio is then calculated between the two values, the first value and the second value, and assigned to the same cysteine binding protein.
  • a ratio of greater than 2 indicates that the cysteine binding protein is a candidate for interacting with the small molecule fragment.
  • the ratio is greater than 2.5, 3, 3.5, 4, 4.5, 5, 6, 7, 8, 9, or 10. In some cases, the ratio is at most 20.
  • the same small molecule fragment interacts with a number of cysteine binding proteins in the presence of a cysteine-reactive probe. In some instances, the small molecule modulates the interaction of a cysteine-reactive probe with its cysteine binding protein partners.
  • the spectrum of ratios for a small molecule fragment with its interacting protein partners in the presence of a cysteine-reactive probe indicates the specificity of the small molecule fragment toward the protein. In some instances, the spectrum of ratio indicates whether the small molecule fragment is a specific inhibitor to a protein or a pan inhibitor.
  • the cysteine containing protein identified by the above method comprises a biologically active cysteine residue.
  • the biologically active cysteine site is a cysteine residue that is located about ⁇ or less to an active-site ligand or residue.
  • the cysteine residue that is located about ⁇ or less to the active-site ligand or residue is an active site cysteine.
  • the biologically active cysteine site is an active site cysteine.
  • the biologically active cysteine site is a cysteine residue that is located greater than ⁇ from an active-site ligand or residue.
  • the cysteine residue that is located greater than ⁇ from the active-site ligand or residue is a non- active site cysteine.
  • the biologically active cysteine site is a non-active site cysteine.
  • the small molecule fragment that covalently interacts with the biologically active cysteine impairs and/or inhibits activity of the cysteine containing protein.
  • the cysteine containing protein exists in an active form.
  • the small molecule fragment and/or the cysteine-reactive probe interact with the active form of the cysteine containing protein.
  • the cysteine containing protein exists in a pro-active form.
  • the small molecule fragment and/or the cysteine-reactive probe interact with the pro-active form of the cysteine containing protein.
  • the structural environment of the biologically active cysteine residue modulates the reactivity of the cysteine residue.
  • the structural environment is a hydrophobic environment or a hydrophilic environment.
  • the structural environment is a charged environment.
  • the structural environment is a nucleophilic environment.
  • the cysteine containing protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, a plasma protein, transcription related protein, translation related protein, mitochondrial protein, or cytoskeleton related protein.
  • the cysteine containing protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, transcription related protein, or translation related protein.
  • the cysteine containing protein is a protein illustrated in Tables 1, 2, 3, 8 or 9.
  • the cysteine residue of the cysteine-containing proteins illustrated in Tables 1, 2, 3, 8 or 9 is denoted by (*) in Tables 1, 2, 3, 8 or 9.
  • a set of cysteine-reactive probes are added to the cell solutions. For example, a first set of cysteine-reactive probes are added to the first cell solution and a second set of cysteine-reactive probes are added to the second cell solution. In some cases, each cysteine-reactive probe is different within the set. In some instances, the first set of cysteine- reactive probes is the same as the second set of cysteine-reactive probes. In some cases, the first set of cysteine-reactive probes generate a third group of cysteine-reactive probe-protein complexes and the second set of cysteine-reactive probes generate a fourth group of cysteine- reactive probe-protein complexes. In some instances, the set of cysteine-reactive probes further facilitates identification of cysteine containing proteins.
  • the sample is a cell sample. In other instances, the sample is a tissue sample.
  • the method is an in-situ method.
  • the small molecule fragments described herein comprise non- naturally occurring molecules.
  • the non-naturally occurring molecules do not include natural and/or non-natural peptide fragments, or small molecules that are produced naturally within the body of a mammal.
  • the small molecule fragments described herein comprise a molecule weight of about 100 Dalton or higher. In some embodiments, the small molecule fragments comprise a molecule weight of about 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the molecule weight of the small molecule fragments are between about 150 and about 500, about 150 and about 450, abut 150 and about 440, about 150 and about 430, about 150 and about 400, about 150 and about 350, about 150 and about 300, about 150 and about 250, about 170 and about 500, about 180 and about 450, about 190 and about 400, about 200 and about 350, about 130 and about 300, or about 120 and about 250 Dalton.
  • the molecule weight of the small molecule fragments described herein is the molecule weight prior to enrichment with one or more elements selected from a halogen, a nonmetal, a transition metal, or a combination thereof. In some embodiments, the molecule weight of the small molecule fragments described herein is the molecule weight prior to enrichment with a halogen. In some embodiments, the molecule weight of the small molecule fragments described herein is the molecule weight prior to enrichment with a nonmetal. In some embodiments, the molecule weight of the small molecule fragments described herein is the molecule weight prior to enrichment with a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms.
  • the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some cases, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, the molecular weight of the small molecule fragment does not include the molecular weight of a transition metal.
  • the small molecule fragments described herein comprise micromolar or millimolar binding affinity. In some instances, the small molecule fragments comprise a binding affinity of about ⁇ , 10 ⁇ , ⁇ , 500 ⁇ , ImM, lOmM, or higher. [0099] In some embodiments, the small molecule fragments described herein has a high ligand efficiency (LE).
  • Ligand efficiency is the measurement of the binding energy per atom of a ligand to its binding partner. In some instances, the ligand efficiency is defined as the ratio of the Gibbs free energy (AG) to the number of non-hydrogen atoms of the compound (N):
  • LE is also arranged as:
  • the LE score is about 0.3 kcal mol ⁇ HA "1 , about 0.35 kcal mol “ 1 HA "1 , about 0.4 kcal mol ⁇ HA "1 , or higher.
  • the small molecule fragments described herein are designed based on the Rule of 3.
  • the Rule of 3 comprises a non-polar solvent- polar solvent (e.g. octanol -water) partition coefficient log P of about 3 or less, a molecular mass of about 300 Daltons or less, about 3 hydrogen bond donors or less, about 3 hydrogen bond acceptors or less, and about 3 rotatable bonds or less.
  • a non-polar solvent- polar solvent e.g. octanol -water partition coefficient log P of about 3 or less, a molecular mass of about 300 Daltons or less, about 3 hydrogen bond donors or less, about 3 hydrogen bond acceptors or less, and about 3 rotatable bonds or less.
  • the small molecule fragments described herein comprises three cyclic rings or less.
  • the small molecule fragments described herein binds to a cysteine residue of a polypeptide that is about 20 amino acid residues in length or more. In some instances, the small molecule fragments described herein binds to a cysteine residue of a polypeptide that is about 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
  • the small molecule fragments described herein further comprise pharmacokinetic parameters that are unsuitable as a therapeutic agent for
  • the pharmacokinetic parameters that are suitable as a therapeutic agent comprise parameters in accordance with FDA guideline, or in accordance with a guideline from an equivalent Food and Drug Administration outside of the United States.
  • the pharmacokinetic parameters comprise the peak plasma concentration (Cmax), the lowest concentration of a therapeutic agent (Cmin), volume of distribution, time to reach Cmax, elimination half-life, clearance, and the life.
  • the pharmacokinetic parameters of the small molecule fragments are outside of the parameters set by the FDA guideline, or by an equivalent Food and Drug Administration outside of the United States.
  • the small molecule fragments described herein comprise a reactive moiety which forms a covalent interaction with the thiol group of a cysteine residue of a cysteine containing protein, and an affinity handle moiety.
  • a small molecule fragment described herein is a small molecule fragment of Formula (I):
  • RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • F is obtained from a compound library.
  • the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment-Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from AnCoreX, Bio Building Blocks from ASINEX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library.
  • the small molecule fragment of Formula (I) does not contain a second binding site. In some instances, the small molecule fragment moiety does not bind to the protein. In some cases, the small molecule fragment moiety does not covalently bind to the protein. In some instances, the small molecule fragment moiety does not interact with a secondary binding site on the protein. In some instances, the secondary binding site is an active site such as an ATP binding site. In some cases, the active site is at least about 10, 15, 20, 25, 35, 40A, or more away from the biologically active cysteine residue. In some instances, the small molecule fragment moiety does not interact with an active site such as an ATP binding site.
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • F further comprises a linker moiety that connects F to the carbonyl moiety.
  • the small molecule fragment is a small molecule fragment illustrated in Fig. 3.
  • F is a small molecule fragment moiety selected from: N-(4- bromophenyl)-N-phenylacrylamide, N-(l-benzoylpiperidin-4-yl)-2-chloro-N-phenylacetamide, l-(4-benzylpiperidin-l-yl)-2-chloroethan-l-one, N-(2-(lH-indol-3-yl)ethyl)-2-chloroacetamide, N-(3,5-bis(trifluoromethyl)phenyl)acrylamide, N-(4-phenoxy-3-(trifluoromethyl)phenyl)-N- (pyridin-3-ylmethyl)acrylamide, N-(3,5-bis(trifluoromethyl)phenyl)acetamide, 2-chloro-l-(4- (hydroxydiphenylmethyl)piperidin- 1 -yl)ethan- 1 -one, (E)-3 -
  • the small molecule fragment of Formula (I) comprise a molecule weight of about 100, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240,
  • the molecule weight of the small molecule fragment of Formula (I) is between about
  • Formula (I) is the molecule weight prior to enrichment with one or more elements selected from a halogen, a nonmetal, a transition metal, or a combination thereof.
  • the molecule weight of the small molecule fragment of Formula (I) is the molecule weight prior to enrichment with a halogen.
  • the molecule weight of the small molecule fragment of Formula (I) is the molecule weight prior to enrichment with a nonmetal.
  • the molecule weight of the small molecule fragment of Formula (I) is the molecule weight prior to enrichment with a transition metal.
  • the molecular weight of the small molecule fragment of Formula (I) does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, the molecular weight of the small molecule fragment of Formula (I) does not include the molecular weight of a halogen. In some embodiments,
  • the molecular weight of the small molecule fragment of Formula (I) does not include the molecular weight of a transition metal.
  • the small molecule fragment of Formula (I) comprises micromolar or millimolar binding affinity. In some instances, the small molecule fragment of Formula (I) comprises a binding affinity of about ⁇ , 10 ⁇ , ⁇ , 500 ⁇ , ImM, lOmM, or higher.
  • the small molecule fragment of Formula (I) has a LE score about 0.3 kcal mol ⁇ HA "1 , about 0.35 kcal mol ⁇ HA "1 , about 0.4 kcal mol ⁇ HA "1 , or higher
  • the small molecule fragment of Formula (I) follows the design parameters of Rule of 3.
  • the small molecule fragment of Formula (I) has a non-polar solvent-polar solvent (e.g. octanol -water) partition coefficient log P of about 3 or less, a molecular mass of about 300 Daltons or less, about 3 hydrogen bond donors or less, about 3 hydrogen bond acceptors or less, and about 3 rotatable bonds or less.
  • a non-polar solvent-polar solvent e.g. octanol -water partition coefficient log P of about 3 or less, a molecular mass of about 300 Daltons or less, about 3 hydrogen bond donors or less, about 3 hydrogen bond acceptors or less, and about 3 rotatable bonds or less.
  • the small molecule fragment of Formula (I) comprises three cyclic rings or less.
  • the small molecule fragment of Formula (I) binds to a cysteine residue of a polypeptide (e.g., a cysteine containing protein) that is about 20 amino acid residues in length or more.
  • the small molecule fragments described herein binds to a cysteine residue of a polypeptide (e.g., a cysteine containing protein) that is about 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
  • the small molecule fragment of Formula (I) has pharmacokinetic parameters outside of the parameters set by the FDA guideline, or by an equivalent Food and Drug Administration outside of the United States. In some instances, a skilled artisan understands in view of the pharmacokinetic parameters of the small molecule fragment of Formula (I) described herein that these small molecule fragment is unsuited as a therapeutic agent without further optimization.
  • the small molecule fragment is a specific inhibitor or a pan inhibitor.
  • a cysteine-reactive probe comprises a reactive moiety which forms a covalent interaction with the thiol group of a cysteine residue of a cysteine containing protein, and an affinity handle moiety.
  • a cysteine-reactive probe is a cysteine-reactive probe of Formula (II):
  • RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue; and AFDVI is an affinity handle moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • the affinity handle moiety comprises an affinity handle and a binding moiety that facilitates covalent interaction of the cysteine-reactive probe to a cysteine residue of a cysteine-containing protein.
  • the binding moiety is a small molecule fragment obtained from a compound library.
  • the compound library comprises
  • ChemBridge fragment library Pyramid Platform Fragment-Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from AnCoreX, Bio Building Blocks from ASF EX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library.
  • the affinity handle is a bioorthogonal affinity handle. In some embodiments, the affinity handle utilizes bioorthogonal chemistry. As used herein,
  • bioorthogonal chemistry refers to any chemical reaction that occurs inside of a living system (e.g. a cell) without interfering with native biochemical processes.
  • the affinity handle comprises a carbodiimide, N-hydroxysuccinimide (NHS) ester, imidoester, pentafluorophenyl ester, hydroxymethyl phosphine, maleimide, haloacetyl, pyridyl disulfide, thiosulfonate, vinylsulfone, hydrazide, alkoxyamine, alkyne, azide, or isocyanate group.
  • the affinity handle comprises an alkyne or an azide group.
  • the affinity handle is an alkyne group.
  • alkyne group refers to a group with a chemical formula of H-C ⁇ C-R, HC 2 R, Ri-C ⁇ C-R 2 , or RiC 2 R 2
  • R, Ri, and R 2 are independently a cysteine-reactive probe portion described herein, a linker, or a combination thereof.
  • the alkyne group is capable of being covalently linked in a chemical reaction with a molecule containing an azide.
  • the affinity handle is an azide group.
  • the affinity handle (e.g. alkyne group or azide group) serve as nonnative and non-perturbed bioorthogonal chemical handles.
  • the affinity handle (e.g. alkyne group or azide group) is further derivatized through chemical reactions such as click chemistry.
  • the click chemistry is a copper(I)-catalyzed [3+2]-Huisgen 1,3-dipolar cyclo-addition of alkynes and azides leading to 1,2,3-triazoles.
  • the click chemistry is a copper free variant of the above reaction.
  • the affinity handle further comprises a linker.
  • the linker bridges the affinity handle to the reactive moiety.
  • the affinity handle is further conjugated to an affinity ligand.
  • the affinity ligand comprises a chromophore, a labeling group, or a combination thereof.
  • the chromophore comprises fluorochrome, non-fluorochrome chromophore, quencher, an absorption chromophore, fluorophore, organic dye, inorganic dye, metal chelate, or a fluorescent enzyme substrate.
  • the chromophore comprises non-fluorochrome chromophore, quencher, an absorption chromophore, fluorophore, organic dye, inorganic dye, metal chelate, or a fluorescent enzyme substrate.
  • the chromophore comprises a fluorophore.
  • the fluorophore comprises rhodamine, rhodol, fluorescein, thiofluorescein, aminofluorescein, carboxyfluorescein, chlorofluorescein, methylfluorescein, sulfofluorescein, aminorhodol, carboxyrhodol, chlororhodol, methylrhodol, sulforhodol, aminorhodamine, carboxyrhodamine, chlororhodamine, methylrhodamine, sulforhodamine, thiorhodamine, cyanine, indocarbocyanine, oxacarbocyanine, thiacarbocyanine, merocyanine, cyanine 2, cyanine 3, cyanine 3.5, cyanine 5, cyanine 5.5, cyanine 7, oxadiazole derivatives, pyridyloxazole,
  • the labeling group is a biotin moiety, a streptavidin moiety, bead, resin, a solid support, or a combination thereof.
  • a biotin moiety described herein comprises biotin and biotin derivatives. Exemplary biotin derivatives include, but are not limited by, desthiobiotin, biotin alkyne or biotin azide. In some instances, a biotin moiety described herein is desthiobiotin. In some cases, a biotin moiety described herein is d- Desthiobiotin.
  • the labeling group is a biotin moiety.
  • the biotin moiety further comprises a linker such as a 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 or more residues in length.
  • the linker further comprises a cleavage site, such as a protease cleavage site.
  • the biotin moiety interacts with a streptavidin moiety.
  • the biotin moiety is further attached to a bead, such as a streptavidin-coupled bead.
  • the biotin moiety is further attached to a resin or a solid support, such as a streptavidin-coupled resin or a streptavidin-coupled solid support.
  • the solid support is a plate, a platform, a cover slide, a microfluidic channel, and the like.
  • the affinity handle moiety further comprises a chromophore.
  • the cysteine-reactive probe is a cysteine-reactive probe illustrated in Fig. 3.
  • the cysteine-reactive probe is a cysteine-reactive probe selected from: N-(hex-5-yn-l-yl)-2-iodoacetamide, Iodoacetamide-rhodamine, 3- acrylamido-N-(hex-5-yn- 1 -yl)-5-(trifluoromethyl)benzamide, 3 -acrylamido-N-(hex-5-yn- 1 -yl)- 5-(trifluoromethyl)benzamide, or 2-chloro-N-(l-(3-ethynylbenzoyl)piperidin-4-yl)-N- phenylacetamide.
  • the cysteine containing protein is a soluble protein or a membrane protein.
  • the cysteine containing protein is involved in one or more of a biological process such as protein transport, lipid metabolism, apoptosis, transcription, electron transport, mRNA processing, or host-virus interaction.
  • the cysteine containing protein is associated with one or more of diseases such as cancer or one or more disorders or conditions such as immune, metabolic, developmental, reproductive, neurological, psychiatric, renal, cardiovascular, or hematological disorders or conditions.
  • the cysteine containing protein comprises a biologically active cysteine residue.
  • the cysteine containing protein comprises one or more cysteines in which at least one cysteine is a biologically active cysteine residue.
  • the biologically active cysteine site is a cysteine residue that is located about lOA or less to an active-site ligand or residue.
  • the cysteine residue that is located about lOA or less to the active-site ligand or residue is an active site cysteine.
  • the biologically active cysteine site is a cysteine residue that is located greater than lOA from an active-site ligand or residue.
  • the cysteine residue is located greater than 12 A, 15 A, 2 ⁇ , 25A, 30A, 35A, 40A, 45 A, or greater than 5 ⁇ from an active-site ligand or residue.
  • the cysteine residue that is located greater than ⁇ from the active-site ligand or residue is a non-active site cysteine.
  • the cysteine containing protein exists in an active form, or in a pro-active form.
  • the cysteine containing protein comprises one or more functions of an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, a plasma protein, transcription related protein, translation related protein, mitochondrial protein, or cytoskeleton related protein.
  • the cysteine containing protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, a plasma protein, transcription related protein, translation related protein, mitochondrial protein, or cytoskeleton related protein.
  • the cysteine containing protein has an uncategorized function.
  • the cysteine containing protein is an enzyme.
  • An enzyme is a protein molecule that accelerates or catalyzes chemical reaction.
  • non- limiting examples of enzymes include kinases, proteases, or deubiquitinating enzymes.
  • exemplary kinases include tyrosine kinases such as the TEC family of kinases such as Tec, Bruton's tyrosine kinase (Btk), interleukin-2-indicible T-cell kinase (Itk) (or Emt/Tsk), Bmx, and Txk/Rlk; spleen tyrosine kinase (Syk) family such as SYK and Zeta-chain-associated protein kinase 70 (ZAP-70); Src kinases such as Src, Yes, Fyn, Fgr, Lck, Hck, Blk, Lyn, and Frk; JAK kinases such as Janus kinase 1 (JAK1), Janus kinase 2 (JAK2), Janus kinase 3 (JAK3), and Tyrosine kinase 2 (TYK2); or Erasine kinases
  • the cysteine containing protein is a protease.
  • the protease is a cysteine protease.
  • the cysteine protease is a caspase.
  • the caspase is an initiator (apical) caspase.
  • the caspase is an effector (executioner) caspase.
  • Exemplary caspase includes CASP2, CASP8, CASP9, C ASP 10, CASP3, CASP6, CASP7, CASP4, and CASP5.
  • the cysteine protease is a cathepsin.
  • Exemplary cathepsin includes Cathepsin B, Cathepsin C, CathepsinF, Cathepsin H, Cathepsin K, Cathepsin LI, Cathepsin L2, Cathepsin O, Cathepsin S, Cathepsin W, or Cathepsin Z.
  • the cysteine containing protein is a deubiquitinating enzyme (DUB).
  • exemplary deubiquitinating enzymes include cysteine proteases DUBs or metalloproteases.
  • Exemplary cysteine protease DUBs include ubiquitin-specific protease (USP/UBP) such as USP1, USP2, USP3, USP4, USP5, USP6, USP7, USP8, USP9X, USP9Y, USP10, USP11, USP12, USP13, USP14, USP15, USP16, USP17, USP17L2,
  • Exemplary metalloproteases include the Jabl/Mov34/Mprl Pad
  • exemplary cysteine containing proteins as enzymes include, but are not limited to, Glyceraldehyde-3 -phosphate dehydrogenase (GAPDH), Protein arginine N-methyltransferase 1 (PRMT1), Peptidyl-prolyl cis-trans isomerase NFMA-interaction (PIN1), Acetyl-CoA acetyltransferase (mitochondrial) (ACAT1), Glutathione S-transferase P (GSTP1), Elongation factor 2 (EEF2), Glutathione S-transferase omega- 1 (GSTOl), Acetyl-CoA acetyltransferase (mitochondrial) (ACAT1), Protein disulfide-isom erase A4 (PDIA4),
  • GPDH Glyceraldehyde-3 -phosphate dehydrogenase
  • PRMT1 Protein arginine N-methyltransferase 1
  • Prostaglandin E synthase 3 PTGES3
  • Adenosine kinase (ADK) Adenosine kinase (ADK)
  • EEF2 Elongation factor 2
  • IAHl Isoamyl acetate-hydrolyzing esterase 1 homolog
  • PRDX5 Peroxiredoxin-5 (mitochondrial)
  • PRDX5 Inosine-5 -monophosphate dehydrogenase 2
  • IMPDH2 3-hydroxyacyl-CoA dehydrogenase type-2
  • HSD17B10 Omega-amidase NIT2
  • AKRIBI Monofunctional Cl-tetrahydrofolate synthase (mitochondrial) (MTHFDIL), Protein disulfide-isomerase A6 (PDIA6), Pyruvate kinase isozymes M1/M2 (PKM), 6- phosphogluconolactonase (PGLS), Acetyl-CoA acetyltransferase (mitochondrial) (ACAT1), EROl-like protein alpha (EROIL), Thioredoxin domain-containing protein 17 (TXNDC17), Protein disulfide-isomerase A4 (PDIA4), Protein disulfide-isomerase A3 (PDIA3), 3-ketoacyl- CoA thiolase (mitochondrial) (ACAA2), Dynamin-2 (D M2), DNA replication licensing factor MCM3 (MCM3), Serine-tRNA ligase (cytoplasmic) (SARS), Fatty acid synthase (FASN), Acety
  • PGP DNA replication licensing factor MCM6
  • TIGAR Cleavage and polyadenylation specificity factor subunit
  • UBE2L3 Ubiquitin- conjugating enzyme E2 L3
  • AARS cytoplasmic
  • GMPPA 1 -phosphate guanyltransferase alpha
  • cytoplasmic C-l-tetrahydrofolate synthase
  • MTHFD1 Dynamin-l-like protein
  • DCM1L Dynamin-l-like protein
  • PDIA3 Protein disulfide-isomerase A3
  • DPEP Aspartyl aminopeptidase
  • ACAT2 Acetyl-CoA acetyltransferase (cytosolic)
  • TXNDC5 Thioredoxin domain-containing protein 5
  • TK1 Thymidine kinase (cytosolic)
  • Inosine-5-monophosphate dehydrogenase 2 (EVIPDH2), Ubiquitin carboxyl-terminal hydrolase isozyme L3 (UCHL3), Integrin-linked protein kinase (ILK), Cyclin-dependent kinase 2 (CDK2),
  • ECI2 mitochondrial (mitochondrial)
  • MTHFD1 C-l-tetrahydrofolate synthase
  • DCK Deoxycytidine kinase
  • UBA6 Ubiquitin-like modifier-activating enzyme 6
  • PCMT1 Protein-L-isoaspartate(D- aspartate) O-methyltransferase
  • PCMT1 Monofunctional Cl-tetrahydrofolate synthase
  • ETHE1 Arginine-tRNA ligase (cytoplasmic) (RARS), NEDD 8 -activating enzyme El catalytic subunit (UBA3), Dual specificity mitogen-activated protein kinase (MAP2K3),
  • UBE2S Ubiquitin-conjugating enzyme E2S
  • PPAT Amidophosphoribosyltransferase
  • PCK2 Phosphoenolpyruvate carboxykinase
  • PPKP 6-phosphofructokinase type C
  • ACSF2 Acyl-CoA synthetase family member 2 (mitochondrial)
  • PAICS Multifunctional protein ADE2
  • Desumoylating isopeptidase 1 (DESI1), 6-phosphofructokinase type C (PFKP), V-type proton
  • ATP6V1A ATPase catalytic subunit A
  • ACAA1 3-ketoacyl-CoA thiolase (peroxisomal)
  • GLKl Galactokinase
  • TK1 Thymidine kinase (cytosolic)
  • WRNIP1 ATPase WRNIP1
  • PFAS Phosphoribosylformylglycinamidine synthase
  • A (ATP6V1 A), Thioredoxin domain-containing protein 5 (TXNDC5), 4- trimethylaminobutyraldehyde dehydrogenase (ALDH9A1), Dual specificity mitogen-activated protein kinase (MAP2K4), Calcineurin-like phosphoesterase domain-containing (CPPED1),
  • Dual specificity protein phosphatase 12 (DUSP12), Phosphoribosylformylglycinamidine synthase (PFAS), Diphosphomevalonate decarboxylase (MVD), D-3-phosphoglycerate dehydrogenase (PHGDH), Cell cycle checkpoint control protein RAD9A (RAD9A),
  • Peroxiredoxin-1 PRDX1
  • Sorbitol dehydrogenase SORD
  • Peroxiredoxin-4 PRDX4
  • AMPD2 AMP deaminase 2
  • Isocitrate dehydrogenase IDHl
  • Pyruvate carboxylase mitochondrial
  • PC Integrin-linked kinase-associated serine/threonine
  • ILKAP Methylmalonate- semi aldehyde dehydrogenase
  • ADH6A1 Methylmalonate- semi aldehyde dehydrogenase
  • PSMD14 26 S proteasome non- ATPase regulatory subunit 14
  • DTYMK Thymidylate kinase
  • PFKFB2 6-phosphofructo-2-kinase/fructose-2,6- bisphosphata
  • PRDX5 Peroxiredoxin-5 (mitochondrial)
  • PRDX5 Peroxiredoxin-5 (mitochondrial)
  • CTSB Cathepsin B
  • TMPRSS12 Transmembrane protease serine 12
  • UGDH UDP-glucose 6-dehydrogenase
  • HINT1 E3 ubiquitin-protein ligase UBR5
  • SAMHDl S
  • ADH2 mitochondrial
  • PMPCB Mitochondrial-processing peptidase subunit beta
  • ACP6 Lysophosphatidic acid phosphatase type 6
  • UBE2L6 Ubiquitin/ISG15-conjugating enzyme E2 L6
  • Caspase-8 CASP8
  • PDE12 2,5- phosphodiesterase 12
  • TXNDC12 Thioredoxin domain-containing protein 12
  • NIT1 NIT1
  • EROl-like protein alpha EROIL
  • SAEl Leucine— tRNA ligase (cytoplasmic) (LARS)
  • TGM2 Protein-glutamine gamma- glutamyltransf erase 2
  • TGM2 Probable DNA dC- dU-editing enzyme
  • RNA-specific adenosine deaminase ADAR
  • Isocitrate dehydrogenase IDH2
  • Methylcrotonoyl-CoA carboxylase beta chain mitochondrial
  • MCCC2 Uridine phosphorylase 1 (UPP1), Glycogen phosphorylase (brain form) (PYGB), E3 ubiquitin-protein ligase UBR5 (UBR5), Procollagen-lysine,2-oxoglutarate 5-dioxygenase 1 (PLOD1), Ubiquitin carboxyl-terminal hydrolase 48 (USP48), Aconitate hydratase
  • URR2 Cysteine protease ATG4B (ATG4B), Serine/threonine-protein kinase Nek9 (NEK9), Lysine-specific demethylase 4B (KDM4B), Insulin-degrading enzyme (IDE), Dipeptidyl peptidase 9 (DPP9), Decaprenyl-diphosphate synthase subunit 2 (PDSS2), TFIIH basal transcription factor complex helicase (ERCC3), Methionine-R-sulfoxide reductase B2
  • MSRB2 mitochondrial
  • E3 ubiquitin-protein ligase BREIB RNF40
  • Thymidylate synthase TYMS
  • Cyclin-dependent kinase 5 CDK5
  • PAPSS2 Bfunctional 3-phosphoadenosine 5- phosphosulfate
  • ACADSB Short/branched chain specific acyl-CoA dehydrogenase
  • CSD Cathepsin D
  • E3 ubiquitin-protein ligase HUWEl HUWEl
  • Calpain-2 catalytic subunit CAPN2
  • MAP2K7 Mitogen- activated protein kinase kinase kinase MLT
  • BLMH Bleomycin hydrolase
  • DDX59 Probable ATP-dependent RNA helicase DDX59
  • CTH Cystathionine gamma-lyase
  • PLOD3 Nucleoside diphosphate-linked moiety X motif 8 (mitochondrial) (NUDT8), E3 ubiquitin-protein ligase HUWEl (HUWEl), Methylated-DNA ⁇ protein-cysteine
  • MGMT Nitrilase homolog 1
  • IRF2BP1 Interferon regulatory factor 2-binding protein 1
  • USP16 Ubiquitin carboxyl-terminal hydrolase 16
  • NMT2 Glycylpeptide N- tetradecanoyltransferase 2
  • CDKN3 Cyclin-dependent kinase inhibitor 3
  • HSDL2 Hydroxysteroid dehydrogenase-like protein 2
  • VRK1 Serine/threonine-protein kinase VRK1
  • ARAF Serine/threonine-protein kinase A-Raf
  • ACLY ATP-citrate synthase
  • ZC3H12D Peripheral plasma membrane protein CASK
  • CASK DNA polymerase epsilon subunit 3
  • POLE3 Aldehyde dehydrogenase X
  • ALDHIBI UDP-N-acetylglucosamine transferase subunit ALG13 (ALG13), Protein disulfide-isomerase A4 (PDIA4), DNA polymerase alpha catalytic subunit (POLA1), Ethylmalonyl-CoA decarboxylase (ECHDC1), Protein-tyrosine kinase 2-beta (PTK2B), E3 SUMO-protein ligase RanBP2 (RANBP2), Legumain (LGMN), Non-specific lipid-transfer protein (SCP2), Long-chain-fatty-acid ⁇ CoA ligase 4 (ACSL4), Dual specificity protein phosphatase 12 (DUSP12), Oxidoreductase HTATIP2 (HTATIP2), Serine/threonine-protein kinase MRCK beta (CDC42BPB), Histone-lysine N-methyltransferase EZH2 (DUSP12), Oxidor
  • MAP2K7 Ubiquitin carboxyl-terminal hydrolase 28
  • USP28 6-phosphofructokinase (liver type)
  • PMARCADl SWI/SNF-related matrix-associated actin-dependent
  • PPME1 Protein phosphatase methylesterase 1
  • MCM5 DNA replication licensing factor
  • PFKFB4 6- phosphofructo-2-kinase/fructose-2,6-bisphosphata
  • DHRS11 Dehydrogenase/reductase SDR family member 11
  • PPEP1 Pyroglutamyl -peptidase 1
  • MYCBP2 Probable E3 ubiquitin- protein ligase
  • DFFB DNA fragmentation factor subunit beta
  • VCIP135 VPIP1
  • Putative transferase CAF17 mitochondrial
  • TOA57 Calpain-7
  • CAN7 GDP-L-fucose synthas
  • ALDHIBI mitochondrial
  • BTK Tyrosine-protein kinase
  • RAD50 DNA repair protein
  • ATPBD4 ATP -binding domain-containing protein 4
  • NME3 Interleukin-1 receptor-associated kinase 1
  • IRAKI Ribonuclease P/MRP protein subunit POP5
  • Peptide-N(4)-(N-acetyl-beta-glucosaminyl)asparagin NGLY1
  • Caspase- 2 CASP2
  • RPS6KA3 Ribosomal protein S6 kinase alpha-3
  • E3 ubiquitin-protein ligase UBR1 UBR1
  • CHEK2 Serine/threonine-protein kinase Chk2
  • Phosphatidylinositol 3,4,5- trisphosphate 5-phospha INPPL1
  • EP300 Hi stone acetyl transferase p300
  • EP300 Creatine kinase U- type (mitochondrial)
  • CKMT1B E3 ubiquitin-protein ligase TREVI33
  • MAP2K7 Myotubularin-related protein 1 (MTMR1), Calcium-dependent phospholipase A2 (PLA2G5), Mitotic checkpoint serine/threonine-protein kinase (BUB IB), Putative transferase CAF17 (mitochondrial) (IBA57), Tyrosine-protein kinase ZAP-70 (ZAP70), E3 ubiquitin- protein ligase pellino homolog 1 (PELI1), Neuropathy target esterase (PNPLA6), Ribosomal protein S6 kinase alpha-3 (RPS6KA3), N6-adenosine-methyltransferase 70 kDa subunit
  • the cysteine containing protein is a signaling protein.
  • exemplary signaling protein includes vascular endothelial growth factor (VEGF) proteins or proteins involved in redox signaling.
  • VEGF proteins include VEGF-A, VEGF-B, VEGF-C, VEGF-D, and PGF.
  • Exemplary proteins involved in redox signaling include redox-regulatory protein FAM213A.
  • the cysteine containing protein is a transcription factor or regulator.
  • Exemplary cysteine containing proteins as transcription factors and regulators include, but are not limited to, 40S ribosomal protein S3 (RPS3), Basic leucine zipper and W2 domain- containing protein (BZW1), Poly(rC)-binding protein 1 (PCBP1), 40S ribosomal protein Sl l (RPSl 1), 40S ribosomal protein S4, X isoform (RPS4X), Signal recognition particle 9 kDa protein (SRP9), Non-POU domain-containing octamer-binding protein (NONO), N-alpha- acetyltransferase 15, NatA auxiliary subunit (NAA15), Cleavage stimulation factor subunit 2 (CSTF2), Lamina-associated polypeptide 2, isoform alpha (TMPO), Heterogeneous nuclear ribonucleoprotein R (HNRNPR), MMS19 nucleot
  • RPS3
  • GTF2A2 Chromatin accessibility complex protein 1 (CHRAC1), CDKN2A-interacting protein (CDKN2AIP), Zinc finger protein 217 (ZNF217), Signal transducer and activator of transcription 3 (STAT3), WD repeat and HMG-box DNA-binding protein 1 (WDHD1), Lamina- associated polypeptide 2 (isoform alpha) (TMPO), Lamina-associated polypeptide 2 (isoforms beta/gam) (TMPO), Interferon regulatory factor 4 (IRF4), Protein flightless- 1 homolog (FLU), Heterogeneous nuclear ribonucleoprotein F (HNRNPF), Nucleus accumbens-associated protein 1 (NACC1), Transcription elongation regulator 1 (TCERG1), Protein HEXIM1 (HEXFM1), Enhancer of mRNA-decapping protein (EDC3), Zinc finger protein Aiolos (IKZF3),
  • WDHD1 Lamina- associated polypeptide 2 (
  • Transcription elongation factor SPT5 (SUPT5H), Forkhead box protein Kl (FOXK1), LEVI domain-containing protein 1 (LFMD1), MMS19 nucleotide excision repair protein homolog (MMS19), Elongator complex protein 4 (ELP4), Ankyrin repeat and KH domain-containing protein 1 (ANKHDl), PML, Nuclear factor NF-kappa-B pi 00 subunit (NFKB2), Heterogeneous nuclear ribonucleoprotein L-like (HNRPLL), CCR4-NOT transcription complex subunit 3 (CNOT3), Constitutive coactivator of PPAR-gamma-like protein (FAM120A), Mediator of RNA polymerase II transcription subunit (MED 15), 60S ribosomal protein L7 (RPL7),
  • Interferon regulatory factor 8 IRF8
  • COUP transcription factor 2 N2F2
  • Mediator of RNA polymerase II transcription subunit MED1
  • TRMT2A tRNA (uracil -5 -)-methyl transferase homolog A
  • RELA Transcription factor p65
  • Exosome complex component RRP42
  • EXOSC7 General transcription factor 3C polypeptide 1 (GTF3C1), Mothers against decapentaplegic homolog 2 (SMAD2), Ankyrin repeat domain-containing protein 17
  • ANKRD17 MMS19 nucleotide excision repair protein homolog (MMS19), Death domain-associated protein 6 (DAXX), Zinc finger protein 318 (ZNF318), Thioredoxin-interacting protein (TXNIP), Glucocorticoid receptor (NR3C1), Iron-responsive element-binding protein 2 (IREB2), Zinc finger protein 295 (ZNF295), Polycomb protein SUZ12 (SUZ12), Cleavage stimulation factor subunit 2 tau variant (CSTF2T), C-myc promoter-binding protein
  • EDC3 A-kinase anchor protein 1 (mitochondrial) (AKAPl), Transcription factor RelB
  • KTF4A Chromosome-associated kinesin KTF4A
  • MED 12 Mediator of RNA polymerase II transcription subunit
  • NPAT Protein NPAT
  • LRPPRC mitochondrial
  • AHDCl AT-hook DNA-binding motif-containing protein 1
  • RNA polymerase II transcription subunit MED 12
  • Bromodomain-containing protein 8 BFD8
  • Trinucleotide repeat-containing gene 6B protein TNRC6B
  • Aryl hydrocarbon receptor nuclear translocator ARNT
  • Activating transcription factor 7-interacting protein ATF7IP
  • Glucocorticoid receptor N-romosome transmission fidelity protein
  • CHTF18 C-myc promoter-binding protein
  • DENND4A C-myc promoter-binding protein
  • the cysteine containing protein is a channel, transporter or receptor.
  • exemplary cysteine containing proteins as channels, transporters, or receptors include, but are not limited to, Chloride intracellular channel protein 4 (CLIC4), Exportin-1 (XPOl),
  • TXN Thioredoxin
  • SEC 13 Protein SEC 13 homolog
  • SEC 13 Chloride intracellular channel protein 1
  • CLIC1 Guanine nucleotide-binding protein subunit beta-2 (GNB2L1), Sorting nexin-6
  • SNX6 conserved oligomeric Golgi complex subunit 3 (COG3), Nuclear cap-binding protein subunit 1 (NCBP1), Cytoplasmic dynein 1 light intermediate chain 1 (DYNC1LI1), MOB-like protein phocein (MOB4), Programmed cell death 6-interacting protein (PDCD6IP),
  • Glutaredoxin-1 (GLRX), ATP synthase subunit alpha (mitochondrial) (ATP5A1), Treacle protein (TCOF1), Dynactin subunit 1 (DCTN1), Importin-7 (IP07), Exportin-2 (CSE1L), ATP synthase subunit gamma (mitochondrial) (ATP5C1), Trafficking protein particle complex subunit 5 (TRAPPC5), Thioredoxin mitochondrial (TXN2), THO complex subunit 6 homolog
  • TXNL1 Nuclear pore complex protein Nup214 (NUP214), Protein lin-7 homolog C (LIN7C),
  • GGA2 ADP-ribosylation factor-binding protein GGA2 (GGA2), Trafficking protein particle complex subunit 4 (TRAPPC4), Protein quaking (QKI), Perilipin-3 (PLIN3), Copper transport protein
  • ATOX1 ATOX1
  • Unconventional myosin-Ic MYOIC
  • Nucleoporin NUP53 NUP35
  • Vacuolar protein sorting-associated protein 18 homolog (VPS 18), Dedicator of cytokinesis protein 7 (DOCK7), Nucleoporin p54 (NUP54), Ras-related GTP-binding protein C (RRAGC),
  • Arf-GAP with Rho-GAP domain (ANK repeat and PH domain) (ARAPl), Exportin-5 (XP05),
  • Kinectin KTN1
  • CLIC6 Chloride intracellular channel protein 6
  • KCNAB2 Voltage-gated potassium channel subunit beta-2
  • Exportin-5 XP05
  • Ras-related GTP-binding protein C RRAGC
  • Ribosome-binding protein 1 RRBPl
  • Acyl-CoA-binding domain-containing protein 6 ACBD6
  • Chloride intracellular channel protein 5 CLIC5
  • Pleckstrin homology domain- containing family A member PLEKHA2
  • ADP-ribosylation factor-like protein 3 ARL3
  • SEC24C Protein transport protein Sec24C
  • VDAC3 Voltage-dependent anion-selective channel protein
  • PDCD6IP Chloride intracellular channel protein 3 (CLIC3)
  • Multivesicular body subunit 12A FAM125A
  • EIF4ENIF1 Eukaryotic translation initiation factor 4E transporter
  • NmrA-like family domain-containing protein 1 MRAL1
  • Nuclear factor 5 R
  • TRIPPCl Guanine nucleotide-binding protein-like 3 (GNL3), or Importin-13 (IP013).
  • the cysteine containing protein is a chaperone.
  • Exemplary cysteine containing proteins as chaperones include, but are not limited to, 60 kDa heat shock protein (mitochondrial) (HSPDl), T-complex protein 1 subunit eta (CCT7), T-complex protein 1 subunit epsilon (CCT5), Heat shock 70 kDa protein 4 (HSPA4), GrpE protein homolog 1 (mitochondrial) (GRPELl), Tubulin-specific chaperone E (TBCE), Protein unc-45 homolog A (UNC45A), Serpin HI (SERPINH1), Tubulin-specific chaperone D (TBCD), Peroxisomal biogenesis factor 19 (PEX19), BAG family molecular chaperone regulator 5 (BAG5), T- complex protein 1 subunit theta (CCT8), Protein canopy homolog 3 (CNPY3), DnaJ homolog subfamily C member 10 (DNAJC10),
  • HSPDl
  • the cysteine containing protein is an adapter, scaffolding or modulator protein.
  • exemplary cysteine containing proteins as adapter, scaffolding, or modulator proteins include, but are not limited to, Proteasome activator complex subunit 1 (PSMEl),
  • TIPRL ⁇ ⁇ -like protein
  • CKL Crk-like protein
  • CFL1 Condensin complex subunit 1
  • NCAPD2 Condensin complex subunit 1
  • GCN1L1 Translational activator GCN1
  • PPP2R5D Serine/threonine-protein phosphatase 2A 56 kDa regulatory
  • PPP2R5D UPF0539 protein C7orf59
  • C7orf59 Protein diaphanous homolog 1
  • DIAPHl Protein asunder homolog
  • Amun Ras GTPase-activating-like protein IQGAPl
  • IQGAPl Sister chromatid cohesion protein PDS5 homolog A
  • RTN4 Proteasome activator complex subunit 4
  • PSME4 Proteasome activator complex subunit 4
  • NCAPH Condensin complex subunit 2
  • PDS5A Sister chromatid cohesion protein PDS5 homolog A
  • PRKARl A cAMP-dependent protein kinase type I-alpha regulatory
  • HCFC1 Host cell factor 1
  • Serine/threonine-protein phosphatase 4 regulatory PPP4R2
  • Apoptotic chromatin condensation inducer in the nucleus ACINI
  • BRISC and BRCA1-A complex member 1 BAB AMI
  • IFIT3 Interferon-induced protein with tetratricopeptide
  • RASSF2 Ras association domain-containing protein 2
  • Hsp70-binding protein 1 Hsp70-binding protein 1
  • TBC1 domain family member 15 TBC1 domain family member 15
  • DMBP Dynamin-binding protein
  • NCAPD2 Condensin complex subunit 1
  • Beta-2-syntrophin SNTB2
  • Disks large homolog 1 DG1
  • TBC1 domain family member 13 TBC1 domain family member 13
  • FNBP1L Formin-binding protein 1-like (FNBP1L)
  • Translational activator GCN1 GCN1
  • GCN1L1 GRB2-related adapter protein (GRAP), G2/mitotic-specific cyclin-Bl (CCNB1), Myotubularin-related protein 12 (MTMR12), Protein FADD (FADD), Translational activator GCN1 (GCN1L1), Wings apart-like protein homolog (WAPAL), cAMP-dependent protein kinase type II-beta regulatory (PRKAR2B), Malcavernin (CCM2), MPP1 55 kDa erythrocyte membrane protein, Actin filament-associated protein 1 (AFAPl), Tensin-3 (TNS3), tRNA methyltransferase 112 homolog (TRMT112), Symplekin (SYMPK), TBC1 domain family member 2A (TBC1D2), ATR-interacting protein (ATRIP), Ataxin-10 (ATXN10), Succinate dehydrogenase assembly factor 2 (mitochondrial) (SDHAF2), Formin-binding protein 1
  • FNBP1 Myotubularin-related protein 12
  • MTMR12 Myotubularin-related protein 12
  • IFIT3 Interferon-induced protein with tetratricopeptide
  • CBFA2T2 Protein CBFA2T2
  • NCF1 Neutrophil cytosol factor 1
  • NUDT16L Protein syndesmos
  • a cysteine containing protein comprises a protein illustrated in
  • a cysteine containing protein comprises a protein illustrated in Table 1. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 1. In some instances, a cysteine containing protein comprises a protein illustrated in Table 2. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 2. In some instances, a cysteine containing protein comprises a protein illustrated in Table 3. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 3. In some instances, a cysteine containing protein comprises a cysteine residue denoted in Table 4. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 4.
  • a cysteine containing protein comprises a protein illustrated in Table 5. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 5. In some instances, a cysteine containing protein comprises a protein illustrated in Table 7. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 7. In some instances, a cysteine containing protein comprises a protein illustrated in Table 8. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 8. In some instances, a cysteine containing protein comprises a protein illustrated in Table 9. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 9.
  • the cysteine containing protein is a modified protein, in which the protein is modified at a cysteine residue site by a small molecule fragment described herein, such as for example, by a small molecule fragment of Formula (I) described herein, a cysteine-reactive probe of Formula (II) described herein, or by a small molecule fragment illustrated in Fig. 3.
  • a small molecule fragment described herein such as for example, by a small molecule fragment of Formula (I) described herein, a cysteine-reactive probe of Formula (II) described herein, or by a small molecule fragment illustrated in Fig. 3.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein.
  • the cysteine containing protein is selected from Table 3.
  • cysteine residues of each respective cysteine containing protein are denoted in Table 3.
  • a cysteine containing protein selected from Table 3 is modified by a small molecule fragment at at least one cysteine site denoted in Table 3 to generate a modified cysteine containing protein.
  • the cysteine containing protein is selected from AIP, PESl, IKBKB, XPOl, KDM4B, R3C1, GSTP1, T FAIP3, ACATl,
  • the cysteine containing protein is selected from AIP, PESl, IKBKB, XPOl, GSTP1, ACATl, IRAKI, IRF4, ZC3HAV1, USP7,
  • the cysteine containing protein is selected from KDM4B, NR3C1, TNFAIP3, USP7 or USP22. In some cases, the cysteine containing protein is selected from GNB2L1 or USP34. In some cases, the cysteine containing protein is DCUNIDI . In some cases, the cysteine containing protein is selected from PESl, IKBKB, GSTP1, ACATl, IRAKI, ZC3HAV1 or RRAGC. In some cases, the cysteine containing protein is selected from XPOl, GNB2L1, USP34, UBE20, MLTK or
  • the cysteine containing protein is selected from KDM4B or NR3C1. In some cases, the cysteine containing protein is selected from TNFAIP3, USP7, USP28, KDM3A or USP16. In some cases, the cysteine containing protein is selected from IRF4, PELI1,
  • cysteine containing protein is AIP. In some cases, the cysteine containing protein is an enzyme and the enzyme is selected from IKBKB, KDM4B,
  • cysteine containing protein is a transcription factor or regulator and the transcription factor or regulator is selected from NR3C1, IRF4 or ZC3HAV1.
  • the cysteine containing protein is a channel, a transporter, or a receptor and the channel, transporter, or receptor is selected from GNB2L1 or RRAGC. In some cases, the cysteine containing protein is selected from AIP, PESl, XPOl or DCUNIDI . In some cases, the cysteine containing protein is selected from PESl, CYR61, UBE2L6, XPOl, ADA, NR3C1,
  • cysteine containing protein is selected from PES1, CYR61, R3C1, UCHL3, ERCC3, ACAT1, STAT3, CASP2, LRBA, UBE2L3, RELB, PDIA6, PCK2,
  • cysteine containing protein is selected from
  • the cysteine containing protein is selected from CYR61 or XPOl . In some cases, the cysteine containing protein is selected from ADA, MGMT, IDH2, IRF8 or SAMHDl . In some cases, the cysteine containing protein is selected from PES1, CYR61, XPOl, R3C1 or SMARCC2. In some cases, the cysteine containing protein is selected from CYR61, UBE2L6, MGMT, ERCC3,
  • cysteine containing protein is selected from ADA, RELB or USP34. In some cases, the cysteine containing protein is selected from UCHL3, CASP2,
  • the cysteine containing protein is selected from MGMT, ACAT1, UBA7, UBE2L3 or IRF8. In some cases, the cysteine containing protein is selected from PFKFB4, AC ATI or STAT3. In some cases, the cysteine containing protein is selected from POU2F2, PDIA6 or SAMHDl . In some cases, the cysteine containing protein is an enzyme and the enzyme is selected from UBE2L6, ADA, UCHL3,
  • cysteine containing protein is a transcription factor or a regulator and the transcription factor or regulator is selected from
  • the cysteine containing protein is selected from ZAP70, PRKCQ or PRMT1. In some cases, the cysteine containing protein is selected from ZAP70 or PRKCQ. In some cases, the cysteine containing protein is selected from CYR61, Z F217, NCFl, IREB2, LRBA, CDK5, EP300, EZH2, UBE2S,
  • the cysteine containing protein is selected from
  • cysteine containing protein is selected from NCFl, LRBA or CDK5. In some cases, the cysteine containing protein is EZH2. In some cases, the cysteine containing protein is selected from
  • the cysteine containing protein is selected from CYR61, IREB2, LRBA or UBE2S. In some cases, the cysteine containing protein is selected from EZH2, VCPIPl or RRAGC. In some cases, the cysteine containing protein is an enzyme and the enzyme is selected from CDK5, EP300, EZH2, UBE2S, VCPIPl or IRAK4. In some cases, the cysteine containing protein is a transcription factor or a regulator and the transcription factor or regulator is selected from ZNF217 or IREB2.
  • the cysteine containing protein is an adapter, a scaffolding protein or a modulator protein and the adapter, scaffolding protein or the modulator protein is selected from NCFl .
  • the cysteine containing protein is a channel, a transporter or a receptor and the channel, transporter, or receptor is selected from RRAGC.
  • the cysteine containing protein is selected from CYR61 or LRBA.
  • the cysteine containing protein is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
  • the cysteine residue of the modified c steine containing protein has the structure SR, wherein R is selected from: , wherein
  • R 1 is H, C1-C3 alkyl, or aryl; and F' is the small molecule fragment moiety.
  • the small molecule fragment has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the molecular weight of the small molecule fragment is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms.
  • the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof.
  • the small molecule fragment is a small molecule fragment
  • RM is a reactive moiety selected from a
  • Michael acceptor moiety a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • F is obtained from a compound library.
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • F further comprises a linker moiety that connects F to the carbonyl moiety.
  • the small molecule fragment binds irreversibly to the cysteine containing protein.
  • the small molecule fragment binds reversibly to the cysteine containing protein.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is selected from Table 10A, enzymes.
  • the cysteine containing protein is selected from Table 10A, enzymes.
  • one or more cysteine residues of each respective cysteine containing protein are denoted in Table 10A.
  • a cysteine containing protein selected from Table 10A is modified by a small molecule fragment at at least one cysteine site denoted in Table 10A to generate a modified cysteine containing protein.
  • the cysteine containing protein is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
  • the cysteine residue of the modified c steine containing protein has the
  • the small molecule fragment has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some cases, the molecular weight of the small molecule fragment is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some
  • the small molecule fragment is a small molecule fragment of Formula (I):
  • RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • F is obtained from a compound library.
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • F further comprises a linker moiety that connects F to the carbonyl moiety.
  • the small molecule fragment binds irreversibly to the cysteine containing protein.
  • the small molecule fragment binds reversibly to the cysteine containing protein.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is selected from Table 10B, transcription factors and regulators.
  • the cysteine containing protein is selected from Table 10B, transcription factors and regulators.
  • one or more cysteine residues of each respective cysteine containing protein are denoted in Table 10B.
  • a cysteine containing protein selected from Table 10B is modified by a small molecule fragment at at least one cysteine site denoted in Table 10B to generate a modified cysteine containing protein.
  • the cysteine containing protein is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
  • the cysteine residue of the modified cysteine is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
  • the cysteine residue of the modified cysteine is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
  • the cysteine residue of the modified cysteine is about 20, 30,
  • containin protein has the structure SR, wherein R is selected from:
  • the small molecule fragment has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the molecular weight of the small molecule fragment is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms.
  • the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof.
  • the small molecule fragment is a small molecule
  • fragment of Formula (I) a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • F is obtained from a compound library.
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • F further comprises a linker moiety that connects F to the carbonyl moiety.
  • the small molecule fragment binds irreversibly to the cysteine containing protein.
  • the small molecule fragment binds reversibly to the cysteine containing protein.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is selected from Table IOC, channels, transporters or receptors.
  • the cysteine containing protein is selected from Table IOC, channels, transporters or receptors.
  • one or more cysteine residues of each respective cysteine containing protein are denoted in Table IOC.
  • a cysteine containing protein selected from Table IOC is modified by a small molecule fragment at at least one cysteine site denoted in Table IOC to generate a modified cysteine containing protein.
  • the cysteine containing protein is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
  • the cysteine residue of the modified cysteine is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
  • the cysteine residue of the modified cysteine is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
  • the cysteine residue of the modified cysteine is about 20, 30,
  • containin protein has the structure SR, wherein R is selected from:
  • the small molecule fragment has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the molecular weight of the small molecule fragment is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms.
  • the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof.
  • the small molecule fragment is a small molecule
  • fragment of Formula (I) a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • F is obtained from a compound library.
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • F further comprises a linker moiety that connects F to the carbonyl moiety.
  • the small molecule fragment binds irreversibly to the cysteine containing protein.
  • the small molecule fragment binds reversibly to the cysteine containing protein.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is selected from Table 10D, adapter, scaffolding, or modulator proteins.
  • the cysteine containing protein is selected from Table 10D, adapter, scaffolding, or modulator proteins.
  • one or more cysteine residues of each respective cysteine containing protein are denoted in Table 10D.
  • a cysteine containing protein selected from Table 10D is modified by a small molecule fragment at at least one cysteine site denoted in Table 10D to generate a modified cysteine containing protein.
  • the cysteine containing protein is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
  • the cysteine residue of the modified cysteine is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
  • the cysteine residue of the modified cysteine is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
  • the cysteine residue of the modified cysteine is about 20, 30,
  • containin protein has the structure SR, wherein R is selected from:
  • the small molecule fragment has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the molecular weight of the small molecule fragment is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms.
  • the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof.
  • the small molecule fragment is a small molecule
  • fragment of Formula (I) a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • F is obtained from a compound library.
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • F further comprises a linker moiety that connects F to the carbonyl moiety.
  • the small molecule fragment binds irreversibly to the cysteine containing protein.
  • the small molecule fragment binds reversibly to the cysteine containing protein.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is selected from Table 10E.
  • the cysteine containing protein is selected from Table 10E.
  • one or more cysteine residues of each respective cysteine containing protein are denoted in Table 10E.
  • a cysteine containing protein selected from Table 10E is modified by a small molecule fragment at at least one cysteine site denoted in Table 10E to generate a modified cysteine containing protein.
  • the cysteine containing protein is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
  • the cysteine residue of the modified cysteine containing protein has the structure SR, wherein R is selected
  • R 1 is H, C 1-C3 alkyl, or aryl; and F' is the small molecule fragment moiety.
  • the small molecule fragment has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the molecular weight of the small molecule fragment is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms.
  • the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof.
  • the small molecule fragment is a small molecule fragment
  • Michael acceptor moiety a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • F is obtained from a compound library.
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • F further comprises a linker moiety that connects F to the carbonyl moiety.
  • the small molecule fragment binds irreversibly to the cysteine containing protein.
  • the small molecule fragment binds reversibly to the cysteine containing protein.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif X P C*Z, wherein X p is a polar residue, C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from AIP, PES 1, IKBKB, XPO l, KDM4B, R3C 1, GSTP1, T FAIP3, ACAT1, IRAKI, G B2L1, IRF4, USP34, ZC3HAV1, USP7, PELI1, DCUN1D1, USP28, UBE20, RRAGC, MLTK, USP22, KDM3A, or USP16.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif X p C*X n , wherein X p is a polar residue, C* denotes the site of modification, and X n is a nonpolar residue.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from AIP, PES1, IKBKB, XPOl, GSTP1, ACAT1, IRAKI, IRF4, ZC3HAV1, USP7, PELI1, USP28, UBE20, RRAGC, MLTK, USP22, KDM3A, or USP16.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif X P C*X P , wherein X p is a polar residue and C* denotes the site of modification.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from KDM4B, NR3C1, T FAIP3, USP7 or USP22.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif X p C*X b , wherein X p is a polar residue, C* denotes the site of modification, and X is a basic residue.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from GNB2L1 or USP34.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif X p C*X a , wherein X p is a polar residue, C* denotes the site of modification, and X a is an acidic residue.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is DCUN1D1.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif SC*Z, wherein C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from PES1, IKBKB, GSTP1, ACAT1, IRAKI, ZC3HAV1 or RRAGC.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif NC*Z, wherein C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from XPOl, G B2L1, USP34, UBE20, MLTK or USP22.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif YC*Z, wherein C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from KDM4B or R3C1.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif TC*Z, wherein C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from T FAIP3, USP7, USP28, KDM3A or USP16.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif QC*Z, wherein C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from IRF4, PELI1, DCUN1D1 or USP22.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif CC*Z, wherein C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is ⁇ >.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is an enzyme and the enzyme comprises the motif X P C*Z, wherein X p is a polar residue, C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the enzyme is selected from IKBKB, KDM4B, GSTP1, T FAIP3, ACATl, IRAKI, USP34, USP7, PELI1, USP28, UBE20, MLTK, USP22, KDM3A, or USP16.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein
  • the cysteine containing protein is a transcription factor or a regulator and the transcription factor or regulator comprises the motif X P C*Z, wherein X p is a polar residue, C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the transcription factor or regulator is selected from R3C1, IRF4 or ZC3HAV1.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is a channel, transporter or a receptor and the channel, transporter or receptor comprises the motif X P C*Z, wherein X p is a polar residue, C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the channel, transporter, or receptor is selected from GNB2L1 or RRAGC.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif X P C*Z, wherein X p is a polar residue, C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from AIP, PES1, XPOl or DCUN1D1.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif X n C*Z, wherein X n is a nonpolar residue, C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from PES1, CYR61, UBE2L6, XPOl, ADA, R3C1, POU2F2, UCHL3, MGMT, ERCC3, ACATl, STAT3, UBA7, CASP2, IDH2, LRBA, UBE2L3, RELB, IRF8, CASP8, PDIA6, PCK2, PFKFB4, PDE12, USP34, USP48, SMARCC2 or SAMHD1.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif X n C*X n , wherein X n is a nonpolar residue and C* denotes the site of modification.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from PES1, CYR61, R3C1, UCHL3, ERCC3, ACAT1, STAT3, CASP2, LRBA, UBE2L3, RELB, PDIA6, PCK2, PFKFB4, USP48 or SMARCC2.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif X n C*X p , wherein X n is a nonpolar residue, C* denotes the site of modification, and X p is a polar residue.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from UBE2L6, POU2F2, MGMT, ACAT1, UBA7, CASP8, PDE12 or USP34.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif X n C*X a , wherein X n is a nonpolar residue, C* denotes the site of modification, and X a is an acidic residue.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from CYR61 or XPOl .
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif X n C*X b , wherein X n is a nonpolar residue, C* denotes the site of modification, and X is a basic residue.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from ADA, MGMT, IDH2, IRF8 or SAMHDl .
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif LC*Z, wherein C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from PES1, CYR61, XPOl, R3C1 or SMARCC2.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif PC *Z, wherein C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from CYR61, UBE2L6, MGMT, ERCC3, AC ATI or USP48.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif GC*Z, wherein C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from ADA, RELB or USP34.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif AC*Z, wherein C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from UCHL3, CASP2, IDH2, LRBA, CASP8, PCK2 or PDE12.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif VC*Z, wherein C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from MGMT, ACAT1, UBA7, UBE2L3 or IRF8.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif IC*Z, wherein C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from PFKFB4, AC ATI or STAT3.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif X r C*Z, wherein
  • X r denotes an aromatic residue
  • C* denotes the site of modification
  • Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from POU2F2, PDIA6 or SAMHD1.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is an enzyme and the enzyme comprises the motif X n C*Z, wherein X n is a nonpolar residue, C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the enzyme is selected from UBE2L6, ADA, UCHL3, MGMT, ERCC3, ACAT1, UBA7, CASP2, IDH2, UBE2L3, CASP8, PDIA6, PCK2, PFKFB4, PDE12, USP34, USP48 or SAMHD1.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein
  • the cysteine containing protein is a transcription factor or a regulator and the transcription factor or regulator comprises the motif X n C*Z, wherein X n is a nonpolar residue, C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the transcription factor or regulator is selected from R3C1, POU2F2, STAT3, RELB, IRF8 or SMARCC2.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif X n C*Z, wherein X n is a nonpolar residue, C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from PES1, CYR61, XPOl or LRBA.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif X a C*Z, wherein X a is an acidic residue, C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from ZAP70, PRKCQ or PRMT1.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif EC*Z, wherein C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from ZAP70 or PRKCQ.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif X b C*Z, wherein X b is a basic residue, C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from CYR61, Z F217, NCF1, IREB2, LRBA, CDK5, EP300,
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif Xt,C*X n , wherein X b is a basic residue, C* denotes the site of modification, and X n is a nonpolar residue.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from CYR61, Z F217, IREB2, EP300, UBE2S, VCPIPl, RRAGC or IRAK4.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif Xt,C*X p , wherein X b is a basic residue, C* denotes the site of modification, and X p is a polar residue.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from NCF1, LRBA or CDK5.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif Xt,C*X , wherein X b is a basic residue and C* denotes the site of modification.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is EZH2.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif RC*Z, wherein C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from ZNF217, NCF1, CDK5, EP300 or IRAK4.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif KC*Z, wherein C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from CYR61, IREB2, LRBA or UBE2S.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif HC*Z, wherein C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from EZH2, VCPIP1 or RRAGC.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein
  • the cysteine containing protein is an enzyme and the enzyme comprises the motif Xt,C*Z, wherein X b is a basic residue, C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the enzyme is selected from CDK5, EP300, EZH2, UBE2S, VCPIP1 or IRAK4.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is a transcription factor or a regulator and the transcription factor or regulator comprises the motif X b C*Z, wherein X b is a basic residue, C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the transcription factor or regulator is selected from Z F217 or IREB2.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein
  • the cysteine containing protein is an adapter, a scaffolding protein, or a modulator protein
  • the adapter, scaffolding protein or the modulator protein comprises the motif X b C*Z, wherein X b is a basic residue, C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the adapter, scaffolding protein or the modulator protein is selected from NCF1.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is a channel, a transporter, or a receptor and the channel, transporter, or receptor comprises the motif X b C*Z, wherein X b is a basic residue, C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the channel, transporter, or receptor is selected from RRAGC.
  • a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif X b C*Z, wherein X b is a basic residue, C* denotes the site of modification, and Z is any amino acid.
  • the cysteine containing protein is selected from Table 3.
  • the cysteine containing protein is selected from CYR61 or LRBA.
  • a cysteine containing protein described above comprises about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
  • cysteine residue of a modified cysteine containin protein
  • R is selected from:
  • the small molecule fragment has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
  • the molecular weight of the small molecule fragment is prior to enrichment with a halogen, a nonmetal, or a transition metal.
  • the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms.
  • the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof.
  • the small molecule fragment is a small molecule fragment of Formula (I):
  • RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • F is obtained from a compound library.
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • F further comprises a linker moiety that connects F to the carbonyl moiety.
  • the small molecule fragment binds irreversibly to a cysteine containing protein described above.
  • the small molecule fragment binds reversibly to a cysteine containing protein described above.
  • compositions Compositions, Cells, and Cell Populations
  • compositions of a small molecule fragment conjugated to a cysteine containing protein also include compositions of a small molecule fragment conjugated to a cysteine containing protein, a cysteine-reactive probe conjugated to a cysteine containing protein, and treated sample compositions.
  • a composition described herein comprises a small molecule fragment of Formula (I):
  • RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety; and
  • cysteine containing protein wherein the cysteine containing protein is covalently bond to the small molecule fragment.
  • composition that comprises a cysteine-reactive probe of Formula (II):
  • RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue; and AFDVI is an affinity handle moiety; and
  • cysteine containing protein wherein the cysteine containing protein is covalently bond to the cysteine-reactive probe.
  • compositions that comprises an isolated sample wherein the isolated sample is an isolated cell or a tissue sample; and a cysteine- reactive probe to be assayed for its ability to interact with a cysteine containing protein expressed in the isolated sample.
  • Disclosed herein further include isolated treated cell and cell populations.
  • described herein is an isolated treated cell that comprises a cysteine-reactive probe covalently attached to a cysteine containing protein.
  • the isolated treated cell further comprises a set of cysteine-reactive probes wherein each of the cysteine-reactive probes is covalently attached to a cysteine containing protein.
  • an isolated treated cell that comprises a small molecule fragment covalently attached to a cysteine containing protein.
  • the isolated treated cell further comprises a set of small molecule fragments wherein each of the small molecule fragment is covalently attached to a cysteine containing protein.
  • the isolated treated cell further comprises a cysteine-reactive probe.
  • the isolated treated cell further comprises a set of cysteine-reactive probes.
  • an isolated treated population of cells that comprises a set of cysteine-reactive probes covalently attached to cysteine containing proteins.
  • an isolated treated population of cells that comprises a set of small molecule fragments covalently attached to cysteine containing proteins.
  • the isolated treated population of cells further comprises a set of cysteine-reactive probes.
  • the small molecule fragment is a small molecule fragment of Formula (I):
  • RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • F is obtained from a compound library.
  • the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment -Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from
  • F is a small molecule fragment moiety illustrated in Fig. 3.
  • F further comprises a linker moiety that connects F to the carbonyl moiety.
  • the small molecule fragment is a small molecule fragment illustrated in Fig. 3.
  • cysteine-reactive probe is a cysteine-reactive probe of Formula (II):
  • RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue; and AHM is an affinity handle moiety.
  • the Michael acceptor moiety comprises an alkene or an alkyne moiety.
  • the affinity handle moiety comprises an affinity handle and a binding moiety that facilitates covalent interaction of the cysteine-reactive probe to a cysteine residue of a cysteine-containing protein.
  • the binding moiety is a small molecule fragment obtained from a compound library.
  • the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment-Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from AnCoreX, Bio Building Blocks from ASINEX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library.
  • the affinity handle is a bioorthogonal affinity handle.
  • the affinity handle comprises a carbodiimide, N-hydroxysuccinimide (NHS) ester, imidoester, pentafluorophenyl ester, hydroxymethyl phosphine, maleimide, haloacetyl, pyridyl disulfide, thiosulfonate, vinylsulfone, hydrazide, alkoxyamine, alkyne, azide, or isocyanate group.
  • the affinity handle comprises an alkyne or an azide group.
  • the affinity handle is further conjugated to an affinity ligand.
  • the affinity ligand comprises a chromophore, a labeling group, or a combination thereof.
  • the chromophore comprises fluorochrome, non-fluorochrome chromophore, quencher, an absorption chromophore, fluorophore, organic dye, inorganic dye, metal chelate, or a fluorescent enzyme substrate.
  • the labeling group is biotin moiety, streptavidin moiety, bead, resin, a solid support, or a combination thereof.
  • the affinity handle moiety further comprises a chromophore.
  • the cysteine-reactive probe is a cysteine-reactive probe illustrated in Fig. 3.
  • the cell or cell population is obtained from any mammal, such as human or non-human primates.
  • the cell or cell population is an epithelial cell, connective tissue cell, hormone secreting cell, a nerve cell, a skeletal muscle cell, a blood cell, or an immune system cell.
  • the cell or cell population is cancerous or is obtained from a tumor site.
  • Polypeptides comprising a cysteine interacting site
  • polypeptides that comprise one or more of the cysteine interacting sites identified by a method described herein.
  • described herein is an isolated and purified polypeptide that comprises at least 90% sequence identity to at least seven contiguous amino acids of an amino acid sequence selected from Tables 1 -3 or 8-9.
  • the isolated and purified polypeptide comprises at least 91%, 92%, 93%, 94%), 95%), 96%), 97%), 98%>, or 99% sequence identity to at least seven contiguous amino acids of an amino acid sequence selected from Tables 1 -3 or 8-9.
  • the isolated and purified polypeptide comprises 100% sequence identity to at least seven contiguous amino acids of an amino acid sequence selected from Tables 1-3 or 8-9. In some instances, the isolated and purified polypeptide consists 100% sequence identity to the full length of an amino acid sequence selected from Tables 1 -3 or 8-9. In some instances, the isolated and purified polypeptide is at most 50 amino acids in length.
  • nucleic acid encoding a polypeptide that comprises at least 90% sequence identity at least seven contiguous amino acids of an amino acid sequence selected from Tables 1 -3 or 8-9.
  • nucleic acid encoding a polypeptide comprises at least 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity at least seven contiguous amino acids of an amino acid sequence selected from Tables 1-3 or 8-9.
  • nucleic acid encoding a polypeptide comprises 100% sequence identity at least seven contiguous amino acids of an amino acid sequence selected from Tables 1 -3 or 8-9.
  • nucleic acid encoding a polypeptide consists 100% sequence identity to the full length of an amino acid sequence selected from Tables 1-3 or 8-9.
  • further disclosed herein include a method of mapping a biologically active cysteine site on a protein, which comprises harvesting a set of cysteine- reactive probe-protein complexes from a sample wherein the cysteine-reactive probe comprises a reactive moiety capable of forming a covalent bond with a cysteine residue located on the cysteine containing protein; analyzing the set of cysteine-reactive probe-protein complexes by a proteomic analysis means; and based on the previous step, mapping the biologically active cysteine site on the protein.
  • the analyzing further comprises treating the set of cysteine- reactive probe-protein complexes with a protease to generate a set of protein fragments.
  • the protease is a serine protease, a threonine protease, a cysteine protease, an aspartate protease, a glutamic acid protease, or a metalloprotease.
  • the protease is a serine protease.
  • the protease is trypsin.
  • cysteine-reactive probe-protein complex is further attached to a labeling group such as a biotin moiety.
  • the labeling group such as a biotin moiety further comprises a linker.
  • the linker is a peptide.
  • the peptide linker is about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more amino acid residues in length.
  • the peptide linker contains a cleavage site.
  • a non-limiting list of cleavage sites includes Tobacco Etch Virus (TEV), thrombin (Thr), enterokinase (EKT), activated Factor X (Xa), or human Rhinovirus 3C protease (3C/PreScission).
  • TEV Tobacco Etch Virus
  • Thr thrombin
  • EKT enterokinase
  • Xa activated Factor X
  • Xa activated Factor X
  • the peptide linker contains a TEV protease cleavage site.
  • the TEV protease cleavage site comprises the following sequence Gly-Gln- Phe-Tyr-Leu-Asn-Glu (SEQ ID NO: 860).
  • the biotin moiety is further coupled to a bead (e.g. a streptavidin-coupled bead).
  • the protein from the cysteine-reactive probe-protein complex attached to the bead is digested with trypsin, and the immobilized peptide or protein fragment is further separated and collected.
  • the collected peptide or protein fragment is then digested by a protease (e.g. TEV protease), and the treated protein fragment is then separated, and collected for analysis.
  • the analysis is a proteomic analysis as described above and elsewhere herein.
  • the sequence of the protein fragment is further determined.
  • the protein fragment correlates to a small molecule fragment binding site on the cysteine containing protein.
  • the sequence of the protein fragment correlates to a sequence as illustrated in Tables 1-3 or 8-9.
  • the sequence as shown in Tables 1-3 or 8-9 correlate to a site on the full length protein as a drug binding site.
  • the sequence as shown in Tables 1-3 or 8-9 correlate to a drug binding site.
  • polypeptides comprising one or more of the sequences as shown in Tables 1-3 or 8-9 serve as probes for small molecule fragment screening.
  • the polypeptide is subjected to one or more rounds of purification steps to remove impurities.
  • the purification step is a chromatographic step utilizing separation methods such as affinity-based, size-exclusion based, ion-exchange based, or the like.
  • the polypeptide is at most
  • polypeptide is at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, 99.9%, or 100% pure or without the presence of impurities.
  • the polypeptide is at least 30%, 40%, 50%, 60%, 70%,
  • nucleic acid encoding a polypeptide that is derived from a cysteine containing protein is subjected to one or more rounds of purification steps to remove impurities.
  • the purification step is a chromatographic step utilizing separation methods such as affinity-based, size-exclusion based, ion-exchange based, or the like.
  • the nucleic acid is at most 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, 99.9%, or 100%) pure or without the presence of impurities.
  • the nucleic acid is at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, 99.9%, or 100% pure or without the presence of impurities.
  • a polypeptide includes natural amino acids, unnatural amino acids, or a combination thereof.
  • an amino acid residue refers to a molecule containing both an amino group and a carboxyl group.
  • Suitable amino acids include, without limitation, both the D- and L-isomers of the naturally-occurring amino acids, as well as non-naturally occurring amino acids prepared by organic synthesis or other metabolic routes.
  • amino acid as used herein, includes, without limitation, a-amino acids, natural amino acids, non- natural amino acids, and amino acid analogs.
  • a-amino acid refers to a molecule containing both an amino group and a carboxyl group bound to a carbon which is designated the a-carbon.
  • ⁇ -amino acid refers to a molecule containing both an amino group and a carboxyl group in a ⁇ configuration.
  • N “Naturally occurring amino acid” refers to any one of the twenty amino acids commonly found in peptides synthesized in nature, and known by the one letter abbreviations A, R, N, C, D, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y and V.
  • Hydrophobic amino acids include small hydrophobic amino acids and large hydrophobic amino acids.
  • Small hydrophobic amino acid are glycine, alanine, proline, and analogs thereof.
  • Large hydrophobic amino acids are valine, leucine, isoleucine, phenylalanine, methionine, tryptophan, and analogs thereof.
  • Poly amino acids are serine, threonine, asparagine, glutamine, cysteine, tyrosine, and analogs thereof.
  • Chargeged amino acids are lysine, arginine, histidine, aspartate, glutamate, and analogs thereof. In some cases, aspartic acid and glutamic acid are referred to as acidic amino acids. In other cases, lysine, arginine and histinde are referred to as basic amino acids.
  • amino acid analog refers to a molecule which is structurally similar to an amino acid and which is substituted for an amino acid in the formation of a peptidomimetic macrocycle
  • Amino acid analogs include, without limitation, ⁇ -amino acids and amino acids where the amino or carboxy group is substituted by a similarly reactive group (e.g., substitution of the primary amine with a secondary or tertiary amine, or substitution of the carboxy group with an ester).
  • non-natural amino acid refers to an amino acid which is not one of the twenty amino acids commonly found in peptides synthesized in nature, and known by the one letter abbreviations A, R, N, C, D, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y and V.
  • amino acid analogs include ⁇ -amino acid analogs.
  • ⁇ - amino acid analogs include, but are not limited to, the following: cyclic ⁇ -amino acid analogs; ⁇ - alanine; (R) ⁇ -phenylalanine; (R)-l,2,3,4-tetrahydro-isoquinoline-3-acetic acid; (R)-3-amino-4- (l-naphthyl)-butyric acid; (R)-3-amino-4-(2,4-dichlorophenyl)butyric acid; (R)-3-amino-4-(2- chlorophenyl)-butyric acid; (R)-3-amino-4-(2-cyanophenyl)-butyric acid; (R)-3-amino-4-(2- fluorophenyl)-butyric acid; (R)-3-amino-4-(2-furyl)-butyric acid; (R)-3-amino-4-(2-
  • amino acid analogs include analogs of alanine, valine, glycine or leucine.
  • Examples of amino acid analogs of alanine, valine, glycine, and leucine include, but are not limited to, the following: a-methoxyglycine; a-allyl-L-alanine; a-aminoisobutyric acid; a- methyl-leucine; P-(l-naphthyl)-D-alanine; P-(l-naphthyl)-L-alanine; P-(2-naphthyl)-D-alanine; P-(2-naphthyl)-L-alanine; P-(2-pyridyl)-D-alanine; P-(2-pyridyl)-L-alanine; P-(2-thienyl)-D- alanine; P-(2-thienyl)-L-alanine; P-(
  • amino acid analogs include analogs of arginine or lysine.
  • amino acid analogs of arginine and lysine include, but are not limited to, the following: citrulline; L-2-amino-3-guanidinopropionic acid; L-2-amino-3-ureidopropionic acid; L-citrulline; Lys(Me) 2 -OH; Lys(N 3 )— OH; ⁇ -benzyloxycarbonyl-L-ornithine; ⁇ -nitro-D- arginine; ⁇ -nitro-L-arginine; a-methyl-ornithine; 2,6-diaminoheptanedioic acid; L-ornithine; (N5-l-(4,4-dimethyl-2,6-dioxo-cyclohex-l-ylidene)ethyl)-D-ornithine; (N5-l-(4,4-dimethyl-2,6- dioxo-cyclohex- 1 -yliden
  • amino acid analogs include analogs of aspartic or glutamic acids.
  • amino acid analogs of aspartic and glutamic acids include, but are not limited to, the following: a-methyl-D-aspartic acid; a-methyl-glutamic acid; a-methyl-L-aspartic acid; ⁇ - methylene-glutamic acid; (N-y-ethyl)-L-glutamine; [N-a-(4-aminobenzoyl)]-L-glutamic acid; 2,6-diaminopimelic acid; L-a-aminosuberic acid; D-2-aminoadipic acid; D-a-aminosuberic acid; a-aminopimelic acid; iminodiacetic acid; L-2-aminoadipic acid; threo-P-methyl-aspartic acid; ⁇ - carboxy-D-glutamic acid ⁇ , ⁇ -di-t-butyl ester
  • amino acid analogs include analogs of cysteine and methionine.
  • amino acid analogs of cysteine and methionine include, but are not limited to,
  • amino acid analogs include analogs of phenylalanine and tyrosine.
  • amino acid analogs of phenylalanine and tyrosine include ⁇ -methyl-phenylalanine, ⁇ -hydroxyphenylalanine, a-methyl-3-methoxy-DL-phenylalanine, a-methyl-D-phenylalanine, a- methyl-L-phenylalanine, l,2,3,4-tetrahydroisoquinoline-3-carboxylic acid, 2,4-dichloro- phenylalanine, 2-(trifluoromethyl)-D-phenylalanine, 2-(trifluoromethyl)-L-phenylalanine, 2- bromo-D-phenylalanine, 2-bromo-L-phenylalanine, 2-chloro-D-phenylalanine, 2-chloro-L- phenylalanine, 2-cyano-D-phenylalanine, 2-
  • amino acid analogs include analogs of proline.
  • amino acid analogs of proline include, but are not limited to, 3,4-dehydro-proline, 4-fluoro-proline, cis-
  • amino acid analogs include analogs of serine and threonine.
  • amino acid analogs of serine and threonine include, but are not limited to, 3-amino- 2-hydroxy-5-methylhexanoic acid, 2-amino-3-hydroxy-4-methylpentanoic acid, 2-amino-3- ethoxybutanoic acid, 2-amino-3-methoxybutanoic acid, 4-amino-3-hydroxy-6-methylheptanoic acid, 2-amino-3 -benzyl oxypropionic acid, 2-amino-3 -benzyl oxypropionic acid, 2-amino-3- ethoxypropionic acid, 4-amino-3-hydroxybutanoic acid, and a-methylserine.
  • amino acid analogs include analogs of tryptophan.
  • amino acid analogs of tryptophan include, but are not limited to, the following: a-methyl- tryptophan; P-(3-benzothienyl)-D-alanine; P-(3-benzothienyl)-L-alanine; 1 -methyl -tryptophan; 4-methyl-tryptophan; 5-benzyloxy-tryptophan; 5-bromo-tiyptophan; 5-chloro-tryptophan; 5- fluoro-tryptophan; 5 -hydroxy-tryptophan; 5-hydroxy-L-tryptophan; 5-methoxy-tryptophan; 5- methoxy-L-tryptophan; 5 -methyl -tryptophan; 6-bromo-tiyptophan; 6-chloro-D-tryptophan; 6- chloro-tryptophan; 6-fluoro-tryptophan; 6-methyl-tryp
  • amino acid analogs are racemic.
  • the D isomer of the amino acid analog is used.
  • the L isomer of the amino acid analog is used.
  • the amino acid analog comprises chiral centers that are in the R or S configuration.
  • the amino group(s) of a ⁇ -amino acid analog is substituted with a protecting group, e.g., tert-butyloxycarbonyl (BOC group), 9-fluorenylmethyloxycarbonyl (FMOC), tosyl, and the like.
  • BOC group tert-butyloxycarbonyl
  • FMOC 9-fluorenylmethyloxycarbonyl
  • tosyl and the like.
  • the carboxylic acid functional group of a ⁇ -amino acid analog is protected, e.g., as its ester derivative.
  • the salt of the amino acid analog is used.
  • nucleic acid molecules refer to at least two nucleotides covalently linked together.
  • a nucleic acid described herein contains phosphodiester bonds, although in some cases, as outlined below (for example in the
  • nucleic acid analogs are included that have alternate backbones, comprising, for example, phosphoramide (Beaucage et al.,
  • nucleic acids include those with bicyclic structures including locked nucleic acids (also referred to herein as "LNA"), oshkin et al., J. Am. Chem. Soc. 120.13252 3 (1998); positive backbones (Denpcy et al., Proc. Natl. Acad. Sci. USA 92:6097 (1995); non- ionic backbones (U.S. Pat. Nos.
  • LNA locked nucleic acids
  • LNAs are a class of nucleic acid analogues in which the ribose ring is "locked" by a methylene bridge connecting the 2'-0 atom with the 4'-C atom. All of these references are hereby expressly incorporated by reference. In some instances, these
  • the target nucleic acids are single stranded or double stranded, as specified, or contain portions of both double stranded or single stranded sequence.
  • the nucleic acids are DNA (including, e.g., genomic DNA, mitochondrial DNA, and cDNA), RNA (including, e.g., mRNA and rRNA) or a hybrid, where the nucleic acid contains any combination of deoxyribo- and ribonucleotides, and any combination of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, xathanine hypoxathanine, isocytosine, isoguanine, etc.
  • DNA including, e.g., genomic DNA, mitochondrial DNA, and cDNA
  • RNA including, e.g., mRNA and rRNA
  • a hybrid where the nucleic acid contains any combination of deoxyribo- and ribonucleotides, and any combination of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, xathanine hypox
  • one or more of the methods disclosed herein comprise a sample.
  • the sample is a cell sample or a tissue sample.
  • the sample is a cell sample.
  • the sample for use with the methods described herein is obtained from cells of an animal.
  • the animal cell includes a cell from a marine invertebrate, fish, insects, amphibian, reptile, or mammal.
  • the mammalian cell is a primate, ape, equine, bovine, porcine, canine, feline, or rodent.
  • the mammal is a primate, ape, dog, cat, rabbit, ferret, or the like.
  • the rodent is a mouse, rat, hamster, gerbil, hamster, chinchilla, or guinea pig.
  • the bird cell is from a canary, parakeet or parrots.
  • the reptile cell is from a turtles, lizard or snake.
  • the fish cell is from a tropical fish.
  • the fish cell is from a zebrafish (e.g. Danino rerio).
  • the worm cell is from a nematode (e.g. C. elegans).
  • the amphibian cell is from a frog.
  • the arthropod cell is from a tarantula or hermit crab.
  • the sample for use with the methods described herein is obtained from a mammalian cell.
  • the mammalian cell is an epithelial cell, connective tissue cell, hormone secreting cell, a nerve cell, a skeletal muscle cell, a blood cell, or an immune system cell.
  • Exemplary mammalian cells include, but are not limited to, 293 A cell line, 293FT cell line, 293F cells , 293 H cells, HEK 293 cells, CHO DG44 cells, CHO-S cells, CHO-K1 cells, Expi293FTM cells, Flp-InTM T-RExTM 293 cell line, Flp-InTM-293 cell line, Flp-InTM-3T3 cell line, Flp-InTM-BHK cell line, Flp-InTM-CHO cell line, Flp-InTM-CV-l cell line, Flp-InTM- Jurkat cell line, FreeStyleTM 293 -F cells, FreeStyleTM CHO-S cells, GripTiteTM 293 MSR cell line, GS-CHO cell line, HepaRGTM cells, T-RExTM Jurkat cell line, Per.C6 cells, T-RExTM-293 cell line, T-RExTM-CHO cell line, T-RExTM-HeLa cell line, NC-HE
  • the sample for use with the methods described herein is obtained from cells of a tumor cell line.
  • the sample is obtained from cells of a solid tumor cell line.
  • the solid tumor cell line is a sarcoma cell line.
  • the solid tumor cell line is a carcinoma cell line.
  • the sarcoma cell line is obtained from a cell line of alveolar rhabdomyosarcoma, alveolar soft part sarcoma, ameloblastoma, angiosarcoma, chondrosarcoma, chordoma, clear cell sarcoma of soft tissue, dedifferentiated liposarcoma, desmoid, desmoplastic small round cell tumor, embryonal rhabdomyosarcoma, epithelioid fibrosarcoma, epithelioid hemangioendothelioma, epithelioid sarcoma, esthesioneuroblastoma, Ewing sarcoma, extrarenal rhabdoid tumor, extraskeletal myxoid chondrosarcoma, extraskeletal osteosarcoma, fibrosarcoma, giant cell tumor, hemangiopericytoma, infantile fibrosarcoma, inflammatory myofibroblastic tumor
  • rhabdomyosarcoma round cell liposarcoma, small cell osteosarcoma, solitary fibrous tumor, synovial sarcoma, telangiectatic osteosarcoma.
  • the carcinoma cell line is obtained from a cell line of adenocarcinoma, squamous cell carcinoma, adenosquamous carcinoma, anaplastic carcinoma, large cell carcinoma, small cell carcinoma, anal cancer, appendix cancer, bile duct cancer (i.e., cholangiocarcinoma), bladder cancer, brain tumor, breast cancer, cervical cancer, colon cancer, cancer of Unknown Primary (CUP), esophageal cancer, eye cancer, fallopian tube cancer, gastroenterological cancer, kidney cancer, liver cancer, lung cancer, medulloblastoma, melanoma, oral cancer, ovarian cancer, pancreatic cancer, parathyroid disease, penile cancer, pituitary tumor, prostate cancer, rectal cancer, skin cancer, stomach cancer, testicular cancer, throat cancer, thyroid cancer, uterine cancer, vaginal cancer, or vulvar cancer.
  • adenocarcinoma squamous cell carcinoma, adenosquamous carcinoma, anaplastic carcinoma,
  • the sample is obtained from cells of a hematologic malignant cell line.
  • the hematologic malignant cell line is a T-cell cell line.
  • the hematologic malignant cell line is obtained from a T-cell cell line of: peripheral T-cell lymphoma not otherwise specified (PTCL-NOS), anaplastic large cell lymphoma, angioimmunoblastic lymphoma, cutaneous T-cell lymphoma, adult T-cell leukemia/lymphoma (ATLL), blastic K-cell lymphoma, enteropathy -type T-cell lymphoma, hematosplenic gamma-delta T-cell lymphoma, lymphoblastic lymphoma, nasal K/T-cell lymphomas, or treatment-related T-cell lymphomas.
  • PTCL-NOS peripheral T-cell lymphoma not otherwise specified
  • anaplastic large cell lymphoma angioimmunoblastic lymphoma
  • the hematologic malignant cell line is obtained from a B-cell cell line of: acute lymphoblastic leukemia (ALL), acute myelogenous leukemia (AML), chronic myelogenous leukemia (CML), acute monocytic leukemia (AMoL), chronic lymphocytic leukemia (CLL), high-risk chronic lymphocytic leukemia (CLL), small lymphocytic lymphoma (SLL), high-risk small lymphocytic lymphoma (SLL), follicular lymphoma (FL), mantle cell lymphoma (MCL), Waldenstrom's macroglobulinemia, multiple myeloma, extranodal marginal zone B cell lymphoma, nodal marginal zone B cell lymphoma, Burkitt's lymphoma, non-Burkitt high grade B cell lymphoma, primary mediastinal B-cell lymphoma (PMBL), immunoblastic large cell lymph
  • the sample for use with the methods described herein is obtained from a tumor cell line.
  • exemplary tumor cell line includes, but is not limited to, 600MPE, AU565, BT-20, BT-474, BT-483, BT-549, Evsa-T, Hs578T, MCF-7, MDA-MB-231, SkBr3, T-47D, HeLa, DU145, PC3, LNCaP, A549, H1299, NCI-H460, A2780, SKOV-3/Luc, Neuro2a, RKO, RKO-AS45-1, HT-29, SW1417, SW948, DLD-1, SW480, Capan-1, MC/9, B72.3, B25.2, B6.2, B38.1, DMS 153, SU.86.86, SNU-182, SNU-423, SNU-449, SNU-475, SNU-387, Hs 817.T, LMH, LMH/2A, SNU-398,
  • the sample for use in the methods is from any tissue or fluid from an individual.
  • Samples include, but are not limited to, tissue (e.g. connective tissue, muscle tissue, nervous tissue, or epithelial tissue), whole blood, dissociated bone marrow, bone marrow aspirate, pleural fluid, peritoneal fluid, central spinal fluid, abdominal fluid, pancreatic fluid, cerebrospinal fluid, brain fluid, ascites, pericardial fluid, urine, saliva, bronchial lavage, sweat, tears, ear flow, sputum, hydrocele fluid, semen, vaginal flow, milk, amniotic fluid, and secretions of respiratory, intestinal or genitourinary tract.
  • tissue e.g. connective tissue, muscle tissue, nervous tissue, or epithelial tissue
  • whole blood e.g. connective tissue, muscle tissue, nervous tissue, or epithelial tissue
  • dissociated bone marrow e.g. connective tissue, muscle tissue, nervous tissue, or epithelial tissue
  • the sample is a tissue sample, such as a sample obtained from a biopsy or a tumor tissue sample.
  • the sample is a blood serum sample.
  • the sample is a blood cell sample containing one or more peripheral blood mononuclear cells (PBMCs).
  • PBMCs peripheral blood mononuclear cells
  • the sample contains one or more circulating tumor cells (CTCs).
  • the sample contains one or more disseminated tumor cells (DTC, e.g., in a bone marrow aspirate sample).
  • the samples are obtained from the individual by any suitable means of obtaining the sample using well-known and routine clinical methods.
  • Procedures for obtaining tissue samples from an individual are well known. For example, procedures for drawing and processing tissue sample such as from a needle aspiration biopsy is well-known and is employed to obtain a sample for use in the methods provided.
  • tissue sample typically, for collection of such a tissue sample, a thin hollow needle is inserted into a mass such as a tumor mass for sampling of cells that, after being stained, will be examined under a microscope.
  • the sample is a sample solution.
  • the sample solution comprises a solution such as a buffer (e.g. phosphate buffered saline) or a media.
  • the media is an isotopically labeled media.
  • the sample solution is a cell solution.
  • the sample e.g., cells or a cell solution
  • a cysteine-reactive probe for analysis of protein cysteine-reactive probe interactions.
  • the sample e.g., cells or a cell solution
  • the sample is further incubated in the presence of a small molecule fragment prior to addition of the cysteine-reactive probe.
  • the sample is compared with a control.
  • the control comprises the cysteine- reactive probe but not the small molecule fragment.
  • a difference is observed between a set of cysteine-reactive probe protein interactions between the sample and the control.
  • the difference correlates to the interaction between the small molecule fragment and the cysteine containing proteins.
  • the sample e.g. cells or a cell solution
  • the sample is further labeled for analysis of cysteine-reactive probe protein interactions.
  • the sample e.g. cells or a cell solution
  • the sample is labeled with an enriched media.
  • the sample e.g. cells or a cell solution
  • isotope-labeled amino acids such as 13 C or 15 N-labeled amino acids.
  • the labeled sample is further compared with a non-labeled sample to detect differences in cysteine-reactive probe protein interactions between the two samples.
  • this difference is a difference of a cysteine containing protein and its interaction with a small molecule fragment in the labeled sample versus the non-labeled sample. In some instances, the difference is an increase, decrease or a lack of protein cysteine-reactive probe interaction in the two samples.
  • the isotope-labeled method is termed SILAC, stable isotope labeling using amino acids in cell culture.
  • the sample is divided into a first cell solution and a second cell solution.
  • the first cell solution is incubated with a small molecule fragment for an extended period of time prior to incubating the first cell solution with a first cysteine-reactive probe to generate a first group of cysteine-reactive probe-protein complexes.
  • the extended period of time is about 5, 10, 15, 20, 30, 60, 90, 120 minutes or longer.
  • the second cell solution comprises a second cysteine-reactive probe to generate a second group of cysteine-reactive probe-protein complexes.
  • the first cysteine- reactive probe and the second cysteine-reactive probe are the same.
  • cells from the second cell solution are further treated with a buffer, such as a control buffer, in which the buffer does not contain a small molecule fragment.
  • the control buffer comprises dimethyl sulfoxide (DMSO).
  • the cysteine-reactive probe-protein complex is further conjugated to a chromophore, such as a fluorophore.
  • a chromophore such as a fluorophore.
  • the cysteine-reactive probe-protein complex is separated and visualized utilizing an electrophoresis system, such as through a gel electrophoresis, or a capillary electrophoresis.
  • Exemplary gel electrophoresis includes agarose based gels, polyacrylamide based gels, or starch based gels.
  • the cysteine-reactive probe-protein is subjected to a native electrophoresis condition.
  • the cysteine-reactive probe-protein is subjected to a denaturing electrophoresis condition.
  • the cysteine-reactive probe-protein after harvesting is further fragmentized to generate protein fragments.
  • fragmentation is generated through mechanical stress, pressure, or chemical means.
  • the protein from the cysteine-reactive probe-protein complexes is fragmented by a chemical means.
  • the chemical means is a protease.
  • proteases include, but are not limited to, serine proteases such as chymotrypsin A, penicillin G acylase precursor, dipeptidase E, DmpA aminopeptidase, subtilisin, prolyl oligopeptidase, D-Ala-D-Ala peptidase C, signal peptidase I, cytomegalovirus assemblin, Lon-A peptidase, peptidase Clp, Escherichia coli phage K1F endosialidase CEVICD self-cleaving protein, nucleoporin 145, lactoferrin, murein tetrapeptidase LD-carboxypeptidase, or rhomboid- 1; threonine proteases such as ornithine acetyltransferase; cysteine proteases such as TEV protease, amidophosphoribosyltransferase precursor,
  • aminopeptidase aminopeptidase, papain, bromelain, cathepsin K, calpain, caspase-1, separase, adenain, pyroglutamyl-peptidase I, sortase A, hepatitis C virus peptidase 2, Sindbis virus-type nsP2 peptidase, dipeptidyl-peptidase VI, or DeSI-1 peptidase; aspartate proteases such as beta- secretase 1 (BACEl), beta-secretase 2 (BACE2), cathepsin D, cathepsin E, chymosin, napsin-A, nepenthesin, pepsin, plasmepsin, presenilin, or renin; glutamic acid proteases such as AfuGprA; and metalloproteases such as peptidase_M48.
  • BACEl beta- secretase 1
  • the fragmentation is a random fragmentation. In some instances, the fragmentation generates specific lengths of protein fragments, or the shearing occurs at particular sequence of amino acid regions.
  • the protein fragments are further analyzed by a proteomic method such as by liquid chromatography (LC) (e.g. high performance liquid chromatography), liquid chromatography-mass spectrometry (LC-MS), matrix-assisted laser desorption/ionization (MALDI-TOF), gas chromatography-mass spectrometry (GC-MS), capillary electrophoresis- mass spectrometry (CE-MS), or nuclear magnetic resonance imaging (MR).
  • LC liquid chromatography
  • LC-MS liquid chromatography-mass spectrometry
  • MALDI-TOF matrix-assisted laser desorption/ionization
  • GC-MS gas chromatography-mass spectrometry
  • CE-MS capillary electrophoresis- mass spectrometry
  • MR nuclear magnetic resonance imaging
  • the LC method is any suitable LC methods well known in the art, for separation of a sample into its individual parts. This separation occurs based on the interaction of the sample with the mobile and stationary phases. Since there are many stationary/mobile phase combinations that are employed when separating a mixture, there are several different types of chromatography that are classified based on the physical states of those phases. In some embodiments, the LC is further classified as normal-phase chromatography, reverse-phase chromatography, size-exclusion chromatography, ion-exchange chromatography, affinity chromatography, displacement chromatography, partition chromatography, flash chromatography, chiral chromatography, and aqueous normal-phase chromatography.
  • the LC method is a high performance liquid chromatography (HPLC) method.
  • HPLC high performance liquid chromatography
  • the HPLC method is further categorized as normal- phase chromatography, reverse-phase chromatography, size-exclusion chromatography, ion- exchange chromatography, affinity chromatography, displacement chromatography, partition chromatography, chiral chromatography, and aqueous normal-phase chromatography.
  • the HPLC method of the present disclosure is performed by any standard techniques well known in the art.
  • Exemplary HPLC methods include hydrophilic interaction liquid chromatography (HILIC), electrostatic repulsion-hydrophilic interaction liquid chromatography (ERLIC) and reverse phase liquid chromatography (RPLC).
  • the LC is coupled to a mass spectroscopy as a LC-MS method.
  • the LC-MS method includes ultra-performance liquid chromatography-electrospray ionization quadrupole time-of-flight mass spectrometry (UPLC- ESI-QTOF-MS), ultra-performance liquid chromatography-electrospray ionization tandem mass spectrometry (UPLC-ESI-MS/MS), reverse phase liquid chromatography-mass spectrometry (RPLC -MS), hydrophilic interaction liquid chromatography-mass spectrometry (HILIC -MS), hydrophilic interaction liquid chromatography-triple quadrupole tandem mass spectrometry (HILIC-QQQ), electrostatic repulsion-hydrophilic interaction liquid chromatography-mass spectrometry (ERLIC -MS), liquid chromatography time-of-flight mass spectrometry (LC- QTOF-MS), liquid chromatography-tandem
  • the GC is coupled to a mass spectroscopy as a GC-MS method.
  • the GC-MS method includes two-dimensional gas chromatography time-of-flight mass spectrometry (GC*GC-TOFMS), gas chromatography time-of-flight mass spectrometry (GC-QTOF-MS) and gas chromatography -tandem mass spectrometry (GC -MS/MS).
  • CE is coupled to a mass spectroscopy as a CE-MS method.
  • the CE-MS method includes capillary electrophoresis- negative electrospray ionization-mass spectrometry (CE-ESI-MS), capillary electrophoresis-negative electrospray ionization-quadrupole time of flight-mass spectrometry (CE-ESI-QTOF-MS) and capillary electrophoresis-quadrupole time of flight-mass spectrometry (CE-QTOF-MS).
  • the nuclear magnetic resonance (NMR) method is any suitable method well known in the art for the detection of one or more cysteine binding proteins or protein fragments disclosed herein.
  • the NMR method includes one dimensional (ID) NMR methods, two dimensional (2D) NMR methods, solid state NMR methods and NMR chromatography.
  • ID NMR methods include 1 Hydrogen,
  • Exemplary solid state NMR method include solid state 13 Carbon NMR, high resolution magic angle spinning (HR-MAS) and cross polarization magic angle spinning (CP -MAS) NMR methods.
  • Exemplary NMR techniques include diffusion ordered spectroscopy (DOSY), DOSY-TOCSY and DOSY-HSQC.
  • the protein fragments are analyzed by method as described in Weerapana et al., "Quantitative reactivity profiling predicts functional cysteines in proteomes," Nature, 468:790-795 (2010).
  • the results from the mass spectroscopy method are analyzed by an algorithm for protein identification.
  • the algorithm combines the results from the mass spectroscopy method with a protein sequence database for protein identification.
  • the algorithm comprises ProLuCID algorithm, Probity, Scaffold, SEQUEST, or Mascot.
  • a value is assigned to each of the protein from the cysteine- reactive probe-protein complex.
  • the value assigned to each of the protein from the cysteine-reactive probe-protein complex is obtained from the mass spectroscopy analysis. In some instances, the value is the area-under-the curve from a plot of signal intensity as a function of mass-to-charge ratio.
  • a first value is assigned to the protein obtained from the first cell solution and a second value is assigned to the same protein obtained from the second cell solution.
  • a ratio is calculated between the two values.
  • a ratio of greater than 2 indicates that the protein is a candidate for interacting with a drug or that the protein is a cysteine binding protein.
  • the ratio is greater than 2.5, 3, 3.5, 4, 4.5, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20. In some cases, the ratio is at most 20.
  • the ratio is calculated based on averaged values.
  • the averaged value is an average of at least two, three, or four values of the protein from each cell solution, or that the protein is observed at least two, three, or four times in each cell solution and a value is assigned to each observed time.
  • the ratio further has a standard deviation of less than 12, 10, or 8.
  • a value is not an averaged value.
  • the ratio is calculated based on value of a protein observed only once in a cell population. In some instances, the ratio is assigned with a value of 20.
  • a first ratio is obtained from two cell solutions in which both cell solutions have been incubated with a cysteine-reactive probe and the first cell solution is further incubated with a small molecule fragment.
  • the first ratio is further compared to a second ratio in which both cell solutions have been treated by cysteine-reactive probes in the absence of a small molecule fragment.
  • the first ratio is greater than 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20.
  • the second ratio is greater than 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20. In some instances, if the first ratio is greater than 2, 2.5, 3, 3.5, 4, 4.5, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 and the second ratio is from about 0.5 to about 2, the two ratios indicate that a protein is a drug binding target.
  • the value further enables calculating a percentage of inhibition of the cysteine-reactive probe to the cysteine containing protein.
  • the percentage of inhibition of greater than 50%, 60%, 70%, 80%, 90%, or at 100%) indicates that the cysteine containing protein is a candidate for interacting with the small molecule fragment.
  • kits and articles of manufacture for use with one or more methods described herein.
  • described herein is a kit for identifying a cysteine containing protein as a small molecule fragment binding target.
  • a kit for mapping binding sites on a cysteine containing protein is also described herein.
  • a kit for identifying cysteine binding proteins is also described herein.
  • such kit includes cysteine-reactive probes such as the cysteine-reactive probes described herein, test compounds such as small molecule fragments or libraries and/or controls, and reagents suitable for carrying out one or more of the methods described herein.
  • the kit further comprises samples, such as a cell sample, and suitable solutions such as buffers or media.
  • the kit further comprises recombinant proteins for use in one or more of the methods described herein.
  • additional components of the kit comprises a carrier, package, or container that is compartmentalized to receive one or more containers such as vials, tubes, and the like, each of the container(s) comprising one of the separate elements to be used in a method described herein.
  • Suitable containers include, for example, bottles, vials, plates, syringes, and test tubes.
  • the containers are formed from a variety of materials such as glass or plastic.
  • the articles of manufacture provided herein contain packaging materials.
  • packaging materials include, but are not limited to, bottles, tubes, bags, containers, and any packaging material suitable for a selected formulation and intended mode of use.
  • the container(s) include cysteine-reactive probes, test compounds, and one or more reagents for use in a method disclosed herein.
  • Such kits optionally include an identifying description or label or instructions relating to its use in the methods described herein.
  • a kit typically includes labels listing contents and/or instructions for use, and package inserts with instructions for use. A set of instructions will also typically be included.
  • a label is on or associated with the container.
  • a label is on a container when letters, numbers or other characters forming the label are attached, molded or etched into the container itself; a label is associated with a container when it is present within a receptacle or carrier that also holds the container, e.g., as a package insert.
  • a label is used to indicate that the contents are to be used for a specific therapeutic application.
  • the label also indicates directions for use of the contents, such as in the methods described herein. Services
  • the methods provided herein also perform as a service.
  • a service provider obtain from the customer a plurality of small molecule fragment candidates for analysis with one or more of the cysteine-reactive probes for screening.
  • the service provider screens the small molecule fragment candidates using one or more of the methods described herein, and then provide the results to the customer.
  • the service provider provides the appropriate reagents to the customer for analysis utilizing one or more of the cysteine-reactive probes and one or more of the methods described herein.
  • the customer performs one or more of the methods described herein and then provide the results to the service provider for analysis.
  • the service provider then analyzes the results and provides the results to the costumer.
  • the customer further analyze the results by interacting with software installed locally (at the customer's location) or remotely (e.g., on a server reachable through a network).
  • Exemplary customers include pharmaceutical companies, clinical laboratories, physicians, patients, and the like.
  • a customer is any suitable customer or party with a need or desire to use the methods, systems, compositions, and kits described herein.
  • the methods described herein include a digital processing device, or use of the same.
  • the digital processing device includes one or more hardware central processing units (CPU) that carry out the device's functions.
  • the digital processing device further comprises an operating system configured to perform executable instructions.
  • the digital processing device is optionally connected to a computer network.
  • the digital processing device is optionally connected to the Internet such that it accesses the World Wide Web.
  • the digital processing device is optionally connected to a cloud computing infrastructure.
  • the digital processing device is optionally connected to an intranet.
  • the digital processing device is optionally connected to a data storage device.
  • suitable digital processing devices include, by are not limited to, server computers, desktop computers, laptop computers, notebook computers, sub-notebook computers, netbook computers, netpad computers, set-top computers, media streaming devices, handheld computers, Internet appliances, mobile smartphones, tablet computers, personal digital assistants, video game consoles, and vehicles.
  • Suitable tablet computers include those with booklet, slate, or convertible configurations.
  • the digital processing device includes an operating system configured to perform executable instructions.
  • the operating system is, for example, software, including programs and data, which manages the device's hardware and provides services for execution of applications.
  • Suitable server operating systems include, by way of non-limiting examples, FreeBSD, OpenBSD, NetBSD ® , Linux, Apple ® Mac OS X Server ® , Oracle ® Solaris ® , Windows Server ® , and Novell ® NetWare ® .
  • Suitable personal computer operating systems include, by way of non- limiting examples, Microsoft ® Windows ® , Apple ® Mac OS X ® , UNIX ® , and UNIX-like operating systems such as GNU/Linux ® .
  • the operating system is provided by cloud computing.
  • Suitable mobile smart phone operating systems include, by way of non-limiting examples, Nokia ® Symbian ® OS, Apple ® iOS ® , Research In Motion ® BlackBerry OS ® , Google ® Android ® , Microsoft ® Windows Phone ® OS, Microsoft ® Windows Mobile ® OS, Linux ® , and Palm ® WebOS ® .
  • Suitable media streaming device operating systems include, by way of non-limiting examples, Apple TV ® , Roku ® , Boxee ® , Google TV ® , Google Chromecast ® , Amazon Fire ® , and Samsung ® HomeSync ® .
  • Suitable video game console operating systems include, by way of non- limiting examples, Sony ® PS3 ® , Sony ® PS4 ® , Microsoft ® Xbox 360 ® , Microsoft Xbox One, Nintendo ® Wii ® , Nintendo ® Wii U ® , and Ouya ® .
  • the device includes a storage and/or memory device.
  • the storage and/or memory device is one or more physical apparatuses used to store data or programs on a temporary or permanent basis.
  • the device is volatile memory and requires power to maintain stored information.
  • the device is non-volatile memory and retains stored information when the digital processing device is not powered.
  • the non-volatile memory comprises flash memory.
  • the non-volatile memory comprises dynamic random-access memory (DRAM).
  • the non-volatile memory comprises ferroelectric random access memory (FRAM).
  • the non-volatile memory comprises phase-change random access memory (PRAM).
  • the device is a storage device including, by way of non-limiting examples, CD-ROMs, DVDs, flash memory devices, magnetic disk drives, magnetic tapes drives, optical disk drives, and cloud computing based storage.
  • the storage and/or memory device is a combination of devices such as those disclosed herein.
  • the digital processing device includes a display to send visual information to a user.
  • the display includes a cathode ray tube (CRT), a liquid crystal display (LCD), a thin film transistor liquid crystal display (TFT-LCD), an organic light emitting diode (OLED) display, a plasma display, a video projector, or a combination thereof.
  • the digital processing device includes an input device to receive information from a user.
  • the input device is a keyboard.
  • the input device is a pointing device including, by way of non-limiting examples, a mouse, trackball, track pad, joystick, game controller, or stylus.
  • the input device is a touch screen or a multi-touch screen. In other embodiments, the input device is a microphone to capture voice or other sound input. In other embodiments, the input device is a video camera or other sensor to capture motion or visual input. In further embodiments, the input device is a KinectTM, Leap MotionTM, or the like. In still further embodiments, the input device is a combination of devices such as those disclosed herein.
  • the systems and methods disclosed herein include one or more non-transitory computer readable storage media encoded with a program including instructions executable by the operating system of an optionally networked digital processing device.
  • a computer readable storage medium is a tangible component of a digital processing device.
  • a computer readable storage medium is optionally removable from a digital processing device.
  • a computer readable storage medium includes, by way of non-limiting examples, CD-ROMs, DVDs, flash memory devices, solid state memory, magnetic disk drives, magnetic tape drives, optical disk drives, cloud computing systems and services, and the like.
  • the program and instructions are permanently, substantially permanently, semi-permanently, or non-transitorily encoded on the media.
  • the systems and methods disclosed herein include at least one computer program, or use of the same.
  • a computer program includes a sequence of instructions, executable in the digital processing device's CPU, written to perform a specified task.
  • computer readable instructions are implemented as program modules, such as functions, objects, Application Programming Interfaces (APIs), data structures, and the like, that perform particular tasks or implement particular abstract data types.
  • APIs Application Programming Interfaces
  • a computer program comprises one sequence of instructions. In some embodiments, a computer program comprises a plurality of sequences of instructions. In some embodiments, a computer program is provided from one location. In other embodiments, a computer program is provided from a plurality of locations. In various embodiments, a computer program includes one or more software modules. In various embodiments, a computer program includes, in part or in whole, one or more web applications, one or more mobile applications, one or more standalone applications, one or more web browser plug-ins, extensions, add-ins, or add-ons, or combinations thereof.
  • a computer program includes a web application.
  • a web application in various embodiments, utilizes one or more software frameworks and one or more database systems.
  • a web application is created upon a software framework such as Microsoft ® .NET or Ruby on Rails (RoR).
  • a web application utilizes one or more database systems including, by way of non-limiting examples, relational, non-relational, object oriented, associative, and XML database systems.
  • suitable relational database systems include, by way of non-limiting examples, Microsoft ® SQL Server, mySQLTM, and Oracle ® .
  • a web application in various embodiments, is written in one or more versions of one or more languages.
  • a web application is written in one or more markup languages, presentation definition languages, client-side scripting languages, server-side coding languages, database query languages, or combinations thereof.
  • a web application is written to some extent in a markup language such as Hypertext Markup Language (HTML), Extensible Hypertext Markup Language (XHTML), or extensible Markup Language (XML).
  • a web application is written to some extent in a presentation definition language such as Cascading Style Sheets (CSS).
  • a web application is written to some extent in a client-side scripting language such as Asynchronous Javascript and XML (AJAX), Flash ® Actionscript, Javascript, or Silverlight ® .
  • AJAX Asynchronous Javascript and XML
  • Flash ® Actionscript Javascript
  • Javascript Javascript
  • Silverlight ® Silverlight ®
  • a web application is written to some extent in a server-side coding language such as Active Server Pages (ASP), ColdFusion ® , Perl, JavaTM, JavaServer Pages (JSP), Hypertext Preprocessor (PHP), PythonTM, Ruby, Tel, Smalltalk, WebDNA ® , or Groovy.
  • a web application is written to some extent in a database query language such as Structured Query Language (SQL).
  • SQL Structured Query Language
  • a web application integrates enterprise server products such as IBM ® Lotus Domino ® .
  • a web application includes a media player element.
  • a media player element utilizes one or more of many suitable multimedia technologies including, by way of non-limiting examples, Adobe ® Flash ® , HTML 5, Apple ® QuickTime ® , Microsoft ® Silverlight ® , JavaTM, and Unity ® .
  • a computer program includes a mobile application provided to a mobile digital processing device.
  • the mobile application is provided to a mobile digital processing device at the time it is manufactured.
  • the mobile application is provided to a mobile digital processing device via the computer network described herein.
  • a mobile application is created by techniques using hardware, languages, and development environments. Suitable programming languages include, by way of non-limiting examples, C, C++, C#, Objective-C, JavaTM,
  • Javascript Pascal
  • Object Pascal PythonTM
  • Ruby Ruby
  • VB.NET Javascript
  • WML Javascript
  • XHTML/HTML with or without CSS or combinations thereof.
  • Suitable mobile application development environments are available from several sources. Commercially available development environments include, by way of non-limiting examples, AirplaySDK, alcheMo, Appcelerator ® , Celsius, Bedrock, Flash Lite, .NET Compact Framework, Rhomobile, and WorkLight Mobile Platform. Other development environments are available without cost including, by way of non-limiting examples, Lazarus, MobiFlex,
  • mobile device manufacturers distribute software developer kits including, by way of non-limiting examples, iPhone and iPad (iOS) SDK, AndroidTM SDK, BlackBerry ® SDK, BREW SDK, Palm ® OS SDK, Symbian SDK, webOS SDK, and Windows ® Mobile SDK.
  • iOS iPhone and iPad
  • AndroidTM SDK AndroidTM SDK
  • BlackBerry ® SDK BlackBerry ® SDK
  • BREW SDK Palm ® OS SDK
  • Symbian SDK Symbian SDK
  • webOS SDK webOS SDK
  • Windows ® Mobile SDK Windows ® Mobile SDK
  • commercial forums for distribution of mobile applications include, by way of non-limiting examples, Apple ® App Store, AndroidTM Market, BlackBerry ® App World, App Store for Palm devices, App Catalog for webOS, Windows ® Marketplace for Mobile, Ovi Store for Nokia ® devices, Samsung ® Apps, and Nintendo ® DSi Shop.
  • a computer program includes a standalone application, which is a program that is run as an independent computer process, not an add-on to an existing process, e.g., not a plug-in.
  • standalone applications are compiled.
  • a compiler is a computer program(s) that transforms source code written in a programming language into binary object code such as assembly language or machine code. Suitable compiled programming languages include, by way of non-limiting examples, C, C++, Objective-C, COBOL, Delphi, Eiffel, JavaTM, Lisp, PythonTM, Visual Basic, and VB .NET, or combinations thereof.
  • Compilation is often performed, at least in part, to create an executable program.
  • a computer program includes one or more executable complied applications.
  • the computer program includes a web browser plug-in.
  • a plug-in is one or more software components that add specific functionality to a larger software application. Makers of software applications support plug-ins to enable third- party developers to create abilities which extend an application, to support easily adding new features, and to reduce the size of an application. When supported, plug-ins enable customizing the functionality of a software application. For example, plug-ins are commonly used in web browsers to play video, generate interactivity, scan for viruses, and display particular file types. In some instances, web browser plug-ins include Adobe ® Flash ® Player, Microsoft ® Silverlight ® , and Apple ® QuickTime ® .
  • the toolbar comprises one or more web browser extensions, add-ins, or add-ons. In some embodiments, the toolbar comprises one or more explorer bars, tool bands, or desk bands.
  • plug-in frameworks are available that enable development of plug-ins in various programming languages, including, by way of non- limiting examples, C++, Delphi, JavaTM, PHP, PythonTM, and VB .NET, or combinations thereof.
  • Web browsers are software applications, designed for use with network-connected digital processing devices, for retrieving, presenting, and traversing information resources on the World Wide Web. Suitable web browsers include, by way of non- limiting examples, Microsoft ® Internet Explorer ® , Mozilla ® Firefox ® , Google ® Chrome, Apple ® Safari ® , Opera Software ® Opera ® , and KDE Konqueror. In some embodiments, the web browser is a mobile web browser.
  • Mobile web browsers are designed for use on mobile digital processing devices including, by way of non-limiting examples, handheld computers, tablet computers, netbook computers, subnotebook computers, smartphones, music players, personal digital assistants (PDAs), and handheld video game systems.
  • Suitable mobile web browsers include, by way of non-limiting examples, Google ® Android ® browser, RFM BlackBerry ® Browser, Apple ® Safari ® , Palm ® Blazer, Palm ® WebOS ® Browser, Mozilla ® Firefox ® for mobile, Microsoft ® Internet Explorer ® Mobile, Amazon ® Kindle ® Basic Web, Nokia ® Browser, Opera Software ® Opera ® Mobile, and Sony ® PSPTM browser.
  • a software module comprises a file, a section of code, a programming object, a programming structure, or combinations thereof.
  • a software module comprises a plurality of files, a plurality of sections of code, a plurality of programming objects, a plurality of programming structures, or combinations thereof.
  • the one or more software modules comprise, by way of non-limiting examples, a web application, a mobile application, and a standalone application.
  • software modules are in one computer program or application.
  • software modules are in more than one computer program or application. In some embodiments, software modules are hosted on one machine. In other embodiments, software modules are hosted on more than one machine. In further embodiments, software modules are hosted on cloud computing platforms. In some embodiments, software modules are hosted on one or more machines in one location. In other embodiments, software modules are hosted on one or more machines in more than one location.
  • the methods and systems disclosed herein include one or more databases, or use of the same.
  • databases are suitable for storage and retrieval of analytical information described elsewhere herein.
  • suitable databases include, by way of non-limiting examples, relational databases, non-relational databases, object oriented databases, object databases, entity-relationship model databases, associative databases, and XML databases.
  • a database is internet-based.
  • a database is web-based.
  • a database is cloud computing-based.
  • a database is based on one or more local computer storage devices.
  • the methods provided herein are processed on a server or a computer server (Fig. 2).
  • the server 401 includes a central processing unit (CPU, also "processor") 405 which is a single core processor, a multi core processor, or plurality of processors for parallel processing.
  • a processor used as part of a control assembly is a microprocessor.
  • the server 401 also includes memory 410 (e.g. random access memory, read-only memory, flash memory); electronic storage unit 415 (e.g. hard disk); communications interface 420 (e.g. network adaptor) for
  • peripheral devices 425 which includes cache, other memory, data storage, and/or electronic display adaptors.
  • the memory 410, storage unit 415, interface 420, and peripheral devices 425 are in communication with the processor 405 through a communications bus (solid lines), such as a motherboard.
  • the storage unit 415 is a data storage unit for storing data.
  • the server 401 is operatively coupled to a computer network ("network") 430 with the aid of the communications interface 420.
  • network computer network
  • a processor with the aid of additional hardware is also operatively coupled to a network.
  • the network 430 is the Internet, an intranet and/or an extranet, an intranet and/or extranet that is in communication with the Internet, a telecommunication or data network. In some embodiments, the network 430 with the aid of the server 401,
  • the server is capable of transmitting and receiving computer-readable instructions (e.g., device/system operation protocols or parameters) or data (e.g., sensor measurements, raw data obtained from detecting metabolites, analysis of raw data obtained from detecting metabolites, interpretation of raw data obtained from detecting metabolites, etc.) via electronic signals transported through the network 430.
  • a network is used, for example, to transmit or receive data across an international border.
  • the server 401 is in communication with one or more output devices 435 such as a display or printer, and/or with one or more input devices 440 such as, for example, a keyboard, mouse, or joystick.
  • the display is a touch screen display, in which case it functions as both a display device and an input device.
  • different and/or additional input devices are present such an enunciator, a speaker, or a microphone.
  • the server uses any one of a variety of operating systems, such as for example, any one of several versions of Windows®, or of MacOS®, or of Unix®, or of Linux®.
  • the storage unit 415 stores files or data associated with the operation of a device, systems or methods described herein.
  • the server communicates with one or more remote computer systems through the network 430.
  • the one or more remote computer systems include, for example, personal computers, laptops, tablets, telephones, Smart phones, or personal digital assistants.
  • a control assembly includes a single server 401.
  • the system includes multiple servers in communication with one another through an intranet, extranet and/or the Internet.
  • the server 401 is adapted to store device operation parameters, protocols, methods described herein, and other information of potential relevance. In some embodiments, such information is stored on the storage unit 415 or the server 401 and such data is transmitted through a network.
  • ranges and amounts can be expressed as “about” a particular value or range. About also includes the exact amount. Hence “about 5 ⁇ _,” means “about 5 ⁇ _,” and also “5 ⁇ ..” Generally, the term “about” includes an amount that would be expected to be within experimental error.
  • protein encompasses a full-length cysteine containing protein, a full-length functional cysteine containing protein, a cysteine containing protein fragment, or a functionally active cysteine containing protein fragment.
  • a protein described herein is also referred to as an "isolated protein", or a protein that by virtue of its origin or source of derivation is not associated with naturally associated components that accompany it in its native state; is substantially free of other proteins from the same species; is expressed by a cell from a different species; or does not occur in nature.
  • polypeptide refers to any polymeric chain of amino acids.
  • polypeptide encompasses native or modified cysteine containing protein, cysteine containing protein fragments, or polypeptide analogs comprising non-native amino acid residues.
  • a polypeptide is monomeric.
  • a polypeptide is polymeric.
  • a polypeptide described herein is also referred to as an "isolated polypeptide", or a polypeptide that by virtue of its origin or source of derivation is not associated with naturally associated components that accompany it in its native state; is substantially free of other proteins from the same species; is expressed by a cell from a different species; or does not occur in nature.
  • the terms "individual(s)", “subject(s)” and “patient(s)” mean any mammal.
  • the mammal is a human.
  • the mammal is a non-human. None of the terms require or are limited to situations characterized by the supervision (e.g. constant or intermittent) of a health care worker (e.g. a doctor, a registered nurse, a nurse practitioner, a physician's assistant, an orderly or a hospice worker).
  • a health care worker e.g. a doctor, a registered nurse, a nurse practitioner, a physician's assistant, an orderly or a hospice worker.
  • alkyl as used herein is a branched or unbranched saturated hydrocarbon group of 1 to 24 carbon atoms, such as methyl, ethyl, n-propyl, isopropyl, n-butyl, isobutyl, s- butyl, t-butyl, n-pentyl, isopentyl, s-pentyl, neopentyl, hexyl, heptyl, octyl, nonyl, decyl, dode cyl, tetradecyl, hexadecyl, eicosyl, tetracosyl, and the like.
  • the alkyl group is acyclic. In some instances, the alkyl group is branched or unbranched. In some instances, the alkyl group is also substituted or unsubstituted. For example, the alkyl group is substituted with one or more groups including, but not limited to, alkyl, cycloalkyl, alkoxy, amino, ether, halide, hydroxy, nitro, silyl, sulfo-oxo, or thiol.
  • a "lower alkyl” group is an alkyl group containing from one to six (e.g., from one to four) carbon atoms.
  • alkyl group is also a CI alkyl, C1-C2 alkyl, C1-C3 alkyl, C1-C4 alkyl, CI -05 alkyl, C1-C6 alkyl, C1-C7 alkyl, Cl- C8 alkyl, C1-C9 alkyl, C1-C10 alkyl, and the like up to and including a C1-C24 alkyl.
  • aryl as used herein is a group that contains any carbon-based aromatic group including, but not limited to, benzene, naphthalene, phenyl, biphenyl, anthracene, and the like.
  • the aryl group can be substituted or unsubstituted.
  • the aryl group is substituted with one or more groups including, but not limited to, alkyl, cycloalkyl, alkoxy, alkenyl, cycloalkenyl, alkynyl, cycloalkynyl, aryl, heteroaryl, aldehyde,— H 2 , carboxylic acid, ester, ether, halide, hydroxy, ketone, azide, nitro, silyl, sulfo-oxo, or thiol.
  • groups including, but not limited to, alkyl, cycloalkyl, alkoxy, alkenyl, cycloalkenyl, alkynyl, cycloalkynyl, aryl, heteroaryl, aldehyde,— H 2 , carboxylic acid, ester, ether, halide, hydroxy, ketone, azide, nitro, silyl, sulfo-oxo, or thiol.
  • biasing is a specific type of aryl group and is included in the definition of "aryl.”
  • the aryl group is optionally a single ring structure or comprises multiple ring structures that are either fused ring structures or attached via one or more bridging groups such as a carbon-carbon bond.
  • biaryl refers to two aryl groups that are bound together via a fused ring structure, as in naphthalene, or are attached via one or more carbon-carbon bonds, as in biphenyl.
  • MDA-MB-231 cells and HEK-293T cells were grown in DMEM supplemented with 10% fetal bovine serum, penicillin, streptomycin and glutamine.
  • Jurkat, Ramos and MUM2C cells were grown in RPMI-1640 medium supplemented with 10% fetal bovine serum, penicillin and streptomycin.
  • cells were grown to 100% confluence for MDA-MB-231 cells or until cell density reached 1.5 million cells/mL for Ramos and Jurkat cells.
  • Cells were washed with cold PBS, scraped with cold PBS and cell pellets were isolated by centrifugation (1,400 g , 3 min, 4 °C), and stored at -80 °C until use. Cell pellets were lysed by sonication and fractionated (100,000 g, 45 min) to yield soluble and membrane fractions, which were then adjusted to a final protein concentration of 1.5 mg/mL for proteomics experiments and 1 mg/mL for gel-based ABPP experiments. The soluble lysate was prepared fresh from frozen pellets directly before each experiment. Protein concentration was determined using the Bio-Rad DCTM protein assay kit.
  • DEVD-AMK disclosed as SEQ ID NO: 8557
  • IDHl labeling by IA-rhodamine is relatively better in MDA-MB-231 soluble proteome when compared with Ramos and Jurkat soluble proteome.
  • DEVD Rho-DEVD-AOMK
  • soluble proteome 1 mg/mL was labeled with the indicated concentration of 18 or 19 (1 ⁇ _, of 25 x stock solution in DMSO) for 1 h at ambient temperature followed by copper-mediated azide-alkyne cycloaddition (CuAAC) conjugation to rhodamine-azide.
  • CuAAC copper-mediated azide-alkyne cycloaddition
  • MLTK and IMPDH2 were subjected to CuAAC conjugation to rhodamine-azide as detailed above.
  • the percentage of labeling was determined by quantifying the integrated optical intensity of the bands, using ImageJ software.
  • Nonlinear regression analysis was used to determine the IC50 values from a dose-response curve generated using GraphPad Prism 6.
  • MDA-MB-231 cells were grown to 95% confluence and Ramos cells were grown to 1 million cells/mL. The media in all samples was replaced with fresh media, containing 200 ⁇ of the indicated fragments and the cells were incubated at 37 °C for 2 h, washed with cold PBS, scraped into cold PBS and harvested by centrifugation (see prior section on "Preparation of human cancer cell line proteomes").
  • Fragments 2, 3, 8, 9, 10, 12, 13, 14, 21, 27, 28, 29, 31, 33, 38, 45, 51 and 56 were screened at 200 ⁇ in situ. Fragments 4 and 11 were screened at 100 ⁇ in situ. Fragments 2, 3, 8, and 20 were tested at 50 ⁇ in situ.
  • streptavidin-agarose beads slurry (Pierce) was washed in 10 mL PBS and then resuspended in 5 mL PBS.
  • the SDS -solubilized proteins were added to the suspension of streptavidin-agarose beads and the bead mixture was rotated for 3 h at ambient temperature. After incubation, the beads were pelleted by centrifugation (1,400 g, 3 min) and were washed ( 2 x 10 mL PBS and 2 x 10 mL water).
  • the washed beads were washed once further in 140 ⁇ ⁇ TEV buffer (50 mM Tris, pH 8, 0.5 mM EDTA, 1 mM DTT) and then resuspended in 140 ⁇ , TEV buffer. 5 ⁇ , TEV protease (80 ⁇ ) was added and the reactions were rotated overnight at 29 °C.
  • TEV digest was separated from the beads with Micro Bio-Spin columns by centrifugation (1,400 g, 3 min) and the beads were washed once with water (100 ⁇ .). The samples were then acidified to a final concentration of 5% (v/v) formic acid and stored at -80 °C prior to analysis.
  • TEV digests were pressure loaded onto a 250 ⁇ (inner diameter) fused silica capillary column packed with C18 resin (Aqua 5 ⁇ , Phenomenex).
  • the samples were analyzed by multidimensional liquid chromatography tandem mass spectrometry (MudPIT), using an LTQ-Velos Orbitrap mass spectrometer (Thermo Scientific) coupled to an Agilent 1200- series quaternary pump.
  • the peptides were eluted onto a biphasic column with a 5 ⁇ tip (100 ⁇ fused silica, packed with CI 8 (10 cm) and bulk strong cation exchange resin (3 cm , SCX, Phenomenex,)) in a 5-step MudPIT experiment, using 0%, 30%, 60%, 90%, and 100% salt bumps of 500 mM aqueous ammonium acetate and using a gradient of 5-100%) buffer B in buffer A (buffer A: 95% water, 5% acetonitrile, 0.1% formic acid; buffer B: 5% water, 95% acetonitrile, 0.1% formic acid) as has been described in Weerapana et al.
  • MS2 spectra data were searched using the ProLuCID algorithm (publicly available at http://fields.scripps.edu/downloads.php) using a reverse concatenated, nonredundant variant of the Human UniProt database (release-
  • isoTOP-ABPP ratios, R values) were quantified with in-house CEVIAGE software, using default parameters (3 msls per peak and signal to noise threshold 2.5). Site-specific engagement of electrophilic fragments was assessed by blockade of IA-alkyne probe labeling. For peptides that showed a > 95% reduction in MSI peak area from the fragment treated proteome (light TEV tag) when compared to the DMSO treated proteome (heavy TEV tag), a maximal ratio of 20 was assigned. Ratios for unique peptide entries were calculated for each experiment; overlapping peptides with the same modified cysteine (e.g.
  • Proteome reactivity values for individual fragments were computed as the percentage of the total quantified cysteine-containing peptides with R values >4 (defined as liganded cysteines) for each replicate experiment and the final proteome reactivity value was calculated as the mean for all replicate experiments for each fragment from both MDA-MB-231 and Ramos cellular proteomes.
  • R values were calculated and individual datasets were filtered as described above (R value calculation and processing).
  • Two categories of hits in situ were defined: 1) cysteines liganded in situ that were also observed as hits in vitro and 2) cysteines that detected in vitro, but were only liganded in situ.
  • R values for the same cysteine containing peptide from in vitro and in situ experiments were compared and if both had ratios R > 4, the cysteine was considered ligandable in situ.
  • two ratios R > 4 for replicates of two different fragments were required to be detected in situ and at least one of these fragments must be quantified as a non-hit with R ⁇ 2 in vitro.
  • another cysteine from the same protein was required to be unliganded in situ (R ⁇ 2 ) by the same fragment to control for the possibility that changes in R values from changes in protein expression upon fragment treatment rather than from fragment competition.
  • Custom python scripts were used to compile functional annotations available in the UniProtKB/Swiss-Prot Protein Knowledge database (release-2012 11). Relevant Uniprot entries were mined for available functional annotations at the residue level, specifically for annotations regarding enzyme catalytic residues (active sites), disulfides (redox active and structural) and metal binding sites. Liganded proteins were queried against the Drugbank database (Version 4.2) and fractionated into DrugBank and non-Drugbank proteins. Functional keywords assigned at the protein level were collected from the Uniprot database and the Drugbank and non- drugbank categories were further classified into protein functional classes. Cysteine reactivity data was re-processed using ProLuCID as detailed above (Peptide and protein identification).
  • Cysteines found in both the reactivity and ligandability datasets were sorted based on their reactivity values (lower ratio indicates higher reactivity). The moving average of the percentage of total liganded cysteines within each reactivity bin (step-size 50) was taken.
  • Custom python scripts were developed to collect relevant MR and X-ray structures from the RCSB Protein Data Bank (PDB).
  • PDB Protein Data Bank
  • sequence alignments performed with BLAST to proteins deposited in the PDB, were used to identify structural homologues.
  • enzymes with structures in the PDB were manually inspected to evaluate the location of the cysteine. Cysteines were considered to reside in enzyme active sites if they were within 10 A of active-site ligand or residue(s). Cysteines outside of the 10 A range were deemed non-active-site residues.
  • Glutathione was diluted to a final concentration of 125 ⁇ in assay buffer (100 mM Tris, pH 8.8, 10% ethanol as co solvent).
  • assay buffer 100 mM Tris, pH 8.8, 10% ethanol as co solvent.
  • the reaction mixture was incubated at room temperature for 1 h.
  • the concentration of GSH remaining was calculated from a standard curve.
  • the UniProtKB ID was used to filter the PDB. Structures determined by X-ray crystallography were selected, privileging higher sequence coverage and structure resolution (See Table 5 for selected PDB IDs). When no human structures were available, the closest homologous organism available was selected (e.g. PRMTl : R norvegicus). Protein structures were prepared following the standard AutoDock protocol. Waters, salts, and crystallographic additives were removed; AutoDockTools was used to add hydrogens, calculate Gasteiger-Marsili charges and generate PDBQT files.
  • MSMS reduced surface method was used to identify accessible cysteines.
  • the protein volume was scanned using a probe radius of 1.5 A; residues were considered accessible if they had at least one atom in contact with either external surfaces or internal cavities.
  • the fragment library was docked independently on each accessible cysteine using AutoDock 4.2.
  • a grid box of 24.4x24.4x24.4 A was centered on the geometric center of the residue; thiol hydrogen was removed from the side-chain, which was modeled as flexible during the docking; the rest of the structure was kept rigid.
  • a custom 13-7 interaction potential was defined between the nucleophile sulfur and the reactive carbon in the ligands.
  • the equilibrium distance (r eq ) was set to the length of the C-S covalent bond (1.8 A); the potential well depth (s e q) varied between 1.0 and 0.175 to model to the reactivity of the different ligands.
  • EVIPDH2 structure including the Bateman domain, was modeled using I-TASSER. Subcloning and mutagenesis
  • TIGAR, IDH1, PRMT1 and EVIPDH2 were expressed in BL21(DE3) Chemically Competent Cells (NEB), grown on Terrific Broth supplemented with the desired antibiotic (50 ⁇ g/mL Kanamycin or 50 ⁇ g/mL Carbenicillin) to OD 60 o of 0.8 and induced with 0.5 mM IPTG for 16 h at 18 °C.
  • NEB Chemically Competent Cells
  • the resin was then washed with 100 mL buffer A containing 20 mM imidazole and eluted with 10 mL buffer A containing 200 mM imidazole.
  • the eluant was concentrated to 2.5 mL (Amicon-Ultra-15, 10 kDa MW cutoff), buffer exchanged using PD10 columns (GE Amersham) into the storage buffer (50 mM HEPES, pH 7.4, 150 mM NaCl, 10% glycerol, 1 mM BME) and further concentrated (Amicon-Ultra-4, 10 kDa MW cutoff) to a final concentration of approximately 100 ⁇ protein.
  • PD10 columns GE Amersham
  • CASP3, CASP8, pro-CASP8 (D374A, D384A) and an N-terminal MBP fusion-His 6 - TEV-Arg 6 protease construct pRK793 ("His 6 " disclosed as SEQ ID NO: 861 and "Arg 6 " disclosed as SEQ ID NO: 862) were expressed in E. coli BL21(DE3)pLysS cells (Stratagene). Cells were grown in 2xYT medium supplemented with 200 ⁇ g/ml ampicillin and 50 ⁇ g/ml chloramphenicol at 37 °C to an OD 60 o of 0.8-1.0.
  • caspase Overexpression of caspase was induced with 0.2 mM IPTG at 30 °C for 4 h (CASP3) or at 12 °C overnight (CASP8) or with 0.5 mM IPTG at 30 °C for 4 h (TEV protease).
  • Cells were immediately harvested and resuspended in ice cold buffer A (caspases: 100 mM Tris, pH 8.0, 100 mM NaCl; TEV protease: PBS) and subjected to 3 cycles of lysis by microfluidization (Microfluidics).
  • the cell lysate was clarified by centrifugation (45,000 g, 30 min, 4 °C) and soluble fractions were loaded onto a 1 mL HisTrap HP Ni-NTA affinity column (GE Amersham) pre-equilibrated with buffer A and eluted with buffer A containing 200 mM imidazole.
  • the eluted protein was immediately diluted two-fold with buffer B (20 mM Tris, pH 8.0) and purified by anion-exchange chromatography (HiTrap Q HP, GE Amersham) with a 30-column volume gradient to 50% of buffer B containing 1 M NaCl.
  • the caspases were injected over a Superdex 200 16/60 gel filtration column (GE
  • Retrovirus was prepared by taking 1.5 ⁇ g of each pCLNCX vector and 1.5 ⁇ g pCMV-VSV-G and 20 ⁇ . of Roche X-tremeGeneHP DNA transfection reagent to transfect HEK-293RTV cells. The medium was replaced after 1 day of transfection and the following day the culture supernantant was collected and filtered through 0.5 ⁇ filter.
  • MUM2C cells stably overexpressing IDH1 R132H were seeded 1.5 ⁇ 10 6 cells/150 mm dish. The following day the indicated compounds (50 mM stock solutions in DMSO) or
  • DMSO fetal sulfate
  • DMSO fetal sulfate
  • the cells were washed in ice-cold PBS and collected by scraping in ice-cold PBS and centrifugation (1,400 g, 3 min, 4 °C).
  • the cell pellets were then resuspended in 100 ⁇ ⁇ ice-cold PBS followed by sonication and centrifugation at 16,000 g for 10 min. Lysates were then buffer exchanged into
  • IDH1 buffer 40 mM Tris, pH 7.4, 2 mM MgCl 2 ) with 0.5 mL ZEBA spin desalting columns
  • Thermo Fisher, 89882 The protein concentrations were adjusted to 3.5 mg/mL and 25 ⁇ . of the lysate was mixed with 25 ⁇ . of the reaction mixture (2.5 mM NADPH and 2.5 mM a- ketoglutarate in IDHl buffer) and the reaction was allowed to proceed for 4 h at which point the reaction mixtures were quenched with 50 ⁇ . cold methanol, followed by a centrifugation
  • the injection volume was 25 ⁇ ⁇ .
  • MS analysis was performed on an Agilent G6410B tandem mass spectrometer with ESI source.
  • the dwell time for 2-HG was set to 100 ms. and collision energy for 2-HG was set to 5.
  • the capillary was set to 4 kV, and the fragmentor was set to 100 V.
  • the drying gas temperature was 350 °C, the drying gas flow rate was 11 L/min and the nebulizer pressure was 35 psi.
  • the mass spectrometer was run in MRM mode, monitoring the transition of mlz from 146.7 to 129 for 2-HG (negative ionization mode). Treatments were conducted in triplicate. Background 2-HG production, calculated from the 'mock' GFP over expressing cells, was subtracted from the total 2-HG production.
  • TIGAR activity assay was conducted as described in Gerin et al. The Biochemical Journal 458:439-448 (2014). Formation of 3PG (3-phosphoglycerate) production from 23BPG (2,3-bisphosphoglycerate) was measured spectrophotometrically on a TECAN plate reader, measuring decrease in absorbance at 340 nm in clear, flat-bottom 96 well microplate (Corning® Costar®). 2 ⁇ of recombinant TIGAR (10 mg/mL) was diluted into 1 mL dilution buffer (25 mM HEPES, pH 7.1, 25 mM KCl, 1 mM MgCl 2 ).
  • 25 of diluted protein was incubated for 1 h with the indicated concentration of compound (1 ⁇ , 25 x stock in DMSO). Then 75 ⁇ of assay mixture comprised of 25 mM HEPES (pH 7.1), 25 mM KCl, 1 mM MgCl 2 , 0.5 mM NADH, 1 mM DTT, 1 mM 23BPG, 1 mM ATP-Mg, the equivalent of 1 ⁇ , each of rabbit muscle GAPDH (4000 units/mL, Sigma, G5537) and yeast PG kinase (6300 units/mL, Sigma, P7634) was added to the protein and decrease in absorbance was monitored at 340 nm. The background, calculated from samples lacking TIGAR, was subtracted from samples containing TIGAR. Experiments were performed in quadruplicate.
  • PRMTl assays were conducted as described in Weerapana et al. Nature 468:790-795 (2010).
  • Recombinant human PRMTl (0.85 ⁇ , wild type or CIOI S mutant) in 25 iL methylation buffer (20 mM Tris, pH 8.0, 200 mM NaCl, 0.4 mM EDTA) was pre-incubated with indicated fragments for 1 h and methylation activity was monitored after addition of 1 mg of recombinant histone 4 (NEB, M2504S) and 3 H-SAM (2 ⁇ ). Reactions were further incubated for 60 min at ambient temperature and stopped with 4 ⁇ SDS sample buffer. SDS-PAGE gels were fixed with 10% acetic acid/10% methanol (v/v), washed, and incubated with Amplify reagent (Amersham) before exposing to film at -80 °C for 3 days.
  • kinase activity assay protocol was conducted as described in Wang et al. ACS Chemical biology 9:2194-2198 (2014).
  • Kinase assay buffers, myelin basic protein (MBP) substrate and ATP stock solution were purchased from SignalChem.
  • Radio-labeled [ ⁇ - 33 ⁇ ] ATP was purchased from PerkinElmer. 250 ⁇ . of HEK-293T soluble lysates (8 mg/mL), stably overexpressing WT, C22A or K45M MLTK were labeled for 1 h with 100 ⁇ fragment or DMSO. The samples were then individually immunoprecipitated with 20 ⁇ . flag resin slurry per sample and then eluted with 15 ⁇ .
  • the background was determined from the K45M- inactive mutant MLTK activity level, which was subtracted from the WT and C22A samples. Relative activities for WT and C22A were normalized to their respective DMSO treated samples. Experiments were performed in triplicate.
  • Caspase 3 and 8 assays were conducted with CASP8 activity assay kit (BioVision, Kl 12-100) and Caspase 3 activity assay kit (Invitrogen, EnzChek® Caspase-3 Assay Kit), following the manufacturer's instructions. Briefly, recombinant Caspase 3 (10 ⁇ ) was added to soluble Ramos lysates (1 mg/mL) to a 100 nM final concentration of protease. Caspase 8 (30 ⁇ ) was added to soluble Ramos lysates to a 1 ⁇ final concentration of protease.
  • CASP8 CASP3 and PARP
  • cell pellets were resuspended in cell lysis buffer from (BioVision, 1067-100) with l x cOmplete protease inhibitor (Roche) and allowed to incubate on ice for 30 min prior to centrifugation (10 min, 16,000 g).
  • cell pellets were resuspended in PBS and lysed with sonication prior to centrifugation (10 min, 16,000 g). The proteins were then resolved by SDS-PAGE and transferred to nitrocellulose membranes, blocked with 5% BSA in TBST and probed with the indicated antibodies.
  • the primary antibodies and the dilutions used are as follows: anti-parp (Cell Signaling, 9532, 1 : 1000), anti-casp3 (Cell Signaling, 9662, 1 :500), anti-casp8 (Cell Signaling, 9746, 1 :500), anti-IDHl (Cell Signaling, 1 :500, 3997s), anti-actin (Cell Signaling, 3700, 1 :3000), anti-gapdh (Santa Cruz, sc-32233, 1 :2000) anti-flag (Sigma Aldrich, F1804, 1 :3000) .
  • Blots were incubated with primary antibodies overnight at 4 °C with rocking and were then washed (3 x 5 min, TBST) and incubated with secondary antibodies (LICOR, IRDye ® 800CW or IRDye ® 800LT, 1 : 10,000) for 1 h at ambient temperature. Blots were further washed (3 x 5 min, TBST) and visualized on a LICOR Odyssey Scanner.
  • Prediction failures were due to the approximations of the rigid model used with highly flexible/solvent exposed loop regions (STAT1 :C255, PDB ID: 1YVL; HAT1 :C101, PDB ID:2P0W; ZAP70:C117, PDB ID:4K2R), or with partially buried residues (SARS:C438, PDB ID:4187; PAICS:C374, PDB ID:2H31).
  • the simulation of some degree of flexibility improves the success rate.
  • the method was limited by availability and quality of crystallographic structures, when sequences were not fully resolved in available models (XP01 :C34, C1070, PDB ID:3GB8,
  • General Procedure A was used for the synthesis of one or more of the small molecule fragments and/or cysteine-reactive probes described herein.
  • the amine was dissolved in anhydrous CH 2 C1 2 (0.2 M) and cooled to 0 °C.
  • anhydrous pyridine 1.5 equiv.
  • chloroacetyl chloride 1.5 equiv.
  • General Procedure Al is similar to General Procedure A except triethylamine (3 equiv.) was used instead of pyridine.
  • General Procedure A2 is similar to General Procedure A except N-methylmorpholine (3 equiv.) was used instead of pyridine.
  • General Procedure B was used for the synthesis of one or more of the small molecule fragments and/or cysteine-reactive probes described herein.
  • the amine was dissolved in anhydrous CH 2 CI 2 (0.2 M) and cooled to 0 °C.
  • triethylamine TEA, 1.5 equiv.
  • acryloyl chloride 1.5 equiv.
  • reaction was quenched with H 2 0 (1 mL), diluted with CH 2 C1 2 (20 mL), and washed twice with saturated NaHC0 3 (100 mL).
  • the organic layer was passed through a plug of silica, after which, the eluant was concentrated in vacuo and purified by preparatory thin layer or flash column chromatography to afford the desired product.
  • General Procedure C was used for the synthesis of one or more of the small molecule fragments and/or cysteine-reactive probes described herein.
  • Fmoc-Lys(N 3 )-OH (Anaspec) (500 mg, 1.26 mmol, 1.26 equiv.) was coupled to the resin overnight at room temperature with DIE A (1 13 ⁇ ) and 2-(6-chloro-lH-benzotriazole-l- yl)-l , 1 ,3, 3-tetramethylaminium hexafluorophosphate (HCTU; 1.3 mL of 0.5 M stock in DMF) followed by a second overnight coupling with Fmoc-Lys(N 3 )-OH ( 500 mg, 1.26 mmol, 1.26 equiv.), DIEA (1 13 ⁇ ), O-(7-azabenzotriazol-l-yl)-N,N,N',N'-tetramethyluronium
  • HATU hexafluorophosphate
  • Unmodified resin was then capped (2 x 30 min) with Ac 2 0 (400 ⁇ and DIEA (700 ⁇ in DMF after which the resin was washed with DMF (2 x 1 min).
  • Deprotection with 4-methylpiperidine in DMF (50% v/v, 2 x 5 mL, 1 min) and coupling cycles (4 equiv. Fmoc-protected amino acid (EMD biosciences) in DMF) with HCTU (2 mL, 0.5 M in DMF) and DIEA (347.7 ⁇ were then repeated for the remaining amino acids.
  • the product was purified by silica gel chromatography, utilizing a gradient of 5 to 10 to 15 to 20% ethyl acetate in hexanes to yield the desired product (24 mg, 44%).
  • the reaction is performed with 2.5 equiv. of sodium iodide, in which case re-subjection is not necessary, and purification by PTLC is accomplished in 30% EtOAc/hexanes as eluent.
  • SI-2 was prepared according to Thoma et al, J. Med. Chem. 47: 1939-1955 (2004).

Abstract

Disclosed herein are methods, compositions, probes, polypeptides, assays, and kits for identifying a cysteine containing protein as a binding target for a small molecule fragment. Also disclosed herein are methods, compositions, and probes for mapping a biologically active cysteine site on a protein and screening a small molecule fragment for interaction with a cysteine containing protein.

Description

CYSTEINE REACTIVE PROBES AND USES THEREOF
CROSS-REFERENCE
[0001] This application claims the benefit of U.S. Provisional Application No. 62/345,710, filed on June 3, 2016, and U.S. Provisional Application No. 62/244,881, filed on October 22, 2015, each of which are incorporated herein by reference in their entireties.
STATEMENT AS TO FEDERALLY SPONSORED RESEARCH
[0002] The invention disclosed herein was made, at least in part, with U.S. government support under Grant Nos. CA087660, GM090294, GMl 08208, and GM069832 by the National Institutes of Health. Accordingly, the U.S. Government has certain rights in this invention.
SEQUENCE LISTING
[0003] The instant application contains a Sequence Listing which has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on October 19, 2016, is named 48054-702_601_SL.txt and is 372,838 bytes in size.
BACKGROUND OF THE INVENTION
[0004] Protein function assignment has been benefited from genetic methods, such as target gene disruption, RNA interference, and genome editing technologies, which selectively disrupt the expression of proteins in native biological systems. Chemical probes offer a complementary way to perturb proteins that have the advantages of producing graded (dose-dependent) gain- (agonism) or loss- (antagonism) of-function effects that are introduced acutely and reversibly in cells and organisms. Small molecules present an alternative method to selectively modulate proteins and to serve as leads for the development of novel therapeutics.
SUMMARY OF THE INVENTION
[0005] Disclosed herein, in certain embodiments, is a method of identifying a cysteine containing protein as a binding target for a small molecule fragment, comprising: (a) obtaining a set of cysteine-reactive probe-protein complexes from a sample treated with a cysteine-reactive probe wherein the cysteine-reactive probe comprises a reactive moiety capable of forming a covalent bond with a cysteine residue located on the cysteine containing protein; (b) analyzing the set of cysteine-reactive probe-protein complexes by a proteomic analysis means; (c) based on step b), identifying a cysteine containing protein as the binding target for the small molecule fragment. In some embodiments, the method further comprises assigning a value to each of the cysteine containing protein from the set of cysteine-reactive probe-protein complexes for identifying a cysteine containing protein as the binding target for the small molecule fragment, wherein the value is determined based on the proteomic analysis means of step b). In some embodiments, the sample comprises a first cell solution and a second cell solution. In some embodiments, the method further comprises contacting the first cell solution with a small molecule fragment for an extended period of time prior to incubating the first cell solution with a first cysteine-reactive probe to generate a first group of cysteine-reactive probe-protein complexes. In some embodiments, the extended period of time is about 5, 10, 15, 20, 30, 60, 90, 120 minutes or longer. In some embodiments, the method further comprises contacting the second cell solution with a second cysteine-reactive probe to generate a second group of cysteine-reactive probe-protein complexes. In some embodiments, the first cysteine-reactive probe and the second cysteine-reactive probe are the same. In some embodiments, the first group and the second group of cysteine-reactive probe-protein complexes comprise the set of cysteine- reactive probe-protein complexes. In some embodiments, cells from the second cell solution are grown in a media (e.g., an isotopically enriched media). In some embodiments, cells from the first cell solution are grown in a media (e.g., an isotopically enriched media). In some embodiments, cells from both the first cell solution and the second cell solution are grown in two different isotopically enriched media so that cells from the first cell solution is
distinguishable from cells obtained from the second cell solution. In other embodiments, cells from only one of the cell solutions (e.g., either the first cell solution or the second cell solution) are grown in an isotopically enriched media. In some embodiments, the method further comprises contacting the first cell solution with a first set of small molecule fragments and a complementing set of cysteine-reactive probes wherein each small molecule fragment competes with its complementing cysteine-reactive probe for binding with a cysteine residue, and wherein each small molecule fragment and each complementing cysteine-reactive probe are different within each respective set. In some embodiments, the method further comprises contacting the second cell solution with a second set of cysteine-reactive probes wherein the second set of cysteine-reactive probes is the same as the complementing set of cysteine-reactive probes, and wherein each cysteine-reactive probe is different within the set. In some embodiments, the first set of cysteine-reactive probes generates a third group of cysteine-reactive probe-protein complexes and the second set of cysteine-reactive probes generates a fourth group of cysteine- reactive probe-protein complexes. In some embodiments, the cysteine containing protein comprises a biologically active cysteine residue. In some embodiments, the biologically active cysteine site is a cysteine residue that is located about ΙθΑ or less to an active-site ligand or residue. In some embodiments, the cysteine residue that is located about ΙθΑ or less to the active-site ligand or residue is an active site cysteine. In some embodiments, the biologically active cysteine site is an active site cysteine. In some embodiments, the biologically active cysteine site is a cysteine residue that is located greater than lOA from an active-site ligand or residue. In some embodiments, the cysteine residue that is located greater than lOA from the active-site ligand or residue is a non-active site cysteine. In some embodiments, the biologically active cysteine site is a non-active site cysteine. In some embodiments, the small molecule fragment that covalently interacts with the biologically active cysteine impairs and/or inhibits activity of the cysteine containing protein. In some embodiments, the cysteine containing protein exists in an active form. In some embodiments, the small molecule fragment and/or the cysteine- reactive probe interact with the active form of the cysteine containing protein. In some embodiments, the cysteine containing protein exists in a pro-active form. In some embodiments, the small molecule fragment and/or the cysteine-reactive probe interact with the pro-active form of the cysteine containing protein. In some embodiments, the structural environment of the biologically active cysteine residue modulates the reactivity of the cysteine residue. In some embodiments, the structural environment is a hydrophobic environment or a hydrophilic environment. In some embodiments, the structural environment is a charged environment. In some embodiments, the structural environment is a nucleophilic environment. In some embodiments, the cysteine containing protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, a plasma protein, transcription related protein, translation related protein, mitochondrial protein, or cytoskeleton related protein. In some embodiments, the cysteine containing protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, transcription related protein, or translation related protein. In some embodiments, the enzyme comprises kinases, proteases, or deubiquitinating enzymes. In some embodiments, the protease is a cysteine protease. In some embodiments, the cysteine protease comprises caspases. In some
embodiments, the signaling protein comprises vascular endothelial growth factor. In some embodiments, the signaling protein comprises a redox signaling protein. In some embodiments, the cysteine containing protein is a protein illustrated in Table 1. In some embodiments, the cysteine containing protein is a protein illustrated in Table 2. In some embodiments, the cysteine containing protein is a protein illustrated in Table 3. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 3. In some embodiments, the cysteine containing protein is a protein illustrated in Table 8. In some embodiments, the cysteine containing protein is a protein illustrated in Table 9. In some embodiments, the cysteine containing protein is a protein illustrated in Table 10A, Table 10B, Table IOC, Table 10D or Table 10E. In some embodiments, the small molecule fragment is a small molecule fragment of Formula (I):
Figure imgf000006_0001
Formula (I), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety. In some embodiments, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some embodiments, F is obtained from a compound library. In some embodiments, the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment-Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from AnCoreX, Bio Building Blocks from ASINEX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library. In some embodiments, F is a small molecule fragment moiety illustrated in Fig. 3. In some embodiments, F further comprises a linker moiety that connects F to the carbonyl moiety. In some embodiments, the small molecule fragment is a small molecule fragment illustrated in Fig. 3. In some embodiments, the small molecule fragment is a specific inhibitor or a pan inhibitor. In some embodiments, the cysteine-reactive probe is a cysteine-
reactive probe of Formula (II):
Figure imgf000006_0002
Formula (II), wherein:RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue; and AHM is an affinity handle moiety. In some embodiments, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some embodiments, the affinity handle moiety comprises an affinity handle and a binding moiety that facilitates covalent interaction of the cysteine-reactive probe to a cysteine residue of a cysteine-containing protein. In some embodiments, the binding moiety is a small molecule fragment obtained from a compound library. In some embodiments, the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment-Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from AnCoreX, Bio Building Blocks from ASINEX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library. In some embodiments, the affinity handle is a bioorthogonal affinity handle. In some embodiments, the affinity handle comprises a carbodiimide, N- hydroxysuccinimide (NHS) ester, imidoester, pentafluorophenyl ester, hydroxymethyl phosphine, maleimide, haloacetyl, pyridyl disulfide, thiosulfonate, vinylsulfone, hydrazide, alkoxyamine, alkyne, azide, or isocyanate group. In some embodiments, the affinity handle comprises an alkyne or an azide group. In some embodiments, the affinity handle is further conjugated to an affinity ligand. In some embodiments, the affinity ligand comprises a chromophore, a labeling group, or a combination thereof. In some embodiments, the
chromophore comprises fluorochrome, non-fluorochrome chromophore, quencher, an absorption chromophore, fluorophore, organic dye, inorganic dye, metal chelate, or a fluorescent enzyme substrate. In some embodiments, the fluorophore comprises rhodamine, rhodol, fluorescein, thiofluorescein, aminofluorescein, carboxyfluorescein, chlorofluorescein, methylfluorescein, sulfofluorescein, aminorhodol, carboxyrhodol, chlororhodol, methylrhodol, sulforhodol, aminorhodamine, carboxyrhodamine, chlororhodamine, methylrhodamine, sulforhodamine, thiorhodamine, cyanine, indocarbocyanine, oxacarbocyanine, thiacarbocyanine, merocyanine, cyanine 2, cyanine 3, cyanine 3.5, cyanine 5, cyanine 5.5, cyanine 7, oxadiazole derivatives, pyridyloxazole, nitrobenzoxadiazole, benzoxadiazole, pyren derivatives, cascade blue, oxazine derivatives, Nile red, Nile blue, cresyl violet, oxazine 170, acridine derivatives, proflavin, acridine orange, acridine yellow, arylmethine derivatives, auramine, crystal violet, malachite green, tetrapyrrole derivatives, porphin, phtalocyanine, bilirubin l-dimethylaminonaphthyl-5- sulfonate, l-anilino-8-naphthalene sulfonate, 2-p-touidinyl-6-naphthalene sulfonate, 3-phenyl-7- isocyanatocoumarin, N-(p-(2-benzoxazolyl)phenyl)maleimide, stilbenes, pyrenes, 6-FAM (Fluorescein), 6-FAM (NHS Ester), 5(6)-FAM, 5-FAM, Fluorescein dT, 5-TAMRA-cadavarine, 2-aminoacridone, HEX, JOE (NHS Ester), MAX, TET, ROX, TAMRA, TARMA™ (NHS Ester), TEX 615, ATTO™ 488, ATTO™ 532, ATTO™ 550, ATTO™ 565, ATTO™ RholOl, ATTO™ 590, ATTO™ 633, ATTO™ 647N, TYE™ 563, TYE™ 665, or TYE™ 705. In some embodiments, the labeling group is biotin moiety, streptavidin moiety, bead, resin, a solid support, or a combination thereof. In some embodiments, the affinity handle moiety further comprises a chromophore. In some embodiments, the cysteine-reactive probe is a cysteine- reactive probe illustrated in Fig. 3. In some embodiments, the second cell solution further comprises a control. In some embodiments, the control is dimethyl sulfoxide (DMSO). In some embodiments, the proteomic analysis means comprises a mass spectroscopy method. In some embodiments, the mass spectroscopy method is a liquid-chromatography-mass spectrometry (LC-MS) method. In some embodiments, the method further comprises analyzing the results from the mass spectroscopy method by an algorithm for protein identification. In some embodiments, the algorithm combines the results from the mass spectroscopy method with a protein sequence database for protein identification. In some embodiments, the algorithm comprises ProLuCID algorithm, Probity, Scaffold, SEQUEST, or Mascot. In some embodiments, the mass spectroscopy method is a MALDI-TOF based method. In some embodiments, the value assigned to each of the cysteine containing protein is obtained from the mass spectroscopy analysis. In some embodiments, the value assigned to each of the cysteine containing protein is the area-under-the curve from a plot of signal intensity as a function of mass-to-charge ratio. In some embodiments, the identifying in step c) further comprises (i) locating a first value assigned to a cysteine containing protein from the first group of cysteine- reactive probe-protein complex and a second value of the same cysteine containing protein from the second group of cysteine-reactive probe-protein complex; and (ii) calculating a ratio between the two values assigned to the same cysteine containing protein. In some embodiments, the ratio of greater than 2 indicates that the cysteine containing protein is a candidate for interacting with the small molecule fragment. In some embodiments, the ratio of greater than 3 indicates that the cysteine containing protein is a candidate for interacting with the small molecule fragment. In some embodiments, the identifying in step c) further comprises calculating a percentage of inhibition of the cysteine-reactive probe to the cysteine containing protein. In some
embodiments, the percentage of inhibition of greater than 50%, 60%, 70%, 80%, 90%, or at
100%) indicates that the cysteine containing protein is a candidate for interacting with the small molecule fragment. In some embodiments, the cell is obtained from a tumor cell line. In some embodiments, the cell is obtained from a MDA-MB-231, Ramos, or Jurkat cell line. In some embodiments, the cell is obtained from a tumor sample. In some embodiments, the sample is a tissue sample. In some embodiments, the method is an in situ method. In some embodiments, the cysteine-reactive probe is not 4-hydroxynonenal or 15-deoxy-A12, 14-prostaglandin J2.
[0006] Disclosed herein, in certain embodiments, is a method of screening a small molecule fragment for interaction with a cysteine containing protein, comprising: (a) harvesting a set of cysteine-reactive probe-protein complexes from a sample treated with a cysteine-reactive probe wherein the cysteine-reactive probe comprises a reactive moiety capable of forming a covalent bond with a cysteine residue located on the cysteine containing protein; (b) analyzing the set of cysteine-reactive probe-protein complexes by a proteomic analysis means; and (c) based on step b), identifying the small molecule fragment as interacting with the cysteine containing protein.
In some embodiments, the method further comprises assigning a value to each of the cysteine containing protein from the set of cysteine-reactive probe-protein complexes prior to identifying the small molecule fragment as interacting with the cysteine containing protein, wherein the value is determined based on the proteomic analysis means of step b). In some embodiments, the sample comprises a first cell solution and a second cell solution. In some embodiments, the method further comprises contacting the first cell solution with a small molecule fragment for an extended period of time prior to incubating the first cell solution with a first cysteine-reactive probe to generate a first group of cysteine-reactive probe-protein complexes. In some embodiments, the extended period of time is about 5, 10, 15, 20, 30, 60, 90, 120 minutes or longer. In some embodiments, the method further comprises contacting the second cell solution with a second cysteine-reactive probe to generate a second group of cysteine-reactive probe- protein complexes. In some embodiments, the first cysteine-reactive probe and the second cysteine-reactive probe are the same. In some embodiments, the first group and the second group of cysteine-reactive probe-protein complexes comprise the set of cysteine-reactive probe- protein complexes. In some embodiments, cells from the second cell solution are grown in a media (e.g., an isotopically enriched media). In some embodiments, cells from the first cell solution are grown in a media (e.g., an isotopically enriched media). In some embodiments, cells from both the first cell solution and the second cell solution are grown in two different isotopically enriched media so that cells from the first cell solution is distinguishable from cells obtained from the second cell solution. In other embodiments, cells from only one of the cell solutions (e.g., either the first cell solution or the second cell solution) are grown in an isotopically enriched media. In some embodiments, the method further comprises contacting the first cell solution with a first set of small molecule fragments and a complementing set of cysteine-reactive probes wherein each small molecule fragment competes with its
complementing cysteine-reactive probe for binding with a cysteine residue, and wherein each small molecule fragment and each complementing cysteine-reactive probe are different within each respective set. In some embodiments, the method further comprises contacting the second cell solution with a second set of cysteine-reactive probes wherein the second set of cysteine- reactive probes is the same as the complementing set of cysteine-reactive probes, and wherein each cysteine-reactive probe is different within the set. In some embodiments, the first set of cysteine-reactive probes generates a third group of cysteine-reactive probe-protein complexes and the second set of cysteine-reactive probes generates a fourth group of cysteine-reactive probe-protein complexes. In some embodiments, the cysteine containing protein comprises a biologically active cysteine residue. In some embodiments, the biologically active cysteine site is a cysteine residue that is located about ΙθΑ or less to an active-site ligand or residue. In some embodiments, the cysteine residue that is located about ΙθΑ or less to the active-site ligand or residue is an active site cysteine. In some embodiments, the biologically active cysteine site is an active site cysteine. In some embodiments, the biologically active cysteine site is a cysteine residue that is located greater than ΙθΑ from an active-site ligand or residue. In some embodiments, the cysteine residue that is located greater than ΙθΑ from the active-site ligand or residue is a non-active site cysteine. In some embodiments, the biologically active cysteine site is a non-active site cysteine. In some embodiments, the small molecule fragment that covalently interacts with the biologically active cysteine impairs and/or inhibits activity of the cysteine containing protein. In some embodiments, the cysteine containing protein exists in an active form. In some embodiments, the small molecule fragment and/or the cysteine-reactive probe interact with the active form of the cysteine containing protein. In some embodiments, the cysteine containing protein exists in a pro-active form. In some embodiments, the small molecule fragment and/or the cysteine-reactive probe interact with the pro-active form of the cysteine containing protein. In some embodiments, the structural environment of the biologically active cysteine residue modulates the reactivity of the cysteine residue. In some embodiments, the structural environment is a hydrophobic environment or a hydrophilic environment. In some embodiments, the structural environment is a charged environment. In some embodiments, the structural environment is a nucleophilic environment. In some embodiments, the cysteine containing protein is selected from an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, a plasma protein, transcription related protein, translation related protein, mitochondrial protein, or cytoskeleton related protein. In some embodiments, the cysteine containing protein is selected from an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, transcription related protein, or translation related protein. In some embodiments, the enzyme comprises kinases, proteases, or deubiquitinating enzymes. In some embodiments, the protease is a cysteine protease. In some embodiments, the cysteine protease comprises caspases. In some embodiments, the signaling protein comprises vascular endothelial growth factor. In some embodiments, the signaling protein comprises a redox signaling protein. In some embodiments, the cysteine containing protein is selected from Table 1. In some embodiments, the cysteine containing protein is a protein illustrated in Table 2. In some embodiments, the cysteine containing protein is a protein illustrated in Table 3. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 3. In some embodiments, the cysteine containing protein is a protein illustrated in Table 8. In some embodiments, the cysteine containing protein is a protein illustrated in Table 9. In some embodiments, the cysteine containing protein is a protein illustrated in Table 10A, Table 10B, Table IOC, Table 10D or Table 10E. In some embodiments, the cysteine containing protein is TIGAR, EVIPDH2, IDH1, IDH2, BTK, ZAK, TGM2, Map2k7, XPOl, Casp5, Casp8, ERCC3, Park 7 (Toxoplasma DM), GSTOl, ALDH2, CTSZ, STAT1, STAT3, SMAD2, RBPJ, FOXK1, IRF4, IRF8, GTF3C1, or TCERG1. In some embodiments, the small molecule fragment is a small molecule fragment of Formula (I):
Figure imgf000011_0001
Formula (I), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety. In some embodiments, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some embodiments, F is obtained from a compound library. In some embodiments, the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment-Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from AnCoreX, Bio Building Blocks from ASINEX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library. In some embodiments, F is a small molecule fragment moiety illustrated in Fig. 3. In some embodiments, F further comprises a linker moiety that connects F to the carbonyl moiety. In some embodiments, the small molecule fragment is a small molecule fragment illustrated in Fig. 3. In some embodiments, the small molecule fragment is a specific inhibitor or a pan inhibitor. In some embodiments, the cysteine-reactive probe is a cysteine-
reactive probe of Formula (II):
Figure imgf000011_0002
Formula (II), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue; and AHM is an affinity handle moiety. In some embodiments, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some embodiments, the affinity handle moiety comprises an affinity handle and a binding moiety that facilitates covalent interaction of the cysteine-reactive probe to a cysteine residue of a cysteine-containing protein. In some embodiments, the binding moiety is a small molecule fragment obtained from a compound library. In some embodiments, the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment-Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from AnCoreX, Bio Building Blocks from ASINEX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library. In some embodiments, the affinity handle is a bioorthogonal affinity handle. In some embodiments, the affinity handle comprises a carbodiimide, N- hydroxysuccinimide (NHS) ester, imidoester, pentafluorophenyl ester, hydroxymethyl phosphine, maleimide, haloacetyl, pyridyl disulfide, thiosulfonate, vinylsulfone, hydrazide, alkoxyamine, alkyne, azide, or isocyanate group. In some embodiments, the affinity handle comprises an alkyne or an azide group. In some embodiments, the affinity handle is further conjugated to an affinity ligand. In some embodiments, the affinity ligand comprises a chromophore, a labeling group, or a combination thereof. In some embodiments, the
chromophore comprises fluorochrome, non-fluorochrome chromophore, quencher, an absorption chromophore, fluorophore, organic dye, inorganic dye, metal chelate, or a fluorescent enzyme substrate. In some embodiments, the fluorophore comprises rhodamine, rhodol, fluorescein, thiofluorescein, aminofluorescein, carboxyfluorescein, chlorofluorescein, methylfluorescein, sulfofluorescein, aminorhodol, carboxyrhodol, chlororhodol, methylrhodol, sulforhodol;
aminorhodamine, carboxyrhodamine, chlororhodamine, methylrhodamine, sulforhodamine, thiorhodamine, cyanine, indocarbocyanine, oxacarbocyanine, thiacarbocyanine, merocyanine, cyanine 2, cyanine 3, cyanine 3.5, cyanine 5, cyanine 5.5, cyanine 7, oxadiazole derivatives, pyridyloxazole, nitrobenzoxadiazole, benzoxadiazole, pyren derivatives, cascade blue, oxazine derivatives, Nile red, Nile blue, cresyl violet, oxazine 170, acridine derivatives, proflavin, acridine orange, acridine yellow, arylmethine derivatives, auramine, crystal violet, malachite green, tetrapyrrole derivatives, porphin, phtalocyanine, bilirubin l-dimethylaminonaphthyl-5- sulfonate, l-anilino-8-naphthalene sulfonate, 2-p-touidinyl-6-naphthalene sulfonate, 3-phenyl-7- isocyanatocoumarin, N-(p-(2-benzoxazolyl)phenyl)maleimide, stilbenes, pyrenes, 6-FAM
(Fluorescein), 6-FAM (NHS Ester), 5(6)-FAM, 5-FAM, Fluorescein dT, 5-TAMRA-cadavarine,
2-aminoacridone, HEX, JOE (NHS Ester), MAX, TET, ROX, TAMRA, TARMA™ (NHS
Ester), TEX 615, ATTO™ 488, ATTO™ 532, ATTO™ 550, ATTO™ 565, ATTO™ RholOl,
ATTO™ 590, ATTO™ 633, ATTO™ 647N, TYE™ 563, TYE™ 665, or TYE™ 705. In some embodiments, the labeling group is biotin moiety, streptavidin moiety, bead, resin, a solid support, or a combination thereof. In some embodiments, the affinity handle moiety further comprises a chromophore. In some embodiments, the cysteine-reactive probe is a cysteine- reactive probe illustrated in Fig. 3. In some embodiments, the second cell solution further comprises a control. In some embodiments, the control is dimethyl sulfoxide (DMSO). In some embodiments, the proteomic analysis means comprises a mass spectroscopy method. In some embodiments, the mass spectroscopy method is a MALDI-TOF based method. In some embodiments, the mass spectroscopy method is a liquid-chromatography-mass spectrometry
(LC-MS) method. In some embodiments, the method further comprises analyzing the results from the mass spectroscopy method by an algorithm for protein identification. In some embodiments, the algorithm combines the results from the mass spectroscopy method with a protein sequence database for protein identification. In some embodiments, the algorithm comprises ProLuCID algorithm, Probity, Scaffold, SEQUEST, or Mascot. In some
embodiments, the value assigned to each of the cysteine containing protein is obtained from the mass spectroscopy analysis. In some embodiments, the value assigned to each of the cysteine containing protein is the area-under-the curve from a plot of signal intensity as a function of mass-to-charge ratio. In some embodiments, the identifying in step c) further comprises (i) locating a first value assigned to a cysteine containing protein from the first group of cysteine- reactive probe-protein complex and a second value of the same cysteine containing protein from the second group of cysteine-reactive probe-protein complex; and (ii) calculating a ratio between the two values assigned to the same cysteine containing protein. In some embodiments, the ratio of greater than 2 indicates that the cysteine containing protein is a candidate for interacting with the small molecule fragment. In some embodiments, the ratio of greater than 3 indicates that the cysteine containing protein is a candidate for interacting with the small molecule fragment. In some embodiments, the identifying in step c) further comprises calculating a percentage of inhibition of the cysteine-reactive probe to the cysteine containing protein. In some
embodiments, the percentage of inhibition of greater than 50%, 60%, 70%, 80%, 90%, or at 100%) indicates that the cysteine containing protein is a candidate for interacting with the small molecule fragment. In some embodiments, the cell is obtained from a tumor cell line. In some embodiments, the cell is obtained from a MDA-MB-231, Ramos, or Jurkat cell line. In some embodiments, the cell is obtained from a tumor sample. In some embodiments, the sample is a tissue sample. In some embodiments, the method is an in situ method.
[0007] Disclosed herein, in certain embodiments, is a method of mapping a biologically active cysteine site on a protein, comprising (a) harvesting a set of cysteine-reactive probe-protein complexes from a sample treated with a cysteine-reactive probe wherein the cysteine-reactive probe comprises a reactive moiety capable of forming a covalent bond with a cysteine residue located on the cysteine containing protein; (b) analyzing the set of cysteine-reactive probe- protein complexes by a proteomic analysis means; and (c) based on step b), mapping the biologically active cysteine site on the protein. In some embodiments, the sample comprises a first cell solution and a second cell solution. In some embodiments, the method further comprises contacting the first cell solution with a small molecule fragment for an extended period of time prior to incubating the first cell solution with a first cysteine-reactive probe to generate a first group of cysteine-reactive probe-protein complexes. In some embodiments, the extended period of time is about 5, 10, 15, 20, 30, 60, 90, 120 minutes or longer. In some embodiments, the method further comprises contacting the second cell solution with a second cysteine-reactive probe to generate a second group of cysteine-reactive probe-protein complexes. In some embodiments, the first cysteine-reactive probe and the second cysteine- reactive probe are the same. In some embodiments, the biologically active cysteine site is a cysteine residue that is located about lOA or less to an active-site ligand or residue. In some embodiments, the cysteine residue that is located about lOA or less to the active-site ligand or residue is an active site cysteine. In some embodiments, the biologically active cysteine site is an active site cysteine. In some embodiments, the biologically active cysteine site is a cysteine residue that is located greater than lOA from an active-site ligand or residue. In some embodiments, the cysteine residue that is located greater than lOA from the active-site ligand or residue is a non-active site cysteine. In some embodiments, the biologically active cysteine site is a non-active site cysteine. In some embodiments, the small molecule fragment that covalently interacts with the biologically active cysteine impairs and/or inhibits activity of the cysteine containing protein. In some embodiments, the cysteine containing protein exists in an active form. In some embodiments, the small molecule fragment and/or the cysteine-reactive probe interact with the active form of the cysteine containing protein. In some embodiments, the cysteine containing protein exists in a pro-active form. In some embodiments, the small molecule fragment and/or the cysteine-reactive probe interact with the pro-active form of the cysteine containing protein. In some embodiments, the structural environment of the biologically active cysteine residue modulates the reactivity of the cysteine residue. In some embodiments, the structural environment is a hydrophobic environment or a hydrophilic environment. In some embodiments, the structural environment is a charged environment. In some embodiments, the structural environment is a nucleophilic environment. In some embodiments, the protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, a plasma protein, transcription related protein, translation related protein, mitochondrial protein, or cytoskeleton related protein. In some embodiments, the protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, transcription related protein, or translation related protein. In some embodiments, the enzyme comprises kinases, proteases, or deubiquitinating enzymes. In some embodiments, the protease is a cysteine protease. In some embodiments, the cysteine protease comprises caspases. In some embodiments, the signaling protein comprises vascular endothelial growth factor. In some embodiments, the signaling protein comprises a redox signaling protein. In some embodiments, the protein is a protein illustrated in Table 1. In some embodiments, the cysteine containing protein is a protein illustrated in Table 2. In some embodiments, the cysteine containing protein is a protein illustrated in Table 3. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 3. In some embodiments, the cysteine containing protein is a protein illustrated in Table 8. In some embodiments, the cysteine containing protein is a protein illustrated in Table 9. In some embodiments, the cysteine containing protein is a protein illustrated in Table 10A, Table 10B, Table IOC, Table 10D or Table 10E. In some embodiments, the small molecule fragment is a small molecule fragment of Formula (I):
Figure imgf000015_0001
Formula (I), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety. In some embodiments, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some embodiments, F is obtained from a compound library. In some embodiments, the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment-Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from AnCoreX, Bio Building Blocks from ASF EX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library. In some embodiments, F is a small molecule fragment moiety illustrated in Fig. 3. In some embodiments, F further comprises a linker moiety that connects F to the carbonyl moiety. In some embodiments, the small molecule fragment is a small molecule fragment illustrated in Fig. 3. In some embodiments, the small molecule fragment is a specific inhibitor or a pan inhibitor. In some embodiments, the cysteine-reactive probe is a cysteine-reactive probe of
Formula (II):
Figure imgf000015_0002
Formula (II), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue; and AHM is an affinity handle moiety. In some embodiments, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some embodiments, the affinity handle moiety comprises an affinity handle and a binding moiety that facilitates covalent interaction of the cysteine-reactive probe to a cysteine residue of a cysteine- containing protein. In some embodiments, the binding moiety is a small molecule fragment obtained from a compound library. In some embodiments, the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment-Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from AnCoreX, Bio Building Blocks from ASFNEX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library. In some embodiments, the affinity handle is a bioorthogonal affinity handle. In some embodiments, the affinity handle comprises a carbodiimide, N-hydroxysuccinimide (NHS) ester, imidoester, pentafluorophenyl ester, hydroxymethyl phosphine, maleimide, haloacetyl, pyridyl disulfide, thiosulfonate, vinylsulfone, hydrazide, alkoxyamine, alkyne, azide, or isocyanate group. In some embodiments, the affinity handle comprises an alkyne or an azide group. In some embodiments, the affinity handle is further conjugated to an affinity ligand. In some embodiments, the affinity ligand comprises a chromophore, a labeling group, or a combination thereof. In some embodiments, the chromophore comprises fluorochrome, non-fluorochrome chromophore, quencher, an absorption chromophore, fluorophore, organic dye, inorganic dye, metal chelate, or a fluorescent enzyme substrate. In some embodiments, the fluorophore comprises rhodamine, rhodol, fluorescein, thiofluorescein, aminofluorescein,
carboxyfluorescein, chlorofluorescein, methylfluorescein, sulfofluorescein, aminorhodol, carboxyrhodol, chlororhodol, methylrhodol, sulforhodol; aminorhodamine, carboxyrhodamine, chlororhodamine, methylrhodamine, sulforhodamine, thiorhodamine, cyanine,
indocarbocyanine, oxacarbocyanine, thiacarbocyanine, merocyanine, cyanine 2, cyanine 3, cyanine 3.5, cyanine 5, cyanine 5.5, cyanine 7, oxadiazole derivatives, pyridyloxazole, nitrobenzoxadiazole, benzoxadiazole, pyren derivatives, cascade blue, oxazine derivatives, Nile red, Nile blue, cresyl violet, oxazine 170, acridine derivatives, proflavin, acridine orange, acridine yellow, arylmethine derivatives, auramine, crystal violet, malachite green, tetrapyrrole derivatives, porphin, phtalocyanine, bilirubin l-dimethylaminonaphthyl-5-sulfonate, l-anilino-8- naphthalene sulfonate, 2-p-touidinyl-6-naphthalene sulfonate, 3 -phenyl -7-isocyanatocoumarin,
N-(p-(2-benzoxazolyl)phenyl)maleimide, stilbenes, pyrenes, 6-FAM (Fluorescein), 6-FAM
(NHS Ester), 5(6)-FAM, 5-FAM, Fluorescein dT, 5-TAMRA-cadavarine, 2-aminoacridone,
HEX, JOE (NHS Ester), MAX, TET, ROX, TAMRA, TARMA™ (NHS Ester), TEX 615,
ATTO™ 488, ATTO™ 532, ATTO™ 550, ATTO™ 565, ATTO™ RholOl, ATTO™ 590,
ATTO™ 633, ATTO™ 647N, TYE™ 563, TYE™ 665, or TYE™ 705. In some embodiments, the labeling group is biotin moiety, streptavidin moiety, bead, resin, a solid support, or a combination thereof. In some embodiments, the affinity handle moiety further comprises a chromophore. In some embodiments, the cysteine-reactive probe is a cysteine-reactive probe illustrated in Fig. 3. In some embodiments, the second cell solution further comprises a control.
In some embodiments, the control is dimethyl sulfoxide (DMSO). In some embodiments, the proteomic analysis means comprises a mass spectroscopy method. In some embodiments, the mass spectroscopy method is a liquid-chromatography-mass spectrometry (LC-MS) method. In some embodiments, the method further comprises analyzing the results from the mass spectroscopy method by an algorithm for protein identification. In some embodiments, the algorithm combines the results from the mass spectroscopy method with a protein sequence database for protein identification. In some embodiments, the algorithm comprises ProLuCID algorithm, Probity, Scaffold, SEQUEST, or Mascot. In some embodiments, the mass
spectroscopy method is a MALDI-TOF based method. In some embodiments, the cell is obtained from a tumor cell line. In some embodiments, the cell is obtained from a MDA-MB- 231, Ramos, or Jurkat cell line. In some embodiments, the cell is obtained from a tumor sample. In some embodiments, the sample is a tissue sample. In some embodiments, the method is an in situ method.
[0008] Disclosed herein, in certain embodiments, is a composition comprising: a small
molecule fragment of Formula (I):
Figure imgf000017_0001
Formula (I), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety; and a cysteine containing protein wherein the cysteine containing protein is covalently bond to the small molecule fragment. In some embodiments, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some embodiments, F is obtained from a compound library. In some embodiments, F is a small molecule fragment moiety illustrated in Fig. 3. In some embodiments, F further comprises a linker moiety that connects F to the carbonyl moiety.
[0009] Disclosed herein, in certain embodiments, is a composition comprising: a cysteine-
reactive probe of Formula (II):
Figure imgf000017_0002
Formula (II), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue; and AFDVI is an affinity handle moiety; and a cysteine containing protein wherein the cysteine containing protein is covalently bond to the cysteine-reactive probe. In some embodiments, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some embodiments, the affinity handle moiety comprises an affinity handle and a binding moiety that facilitates covalent interaction of the cysteine-reactive probe to a cysteine residue of a cysteine-containing protein. In some embodiments, the binding moiety is a small molecule fragment obtained from a compound library. In some embodiments, the affinity handle is a bioorthogonal affinity handle. In some embodiments, the affinity handle comprises a carbodiimide, N-hydroxysuccinimide (NHS) ester, imidoester, pentafluorophenyl ester, hydroxymethyl phosphine, maleimide, haloacetyl, pyridyl disulfide, thiosulfonate, vinylsulfone, hydrazide, alkoxyamine, alkyne, azide, or isocyanate group. In some embodiments, the affinity handle is further conjugated to an affinity ligand. In some embodiments, the affinity handle moiety further comprises a chromophore. In some embodiments, the cysteine-reactive probe is a cysteine-reactive probe illustrated in Fig. 3.
[0010] Disclosed herein, in certain embodiments, is a composition comprising: an isolated sample wherein the isolated sample is an isolated cell or a tissue sample; and a cysteine-reactive probe to be assayed for its ability to interact with a cysteine containing protein expressed in the isolated sample. In some embodiments, the composition further comprises contacting the isolated sample with a small molecule fragment for an extended period of time prior to incubating the isolated sample with the cysteine-reactive probe to generate a cysteine-reactive probe-protein complex. In some embodiments, the extended period of time is about 5, 10, 15, 20, 30, 60, 90, 120 minutes or longer.
[0011] Disclosed herein, in certain embodiments, is an isolated treated cell comprising a cysteine-reactive probe covalently attached to a cysteine containing protein. In some
embodiments, the isolated treated cell further comprises a set of cysteine-reactive probes wherein each of the cysteine-reactive probes is covalently attached to a cysteine containing protein.
[0012] Disclosed herein, in certain embodiments, is an isolated treated cell comprising a small molecule fragment covalently attached to a cysteine containing protein. In some embodiments, the isolated treated cell further comprises a set of small molecule fragments wherein each of the small molecule fragments is covalently attached to a cysteine containing protein. In some embodiments, the isolated treated cell further comprises a cysteine-reactive probe. In some embodiments, the isolated treated cell further comprises a set of cysteine-reactive probes.
[0013] Disclosed herein, in certain embodiments, is an isolated treated population of cells comprising a set of cysteine-reactive probes covalently attached to cysteine containing proteins. Also disclosed herein, in certain embodiments, is an isolated treated population of cells comprising a set of small molecule fragments covalently attached to cysteine containing proteins. In some embodiments, the isolated treated population of cells further comprises a set of cysteine-reactive probes.
[0014] Disclosed herein, in certain embodiments, is an isolated and purified polypeptide comprising at least 90% sequence identity to at least seven contiguous amino acids of an amino acid sequence selected from Tables 1-3 or 8-9. In some embodiments, the isolated and purified polypeptide comprising at least 95% sequence identity to at least seven contiguous amino acids of an amino acid sequence selected from Tables 1-3 or 8-9. In some embodiments, the isolated and purified polypeptide comprising 100% sequence identity to at least seven contiguous amino acids of an amino acid sequence selected from Tables 1-3 or 8-9. In some embodiments, the isolated and purified polypeptide consisting 100% sequence identity to the full length of an amino acid sequence selected from Tables 1-3 or 8-9. In some embodiments, the isolated and purified polypeptide is at most 50 amino acids in length. A polypeptide probe for screening a small molecule fragment comprising an isolated and purified polypeptide described herein.
[0015] Further disclosed herein, in certain embodiments, is a nucleic acid encoding a polypeptide comprising at least 90% sequence identity at least seven contiguous amino acids of an amino acid sequence selected from Tables 1-3 or 8-9. In some embodiments, the nucleic acid encoding a polypeptide comprising at least 95% sequence identity at least seven contiguous amino acids of an amino acid sequence selected from Tables 1 -3 or 8-9. In some embodiments, the nucleic acid encoding a polypeptide comprising 100% sequence identity at least seven contiguous amino acids of an amino acid sequence selected from Tables 1-3 or 8-9. In some embodiments, the nucleic acid encoding a polypeptide consisting 100% sequence identity to the full length of an amino acid sequence selected from Tables 1-3 or 8-9.
[0016] Disclosed herein, in certain embodiments, is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, wherein the small molecule fragment is a small molecule fragment of
Formula (I):
Figure imgf000019_0001
Formula (I), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety. In some embodiments, the cysteine containing protein is a protein illustrated in Table 1. In some embodiments, the cysteine containing protein is a protein illustrated in Table 2. In some embodiments, the cysteine containing protein is a protein illustrated in Table 3. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 3. In some embodiments, the cysteine containing protein is a protein illustrated in Table 8. In some embodiments, the cysteine containing protein is a protein illustrated in Table 9. In some embodiments, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some embodiments, F is obtained from a compound library. In some embodiments, F is a small molecule fragment moiety illustrated in Fig. 3. In some embodiments, F further comprises a linker moiety that connects F to the carbonyl moiety. In some embodiments, the small molecule fragment binds irreversibly to the cysteine containing protein. In some embodiments, the small molecule fragment binds reversibly to the cysteine containing protein. [0017] Disclosed herein, in certain embodiments, is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, wherein the small molecule fragment has a molecular weight of about 150 Dalton or higher. In some embodiments, the small molecule fragment has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some embodiments, the molecular weight of the small molecule fragment is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, the cysteine containing protein is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more. In some embodiments, the cysteine containing protein is a protein illustrated in Table 1. In some embodiments, the cysteine containing protein is a protein illustrated in Table 2. In some embodiments, the cysteine containing protein is a protein illustrated in Table 3. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 3. In some embodiments, the cysteine containing protein is a protein illustrated in Table 8. In some embodiments, the cysteine containing protein is a protein illustrated in Table 9. In some embodiments, the small molecule fragment is a small molecule
fragment of Formula (I):
Figure imgf000020_0001
Formula (I), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety. In some embodiments, the small molecule fragment of Formula (I) has a molecular weight of about 150, 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some embodiments, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some embodiments, F is obtained from a compound library. In some embodiments, F is a small molecule fragment moiety illustrated in Fig. 3. In some embodiments, F further comprises a linker moiety that connects F to the carbonyl moiety. In some embodiments, the small molecule fragment bond irreversibly to the cysteine containing protein. In some embodiments, the small molecule fragment bond reversibly to the cysteine containing protein. [0018] Disclosed herein, in certain embodiments, is a cysteine containing protein-small molecule fragment complex produced by a process comprising contacting a cell solution with a
small molecule fragment of Formula (I):
Figure imgf000021_0001
Formula (I), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety; and wherein the contacting time is between about 5 minutes and about 2 hours. In some embodiments, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some embodiments, F is obtained from a compound library. In some embodiments, F is a small molecule fragment moiety illustrated in Fig. 3. In some embodiments, F further comprises a linker moiety that connects F to the carbonyl moiety. In some embodiments, the small molecule fragment of Formula (I) has a molecular weight of about 150, 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some embodiments, the cysteine containing protein is a protein illustrated in Table 1. In some embodiments, the cysteine containing protein is a protein illustrated in Table 2. In some embodiments, the cysteine containing protein is a protein illustrated in Table 3. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 3. In some embodiments, the cysteine containing protein is a protein illustrated in Table 8. In some embodiments, the cysteine containing protein is a protein illustrated in Table 9. In some embodiments, the cysteine containing protein is a protein illustrated in Table 10A, Table 10B, Table IOC, Table 10D or Table 10E. In some embodiments, the small molecule fragment binds irreversibly to the cysteine containing protein. In some embodiments, the small molecule fragment binds reversibly to the cysteine containing protein.
[0019] Disclosed herein, in certain embodiments, is a modified cysteine containing protein comprising a cysteine-reactive probe having a covalent bond to a cysteine residue of a cysteine containing protein, wherein the cysteine-reactive probe is a cysteine-reactive probe of Formula
(II):
Figure imgf000021_0002
Formula (II), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue; and AHM is an affinity handle moiety. In some embodiments, the cysteine containing protein is a protein illustrated in Table 1. In some embodiments, the cysteine containing protein is a protein illustrated in Table 2. In some embodiments, the cysteine containing protein is a protein illustrated in Table 8. In some embodiments, the cysteine containing protein is a protein illustrated in Table 9. In some embodiments, the cysteine containing protein is a protein illustrated in Table 10A, Table 10B, Table IOC, Table 10D or Table 10E. In some embodiments, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some embodiments, the affinity handle moiety comprises an affinity handle and a binding moiety that facilitates covalent interaction of the cysteine-reactive probe to a cysteine residue of a cysteine-containing protein. In some embodiments, the binding moiety is a small molecule fragment obtained from a compound library. In some embodiments, the affinity handle is a bioorthogonal affinity handle. In some embodiments, the affinity handle comprises a carbodiimide, N-hydroxysuccinimide (NHS) ester, imidoester, pentafluorophenyl ester, hydroxymethyl phosphine, maleimide, haloacetyl, pyridyl disulfide, thiosulfonate, vinylsulfone, hydrazide, alkoxyamine, alkyne, azide, or isocyanate group. In some
embodiments, the affinity handle is further conjugated to an affinity ligand. In some
embodiments, the affinity handle moiety further comprises a chromophore. In some
embodiments, the cysteine-reactive probe is a cysteine-reactive probe illustrated in Fig. 3. In some embodiments, the cysteine-reactive probe binds irreversibly to the cysteine containing protein. In some embodiments, the cysteine-reactive probe binds reversibly to the cysteine containing protein.
[0020] Disclosed herein, in certain embodiments, is a cysteine-reactive probe of Formula (II):
Figure imgf000022_0001
Formula (II), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue; and AHM is an affinity handle moiety. In some embodiments, the cysteine-reactive probe covalently binds to a cysteine residue on a cysteine containing protein. In some embodiments, cysteine containing protein is a protein illustrated in Table 1. In some embodiments, the cysteine containing protein is a protein illustrated in Table 2. In some embodiments, the cysteine containing protein is a protein illustrated in Table 3. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 3. In some embodiments, the cysteine containing protein is a protein illustrated in Table 8. In some embodiments, the cysteine containing protein is a protein illustrated in Table 9. In some embodiments, the cysteine containing protein is a protein illustrated in Table 10A, Table 10B, Table IOC, Table 10D or Table 10E. In some embodiments, the cysteine-reactive probe binds irreversibly to the cysteine containing protein. In some embodiments, the cysteine-reactive probe binds reversibly to the cysteine containing protein.
[0021] Disclosed herein, in certain embodiments, is a compound capable of covalently binding to a cysteine containing protein identified, using the method comprising: (a) obtaining a set of cysteine-reactive probe-protein complexes from a sample wherein the cysteine-reactive probe comprises a reactive moiety capable of forming a covalent bond with a cysteine residue located on the cysteine containing protein; (b) analyzing the set of cysteine-reactive probe-protein complexes by a proteomic analysis means; (c) based on step b), identifying a cysteine containing protein as the binding target for the compound. In some embodiments, the compound is a small molecule fragment. In some embodiments, the small molecule fragment is a small molecule
fragment of Formula (I):
Figure imgf000023_0001
Formula (I), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety. In some embodiments, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some embodiments, F is obtained from a compound library. In some embodiments, the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment -Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from
AnCoreX, Bio Building Blocks from ASINEX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library. In some embodiments, F is a small molecule fragment moiety illustrated in Fig. 3. In some embodiments, F further comprises a linker moiety that connects F to the carbonyl moiety. In some embodiments, the small molecule fragment is a small molecule fragment illustrated in Fig. 3. In some embodiments, the small molecule fragment is a specific inhibitor or a pan inhibitor. In some embodiments, the cysteine containing protein comprises a biologically active cysteine residue. In some embodiments, the biologically active cysteine site is a cysteine residue that is located about ΙθΑ or less to an active-site ligand or residue. In some embodiments, the cysteine residue that is located about ΙθΑ or less to the active-site ligand or residue is an active site cysteine. In some embodiments, the biologically active cysteine site is an active site cysteine. In some embodiments, the biologically active cysteine site is a cysteine residue that is located greater than ΙθΑ from an active-site ligand or residue. In some embodiments, the cysteine residue that is located greater than ΙθΑ from the active-site ligand or residue is a non-active site cysteine. In some embodiments, the biologically active cysteine site is a non-active site cysteine. In some embodiments, the small molecule fragment that covalently interacts with the biologically active cysteine impairs and/or inhibits activity of the cysteine containing protein. In some embodiments, the cysteine containing protein exists in an active form. In some embodiments, the small molecule fragment and/or the cysteine- reactive probe interact with the active form of the cysteine containing protein. In some embodiments, the cysteine containing protein exists in a pro-active form. In some embodiments, the small molecule fragment and/or the cysteine-reactive probe interact with the pro-active form of the cysteine containing protein. In some embodiments, the structural environment of the biologically active cysteine residue modulates the reactivity of the cysteine residue. In some embodiments, the structural environment is a hydrophobic environment or a hydrophilic environment. In some embodiments, the structural environment is a charged environment. In some embodiments, the structural environment is a nucleophilic environment. In some embodiments, the cysteine containing protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, a plasma protein, transcription related protein, translation related protein, mitochondrial protein, or cytoskeleton related protein. In some embodiments, the cysteine containing protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, transcription related protein, or translation related protein. In some embodiments, the enzyme comprises kinases, proteases, or deubiquitinating enzymes. In some embodiments, the protease is a cysteine protease. In some embodiments, the cysteine protease comprises caspases. In some
embodiments, the signaling protein comprises vascular endothelial growth factor. In some embodiments, the signaling protein comprises a redox signaling protein. In some embodiments, the cysteine containing protein is a protein illustrated in Table 1. In some embodiments, the cysteine containing protein is a protein illustrated in Table 2. In some embodiments, the cysteine containing protein is a protein illustrated in Table 3. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 3. In some embodiments, the cysteine containing protein is a protein illustrated in Table 8. In some embodiments, the cysteine containing protein is a protein illustrated in Table 9. In some embodiments, the cysteine containing protein is a protein illustrated in Table 10A, Table 10B, Table IOC, Table 10D or
Table 10E.
[0022] Disclosed herein, in certain embodiments is a derivative of a cysteine-containing
protein having the structure of Formula (I), (I) , wherein,
the derivation occurs at a cysteine residue; R i
Figure imgf000024_0001
s selected from: (a) ; (b)
Figure imgf000025_0001
is H, C 1-C3 alkyl, or aryl; and F is a small molecule fragment moiety. In some embodiments, F has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some embodiments, the molecular weight of F is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, F is a small molecule fragment moiety illustrated in Fig. 3. In some
embodiments, the cysteine containing protein is a cysteine containing protein described herein. In some embodiments, the cysteine containing protein is a protein illustrated in Tables 1, 2, 3, 8 or 9. In some embodiments, the cysteine containing protein is a protein illustrated in Table 1. In some embodiments, the cysteine containing protein is a protein illustrated in Table 2. In some embodiments, the cysteine containing protein is a protein illustrated in Table 3. In some embodiments, the cysteine containing protein is a protein illustrated in Table 8. In some embodiments, the cysteine containing protein is a protein illustrated in Table 9.
[0023] Disclosed herein, in certain embodiments, is a derivative of IDHl protein having the
structure of Formula (I), wherein, the derivatio Hl
cysteine residue position
Figure imgf000025_0002
269 based on SEQ ID NO: 1 ; R is selected from: (a) (b)
Figure imgf000025_0003
is H, C 1-C3 alkyl, or aryl; and F' is a small molecule fragment moiety. In some embodiments, F has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some embodiments, the molecular weight of F' is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, F' is a small molecule fragment moiety illustrated in Fig. 3.
[0024] Disclosed herein, in certain embodiments, is a derivative of IDH2 protein having the
structure of Formula (I),
Figure imgf000026_0001
wherein the derivation occurs at IDH2
cysteine residue position 308 based on SEQ ID NO: 2; R is selected from: (a)
Figure imgf000026_0002
;(b)
Figure imgf000026_0003
; wherein R is H, C 1-C3 alkyl, or aryl; and F' is a small molecule fragment moiety. In some embodiments, F' has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some embodiments, the molecular weight of F' is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, F' is a small molecule fragment moiety illustrated in Fig. 3.
[0025] Disclosed herein, in certain embodiments, is a derivative of caspase-8 protein having
O
H
Figure imgf000026_0005
R
the structure of Formula (I), (I), wherein the derivation occurs at caspase-8 cysteine residue position 360 based on SEQ ID NO: 3; R is selected from: (a)
Figure imgf000026_0004
; wherein R1 is H, C 1-C3 alkyl, or aryl; and F' is a small molecule fragment moiety. In some embodiments, F' has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some embodiments, the molecular weight of F is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, F' is a small molecule fragment moiety illustrated in Fig. 3.
[0026] Disclosed herein, in certain embodiments is a derivative of caspase-10 protein having
the structure of Formula (I),
Figure imgf000027_0001
(I), wherein the derivation occurs at cas ase-10 cysteine residue position 401 based on SEQ ID NO: 4; R is selected from: (a)
Figure imgf000027_0002
; wherein R1 is H, C1-C3 alkyl, or aryl; and F' is a small molecule fragment moiety. In some embodiments, F' has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some embodiments, the molecular weight of F' is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, F' is a small molecule fragment moiety illustrated in Fig. 3.
[0027] Disclosed herein, in certain embodiments is a derivative of PRMT-1 protein having
the structure of Formula (I),
Figure imgf000027_0003
(I), wherein the derivation occurs at
PRMT-1 cysteine residue position 109 based on SEQ ID NO: 5; R is selected from: (a)
Figure imgf000027_0004
Figure imgf000028_0001
; wherein R1 is H, C 1-C3 alkyl, or aryl; and F' is a small molecule fragment moiety. In some embodiments, F' has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some embodiments, the molecular weight of F' is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, F' is a small molecule fragment moiety illustrated in Fig. 3.
[0028] Disclosed herein, in certain embodiments, is a derivative of ZAK protein having the
O
H
Figure imgf000028_0005
structure of Formula (I), (I), wherein the derivation occurs at ZAK
cysteine residue position 22 based on SEQ ID NO: 6; R is selected from:
Figure imgf000028_0002
; (b)
Figure imgf000028_0003
; wherein R1 is H, C 1-C3 alkyl, or aryl; and F' is a small molecule fragment moiety. In some embodiments, F' has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some embodiments, the molecular weight of F' is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, F' is a small molecule fragment moiety illustrated in Fig. 3.
[0029] Disclosed herein, in certain embodiments is a derivative of EVIPDH2 protein having
the structure of Formula (I),
Figure imgf000028_0004
wherein the derivation occurs at
EVIPDH2 cysteine residue position 140 based on SEQ ID NO: 7; R is selected from: (a)
Figure imgf000029_0001
moiety. In some embodiments, F' has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some embodiments, the molecular weight of F' is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, F' is a small molecule fragment moiety illustrated in Fig. 3.
[0030] Disclosed herein, in certain embodiments is a derivative of EVIPDH2 protein having
the structure of Formula (I),
Figure imgf000029_0002
wherein the derivation occurs at
EVIPDH2 c steine residue position 331 based on SEQ ID NO: 7; R is selected from: (a)
Figure imgf000029_0003
; wherein R1 is H, C 1-C3 alkyl, or aryl; and F' is a small molecule fragment moiety. In some embodiments, F' has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some embodiments, the molecular weight of F' is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, F' is a small molecule fragment moiety illustrated in Fig. 3. [0031] Disclosed herein in certain embodiments, is a derivative of TIGAR protein having the
structure of Formula (I),
Figure imgf000030_0001
(I), wherein the derivation occurs at TIGAR
c steine residue position 1 14 based on SEQ ID NO: 8; R is selected from: (a)
Figure imgf000030_0002
Figure imgf000030_0003
; wherein R1 is H, C 1-C3 alkyl, or aryl; and F is a small molecule fragment moiety. In some embodiments, F has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some embodiments, the molecular weight of F is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, F is a small molecule fragment moiety illustrated in Fig. 3.
[0032] Disclosed herein, in certain embodiments, is a derivative of TIGAR protein having the
O
H
Figure imgf000030_0006
structure of Formula (I), ^R (I), wherein the derivation occurs at TIGAR
cysteine residue position 161 based on SEQ ID NO: 8; R is selected from: (a)
Figure imgf000030_0004
; (b)
Figure imgf000030_0005
is H, C 1-C3 alkyl, or aryl; and F' is a small molecule fragment moiety. In some embodiments,
F has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600,
650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some embodiments, the molecular weight of F' is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, F' is a small molecule fragment moiety illustrated in Fig. 3.
[0033] Disclosed herein, in certain embodiments, is a derivative of PKC0 protein having the
structure of Formula (I),
Figure imgf000031_0001
wherein the derivation occurs at PKC0
c steine residue position 14 based on SEQ ID NO: 9; R is selected from: (a)
Figure imgf000031_0002
(b)
Figure imgf000031_0003
is H, C 1-C3 alkyl, or aryl; and F' is a small molecule fragment moiety. In some embodiments, F' has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some embodiments, the molecular weight of F' is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, F' is a small molecule fragment moiety illustrated in Fig. 3.
[0034] Disclosed herein, in certain embodiments, is a derivative of PKC0 protein having the
O
H
Figure imgf000031_0006
structure of Formula (I), (I), wherein the derivation occurs at PKC0
cysteine residue position 17 based on SEQ ID NO: 9; R is selected from:
Figure imgf000031_0004
; (b)
Figure imgf000031_0005
; wherein R1 is H, C 1-C3 alkyl, or aryl; and F' is a small molecule fragment moiety. In some embodiments,
F' has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600,
650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some embodiments, the molecular weight of F is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, F' is a small molecule fragment moiety illustrated in Fig. 3.
[0035] Disclosed herein, in certain embodiments, is a method of identifying a cysteine containing protein as a binding target for a small molecule fragment, comprising: (a) obtaining a set of cysteine-reactive probe-protein complexes from a sample comprising a first cell solution treated with a small molecule fragment and a cysteine reactive probe wherein the cysteine- reactive probe comprises a reactive moiety capable of forming a covalent bond with a cysteine residue located on the cysteine containing protein; (b) analyzing the set of cysteine-reactive probe-protein complexes by a proteomic analysis means; and (c) based on step b), identifying a cysteine containing protein as the binding target for the small molecule fragment. In some embodiments, the method further comprises determining a value of each of the cysteine containing protein from the set of cysteine-reactive probe-protein complexes for identifying a cysteine containing protein as the binding target for the small molecule fragment, wherein the value is determined based on the proteomic analysis means of step b). In some embodiments, the sample further comprises a second cell solution. In some embodiments, the method further comprises contacting the first cell solution with a small molecule fragment for an extended period of time prior to incubating the first cell solution with a first cysteine-reactive probe to generate a first group of cysteine-reactive probe-protein complexes. In some embodiments, the extended period of time is about 5, 10, 15, 20, 30, 60, 90, 120 minutes or longer. In some embodiments, the method further comprises contacting the second cell solution with a second cysteine-reactive probe to generate a second group of cysteine-reactive probe-protein complexes. In some embodiments, the first cysteine-reactive probe and the second cysteine- reactive probe are the same. In some embodiments, the first group and the second group of cysteine-reactive probe-protein complexes comprise the set of cysteine-reactive probe-protein complexes. In some embodiments, the cysteine containing protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, a plasma protein, transcription related protein, translation related protein, mitochondrial protein, or cytoskeleton related protein. In some embodiments, the cysteine containing protein is a protein illustrated in Table 3. In some embodiments, the cysteine containing protein is a protein illustrated in Table 1, Table 2, Table 8, Table 9, Table 10A, Table 10B, Table IOC, Table 10D or Table 10E. In some embodiments, the small molecule fragment is a small molecule fragment of Formula (I):
Figure imgf000033_0001
Formula (I), wherein: RM is a reactive moiety selected from a
Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety. In some embodiments, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some embodiments, F is obtained from a compound library. In some embodiments, the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment -Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from
AnCoreX, Bio Building Blocks from ASINEX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIO ET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library. In some embodiments, F is a small molecule fragment moiety illustrated in Fig. 3. In some embodiments, the cysteine-reactive probe is a cysteine-
reactive probe of Formula (II):
Figure imgf000033_0002
Formula (II), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue; and AHM is an affinity handle moiety. In some embodiments, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some embodiments, the affinity handle moiety comprises an affinity handle and a binding moiety that facilitates covalent interaction of the cysteine-reactive probe to a cysteine residue of a cysteine-containing protein. In some embodiments, the binding moiety is a small molecule fragment obtained from a compound library. In some embodiments, the affinity handle comprises a carbodiimide, N-hydroxysuccinimide (NHS) ester, imidoester, pentafluorophenyl ester, hydroxymethyl phosphine, maleimide, haloacetyl, pyridyl disulfide, thiosulfonate, vinylsulfone, hydrazide, alkoxyamine, alkyne, azide, or isocyanate group. In some
embodiments, the affinity handle is further conjugated to an affinity ligand. In some
embodiments, the affinity ligand comprises a chromophore, a labeling group, or a combination thereof. In some embodiments, the chromophore comprises non-fluorochrome chromophore, quencher, an absorption chromophore, fluorophore, organic dye, inorganic dye, metal chelate, or a fluorescent enzyme substrate. In some embodiments, the labeling group is a biotin moiety, a streptavidin moiety, bead, resin, a solid support, or a combination thereof. In some
embodiments, the cysteine-reactive probe is a cysteine-reactive probe illustrated in Fig. 3. In some embodiments, the proteomic analysis means comprises a mass spectroscopy method. In some embodiments, the identifying in step c) further comprises (i) locating a first value assigned to a cysteine containing protein from the first group of cysteine-reactive probe-protein complex and a second value of the same cysteine containing protein from the second group of cysteine- reactive probe-protein complex; and (ii) calculating a ratio between the two values assigned to the same cysteine containing protein. In some embodiments, the ratio of greater than 2 indicates that the cysteine containing protein is a candidate for interacting with the small molecule fragment. In some embodiments, the identifying in step c) further comprises calculating a percentage of inhibition of the cysteine-reactive probe to the cysteine containing protein. In some embodiments, the percentage of inhibition of greater than 50%, 60%, 70%, 80%, 90%, or at 100%) indicates that the cysteine containing protein is a candidate for interacting with the small molecule fragment. In some embodiments, the method is an in situ method. In some embodiments, the cysteine-reactive probe is not 4-hydroxynonenal or 15-deoxy-A12,14- prostaglandin J2.
[0036] Disclosed herein, modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, wherein the small molecule fragment has a molecular weight of about 150 Dalton or higher. In some embodiments, the cysteine containing protein comprises a cysteine residue site denoted in Table 3. In some embodiments, the cysteine containing protein comprises a protein sequence illustrated in Table 1, Table 2, Table 8, Table 9, Table 10A, Table 10B, Table IOC, Table 10D or Table 10E. In some embodiments, the cysteine containing protein is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more. In some embodiments, the cysteine residue of the modified cysteine containing protein has the structure SR, wherein R is selected from:
O O R; I RI O /OR; I R O R; I RI
R1 ; R1 ; R1 ; CN
Figure imgf000034_0001
. wherein
R1 is H, C1-C3 alkyl, or aryl; and F' is the small molecule fragment moiety. In some
embodiments, the small molecule fragment has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the modified cysteine containing protein is selected from IDH2, caspase-8, caspase-10 or PRMT1. In some embodiments, IDH2 is modified at cysteine position 308. In some embodiments, caspase-8 is modified at cysteine position 360. In some embodiments, caspase-10 exist in the proform and is modified at cysteine position 401. In some embodiments, PRMT1 is modified at cysteine position 109. In some embodiments, the small
molecule fragment is a small molecule fragment of Formula (I):
Figure imgf000035_0001
Formula (I), wherein: RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety. In some embodiments, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some embodiments, F is obtained from a compound library. In some embodiments, F is a small molecule fragment moiety illustrated in Fig. 3. In some embodiments, F further comprises a linker moiety that connects F to the carbonyl moiety. In some embodiments, the small molecule fragment binds irreversibly to the cysteine containing protein. In some embodiments, the small molecule fragment binds reversibly to the cysteine containing protein.
[0037] Disclosed herein, in certain embodiments, is a method of screening a small molecule fragment for interaction with a cysteine containing protein, comprising: (a) harvesting a set of cysteine-reactive probe-protein complexes from a sample comprising a first cell solution treated with a small molecule fragment and a cysteine reactive probe wherein the cysteine-reactive probe comprises a reactive moiety capable of forming a covalent bond with a cysteine residue located on the cysteine containing protein; (b) analyzing the set of cysteine-reactive probe- protein complexes by a proteomic analysis means; and (c) based on step b), identifying the small molecule fragment as interacting with the cysteine containing protein. In some embodiments, the method further comprises determining a value of each of the cysteine containing protein from the set of cysteine-reactive probe-protein complexes prior to identifying the small molecule fragment as interacting with the cysteine containing protein, wherein the value is determined based on the proteomic analysis means of step b). In some embodiments, the cysteine containing protein is a protein illustrated in Table 3. In some embodiments, the cysteine containing protein is a protein illustrated in Table 1, Table 2, Table 8, Table 9, Table 10A, Table 10B, Table IOC, Table 10D or Table 10E.
BRIEF DESCRIPTION OF THE DRAWINGS
[0038] Various aspects of the invention are set forth with particularity in the appended claims A better understanding of the features and advantages of the present invention will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the invention are utilized, and the accompanying drawings of which:
[0039] Fig. 1 illustrates proteome-wide screening of covalent fragments. A, General protocol for competitive isoTOP-ABPP. Cell lysate or intact cells are pre-treated with a fragment electrophile or DMSO and then reacted with an IA-alkyne probe 1. The fragment- and DMSO- treated samples are then conjugated to isotopically-differentiated TEV protease-cleavable biotin tags [light (red) and heavy (blue), respectively] by copper-mediated azide-alkyne cycloaddition (CuAAC or click) chemistry, mixed, and IA-labeled proteins enriched by streptavidin- conjugated beads and digested stepwise on-bead with trypsin and TEV to yield IA-labeled peptides for MS analysis. Competition ratios, or R values, are measured by dividing the MSI ion peaks for IA-labeled peptides in DMSO-treated (heavy or blue) versus fragment-treated (light or red) samples. B, Representative members of the electrophilic fragment library, where the reactive (electrophilic) and binding groups are colored green and black, respectively. C, Initial analysis of the proteomic reactivity of fragments using an IA-rhodamine probe 16. Soluble proteome from Ramos cells was treated with the indicated fragments (500 μΜ each) for 1 h, followed by labeling with IA-rhodamine (1 μΜ, 1 h) and analysis by SDS-PAGE and in-gel fluorescence scanning. Several proteins were identified that show impaired reactivity with IA- rhodamine in the presence of one or more fragments (asterisks). Fluorescent gel shown in grayscale. D, Competitive isoTOP-ABPP analysis of fragment-cysteines interactions in the soluble proteome of MDA-MB-231 cells pre-treated with the following fragments (500 μΜ each): 3,5-di(trifluoromethyl)aniline chloroacetamide 3, acrylamide 14, and acetamide 17.
Proteomic reactivity values, or liganded cysteine rates, for fragments were calculated as the percentage of total cysteines with R values > 4 in DMSO/fragment (heavy/light) comparisons. E, Concentration-dependent labeling of MDA-MB-231 soluble proteomes with acrylamide 18 and chloroacetamide 19 click probes detected by CuACC with a rhodamine-azide tag and analysis by SDS-PAGE and in-gel fluorescence scanning. F, Representative MSI peptide ion
chromatograms from competitive isoTOP-ABPP experiments performed with fragments 3, 4, and 23 marking liganded cysteines selectively targeted by one of three fragments (or, in the case of PHGDH C369, by all three fragments).
[0040] Fig. 2 illustrates a conceptual schematic of an exemplary computer server to be used for processing a method described herein.
[0041] Fig. 3 shows composition of fragment electrophile library and structures of additional tool compounds, click probes, and fragments.
[0042] Fig. 4 illustrates analysis of proteomic reactivities of fragment electrophiles
determined by competitive isoTOP-ABPP in human cell lysates. A, Frequency of quantification of all cysteines across the complete set of competitive isoTOP-ABPP experiments performed with fragment electrophiles. Note that cysteines were required to have been quantified in at least three isoTOP-ABPP data sets for interpretation. B, Rank order of proteomic reactivity values (or liganded cysteine rates) of fragments calculated as the percentage of all quantified cysteines with R values > 4 for each fragment. The majority of fragments were evaluated in 2-4 replicate experiments in MDA-MB-231 and/or Ramos cell lysates, and their proteomic reactivity values are reported as mean ± SEM values for the replicates. C, Comparison of the proteomic reactivities of representative fragments screened at 500 versus 25 μΜ in cell lysates. D,
Comparison of proteomic reactivity values for fragments tested in both Ramos and MDA-MB- 231 lysates. E, Mean ± SEM data for proteomic reactivity values of representative fragments tested in at least three independent replicates. F, Relative GSH reactivity for representative fragment electrophiles. Consumption of GSH (125 μΜ) was measured using Ellman's reagent (5 mM) after 1 h incubation with the indicated fragments (500 μΜ). G, Proteomic reactivity values for fragments electrophiles (500 μΜ) possessing different electrophilic groups attached to a common binding element.
[0043] Fig. 5 illustrates analysis of cysteines and proteins liganded by fragment electrophiles. A, Fraction of total quantified cysteines and proteins that were liganded by fragment
electrophiles in competitive isoTOP-ABPP experiments. B, Fraction of liganded proteins found in DrugBank. C, Functional classes of DrugBank and non-DrugBank proteins containing liganded cysteines. D, Functional categorization of liganded and unliganded cysteines based on residue annotations from the Uniprot database. E, Comparison of the ligandability of cysteines as a function of their intrinsic reactivity with the IA-alkyne probe. Cysteine reactivity values were taken from Weerapana, et al. Nature 468, 790-795 (2010), where lower ratios correspond to higher cysteine reactivity. Individual cysteines are plotted on the x-axis and were sorted by reactivity, which is shown on the left y-axis. A moving average with a step-size of 50 is shown in blue for the percentage of liganded cysteines within each reactivity bin (percent values shown on the right y-axis). F, Number of liganded and quantified cysteines per protein measured by isoTOP-ABPP. Respective average values of one and three for liganded and quantified cysteines per protein were measured by isoTOP-ABPP. G, R values for six cysteines in XPOl quantified by isoTOP-ABPP, identifying C528 as the most liganded cysteine in this protein. Each point represents a distinct fragment-cysteine interaction quantified by isoTOP-ABPP.
[0044] Fig. 6 illustrates analysis of fragment-cysteine interactions. A, Heatmap showing R values for representative cysteines and fragments organized by proteomic reactivity values (high to low, left to right) and percentage of fragment hits for individual cysteines (high to low, top to bottom). R values > 4 designate fragment hits (colored medium and dark blue). White color designates fragment-cysteine interactions that were not detected (ND). B, C, Histograms depicting the percentage of fragments that are hits (R > 4) for all 768 liganded cysteines (B) or for liganded cysteines found in enzymes for which X-ray and/or NMR structures have been reported (or reported for a close homologue of the enzyme) (C). D, Percentage of liganded cysteines targeted only by group A (red) or B (blue) fragments or both group A and B fragments (black). Shown for all liganded cysteines, liganded cysteines in enzyme active and non-active sites, and liganded cysteines in transcription factors/ regulators. For C, D, active-site cysteines were defined as those that reside < 10 A from established active-site residues and/or bound substrates/inhibitors in enzyme structures. E, Representative example of reactive docking predictions shown for XPOl (PDB ID: 3GB8). All accessible cysteines were identified and reactive docking was conducted with all fragments from the library within a 25 A docking cube centered on each accessible cysteine. Categories of XPOl cysteines based on combined docking and isoTOP-ABPP results are shown. F, Success rate of reactive docking predictions for liganded cysteines identified by isoTOP-ABPP in 29 representative proteins.
[0045] Fig. 7 illustrates analysis of cysteines liganded by fragment electrophiles in
competitive isoTOP-ABPP experiments. A, Representative MSI ion chromatograms for peptides containing C481 of BTK and C131 of MAP2K7, two cysteines known to be targeted by the anti -cancer drug ibrutinib. Ramos cells were treated with ibrutinib (1 μΜ, 1 h, red trace) or DMSO (blue trace) and evaluated by isoTOP-ABPP. C, Total number of liganded cysteines found in the active sites and non-active sites of enzymes for which X-ray and/or NMR structures have been reported (or reported for a close homologue of the enzyme). C, R values for eight cysteines in PHGDH quantified by isoTOP-ABPP, identifying a single liganded cysteine C369 that is targeted by several fragment electrophiles. Each point represents a distinct fragment- cysteine interaction quantified by isoTOP-ABPP. D, Heatmap showing representative fragment interactions for liganded cysteines found in the active sites and non-active sites of kinases. E, Histogram showing the fragment hit rate for active- and non-active site cysteines in kinases. F, The percentage of liganded cysteines in kinases that were targeted by only group A, only group B, or both group A and B compounds. G, Heatmap showing representative fragment interactions for liganded cysteines found in transcription factors/regulators. H, The fraction of cysteines predicted to be ligandable or not ligandable by reactive docking that were quantified in isoTOP- ABPP experiments.
[0046] Fig. 8 illustrates confirmation and functional analysis of fragment-cysteine
interactions. A, Representative MSI chromatograms for the indicated Cys-containing peptides from PRMTl quantified in competitive isoTOP-ABPP experiments of MDA-MB-231 cell lysates, showing blockade of IA-alkyne 1 labeling of CI 09 by fragment 11, but not control fragment 3. B, 11, but not 3 blocked IA-rhodamine (2 μΜ) labeling of recombinant, purified
WT-PRMT1 (1 μΜ protein doped into HEK293T cell lysates). Note that a C109S-PRMT1 mutant did not react with IA-rhodamine. C, IC50 curve for blockade of 16 labeling of PRMTl by
11. CI, 95% confidence intervals. D, Effect of 11 and control fragment 3 on methylation of recombinant histone 4 by recombinant PRMTl. Shown is one representative experiment of three independent experiments that yielded similar results. E, 60, but not control fragment 3 (50 μΜ of each fragment) blocked labeling of recombinant MLTK (or ZAK) kinase by a previously reported ibrutinib-derived activity probe 59 (upper panel). A C22A-MLTK mutant did not react with the ibrutinib probe. Anti-FLAG blotting confirmed similar expression of WT- and C22A- MLTK proteins, which were expressed as FLAG-fusion proteins in FIEK293T cells (lower panel). F, IC50 curve for blockade of ibrutinib probe-labeling of MLTK by 60. G, 60, but not control fragment 3 (100 μΜ of each fragment) inhibited the kinase activity of WT-, but not C22A-MLTK. H, Click probe 18 (25 μΜ) labeled WT-FMPDH2 and C331 S-FMPDH2, but not C140S-FMPDH2 (or C140S/C331 S-FMPDH2). Labeling was detected by CuAAC conjugation to a rhodamine-azide reporter tag and analysis by SDS-PAGE and in-gel fluorescence scanning. Recombinant FMPDH2 WT and mutants were expressed and purified from E. coli and added to Jurkat lysates to a final concentration of 1 μΜ protein. I, Nucleotide competition profile for 18- labeling of recombinant WT-FMPDH2 (500 μΜ of each nucleotide). J, IC50 curve for blockade of 18 labeling of FMPDH2 by ATP. K, 5, but not control fragment 3 blocked IA-rhodamine (2 μΜ) labeling of recombinant, purified C161 S-TIGAR (2 μΜ protein doped into Ramos cell lysates). L, IC50 curve for blockade of IA-rhodamine labeling of C161 S-TIGAR by 5. M, 5, but not control fragment 3 (100 μΜ of each fragment) inhibited the catalytic activity of WT- TIGAR, C161 S-TIGAR, but not CI 14S-TIGAR or CI 14S/C161 S-TIGAR. For panels C, F, G, I, J, L, and M, data represent mean values ± SEM for at least three independent experiments. Statistical significance was calculated with unpaired students t-tests comparing DMSO- to fragment-treated samples; **^ p < 0.01, ····, > < 0.0001.
[0047] Fig. 9 illustrates confirmation and functional analysis of fragment-cysteine
interactions. A, Representative MSI ion chromatograms for the MLTK tryptic peptide containing liganded cysteine C22 quantified by isoTOP-ABPP in MDA-MB-231 lysates treated with fragment 4 or control fragment 3 (500 μΜ each). B, Lysates from HEK293T cells expressing WT- or C22A-MLTK treated with the indicated fragments and then an ibrutinib- derived activity probe 59 at 10 μΜ. MLTK labeling by 59 was detected by CuAAC conjugation to a rhodamine-azide tag and analysis by SDS-PAGE and in-gel fluorescence scanning. C,
Representative MSI ion chromatograms for FMPDH2 tryptic peptides containing the catalytic cysteine, C331, and Bateman domain cysteine, C140, quantified by isoTOP-ABPP in cell lysates treated with the indicated fragments (500 μΜ each). D, Structure of human FMPDH2 (PDB ID: 1NF7) (light grey) and its structurally unresolved Bateman domain modeled by ITASSER (dark grey) showing the positions of C331 (red spheres), Ribavirin Monophosphate and C2-
Mycophenolic Adenine Dinucleotide (blue), and C140 (yellow spheres). E, Fragment reactivity with recombinant, purified IMPDH2 added to Jurkat lysates to a final concentration of 1 μΜ protein, where reactivity was detected in competition assays using the click probe 18 (25 μΜ; see Fig. 8H for structure of 18). Note that 18 reacted with WT- and C331 S-IMPDH2, but not C140S or C140S/C331 S-IMPDH2. F, Nucleotide competition of 18 (25 μΜ) labeling of WT- IMPDH2 added to cell lysates to a final concentration of 1 μΜ protein. G, Representative MSI chromatograms for TIGAR tryptic peptides containing CI 14 and C161 quantified by isoTOP- ABPP in cell lysates treated with the indicated fragments (500 μΜ each). H, Crystal structure of TIGAR (PDB ID: 3DCY) showing CI 14 (red spheres), C161 (yellow spheres), and inorganic phosphate (blue). I, Labeling of recombinant, purified TIGAR and mutant proteins by the IA- rhodamine (2 μΜ) probe. TIGAR proteins were added to cell lysates, to a final concentration of 2 μΜ protein. J, Concentration-dependent inhibition of WT-TIGAR by 5. Note that the C140S- TIGAR mutant was not inhibited by 5. Data represent mean values ± SEM for 4 replicate experiments at each concentration.
[0048] Fig. 10 illustrates in situ activity of fragment electrophiles. A, X-ray crystal structure of IDHl (PDB ID: 3MAS) showing the position of C269 and the frequently mutated residue in cancer, R132. B, C, Reactivity of 20 and control fragment 2 with recombinant, purified WT- IDH1 (B) or R132H-IDH1 (C) added to cell lysates to a final concentration of 2 or 4 μΜ protein, respectively. Fragment reactivity was detected in competition assays using the IA- rhodamine probe (2 μΜ); note that the C269S-IDH1 mutant did not react with IA-rhodamine. D, Representative MSI ion chromatograms for the IDHl tryptic peptides containing liganded cysteine C269 and an unliganded cysteine C379 quantified by isoTOP-ABPP in MDA-MB-231 lysates treated with fragment 20 (25 μΜ). E, Western blot of MUM2C cells stably
overexpressing GFP (mock) or R132H-IDH1 proteins. F, Representative MSI chromatograms for the IDHl tryptic peptides containing liganded cysteine C269 and an unliganded cysteine C379 quantified by isoTOP-ABPP in R132H-IDH-expressing MUM2C lysates treated with 20 or control fragment 2 (50 μΜ, 2 h, in situ).
[0049] Fig. 11 illustrates in situ activity of fragment electrophiles. A, Blockade of 16 labeling of WT-IDHl by representative fragment electrophiles. Recombinant, purified WT-IDHl was added to MDA-MB-231 lysates at a final concentration of 2 μΜ, treated with fragments at the indicated concentrations, followed by IA-rhodamine probe 16 (2 μΜ) and analysis by SDS-
PAGE and in-gel fluorescence scanning. Note that a C269S mutant of IDHl did not label with
IA-rhodamine 16. B, IC50 curve for blockade of IA-rhodamine-labeling of IDHl by 20. Note that the control fragment 2 showed much lower activity. C, 20, but not 2, inhibited IDHl- catalyzed oxidation of isocitrate to a-ketoglutarate (a-KG) as measured by an increase in
NADPH production (340 nm absorbance). 20 did not inhibit the C269S-IDH1 mutant. D, 20 inhibited oncometabolite 2-hydroxyglutarate (2-HG) production by R132H-IDH1. MUM2C cells stably overexpressing the oncogenic R132H-IDH1 mutant or control GFP-expressing MUM2C cells were treated with the indicated fragments (2 h, in situ). Cells were harvested, lysed and IDH1 -dependent production of 2-HG from a-KG and NADPH was measured by LC- MS and from which 2-HG production of GFP-expressing MUM2C cells was subtracted (GFP- expressing MUM2C cells produced < 10% of the 2-HG generated by R132H-IDH1 -expressing MUM2C cells). E, Proteomic reactivity values for individual fragments are comparable in vitro and in situ. One fragment (11) marked in red showed notably lower reactivity in situ versus in vitro. Reactivity values were calculated as in Fig. ID. Dashed line mark 90% prediction intervals for the comparison of in vitro and in situ proteomic reactivity values for fragment electrophiles. Blue and red circles mark fragments that fall above (or just at) or below these prediction intervals, respectively. F, Fraction of cysteines liganded in vitro that is also liganded in situ. Shown are liganded cysteine numbers for individual fragments determined in vitro and the fraction of these cysteines that were liganded by the corresponding fragments in situ. G, Representative cysteines that were selectively targeted by fragments in situ, but not in vitro. For in szYw-restricted fragment-cysteine interactions, a second cysteine in the parent protein was detected with an unchanging ratio (R ~ 1), thus controlling for potential fragment-induced changes in protein expression. For panels B-D, data represent mean values ± SEM for at least three independent experiments. Statistical significance was calculated with unpaired students t- tests comparing DMSO- to fragment-treated samples; ····, > < 0.0001.
[0050] Fig. 12 illustrates fragment electrophiles that target pro-CASP8. A, Representative MSI chromatograms for CASP8 tryptic peptide containing the catalytic cysteine C360 quantified by isoTOP-ABPP in cell lysates or cells treated with fragment 4 (250 μΜ, in vitro; 100 μΜ, in situ) and control fragment 21 (500 μΜ, in vitro; 200 μΜ, in situ). B, Fragment reactivity with recombinant, purified active CASP8 added to cell lysates, where reactivity was detected in competition assays using the caspase activity probe Rho-DEVD-AOMK probe ("DEVD" disclosed as SEQ ID NO: 857) (2 μΜ, 1 h). C, Western blot of proteomes from MDA-MB-231, Jurkat, and CASP8-null Jurkat proteomes showing that CASP8 was only found in the pro-enzyme form in these cells. D, Fragment reactivity with recombinant, purified pro- CASP8 (D374A, D384A, C409S) added to cell lysates to a final concentration of 1 μΜ protein, where reactivity was detected in competition assays with the JA-rhodamine probe (2 μΜ). Note that mutation of both cysteine-360 and cysteine-409 to serine prevented labeling of pro-CASP8 by IA-rhodamine. E, Concentration-dependent reactivity of click probe 61, with recombinant, purified pro-CASP8 (D374A, D384A) added to cell lysates to a final concentration of 1 μΜ protein. Note that pre-treatment with 7 blocked 61 reactivity with pro-CASP8 and mutation of C360 to serine prevented labeling of pro-CASP8 by 61 (25 μΜ). F, Fragments 7 and 62 did not block labeling by Rho-DEVD-AOMK ("DEVD" disclosed as SEQ ID NO: 857) (2 μΜ) of recombinant, purified active-CASP8 and active-CASP3 added to MDA-MB-231 cell lysates to a final concentration of 1 μΜ protein. G, Representative MSI chromatograms for tryptic peptides containing the catalytic cysteines of CASP8 (C360), CASP2 (C320), and CASP7 (C186) quantified by isoTOP-ABPP in Jurkat cell lysates treated with 7 or 62 (50 μΜ, 1 h). H,
Representative MSI chromatograms for CASP8 tryptic peptide containing C360 quantified by isoTOP-ABPP in cell lysates treated with 10 versus 100 μΜ of 61. Structure of CASP8 C360 tryptic peptide adduct (blue) modified by 61 (black) and conjugated to TEV cleavable tag (red), where underline indicates site of isotopic modification. Figure discloses SEQ ID NO: 864.
[0051] Fig. 13 illustrates fragment electrophiles that target pro-CASP8. A, 7 blocked IA- rhodamine 16 labeling of pro-CASP8. Experiments were performed with recombinant, purified pro-CASP8 (bearing a C409S mutation to eliminate IA-rhodamine labeling at this site) added to Ramos cell lysate at a final concentration of 1 μΜ and treated with the indicated concentrations of 7 followed by IA-rhodamine (2 μΜ). Note that a C360S/C409S-mutant of pro-CASP8 did not label with IA-rhodamine. B, IC50 curve for blockade of IA-rhodamine labeling of pro-CASP8 (C409S) by 7. C, 7 (50 μΜ) fully competed IA-alkyne-labeling of C360 of endogenous CASP8 in cell lysates as measured by isoTOP-ABPP. Representative MSI chromatograms are shown for the C360-containing peptide of CASP8. D, 7 selectively blocked probe labeling of pro- CASP8 compared to active CASP8. Recombinant pro- and active- CASP8 (added to Ramos cell lysates at a final concentration of 1 μΜ each) were treated with 7 (50 μΜ) or the established caspase inhibitor, Ac-DEVD-CHO ("DEVD" disclosed as SEQ ID NO: 857) (20 μΜ), for 1 h followed by labeling with the click probe 61 (25 μΜ) for pro-CASP8 and the Rho-DEVD- AOMK probe ("DEVD" disclosed as SEQ ID NO: 857) (2 μΜ) for active-CASP8. SDS-PAGE and in-gel fluorescence scanning revealed that 7 competes 61-labeling of pro-CASP8, but not Rho-DEVD-AOMK ("DEVD" disclosed as SEQ ID NO: 857) of active-CASP8, and, conversely, DEVD-CHO ("DEVD" disclosed as SEQ ID NO: 857) competed Rho-DEVD- AOMK ("DEVD" disclosed as SEQ ID NO: 857) labeling of active-CASP8, but not 61-labeling of pro-CASP8. E, Neither 7 nor control fragment 62 (100 μΜ each) inhibited the activity of recombinant, purified active CASP8 and CASP3, which were assayed following addition to Ramos cell lysate using DEVD-AMC and IETD-AFC substrates, respectively. DEVD-CHO ("DEVD" disclosed as SEQ ID NO: 857) (20 μΜ) inhibited the activity of both CASP8 and CASP3. F, 7 (30 μΜ) blocked IA-alkyne labeling of C360 of pro-CASP8, but not active-CASP8 as measured by isoTOP-ABPP. Recombinant pro- and active-CASP8 were added to Ramos lysates at 1 μΜ and then treated with 7 (30 μΜ) followed by isoTOP-ABPP. G, Substitution of a naphthylamine for the aniline portion of 7 furnishes a control fragment 62 that did not compete with IA-rhodamine labeling of C360 of pro-CASP8. H, 7, but not control fragment 62, blocked extrinsic, but not intrinsic apoptosis. Jurkat cells (1.5 million cells/mL) were incubated with 7 or 62 (30 μΜ) or the pan-caspase inhibitor VAD-FMK (100 μΜ) for 30 min prior to addition of staurosporine (2 μΜ) or *¾perFasLigand™ (100 ng /mL). Cells were incubated for 6 hours and viability was quantified with CellTiter-Glo®. RLU- relative light unit. I, For cells treated as described in H, cleavage of PARP (89 kDa), CASP8 (p43/p41), and CASP3 (pl9/pl7) was visualized by western blot. For panels B, E, and H, data represent mean values ± SEM for at least three independent experiments.
[0052] Fig. 14 shows electrohile compounds that target pro-CASP8 and pro-CASPlO.
Heatmap showing R values for caspases measured by quantitative proteomics in Jurkat cells treated with 7, 63-R, or 62 followed by probe 61 (10 μΜ, 1 h) (A). Comparison of effects of 7 and 63-R on FasL-induced apoptosis in Jurkat cells or anti-CD3, anti-CD28-activated primary human T cells (B). For B, data represent mean values ± SEM for at least three independent experiments, and results are representative of multiple experiments performed with T cells from different human subjects. Statistical significance was calculated with unpaired students t-tests comparing DMSO- to fragment treated samples; ****^ p < 0.0001 and comparing Jurkat to T cells ####, p < 0.0001.
[0053] Fig. 15 illustrates a fraction of liganded (62%; 341 of 553 quantified cysteines) and unliganded (20%; 561 of 2870 quantified cysteines) cysteines that are sensitive to heat denaturation measured by IA-alkyne labeling (R > 3 native/heat denatured).
[0054] Fig. 16 shows a percentage of proteins identified by isoTOP-ABPP as liganded by fragments 3 and 14 and enriched by their corresponding click probes 19 and 18 that are sensitive to heat denaturation (64% (85 of 133 quantified protein targets) and 73% (19 of 26 quantified protein targets), respectively). Protein enrichment by 18 and 19 was measured by whole protein capture of isotopically-SILAC labeled MDA-MB-231 cells using quantitative (SILAC) proteomics.
[0055] Fig. 17A-B illustrate exemplary fractions of cysteines predicted based on isoTOP- ABPP method or IA-alkyne probe. Fig. 17A shows the fraction of cysteines predicted to be ligandable or unligandable by reactive docking that were quantified in isoTOP-ABPP experiments. Fig. 17B shows the fraction of cysteines predicted to be ligandable or unligandable by reactive docking that show heat-sensitive labeling by the IA-alkyne probe (R > 3 native/heat denatured).
[0056] Fig. 18 shows lysates from HEK293T cells expressing WT or C22A-MLTK treated with the indicated fragments and then an ibrutinib -derived activity probe 59 at 10 μΜ . MLTK labeling by 59 was detected by CuAAC conjugation to a rhodamine-azide tag and analysis by SDS-PAGE and in-gel fluorescence scanning.
[0057] Fig. 19 shows click probe 18 (25 μΜ) labeled WT-IMPDH2 and C331 S-IMPDH2, but not C140S-IMPDH2 (or C140S/C331 S-IMPDH2). Labeling was detected by CuAAC
conjugation to a rhodamine-azide reporter tag and analysis by SDS-PAGE and in-gel
fluorescence scanning. Recombinant IMPDH2 WT and mutants were expressed and purified from E. coli and added to Jurkat lysates to a final concentration of 1 μΜ protein.
[0058] Fig. 20 shows the apparent IC50 curve for blockade of IA rhodamine-labeling of R132H-IDH1 by 20.
[0059] Fig. 21A-C show the activity of compounds 7 and 62 with respect to different recombinant caspases. Fig. 21A shows that 7 does not inhibit active caspases. Recombinant, active caspases were added to MDA-MD-231 lysate to a final concentration of 200 nM (CASP2, 3, 6, 7) or 1 μΜ (CASP8, 10), treated with z- VAD-FMK (25 μΜ) or 7 (50 μΜ), followed by labeling with the Rho-DEVD-AOMK probe ("DEVD" disclosed as SEQ ID NO: 857) (2 μΜ). Fig. 21B shows a western blot of the cleavage of PARP (96 kDa), CASP8 (p43/p41, pl8), and CASP3 (pl7). Fig. 21C shows that 7 protects Jurkat cells from extrinsic, but not intrinsic apoptosis. Cleavage of PARP, CASP8, and CASP3 detected by western blotting as shown in Fig. 2 IB was quantified for three (STS) or two (FasL) independent experiments. Cleavage products (PARP (96 kDa), CASP8 (p43/p41), CASP3 (pi 7)) were quantified for compound treatment and the % cleavage relative to DMSO treated samples was calculated. For Fig. 21C, STS data represent mean values ± SEM for three independent experiments, and FasL data represent mean values ± SD for two independent experiments. Statistical significance was calculated with unpaired students t-tests comparing active compounds (VAD-FMK and 7) to control compound 62; **, p < 0.01, ***, p <0.001, ****, p < 0.0001.
[0060] Fig. 22 shows that CASPIO is involved in intrinsic apoptosis in primary human T cells.
A, Representative MSI peptide signals showing R values for caspases detected by quantitative proteomics using probe 61. ABPP-SILAC experiments. Jurkat cells (10 million cells) were treated with either DMSO (heavy cells) or the indicated compounds (light cells) for 2 h followed by probe 61 (10 μΜ, 1 h). B, 7 competed 61-labeling of pro-CASP8 and CASPIO, whereas 63-R selectively blocked probe labeling of pro-CASP8. C, 7, but not 63-R block probe labeling of pro-CASPlO. Recombinant pro-CASPIO was added to MDA-MB-231 lysates to a final concentration of 300 nM, treated with the indicated compounds, and labeled with probe 61.
Mutation of the catalytic cysteine C401 A fully prevented labeling by 61. D, Apparent IC50 curve for blockade of 61-labeling of pro-CASPIO by 7, 63-R or 63-S. E, Neither 7 nor 63 (25 μΜ each) inhibited the activity of recombinant, purified active CASPIO (500 nM), which was assayed following addition of the protein to MDA-MB-231 lysate using fluorometric AEVD- AMC substrate ("AEVD" disclosed as SEQ ID NO: 859). DEVD-CHO ("DEVD" disclosed as SEQ ID NO: 857) (20 μΜ) inhibited the activity of CASP10. F, Apparent IC50 curve for blockade of 61 labeling of pro-CASP8 and pro-CASPIO by 63-R. G, 63-R shows increased potency against pro- CASP8. Recombinant pro-CASP8 was added to MDA-MB-231 lysates to a final concentration of 300 nM, treated with the indicated compounds and labeled with probe 61. H, Apparent IC50 curve for blockade of 61 labeling of pro-CASP8 by 63-R compared with 63-S. The structure of 63-S is shown. I, CASP10 is more highly expressed in primary human T cells compared to Jurkat cells. Western blot analysis of full-length CASP10, CASP8 and GAPDH expression levels in Jurkat and T-cell lysates (2 mg/mL). J, Jurkat cells (150,000 cells/well) were incubated with 7 or 63-R at the indicated concentrations for 30 min prior to addition of staurosporine (2 μΜ) or *¾perFasLigand™ (100 ng /mL). Cells were incubated for 4 h and viability was quantified with CellTiter-Glo®. K, Jurkat cells treated as in J, but with 63-R or 63-S. L, HeLa cells (20,000 cells/well) were seeded and 24 h later treated with the indicated compounds for 30 minutes prior to addition of *¾perFasLigand™ (100 ng /mL) and
cycloheximide (CHX, 2.5 ng/mL). Cells were incubated for 6 h and viability quantified with CTG M, For T cells treated as in Fig. 14B cleavage of C ASP 10 (p22), CASP8 (pi 8), CASP3 (pl7) and RIPK (33 kDa) was visualized by western blotting. For panels D-F, H, and J-K, data represent mean values ± SEM for at least three independent experiments. Statistical significance was calculated with unpaired students t-tests comparing DMSO- to fragment-treated samples; **, /? < 0.01, ****, /? < 0.0001.
[0061] Fig. 23A-F exemplify DMF inhibits the activation of primary human T cells. Fig. 23A illustrates the chemical structures of DMF, MMF, and DMS. Fig. 23B - Fig. 23E illustrate bar graphs that exemplify IL-2 release (Fig. 23B), CD25 expression (Fig. 23 C and Fig. 23D), and CD69 expression (Fig. 23E) in primary human T cells, either unstimulated (Unstim) or stimulated (Stim) with anti-CD3 + anti-CD28 in the presence of DMSO or the indicated concentrations of DMF, MMF, and DMS for 8 hours. Fig. 23F illustrates a bar graph that exemplifies time course of DMF effects. T cells were stimulated with anti-CD3 + anti-CD28 for the indicated periods of time before beginning DMF treatment. Cells were harvested 24 h after beginning T cell stimulation. Shown are data gated on CD4+ cells. Data represent mean ± SE; n = 4-6 experiments/group. *p < 0.05, **p < 0.01, ***p < 0.001 in comparison to DMSO group.
[0062] Fig. 24 illustrates a bar graph that exemplifies DMF does not affect T cell viability. Primary human T cells were stimulated with anti-CD3 and anti-CD28 antibodies as indicated and treated concomitantly with compound for 8 h. Cells were then stained with LIVE/DEAD fixable blue stain, and analyzed by flow cytometry. Shown are data gated on CD4+ cells. Data represent mean ± SE for four experiments per group.
[0063] Fig. 25A-B illustrate bar graphs that exemplify DMF, but not MMF, inhibits the activation of primary mouse T cells. Splenic T cells were harvested from C57BL/6 mice and left either unstimulated (Unstim) or stimulated (Stim) with anti-CD3 + anti-CD28 in the presence of DMSO or the indicated concentrations of DM F, MMF, and DM S for 8 h. Activation was assessed by measuring CD25 (Fig. 25 A) and CD69 (Fig. 25B) expression. Data represent mean ± SE for four experiments per group. ***p < 0.001 in comparison to DMSO group.
[0064] Fig. 26A-D illustrate bar graphs that exemplify inhibitor}'" effects of DMF are equivalent in Nrf2(+/+) and (-/-) T cells and not caused by reductions in cellular GSH. Fig. 26A exemplifies CD25 expression in anti-CD3 + anti-CD28-stimulated Nrf2(+/+) and (-/-) T cells. Splenic T cells were harvested from Nrf2(+/+) and (-/-) mice, then stimulated in the presence of indicated compounds for 24h. Fig. 26B and Fig. 26C exemplify treatment with DMF or BSO causes significant reductions in GSH content of human T cells. Primary human T cells were stimulated with anti-CD3 + anti-CD28 antibodies and treated with DMF (50 μΜ, 2 hours) or BSO (2.5 niM, 4 hours), after which intracellular GSH levels were measured. Fig. 26D exemplifies that BSO does not alter T cell activation. Primary human T cells were treated with DMSO, DMF (50 μΜ), or BSO (2.5 mM) and stimulated as indicated for 8 h, after which CD25 expression was measured. Data represent mean ±SE for two biological replicates, with 3-4 technical replicates per biological replicate. *p< 0.05, **p < 0.01, ***p < 0.001 in comparison to DMSO groups.
[0065] Fig. 27A-F exemplify isoTOP-ABPP of DMF -treated primary human T cells. Fig. 27 A illustrates a graph that exemplifies isoTOP-ABPP ratios, or R values, for > 2400 Cys residues in primary human T cells treated with DMSO or DMF or MMF (50 μΜ, 4 h). Fig. 27B illustrates a graph that exemplifies expanded profile for DMF-sensitive Cys residues (R values > 4 for DMSO/DMF). For Fig. 27A and Fig. 27B, data represent aggregate quantified Cys residues from five biological replicates. For Cys residues quantified in more than one replicate, average ratios are reported. Dashed line designates R values > 4, which was used to define DMF- sensitive Cys residues (> 4-fold reductions in IA-alkyne reactivity in DMF-treated T cells). Fig. 27C and Fig. 27D illustrate graphs that exemplify concentration- and time- dependent profiles for DMF-sensitive Cys residues in T cells, respectively. For additional concentrations (10 and 25 μΜ) and time points (1 and 2 h), data represent aggregate quantified Cys residues from one- three isoTOP-ABPP experiments per group. Fig. 27E illustrates a chart which exemplifies fraction of proteins for which both a DMF-sensitive Cys residue and at least one additional Cys residue was quantified (Left) and, fraction of these proteins where additional Cys residue was clearly unchanged (Right) (R value < 2.0 for DMSO/DMF). Unclear calls mark proteins with DMF-sensitive Cys residues where the R value for second Cys showed marginal evidence of potential change (R values between 2.0 and 3.9). Fig. 27F illustrates representative MS I profiles for quantified Cys residues in PRKDC, one of which (C4045) shows sensitivity to DMF.
[0066] Fig. 28A-B illustrate bar graphs that exemplify the total number of unique quantified peptides (Fig. 28A) and proteins (Fig. 28B) begin to plateau after five biological replicates of the isoTOP-ABPP experiment in primary human T cells (treated with 50 uM DMF for 4 h).
[0067] Fig. 29 illustrates a graph that exemplifies isoTOP-ABPP of BSO-treated primary human T cells. Cells were treated with 2.5 mM BSO for 4 hours. Data represent aggregate quantified Cys residues from two isoTOP-ABPP experiments per group.
[0068] Fig. 30A-C exemplify conservation and functional analysis of DMF-sensitive cysteines. Fig. 30 A exemplifies fraction of DMF-sensitive cysteines in the human T cell proteome that are conserved in mice. Fig. 30B exemplifies fraction of conserved DMF-sensitive Cys residues in human T cells that were quantified and also sensitive to DMF in mouse T cells. Fig. 30C exemplifies distribution of proteins harboring DMF-sensitive Cys residues by functional class.
[0069] Fig. 31A-C exemplify DMF inhibits p65 translocation to the nucleus in primary human T cells. Fig. 31 A exemplify Human T cells were either left unstimulated or stimulated with anti- CD3 and anti-CD28 antibodies and treated with DMSO or DMF (50 uM) for 1 h. Fig. 3 IB illustrates a bar graph that exemplifies ratio of nuclear to cytoplasmic localization of p65 for samples shown in Fig. 31 A, as well as samples treated with MMF (50 uM) or DMS (50 uM). Fig. 31C exemplifies p65 levels in whole cell lysate.
[0070] Fig. 32A-G exemplify DMF-sensitive C14/C 17 residues in PKCO are important for
CD28 interactions and T cell activation. Fig. 32A illustrates representative MS I profiles for
DMF-sensitive (C14/C17) and -insensitive (C322) Cys residues in PKCG. Fig. 32B exemplifies sequence conservation analysis of human and mouse PKCG, human PKC5, and human PKCs
(SEQ ID NOS 865-868, respectively, in order of appearance). Shown in red are C14 and C17.
Fig. 32C illustrates location of DMF-sensitive C14 and C17 residues in the C2 domain of PKCG
(PDB accession number 2ENJ). Fig. 32D exemplifies DMF, but not MMF, treatment blocks the association of PKCG with CD28. Peripheral CD4+ T cells from C57BL/6 mice were pre- incubated with DMSO, DMF (50 μΜ), or MMF (50 μΜ), either left unstimulated or stimulated with anti- CD3 + anti-CD28 for 5 min, then washed and lysed. Immunoprecipitations (IPs) were performed in the cell lysates with anti-CD28 or control IgG antibodies and IPs blotted for CD28 or PKCG. Fig. 32E illustrates Co-IP of WT PKCG and the C14S/C 17S (2CS) PKCG mutant with
CD28. PKC0(-/-) T cells were reconstituted with empty vector (EV), WT PKCG, or the 2CS
PKCG mutant. Fig. 32F and Fig. 32 G illustrate PKCGf-/-) T cells reconstituted with WT or 2CS PKC0 were assayed for activation potential by measuring CD25 expression (Fig. 32F) and IL-2 (Fig. 32G). For Fig. 32E - Fig. 32G, PKC0( --/-) T cell cultures were pre-activated with plate- coated anti-CD3 + anti-CD28 for 24 h before retroviral transduction with empty vector, WT PKC0, or the 2CS PKC0 mutant. Cells were rested in culture medium without stimulation for 48 h, then re-stimulated with or without 1 μg/mL plate-coated anti-CD3(+28) overnight (Fig. 32F), for 48 h (Fig. 32G), or with soluble 10 μg/mL anti-CD3 + anti-CD28 for 5 min prior to IP (Fig. 32D). For Fig. 32D and Fig. 32E, data are from a single experiment representative of three biological replicates. For Fig. 32F and Fig. 32G, data represent mean ± SE for three biological replicates. ***p < 0.001 in comparison to WT PKC0 group.
[0071] Fig. 33A-D exemplify DMF sensitivity of C14/C17 in PKC0. Fig. 33A illustrates representative MSI profile of C14/C17 of mouse PKC0 shows sensitivity to DMF (50 μΜ, 4 h) in isoTOP-ABPP experiments. Fig. 33B and Fig. 33C exemplify Time- and concentration- dependence of DMF sensitivity of C14/C17 in human PKC0, respectively, as determined by isoTOP-ABPP experiments. Fig. 33D exemplifies C14/C17 of human PKC0 are insensitive to MMF treatment (50 μΜ MMF, 4 h).
[0072] Fig. 34A-B exemplify DMF-sensitive Cys residue in ADA. Fig. 34A illustrates the DMF-sensitive Cys, C75 (magenta), is -25 angstroms from the ADA active site (orange). Fig. 34B illustrates mutations in both residues neighboring C75 (G74 and R76 (blue)) have been associated with the severe combined immunodeficiency known as ADASCID (OMFM: 608958). PDB accession number: 3IAR.
DETAILED DESCRIPTION OF THE INVENTION
[0073] Cysteine containing proteins encompass a large repertoire of proteins that participate in numerous cellular functions such as mitogenesis, proliferation, apoptosis, gene regulation, and proteolysis. These proteins include enzymes, transporters, receptors, channel proteins, adaptor proteins, chaperones, signaling proteins, plasma proteins, transcription related proteins, translation related proteins, mitochondrial proteins, or cytoskeleton related proteins.
Dysregulated expression of a cysteine containing protein, in many cases, is associated with or modulates a disease, such as an inflammatory related disease, a neurodegenerative disease, or cancer. As such, identification of a potential agonist/antagonist to a cysteine containing protein aids in improving the disease condition in a patient.
[0074] In some instances, potential constrains exist in drug screening due to the structurally complex compound and the inability of some of the structurally complexed compound to interact with the protein. As such, small molecule fragments are employed in some instances to serve as launching point for structure-guided elaboration of an initial interaction into a high- affinity drug. In some instances, one method of identifying a small molecule fragment that interacts with a cysteine containing protein is through monitoring their interaction under an in vitro environment. However in some cases, the in vitro environment does not mimic the native condition of the cysteine containing protein. In other cases, the in vitro environment lacks additional helper proteins to facilitate interaction with the small molecule fragment. Further still, in some instances, difficulties arise during the expression and/or purification stage of the cysteine-containing protein.
[0075] Described herein is another method of identifying small molecule fragments for interaction with a cysteine containing protein. In some instances, this method allows for mapping of small molecule fragments for interaction with a cysteine containing protein under native conditions, thereby allows for an accurate mapping of interaction with potential small molecule fragments. In some instances, this method also allows for identification of novel cysteine containing protein targets as this method eliminates the need of recombinant expression and purification.
[0076] In some embodiments, also described herein are compositions, cells, cell populations, assays, probes, and service related to the method of identifying a small molecule fragment for interaction with a cysteine containing protein.
[0077] General Methodology
[0078] In some embodiments, the methods described herein utilize a small molecule fragment and a cysteine-reactive probe for competitive interaction with a cysteine-containing protein. In some embodiments, the method is as described in Fig. 1 A. Fig. 1A illustrates contacting a first cell solution with a small molecule fragment for an extended period of time prior to incubating the first cell solution with a first cysteine-reactive probe to generate a first group of cysteine-reactive probe-protein complexes. In some embodiments, the extended period of time is about 5, 10, 15, 20, 30, 60, 90, 120 minutes or longer. In some instances, the small molecule fragment competes with the first cysteine-reactive probe for interaction with a protein target. In some instances, the small molecule fragment or the cysteine-reactive probe form a covalent bond via a Michael's reaction with a cysteine residue of the cysteine containing protein. Fig. 1 A further illustrates contacting a second cell solution with a second cysteine-reactive probe to generate a second group of cysteine-reactive probe-protein complexes. In some instances, the first cysteine-reactive probe and the second cysteine-reactive probe are the same.
[0079] In some embodiments, cells from the second cell solution are grown in an enriched media (e.g., an isotopically enriched media). In some cases, cells from the first cell solution are grown in an enriched media (e.g., an isotopically enriched media). In some instances, cells from both the first cell solution and the second cell solution are grown in two different enriched media (e.g., two different isotopically enriched media) so that a protein obtained from cells grown in the first cell solution is distinguishable from a protein obtained from cells grown in the second cell solution. In other embodiments, cells from only one of the cell solutions (e.g., either the first cell solution or the second cell solution) are grown in an enriched media (e.g., isotopically enriched media). In such cases, a protein obtained from the enriched cells (e.g., isotopically enriched cells) is distinguishable from a protein obtained from cells that have not been enriched (e.g., isotopically enriched).
[0080] As illustrated in Fig. 1 A, in some instances the second cell solution is not treated with a small molecule fragment. In such cases, the second cell solution acts as a control.
[0081] In some instants, cells from the second cell solution are are further treated with a buffer. In some cases, the buffer is DMSO. In some cases, cells from the second cell solution are not treated with a small molecule fragment and the second cell solution acts as a control.
[0082] In some instances, a first group of cysteine-reactive probe-protein complexes and a second group of cysteine-reactive probe-protein complexes are harvested separately and combined to generate a set of cysteine-reactive probe-protein complexes which is further processed by a proteomic analysis means. In some cases, either the first group of cysteine- reactive probe-protein complexes or the second group of cysteine-reactive probe-protein complexes contain labeled proteins obtained from cells grown in an enriched media (e.g., isotopically enriched media). In some cases, both groups of cysteine-reactive probe-protein complexes contain labeled proteins obtained from cells grown in two different enriched media (e.g., two different isotopically enriched media). In other cases, either the first group of cysteine- reactive probe-protein complexes, the second group of cysteine-reactive probe-protein complexes, or both groups of cysteine-reactive probe-protein complexes contain labeled proteins in which the proteins have been labeled after havesting from a cell.
[0083] In some instances, a first group of cysteine-reactive probe-protein complexes and a second group of cysteine-reactive probe-protein complexes are harvested separately and the proteins from one of the two groups of cysteine-reactive probe-protein complexes are subsequently labeled (e.g., by methylation). In some cases, first group of cysteine-reactive probe-protein complexes and a second group of cysteine-reactive probe-protein complexes are then combined and subjected to proteomic analysis means.
[0084] In other instances, a first group of cysteine-reactive probe-protein complexes and a second group of cysteine-reactive probe-protein complexes are harvested separately and both groups are subjected to proteomic analysis means. In some cases, data obtained from a protemoic analysis means is then combined for further analysis.
[0085] In some embodiments, the proteomic analysis means comprises a mass spectroscopy method. In some instances, the mass spectroscopy method is a liquid-chromatography-mass spectrometry (LC-MS) method. In some cases, the proteomic analysis means further comprise analyzing the results from the mass spectroscopy method by an algorithm for protein
identification. In some cases, the algorithm combines the results from the mass spectroscopy method with a protein sequence database for protein identification. In some cases, the algorithm comprises ProLuCID algorithm, Probity, Scaffold, SEQUEST, or Mascot. In some cases, the mass spectroscopy method is a MALDI-TOF based method.
[0086] In some embodiments, a value is assigned to each of the cysteine binding protein from the cysteine-reactive probe-protein complexes after proteomic analysis, in which the value is determined from the proteomic analysis. In some cases, the value assigned to each of the cysteine containing protein is obtained from a mass spectroscopy analysis. In some instances, the value is an area-under-the curve from a plot of signal intensity as a function of mass-to- charge ratio. In some embodiments, a first value is assigned to a cysteine binding protein from the first group of cysteine-reactive probe-protein complex of the first cell solution and a second value of the same cysteine binding protein from the second group of cysteine-reactive probe- protein complex of the second cell solution. In some instances, a ratio is then calculated between the two values, the first value and the second value, and assigned to the same cysteine binding protein. In some instances, a ratio of greater than 2 indicates that the cysteine binding protein is a candidate for interacting with the small molecule fragment. In some instances, the ratio is greater than 2.5, 3, 3.5, 4, 4.5, 5, 6, 7, 8, 9, or 10. In some cases, the ratio is at most 20. In some instances, the same small molecule fragment interacts with a number of cysteine binding proteins in the presence of a cysteine-reactive probe. In some instances, the small molecule modulates the interaction of a cysteine-reactive probe with its cysteine binding protein partners. In some instances, the spectrum of ratios for a small molecule fragment with its interacting protein partners in the presence of a cysteine-reactive probe indicates the specificity of the small molecule fragment toward the protein. In some instances, the spectrum of ratio indicates whether the small molecule fragment is a specific inhibitor to a protein or a pan inhibitor.
[0087] In some embodiments, the cysteine containing protein identified by the above method comprises a biologically active cysteine residue. In some instances, the biologically active cysteine site is a cysteine residue that is located about ΙθΑ or less to an active-site ligand or residue. In some cases, the cysteine residue that is located about ΙθΑ or less to the active-site ligand or residue is an active site cysteine. In some cases, the biologically active cysteine site is an active site cysteine. In some embodiments, the biologically active cysteine site is a cysteine residue that is located greater than ΙθΑ from an active-site ligand or residue. In some cases, the cysteine residue that is located greater than ΙθΑ from the active-site ligand or residue is a non- active site cysteine. In some instances, the biologically active cysteine site is a non-active site cysteine.
[0088] In some embodiments, the small molecule fragment that covalently interacts with the biologically active cysteine impairs and/or inhibits activity of the cysteine containing protein. In some instances, the cysteine containing protein exists in an active form. In some embodiments, the small molecule fragment and/or the cysteine-reactive probe interact with the active form of the cysteine containing protein. In some instances, the cysteine containing protein exists in a pro-active form. In some embodiments, the small molecule fragment and/or the cysteine-reactive probe interact with the pro-active form of the cysteine containing protein.
[0089] In some embodiments, the structural environment of the biologically active cysteine residue modulates the reactivity of the cysteine residue. In some embodiments, the structural environment is a hydrophobic environment or a hydrophilic environment. In some embodiments, the structural environment is a charged environment. In some embodiments, the structural environment is a nucleophilic environment.
[0090] In some embodiments, the cysteine containing protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, a plasma protein, transcription related protein, translation related protein, mitochondrial protein, or cytoskeleton related protein. In some instances, the cysteine containing protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, transcription related protein, or translation related protein. In some embodiments, the cysteine containing protein is a protein illustrated in Tables 1, 2, 3, 8 or 9. In some instances, the cysteine residue of the cysteine-containing proteins illustrated in Tables 1, 2, 3, 8 or 9 is denoted by (*) in Tables 1, 2, 3, 8 or 9.
[0091] In some instances, a set of cysteine-reactive probes are added to the cell solutions. For example, a first set of cysteine-reactive probes are added to the first cell solution and a second set of cysteine-reactive probes are added to the second cell solution. In some cases, each cysteine-reactive probe is different within the set. In some instances, the first set of cysteine- reactive probes is the same as the second set of cysteine-reactive probes. In some cases, the first set of cysteine-reactive probes generate a third group of cysteine-reactive probe-protein complexes and the second set of cysteine-reactive probes generate a fourth group of cysteine- reactive probe-protein complexes. In some instances, the set of cysteine-reactive probes further facilitates identification of cysteine containing proteins.
[0092] In some embodiments, the sample is a cell sample. In other instances, the sample is a tissue sample.
[0093] In some instances, the method is an in-situ method. Small Molecule Fragments
[0094] In some embodiments, the small molecule fragments described herein comprise non- naturally occurring molecules. In some instances, the non-naturally occurring molecules do not include natural and/or non-natural peptide fragments, or small molecules that are produced naturally within the body of a mammal.
[0095] In some embodiments, the small molecule fragments described herein comprise a molecule weight of about 100 Dalton or higher. In some embodiments, the small molecule fragments comprise a molecule weight of about 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some instances, the molecule weight of the small molecule fragments are between about 150 and about 500, about 150 and about 450, abut 150 and about 440, about 150 and about 430, about 150 and about 400, about 150 and about 350, about 150 and about 300, about 150 and about 250, about 170 and about 500, about 180 and about 450, about 190 and about 400, about 200 and about 350, about 130 and about 300, or about 120 and about 250 Dalton.
[0096] In some embodiments, the molecule weight of the small molecule fragments described herein is the molecule weight prior to enrichment with one or more elements selected from a halogen, a nonmetal, a transition metal, or a combination thereof. In some embodiments, the molecule weight of the small molecule fragments described herein is the molecule weight prior to enrichment with a halogen. In some embodiments, the molecule weight of the small molecule fragments described herein is the molecule weight prior to enrichment with a nonmetal. In some embodiments, the molecule weight of the small molecule fragments described herein is the molecule weight prior to enrichment with a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms.
[0097] In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some cases, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, the molecular weight of the small molecule fragment does not include the molecular weight of a transition metal.
[0098] In some embodiments, the small molecule fragments described herein comprise micromolar or millimolar binding affinity. In some instances, the small molecule fragments comprise a binding affinity of about ΙμΜ, 10μΜ, ΙΟΟμΜ, 500μΜ, ImM, lOmM, or higher. [0099] In some embodiments, the small molecule fragments described herein has a high ligand efficiency (LE). Ligand efficiency is the measurement of the binding energy per atom of a ligand to its binding partner. In some instances, the ligand efficiency is defined as the ratio of the Gibbs free energy (AG) to the number of non-hydrogen atoms of the compound (N):
LE = (AG)/N.
[00100] In some cases, LE is also arranged as:
LE = 1.4 (-logIC50)/N.
[00101] In some instances, the LE score is about 0.3 kcal mol^HA"1, about 0.35 kcal mol" 1HA"1, about 0.4 kcal mol^HA"1, or higher.
[00102] In some embodiments, the small molecule fragments described herein are designed based on the Rule of 3. In some embodiments, the Rule of 3 comprises a non-polar solvent- polar solvent (e.g. octanol -water) partition coefficient log P of about 3 or less, a molecular mass of about 300 Daltons or less, about 3 hydrogen bond donors or less, about 3 hydrogen bond acceptors or less, and about 3 rotatable bonds or less.
[00103] In some embodiments, the small molecule fragments described herein comprises three cyclic rings or less.
[00104] In some embodiments, the small molecule fragments described herein binds to a cysteine residue of a polypeptide that is about 20 amino acid residues in length or more. In some instances, the small molecule fragments described herein binds to a cysteine residue of a polypeptide that is about 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
[00105] In some embodiments, the small molecule fragments described herein further comprise pharmacokinetic parameters that are unsuitable as a therapeutic agent for
administration without further optimization of the small molecule fragments. In some instances, the pharmacokinetic parameters that are suitable as a therapeutic agent comprise parameters in accordance with FDA guideline, or in accordance with a guideline from an equivalent Food and Drug Administration outside of the United States. In some instances, the pharmacokinetic parameters comprise the peak plasma concentration (Cmax), the lowest concentration of a therapeutic agent (Cmin), volume of distribution, time to reach Cmax, elimination half-life, clearance, and the life. In some embodiments, the pharmacokinetic parameters of the small molecule fragments are outside of the parameters set by the FDA guideline, or by an equivalent Food and Drug Administration outside of the United States. In some instances, a skilled artisan understands in view of the pharmacokinetic parameters of the small molecule fragments described herein that these small molecule fragments are unsuited as therapeutic agents without further optimization. [00106] In some embodiments, the small molecule fragments described herein comprise a reactive moiety which forms a covalent interaction with the thiol group of a cysteine residue of a cysteine containing protein, and an affinity handle moiety.
[00107] In some instances, a small molecule fragment described herein is a small molecule fragment of Formula (I):
Figure imgf000055_0001
Formula (I)
wherein:
RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety.
[00108] In some instances, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some cases, F is obtained from a compound library. In some cases, the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment-Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from AnCoreX, Bio Building Blocks from ASINEX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library.
[00109] In some embodiments, the small molecule fragment of Formula (I) does not contain a second binding site. In some instances, the small molecule fragment moiety does not bind to the protein. In some cases, the small molecule fragment moiety does not covalently bind to the protein. In some instances, the small molecule fragment moiety does not interact with a secondary binding site on the protein. In some instances, the secondary binding site is an active site such as an ATP binding site. In some cases, the active site is at least about 10, 15, 20, 25, 35, 40A, or more away from the biologically active cysteine residue. In some instances, the small molecule fragment moiety does not interact with an active site such as an ATP binding site.
[00110] In some instances, F is a small molecule fragment moiety illustrated in Fig. 3. In some cases, F further comprises a linker moiety that connects F to the carbonyl moiety. In some cases, the small molecule fragment is a small molecule fragment illustrated in Fig. 3. [00111] In some instances, F is a small molecule fragment moiety selected from: N-(4- bromophenyl)-N-phenylacrylamide, N-(l-benzoylpiperidin-4-yl)-2-chloro-N-phenylacetamide, l-(4-benzylpiperidin-l-yl)-2-chloroethan-l-one, N-(2-(lH-indol-3-yl)ethyl)-2-chloroacetamide, N-(3,5-bis(trifluoromethyl)phenyl)acrylamide, N-(4-phenoxy-3-(trifluoromethyl)phenyl)-N- (pyridin-3-ylmethyl)acrylamide, N-(3,5-bis(trifluoromethyl)phenyl)acetamide, 2-chloro-l-(4- (hydroxydiphenylmethyl)piperidin- 1 -yl)ethan- 1 -one, (E)-3 -(3,5 -bi s(trifluoromethyl)phenyl)-2- cyanoacryl amide, N-(3,5-bis(trifluoromethyl)phenyl)-2-bromopropanamide, N-(3,5- bis(trifluoromethyl)phenyl)-2-chloropropanamide, N-(3,5-bis(trifluoromethyl)phenyl)-N- (pyridin-3-ylmethyl)acrylamide, 3-(2-chloroacetamido)-5-(trifluoromethyl)benzoic acid, l-(4- (5-fluorobenzisoxazol-3 -yl)piperidin- 1 -yl)prop-2-en- 1 -one, tert-butyl 4-(4-acrylamido-2,6- difluorophenyl)piperazine-l-carboxylate, N-(4-bromo-2,5-dimethylphenyl)acrylamide, 2- Chloroacetamido-2-deoxy-a/ -D-glucopyranose, 2-chloro-l-(2-methyl-3,4-dihydroquinolin- 1 (2H)-yl)ethan- 1 -one, N-cyclohexyl-N-phenylacrylamide, 1 -(5-bromoindolin- 1 -yl)prop-2-en- 1 - one, N-(l-benzylpiperidin-4-yl)-N-phenylacrylamide, 2-chloro-N-(2-methyl-5- (trifluoromethyl)phenyl)acetamide, l-(5-bromoindolin-l-yl)-2-chloroethan-l-one, 2-chloro-N- (quinolin-5-yl)acetamide, l-(4-benzylpiperidin-l-yl)prop-2-en-l-one, 2-chloro-N-((3-hydroxy- 5-(hydroxymethyl)-2-methylpyridin-4-yl)methyl)acetamide, or l-(6,7-dimethoxy-3,4- dihydroisoquinolin-2(lH)-yl)prop-2-en-l-one.
[00112] In some embodiments, the small molecule fragment of Formula (I) comprise a molecule weight of about 100, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240,
250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430,
440, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some instances, the molecule weight of the small molecule fragment of Formula (I) is between about
150 and about 500, about 150 and about 450, abut 150 and about 440, about 150 and about 430, about 150 and about 400, about 150 and about 350, about 150 and about 300, about 150 and about 250, about 170 and about 500, about 180 and about 450, about 190 and about 400, about
200 and about 350, about 130 and about 300, or about 120 and about 250 Dalton.
[00113] In some embodiments, the molecule weight of the small molecule fragment of
Formula (I) is the molecule weight prior to enrichment with one or more elements selected from a halogen, a nonmetal, a transition metal, or a combination thereof. In some embodiments, the molecule weight of the small molecule fragment of Formula (I) is the molecule weight prior to enrichment with a halogen. In some embodiments, the molecule weight of the small molecule fragment of Formula (I) is the molecule weight prior to enrichment with a nonmetal. In some embodiments, the molecule weight of the small molecule fragment of Formula (I) is the molecule weight prior to enrichment with a transition metal. [00114] In some embodiments, the molecular weight of the small molecule fragment of Formula (I) does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some embodiments, the molecular weight of the small molecule fragment of Formula (I) does not include the molecular weight of a halogen. In some
embodiments, the molecular weight of the small molecule fragment of Formula (I) does not include the molecular weight of a transition metal.
[00115] In some instances, the small molecule fragment of Formula (I) comprises micromolar or millimolar binding affinity. In some instances, the small molecule fragment of Formula (I) comprises a binding affinity of about ΙμΜ, 10μΜ, ΙΟΟμΜ, 500μΜ, ImM, lOmM, or higher.
[00116] In some cases, the small molecule fragment of Formula (I) has a LE score about 0.3 kcal mol^HA"1, about 0.35 kcal mol^HA"1, about 0.4 kcal mol^HA"1, or higher
[00117] In some embodiments, the small molecule fragment of Formula (I) follows the design parameters of Rule of 3. In some instances, the small molecule fragment of Formula (I) has a non-polar solvent-polar solvent (e.g. octanol -water) partition coefficient log P of about 3 or less, a molecular mass of about 300 Daltons or less, about 3 hydrogen bond donors or less, about 3 hydrogen bond acceptors or less, and about 3 rotatable bonds or less.
[00118] In some embodiments, the small molecule fragment of Formula (I) comprises three cyclic rings or less.
[00119] In some embodiments, the small molecule fragment of Formula (I) binds to a cysteine residue of a polypeptide (e.g., a cysteine containing protein) that is about 20 amino acid residues in length or more. In some instances, the small molecule fragments described herein binds to a cysteine residue of a polypeptide (e.g., a cysteine containing protein) that is about 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
[00120] In some instances, the small molecule fragment of Formula (I) has pharmacokinetic parameters outside of the parameters set by the FDA guideline, or by an equivalent Food and Drug Administration outside of the United States. In some instances, a skilled artisan understands in view of the pharmacokinetic parameters of the small molecule fragment of Formula (I) described herein that these small molecule fragment is unsuited as a therapeutic agent without further optimization.
[00121] In some embodiments, the small molecule fragment is a specific inhibitor or a pan inhibitor. Cysteine-reactive Probes
[00122] In some embodiments, a cysteine-reactive probe comprises a reactive moiety which forms a covalent interaction with the thiol group of a cysteine residue of a cysteine containing protein, and an affinity handle moiety.
[00123] In some embodiments, a cysteine-reactive probe is a cysteine-reactive probe of Formula (II):
Figure imgf000058_0001
Formula (II)
wherein:
RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue; and AFDVI is an affinity handle moiety.
[00124] In some instances, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some cases, the affinity handle moiety comprises an affinity handle and a binding moiety that facilitates covalent interaction of the cysteine-reactive probe to a cysteine residue of a cysteine-containing protein. In some cases, the binding moiety is a small molecule fragment obtained from a compound library. In some instances, the compound library comprises
ChemBridge fragment library, Pyramid Platform Fragment-Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from AnCoreX, Bio Building Blocks from ASF EX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library.
[00125] In some embodiments, the affinity handle is a bioorthogonal affinity handle. In some embodiments, the affinity handle utilizes bioorthogonal chemistry. As used herein,
bioorthogonal chemistry refers to any chemical reaction that occurs inside of a living system (e.g. a cell) without interfering with native biochemical processes.
[00126] In some cases, the affinity handle comprises a carbodiimide, N-hydroxysuccinimide (NHS) ester, imidoester, pentafluorophenyl ester, hydroxymethyl phosphine, maleimide, haloacetyl, pyridyl disulfide, thiosulfonate, vinylsulfone, hydrazide, alkoxyamine, alkyne, azide, or isocyanate group. In some cases, the affinity handle comprises an alkyne or an azide group. [00127] In some instances, the affinity handle is an alkyne group. The term "alkyne group" as used in the context of an affinity handle refers to a group with a chemical formula of H-C≡C-R, HC2R, Ri-C≡C-R2, or RiC2R2 In the context of the present chemical formula, R, Ri, and R2 are independently a cysteine-reactive probe portion described herein, a linker, or a combination thereof. In some cases, the alkyne group is capable of being covalently linked in a chemical reaction with a molecule containing an azide. In some instances, the affinity handle is an azide group.
[00128] In some instances, the affinity handle (e.g. alkyne group or azide group) serve as nonnative and non-perturbed bioorthogonal chemical handles. In some instances, the affinity handle (e.g. alkyne group or azide group) is further derivatized through chemical reactions such as click chemistry. In some instances, the click chemistry is a copper(I)-catalyzed [3+2]-Huisgen 1,3-dipolar cyclo-addition of alkynes and azides leading to 1,2,3-triazoles. In other instances, the click chemistry is a copper free variant of the above reaction.
[00129] In some instances, the affinity handle further comprises a linker. In some instances, the linker bridges the affinity handle to the reactive moiety.
[00130] In some instances, the affinity handle is further conjugated to an affinity ligand. In some cases, the affinity ligand comprises a chromophore, a labeling group, or a combination thereof. In some embodiments, the chromophore comprises fluorochrome, non-fluorochrome chromophore, quencher, an absorption chromophore, fluorophore, organic dye, inorganic dye, metal chelate, or a fluorescent enzyme substrate. In some cases, the chromophore comprises non-fluorochrome chromophore, quencher, an absorption chromophore, fluorophore, organic dye, inorganic dye, metal chelate, or a fluorescent enzyme substrate. In other cases, the chromophore comprises a fluorophore.
[00131] In some embodiments, the fluorophore comprises rhodamine, rhodol, fluorescein, thiofluorescein, aminofluorescein, carboxyfluorescein, chlorofluorescein, methylfluorescein, sulfofluorescein, aminorhodol, carboxyrhodol, chlororhodol, methylrhodol, sulforhodol, aminorhodamine, carboxyrhodamine, chlororhodamine, methylrhodamine, sulforhodamine, thiorhodamine, cyanine, indocarbocyanine, oxacarbocyanine, thiacarbocyanine, merocyanine, cyanine 2, cyanine 3, cyanine 3.5, cyanine 5, cyanine 5.5, cyanine 7, oxadiazole derivatives, pyridyloxazole, nitrobenzoxadiazole, benzoxadiazole, pyren derivatives, cascade blue, oxazine derivatives, Nile red, Nile blue, cresyl violet, oxazine 170, acridine derivatives, proflavin, acridine orange, acridine yellow, arylmethine derivatives, auramine, crystal violet, malachite green, tetrapyrrole derivatives, porphin, phtalocyanine, bilirubin l-dimethylaminonaphthyl-5- sulfonate, l-anilino-8-naphthalene sulfonate, 2-p-touidinyl-6-naphthalene sulfonate, 3-phenyl-7- isocyanatocoumarin, N-(p-(2-benzoxazolyl)phenyl)maleimide, stilbenes, pyrenes, 6-FAM (Fluorescein), 6-FAM (NHS Ester), 5(6)-FAM, 5-FAM, Fluorescein dT, 5-TAMRA-cadavarine, 2-aminoacridone, HEX, JOE (NHS Ester), MAX, TET, ROX, TAMRA, TARMA™ (NHS Ester), TEX 615, ATTO™ 488, ATTO™ 532, ATTO™ 550, ATTO™ 565, ATTO™ RholOl, ATTO™ 590, ATTO™ 633, ATTO™ 647N, TYE™ 563, TYE™ 665, or TYE™ 705.
[00132] In some embodiments, the labeling group is a biotin moiety, a streptavidin moiety, bead, resin, a solid support, or a combination thereof. As used herein, a biotin moiety described herein comprises biotin and biotin derivatives. Exemplary biotin derivatives include, but are not limited by, desthiobiotin, biotin alkyne or biotin azide. In some instances, a biotin moiety described herein is desthiobiotin. In some cases, a biotin moiety described herein is d- Desthiobiotin.
[00133] In some instances, the labeling group is a biotin moiety. In some instances, the biotin moiety further comprises a linker such as a 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 or more residues in length. In some instances, the linker further comprises a cleavage site, such as a protease cleavage site. In some cases, the biotin moiety interacts with a streptavidin moiety. In some instances, the biotin moiety is further attached to a bead, such as a streptavidin-coupled bead. In some instances, the biotin moiety is further attached to a resin or a solid support, such as a streptavidin-coupled resin or a streptavidin-coupled solid support. In some instances, the solid support is a plate, a platform, a cover slide, a microfluidic channel, and the like.
[00134] In some embodiments, the affinity handle moiety further comprises a chromophore.
[00135] In some embodiments, the cysteine-reactive probe is a cysteine-reactive probe illustrated in Fig. 3. In some embodiments, the cysteine-reactive probe is a cysteine-reactive probe selected from: N-(hex-5-yn-l-yl)-2-iodoacetamide, Iodoacetamide-rhodamine, 3- acrylamido-N-(hex-5-yn- 1 -yl)-5-(trifluoromethyl)benzamide, 3 -acrylamido-N-(hex-5-yn- 1 -yl)- 5-(trifluoromethyl)benzamide, or 2-chloro-N-(l-(3-ethynylbenzoyl)piperidin-4-yl)-N- phenylacetamide.
Cysteine Containing Proteins
[00136] In some instances, the cysteine containing protein is a soluble protein or a membrane protein. In some instances, the cysteine containing protein is involved in one or more of a biological process such as protein transport, lipid metabolism, apoptosis, transcription, electron transport, mRNA processing, or host-virus interaction. In some instances, the cysteine containing protein is associated with one or more of diseases such as cancer or one or more disorders or conditions such as immune, metabolic, developmental, reproductive, neurological, psychiatric, renal, cardiovascular, or hematological disorders or conditions.
[00137] In some embodiments, the cysteine containing protein comprises a biologically active cysteine residue. In some embodiments, the cysteine containing protein comprises one or more cysteines in which at least one cysteine is a biologically active cysteine residue. In some cases, the biologically active cysteine site is a cysteine residue that is located about lOA or less to an active-site ligand or residue. In some cases, the cysteine residue that is located about lOA or less to the active-site ligand or residue is an active site cysteine. In other cases, the biologically active cysteine site is a cysteine residue that is located greater than lOA from an active-site ligand or residue. In some instances, the cysteine residue is located greater than 12 A, 15 A, 2θΑ, 25A, 30A, 35A, 40A, 45 A, or greater than 5θΑ from an active-site ligand or residue. In some cases, the cysteine residue that is located greater than ΙθΑ from the active-site ligand or residue is a non-active site cysteine. In additional cases, the cysteine containing protein exists in an active form, or in a pro-active form.
[00138] In some embodiments, the cysteine containing protein comprises one or more functions of an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, a plasma protein, transcription related protein, translation related protein, mitochondrial protein, or cytoskeleton related protein. In some embodiments, the cysteine containing protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, a plasma protein, transcription related protein, translation related protein, mitochondrial protein, or cytoskeleton related protein. In some instances, the cysteine containing protein has an uncategorized function.
[00139] In some embodiments, the cysteine containing protein is an enzyme. An enzyme is a protein molecule that accelerates or catalyzes chemical reaction. In some embodiments, non- limiting examples of enzymes include kinases, proteases, or deubiquitinating enzymes.
[00140] In some instances, exemplary kinases include tyrosine kinases such as the TEC family of kinases such as Tec, Bruton's tyrosine kinase (Btk), interleukin-2-indicible T-cell kinase (Itk) (or Emt/Tsk), Bmx, and Txk/Rlk; spleen tyrosine kinase (Syk) family such as SYK and Zeta-chain-associated protein kinase 70 (ZAP-70); Src kinases such as Src, Yes, Fyn, Fgr, Lck, Hck, Blk, Lyn, and Frk; JAK kinases such as Janus kinase 1 (JAK1), Janus kinase 2 (JAK2), Janus kinase 3 (JAK3), and Tyrosine kinase 2 (TYK2); or ErbB family of kinases such as Herl (EGFR, ErbBl), Her2 (Neu, ErbB2), Her3 (ErbB3), and Her4 (ErbB4).
[00141] In some embodiments, the cysteine containing protein is a protease. In some embodiments, the protease is a cysteine protease. In some cases, the cysteine protease is a caspase. In some instances, the caspase is an initiator (apical) caspase. In some instances, the caspase is an effector (executioner) caspase. Exemplary caspase includes CASP2, CASP8, CASP9, C ASP 10, CASP3, CASP6, CASP7, CASP4, and CASP5. In some instances, the cysteine protease is a cathepsin. Exemplary cathepsin includes Cathepsin B, Cathepsin C, CathepsinF, Cathepsin H, Cathepsin K, Cathepsin LI, Cathepsin L2, Cathepsin O, Cathepsin S, Cathepsin W, or Cathepsin Z.
[00142] In some embodiments, the cysteine containing protein is a deubiquitinating enzyme (DUB). In some embodiments, exemplary deubiquitinating enzymes include cysteine proteases DUBs or metalloproteases. Exemplary cysteine protease DUBs include ubiquitin-specific protease (USP/UBP) such as USP1, USP2, USP3, USP4, USP5, USP6, USP7, USP8, USP9X, USP9Y, USP10, USP11, USP12, USP13, USP14, USP15, USP16, USP17, USP17L2,
USP17L3, USP17L4, USP17L5, USP17L7, USP17L8, USP18, USP19, USP20, USP21, USP22, USP23, USP24, USP25, USP26, USP27X, USP28, USP29, USP30, USP31, USP32, USP33, USP34, USP35, USP36, USP37, USP38, USP39, USP40, USP41, USP42, USP43, USP44, USP45, or USP46; ovarian tumor (OTU) proteases such as OTUB1 and OTUB2; Machado- Josephin domain (MJD) proteases such as ATXN3 and ATXN3L; and ubiquitin C-terminal hydrolase (UCH) proteases such as BAPl, UCHL1, UCHL3, and UCHL5. Exemplary metalloproteases include the Jabl/Mov34/Mprl Padl N-terminal+ (MPN+) (JAMM) domain proteases.
[00143] In some embodiments, exemplary cysteine containing proteins as enzymes include, but are not limited to, Glyceraldehyde-3 -phosphate dehydrogenase (GAPDH), Protein arginine N-methyltransferase 1 (PRMT1), Peptidyl-prolyl cis-trans isomerase NFMA-interaction (PIN1), Acetyl-CoA acetyltransferase (mitochondrial) (ACAT1), Glutathione S-transferase P (GSTP1), Elongation factor 2 (EEF2), Glutathione S-transferase omega- 1 (GSTOl), Acetyl-CoA acetyltransferase (mitochondrial) (ACAT1), Protein disulfide-isom erase A4 (PDIA4),
Prostaglandin E synthase 3 (PTGES3), Adenosine kinase (ADK), Elongation factor 2 (EEF2), Isoamyl acetate-hydrolyzing esterase 1 homolog (IAHl), Peroxiredoxin-5 (mitochondrial) (PRDX5), Inosine-5 -monophosphate dehydrogenase 2 (IMPDH2), 3-hydroxyacyl-CoA dehydrogenase type-2 (HSD17B10), Omega-amidase NIT2 (NIT2), Aldose reductase
(AKRIBI), Monofunctional Cl-tetrahydrofolate synthase (mitochondrial) (MTHFDIL), Protein disulfide-isomerase A6 (PDIA6), Pyruvate kinase isozymes M1/M2 (PKM), 6- phosphogluconolactonase (PGLS), Acetyl-CoA acetyltransferase (mitochondrial) (ACAT1), EROl-like protein alpha (EROIL), Thioredoxin domain-containing protein 17 (TXNDC17), Protein disulfide-isomerase A4 (PDIA4), Protein disulfide-isomerase A3 (PDIA3), 3-ketoacyl- CoA thiolase (mitochondrial) (ACAA2), Dynamin-2 (D M2), DNA replication licensing factor MCM3 (MCM3), Serine-tRNA ligase (cytoplasmic) (SARS), Fatty acid synthase (FASN), Acetyl-CoA acetyltransferase (mitochondrial) (ACAT1), Protein disulfide-isomerase (P4HB), Deoxycytidine kinase (DCK), Eukaryotic translation initiation factor 3 subunit (EIF3F), Protein disulfide-isomerase A6 (PDIA6), UDP-N-acetylglucosamine-peptide N-acetylglucosamine (OGT), Ketosamine-3 -kinase (FN3KRP), Protein DJ-1 (PARK7), Phosphoglycolate phosphatase
(PGP), DNA replication licensing factor MCM6 (MCM6), Fructose-2,6-bisphosphatase TIGAR
(TIGAR), Cleavage and polyadenylation specificity factor subunit (CPSF3), Ubiquitin- conjugating enzyme E2 L3 (UBE2L3), Alanine—tRNA ligase, cytoplasmic (AARS), Mannose-
1 -phosphate guanyltransferase alpha (GMPPA), C-l-tetrahydrofolate synthase (cytoplasmic)
(MTHFD1), Dynamin-l-like protein (DNM1L), Protein disulfide-isomerase A3 (PDIA3),
Aspartyl aminopeptidase (DNPEP), Acetyl-CoA acetyltransferase (cytosolic) (ACAT2),
Thioredoxin domain-containing protein 5 (TXNDC5), Thymidine kinase (cytosolic) (TK1),
Inosine-5-monophosphate dehydrogenase 2 (EVIPDH2), Ubiquitin carboxyl-terminal hydrolase isozyme L3 (UCHL3), Integrin-linked protein kinase (ILK), Cyclin-dependent kinase 2 (CDK2),
Histone acetyltransferase type B catalytic subunit (HAT1), Enoyl-CoA delta isomerase 2
(mitochondrial) (ECI2), C-l-tetrahydrofolate synthase (cytoplasmic) (MTHFD1), Deoxycytidine kinase (DCK), Ubiquitin-like modifier-activating enzyme 6 (UBA6), Protein-L-isoaspartate(D- aspartate) O-methyltransferase (PCMT1), Monofunctional Cl-tetrahydrofolate synthase
(mitochondrial) (MTHFD1L), Thymidylate kinase (DTYMK), Protein ETHE1 (mitochondrial)
(ETHE1), Arginine-tRNA ligase (cytoplasmic) (RARS), NEDD 8 -activating enzyme El catalytic subunit (UBA3), Dual specificity mitogen-activated protein kinase (MAP2K3),
Ubiquitin-conjugating enzyme E2S (UBE2S), Amidophosphoribosyltransferase (PPAT),
Succinate-semialdehyde dehydrogenase (mitochondrial) (ALDH5A1), CAD,
Phosphoenolpyruvate carboxykinase (PCK2), 6-phosphofructokinase type C (PFKP), Acyl-CoA synthetase family member 2 (mitochondrial) (ACSF2), Multifunctional protein ADE2 (PAICS),
Desumoylating isopeptidase 1 (DESI1), 6-phosphofructokinase type C (PFKP), V-type proton
ATPase catalytic subunit A (ATP6V1A), 3-ketoacyl-CoA thiolase (peroxisomal) (ACAA1),
Galactokinase (GALKl), Thymidine kinase (cytosolic) (TK1), ATPase WRNIP1 (WRNIP1),
Phosphoribosylformylglycinamidine synthase (PFAS), V-type proton ATPase catalytic subunit
A (ATP6V1 A), Thioredoxin domain-containing protein 5 (TXNDC5), 4- trimethylaminobutyraldehyde dehydrogenase (ALDH9A1), Dual specificity mitogen-activated protein kinase (MAP2K4), Calcineurin-like phosphoesterase domain-containing (CPPED1),
Dual specificity protein phosphatase 12 (DUSP12), Phosphoribosylformylglycinamidine synthase (PFAS), Diphosphomevalonate decarboxylase (MVD), D-3-phosphoglycerate dehydrogenase (PHGDH), Cell cycle checkpoint control protein RAD9A (RAD9A),
Peroxiredoxin-1 (PRDX1), Sorbitol dehydrogenase (SORD), Peroxiredoxin-4 (PRDX4), AMP deaminase 2 (AMPD2), Isocitrate dehydrogenase (IDHl), Pyruvate carboxylase (mitochondrial)
(PC), Integrin-linked kinase-associated serine/threonine (ILKAP), Methylmalonate- semi aldehyde dehydrogenase (ALDH6A1), 26 S proteasome non- ATPase regulatory subunit 14 (PSMD14), Thymidylate kinase (DTYMK), 6-phosphofructo-2-kinase/fructose-2,6- bisphosphata (PFKFB2), Peroxiredoxin-5 (mitochondrial) (PRDX5), PDP1, Cathepsin B (CTSB), Transmembrane protease serine 12 (TMPRSS12), UDP-glucose 6-dehydrogenase (UGDH), Histidine triad nucleotide-binding protein 1 (HINT1), E3 ubiquitin-protein ligase UBR5 (UBR5), SAM domain and HD domain-containing protein 1 (SAMHDl), Probable tRNA threonylcarbamoyladenosine biosynthesis (OSGEP), Methylated-DNA~protein-cysteine methyltransferase (MGMT), Fatty acid synthase (FASN), Adenosine deaminase (ADA), Cyclin- dependent kinase 19 (CDK19), Serine/threonine-protein kinase 38 (STK38), Mitogen-activated protein kinase 9 (MAPK9), tRNA (adenine(58)-N(l))-methyltransferase catalytic (TRMT61A), Glyoxylate reductase/hydroxypyruvate reductase (GRHPR), Aldehyde dehydrogenase
(mitochondrial) (ALDH2), Mitochondrial-processing peptidase subunit beta (PMPCB), 3- ketoacyl-CoA thiolase, peroxisomal (ACAA1), Lysophosphatidic acid phosphatase type 6 (ACP6), Ubiquitin/ISG15-conjugating enzyme E2 L6 (UBE2L6), Caspase-8 (CASP8), 2,5- phosphodiesterase 12 (PDE12), Thioredoxin domain-containing protein 12 (TXNDC12), Nitrilase homolog 1 (NIT1), EROl-like protein alpha (EROIL), SUMO-activating enzyme subunit 1 (SAEl), Leucine— tRNA ligase (cytoplasmic) (LARS), Protein-glutamine gamma- glutamyltransf erase 2 (TGM2), Probable DNA dC- dU-editing enzyme APOBEC-3C
(APOBEC3C), Double-stranded RNA-specific adenosine deaminase (ADAR), Isocitrate dehydrogenase (IDH2), Methylcrotonoyl-CoA carboxylase beta chain (mitochondrial)
(MCCC2), Uridine phosphorylase 1 (UPP1), Glycogen phosphorylase (brain form) (PYGB), E3 ubiquitin-protein ligase UBR5 (UBR5), Procollagen-lysine,2-oxoglutarate 5-dioxygenase 1 (PLOD1), Ubiquitin carboxyl-terminal hydrolase 48 (USP48), Aconitate hydratase
(mitochondrial) (AC02), GMP reductase 2 (GMPR2), Pyrroline-5-carboxylate reductase 1 (mitochondrial) (PYCR1), Cathepsin Z (CTSZ), E3 ubiquitin-protein ligase UBR2
(UBR2),Cysteine protease ATG4B (ATG4B), Serine/threonine-protein kinase Nek9 (NEK9), Lysine-specific demethylase 4B (KDM4B), Insulin-degrading enzyme (IDE), Dipeptidyl peptidase 9 (DPP9), Decaprenyl-diphosphate synthase subunit 2 (PDSS2), TFIIH basal transcription factor complex helicase (ERCC3), Methionine-R-sulfoxide reductase B2
(mitochondrial) (MSRB2), E3 ubiquitin-protein ligase BREIB (RNF40), Thymidylate synthase (TYMS), Cyclin-dependent kinase 5 (CDK5), Bifunctional 3-phosphoadenosine 5- phosphosulfate (PAPSS2), Short/branched chain specific acyl-CoA dehydrogenase (ACADSB), Cathepsin D (CTSD), E3 ubiquitin-protein ligase HUWEl (HUWEl), Calpain-2 catalytic subunit (CAPN2), Dual specificity mitogen-activated protein kinase (MAP2K7), Mitogen- activated protein kinase kinase kinase MLT (MLTK), Bleomycin hydrolase (BLMH), Probable ATP-dependent RNA helicase DDX59 (DDX59), Cystathionine gamma-lyase (CTH), S- adenosylmethionine synthase isoform type-2 (MAT2A), 6-phosphofructokinase type C (PFKP), Cytidine deaminase (CD A), DNA-directed RNA polymerase II subunit RPB2 (POLR2B), Protein disulfide-isomerase (P4HB), Procollagen-lysine,2-oxoglutarate 5-dioxygenase 3
(PLOD3), Nucleoside diphosphate-linked moiety X motif 8 (mitochondrial) (NUDT8), E3 ubiquitin-protein ligase HUWEl (HUWEl), Methylated-DNA~protein-cysteine
methyltransferase (MGMT), Nitrilase homolog 1 (NIT1), Interferon regulatory factor 2-binding protein 1 (IRF2BP1), Ubiquitin carboxyl-terminal hydrolase 16 (USP16), Glycylpeptide N- tetradecanoyltransferase 2 (NMT2), Cyclin-dependent kinase inhibitor 3 (CDKN3),
Hydroxysteroid dehydrogenase-like protein 2 (HSDL2), Serine/threonine-protein kinase VRK1 (VRK1), Serine/threonine-protein kinase A-Raf (ARAF), ATP-citrate synthase (ACLY), Probable ribonuclease ZC3H12D (ZC3H12D), Peripheral plasma membrane protein CASK (CASK), DNA polymerase epsilon subunit 3 (POLE3), Aldehyde dehydrogenase X
(mitochondrial) (ALDHIBI), UDP-N-acetylglucosamine transferase subunit ALG13 (ALG13), Protein disulfide-isomerase A4 (PDIA4), DNA polymerase alpha catalytic subunit (POLA1), Ethylmalonyl-CoA decarboxylase (ECHDC1), Protein-tyrosine kinase 2-beta (PTK2B), E3 SUMO-protein ligase RanBP2 (RANBP2), Legumain (LGMN), Non-specific lipid-transfer protein (SCP2), Long-chain-fatty-acid~CoA ligase 4 (ACSL4), Dual specificity protein phosphatase 12 (DUSP12), Oxidoreductase HTATIP2 (HTATIP2), Serine/threonine-protein kinase MRCK beta (CDC42BPB), Histone-lysine N-methyltransferase EZH2 (EZH2), Nonspecific lipid-transfer protein (SCP2), Dual specificity mitogen-activated protein kinase
(MAP2K7), Ubiquitin carboxyl-terminal hydrolase 28 (USP28), 6-phosphofructokinase (liver type) (PFKL), SWI/SNF-related matrix-associated actin-dependent (SMARCADl), Protein phosphatase methylesterase 1 (PPME1), DNA replication licensing factor MCM5 (MCM5), 6- phosphofructo-2-kinase/fructose-2,6-bisphosphata (PFKFB4), Dehydrogenase/reductase SDR family member 11 (DHRS11), Pyroglutamyl -peptidase 1 (PGPEP1), Probable E3 ubiquitin- protein ligase (MYCBP2), DNA fragmentation factor subunit beta (DFFB), Deubiquitinating protein VCIP135 (VCPIP1), Putative transferase CAF17 (mitochondrial) (TOA57), Calpain-7 (CAPN7), GDP-L-fucose synthase (TSTA3), Protein disulfide-isomerase A4 (PDIA4, Probable ATP-dependent RNA helicase (DDX59), RNA exonuclease 4 (REX04), PDK1, E3 SUMO- protein ligase (PIAS4), DNA (cytosine-5)-methyl transferase 1 (DNMTl), Alpha-aminoadipic semialdehyde dehydrogenase (ALDH7A1), Hydroxymethylglutaryl-CoA synthase (cytoplasmic) (HMGCS 1), E3 ubiquitin-protein ligase (SMURF2), Aldehyde dehydrogenase X
(mitochondrial) (ALDHIBI), Tyrosine-protein kinase (BTK), DNA repair protein RAD50 (RAD50), ATP -binding domain-containing protein 4 (ATPBD4), Nucleoside diphosphate kinase
3 (NME3), Interleukin-1 receptor-associated kinase 1 (IRAKI), Ribonuclease P/MRP protein subunit POP5 (POP5), Peptide-N(4)-(N-acetyl-beta-glucosaminyl)asparagin (NGLY1), Caspase- 2 (CASP2), Ribosomal protein S6 kinase alpha-3 (RPS6KA3), E3 ubiquitin-protein ligase UBR1 (UBR1), Serine/threonine-protein kinase Chk2 (CHEK2), Phosphatidylinositol 3,4,5- trisphosphate 5-phospha (INPPL1), Hi stone acetyl transferase p300 (EP300), Creatine kinase U- type (mitochondrial) (CKMT1B), E3 ubiquitin-protein ligase TREVI33 (TRIM33), Cancer- related nucleoside-triphosphatase (NTPCR), Aconitate hydratase (mitochondrial) (AC02), Ubiquitin carboxyl-terminal hydrolase 34 (USP34), Probable E3 ubiquitin-protein ligase HERC4 (HERC4), E3 ubiquitin-protein ligase HECTD1 (HECTD1), Peroxisomal 2,4-dienoyl- CoA reductase (DECR2), Helicase ARIP4 (RAD54L2), Ubiquitin-like modifier-activating enzyme 7 (UBA7), ER degradation-enhancing alpha-mannosidase-like 3 (EDEM3), Ubiquitin- conjugating enzyme E20 (UBE20), Dual specificity mitogen-activated protein kinase
(MAP2K7), Myotubularin-related protein 1 (MTMR1), Calcium-dependent phospholipase A2 (PLA2G5), Mitotic checkpoint serine/threonine-protein kinase (BUB IB), Putative transferase CAF17 (mitochondrial) (IBA57), Tyrosine-protein kinase ZAP-70 (ZAP70), E3 ubiquitin- protein ligase pellino homolog 1 (PELI1), Neuropathy target esterase (PNPLA6), Ribosomal protein S6 kinase alpha-3 (RPS6KA3), N6-adenosine-methyltransferase 70 kDa subunit
(METTL3), Fructosamine-3 -kinase (FN3K), Ubiquitin carboxyl-terminal hydrolase 22 (USP22), Rab3 GTPase-activating protein catalytic subunit (RAB3GAP1), Caspase-5 (CASP5), L-2- hydroxyglutarate dehydrogenase (mitochondrial) (L2HGDH), Saccharopine dehydrogenase-like oxidoreductase (SCCPDH), FLAD1 FAD synthase, Lysine-specific demethylase 3 A (KDM3A), or Ubiquitin carboxyl-terminal hydrolase 34 (USP34).
[00144] In some embodiments, the cysteine containing protein is a signaling protein. In some instances, exemplary signaling protein includes vascular endothelial growth factor (VEGF) proteins or proteins involved in redox signaling. Exemplary VEGF proteins include VEGF-A, VEGF-B, VEGF-C, VEGF-D, and PGF. Exemplary proteins involved in redox signaling include redox-regulatory protein FAM213A.
[00145] In some embodiments, the cysteine containing protein is a transcription factor or regulator. Exemplary cysteine containing proteins as transcription factors and regulators include, but are not limited to, 40S ribosomal protein S3 (RPS3), Basic leucine zipper and W2 domain- containing protein (BZW1), Poly(rC)-binding protein 1 (PCBP1), 40S ribosomal protein Sl l (RPSl 1), 40S ribosomal protein S4, X isoform (RPS4X), Signal recognition particle 9 kDa protein (SRP9), Non-POU domain-containing octamer-binding protein (NONO), N-alpha- acetyltransferase 15, NatA auxiliary subunit (NAA15), Cleavage stimulation factor subunit 2 (CSTF2), Lamina-associated polypeptide 2, isoform alpha (TMPO), Heterogeneous nuclear ribonucleoprotein R (HNRNPR), MMS19 nucleotide excision repair protein homolog (MMS19), SWI/SNF complex subunit SMARCC2 (SMARCC2), Enhancer of mRNA-decapping protein 3 (EDC3), H/ACA ribonucleoprotein complex subunit 2 (NHP2), WW domain-containing adapter protein with coiled-c (WAC), N-alpha-acetyltransferase 15 NatA auxiliary subunit (NAA15), 40S ribosomal protein Sl l (RPSl l), Signal transducer and activator of transcription 1 (STATl), Mediator of RNA polymerase II transcription subunit (MED 15), Lamina-associated polypeptide 2 (isoform alpha) (TMPO), MMS19 nucleotide excision repair protein homolog (MMS19), DNA mismatch repair protein Msh2 (MSH2), Recombining binding protein suppressor of hairless (RBPJ), Mediator of RNA polymerase II transcription subunit (MED 17), Heterogeneous nuclear ribonucleoprotein U (HNRNPU), Transcription initiation factor IIA subunit 2
(GTF2A2), Chromatin accessibility complex protein 1 (CHRAC1), CDKN2A-interacting protein (CDKN2AIP), Zinc finger protein 217 (ZNF217), Signal transducer and activator of transcription 3 (STAT3), WD repeat and HMG-box DNA-binding protein 1 (WDHD1), Lamina- associated polypeptide 2 (isoform alpha) (TMPO), Lamina-associated polypeptide 2 (isoforms beta/gam) (TMPO), Interferon regulatory factor 4 (IRF4), Protein flightless- 1 homolog (FLU), Heterogeneous nuclear ribonucleoprotein F (HNRNPF), Nucleus accumbens-associated protein 1 (NACC1), Transcription elongation regulator 1 (TCERG1), Protein HEXIM1 (HEXFM1), Enhancer of mRNA-decapping protein (EDC3), Zinc finger protein Aiolos (IKZF3),
Transcription elongation factor SPT5 (SUPT5H), Forkhead box protein Kl (FOXK1), LEVI domain-containing protein 1 (LFMD1), MMS19 nucleotide excision repair protein homolog (MMS19), Elongator complex protein 4 (ELP4), Ankyrin repeat and KH domain-containing protein 1 (ANKHDl), PML, Nuclear factor NF-kappa-B pi 00 subunit (NFKB2), Heterogeneous nuclear ribonucleoprotein L-like (HNRPLL), CCR4-NOT transcription complex subunit 3 (CNOT3), Constitutive coactivator of PPAR-gamma-like protein (FAM120A), Mediator of RNA polymerase II transcription subunit (MED 15), 60S ribosomal protein L7 (RPL7),
Interferon regulatory factor 8 (IRF8), COUP transcription factor 2 (NR2F2), Mediator of RNA polymerase II transcription subunit (MED1), tRNA (uracil -5 -)-methyl transferase homolog A (TRMT2A), Transcription factor p65 (RELA), Exosome complex component RRP42
(EXOSC7), General transcription factor 3C polypeptide 1 (GTF3C1), Mothers against decapentaplegic homolog 2 (SMAD2), Ankyrin repeat domain-containing protein 17
(ANKRD17), MMS19 nucleotide excision repair protein homolog (MMS19), Death domain- associated protein 6 (DAXX), Zinc finger protein 318 (ZNF318), Thioredoxin-interacting protein (TXNIP), Glucocorticoid receptor (NR3C1), Iron-responsive element-binding protein 2 (IREB2), Zinc finger protein 295 (ZNF295), Polycomb protein SUZ12 (SUZ12), Cleavage stimulation factor subunit 2 tau variant (CSTF2T), C-myc promoter-binding protein
(DENND4A), Pinin (PNN), Mediator of RNA polymerase II transcription subunit (MED9), POU domain, class 2, transcription factor 2 (POU2F2), Enhancer of mRNA-decapping protein 3
(EDC3), A-kinase anchor protein 1 (mitochondrial) (AKAPl), Transcription factor RelB
(RELB), RNA polymerase II-associated protein 1 (RPAPl), Zinc finger protein 346 (ZNF346),
Chromosome-associated kinesin KTF4A (KTF4A), Mediator of RNA polymerase II transcription subunit (MED 12), Protein NPAT (NPAT), Leucine-rich PPR motif-containing protein
(mitochondrial) (LRPPRC), AT-hook DNA-binding motif-containing protein 1 (AHDCl),
Mediator of RNA polymerase II transcription subunit (MED 12), Bromodomain-containing protein 8 (BRD8), Trinucleotide repeat-containing gene 6B protein (TNRC6B), Aryl hydrocarbon receptor nuclear translocator (ARNT), Activating transcription factor 7-interacting protein (ATF7IP), Glucocorticoid receptor (NR3C1), Chromosome transmission fidelity protein
18 homolog (CHTF18), or C-myc promoter-binding protein (DENND4A).
[00146] In some embodiments, the cysteine containing protein is a channel, transporter or receptor. Exemplary cysteine containing proteins as channels, transporters, or receptors include, but are not limited to, Chloride intracellular channel protein 4 (CLIC4), Exportin-1 (XPOl),
Thioredoxin (TXN), Protein SEC 13 homolog (SEC 13), Chloride intracellular channel protein 1
(CLIC1), Guanine nucleotide-binding protein subunit beta-2 (GNB2L1), Sorting nexin-6
(SNX6), Conserved oligomeric Golgi complex subunit 3 (COG3), Nuclear cap-binding protein subunit 1 (NCBP1), Cytoplasmic dynein 1 light intermediate chain 1 (DYNC1LI1), MOB-like protein phocein (MOB4), Programmed cell death 6-interacting protein (PDCD6IP),
Glutaredoxin-1 (GLRX), ATP synthase subunit alpha (mitochondrial) (ATP5A1), Treacle protein (TCOF1), Dynactin subunit 1 (DCTN1), Importin-7 (IP07), Exportin-2 (CSE1L), ATP synthase subunit gamma (mitochondrial) (ATP5C1), Trafficking protein particle complex subunit 5 (TRAPPC5), Thioredoxin mitochondrial (TXN2), THO complex subunit 6 homolog
(THOC6), Exportin-1 (XPOl), Nuclear pore complex protein Nup50 (NUP50), Treacle protein
(TCOF1), Nuclear pore complex protein Nup93 (NUP93), Nuclear pore glycoprotein p62
(NUP62), Cytoplasmic dynein 1 heavy chain 1 (DYNC1H1), Thioredoxin-like protein 1
(TXNL1), Nuclear pore complex protein Nup214 (NUP214), Protein lin-7 homolog C (LIN7C),
ADP-ribosylation factor-binding protein GGA2 (GGA2), Trafficking protein particle complex subunit 4 (TRAPPC4), Protein quaking (QKI), Perilipin-3 (PLIN3), Copper transport protein
ATOX1 (ATOX1), Unconventional myosin-Ic (MYOIC), Nucleoporin NUP53 (NUP35),
Vacuolar protein sorting-associated protein 18 homolog (VPS 18), Dedicator of cytokinesis protein 7 (DOCK7), Nucleoporin p54 (NUP54), Ras-related GTP-binding protein C (RRAGC),
Arf-GAP with Rho-GAP domain (ANK repeat and PH domain) (ARAPl), Exportin-5 (XP05),
Kinectin (KTN1), Chloride intracellular channel protein 6 (CLIC6), Voltage-gated potassium channel subunit beta-2 (KCNAB2), Exportin-5 (XP05), Ras-related GTP-binding protein C (RRAGC), Ribosome-binding protein 1 (RRBPl), Acyl-CoA-binding domain-containing protein 6 (ACBD6), Chloride intracellular channel protein 5 (CLIC5), Pleckstrin homology domain- containing family A member (PLEKHA2), ADP-ribosylation factor-like protein 3 (ARL3), Protein transport protein Sec24C (SEC24C), Voltage-dependent anion-selective channel protein (VDAC3), Programmed cell death 6-interacting protein (PDCD6IP), Chloride intracellular channel protein 3 (CLIC3), Multivesicular body subunit 12A (FAM125A), Eukaryotic translation initiation factor 4E transporter (EIF4ENIF1), NmrA-like family domain-containing protein 1 ( MRAL1), Nuclear pore complex protein Nup98-Nup96 (NUP98), Conserved oligomeric Golgi complex subunit 1 (COG1), Importin-4 (ΣΡ04), Pleckstrin homology domain- containing family A member (PLEKHA2), Cytoplasmic dynein 1 heavy chain 1 (DYNC1H1), DENN domain-containing protein 1C (DENND1C), Cytoplasmic dynein 1 heavy chain 1 (DYNC1H1), Protein ELYS (AHCTF1), Trafficking protein particle complex subunit 1
(TRAPPCl), Guanine nucleotide-binding protein-like 3 (GNL3), or Importin-13 (IP013).
[00147] In some embodiments, the cysteine containing protein is a chaperone. Exemplary cysteine containing proteins as chaperones include, but are not limited to, 60 kDa heat shock protein (mitochondrial) (HSPDl), T-complex protein 1 subunit eta (CCT7), T-complex protein 1 subunit epsilon (CCT5), Heat shock 70 kDa protein 4 (HSPA4), GrpE protein homolog 1 (mitochondrial) (GRPELl), Tubulin-specific chaperone E (TBCE), Protein unc-45 homolog A (UNC45A), Serpin HI (SERPINH1), Tubulin-specific chaperone D (TBCD), Peroxisomal biogenesis factor 19 (PEX19), BAG family molecular chaperone regulator 5 (BAG5), T- complex protein 1 subunit theta (CCT8), Protein canopy homolog 3 (CNPY3), DnaJ homolog subfamily C member 10 (DNAJC10), ATP-dependent Clp protease ATP -binding subunit clp (CLPX), or Midasin (MDN1).
[00148] In some embodiments, the cysteine containing protein is an adapter, scaffolding or modulator protein. Exemplary cysteine containing proteins as adapter, scaffolding, or modulator proteins include, but are not limited to, Proteasome activator complex subunit 1 (PSMEl),
ΤΠ Ι-like protein (TIPRL), Crk-like protein (CRKL), Cofilin-1 (CFL1), Condensin complex subunit 1 (NCAPD2), Translational activator GCN1 (GCN1L1), Serine/threonine-protein phosphatase 2A 56 kDa regulatory (PPP2R5D), UPF0539 protein C7orf59 (C7orf59), Protein diaphanous homolog 1 (DIAPHl), Protein asunder homolog (Asun), Ras GTPase-activating-like protein IQGAPl (IQGAPl), Sister chromatid cohesion protein PDS5 homolog A (PDS5A),
Reticulon-4 (RTN4), Proteasome activator complex subunit 4 (PSME4), Condensin complex subunit 2 (NCAPH), Sister chromatid cohesion protein PDS5 homolog A (PDS5A), cAMP- dependent protein kinase type I-alpha regulatory (PRKARl A), Host cell factor 1 (HCFC1),
Serine/threonine-protein phosphatase 4 regulatory (PPP4R2), Apoptotic chromatin condensation inducer in the nucleus (ACINI), BRISC and BRCA1-A complex member 1 (BAB AMI), Interferon-induced protein with tetratricopeptide (IFIT3), Ras association domain-containing protein 2 (RASSF2), Hsp70-binding protein 1 (HSPBP1), TBC1 domain family member 15 (TBC1D15), Dynamin-binding protein (DNMBP), Condensin complex subunit 1 (NCAPD2), Beta-2-syntrophin (SNTB2), Disks large homolog 1 (DLG1), TBC1 domain family member 13 (TBC1D13), Formin-binding protein 1-like (FNBP1L), Translational activator GCN1
(GCN1L1), GRB2-related adapter protein (GRAP), G2/mitotic-specific cyclin-Bl (CCNB1), Myotubularin-related protein 12 (MTMR12), Protein FADD (FADD), Translational activator GCN1 (GCN1L1), Wings apart-like protein homolog (WAPAL), cAMP-dependent protein kinase type II-beta regulatory (PRKAR2B), Malcavernin (CCM2), MPP1 55 kDa erythrocyte membrane protein, Actin filament-associated protein 1 (AFAPl), Tensin-3 (TNS3), tRNA methyltransferase 112 homolog (TRMT112), Symplekin (SYMPK), TBC1 domain family member 2A (TBC1D2), ATR-interacting protein (ATRIP), Ataxin-10 (ATXN10), Succinate dehydrogenase assembly factor 2 (mitochondrial) (SDHAF2), Formin-binding protein 1
(FNBP1), Myotubularin-related protein 12 (MTMR12), Interferon-induced protein with tetratricopeptide (IFIT3), Protein CBFA2T2 (CBFA2T2), Neutrophil cytosol factor 1 (NCF1), or Protein syndesmos (NUDT16L1).
[00149] In some embodiments, a cysteine containing protein comprises a protein illustrated in
Tables 1-5 or Tables 7-9. In some instances, a cysteine containing protein comprises a protein illustrated in Table 1. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 1. In some instances, a cysteine containing protein comprises a protein illustrated in Table 2. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 2. In some instances, a cysteine containing protein comprises a protein illustrated in Table 3. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 3. In some instances, a cysteine containing protein comprises a protein illustrated in Table 4. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 4. In some instances, a cysteine containing protein comprises a protein illustrated in Table 5. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 5. In some instances, a cysteine containing protein comprises a protein illustrated in Table 7. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 7. In some instances, a cysteine containing protein comprises a protein illustrated in Table 8. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 8. In some instances, a cysteine containing protein comprises a protein illustrated in Table 9. In some embodiments, the cysteine containing protein comprises a cysteine residue denoted in Table 9. In some instances, the cysteine containing protein is a modified protein, in which the protein is modified at a cysteine residue site by a small molecule fragment described herein, such as for example, by a small molecule fragment of Formula (I) described herein, a cysteine-reactive probe of Formula (II) described herein, or by a small molecule fragment illustrated in Fig. 3.
[00150] In some embodiments, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein. In some instances, the cysteine containing protein is selected from Table 3.
In some cases, one or more cysteine residues of each respective cysteine containing protein are denoted in Table 3. In some cases, a cysteine containing protein selected from Table 3 is modified by a small molecule fragment at at least one cysteine site denoted in Table 3 to generate a modified cysteine containing protein. In some cases, the cysteine containing protein is selected from AIP, PESl, IKBKB, XPOl, KDM4B, R3C1, GSTP1, T FAIP3, ACATl,
IRAKI, G B2L1, IRF4, USP34, ZC3HAV1, USP7, PELI1, DCUNIDI, USP28, UBE20,
RRAGC, MLTK, USP22, KDM3A, or USP16. In some cases, the cysteine containing protein is selected from AIP, PESl, IKBKB, XPOl, GSTP1, ACATl, IRAKI, IRF4, ZC3HAV1, USP7,
PELI1, USP28, UBE20, RRAGC, MLTK, USP22, KDM3A, or USP16. In some cases, the cysteine containing protein is selected from KDM4B, NR3C1, TNFAIP3, USP7 or USP22. In some cases, the cysteine containing protein is selected from GNB2L1 or USP34. In some cases, the cysteine containing protein is DCUNIDI . In some cases, the cysteine containing protein is selected from PESl, IKBKB, GSTP1, ACATl, IRAKI, ZC3HAV1 or RRAGC. In some cases, the cysteine containing protein is selected from XPOl, GNB2L1, USP34, UBE20, MLTK or
USP22. In some cases, the cysteine containing protein is selected from KDM4B or NR3C1. In some cases, the cysteine containing protein is selected from TNFAIP3, USP7, USP28, KDM3A or USP16. In some cases, the cysteine containing protein is selected from IRF4, PELI1,
DCUNIDI or USP22. In some cases, the cysteine containing protein is AIP. In some cases, the cysteine containing protein is an enzyme and the enzyme is selected from IKBKB, KDM4B,
GSTP1, TNFAIP3, ACATl, IRAKI, USP34, USP7, PELI1, USP28, UBE20, MLTK, USP22,
KDM3A, or USP16. In some cases, the cysteine containing protein is a transcription factor or regulator and the transcription factor or regulator is selected from NR3C1, IRF4 or ZC3HAV1.
In some cases, the cysteine containing protein is a channel, a transporter, or a receptor and the channel, transporter, or receptor is selected from GNB2L1 or RRAGC. In some cases, the cysteine containing protein is selected from AIP, PESl, XPOl or DCUNIDI . In some cases, the cysteine containing protein is selected from PESl, CYR61, UBE2L6, XPOl, ADA, NR3C1,
POU2F2, UCHL3, MGMT, ERCC3, ACATl, STAT3, UBA7, CASP2, IDH2, LRBA, UBE2L3,
RELB, IRF8, CASP8, PDIA6, PCK2, PFKFB4, PDE12, USP34, USP48, SMARCC2 or SAMHD1. In some cases, the cysteine containing protein is selected from PES1, CYR61, R3C1, UCHL3, ERCC3, ACAT1, STAT3, CASP2, LRBA, UBE2L3, RELB, PDIA6, PCK2,
PFKFB4, USP48 or SMARCC2. In some cases, the cysteine containing protein is selected from
UBE2L6, POU2F2, MGMT, ACAT1, UBA7, CASP8, PDE12 or USP34. In some cases, the cysteine containing protein is selected from CYR61 or XPOl . In some cases, the cysteine containing protein is selected from ADA, MGMT, IDH2, IRF8 or SAMHDl . In some cases, the cysteine containing protein is selected from PES1, CYR61, XPOl, R3C1 or SMARCC2. In some cases, the cysteine containing protein is selected from CYR61, UBE2L6, MGMT, ERCC3,
ACAT1 or USP48. In some cases, the cysteine containing protein is selected from ADA, RELB or USP34. In some cases, the cysteine containing protein is selected from UCHL3, CASP2,
IDH2, LRBA, CASP8, PCK2 or PDE12. In some cases, the cysteine containing protein is selected from MGMT, ACAT1, UBA7, UBE2L3 or IRF8. In some cases, the cysteine containing protein is selected from PFKFB4, AC ATI or STAT3. In some cases, the cysteine containing protein is selected from POU2F2, PDIA6 or SAMHDl . In some cases, the cysteine containing protein is an enzyme and the enzyme is selected from UBE2L6, ADA, UCHL3,
MGMT, ERCC3, ACAT1, UBA7, CASP2, IDH2, UBE2L3, CASP8, PDIA6, PCK2, PFKFB4,
PDE12, USP34, USP48 or SAMHDl . In some cases, the cysteine containing protein is a transcription factor or a regulator and the transcription factor or regulator is selected from
R3C1, POU2F2, STAT3, RELB, IRF8 or SMARCC2. In some cases, the cysteine containing protein is selected from ZAP70, PRKCQ or PRMT1. In some cases, the cysteine containing protein is selected from ZAP70 or PRKCQ. In some cases, the cysteine containing protein is selected from CYR61, Z F217, NCFl, IREB2, LRBA, CDK5, EP300, EZH2, UBE2S,
VCPIPl, RRAGC or IRAK4. In some cases, the cysteine containing protein is selected from
CYR61, ZNF217, IREB2, EP300, UBE2S, VCPIPl, RRAGC or IRAK4. In some cases, the cysteine containing protein is selected from NCFl, LRBA or CDK5. In some cases, the cysteine containing protein is EZH2. In some cases, the cysteine containing protein is selected from
ZNF217, NCFl, CDK5, EP300 or IRAK4. In some cases, the cysteine containing protein is selected from CYR61, IREB2, LRBA or UBE2S. In some cases, the cysteine containing protein is selected from EZH2, VCPIPl or RRAGC. In some cases, the cysteine containing protein is an enzyme and the enzyme is selected from CDK5, EP300, EZH2, UBE2S, VCPIPl or IRAK4. In some cases, the cysteine containing protein is a transcription factor or a regulator and the transcription factor or regulator is selected from ZNF217 or IREB2. In some cases, the cysteine containing protein is an adapter, a scaffolding protein or a modulator protein and the adapter, scaffolding protein or the modulator protein is selected from NCFl . In some cases, the cysteine containing protein is a channel, a transporter or a receptor and the channel, transporter, or receptor is selected from RRAGC. In some cases, the cysteine containing protein is selected from CYR61 or LRBA. In some cases, the cysteine containing protein is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more. In some cases, the cysteine residue of the modified c steine containing protein has the structure SR, wherein R is selected from:
Figure imgf000073_0001
, wherein
R1 is H, C1-C3 alkyl, or aryl; and F' is the small molecule fragment moiety. In some cases, the small molecule fragment has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some cases, the molecular weight of the small molecule fragment is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some cases, the small molecule fragment is a small
molecule fragment of Formula (I):
Figure imgf000073_0002
wherein RM is a reactive moiety selected from a
Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety. In some cases, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some cases, F is obtained from a compound library. In some cases, F is a small molecule fragment moiety illustrated in Fig. 3. In some cases, F further comprises a linker moiety that connects F to the carbonyl moiety. In some cases, the small molecule fragment binds irreversibly to the cysteine containing protein. In some cases, the small molecule fragment binds reversibly to the cysteine containing protein.
[00151] In some embodiments, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is selected from Table 10A, enzymes. In some cases, one or more cysteine residues of each respective cysteine containing protein are denoted in Table 10A. In some cases, a cysteine containing protein selected from Table 10A is modified by a small molecule fragment at at least one cysteine site denoted in Table 10A to generate a modified cysteine containing protein. In some cases, the cysteine containing protein is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more. In some cases, the cysteine residue of the modified c steine containing protein has the
structure SR, wherein R is selected from:
Figure imgf000074_0001
Figure imgf000074_0002
, wherein R is H, C1-C3 alkyl, or aryl; and F' is the small molecule fragment moiety. In some cases, the small molecule fragment has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some cases, the molecular weight of the small molecule fragment is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some
cases, the small molecule fragment is a small molecule fragment of Formula (I):
Figure imgf000074_0003
wherein RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety. In some cases, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some cases, F is obtained from a compound library. In some cases, F is a small molecule fragment moiety illustrated in Fig. 3. In some cases, F further comprises a linker moiety that connects F to the carbonyl moiety. In some cases, the small molecule fragment binds irreversibly to the cysteine containing protein. In some cases, the small molecule fragment binds reversibly to the cysteine containing protein.
[00152] In some embodiments, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is selected from Table 10B, transcription factors and regulators. In some cases, one or more cysteine residues of each respective cysteine containing protein are denoted in Table 10B. In some cases, a cysteine containing protein selected from Table 10B is modified by a small molecule fragment at at least one cysteine site denoted in Table 10B to generate a modified cysteine containing protein. In some cases, the cysteine containing protein is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more. In some cases, the cysteine residue of the modified cysteine
containin protein has the structure SR, wherein R is selected from:
Figure imgf000075_0001
Figure imgf000075_0002
, wherein R is H, Cl-
C3 alkyl, or aryl; and F' is the small molecule fragment moiety. In some cases, the small molecule fragment has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some cases, the molecular weight of the small molecule fragment is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some cases, the small molecule fragment is a small molecule
fragment of Formula (I):
Figure imgf000075_0003
a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety. In some cases, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some cases, F is obtained from a compound library. In some cases, F is a small molecule fragment moiety illustrated in Fig. 3. In some cases, F further comprises a linker moiety that connects F to the carbonyl moiety. In some cases, the small molecule fragment binds irreversibly to the cysteine containing protein. In some cases, the small molecule fragment binds reversibly to the cysteine containing protein.
[00153] In some embodiments, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is selected from Table IOC, channels, transporters or receptors. In some cases, one or more cysteine residues of each respective cysteine containing protein are denoted in Table IOC. In some cases, a cysteine containing protein selected from Table IOC is modified by a small molecule fragment at at least one cysteine site denoted in Table IOC to generate a modified cysteine containing protein. In some cases, the cysteine containing protein is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more. In some cases, the cysteine residue of the modified cysteine
containin protein has the structure SR, wherein R is selected from:
Figure imgf000076_0001
Figure imgf000076_0002
, wherein R is H, Cl-
C3 alkyl, or aryl; and F' is the small molecule fragment moiety. In some cases, the small molecule fragment has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some cases, the molecular weight of the small molecule fragment is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some cases, the small molecule fragment is a small molecule
fragment of Formula (I):
Figure imgf000076_0003
a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety. In some cases, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some cases, F is obtained from a compound library. In some cases, F is a small molecule fragment moiety illustrated in Fig. 3. In some cases, F further comprises a linker moiety that connects F to the carbonyl moiety. In some cases, the small molecule fragment binds irreversibly to the cysteine containing protein. In some cases, the small molecule fragment binds reversibly to the cysteine containing protein.
[00154] In some embodiments, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is selected from Table 10D, adapter, scaffolding, or modulator proteins. In some cases, one or more cysteine residues of each respective cysteine containing protein are denoted in Table 10D. In some cases, a cysteine containing protein selected from Table 10D is modified by a small molecule fragment at at least one cysteine site denoted in Table 10D to generate a modified cysteine containing protein. In some cases, the cysteine containing protein is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more. In some cases, the cysteine residue of the modified cysteine
containin protein has the structure SR, wherein R is selected from:
Figure imgf000077_0001
Figure imgf000077_0002
, wherein R is H, Cl-
C3 alkyl, or aryl; and F' is the small molecule fragment moiety. In some cases, the small molecule fragment has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some cases, the molecular weight of the small molecule fragment is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some cases, the small molecule fragment is a small molecule
fragment of Formula (I):
Figure imgf000077_0003
a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety. In some cases, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some cases, F is obtained from a compound library. In some cases, F is a small molecule fragment moiety illustrated in Fig. 3. In some cases, F further comprises a linker moiety that connects F to the carbonyl moiety. In some cases, the small molecule fragment binds irreversibly to the cysteine containing protein. In some cases, the small molecule fragment binds reversibly to the cysteine containing protein.
[00155] In some embodiments, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is selected from Table 10E. In some cases, one or more cysteine residues of each respective cysteine containing protein are denoted in Table 10E. In some cases, a cysteine containing protein selected from Table 10E is modified by a small molecule fragment at at least one cysteine site denoted in Table 10E to generate a modified cysteine containing protein. In some cases, the cysteine containing protein is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more. In some cases, the cysteine residue of the modified cysteine containing protein has the structure SR, wherein R is selected
O O R; I R1 OWOR1 R1 O R1 R1 <y from: R1 , R1 , R1 , CN , or CN
wherein R1 is H, C 1-C3 alkyl, or aryl; and F' is the small molecule fragment moiety. In some cases, the small molecule fragment has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some cases, the molecular weight of the small molecule fragment is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some cases, the small molecule fragment is a small
O molecule fragment of Formula (I): ^ wherein RM is a reactive moiety selected from a
Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety. In some cases, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some cases, F is obtained from a compound library. In some cases, F is a small molecule fragment moiety illustrated in Fig. 3. In some cases, F further comprises a linker moiety that connects F to the carbonyl moiety. In some cases, the small molecule fragment binds irreversibly to the cysteine containing protein. In some cases, the small molecule fragment binds reversibly to the cysteine containing protein.
[00156] In some embodiments, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif XPC*Z, wherein Xp is a polar residue, C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from AIP, PES 1, IKBKB, XPO l, KDM4B, R3C 1, GSTP1, T FAIP3, ACAT1, IRAKI, G B2L1, IRF4, USP34, ZC3HAV1, USP7, PELI1, DCUN1D1, USP28, UBE20, RRAGC, MLTK, USP22, KDM3A, or USP16.
[00157] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif XpC*Xn, wherein Xp is a polar residue, C* denotes the site of modification, and Xn is a nonpolar residue. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from AIP, PES1, IKBKB, XPOl, GSTP1, ACAT1, IRAKI, IRF4, ZC3HAV1, USP7, PELI1, USP28, UBE20, RRAGC, MLTK, USP22, KDM3A, or USP16.
[00158] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif XPC*XP, wherein Xp is a polar residue and C* denotes the site of modification. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from KDM4B, NR3C1, T FAIP3, USP7 or USP22.
[00159] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif XpC*Xb, wherein Xp is a polar residue, C* denotes the site of modification, and X is a basic residue. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from GNB2L1 or USP34.
[00160] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif XpC*Xa, wherein Xp is a polar residue, C* denotes the site of modification, and Xa is an acidic residue. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is DCUN1D1.
[00161] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif SC*Z, wherein C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from PES1, IKBKB, GSTP1, ACAT1, IRAKI, ZC3HAV1 or RRAGC. [00162] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif NC*Z, wherein C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from XPOl, G B2L1, USP34, UBE20, MLTK or USP22.
[00163] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif YC*Z, wherein C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from KDM4B or R3C1.
[00164] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif TC*Z, wherein C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from T FAIP3, USP7, USP28, KDM3A or USP16.
[00165] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif QC*Z, wherein C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from IRF4, PELI1, DCUN1D1 or USP22.
[00166] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif CC*Z, wherein C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is ΑΠ>.
[00167] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is an enzyme and the enzyme comprises the motif XPC*Z, wherein Xp is a polar residue, C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the enzyme is selected from IKBKB, KDM4B, GSTP1, T FAIP3, ACATl, IRAKI, USP34, USP7, PELI1, USP28, UBE20, MLTK, USP22, KDM3A, or USP16.
[00168] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is a transcription factor or a regulator and the transcription factor or regulator comprises the motif XPC*Z, wherein Xp is a polar residue, C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the transcription factor or regulator is selected from R3C1, IRF4 or ZC3HAV1.
[00169] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is a channel, transporter or a receptor and the channel, transporter or receptor comprises the motif XPC*Z, wherein Xp is a polar residue, C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the channel, transporter, or receptor is selected from GNB2L1 or RRAGC.
[00170] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif XPC*Z, wherein Xp is a polar residue, C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from AIP, PES1, XPOl or DCUN1D1.
[00171] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif XnC*Z, wherein Xn is a nonpolar residue, C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from PES1, CYR61, UBE2L6, XPOl, ADA, R3C1, POU2F2, UCHL3, MGMT, ERCC3, ACATl, STAT3, UBA7, CASP2, IDH2, LRBA, UBE2L3, RELB, IRF8, CASP8, PDIA6, PCK2, PFKFB4, PDE12, USP34, USP48, SMARCC2 or SAMHD1.
[00172] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif XnC*Xn, wherein Xn is a nonpolar residue and C* denotes the site of modification. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from PES1, CYR61, R3C1, UCHL3, ERCC3, ACAT1, STAT3, CASP2, LRBA, UBE2L3, RELB, PDIA6, PCK2, PFKFB4, USP48 or SMARCC2.
[00173] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif XnC*Xp, wherein Xn is a nonpolar residue, C* denotes the site of modification, and Xp is a polar residue. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from UBE2L6, POU2F2, MGMT, ACAT1, UBA7, CASP8, PDE12 or USP34.
[00174] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif XnC*Xa, wherein Xn is a nonpolar residue, C* denotes the site of modification, and Xa is an acidic residue. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from CYR61 or XPOl .
[00175] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif XnC*Xb, wherein Xn is a nonpolar residue, C* denotes the site of modification, and X is a basic residue. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from ADA, MGMT, IDH2, IRF8 or SAMHDl .
[00176] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif LC*Z, wherein C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from PES1, CYR61, XPOl, R3C1 or SMARCC2.
[00177] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif PC *Z, wherein C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from CYR61, UBE2L6, MGMT, ERCC3, AC ATI or USP48. [00178] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif GC*Z, wherein C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from ADA, RELB or USP34.
[00179] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif AC*Z, wherein C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from UCHL3, CASP2, IDH2, LRBA, CASP8, PCK2 or PDE12.
[00180] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif VC*Z, wherein C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from MGMT, ACAT1, UBA7, UBE2L3 or IRF8.
[00181] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif IC*Z, wherein C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from PFKFB4, AC ATI or STAT3.
[00182] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif XrC*Z, wherein
Xr denotes an aromatic residue, C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from POU2F2, PDIA6 or SAMHD1.
[00183] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is an enzyme and the enzyme comprises the motif XnC*Z, wherein Xn is a nonpolar residue, C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the enzyme is selected from UBE2L6, ADA, UCHL3, MGMT, ERCC3, ACAT1, UBA7, CASP2, IDH2, UBE2L3, CASP8, PDIA6, PCK2, PFKFB4, PDE12, USP34, USP48 or SAMHD1.
[00184] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is a transcription factor or a regulator and the transcription factor or regulator comprises the motif XnC*Z, wherein Xn is a nonpolar residue, C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the transcription factor or regulator is selected from R3C1, POU2F2, STAT3, RELB, IRF8 or SMARCC2.
[00185] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif XnC*Z, wherein Xn is a nonpolar residue, C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from PES1, CYR61, XPOl or LRBA.
[00186] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif XaC*Z, wherein Xa is an acidic residue, C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from ZAP70, PRKCQ or PRMT1.
[00187] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif EC*Z, wherein C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from ZAP70 or PRKCQ.
[00188] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif XbC*Z, wherein Xb is a basic residue, C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from CYR61, Z F217, NCF1, IREB2, LRBA, CDK5, EP300,
EZH2, UBE2S, VCPIP1, RRAGC or IRAK4. [00189] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif Xt,C*Xn, wherein Xb is a basic residue, C* denotes the site of modification, and Xn is a nonpolar residue. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from CYR61, Z F217, IREB2, EP300, UBE2S, VCPIPl, RRAGC or IRAK4.
[00190] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif Xt,C*Xp, wherein Xb is a basic residue, C* denotes the site of modification, and Xp is a polar residue. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from NCF1, LRBA or CDK5.
[00191] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif Xt,C*X , wherein Xb is a basic residue and C* denotes the site of modification. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is EZH2.
[00192] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif RC*Z, wherein C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from ZNF217, NCF1, CDK5, EP300 or IRAK4.
[00193] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif KC*Z, wherein C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from CYR61, IREB2, LRBA or UBE2S.
[00194] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif HC*Z, wherein C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from EZH2, VCPIP1 or RRAGC.
[00195] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is an enzyme and the enzyme comprises the motif Xt,C*Z, wherein Xb is a basic residue, C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the enzyme is selected from CDK5, EP300, EZH2, UBE2S, VCPIP1 or IRAK4.
[00196] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is a transcription factor or a regulator and the transcription factor or regulator comprises the motif XbC*Z, wherein Xb is a basic residue, C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the transcription factor or regulator is selected from Z F217 or IREB2.
[00197] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is an adapter, a scaffolding protein, or a modulator protein and the adapter, scaffolding protein or the modulator protein comprises the motif XbC*Z, wherein Xb is a basic residue, C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the adapter, scaffolding protein or the modulator protein is selected from NCF1.
[00198] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein is a channel, a transporter, or a receptor and the channel, transporter, or receptor comprises the motif XbC*Z, wherein Xb is a basic residue, C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the channel, transporter, or receptor is selected from RRAGC.
[00199] In some instances, described herein is a modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, in which the cysteine containing protein comprises the motif XbC*Z, wherein Xb is a basic residue, C* denotes the site of modification, and Z is any amino acid. In some cases, the cysteine containing protein is selected from Table 3. In some cases, the cysteine containing protein is selected from CYR61 or LRBA. [00200] In some cases, a cysteine containing protein described above comprises about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
[00201] In some cases, the cysteine residue of a modified cysteine containin protein
described above has the structure SR wherein R is selected from:
Figure imgf000087_0001
Figure imgf000087_0002
, wherein R is H, C1-C3 alkyl, or aryl; and F is the small molecule fragment moiety. In some cases, the small molecule fragment has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher. In some cases, the molecular weight of the small molecule fragment is prior to enrichment with a halogen, a nonmetal, or a transition metal. In some embodiments, the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms. In some embodiments, the molecular weight of the small molecule fragment does not include the molecular weight of a halogen, a transition metal or a combination thereof. In some
cases, the small molecule fragment is a small molecule fragment of Formula (I):
Figure imgf000087_0003
wherein RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety. In some cases, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some cases, F is obtained from a compound library. In some cases, F is a small molecule fragment moiety illustrated in Fig. 3. In some cases, F further comprises a linker moiety that connects F to the carbonyl moiety. In some cases, the small molecule fragment binds irreversibly to a cysteine containing protein described above. In some cases, the small molecule fragment binds reversibly to a cysteine containing protein described above.
Compositions, Cells, and Cell Populations
[00202] Disclosed herein also include compositions of a small molecule fragment conjugated to a cysteine containing protein, a cysteine-reactive probe conjugated to a cysteine containing protein, and treated sample compositions. In some embodiments, a composition described herein comprises a small molecule fragment of Formula (I):
Figure imgf000088_0001
Formula (I)
wherein:
RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety; and
a cysteine containing protein wherein the cysteine containing protein is covalently bond to the small molecule fragment.
[00203] In some embodiments, also described herein is a composition that comprises a cysteine-reactive probe of Formula (II):
Figure imgf000088_0002
Formula (II)
wherein:
RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue; and AFDVI is an affinity handle moiety; and
a cysteine containing protein wherein the cysteine containing protein is covalently bond to the cysteine-reactive probe.
[00204] In some embodiments, also described herein is a composition that comprises an isolated sample wherein the isolated sample is an isolated cell or a tissue sample; and a cysteine- reactive probe to be assayed for its ability to interact with a cysteine containing protein expressed in the isolated sample.
[00205] Disclosed herein further include isolated treated cell and cell populations. In some embodiments, described herein is an isolated treated cell that comprises a cysteine-reactive probe covalently attached to a cysteine containing protein. In some instances, the isolated treated cell further comprises a set of cysteine-reactive probes wherein each of the cysteine-reactive probes is covalently attached to a cysteine containing protein.
[00206] In some embodiments, described herein is an isolated treated cell that comprises a small molecule fragment covalently attached to a cysteine containing protein. In some instances, the isolated treated cell further comprises a set of small molecule fragments wherein each of the small molecule fragment is covalently attached to a cysteine containing protein. In some instances, the isolated treated cell further comprises a cysteine-reactive probe. In some instances, the isolated treated cell further comprises a set of cysteine-reactive probes.
[00207] In some embodiments, also described herein is an isolated treated population of cells that comprises a set of cysteine-reactive probes covalently attached to cysteine containing proteins.
[00208] In some embodiments, further described herein is an isolated treated population of cells that comprises a set of small molecule fragments covalently attached to cysteine containing proteins. In some instances, the isolated treated population of cells further comprises a set of cysteine-reactive probes.
[00209] As disclosed elsewhere herein, the small molecule fragment is a small molecule fragment of Formula (I):
Figure imgf000089_0001
Formula (I)
wherein:
RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and F is a small molecule fragment moiety.
[00210] In some instances, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some cases, F is obtained from a compound library. In some embodiments, the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment -Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from
AnCoreX, Bio Building Blocks from ASINEX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library. In some cases, F is a small molecule fragment moiety illustrated in Fig. 3. In some cases, F further comprises a linker moiety that connects F to the carbonyl moiety. In some embodiments, the small molecule fragment is a small molecule fragment illustrated in Fig. 3.
[00211] Also described elsewhere herein, the cysteine-reactive probe is a cysteine-reactive probe of Formula (II):
Figure imgf000090_0001
Formula (II)
wherein:
RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue; and AHM is an affinity handle moiety.
[00212] In some embodiments, the Michael acceptor moiety comprises an alkene or an alkyne moiety. In some instances, the affinity handle moiety comprises an affinity handle and a binding moiety that facilitates covalent interaction of the cysteine-reactive probe to a cysteine residue of a cysteine-containing protein. In some cases, the binding moiety is a small molecule fragment obtained from a compound library. In some cases, the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment-Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from AnCoreX, Bio Building Blocks from ASINEX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library.
[00213] In some instances, the affinity handle is a bioorthogonal affinity handle. In some cases, the affinity handle comprises a carbodiimide, N-hydroxysuccinimide (NHS) ester, imidoester, pentafluorophenyl ester, hydroxymethyl phosphine, maleimide, haloacetyl, pyridyl disulfide, thiosulfonate, vinylsulfone, hydrazide, alkoxyamine, alkyne, azide, or isocyanate group. In some cases, the affinity handle comprises an alkyne or an azide group. In some instances, the affinity handle is further conjugated to an affinity ligand. In some instances, the affinity ligand comprises a chromophore, a labeling group, or a combination thereof. In some cases, the chromophore comprises fluorochrome, non-fluorochrome chromophore, quencher, an absorption chromophore, fluorophore, organic dye, inorganic dye, metal chelate, or a fluorescent enzyme substrate. In some cases, the labeling group is biotin moiety, streptavidin moiety, bead, resin, a solid support, or a combination thereof. In some instances, the affinity handle moiety further comprises a chromophore. In some embodiments, the cysteine-reactive probe is a cysteine-reactive probe illustrated in Fig. 3.
[00214] Further described elsewhere herein, the cell or cell population is obtained from any mammal, such as human or non-human primates. In some embodiments, the cell or cell population is an epithelial cell, connective tissue cell, hormone secreting cell, a nerve cell, a skeletal muscle cell, a blood cell, or an immune system cell. In additional embodiments, the cell or cell population is cancerous or is obtained from a tumor site.
Polypeptides comprising a cysteine interacting site
[00215] Further disclosed herein are polypeptides that comprise one or more of the cysteine interacting sites identified by a method described herein. In some embodiments, described herein is an isolated and purified polypeptide that comprises at least 90% sequence identity to at least seven contiguous amino acids of an amino acid sequence selected from Tables 1 -3 or 8-9. In some embodiments, the isolated and purified polypeptide comprises at least 91%, 92%, 93%, 94%), 95%), 96%), 97%), 98%>, or 99% sequence identity to at least seven contiguous amino acids of an amino acid sequence selected from Tables 1 -3 or 8-9. In some embodiments, the isolated and purified polypeptide comprises 100% sequence identity to at least seven contiguous amino acids of an amino acid sequence selected from Tables 1-3 or 8-9. In some instances, the isolated and purified polypeptide consists 100% sequence identity to the full length of an amino acid sequence selected from Tables 1 -3 or 8-9. In some instances, the isolated and purified polypeptide is at most 50 amino acids in length.
[00216] In some embodiments, additionally described herein include nucleic acid encoding a polypeptide that comprises at least 90% sequence identity at least seven contiguous amino acids of an amino acid sequence selected from Tables 1 -3 or 8-9. In some embodiments, the nucleic acid encoding a polypeptide comprises at least 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity at least seven contiguous amino acids of an amino acid sequence selected from Tables 1-3 or 8-9. In some embodiments, the nucleic acid encoding a polypeptide comprises 100% sequence identity at least seven contiguous amino acids of an amino acid sequence selected from Tables 1 -3 or 8-9. In some embodiments, the nucleic acid encoding a polypeptide consists 100% sequence identity to the full length of an amino acid sequence selected from Tables 1-3 or 8-9.
[00217] In some embodiments, further disclosed herein include a method of mapping a biologically active cysteine site on a protein, which comprises harvesting a set of cysteine- reactive probe-protein complexes from a sample wherein the cysteine-reactive probe comprises a reactive moiety capable of forming a covalent bond with a cysteine residue located on the cysteine containing protein; analyzing the set of cysteine-reactive probe-protein complexes by a proteomic analysis means; and based on the previous step, mapping the biologically active cysteine site on the protein.
[00218] In some embodiments, the analyzing further comprises treating the set of cysteine- reactive probe-protein complexes with a protease to generate a set of protein fragments. The protease is a serine protease, a threonine protease, a cysteine protease, an aspartate protease, a glutamic acid protease, or a metalloprotease. In some instances, the protease is a serine protease. In some instances, the protease is trypsin. In some instances, cysteine-reactive probe-protein complex is further attached to a labeling group such as a biotin moiety. In some instances, the labeling group such as a biotin moiety further comprises a linker. In some instances, the linker is a peptide. In some instances, the peptide linker is about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more amino acid residues in length. In some instances, the peptide linker contains a cleavage site. A non-limiting list of cleavage sites includes Tobacco Etch Virus (TEV), thrombin (Thr), enterokinase (EKT), activated Factor X (Xa), or human Rhinovirus 3C protease (3C/PreScission). In some instances, the peptide linker contains a TEV protease cleavage site. In some instances, the TEV protease cleavage site comprises the following sequence Gly-Gln- Phe-Tyr-Leu-Asn-Glu (SEQ ID NO: 860). In some instances, the biotin moiety is further coupled to a bead (e.g. a streptavidin-coupled bead).
[00219] In some instances, the protein from the cysteine-reactive probe-protein complex attached to the bead (via a biotin moiety comprising a linker and attached to a streptavidin- coupled bead) is digested with trypsin, and the immobilized peptide or protein fragment is further separated and collected. In some instances, the collected peptide or protein fragment is then digested by a protease (e.g. TEV protease), and the treated protein fragment is then separated, and collected for analysis. In some instances, the analysis is a proteomic analysis as described above and elsewhere herein. In some instances, the sequence of the protein fragment is further determined. In some instances, the protein fragment correlates to a small molecule fragment binding site on the cysteine containing protein.
[00220] In some embodiments, the sequence of the protein fragment correlates to a sequence as illustrated in Tables 1-3 or 8-9. In some instances, the sequence as shown in Tables 1-3 or 8-9 correlate to a site on the full length protein as a drug binding site. In some instances, the sequence as shown in Tables 1-3 or 8-9 correlate to a drug binding site. In some instances, polypeptides comprising one or more of the sequences as shown in Tables 1-3 or 8-9 serve as probes for small molecule fragment screening.
[00221] In some instances after the generation of a polypeptide, the polypeptide is subjected to one or more rounds of purification steps to remove impurities. In some instances, the purification step is a chromatographic step utilizing separation methods such as affinity-based, size-exclusion based, ion-exchange based, or the like. In some cases, the polypeptide is at most
30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, 99.9%, or 100% pure or without the presence of impurities. In some cases, the polypeptide is at least 30%, 40%, 50%, 60%, 70%,
80%, 90%, 95%, 99%, 99.9%, or 100% pure or without the presence of impurities. [00222] As described above, nucleic acid encoding a polypeptide that is derived from a cysteine containing protein is subjected to one or more rounds of purification steps to remove impurities. In some instances, the purification step is a chromatographic step utilizing separation methods such as affinity-based, size-exclusion based, ion-exchange based, or the like. In some cases, the nucleic acid is at most 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, 99.9%, or 100%) pure or without the presence of impurities. In some cases, the nucleic acid is at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, 99.9%, or 100% pure or without the presence of impurities.
[00223] As used herein, a polypeptide includes natural amino acids, unnatural amino acids, or a combination thereof. In some instances, an amino acid residue refers to a molecule containing both an amino group and a carboxyl group. Suitable amino acids include, without limitation, both the D- and L-isomers of the naturally-occurring amino acids, as well as non-naturally occurring amino acids prepared by organic synthesis or other metabolic routes. The term amino acid, as used herein, includes, without limitation, a-amino acids, natural amino acids, non- natural amino acids, and amino acid analogs.
[00224] The term "a-amino acid" refers to a molecule containing both an amino group and a carboxyl group bound to a carbon which is designated the a-carbon.
[00225] The term "β-amino acid" refers to a molecule containing both an amino group and a carboxyl group in a β configuration.
[00226] "Naturally occurring amino acid" refers to any one of the twenty amino acids commonly found in peptides synthesized in nature, and known by the one letter abbreviations A, R, N, C, D, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y and V.
[00227] The following table shows a summary of the properties of natural amino acids:
1- Side- Side-chain
Leiter Letter cha n charge Hydro
Amino Acid Code Code Polarity (pH 7.4) Index
Alanine Ala A nonpolar neutral 1.8
Arginine Arg R polar positive — 4 5
Asparagme Asn N polar neutral -3.5
Aspartie acid Asp D polar negative -3.5
Cysteine Cys C polar neutral 2.3
Glutamic acid Glu E polar negative -3.5
Giutamme Gin Q polar neutral -3.5
Glycine Gly G nonpolar neutral -0.4
Histidme His H polar positive(10¾ a) -3.2
neutral (90%) 3- I- Side- Side-chain
Letter Letter chain charge Hydropathy
Amino Acid Code Code Polarity (pH 7.4) Index
Isoieucine He Ϊ minpolar neutral 4.5
Leucine Leu L nonpolar neutral 3.8
Lysine Lys K polar positive 3.9
Methionine Met M nonpolar neutral 1.9
Phenylalanine P he F nonpolar neutral 2.8
Proline Pro P nonpolar neutral 1.6
Serine Ser S polar neutral 0.8
Threonine Thr T polar neutral 0.7
Tryptophan Trp W nonpolar neutral 0.9
Tyrosine Tyr Y polar neutral 1.3
Valine Val V nonpolar neutral 4.2
[00228] "Hydrophobic amino acids" include small hydrophobic amino acids and large hydrophobic amino acids. "Small hydrophobic amino acid" are glycine, alanine, proline, and analogs thereof. "Large hydrophobic amino acids" are valine, leucine, isoleucine, phenylalanine, methionine, tryptophan, and analogs thereof. "Polar amino acids" are serine, threonine, asparagine, glutamine, cysteine, tyrosine, and analogs thereof. "Charged amino acids" are lysine, arginine, histidine, aspartate, glutamate, and analogs thereof. In some cases, aspartic acid and glutamic acid are referred to as acidic amino acids. In other cases, lysine, arginine and histinde are referred to as basic amino acids.
[00229] The term "amino acid analog" refers to a molecule which is structurally similar to an amino acid and which is substituted for an amino acid in the formation of a peptidomimetic macrocycle Amino acid analogs include, without limitation, β-amino acids and amino acids where the amino or carboxy group is substituted by a similarly reactive group (e.g., substitution of the primary amine with a secondary or tertiary amine, or substitution of the carboxy group with an ester).
[00230] The term "non-natural amino acid" refers to an amino acid which is not one of the twenty amino acids commonly found in peptides synthesized in nature, and known by the one letter abbreviations A, R, N, C, D, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y and V.
[00231] In some instances, amino acid analogs include β-amino acid analogs. Examples of β- amino acid analogs include, but are not limited to, the following: cyclic β-amino acid analogs; β- alanine; (R)^-phenylalanine; (R)-l,2,3,4-tetrahydro-isoquinoline-3-acetic acid; (R)-3-amino-4- (l-naphthyl)-butyric acid; (R)-3-amino-4-(2,4-dichlorophenyl)butyric acid; (R)-3-amino-4-(2- chlorophenyl)-butyric acid; (R)-3-amino-4-(2-cyanophenyl)-butyric acid; (R)-3-amino-4-(2- fluorophenyl)-butyric acid; (R)-3-amino-4-(2-furyl)-butyric acid; (R)-3-amino-4-(2- methylphenyl)-butyric acid; (R)-3-amino-4-(2-naphthyl)-butyric acid; (R)-3-amino-4-(2- thienyl)-butyric acid; (R)-3-amino-4-(2-trifluoromethylphenyl)-butyric acid; (R)-3-amino-4-
(3,4-dichlorophenyl)butyric acid; (R)-3-amino-4-(3,4-difluorophenyl)butyric acid; (R)-3-amino-
4-(3-benzothienyl)-butyric acid; (R)-3-amino-4-(3-chlorophenyl)-butyric acid; (R)-3-amino-4-
(3-cyanophenyl)-butyric acid; (R)-3-amino-4-(3-fluorophenyl)-butyric acid; (R)-3-amino-4-(3- methylphenyl)-butyric acid; (R)-3-amino-4-(3-pyridyl)-butyric acid; (R)-3-amino-4-(3-thienyl)- butyric acid; (R)-3-amino-4-(3-trifluoromethylphenyl)-butyric acid; (R)-3-amino-4-(4- bromophenyl)-butyric acid; (R)-3-amino-4-(4-chlorophenyl)-butyric acid; (R)-3-amino-4-(4- cyanophenyl)-butyric acid; (R)-3-amino-4-(4-fluorophenyl)-butyric acid; (R)-3-amino-4-(4- iodophenyl) -butyric acid; (R)-3-amino-4-(4-methylphenyl)-butyric acid; (R)-3-amino-4-(4- nitrophenyl)-butyric acid; (R)-3-amino-4-(4-pyridyl)-butyric acid; (R)-3-amino-4-(4- trifluoromethylphenyl)-butyric acid; (R)-3-amino-4-pentafluoro-phenylbutyric acid; (R)-3- amino-5-hexenoic acid; (R)-3-amino-5-hexynoic acid; (R)-3-amino-5-phenylpentanoic acid;
(R)-3-amino-6-phenyl-5-hexenoic acid; (S)-l,2,3,4-tetrahydro-isoquinoline-3-acetic acid; (S)-3- amino-4-(l-naphthyl)-butyric acid; (S)-3-amino-4-(2,4-dichlorophenyl)butyric acid; (S)-3- amino-4-(2-chlorophenyl)-butyric acid; (S)-3-amino-4-(2-cyanophenyl)-butyric acid; (S)-3- amino-4-(2-fluorophenyl)-butyric acid; (S)-3-amino-4-(2-furyl)-butyric acid; (S)-3-amino-4-(2- methylphenyl)-butyric acid; (S)-3-amino-4-(2-naphthyl)-butyric acid; (S)-3-amino-4-(2-thienyl)- butyric acid; (S)-3-amino-4-(2-trifluoromethylphenyl)-butyric acid; (S)-3-amino-4-(3,4- dichlorophenyl)butyric acid; (S)-3-amino-4-(3,4-difluorophenyl)butyric acid; (S)-3-amino-4-(3- benzothienyl)-butyric acid; (S)-3-amino-4-(3-chlorophenyl)-butyric acid; (S)-3-amino-4-(3- cyanophenyl)-butyric acid; (S)-3-amino-4-(3-fluorophenyl)-butyric acid; (S)-3-amino-4-(3- methylphenyl)-butyric acid; (S)-3-amino-4-(3-pyridyl)-butyric acid; (S)-3-amino-4-(3-thienyl)- butyric acid; (S)-3-amino-4-(3-trifluoromethylphenyl)-butyric acid; (S)-3-amino-4-(4- bromophenyl)-butyric acid; (S)-3-amino-4-(4-chlorophenyl) butyric acid; (S)-3-amino-4-(4- cyanophenyl)-butyric acid; (S)-3-amino-4-(4-fluorophenyl) butyric acid; (S)-3-amino-4-(4- iodophenyl) -butyric acid; (S)-3-amino-4-(4-methylphenyl)-butyric acid; (S)-3-amino-4-(4- nitrophenyl)-butyric acid; (S)-3-amino-4-(4-pyridyl)-butyric acid; (S)-3-amino-4-(4- trifluoromethylphenyl)-butyric acid; (S)-3-amino-4-pentafluoro-phenylbutyric acid; (S)-3- amino-5-hexenoic acid; (S)-3-amino-5-hexynoic acid; (S)-3-amino-5-phenylpentanoic acid; (S)-
3-amino-6-phenyl-5-hexenoic acid; 1,2,5, 6-tetrahydropyridine-3-carboxylic acid; 1,2,5,6- tetrahydropyridine-4-carboxylic acid; 3-amino-3-(2-chlorophenyl)-propionic acid; 3-amino-3-
(2-thienyl)-propionic acid; 3-amino-3-(3-bromophenyl)-propionic acid; 3-amino-3-(4- chlorophenyl)-propionic acid; 3-amino-3-(4-methoxyphenyl)-propionic acid; 3-amino-4,4,4- trifluoro-butyric acid; 3-aminoadipic acid; D-P-phenylalanine; β-leucine; L-P-homoalanine; L-β- homoaspartic acid γ-benzyl ester; L-P-homoglutamic acid δ-benzyl ester; L-P-homoisoleucine; L-P-homoleucine; L-P-homomethionine; L-P-homophenylalanine; L-P-homoproline; L-β- homotryptophan; L-P-homovaline; L-Nco-benzyloxycarbonyl^-homolysine; Nco-L-β- homoarginine; O-benzyl-L-P-homohydroxyproline; O-benzyl-L-P-homoserine; O-benzyl-L-β- homothreonine; O-benzyl-L-P-homotyrosine; Y-trityl-L-P-homoasparagine; (R)-P-phenylalanine; L-P-homoaspartic acid γ-t-butyl ester; L-P-homoglutamic acid δ-t-butyl ester; L-Νω-β- homolysine; N5-trityl-L-P-homoglutamine; Nro-2,2,4,6,7-pentamethyl-dihydrobenzofuran-5- sulfonyl-L-P-homoarginine; O-t-butyl-L-P-homohydroxy-proline; O-t-butyl-L-P-homoserine; O- t-butyl-L-P-homothreonine; O-t-butyl-L-P-homotyrosine; 2-aminocyclopentane carboxylic acid; and 2-aminocyclohexane carboxylic acid.
[00232] In some instances, amino acid analogs include analogs of alanine, valine, glycine or leucine. Examples of amino acid analogs of alanine, valine, glycine, and leucine include, but are not limited to, the following: a-methoxyglycine; a-allyl-L-alanine; a-aminoisobutyric acid; a- methyl-leucine; P-(l-naphthyl)-D-alanine; P-(l-naphthyl)-L-alanine; P-(2-naphthyl)-D-alanine; P-(2-naphthyl)-L-alanine; P-(2-pyridyl)-D-alanine; P-(2-pyridyl)-L-alanine; P-(2-thienyl)-D- alanine; P-(2-thienyl)-L-alanine; P-(3-benzothienyl)-D-alanine; P-(3-benzothienyl)-L-alanine; β- (3-pyridyl)-D-alanine; P-(3-pyridyl)-L-alanine; P-(4-pyridyl)-D-alanine; P-(4-pyridyl)-L-alanine; β-chloro-L-alanine; β-cyano-L-alanin; β-cyclohexyl-D-alanine; β-cyclohexyl-L-alanine; β- cyclopenten- 1 -yl-alanine; β-cyclopentyl-alanine; β-cyclopropyl-L- Ala- OH.dicyclohexylammonium salt; β-t-butyl-D-alanine; β-t-butyl-L-alanine; γ-aminobutyric acid; L-a^-diaminopropionic acid; 2,4-dinitro-phenylglycine; 2,5-dihydro-D-phenylglycine; 2- amino-4,4,4-trifluorobutyric acid; 2-fluoro-phenylglycine; 3-amino-4,4,4-trifluoro-butyric acid; 3-fluoro-valine; 4,4,4-trifluoro-valine; 4,5-dehydro-L-leu-OH.dicyclohexylammonium salt; 4- fluoro-D-phenylglycine; 4-fluoro-L-phenylglycine; 4-hydroxy-D-phenylglycine; 5,5,5-trifluoro- leucine; 6-aminohexanoic acid; cyclopentyl-D-Gly-OH.dicyclohexylammonium salt;
cyclopentyl-Gly-OH.dicyclohexylammonium salt; D-a^-diaminopropionic acid; D-a- aminobutyric acid; D-a-t-butyl glycine; D-(2-thienyl)glycine; D-(3-thienyl)glycine; D-2- aminocaproic acid; D-2-indanylglycine; D-allylglycine-dicyclohexylammonium salt; D- cyclohexylglycine; D-norvaline; D-phenylglycine; β-aminobutyric acid; β-aminoisobutyric acid; (2-bromophenyl)glycine; (2-methoxyphenyl)glycine; (2-methylphenyl)glycine; (2- thiazoyl)glycine; (2-thienyl)glycine; 2-amino-3-(dimethylamino)-propionic acid; L-α,β- diaminopropionic acid; L-a-aminobutyric acid; L-a-t-butyl glycine; L-(3-thienyl)glycine; L-2- amino-3-(dimethylamino)-propionic acid; L-2-aminocaproic acid dicyclohexyl-ammonium salt; L-2-indanyl glycine; L-allyl glycine. dicyclohexyl ammonium salt; L-cycl oh exyl glycine; L- phenylglycine; L-propargylglycine; L-norvaline; N-a-aminomethyl-L-alanine; D-α,γ- diaminobutyric acid; L-a -diaminobutyric acid; β-cyclopropyl-L-alanine; (Ν-β-(2,4- dinitrophenyl))-L-a,P-diaminopropionic acid; (N-P-l-(4,4-dimethyl-2,6-dioxocyclohex-l- ylidene)ethyl)-D-a,P-diaminopropionic acid; (N-P-l-(4,4-dimethyl-2,6-dioxocyclohex-l- ylidene)ethyl)-L-a,P-diaminopropionic acid; (N-P-4-methyltrityl)-L-a,P-diaminopropionic acid; (N-P-allyloxycarbonyl)-L-a,P-diaminopropionic acid; (N-Y-l-(4,4-dimethyl-2,6-dioxocyclohex- 1 -ylidene)ethyl)-D-a,Y-diaminobutyric acid; (Ν-γ- 1 -(4,4-dimethyl-2,6-dioxocyclohex- 1 - ylidene)ethyl)-L-a,Y-diaminobutyric acid; (N-Y-4-methyltrityl)-D-a,Y-diaminobutyric acid; (Ν-γ- 4-methyltrityl)-L-a,Y-diaminobutyric acid; (N-Y-allyloxycarbonyl)-L-a,Y-diaminobutyric acid; D-aj-diaminobutyric acid; 4,5-dehydro-L-leucine; cyclopentyl-D-Gly-OH; cyclopentyl-Gly- OH; D-allyl glycine; D-homocyclohexylalanine; L-l-pyrenylalanine; L-2-aminocaproic acid; L- allylglycine; L-homocyclohexylalanine; and N-(2-hydroxy-4-methoxy-Bzl)-Gly-OH.
[00233] In some instances, amino acid analogs include analogs of arginine or lysine.
Examples of amino acid analogs of arginine and lysine include, but are not limited to, the following: citrulline; L-2-amino-3-guanidinopropionic acid; L-2-amino-3-ureidopropionic acid; L-citrulline; Lys(Me)2-OH; Lys(N3)— OH; Νδ-benzyloxycarbonyl-L-ornithine; Νω-nitro-D- arginine; Νω-nitro-L-arginine; a-methyl-ornithine; 2,6-diaminoheptanedioic acid; L-ornithine; (N5-l-(4,4-dimethyl-2,6-dioxo-cyclohex-l-ylidene)ethyl)-D-ornithine; (N5-l-(4,4-dimethyl-2,6- dioxo-cyclohex- 1 -ylidene)ethyl)-L-ornithine; (N5-4-methyltrityl)-D-ornithine; (Νδ-4- methyltrityl)-L-ornithine; D-ornithine; L-ornithine; Arg(Me)(Pbf)-OH; Arg(Me)2-OH
(asymmetrical); Arg(Me)2-OH (symmetrical); Lys(ivDde)-OH; Lys(Me)2-OH.HCl; Lys(Me3)- OH chloride; Νω-nitro-D-arginine; and Νω-nitro-L-arginine.
[00234] In some instances, amino acid analogs include analogs of aspartic or glutamic acids. Examples of amino acid analogs of aspartic and glutamic acids include, but are not limited to, the following: a-methyl-D-aspartic acid; a-methyl-glutamic acid; a-methyl-L-aspartic acid; γ- methylene-glutamic acid; (N-y-ethyl)-L-glutamine; [N-a-(4-aminobenzoyl)]-L-glutamic acid; 2,6-diaminopimelic acid; L-a-aminosuberic acid; D-2-aminoadipic acid; D-a-aminosuberic acid; a-aminopimelic acid; iminodiacetic acid; L-2-aminoadipic acid; threo-P-methyl-aspartic acid; γ- carboxy-D-glutamic acid γ,γ-di-t-butyl ester; γ-carboxy-L-glutamic acid γ,γ-di-t-butyl ester; Glu(OAll)-OH; L-Asu(OtBu)— OH; and pyroglutamic acid.
[00235] In some instances, amino acid analogs include analogs of cysteine and methionine.
Examples of amino acid analogs of cysteine and methionine include, but are not limited to,
Cys(farnesyl)-OH, Cys(farnesyl)-OMe, a-methyl-methionine, Cys(2-hydroxyethyl)-OH, Cys(3- aminopropyl)-OH, 2-amino-4-(ethylthio)butyric acid, buthionine, buthioninesulfoximine, ethionine, methionine methyl sulfonium chloride, selenomethionine, cysteic acid, [2-(4- pyridyl)ethyl]-DL-penicillamine, [2-(4-pyridyl)ethyl]-L-cysteine, 4-methoxybenzyl-D- penicillamine, 4-methoxybenzyl-L-penicillamine, 4-methylbenzyl-D-penicillamine, 4- methylbenzyl-L-penicillamine, benzyl-D-cysteine, benzyl-L-cysteine, benzyl-DL-homocysteine, carbamoyl-L-cysteine, carboxyethyl-L-cysteine, carboxymethyl-L-cysteine, diphenylmethyl-L- cysteine, ethyl -L-cysteine, methyl-L-cysteine, t-butyl-D-cysteine, trityl-L-homocysteine, trityl- D-penicill amine, cystathionine, homocystine, L-homocystine, (2-aminoethyl)-L-cysteine, seleno-L-cystine, cystathionine, Cys(StBu)— OH, and acetamidomethyl-D-penicillamine.
[00236] In some instances, amino acid analogs include analogs of phenylalanine and tyrosine. Examples of amino acid analogs of phenylalanine and tyrosine include β-methyl-phenylalanine, β-hydroxyphenylalanine, a-methyl-3-methoxy-DL-phenylalanine, a-methyl-D-phenylalanine, a- methyl-L-phenylalanine, l,2,3,4-tetrahydroisoquinoline-3-carboxylic acid, 2,4-dichloro- phenylalanine, 2-(trifluoromethyl)-D-phenylalanine, 2-(trifluoromethyl)-L-phenylalanine, 2- bromo-D-phenylalanine, 2-bromo-L-phenylalanine, 2-chloro-D-phenylalanine, 2-chloro-L- phenylalanine, 2-cyano-D-phenylalanine, 2-cyano-L-phenylalanine, 2-fluoro-D-phenylalanine, 2-fluoro-L-phenylalanine, 2-methyl-D-phenylalanine, 2-methyl-L-phenylalanine, 2-nitro-D- phenylalanine, 2-nitro-L-phenylalanine, 2;4;5-trihydroxy-phenylalanine, 3,4,5-trifluoro-D- phenylalanine, 3,4,5-trifluoro-L-phenylalanine, 3,4-dichloro-D-phenylalanine, 3,4-dichloro-L- phenylalanine, 3,4-difluoro-D-phenylalanine, 3,4-difluoro-L-phenylalanine, 3,4-dihydroxy-L- phenylalanine, 3,4-dimethoxy-L-phenylalanine, 3,5,3 '-triiodo-L-thyronine, 3,5-diiodo-D- tyrosine, 3,5-diiodo-L-tyrosine, 3,5-diiodo-L-thyronine, 3-(trifluoromethyl)-D-phenylalanine, 3- (trifluoromethyl)-L-phenylalanine, 3-amino-L-tyrosine, 3-bromo-D-phenylalanine, 3-bromo-L- phenylalanine, 3-chloro-D-phenylalanine, 3-chloro-L-phenylalanine, 3-chloro-L-tyrosine, 3- cyano-D-phenylalanine, 3-cyano-L-phenylalanine, 3 -fluoro-D -phenyl alanine, 3-fluoro-L- phenylalanine, 3-fluoro-tyrosine, 3-iodo-D-phenylalanine, 3-iodo-L-phenylalanine, 3-iodo-L- tyrosine, 3-methoxy-L-tyrosine, 3-methyl-D-phenylalanine, 3-methyl-L-phenylalanine, 3-nitro- D-phenylalanine, 3-nitro-L-phenylalanine, 3-nitro-L-tyrosine, 4-(trifluoromethyl)-D- phenylalanine, 4-(trifluoromethyl)-L-phenylalanine, 4-amino-D-phenylalanine, 4-amino-L- phenylalanine, 4-benzoyl-D-phenylalanine, 4-benzoyl-L-phenylalanine, 4-bis(2- chloroethyl)amino-L-phenylalanine, 4-bromo-D-phenylalanine, 4-bromo-L-phenylalanine, 4- chloro-D-phenylalanine, 4-chloro-L-phenylalanine, 4-cyano-D-phenylalanine, 4-cyano-L- phenylalanine, 4-fluoro-D-phenylalanine, 4-fluoro-L-phenylalanine, 4-iodo-D-phenylalanine, 4- iodo-L-phenylalanine, homophenylalanine, thyroxine, 3,3-diphenylalanine, thyronine, ethyl- tyrosine, and methyl -tyrosine.
[00237] In some instances, amino acid analogs include analogs of proline. Examples of amino acid analogs of proline include, but are not limited to, 3,4-dehydro-proline, 4-fluoro-proline, cis-
4-hydroxy-proline, thiazolidine-2-carboxylic acid, and trans-4-fluoro-proline. [00238] In some instances, amino acid analogs include analogs of serine and threonine.
Examples of amino acid analogs of serine and threonine include, but are not limited to, 3-amino- 2-hydroxy-5-methylhexanoic acid, 2-amino-3-hydroxy-4-methylpentanoic acid, 2-amino-3- ethoxybutanoic acid, 2-amino-3-methoxybutanoic acid, 4-amino-3-hydroxy-6-methylheptanoic acid, 2-amino-3 -benzyl oxypropionic acid, 2-amino-3 -benzyl oxypropionic acid, 2-amino-3- ethoxypropionic acid, 4-amino-3-hydroxybutanoic acid, and a-methylserine.
[00239] In some instances, amino acid analogs include analogs of tryptophan. Examples of amino acid analogs of tryptophan include, but are not limited to, the following: a-methyl- tryptophan; P-(3-benzothienyl)-D-alanine; P-(3-benzothienyl)-L-alanine; 1 -methyl -tryptophan; 4-methyl-tryptophan; 5-benzyloxy-tryptophan; 5-bromo-tiyptophan; 5-chloro-tryptophan; 5- fluoro-tryptophan; 5 -hydroxy-tryptophan; 5-hydroxy-L-tryptophan; 5-methoxy-tryptophan; 5- methoxy-L-tryptophan; 5 -methyl -tryptophan; 6-bromo-tiyptophan; 6-chloro-D-tryptophan; 6- chloro-tryptophan; 6-fluoro-tryptophan; 6-methyl-tryptophan; 7-benzyloxy-tryptophan; 7- bromo-tryptophan; 7-methyl -tryptophan; D-1, 2,3, 4-tetrahydro-norharman-3 -carboxylic acid; 6- methoxy-1, 2,3, 4-tetrahydronorharman-l -carboxylic acid; 7-azatryptophan; L- 1,2,3, 4-tetrahydro- norharman-3 -carboxylic acid; 5-methoxy-2-methyl-tryptophan; and 6-chloro-L-tryptophan.
[00240] In some instances, amino acid analogs are racemic. In some instances, the D isomer of the amino acid analog is used. In some cases, the L isomer of the amino acid analog is used. In some instances, the amino acid analog comprises chiral centers that are in the R or S configuration. Sometimes, the amino group(s) of a β-amino acid analog is substituted with a protecting group, e.g., tert-butyloxycarbonyl (BOC group), 9-fluorenylmethyloxycarbonyl (FMOC), tosyl, and the like. Sometimes, the carboxylic acid functional group of a β-amino acid analog is protected, e.g., as its ester derivative. In some cases, the salt of the amino acid analog is used.
[00241] In some embodiments, nucleic acid molecules refer to at least two nucleotides covalently linked together. In some instances, a nucleic acid described herein contains phosphodiester bonds, although in some cases, as outlined below (for example in the
construction of primers and probes such as label probes), nucleic acid analogs are included that have alternate backbones, comprising, for example, phosphoramide (Beaucage et al.,
Tetrahedron 49(10): 1925 (1993) and references therein; Letsinger, J. Org. Chem. 35:3800 (1970), Sprinzl et al., Eur. J". Biochem. 81 :579 (1977); Letsinger et al., Nucl. Acids Res. 14:3487 (1986); Sawai et al, Chem. Lett. 805 (1984), Letsinger et al., J. Am. Chem. Soc. 110:4470 (1988); and Pauwels et al, Chemica Scripta 26: 141 91986)), phosphorothioate (Mag et al, Nucleic Acids Res. 19: 1437 (1991); and U.S. Pat. No. 5,644,048), phosphorodithioate (Briu et al ., J. Am, Chem, Soc. 111 :2321 (1989), 0-methyl.phosphoroamidite linkages (see Eckstein, Oligonucleotides and Analogues: A Practical Approach, Oxford University Press), and peptide nucleic acid (also referred to herein as "PNA") backbones and linkages (see Egholm, J. Am. Chem. Soc. 1 14: 1895 (1992); Meier et al, Chem. Int. Ed. Engl . 31 : 1008 (1992); Nielsen, Nature, 365:566 (1993); Carlsson et al., Nature 380:207 (1996), all of which are incorporated by reference). Other analog nucleic acids include those with bicyclic structures including locked nucleic acids (also referred to herein as "LNA"), oshkin et al., J. Am. Chem. Soc. 120.13252 3 (1998); positive backbones (Denpcy et al., Proc. Natl. Acad. Sci. USA 92:6097 (1995); non- ionic backbones (U.S. Pat. Nos. 5,386,023, 5,637,684, 5,602,240, 5,216,141 and 4,469,863; Kiedrowshi et al., Angew. Chem. Intl. Ed. English 30:423 (1991); Letsinger et al., J. Am. Chem. Soc. 1 10:4470 (1988); Letsinger et al.. Nucleoside &amp; Nucleotide 13 : 1597 (1994); Chapters 2 and 3, ASC Symposium Series 580, "Carbohydrate Modifications in Antisense Research", Ed. Y. S. Sanghui and P. Dan Cook; Mesmaeker et al ., Bioorganic & Medicinal Chem. Lett. 4:395 (1994); Jeffs et al., J. Biomolecu!ar NMR 34: 17 (1994); Tetrahedron Lett. 37:743 (1996)) and non-ribose backbones, including those described in U.S. Pat. Nos. 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC Symposium Series 580, "Carbohydrate Modifications in Antisense Research", Ed. Y. S. Sanghui and P. Dan Cook. Nucleic acids containing one or more carbocyclic sugars are al so included within the definition of nucleic acids (see Jenkins et al., Chem. Soc. Rev. (1995) pp 169 1 76). Several nucleic acid analogs are described in Rawls, C & E News Jim. 2, 1997 page 35. "Locked nucleic acids" are also included within the definition of nucleic acid analogs. LNAs are a class of nucleic acid analogues in which the ribose ring is "locked" by a methylene bridge connecting the 2'-0 atom with the 4'-C atom. All of these references are hereby expressly incorporated by reference. In some instances, these
modifications of the ribose-phosphate backbone are done to increase the stability and half-life of such molecules in physiological environments. For example, PNA:DNA and LNA-DNA hybrids exhibit higher stability and thus are used in some embodiments. The target nucleic acids are single stranded or double stranded, as specified, or contain portions of both double stranded or single stranded sequence. Depending on the application, the nucleic acids are DNA (including, e.g., genomic DNA, mitochondrial DNA, and cDNA), RNA (including, e.g., mRNA and rRNA) or a hybrid, where the nucleic acid contains any combination of deoxyribo- and ribonucleotides, and any combination of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, xathanine hypoxathanine, isocytosine, isoguanine, etc.
Samples, Analytical Techniques, and Instrumentation
[00242] In certain embodiments, one or more of the methods disclosed herein comprise a sample. In some embodiments, the sample is a cell sample or a tissue sample. In some instances, the sample is a cell sample. In some embodiments, the sample for use with the methods described herein is obtained from cells of an animal. In some instances, the animal cell includes a cell from a marine invertebrate, fish, insects, amphibian, reptile, or mammal. In some instances, the mammalian cell is a primate, ape, equine, bovine, porcine, canine, feline, or rodent. In some instances, the mammal is a primate, ape, dog, cat, rabbit, ferret, or the like. In some cases, the rodent is a mouse, rat, hamster, gerbil, hamster, chinchilla, or guinea pig. In some embodiments, the bird cell is from a canary, parakeet or parrots. In some embodiments, the reptile cell is from a turtles, lizard or snake. In some cases, the fish cell is from a tropical fish. In some cases, the fish cell is from a zebrafish (e.g. Danino rerio). In some cases, the worm cell is from a nematode (e.g. C. elegans). In some cases, the amphibian cell is from a frog. In some embodiments, the arthropod cell is from a tarantula or hermit crab.
[00243] In some embodiments, the sample for use with the methods described herein is obtained from a mammalian cell. In some instances, the mammalian cell is an epithelial cell, connective tissue cell, hormone secreting cell, a nerve cell, a skeletal muscle cell, a blood cell, or an immune system cell.
[00244] Exemplary mammalian cells include, but are not limited to, 293 A cell line, 293FT cell line, 293F cells , 293 H cells, HEK 293 cells, CHO DG44 cells, CHO-S cells, CHO-K1 cells, Expi293F™ cells, Flp-In™ T-REx™ 293 cell line, Flp-In™-293 cell line, Flp-In™-3T3 cell line, Flp-In™-BHK cell line, Flp-In™-CHO cell line, Flp-In™-CV-l cell line, Flp-In™- Jurkat cell line, FreeStyle™ 293 -F cells, FreeStyle™ CHO-S cells, GripTite™ 293 MSR cell line, GS-CHO cell line, HepaRG™ cells, T-REx™ Jurkat cell line, Per.C6 cells, T-REx™-293 cell line, T-REx™-CHO cell line, T-REx™-HeLa cell line, NC-HEVIT cell line, and PC 12 cell line.
[00245] In some instances, the sample for use with the methods described herein is obtained from cells of a tumor cell line. In some instances, the sample is obtained from cells of a solid tumor cell line. In some instances, the solid tumor cell line is a sarcoma cell line. In some instances, the solid tumor cell line is a carcinoma cell line. In some embodiments, the sarcoma cell line is obtained from a cell line of alveolar rhabdomyosarcoma, alveolar soft part sarcoma, ameloblastoma, angiosarcoma, chondrosarcoma, chordoma, clear cell sarcoma of soft tissue, dedifferentiated liposarcoma, desmoid, desmoplastic small round cell tumor, embryonal rhabdomyosarcoma, epithelioid fibrosarcoma, epithelioid hemangioendothelioma, epithelioid sarcoma, esthesioneuroblastoma, Ewing sarcoma, extrarenal rhabdoid tumor, extraskeletal myxoid chondrosarcoma, extraskeletal osteosarcoma, fibrosarcoma, giant cell tumor, hemangiopericytoma, infantile fibrosarcoma, inflammatory myofibroblastic tumor, Kaposi sarcoma, leiomyosarcoma of bone, liposarcoma, liposarcoma of bone, malignant fibrous histiocytoma (MFH), malignant fibrous histiocytoma (MFH) of bone, malignant mesenchymoma, malignant peripheral nerve sheath tumor, mesenchymal chondrosarcoma, myxofibrosarcoma, myxoid liposarcoma, myxoinflammatory fibroblastic sarcoma, neoplasms with perivascular epitheioid cell differentiation, osteosarcoma, parosteal osteosarcoma, neoplasm with perivascular epitheioid cell differentiation, periosteal osteosarcoma, pleomorphic liposarcoma, pleomorphic rhabdomyosarcoma, P ET/extraskeletal Ewing tumor,
rhabdomyosarcoma, round cell liposarcoma, small cell osteosarcoma, solitary fibrous tumor, synovial sarcoma, telangiectatic osteosarcoma.
[00246] In some embodiments, the carcinoma cell line is obtained from a cell line of adenocarcinoma, squamous cell carcinoma, adenosquamous carcinoma, anaplastic carcinoma, large cell carcinoma, small cell carcinoma, anal cancer, appendix cancer, bile duct cancer (i.e., cholangiocarcinoma), bladder cancer, brain tumor, breast cancer, cervical cancer, colon cancer, cancer of Unknown Primary (CUP), esophageal cancer, eye cancer, fallopian tube cancer, gastroenterological cancer, kidney cancer, liver cancer, lung cancer, medulloblastoma, melanoma, oral cancer, ovarian cancer, pancreatic cancer, parathyroid disease, penile cancer, pituitary tumor, prostate cancer, rectal cancer, skin cancer, stomach cancer, testicular cancer, throat cancer, thyroid cancer, uterine cancer, vaginal cancer, or vulvar cancer.
[00247] In some instances, the sample is obtained from cells of a hematologic malignant cell line. In some instances, the hematologic malignant cell line is a T-cell cell line. In some instances, B-cell cell line. In some instances, the hematologic malignant cell line is obtained from a T-cell cell line of: peripheral T-cell lymphoma not otherwise specified (PTCL-NOS), anaplastic large cell lymphoma, angioimmunoblastic lymphoma, cutaneous T-cell lymphoma, adult T-cell leukemia/lymphoma (ATLL), blastic K-cell lymphoma, enteropathy -type T-cell lymphoma, hematosplenic gamma-delta T-cell lymphoma, lymphoblastic lymphoma, nasal K/T-cell lymphomas, or treatment-related T-cell lymphomas.
[00248] In some instances, the hematologic malignant cell line is obtained from a B-cell cell line of: acute lymphoblastic leukemia (ALL), acute myelogenous leukemia (AML), chronic myelogenous leukemia (CML), acute monocytic leukemia (AMoL), chronic lymphocytic leukemia (CLL), high-risk chronic lymphocytic leukemia (CLL), small lymphocytic lymphoma (SLL), high-risk small lymphocytic lymphoma (SLL), follicular lymphoma (FL), mantle cell lymphoma (MCL), Waldenstrom's macroglobulinemia, multiple myeloma, extranodal marginal zone B cell lymphoma, nodal marginal zone B cell lymphoma, Burkitt's lymphoma, non-Burkitt high grade B cell lymphoma, primary mediastinal B-cell lymphoma (PMBL), immunoblastic large cell lymphoma, precursor B-lymphoblastic lymphoma, B cell prolymphocytic leukemia, lymphoplasmacytic lymphoma, splenic marginal zone lymphoma, plasma cell myeloma, plasmacytoma, mediastinal (thymic) large B cell lymphoma, intravascular large B cell lymphoma, primary effusion lymphoma, or lymphomatoid granulomatosis.
[00249] In some embodiments, the sample for use with the methods described herein is obtained from a tumor cell line. Exemplary tumor cell line includes, but is not limited to, 600MPE, AU565, BT-20, BT-474, BT-483, BT-549, Evsa-T, Hs578T, MCF-7, MDA-MB-231, SkBr3, T-47D, HeLa, DU145, PC3, LNCaP, A549, H1299, NCI-H460, A2780, SKOV-3/Luc, Neuro2a, RKO, RKO-AS45-1, HT-29, SW1417, SW948, DLD-1, SW480, Capan-1, MC/9, B72.3, B25.2, B6.2, B38.1, DMS 153, SU.86.86, SNU-182, SNU-423, SNU-449, SNU-475, SNU-387, Hs 817.T, LMH, LMH/2A, SNU-398, PLHC-1, HepG2/SF, OCI-Lyl, OCI-Ly2, OCI-Ly3, OCI-Ly4, OCI-Ly6, OCI-Ly7, OCI-LylO, OCI-Lyl 8, OCI-Lyl9, U2932, DB, HBL- 1, RIVA, SUDHL2, TMD8, MEC1, MEC2, 8E5, CCRF-CEM, MOLT-3, TALL-104, AML- 193, THP-1, BDCM, HL-60, Jurkat, RPMI 8226, MOLT-4, RS4, K-562, KASUMI-1, Daudi, GA-10, Raji, JeKo-1, K-92, and Mino.
[00250] In some embodiments, the sample for use in the methods is from any tissue or fluid from an individual. Samples include, but are not limited to, tissue (e.g. connective tissue, muscle tissue, nervous tissue, or epithelial tissue), whole blood, dissociated bone marrow, bone marrow aspirate, pleural fluid, peritoneal fluid, central spinal fluid, abdominal fluid, pancreatic fluid, cerebrospinal fluid, brain fluid, ascites, pericardial fluid, urine, saliva, bronchial lavage, sweat, tears, ear flow, sputum, hydrocele fluid, semen, vaginal flow, milk, amniotic fluid, and secretions of respiratory, intestinal or genitourinary tract. In some embodiments, the sample is a tissue sample, such as a sample obtained from a biopsy or a tumor tissue sample. In some embodiments, the sample is a blood serum sample. In some embodiments, the sample is a blood cell sample containing one or more peripheral blood mononuclear cells (PBMCs). In some embodiments, the sample contains one or more circulating tumor cells (CTCs). In some embodiments, the sample contains one or more disseminated tumor cells (DTC, e.g., in a bone marrow aspirate sample).
[00251] In some embodiments, the samples are obtained from the individual by any suitable means of obtaining the sample using well-known and routine clinical methods. Procedures for obtaining tissue samples from an individual are well known. For example, procedures for drawing and processing tissue sample such as from a needle aspiration biopsy is well-known and is employed to obtain a sample for use in the methods provided. Typically, for collection of such a tissue sample, a thin hollow needle is inserted into a mass such as a tumor mass for sampling of cells that, after being stained, will be examined under a microscope. Sample Preparation and Analysis
[00252] In some embodiments, the sample is a sample solution. In some instances, the sample solution comprises a solution such as a buffer (e.g. phosphate buffered saline) or a media. In some embodiments, the media is an isotopically labeled media. In some instances, the sample solution is a cell solution.
[00253] In some embodiments, the sample (e.g., cells or a cell solution) is incubated with a cysteine-reactive probe for analysis of protein cysteine-reactive probe interactions. In some instances, the sample (e.g., cells or a cell solution) is further incubated in the presence of a small molecule fragment prior to addition of the cysteine-reactive probe. In some instances, the sample is compared with a control. In some instances, the control comprises the cysteine- reactive probe but not the small molecule fragment. In some instances, a difference is observed between a set of cysteine-reactive probe protein interactions between the sample and the control. In some instances, the difference correlates to the interaction between the small molecule fragment and the cysteine containing proteins.
[00254] In some embodiments, the sample (e.g. cells or a cell solution) is further labeled for analysis of cysteine-reactive probe protein interactions. In some instances, the sample (e.g. cells or a cell solution) is labeled with an enriched media. In some cases, the sample (e.g. cells or a cell solution) is labeled with isotope-labeled amino acids, such as 13C or 15N-labeled amino acids. In some cases, the labeled sample is further compared with a non-labeled sample to detect differences in cysteine-reactive probe protein interactions between the two samples. In some instances, this difference is a difference of a cysteine containing protein and its interaction with a small molecule fragment in the labeled sample versus the non-labeled sample. In some instances, the difference is an increase, decrease or a lack of protein cysteine-reactive probe interaction in the two samples. In some instances, the isotope-labeled method is termed SILAC, stable isotope labeling using amino acids in cell culture.
[00255] In some instances, the sample is divided into a first cell solution and a second cell solution. In some cases, the first cell solution is incubated with a small molecule fragment for an extended period of time prior to incubating the first cell solution with a first cysteine-reactive probe to generate a first group of cysteine-reactive probe-protein complexes. In some instances, the extended period of time is about 5, 10, 15, 20, 30, 60, 90, 120 minutes or longer. In some instances, the second cell solution comprises a second cysteine-reactive probe to generate a second group of cysteine-reactive probe-protein complexes. In some instances, the first cysteine- reactive probe and the second cysteine-reactive probe are the same. In some embodiments, cells from the second cell solution are further treated with a buffer, such as a control buffer, in which the buffer does not contain a small molecule fragment. In some embodiments, the control buffer comprises dimethyl sulfoxide (DMSO).
[00256] In some embodiments, the cysteine-reactive probe-protein complex is further conjugated to a chromophore, such as a fluorophore. In some instances, the cysteine-reactive probe-protein complex is separated and visualized utilizing an electrophoresis system, such as through a gel electrophoresis, or a capillary electrophoresis. Exemplary gel electrophoresis includes agarose based gels, polyacrylamide based gels, or starch based gels. In some instances, the cysteine-reactive probe-protein is subjected to a native electrophoresis condition. In some instances, the cysteine-reactive probe-protein is subjected to a denaturing electrophoresis condition.
[00257] In some instances, the cysteine-reactive probe-protein after harvesting is further fragmentized to generate protein fragments. In some instances, fragmentation is generated through mechanical stress, pressure, or chemical means. In some instances, the protein from the cysteine-reactive probe-protein complexes is fragmented by a chemical means. In some embodiments, the chemical means is a protease. Exemplary proteases include, but are not limited to, serine proteases such as chymotrypsin A, penicillin G acylase precursor, dipeptidase E, DmpA aminopeptidase, subtilisin, prolyl oligopeptidase, D-Ala-D-Ala peptidase C, signal peptidase I, cytomegalovirus assemblin, Lon-A peptidase, peptidase Clp, Escherichia coli phage K1F endosialidase CEVICD self-cleaving protein, nucleoporin 145, lactoferrin, murein tetrapeptidase LD-carboxypeptidase, or rhomboid- 1; threonine proteases such as ornithine acetyltransferase; cysteine proteases such as TEV protease, amidophosphoribosyltransferase precursor, gamma-glutamyl hydrolase (Rattus norvegicus), hedgehog protein, DmpA
aminopeptidase, papain, bromelain, cathepsin K, calpain, caspase-1, separase, adenain, pyroglutamyl-peptidase I, sortase A, hepatitis C virus peptidase 2, sindbis virus-type nsP2 peptidase, dipeptidyl-peptidase VI, or DeSI-1 peptidase; aspartate proteases such as beta- secretase 1 (BACEl), beta-secretase 2 (BACE2), cathepsin D, cathepsin E, chymosin, napsin-A, nepenthesin, pepsin, plasmepsin, presenilin, or renin; glutamic acid proteases such as AfuGprA; and metalloproteases such as peptidase_M48.
[00258] In some instances, the fragmentation is a random fragmentation. In some instances, the fragmentation generates specific lengths of protein fragments, or the shearing occurs at particular sequence of amino acid regions.
[00259] In some instances, the protein fragments are further analyzed by a proteomic method such as by liquid chromatography (LC) (e.g. high performance liquid chromatography), liquid chromatography-mass spectrometry (LC-MS), matrix-assisted laser desorption/ionization (MALDI-TOF), gas chromatography-mass spectrometry (GC-MS), capillary electrophoresis- mass spectrometry (CE-MS), or nuclear magnetic resonance imaging ( MR).
[00260] In some embodiments, the LC method is any suitable LC methods well known in the art, for separation of a sample into its individual parts. This separation occurs based on the interaction of the sample with the mobile and stationary phases. Since there are many stationary/mobile phase combinations that are employed when separating a mixture, there are several different types of chromatography that are classified based on the physical states of those phases. In some embodiments, the LC is further classified as normal-phase chromatography, reverse-phase chromatography, size-exclusion chromatography, ion-exchange chromatography, affinity chromatography, displacement chromatography, partition chromatography, flash chromatography, chiral chromatography, and aqueous normal-phase chromatography.
[00261] In some embodiments, the LC method is a high performance liquid chromatography (HPLC) method. In some embodiments, the HPLC method is further categorized as normal- phase chromatography, reverse-phase chromatography, size-exclusion chromatography, ion- exchange chromatography, affinity chromatography, displacement chromatography, partition chromatography, chiral chromatography, and aqueous normal-phase chromatography.
[00262] In some embodiments, the HPLC method of the present disclosure is performed by any standard techniques well known in the art. Exemplary HPLC methods include hydrophilic interaction liquid chromatography (HILIC), electrostatic repulsion-hydrophilic interaction liquid chromatography (ERLIC) and reverse phase liquid chromatography (RPLC).
[00263] In some embodiments, the LC is coupled to a mass spectroscopy as a LC-MS method. In some embodiments, the LC-MS method includes ultra-performance liquid chromatography-electrospray ionization quadrupole time-of-flight mass spectrometry (UPLC- ESI-QTOF-MS), ultra-performance liquid chromatography-electrospray ionization tandem mass spectrometry (UPLC-ESI-MS/MS), reverse phase liquid chromatography-mass spectrometry (RPLC -MS), hydrophilic interaction liquid chromatography-mass spectrometry (HILIC -MS), hydrophilic interaction liquid chromatography-triple quadrupole tandem mass spectrometry (HILIC-QQQ), electrostatic repulsion-hydrophilic interaction liquid chromatography-mass spectrometry (ERLIC -MS), liquid chromatography time-of-flight mass spectrometry (LC- QTOF-MS), liquid chromatography-tandem mass spectrometry (LC-MS/MS), multidimensional liquid chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS). In some instances, the LC-MS method is LC/LC-MS/MS. In some embodiments, the LC-MS methods of the present disclosure are performed by standard techniques well known in the art.
[00264] In some embodiments, the GC is coupled to a mass spectroscopy as a GC-MS method. In some embodiments, the GC-MS method includes two-dimensional gas chromatography time-of-flight mass spectrometry (GC*GC-TOFMS), gas chromatography time-of-flight mass spectrometry (GC-QTOF-MS) and gas chromatography -tandem mass spectrometry (GC -MS/MS).
[00265] In some embodiments, CE is coupled to a mass spectroscopy as a CE-MS method. In some embodiments, the CE-MS method includes capillary electrophoresis- negative electrospray ionization-mass spectrometry (CE-ESI-MS), capillary electrophoresis-negative electrospray ionization-quadrupole time of flight-mass spectrometry (CE-ESI-QTOF-MS) and capillary electrophoresis-quadrupole time of flight-mass spectrometry (CE-QTOF-MS).
[00266] In some embodiments, the nuclear magnetic resonance (NMR) method is any suitable method well known in the art for the detection of one or more cysteine binding proteins or protein fragments disclosed herein. In some embodiments, the NMR method includes one dimensional (ID) NMR methods, two dimensional (2D) NMR methods, solid state NMR methods and NMR chromatography. Exemplary ID NMR methods include 1Hydrogen,
13 Carbon, 15 Nitrogen, 17 Oxygen, 19 Fluorine, 31 Phosphorus, 39 Potassium, 23 Sodium, 33 Sulfur,
8V Strontium, 27 Aluminium, 43 Calcium, 35 Chlorine, 37 Chlorine, 63 Copper, 65 Copper, 57 Iron, 25Magnesium, 199Mercury or 67Zinc NMR method, distortionless enhancement by polarization transfer (DEPT) method, attached proton test (APT) method and ID-incredible natural abundance double quantum transition experiment (INADEQUATE) method. Exemplary 2D NMR methods include correlation spectroscopy (COSY), total correlation spectroscopy
(TOCSY), 2D-IN ADEQUATE, 2D-adequate double quantum transfer experiment
(ADEQUATE), nuclear overhauser effect spectroscopy (NOSEY), rotating-frame NOE spectroscopy (ROESY), heteronuclear multiple-quantum correlation spectroscopy (HMQC), heteronuclear single quantum coherence spectroscopy (HSQC), short range coupling and long range coupling methods. Exemplary solid state NMR method include solid state 13Carbon NMR, high resolution magic angle spinning (HR-MAS) and cross polarization magic angle spinning (CP -MAS) NMR methods. Exemplary NMR techniques include diffusion ordered spectroscopy (DOSY), DOSY-TOCSY and DOSY-HSQC.
[00267] In some embodiments, the protein fragments are analyzed by method as described in Weerapana et al., "Quantitative reactivity profiling predicts functional cysteines in proteomes," Nature, 468:790-795 (2010).
[00268] In some embodiments, the results from the mass spectroscopy method are analyzed by an algorithm for protein identification. In some embodiments, the algorithm combines the results from the mass spectroscopy method with a protein sequence database for protein identification. In some embodiments, the algorithm comprises ProLuCID algorithm, Probity, Scaffold, SEQUEST, or Mascot. [00269] In some embodiments, a value is assigned to each of the protein from the cysteine- reactive probe-protein complex. In some embodiments, the value assigned to each of the protein from the cysteine-reactive probe-protein complex is obtained from the mass spectroscopy analysis. In some instances, the value is the area-under-the curve from a plot of signal intensity as a function of mass-to-charge ratio. In some embodiments, a first value is assigned to the protein obtained from the first cell solution and a second value is assigned to the same protein obtained from the second cell solution. In some instances, a ratio is calculated between the two values. In some instances, a ratio of greater than 2 indicates that the protein is a candidate for interacting with a drug or that the protein is a cysteine binding protein. In some instances, the ratio is greater than 2.5, 3, 3.5, 4, 4.5, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20. In some cases, the ratio is at most 20.
[00270] In some instances, the ratio is calculated based on averaged values. In some instances, the averaged value is an average of at least two, three, or four values of the protein from each cell solution, or that the protein is observed at least two, three, or four times in each cell solution and a value is assigned to each observed time. In some instances, the ratio further has a standard deviation of less than 12, 10, or 8.
[00271] In some instances, a value is not an averaged value. In some instances, the ratio is calculated based on value of a protein observed only once in a cell population. In some instances, the ratio is assigned with a value of 20.
[00272] In some embodiments, in the context of identifying a cysteine containing protein as a small fragment molecule binding target, a first ratio is obtained from two cell solutions in which both cell solutions have been incubated with a cysteine-reactive probe and the first cell solution is further incubated with a small molecule fragment. In some instances, the first ratio is further compared to a second ratio in which both cell solutions have been treated by cysteine-reactive probes in the absence of a small molecule fragment. In some instances, the first ratio is greater than 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20. In some instances, the second ratio is greater than 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20. In some instances, if the first ratio is greater than 2, 2.5, 3, 3.5, 4, 4.5, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 and the second ratio is from about 0.5 to about 2, the two ratios indicate that a protein is a drug binding target.
[00273] In some embodiments, the value further enables calculating a percentage of inhibition of the cysteine-reactive probe to the cysteine containing protein. In some
embodiments, the percentage of inhibition of greater than 50%, 60%, 70%, 80%, 90%, or at 100%) indicates that the cysteine containing protein is a candidate for interacting with the small molecule fragment. Kits/Article of Manufacture
[00274] Disclosed herein, in certain embodiments, are kits and articles of manufacture for use with one or more methods described herein. In some embodiments, described herein is a kit for identifying a cysteine containing protein as a small molecule fragment binding target. In some instances, also described herein is a kit for mapping binding sites on a cysteine containing protein. In some cases, described herein is a kit for identifying cysteine binding proteins. In some embodiments, also described herein is a kit for a high throughput screening of a small molecule fragment for interaction with a cysteine containing protein.
[00275] In some embodiments, such kit includes cysteine-reactive probes such as the cysteine-reactive probes described herein, test compounds such as small molecule fragments or libraries and/or controls, and reagents suitable for carrying out one or more of the methods described herein. In some instances, the kit further comprises samples, such as a cell sample, and suitable solutions such as buffers or media. In some embodiments, the kit further comprises recombinant proteins for use in one or more of the methods described herein. In some embodiments, additional components of the kit comprises a carrier, package, or container that is compartmentalized to receive one or more containers such as vials, tubes, and the like, each of the container(s) comprising one of the separate elements to be used in a method described herein. Suitable containers include, for example, bottles, vials, plates, syringes, and test tubes. In one embodiment, the containers are formed from a variety of materials such as glass or plastic.
[00276] The articles of manufacture provided herein contain packaging materials. Examples of pharmaceutical packaging materials include, but are not limited to, bottles, tubes, bags, containers, and any packaging material suitable for a selected formulation and intended mode of use.
[00277] For example, the container(s) include cysteine-reactive probes, test compounds, and one or more reagents for use in a method disclosed herein. Such kits optionally include an identifying description or label or instructions relating to its use in the methods described herein.
[00278] A kit typically includes labels listing contents and/or instructions for use, and package inserts with instructions for use. A set of instructions will also typically be included.
[00279] In one embodiment, a label is on or associated with the container. In one
embodiment, a label is on a container when letters, numbers or other characters forming the label are attached, molded or etched into the container itself; a label is associated with a container when it is present within a receptacle or carrier that also holds the container, e.g., as a package insert. In one embodiment, a label is used to indicate that the contents are to be used for a specific therapeutic application. The label also indicates directions for use of the contents, such as in the methods described herein. Services
[00280] In some embodiments, the methods provided herein also perform as a service. In some instances, a service provider obtain from the customer a plurality of small molecule fragment candidates for analysis with one or more of the cysteine-reactive probes for screening. In some embodiments, the service provider screens the small molecule fragment candidates using one or more of the methods described herein, and then provide the results to the customer. In some instances, the service provider provides the appropriate reagents to the customer for analysis utilizing one or more of the cysteine-reactive probes and one or more of the methods described herein. In some cases, the customer performs one or more of the methods described herein and then provide the results to the service provider for analysis. In some embodiments, the service provider then analyzes the results and provides the results to the costumer. In some cases, the customer further analyze the results by interacting with software installed locally (at the customer's location) or remotely (e.g., on a server reachable through a network). Exemplary customers include pharmaceutical companies, clinical laboratories, physicians, patients, and the like. In some instances, a customer is any suitable customer or party with a need or desire to use the methods, systems, compositions, and kits described herein.
Digital Processing Device
[00281] In some embodiments, the methods described herein include a digital processing device, or use of the same. In further embodiments, the digital processing device includes one or more hardware central processing units (CPU) that carry out the device's functions. In still further embodiments, the digital processing device further comprises an operating system configured to perform executable instructions. In some embodiments, the digital processing device is optionally connected to a computer network. In further embodiments, the digital processing device is optionally connected to the Internet such that it accesses the World Wide Web. In still further embodiments, the digital processing device is optionally connected to a cloud computing infrastructure. In other embodiments, the digital processing device is optionally connected to an intranet. In other embodiments, the digital processing device is optionally connected to a data storage device.
[00282] In accordance with the description herein, suitable digital processing devices include, by are not limited to, server computers, desktop computers, laptop computers, notebook computers, sub-notebook computers, netbook computers, netpad computers, set-top computers, media streaming devices, handheld computers, Internet appliances, mobile smartphones, tablet computers, personal digital assistants, video game consoles, and vehicles. Suitable tablet computers include those with booklet, slate, or convertible configurations. [00283] In some embodiments, the digital processing device includes an operating system configured to perform executable instructions. The operating system is, for example, software, including programs and data, which manages the device's hardware and provides services for execution of applications. Suitable server operating systems include, by way of non-limiting examples, FreeBSD, OpenBSD, NetBSD®, Linux, Apple® Mac OS X Server®, Oracle® Solaris®, Windows Server®, and Novell® NetWare®. Suitable personal computer operating systems include, by way of non- limiting examples, Microsoft® Windows®, Apple® Mac OS X®, UNIX®, and UNIX-like operating systems such as GNU/Linux®. In some embodiments, the operating system is provided by cloud computing. Suitable mobile smart phone operating systems include, by way of non-limiting examples, Nokia® Symbian® OS, Apple® iOS®, Research In Motion® BlackBerry OS®, Google® Android®, Microsoft® Windows Phone® OS, Microsoft® Windows Mobile® OS, Linux®, and Palm® WebOS®. Suitable media streaming device operating systems include, by way of non-limiting examples, Apple TV®, Roku®, Boxee®, Google TV®, Google Chromecast®, Amazon Fire®, and Samsung® HomeSync®. Suitable video game console operating systems include, by way of non- limiting examples, Sony® PS3®, Sony® PS4®, Microsoft® Xbox 360®, Microsoft Xbox One, Nintendo® Wii®, Nintendo® Wii U®, and Ouya®.
[00284] In some embodiments, the device includes a storage and/or memory device. The storage and/or memory device is one or more physical apparatuses used to store data or programs on a temporary or permanent basis. In some embodiments, the device is volatile memory and requires power to maintain stored information. In some embodiments, the device is non-volatile memory and retains stored information when the digital processing device is not powered. In further embodiments, the non-volatile memory comprises flash memory. In some embodiments, the non-volatile memory comprises dynamic random-access memory (DRAM). In some embodiments, the non-volatile memory comprises ferroelectric random access memory (FRAM). In some embodiments, the non-volatile memory comprises phase-change random access memory (PRAM). In other embodiments, the device is a storage device including, by way of non-limiting examples, CD-ROMs, DVDs, flash memory devices, magnetic disk drives, magnetic tapes drives, optical disk drives, and cloud computing based storage. In further embodiments, the storage and/or memory device is a combination of devices such as those disclosed herein.
[00285] In some embodiments, the digital processing device includes a display to send visual information to a user. In some embodiments, the display includes a cathode ray tube (CRT), a liquid crystal display (LCD), a thin film transistor liquid crystal display (TFT-LCD), an organic light emitting diode (OLED) display, a plasma display, a video projector, or a combination thereof. [00286] In some embodiments, the digital processing device includes an input device to receive information from a user. In some embodiments, the input device is a keyboard. In some embodiments, the input device is a pointing device including, by way of non-limiting examples, a mouse, trackball, track pad, joystick, game controller, or stylus. In some embodiments, the input device is a touch screen or a multi-touch screen. In other embodiments, the input device is a microphone to capture voice or other sound input. In other embodiments, the input device is a video camera or other sensor to capture motion or visual input. In further embodiments, the input device is a Kinect™, Leap Motion™, or the like. In still further embodiments, the input device is a combination of devices such as those disclosed herein.
[00287] In some embodiments, the systems and methods disclosed herein include one or more non-transitory computer readable storage media encoded with a program including instructions executable by the operating system of an optionally networked digital processing device. In further embodiments, a computer readable storage medium is a tangible component of a digital processing device. In still further embodiments, a computer readable storage medium is optionally removable from a digital processing device. In some embodiments, a computer readable storage medium includes, by way of non-limiting examples, CD-ROMs, DVDs, flash memory devices, solid state memory, magnetic disk drives, magnetic tape drives, optical disk drives, cloud computing systems and services, and the like. In some cases, the program and instructions are permanently, substantially permanently, semi-permanently, or non-transitorily encoded on the media.
[00288] In some embodiments, the systems and methods disclosed herein include at least one computer program, or use of the same. A computer program includes a sequence of instructions, executable in the digital processing device's CPU, written to perform a specified task. In some embodiments, computer readable instructions are implemented as program modules, such as functions, objects, Application Programming Interfaces (APIs), data structures, and the like, that perform particular tasks or implement particular abstract data types.
[00289] In some embodiments, the functionality of the computer readable instructions are combined or distributed as desired in various environments. In some embodiments, a computer program comprises one sequence of instructions. In some embodiments, a computer program comprises a plurality of sequences of instructions. In some embodiments, a computer program is provided from one location. In other embodiments, a computer program is provided from a plurality of locations. In various embodiments, a computer program includes one or more software modules. In various embodiments, a computer program includes, in part or in whole, one or more web applications, one or more mobile applications, one or more standalone applications, one or more web browser plug-ins, extensions, add-ins, or add-ons, or combinations thereof.
[00290] In some embodiments, a computer program includes a web application. A web application, in various embodiments, utilizes one or more software frameworks and one or more database systems. In some embodiments, a web application is created upon a software framework such as Microsoft® .NET or Ruby on Rails (RoR). In some embodiments, a web application utilizes one or more database systems including, by way of non-limiting examples, relational, non-relational, object oriented, associative, and XML database systems. In further embodiments, suitable relational database systems include, by way of non-limiting examples, Microsoft® SQL Server, mySQL™, and Oracle®. A web application, in various embodiments, is written in one or more versions of one or more languages. In some embodiments, a web application is written in one or more markup languages, presentation definition languages, client-side scripting languages, server-side coding languages, database query languages, or combinations thereof. In some embodiments, a web application is written to some extent in a markup language such as Hypertext Markup Language (HTML), Extensible Hypertext Markup Language (XHTML), or extensible Markup Language (XML). In some embodiments, a web application is written to some extent in a presentation definition language such as Cascading Style Sheets (CSS). In some embodiments, a web application is written to some extent in a client-side scripting language such as Asynchronous Javascript and XML (AJAX), Flash® Actionscript, Javascript, or Silverlight®. In some embodiments, a web application is written to some extent in a server-side coding language such as Active Server Pages (ASP), ColdFusion®, Perl, Java™, JavaServer Pages (JSP), Hypertext Preprocessor (PHP), Python™, Ruby, Tel, Smalltalk, WebDNA®, or Groovy. In some embodiments, a web application is written to some extent in a database query language such as Structured Query Language (SQL). In some embodiments, a web application integrates enterprise server products such as IBM® Lotus Domino®. In some embodiments, a web application includes a media player element. In various further embodiments, a media player element utilizes one or more of many suitable multimedia technologies including, by way of non-limiting examples, Adobe® Flash®, HTML 5, Apple® QuickTime®, Microsoft® Silverlight®, Java™, and Unity®.
[00291] In some embodiments, a computer program includes a mobile application provided to a mobile digital processing device. In some embodiments, the mobile application is provided to a mobile digital processing device at the time it is manufactured. In other embodiments, the mobile application is provided to a mobile digital processing device via the computer network described herein.
Ill [00292] In view of the disclosure provided herein, a mobile application is created by techniques using hardware, languages, and development environments. Suitable programming languages include, by way of non-limiting examples, C, C++, C#, Objective-C, Java™,
Javascript, Pascal, Object Pascal, Python™, Ruby, VB.NET, WML, and XHTML/HTML with or without CSS, or combinations thereof.
[00293] Suitable mobile application development environments are available from several sources. Commercially available development environments include, by way of non-limiting examples, AirplaySDK, alcheMo, Appcelerator®, Celsius, Bedrock, Flash Lite, .NET Compact Framework, Rhomobile, and WorkLight Mobile Platform. Other development environments are available without cost including, by way of non-limiting examples, Lazarus, MobiFlex,
MoSync, and Phonegap. Also, mobile device manufacturers distribute software developer kits including, by way of non-limiting examples, iPhone and iPad (iOS) SDK, Android™ SDK, BlackBerry® SDK, BREW SDK, Palm® OS SDK, Symbian SDK, webOS SDK, and Windows® Mobile SDK.
[00294] In some embodiments, commercial forums for distribution of mobile applications include, by way of non-limiting examples, Apple® App Store, Android™ Market, BlackBerry® App World, App Store for Palm devices, App Catalog for webOS, Windows® Marketplace for Mobile, Ovi Store for Nokia® devices, Samsung® Apps, and Nintendo® DSi Shop.
[00295] In some embodiments, a computer program includes a standalone application, which is a program that is run as an independent computer process, not an add-on to an existing process, e.g., not a plug-in. In some instances, standalone applications are compiled. A compiler is a computer program(s) that transforms source code written in a programming language into binary object code such as assembly language or machine code. Suitable compiled programming languages include, by way of non-limiting examples, C, C++, Objective-C, COBOL, Delphi, Eiffel, Java™, Lisp, Python™, Visual Basic, and VB .NET, or combinations thereof.
Compilation is often performed, at least in part, to create an executable program. In some embodiments, a computer program includes one or more executable complied applications.
[00296] In some embodiments, the computer program includes a web browser plug-in. In computing, a plug-in is one or more software components that add specific functionality to a larger software application. Makers of software applications support plug-ins to enable third- party developers to create abilities which extend an application, to support easily adding new features, and to reduce the size of an application. When supported, plug-ins enable customizing the functionality of a software application. For example, plug-ins are commonly used in web browsers to play video, generate interactivity, scan for viruses, and display particular file types. In some instances, web browser plug-ins include Adobe® Flash® Player, Microsoft® Silverlight®, and Apple® QuickTime®. In some embodiments, the toolbar comprises one or more web browser extensions, add-ins, or add-ons. In some embodiments, the toolbar comprises one or more explorer bars, tool bands, or desk bands.
[00297] In view of the disclosure provided herein, plug-in frameworks are available that enable development of plug-ins in various programming languages, including, by way of non- limiting examples, C++, Delphi, Java™, PHP, Python™, and VB .NET, or combinations thereof.
[00298] Web browsers (also called Internet browsers) are software applications, designed for use with network-connected digital processing devices, for retrieving, presenting, and traversing information resources on the World Wide Web. Suitable web browsers include, by way of non- limiting examples, Microsoft® Internet Explorer®, Mozilla® Firefox®, Google® Chrome, Apple® Safari®, Opera Software® Opera®, and KDE Konqueror. In some embodiments, the web browser is a mobile web browser. Mobile web browsers (also called mircrobrowsers, mini -browsers, and wireless browsers) are designed for use on mobile digital processing devices including, by way of non-limiting examples, handheld computers, tablet computers, netbook computers, subnotebook computers, smartphones, music players, personal digital assistants (PDAs), and handheld video game systems. Suitable mobile web browsers include, by way of non-limiting examples, Google® Android® browser, RFM BlackBerry® Browser, Apple® Safari®, Palm® Blazer, Palm® WebOS® Browser, Mozilla® Firefox® for mobile, Microsoft® Internet Explorer® Mobile, Amazon® Kindle® Basic Web, Nokia® Browser, Opera Software® Opera® Mobile, and Sony® PSP™ browser.
[00299] In some embodiments, the systems and methods disclosed herein include software, server, and/or database modules, or use of the same. In view of the disclosure provided herein, software modules are created and implemented in a multitude of ways. In various embodiments, a software module comprises a file, a section of code, a programming object, a programming structure, or combinations thereof. In further various embodiments, a software module comprises a plurality of files, a plurality of sections of code, a plurality of programming objects, a plurality of programming structures, or combinations thereof. In various embodiments, the one or more software modules comprise, by way of non-limiting examples, a web application, a mobile application, and a standalone application. In some embodiments, software modules are in one computer program or application. In other embodiments, software modules are in more than one computer program or application. In some embodiments, software modules are hosted on one machine. In other embodiments, software modules are hosted on more than one machine. In further embodiments, software modules are hosted on cloud computing platforms. In some embodiments, software modules are hosted on one or more machines in one location. In other embodiments, software modules are hosted on one or more machines in more than one location.
[00300] In some embodiments, the methods and systems disclosed herein include one or more databases, or use of the same. In view of the disclosure provided herein, databases are suitable for storage and retrieval of analytical information described elsewhere herein. In various embodiments, suitable databases include, by way of non-limiting examples, relational databases, non-relational databases, object oriented databases, object databases, entity-relationship model databases, associative databases, and XML databases. In some embodiments, a database is internet-based. In further embodiments, a database is web-based. In still further embodiments, a database is cloud computing-based. In other embodiments, a database is based on one or more local computer storage devices.
Server
[00301] In some embodiments, the methods provided herein are processed on a server or a computer server (Fig. 2). In some embodiments, the server 401 includes a central processing unit (CPU, also "processor") 405 which is a single core processor, a multi core processor, or plurality of processors for parallel processing. In some embodiments, a processor used as part of a control assembly is a microprocessor. In some embodiments, the server 401 also includes memory 410 (e.g. random access memory, read-only memory, flash memory); electronic storage unit 415 (e.g. hard disk); communications interface 420 (e.g. network adaptor) for
communicating with one or more other systems; and peripheral devices 425 which includes cache, other memory, data storage, and/or electronic display adaptors. The memory 410, storage unit 415, interface 420, and peripheral devices 425 are in communication with the processor 405 through a communications bus (solid lines), such as a motherboard. In some embodiments, the storage unit 415 is a data storage unit for storing data. The server 401 is operatively coupled to a computer network ("network") 430 with the aid of the communications interface 420. In some embodiments, a processor with the aid of additional hardware is also operatively coupled to a network. In some embodiments, the network 430 is the Internet, an intranet and/or an extranet, an intranet and/or extranet that is in communication with the Internet, a telecommunication or data network. In some embodiments, the network 430 with the aid of the server 401,
implements a peer-to-peer network, which enables devices coupled to the server 401 to behave as a client or a server. In some embodiments, the server is capable of transmitting and receiving computer-readable instructions (e.g., device/system operation protocols or parameters) or data (e.g., sensor measurements, raw data obtained from detecting metabolites, analysis of raw data obtained from detecting metabolites, interpretation of raw data obtained from detecting metabolites, etc.) via electronic signals transported through the network 430. Moreover, in some embodiments, a network is used, for example, to transmit or receive data across an international border.
[00302] In some embodiments, the server 401 is in communication with one or more output devices 435 such as a display or printer, and/or with one or more input devices 440 such as, for example, a keyboard, mouse, or joystick. In some embodiments, the display is a touch screen display, in which case it functions as both a display device and an input device. In some embodiments, different and/or additional input devices are present such an enunciator, a speaker, or a microphone. In some embodiments, the server uses any one of a variety of operating systems, such as for example, any one of several versions of Windows®, or of MacOS®, or of Unix®, or of Linux®.
[00303] In some embodiments, the storage unit 415 stores files or data associated with the operation of a device, systems or methods described herein.
[00304] In some embodiments, the server communicates with one or more remote computer systems through the network 430. In some embodiments, the one or more remote computer systems include, for example, personal computers, laptops, tablets, telephones, Smart phones, or personal digital assistants.
[00305] In some embodiments, a control assembly includes a single server 401. In other situations, the system includes multiple servers in communication with one another through an intranet, extranet and/or the Internet.
[00306] In some embodiments, the server 401 is adapted to store device operation parameters, protocols, methods described herein, and other information of potential relevance. In some embodiments, such information is stored on the storage unit 415 or the server 401 and such data is transmitted through a network.
Certain Terminology
[00307] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as is commonly understood by one of skill in the art to which the claimed subject matter belongs. It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of any subject matter claimed. In this application, the use of the singular includes the plural unless specifically stated otherwise. It must be noted that, as used in the specification and the appended claims, the singular forms "a," "an" and "the" include plural referents unless the context clearly dictates otherwise. In this application, the use of "or" means "and/or" unless stated otherwise.
Furthermore, use of the term "including" as well as other forms, such as "include", "includes," and "included," is not limiting. [00308] As used herein, ranges and amounts can be expressed as "about" a particular value or range. About also includes the exact amount. Hence "about 5 μΙ_," means "about 5 μΙ_," and also "5 μΐ.." Generally, the term "about" includes an amount that would be expected to be within experimental error.
[00309] The section headings used herein are for organizational purposes only and are not to be construed as limiting the subject matter described.
[00310] The term "protein", as used herein, encompasses a full-length cysteine containing protein, a full-length functional cysteine containing protein, a cysteine containing protein fragment, or a functionally active cysteine containing protein fragment. In some instances, a protein described herein is also referred to as an "isolated protein", or a protein that by virtue of its origin or source of derivation is not associated with naturally associated components that accompany it in its native state; is substantially free of other proteins from the same species; is expressed by a cell from a different species; or does not occur in nature.
[00311] The term "polypeptide", as used herein, refers to any polymeric chain of amino acids. The term "polypeptide" encompasses native or modified cysteine containing protein, cysteine containing protein fragments, or polypeptide analogs comprising non-native amino acid residues. In some instances, a polypeptide is monomeric. In other instances, a polypeptide is polymeric. In some instances, a polypeptide described herein is also referred to as an "isolated polypeptide", or a polypeptide that by virtue of its origin or source of derivation is not associated with naturally associated components that accompany it in its native state; is substantially free of other proteins from the same species; is expressed by a cell from a different species; or does not occur in nature.
[00312] As used herein, the terms "individual(s)", "subject(s)" and "patient(s)" mean any mammal. In some embodiments, the mammal is a human. In some embodiments, the mammal is a non-human. None of the terms require or are limited to situations characterized by the supervision (e.g. constant or intermittent) of a health care worker (e.g. a doctor, a registered nurse, a nurse practitioner, a physician's assistant, an orderly or a hospice worker).
[00313] The term "alkyl" as used herein is a branched or unbranched saturated hydrocarbon group of 1 to 24 carbon atoms, such as methyl, ethyl, n-propyl, isopropyl, n-butyl, isobutyl, s- butyl, t-butyl, n-pentyl, isopentyl, s-pentyl, neopentyl, hexyl, heptyl, octyl, nonyl, decyl, dode cyl, tetradecyl, hexadecyl, eicosyl, tetracosyl, and the like. It is understand that the alkyl group is acyclic. In some instances, the alkyl group is branched or unbranched. In some instances, the alkyl group is also substituted or unsubstituted. For example, the alkyl group is substituted with one or more groups including, but not limited to, alkyl, cycloalkyl, alkoxy, amino, ether, halide, hydroxy, nitro, silyl, sulfo-oxo, or thiol. A "lower alkyl" group is an alkyl group containing from one to six (e.g., from one to four) carbon atoms. In some instances, the term alkyl group is also a CI alkyl, C1-C2 alkyl, C1-C3 alkyl, C1-C4 alkyl, CI -05 alkyl, C1-C6 alkyl, C1-C7 alkyl, Cl- C8 alkyl, C1-C9 alkyl, C1-C10 alkyl, and the like up to and including a C1-C24 alkyl.
[00314] The term "aryl" as used herein is a group that contains any carbon-based aromatic group including, but not limited to, benzene, naphthalene, phenyl, biphenyl, anthracene, and the like. The aryl group can be substituted or unsubstituted. In some instances, the aryl group is substituted with one or more groups including, but not limited to, alkyl, cycloalkyl, alkoxy, alkenyl, cycloalkenyl, alkynyl, cycloalkynyl, aryl, heteroaryl, aldehyde,— H2, carboxylic acid, ester, ether, halide, hydroxy, ketone, azide, nitro, silyl, sulfo-oxo, or thiol. The term "biaryl" is a specific type of aryl group and is included in the definition of "aryl." In addition, the aryl group is optionally a single ring structure or comprises multiple ring structures that are either fused ring structures or attached via one or more bridging groups such as a carbon-carbon bond. For example, biaryl refers to two aryl groups that are bound together via a fused ring structure, as in naphthalene, or are attached via one or more carbon-carbon bonds, as in biphenyl.
EXAMPLES
[00315] These examples are provided for illustrative purposes only and not to limit the scope of the claims provided herein.
Example 1
Biological Methods
Preparation of human cancer cell line proteomes
[00316] All cell lines were obtained from ATCC, were used with a low passage number and were grown at 37 °C with 5% C02. MDA-MB-231 cells and HEK-293T cells were grown in DMEM supplemented with 10% fetal bovine serum, penicillin, streptomycin and glutamine. Jurkat, Ramos and MUM2C cells were grown in RPMI-1640 medium supplemented with 10% fetal bovine serum, penicillin and streptomycin. For in vitro labeling, cells were grown to 100% confluence for MDA-MB-231 cells or until cell density reached 1.5 million cells/mL for Ramos and Jurkat cells. Cells were washed with cold PBS, scraped with cold PBS and cell pellets were isolated by centrifugation (1,400 g , 3 min, 4 °C), and stored at -80 °C until use. Cell pellets were lysed by sonication and fractionated (100,000 g, 45 min) to yield soluble and membrane fractions, which were then adjusted to a final protein concentration of 1.5 mg/mL for proteomics experiments and 1 mg/mL for gel-based ABPP experiments. The soluble lysate was prepared fresh from frozen pellets directly before each experiment. Protein concentration was determined using the Bio-Rad DC™ protein assay kit.
Screening of fragment electrophile library by gel-based ABPP with IA-rhodamine and Ac-Rho-
DEVD-AMK ("DEVD" disclosed as SEQ ID NO: 857)
111 [00317] 25 μΙ_, of soluble proteome (1 mg/mL) was treated with fragment electrophiles (1 μΙ_, of 25 x stock solution in DMSO) at ambient temperature for 1 h. IA-rhodamine (1 μΐ. of 25 μΜ, final concentration = 1 μΜ) was then added and allowed to react for an additional 1 h. The reactions were quenched with 8 μΙ_, of 4x SDS-PAGE loading buffer and the quenched samples analyzed by SDS-PAGE (10% polyacrylamide; 15 μΙ_, of sample/lane) and visualized by in-gel fluorescence using a flatbed fluorescent scanner (BioRad ChemiDoc™ MP or Hitachi FMBio He) . To measure labeling of recombinant proteins expressed in E. coli, purified protein was added to soluble proteome to a final concentration of 1 μΜ (CASP8, PRMT1, EVIPDH2), 2 μΜ (TIGAR, IDH1) or 4 μΜ (IDH1 R132H) and the proteomes were treated as detailed above. IDHl labeling by IA-rhodamine is relatively better in MDA-MB-231 soluble proteome when compared with Ramos and Jurkat soluble proteome. Recombinant, active CASP8 in soluble proteome was labeled with Rho-DEVD-AOMK ("DEVD" disclosed as SEQ ID NO: 857) (1 μL· of 50 μΜ, final concentration = 2 μΜ), quenched and analyzed by SDS-PAGE on 14% polyacrylamide gels.
Gel-based ABPP with alkyne-containing click probes
[00318] 25 μΙ_, of soluble proteome (1 mg/mL) was labeled with the indicated concentration of 18 or 19 (1 μΙ_, of 25 x stock solution in DMSO) for 1 h at ambient temperature followed by copper-mediated azide-alkyne cycloaddition (CuAAC) conjugation to rhodamine-azide. CuAAC was performed with 20 μΜ rhodamine-azide (50x stock in DMSO), 1 mM tris(2- carboxyethyl)phosphine hydrochloride (TCEP; fresh 50x stock in water, final concentration = 1 mM), ligand (17x stock in DMSO:t-butanol 1 :4, final concentration = 100 μΜ) and 1 mM CuS04 (50x stock in water, final concentration = 1 mM). Samples were allowed to react for 1 h at ambient temperature before quenching with 8 μΙ_, 4x SDS-PAGE loading buffer. Quenched reactions were analyzed by SDS-PAGE and visualized by in-gel fluorescence. For CASP8 and FMPDH2 25 μΙ_, of soluble proteomes containing IMPDH2 or Pro-CASP8 (1 μΜ each respectively) were treated with the indicated fragment for 1 h prior to incubation for 1 h with 18 (1 μΐ, of 625 μΜ, final concentration = 25 μΜ) for FMPDH2 or 61 (1 μΐ, of 625 μΜ, final concentration = 25 μΜ) for CASP8. For MLTK, HEK 293T cells stably overexpressing MTLK2 were treated with the indicated fragment electrophiles for 1 h, followed by labeling with 59 (1 μΐ, of 125 μΜ, final concentration = 5 μΜ) for 1 h. These were followed by CuAAC
conjugation to rhodamine-azide and evaluation by SDS-PAGE as described above.
Determination of in vitro IC50 values
[00319] 25 μΙ_, of proteomes containing the indicated protein were treated with fragment electrophiles for 1 h at ambient temperature, labeled with the probes detailed above for 1 h, quenched, and analyzed by SDS-PAGE and in-gel fluorescence visualization (n = 3). IA- rhodamine was used as the probe for C161 S-TIGAR, C409S-CASP8 and PRMTl . 59 was used as a probe for MLTK. The soluble proteome containing IMPDH2 was treated with ATP for 15 min prior to incubation with 18 (1 μΐ, of 625 μΜ, final concentration = 25 μΜ) for 1 h. MLTK and IMPDH2 were subjected to CuAAC conjugation to rhodamine-azide as detailed above. The percentage of labeling was determined by quantifying the integrated optical intensity of the bands, using ImageJ software. Nonlinear regression analysis was used to determine the IC50 values from a dose-response curve generated using GraphPad Prism 6.
isoTOP-ABPP sample preparation
[00320] For in situ labeling, MDA-MB-231 cells were grown to 95% confluence and Ramos cells were grown to 1 million cells/mL. The media in all samples was replaced with fresh media, containing 200 μΜ of the indicated fragments and the cells were incubated at 37 °C for 2 h, washed with cold PBS, scraped into cold PBS and harvested by centrifugation (see prior section on "Preparation of human cancer cell line proteomes").
[00321] Fragments 2, 3, 8, 9, 10, 12, 13, 14, 21, 27, 28, 29, 31, 33, 38, 45, 51 and 56 were screened at 200 μΜ in situ. Fragments 4 and 11 were screened at 100 μΜ in situ. Fragments 2, 3, 8, and 20 were tested at 50 μΜ in situ.
[00322] After in vitro or in situ fragment treatment, the samples were labeled for 1 h at ambient temperature with 100 μΜ iodoacetamide alkyne (IA-alkyne, 5 μΐ. of 10 mM stock in DMSO). For direct labeling with 61, 61 (5 μί of 1 or 10 mM stocks in DMSO, final
concentration = 10 or 100 μΜ) was substituted for IA-alkyne. Samples were conjugated by CuAAC to either the light (fragment treated) or heavy (DMSO treated) TEV tags (10 μί of 5 mM stocks in DMSO, final concentration = 100 μΜ), TCEP, TBTA ligand and CuS04 as detailed above. The samples were allowed to react for 1 h at which point the samples were centrifuged (16,000 g, 5 min, 4 °C). The resulting pellets were sonicated in ice-cold methanol (500 μ ) and the resuspended light- and heavy-labeled samples were then combined and centrifuged (16,000 g, 5 min, 4 °C). The pellets were solubilized in PBS containing 1.2% SDS (1 mL) with sonication and heating (5 min, 95 °C) and any insoluble material was removed by an additional centrifugation step at ambient temperature (14,000 g, 1 min).
[00323] For each sample, 100 μΐ^ of streptavidin-agarose beads slurry (Pierce) was washed in 10 mL PBS and then resuspended in 5 mL PBS. The SDS -solubilized proteins were added to the suspension of streptavidin-agarose beads and the bead mixture was rotated for 3 h at ambient temperature. After incubation, the beads were pelleted by centrifugation (1,400 g, 3 min) and were washed ( 2 x 10 mL PBS and 2 x 10 mL water).
[00324] The beads were transferred to eppendorf tubes with 1 mL PBS, centrifuged (1,400 g,
3 min), and resuspended in PBS containing 6 M urea (500 μΐ,). To this was added 10 mM DTT (25 μΐ^ of a 200 mM stock in water) and the beads were incubated at 65 °C for 15 mins. 20 mM iodoacetamide (25 μΐ^ of a 400 mM stock in water) was then added and allowed to react at 37 °C for 30 mins with shaking. The bead mixture was diluted with 900 μΐ^ PBS, pelleted by centrifugation (1,400 g, 3 min), and resuspended in 200 μΐ^ PBS. To this was added 1 mM CaCl2 (2 μΐ^ of a 200 mM stock in water) and trypsin (2 μg, Promega, sequencing grade) and the digestion was allowed to proceed overnight at 37 °C with shaking. The beads were separated from the digest with Micro Bio-Spin columns (Bio-Rad) by centrifugation (1,000 g, 1 min), washed (2 x 1 mL PBS and 2 x 1 mL water) and then transferred to fresh eppendorfs with 1 mL water. The washed beads were washed once further in 140 μΐ^ TEV buffer (50 mM Tris, pH 8, 0.5 mM EDTA, 1 mM DTT) and then resuspended in 140 μΐ, TEV buffer. 5 μΐ, TEV protease (80 μΜ) was added and the reactions were rotated overnight at 29 °C. The TEV digest was separated from the beads with Micro Bio-Spin columns by centrifugation (1,400 g, 3 min) and the beads were washed once with water (100 μΐ.). The samples were then acidified to a final concentration of 5% (v/v) formic acid and stored at -80 °C prior to analysis.
Liquid-chromatography-mass-spectrometry (LC-MS) analysis of isoTOP-ABPP samples
[00325] TEV digests were pressure loaded onto a 250 μπι (inner diameter) fused silica capillary column packed with C18 resin (Aqua 5 μπι, Phenomenex). The samples were analyzed by multidimensional liquid chromatography tandem mass spectrometry (MudPIT), using an LTQ-Velos Orbitrap mass spectrometer (Thermo Scientific) coupled to an Agilent 1200- series quaternary pump. The peptides were eluted onto a biphasic column with a 5 μπι tip (100 μπι fused silica, packed with CI 8 (10 cm) and bulk strong cation exchange resin (3 cm , SCX, Phenomenex,)) in a 5-step MudPIT experiment, using 0%, 30%, 60%, 90%, and 100% salt bumps of 500 mM aqueous ammonium acetate and using a gradient of 5-100%) buffer B in buffer A (buffer A: 95% water, 5% acetonitrile, 0.1% formic acid; buffer B: 5% water, 95% acetonitrile, 0.1% formic acid) as has been described in Weerapana et al. Nat Protoc 2: 1414- 1425 (2007). Data was collected in data-dependent acquisition mode with dynamic exclusion enabled (20 s, repeat of 2). One full MS (MSI) scan (400-1800 m/z) was followed by 30 MS2 scans (ITMS) of the nth most abundant ions.
Peptide and protein identification
[00326] The MS2 spectra data were extracted from the raw file using RAW Xtractor (version
1.9.9.2; available at http://fields.scripps.edu/downloads.php). MS2 spectra data were searched using the ProLuCID algorithm (publicly available at http://fields.scripps.edu/downloads.php) using a reverse concatenated, nonredundant variant of the Human UniProt database (release-
2012 11). Cysteine residues were searched with a static modification for
carboxyamidomethylation (+57.02146) and up to one differential modification for either the light or heavy TEV tags (+464.28595 or +470.29976 respectively). Peptides were required to have at least one tryptic terminus and to contain the TEV modification. ProLuCID data was filtered through DTASelect (version 2.0) to achieve a peptide false-positive rate below 1%. R value calculation and processing
[00327] The ratios of heavy/light for each unique peptide (DMSO/compound treated;
isoTOP-ABPP ratios, R values) were quantified with in-house CEVIAGE software, using default parameters (3 msls per peak and signal to noise threshold 2.5). Site-specific engagement of electrophilic fragments was assessed by blockade of IA-alkyne probe labeling. For peptides that showed a > 95% reduction in MSI peak area from the fragment treated proteome (light TEV tag) when compared to the DMSO treated proteome (heavy TEV tag), a maximal ratio of 20 was assigned. Ratios for unique peptide entries were calculated for each experiment; overlapping peptides with the same modified cysteine (e.g. different charge states, MudPIT chromatographic steps or tryptic termini) were grouped together and the median ratio was reported as the final ratio (R). The peptide ratios reported by CFMAGE were further filtered to ensure the removal or correction of low quality ratios in each individual dataset. The quality filters applied were the following: removal of half tryptic peptides; for ratios with high standard deviations from the median (90% of the median or above) the lowest ratio was taken instead of the median; removal of peptides with R=20 and only a single ms2 event triggered during the elution of the parent ion; manual annotation of all the peptides with ratios of 20, removing any peptides with low-quality elution profiles that remained after the previous curation steps. Proteome reactivity values for individual fragments were computed as the percentage of the total quantified cysteine-containing peptides with R values >4 (defined as liganded cysteines) for each replicate experiment and the final proteome reactivity value was calculated as the mean for all replicate experiments for each fragment from both MDA-MB-231 and Ramos cellular proteomes.
Cross-data processing
[00328] Biological replicates of the same compound and cell-line were averaged if the standard deviation was below 60% of the mean; otherwise the lowest value of the ratio set was taken. For peptides with multiple modified cysteines, the cysteine with the highest number of quantification events was kept and the remaining, redundant peptides were discarded. Peptides included in the aggregate dataset (those used for further bioinformatics and statistical analyses) were required to have been quantified in 3 experiments. Cysteines were categorized as liganded if they had at least two ratios R > 4 (hit fragments) and one ratio between 0.5 and 2 (control fragments). Although the majority (> 75% of fragments) were profiled in at least two biological replicates, some data from single replicate MS experiments were included. Averaged filtered data for all fragments and representative individual filtered datasets are found in Tables 1-3. In situ data processing
[00329] R values were calculated and individual datasets were filtered as described above (R value calculation and processing). Two categories of hits in situ were defined: 1) cysteines liganded in situ that were also observed as hits in vitro and 2) cysteines that detected in vitro, but were only liganded in situ. For the first category, R values for the same cysteine containing peptide from in vitro and in situ experiments were compared and if both had ratios R > 4, the cysteine was considered ligandable in situ. To qualify for the second category, two ratios R > 4 for replicates of two different fragments were required to be detected in situ and at least one of these fragments must be quantified as a non-hit with R < 2 in vitro. Additionally, another cysteine from the same protein was required to be unliganded in situ (R < 2 ) by the same fragment to control for the possibility that changes in R values from changes in protein expression upon fragment treatment rather than from fragment competition.
Functional annotation of liganded cysteines
[00330] Custom python scripts were used to compile functional annotations available in the UniProtKB/Swiss-Prot Protein Knowledge database (release-2012 11). Relevant Uniprot entries were mined for available functional annotations at the residue level, specifically for annotations regarding enzyme catalytic residues (active sites), disulfides (redox active and structural) and metal binding sites. Liganded proteins were queried against the Drugbank database (Version 4.2) and fractionated into DrugBank and non-Drugbank proteins. Functional keywords assigned at the protein level were collected from the Uniprot database and the Drugbank and non- drugbank categories were further classified into protein functional classes. Cysteine reactivity data was re-processed using ProLuCID as detailed above (Peptide and protein identification). Cysteines found in both the reactivity and ligandability datasets were sorted based on their reactivity values (lower ratio indicates higher reactivity). The moving average of the percentage of total liganded cysteines within each reactivity bin (step-size 50) was taken. Custom python scripts were developed to collect relevant MR and X-ray structures from the RCSB Protein Data Bank (PDB). For proteins without available PDB structures, sequence alignments, performed with BLAST to proteins deposited in the PDB, were used to identify structural homologues. For annotation of active-site and non-active cysteines, enzymes with structures in the PDB were manually inspected to evaluate the location of the cysteine. Cysteines were considered to reside in enzyme active sites if they were within 10 A of active-site ligand or residue(s). Cysteines outside of the 10 A range were deemed non-active-site residues.
Histograms of fragment hit-rates across high-coverage, ligandable cysteines, active-site and non- active site cysteines were calculated from the subset of ligandable cysteines quantified in 10 or more separate experiments. The fragment hit rate is reported as the percentage of the total quantification events with R > 4. For analyses of trends within the whole data, including histograms and heatmaps, a cell-line merged dataset was used where data from the MDA-MB- 231 experiments was taken first and the Ramos data was used if there was no data from MDA- MB-231 experiments for a particular fragment and cysteine. Heatmaps were generated in R (version 3.1.3) using the heatmap.2 algorithm. Protein structures were rendered using Pymol. GSH reactivity
[00331] Glutathione (GSH) was diluted to a final concentration of 125 μΜ in assay buffer (100 mM Tris, pH 8.8, 10% ethanol as co solvent). In triplicate, to 100 μΙ_, of the GSH mixture in a clear 96 well plate (Costar® Corning®), the indicated electrophile (2 μΙ_, of a 50 mM stock solution in DMSO, final concentration = 500 μΜ ) was added and the reaction mixture was incubated at room temperature for 1 h. 5 μΙ_, of Ellman's reagent (100 mM stock in 1M NaOH, final concentration = 5 mM) was added and the absorbance was measure at 440 nm on a plate reader (Tecan Infinite F500). The concentration of GSH remaining was calculated from a standard curve.
Reactive cysteine docking
[00332] In silico fragment library containing all chloroacetamide and acrylamide fragments from Fig. 3 was prepared using Open Babel library with custom Python scripts. Fragments were modeled in their reactive form (i.e., with explicit chloroacetamide and acrylamide warheads). 3D coordinates were generated from SMILES strings, calculating their protonation state at pH 7.4, and then minimizing them using MMF94s forcefield (50K iterations steepest descent; 90K conjugate gradient); for chiral molecules with undefined configuration, all enantiomers were generated, resulting in 53 total fragments
[00333] For each protein, the UniProtKB ID was used to filter the PDB. Structures determined by X-ray crystallography were selected, privileging higher sequence coverage and structure resolution (See Table 5 for selected PDB IDs). When no human structures were available, the closest homologous organism available was selected (e.g. PRMTl : R norvegicus). Protein structures were prepared following the standard AutoDock protocol. Waters, salts, and crystallographic additives were removed; AutoDockTools was used to add hydrogens, calculate Gasteiger-Marsili charges and generate PDBQT files.
[00334] MSMS reduced surface method was used to identify accessible cysteines. The protein volume was scanned using a probe radius of 1.5 A; residues were considered accessible if they had at least one atom in contact with either external surfaces or internal cavities.
[00335] The fragment library was docked independently on each accessible cysteine using AutoDock 4.2. A grid box of 24.4x24.4x24.4 A was centered on the geometric center of the residue; thiol hydrogen was removed from the side-chain, which was modeled as flexible during the docking; the rest of the structure was kept rigid. A custom 13-7 interaction potential was defined between the nucleophile sulfur and the reactive carbon in the ligands. The equilibrium distance (req) was set to the length of the C-S covalent bond (1.8 A); the potential well depth (seq) varied between 1.0 and 0.175 to model to the reactivity of the different ligands. For each fragment, potential well depth was determined by dividing its proteomic reactivity percentage by 20, and the value for iodoacetamide was approximated as the maximum (2.5) for reference. The potential was implemented by modifying the force field table of AutoDock. Fragments were docked with no constraints, generating 100 poses using the default GA settings. For each fragment, the best docking score pose was analyzed: if the distance between the nucleophilic sulfur and the reactive carbon was <2.0 A, the cysteine was considered covalently modified. If a residue was alkylated by at least one ligand, it was considered labeled. The docking score (i.e., negative binding energy) was calculated based on the estimated interaction energy of each fragment in its docked pose. The docking score of the best alkylating fragment defined the labeling score. The residue with the best labeling score was considered the most probable to be labeled.
Structural modeling
[00336] EVIPDH2 structure, including the Bateman domain, was modeled using I-TASSER. Subcloning and mutagenesis
[00337] Full length cDNAs encoding for IDH1 (Open Biosystems, Clone ID: 3880331) and EVIPDH2 (Open Biosystems, Clone ID: 3447994) were subcloned into pET22b (+) (Novagen) with C-terminal His6-affinity tag (SEQ ID NO: 861). Full length cDNA encoding for TIGAR (Origene, Sc320794) was subcloned into pET28a (+) (Novagen) with N-terminal His6-affinity tag (SEQ ID NO: 861). Full length PRMT1 subcloned into pET45b (+) (Novagen) was previously generated by the Cravatt lab. Full-length human CASP3 (residues 1-277) and a truncated CASP8 (residues 217-479) without the CARD domain was subcloned into pET23b (Novagen) with C-terminal His6-affinity tags (SEQ ID NO: 861). Cysteine mutants were generated using QuikChange site-directed mutagenesis, using primers containing the desired mutations and their respective compliments.
Recombinant over expression of TIGAR, IDH1, PRMT1 and IMPDH2
[00338] TIGAR, IDH1, PRMT1 and EVIPDH2 were expressed in BL21(DE3) Chemically Competent Cells (NEB), grown on Terrific Broth supplemented with the desired antibiotic (50 μg/mL Kanamycin or 50 μg/mL Carbenicillin) to OD60o of 0.8 and induced with 0.5 mM IPTG for 16 h at 18 °C. Cells were immediately harvested and resuspended in 30 mL cold buffer A (25 mM Tris, pH 7.4, 200 mM NaCl, 10% glycerol, 1 mM BME), supplemented with lysozyme (Sigma), DNAase (NEB) and cOmplete protease inhibitor tablets (Roche), sonicated, and centrifuged (45,000 g, 30 min, 4 °C). The soluble fractions were collected and rotated for 1 h with 1 mL Ni-NTA slurry (Qiagen) at 4 °C. The slurry was then transferred to a 50 mL volume, fritted column and collected by gravity flow. The resin was then washed with 100 mL buffer A containing 20 mM imidazole and eluted with 10 mL buffer A containing 200 mM imidazole. The eluant was concentrated to 2.5 mL (Amicon-Ultra-15, 10 kDa MW cutoff), buffer exchanged using PD10 columns (GE Amersham) into the storage buffer (50 mM HEPES, pH 7.4, 150 mM NaCl, 10% glycerol, 1 mM BME) and further concentrated (Amicon-Ultra-4, 10 kDa MW cutoff) to a final concentration of approximately 100 μΜ protein. Protein
concentration was determined using the Bio-Rad DC™ protein assay kit. Protein purity was assayed by SDS-PAGE under reducing conditions and were >95% pure.
Recombinant CASP3, CASP8 and TEV protease expression
[00339] CASP3, CASP8, pro-CASP8 (D374A, D384A) and an N-terminal MBP fusion-His6- TEV-Arg6 protease construct pRK793 ("His6" disclosed as SEQ ID NO: 861 and "Arg6" disclosed as SEQ ID NO: 862) were expressed in E. coli BL21(DE3)pLysS cells (Stratagene). Cells were grown in 2xYT medium supplemented with 200 μg/ml ampicillin and 50 μg/ml chloramphenicol at 37 °C to an OD60o of 0.8-1.0. Overexpression of caspase was induced with 0.2 mM IPTG at 30 °C for 4 h (CASP3) or at 12 °C overnight (CASP8) or with 0.5 mM IPTG at 30 °C for 4 h (TEV protease). Cells were immediately harvested and resuspended in ice cold buffer A (caspases: 100 mM Tris, pH 8.0, 100 mM NaCl; TEV protease: PBS) and subjected to 3 cycles of lysis by microfluidization (Microfluidics). The cell lysate was clarified by centrifugation (45,000 g, 30 min, 4 °C) and soluble fractions were loaded onto a 1 mL HisTrap HP Ni-NTA affinity column (GE Amersham) pre-equilibrated with buffer A and eluted with buffer A containing 200 mM imidazole. The eluted protein was immediately diluted two-fold with buffer B (20 mM Tris, pH 8.0) and purified by anion-exchange chromatography (HiTrap Q HP, GE Amersham) with a 30-column volume gradient to 50% of buffer B containing 1 M NaCl. The caspases were injected over a Superdex 200 16/60 gel filtration column (GE
Amersham) and TEV protease over a Superdex 75 gel filtration column (GE Amersham) in buffer C (caspases: 20 mM Tris, pH 8.0, 50 mM NaCl; TEV protease: PBS, 10 mM DTT) to buffer exchange and to remove any remaining contaminants. Fractions containing the desired protein were pooled and concentrated to approximately 1 mg/mL (Millipore Ultrafree-15, 10 kDa MW cutoff). The purified proteins were immediately frozen and stored at -80 °C. Protein concentrations were measured using both Bio-Rad colorimetric assay and A280 absorbance in denaturing conditions. Protein purity was assayed by SDS-PAGE under reducing conditions and were >98% pure. Retroviral over expression of flag-tagged IDH1 proteins
[00340] R132H-IDH1, including an additional K345K silent mutation to remove an unwanted restriction site and GFP were subcloned into a modified pCLNCX retroviral vector. Retrovirus was prepared by taking 1.5 μg of each pCLNCX vector and 1.5 μg pCMV-VSV-G and 20 μΐ. of Roche X-tremeGeneHP DNA transfection reagent to transfect HEK-293RTV cells. The medium was replaced after 1 day of transfection and the following day the culture supernantant was collected and filtered through 0.5 μΜ filter. 10 mL of the filtrate, containing the desired virus, was used to infect MUM2C cells in the presence of polybrene (8 μg/mL) for 48 h, at which point the infected cells were selected for in medium containing 100 μg/mL hygromycin for 7-10 days. Surviving cells were expanded and cultured in complete RPMI-1640 medium containing hygromycin.
IDH1 NADP assay
[00341] Recombinant IDH1 and C269S -IDH1 (100 μΜ in storage buffer) were diluted 1 :200 in MDA-MB-231 cellular proteome (1 mg/mL). To 25 μΐ. of this mixture was added 1 μΐ. of the indicated compound (25 X stock solution in DMSO) and the lysates were incubated for 1 h at room temperature in clear 96 well plates (Corning® Costar®). 75 μΐ. per well of a stock solution of NADP (13.3 mM) and isocitrate (13.3 mM) in IDH1 buffer (40 mM Tris, pH 7.4, 2 mM MgCl2, 0.01% pluronic) was added immediately before measuring UV absorbance at 340 nm on a 96 well UV absorbance plate reader (TEC AN). Absorbance was measured for 45 minutes and the relative activities were calculated from the change in absorbance for the linear portion of the curve.
IDH1 2-hydroxyglutarate (2-HG) formation assay
[00342] MUM2C cells stably overexpressing IDH1 R132H were seeded 1.5 χ 106 cells/150 mm dish. The following day the indicated compounds (50 mM stock solutions in DMSO) or
DMSO were added to the cells to the final concentrations indicated and were allowed to incubate for 2 h. Control cells overexpressing GFP were treated in parallel. The cells were washed in ice-cold PBS and collected by scraping in ice-cold PBS and centrifugation (1,400 g, 3 min, 4 °C). The cell pellets were then resuspended in 100 μΐ^ ice-cold PBS followed by sonication and centrifugation at 16,000 g for 10 min. Lysates were then buffer exchanged into
IDH1 buffer (40 mM Tris, pH 7.4, 2 mM MgCl2) with 0.5 mL ZEBA spin desalting columns
(Thermo Fisher, 89882). The protein concentrations were adjusted to 3.5 mg/mL and 25 μΐ. of the lysate was mixed with 25 μΐ. of the reaction mixture (2.5 mM NADPH and 2.5 mM a- ketoglutarate in IDHl buffer) and the reaction was allowed to proceed for 4 h at which point the reaction mixtures were quenched with 50 μΐ. cold methanol, followed by a centrifugation
(16,000 g, 10 min, 4 °C). Formation of 2-HG was followed by targeted LC/MS analysis. The reaction mixture was separated with a Luna-NH2 column (5 μιτι, 100 A, 50 x 4.6 mm,
Phenomenex) with a precolumn ( H2, 4 x 3.0 mm) using a gradient of mobile phases A and B (mobile phase A: 100% CH3CN, 0.1% formic acid; mobile phase B: 95:5 (v/v) H20:CH3CN, 50 mM H4OAc, 0.2% H4OH). The flow rate started at 0.1 mL/min, and the gradient consisted of 5 min 0% B, a linear increase to 100% B over 20 min at a flow rate of 0.4 mL/min, followed by an isocratic gradient of 100% B for 2 min at 0.5 mL/min before equilibrating for 3 min at 0% B at 0.4 mL/min (30 min total). For each run, the injection volume was 25 μΐ^. MS analysis was performed on an Agilent G6410B tandem mass spectrometer with ESI source. The dwell time for 2-HG was set to 100 ms. and collision energy for 2-HG was set to 5. The capillary was set to 4 kV, and the fragmentor was set to 100 V. The drying gas temperature was 350 °C, the drying gas flow rate was 11 L/min and the nebulizer pressure was 35 psi. The mass spectrometer was run in MRM mode, monitoring the transition of mlz from 146.7 to 129 for 2-HG (negative ionization mode). Treatments were conducted in triplicate. Background 2-HG production, calculated from the 'mock' GFP over expressing cells, was subtracted from the total 2-HG production. TIGAR activity assay
[00343] TIGAR activity assay was conducted as described in Gerin et al. The Biochemical Journal 458:439-448 (2014). Formation of 3PG (3-phosphoglycerate) production from 23BPG (2,3-bisphosphoglycerate) was measured spectrophotometrically on a TECAN plate reader, measuring decrease in absorbance at 340 nm in clear, flat-bottom 96 well microplate (Corning® Costar®). 2 μΕ of recombinant TIGAR (10 mg/mL) was diluted into 1 mL dilution buffer (25 mM HEPES, pH 7.1, 25 mM KCl, 1 mM MgCl2). 25 of diluted protein was incubated for 1 h with the indicated concentration of compound (1 μί, 25 x stock in DMSO). Then 75 μΕ of assay mixture comprised of 25 mM HEPES (pH 7.1), 25 mM KCl, 1 mM MgCl2, 0.5 mM NADH, 1 mM DTT, 1 mM 23BPG, 1 mM ATP-Mg, the equivalent of 1 μΐ, each of rabbit muscle GAPDH (4000 units/mL, Sigma, G5537) and yeast PG kinase (6300 units/mL, Sigma, P7634) was added to the protein and decrease in absorbance was monitored at 340 nm. The background, calculated from samples lacking TIGAR, was subtracted from samples containing TIGAR. Experiments were performed in quadruplicate.
PRMT1 in vitro methylation assays
[00344] PRMTl assays were conducted as described in Weerapana et al. Nature 468:790-795 (2010). Recombinant human PRMTl (0.85 μΜ, wild type or CIOI S mutant) in 25 iL methylation buffer (20 mM Tris, pH 8.0, 200 mM NaCl, 0.4 mM EDTA) was pre-incubated with indicated fragments for 1 h and methylation activity was monitored after addition of 1 mg of recombinant histone 4 (NEB, M2504S) and 3H-SAM (2 μθ). Reactions were further incubated for 60 min at ambient temperature and stopped with 4χ SDS sample buffer. SDS-PAGE gels were fixed with 10% acetic acid/10% methanol (v/v), washed, and incubated with Amplify reagent (Amersham) before exposing to film at -80 °C for 3 days.
MLTK in vitro kinase activity assay
[00345] The kinase activity assay protocol was conducted as described in Wang et al. ACS Chemical biology 9:2194-2198 (2014). Kinase assay buffers, myelin basic protein (MBP) substrate and ATP stock solution were purchased from SignalChem. Radio-labeled [γ-33Ρ] ATP was purchased from PerkinElmer. 250 μΐ. of HEK-293T soluble lysates (8 mg/mL), stably overexpressing WT, C22A or K45M MLTK were labeled for 1 h with 100 μΜ fragment or DMSO. The samples were then individually immunoprecipitated with 20 μΐ. flag resin slurry per sample and then eluted with 15 μΐ. 3 ><Flag-peptide. To each sample was added 5 iL of MBP and then 5 xL of [γ-33Ρ] ATP assay cocktail (250 μΜ, 167 μ /ητΐ,) was added to initiate the kinase reaction. Each reaction mixture was incubated at ambient temperature for 30 min, and the reactions were terminated by spotting 25 μΐ. of the reaction mixture onto individual precut phosphocellulose P81 paper. The spotted P81 strips were washed with 10 mL of 1% phosphoric acid (3 x 10 min). MLTK activity was measured by counting the radioactivity on the P81 paper in the presence of scintillation fluid in a scintillation counter. The background was determined from the K45M- inactive mutant MLTK activity level, which was subtracted from the WT and C22A samples. Relative activities for WT and C22A were normalized to their respective DMSO treated samples. Experiments were performed in triplicate.
CASP3 and CASP8 in vitro activity assays
[00346] Caspase 3 and 8 assays were conducted with CASP8 activity assay kit (BioVision, Kl 12-100) and Caspase 3 activity assay kit (Invitrogen, EnzChek® Caspase-3 Assay Kit), following the manufacturer's instructions. Briefly, recombinant Caspase 3 (10 μΜ) was added to soluble Ramos lysates (1 mg/mL) to a 100 nM final concentration of protease. Caspase 8 (30 μΜ) was added to soluble Ramos lysates to a 1 μΜ final concentration of protease. In triplicate, 50 μL· lysate was treated with either DMSO, DEVD-CHO ("DEVD" disclosed as SEQ ID NO: 857) (20 μΜ) or the indicated compounds (100 μΜ) for 1 h, following which 50 μΕ of 2x reaction buffer containing lOmM DTT and 5 μΕ substrate (4 mM stock in DMSO of IETD-AFC ("IETD" disclosed as SEQ ID NO: 858) for CASP8; 10 mM stock in DMSO of DEVD-AMC ("DEVD" disclosed as SEQ ID NO: 857) for CASP3) was added to each well and the samples were incubated at ambient temperature for 2 h. Caspase activity was measured from the increase in fluorescence (excitation 380 nm emission 460 nm). Experiments were performed in triplicate. Background was calculated from samples lacking the recombinant caspase.
Apoptosis assays with Caspase 8 inhibitors [00347] 4 mL of Jurkat cells in RPMI (1.5 million cells/mL) were treated with the indicated compound at 30 μΜ for 30 min (50 mM stock solution in DMSO). Z-VAD-FMK (EMD
Millipore Biosciences, 627610) and was used at a final concentration of 100 μΜ. After preincubation, FASL (4 μΐ^ of 100 μ^μΐ^ stock solution of *¾perFasLigand™ in water, final concentration =100 ng/mL, Enzo life Sciences) or staurosporine (8 μΐ. of 1 mM stock solution in DMSO, final concentration =2 μΜ, Fisher Scientific, 50664333). After 6 hours, cells were harvested by centrifugation, washed and lysed in cell lysis buffer (BioVision, 1067-100) and 40 μg of each sample were separated by SDS-Page on 14% polyacrylamide gels. The gels were transferred to nitrocellulose membranes and were immunoblotted overnight with the indicated antibodies. For measurements of cell viability, in quadruplicate for each condition, 150,000 cells (100 μL· of 1.5 million cells/mL) were plated in Nunc™ MicroWell™ 96-Well Optical -Bottom Plates with Polymer Base (Fisher Scientific). Compounds, FASL and STS were used at the same concentrations indicated above with a 30 minute pre-incubation with compound, followed by 6 hours with either STS or FASL or DMSO. Cell viability was measured with CellTiter-Glo® Luminescent Cell Viability Assay (Promega) and was read on a Biotech Synergy 4 plate reader. Western blotting
[00348] For CASP8, CASP3 and PARP, cell pellets were resuspended in cell lysis buffer from (BioVision, 1067-100) with l x cOmplete protease inhibitor (Roche) and allowed to incubate on ice for 30 min prior to centrifugation (10 min, 16,000 g). For all other proteins, cell pellets were resuspended in PBS and lysed with sonication prior to centrifugation (10 min, 16,000 g). The proteins were then resolved by SDS-PAGE and transferred to nitrocellulose membranes, blocked with 5% BSA in TBST and probed with the indicated antibodies. The primary antibodies and the dilutions used are as follows: anti-parp (Cell Signaling, 9532, 1 : 1000), anti-casp3 (Cell Signaling, 9662, 1 :500), anti-casp8 (Cell Signaling, 9746, 1 :500), anti-IDHl (Cell Signaling, 1 :500, 3997s), anti-actin (Cell Signaling, 3700, 1 :3000), anti-gapdh (Santa Cruz, sc-32233, 1 :2000) anti-flag (Sigma Aldrich, F1804, 1 :3000) . Blots were incubated with primary antibodies overnight at 4 °C with rocking and were then washed (3 x 5 min, TBST) and incubated with secondary antibodies (LICOR, IRDye® 800CW or IRDye® 800LT, 1 : 10,000) for 1 h at ambient temperature. Blots were further washed (3 x 5 min, TBST) and visualized on a LICOR Odyssey Scanner.
Statistical analysis
[00349] Data are shown as mean ± SEM. P values were calculated using unpaired, two-tailed Student's t-test. P values of <0.05 were considered significant. Prediction Failures in Reactive Docking
[00350] Prediction failures were due to the approximations of the rigid model used with highly flexible/solvent exposed loop regions (STAT1 :C255, PDB ID: 1YVL; HAT1 :C101, PDB ID:2P0W; ZAP70:C117, PDB ID:4K2R), or with partially buried residues (SARS:C438, PDB ID:4187; PAICS:C374, PDB ID:2H31). In some embodiments, the simulation of some degree of flexibility (such as flexible side chains) improves the success rate. In some embodiments, the method was limited by availability and quality of crystallographic structures, when sequences were not fully resolved in available models (XP01 :C34, C1070, PDB ID:3GB8,
F BP1 :C511,C555,C609, PDB ID:2EFL; FMPDH2:C140, PDB ID: 1 F7), or when only orthologue sequences were available (PRMT1 : R. Norvegicus, PDB ID: 10RI).
General Synthetic Methods
[00351] Chemicals and reagents were purchased from a variety of vendors, including Sigma
Aldrich, Acros, Fisher, Fluka, Santa Cruz, CombiBlocks, BioBlocks, and Matrix Scientific, and were used without further purification, unless noted otherwise. Anhydrous solvents were obtained as commercially available pre-dried, oxygen-free formulations. Flash chromatography was carried out using 230-400 mesh silica gel. Preparative thin layer chromotography (PTLC) was carried out using glass backed PTLC plates 500-2000 μπι thickness (Analtech). All reactions were monitored by thin layer chromatography carried out on 0.25 mm E. Merck silica gel plates (60F-254) and visualized with UV light, or by ninhydrin, ethanolic phosphomolybdic acid, iodine, ^-anisaldehyde or potassium permanganate stain. MR spectra were recorded on
Varian INOVA-400, Bruker DRX-600 or Bruker DRX-500 spectrometers in the indicated solvent. Multiplicities are reported with the following abbreviations: s singlet; d doublet; t triplet; q quartet; p pentet; m multiplet; br broad. Chemical shifts were reported in ppm relative to TMS and J values were reported in Hz. Mass spectrometry data were collected on a HP 1100 single-quadrupole instrument (ESI; low resolution) or an Agilent ESI-TOF instrument (HRMS).
[00352] In some embodiments, General Procedure A was used for the synthesis of one or more of the small molecule fragments and/or cysteine-reactive probes described herein. The amine was dissolved in anhydrous CH2C12 (0.2 M) and cooled to 0 °C. To this, anhydrous pyridine (1.5 equiv.) was added in one portion, then chloroacetyl chloride (1.5 equiv.) dropwise and the reaction was monitored by TLC until complete disappearance of starting material and conversion to product was detected (typically 1 h). If the reaction did not proceed to completion, additional aliquots of pyridine (0.5 equiv.) and chloroacetyl chloride (0.5 equiv.) were added.
The reaction was quenched with H20 (1 mL), diluted with CH2C12 (20 mL), and washed twice with saturated NaHC03 (100 mL). The organic layer was concentrated in vacuo and purified by preparatory thin layer or flash column chromatography to afford the desired product. In some embodiments, General Procedure Al is similar to General Procedure A except triethylamine (3 equiv.) was used instead of pyridine. In some embodiments, General Procedure A2 is similar to General Procedure A except N-methylmorpholine (3 equiv.) was used instead of pyridine.
[00353] In some embodiments, General Procedure B was used for the synthesis of one or more of the small molecule fragments and/or cysteine-reactive probes described herein. The amine was dissolved in anhydrous CH2CI2 (0.2 M) and cooled to 0 °C. To this, triethylamine (TEA, 1.5 equiv.), was added in one portion, then acryloyl chloride (1.5 equiv.) dropwise, and the reaction was monitored by TLC until complete disappearance of starting material and conversion to product was detected (typically 1 h). If the reaction did not proceed to completion, additional aliquots of TEA (0.5 equiv.) and acryloyl chloride (0.5 equiv.) were added. The reaction was quenched with H20 (1 mL), diluted with CH2C12 (20 mL), and washed twice with saturated NaHC03 (100 mL). The organic layer was passed through a plug of silica, after which, the eluant was concentrated in vacuo and purified by preparatory thin layer or flash column chromatography to afford the desired product.
[00354] In some embodiments, General Procedure C was used for the synthesis of one or more of the small molecule fragments and/or cysteine-reactive probes described herein.
Acryloyl chloride (80.4 μΕ, 1.0 mmol, 2 equiv.) was dissolved in anhydrous CH2C12 (4 mL) and cooled to 0 °C. A solution of the amine (0.5 mmol, 1 equiv.) and N-methylmorpholine (0.16 mL, 1.5 mmol, 3 equiv.) in CH2C12 (2 mL) was then added dropwise. The reaction was stirred for 1 hr at 0 °C then allowed to warm up to room temperature slowly. After TLC analysis showed disappearance of starting material, or 6 h, whichever was sooner, the reaction was quenched with saturated aqueous NaHC03 (5 mL) and extracted with CH2C12 (3 x 10 mL). The combined organic layers were dried over anhydrous Na2S04, concentrated in vacuo, and the residue obtained was purified by preparatory thin layer chromatography to afford the desired product. Synthesis of probes and fragments
Purchased fragments
[00355] The following electrophilic fragments were purchased from the indicated vendors. 2
(Santa Cruz Biotechnology sc-345083), 3 (Key Organics JS-092C), 4 (Sigma Aldrich T142433- lOmg), 6 (Toronto Research Chemicals M320600), 8 (Alfa Aesar H33763), 10 (Santa Cruz
Biotechnology sc-345060), 11 (Santa Cruz Biotechnology sc-354895), 12 (Santa Cruz
Biotechnology sc-354966), 21 (Santa Cruz Biotechnology, sc-279681), 22 (Sigma Aldrich
699357-5G), 26 (Sigma Aldrich T 109959), 27 (Santa Cruz Biotechnology sc-342184), 28 (Santa
Cruz Biotechnology sc-335173), 29 (Santa Cruz Biotechnology sc-348978), 30 (Santa Cruz
Biotechnology sc-355362), 32 (Santa Cruz Biotechnology sc-354613), 33 (Sigma Aldrich
R996505), 34 (Santa Cruz Biotechnology sc-355477), 35 (Santa Cruz Biotechnology sc- 328985), 41 (Sigma Aldrich L469769), 42 (Sigma Aldrich R901946), 43 (Santa Cruz
Biotechnology sc-307626), 52 (Enamine, EN300-08075), 55 (Santa Cruz Biotechnology sc- 354880), 57 (VWR 100268-442), 58 (Enzo Life Sciences ALX-430-142-M005), 62 (WuXi Apptec).
Synthesis of isotopically-labeled TEV-tags:
Tev recognition
sequence
Figure imgf000134_0001
[00356] Isotopically-labeled heavy and light tags were synthesized with minor modifications to the procedure reported in Weerapana et al. Nat Protoc 2: 1414-1425 (2007) and Weerapana et al. Nature 468:790-795 (2010). Fmoc-Rink-Amide-MBHA resin (EMD Biosciences; 0.5 M, 830 mg, 0.6 mmol / g loading) was deprotected with 4-methylpiperidine in DMF (50% v/v, 2 x 5 mL, 1 min). Fmoc-Lys(N3)-OH (Anaspec) (500 mg, 1.26 mmol, 1.26 equiv.) was coupled to the resin overnight at room temperature with DIE A (1 13 μΐ) and 2-(6-chloro-lH-benzotriazole-l- yl)-l , 1 ,3, 3-tetramethylaminium hexafluorophosphate (HCTU; 1.3 mL of 0.5 M stock in DMF) followed by a second overnight coupling with Fmoc-Lys(N3)-OH ( 500 mg, 1.26 mmol, 1.26 equiv.), DIEA (1 13 μΐ), O-(7-azabenzotriazol-l-yl)-N,N,N',N'-tetramethyluronium
hexafluorophosphate (HATU; 1.3 mL of 0.5 M stock in DMF). Unmodified resin was then capped (2 x 30 min) with Ac20 (400 μΐ and DIEA (700 μΐ in DMF after which the resin was washed with DMF (2 x 1 min). Deprotection with 4-methylpiperidine in DMF (50% v/v, 2 x 5 mL, 1 min) and coupling cycles (4 equiv. Fmoc-protected amino acid (EMD biosciences) in DMF) with HCTU (2 mL, 0.5 M in DMF) and DIEA (347.7 μΐ were then repeated for the remaining amino acids. For the heavy TEV-tag, Fmoc-Valine-OH (13C5Ci5H2i15N04, 13C5, 97- 99%,15N, 97-99%), Cambridge Isotope Laboratories, Inc.) was used. Reactions were monitored by ninhydrin stain and dual couplings were used for all steps that did not go to completion. Biotin (0.24 g, 2 equiv.) was coupled for two days at room temperature with NHS (0.1 g, 2 equiv.), DIC (0.16 g, 2 equiv.) and DIEA (0.175 g, 2 equiv.). The resin was then washed with DMF (5 mL, 2 1 min) followed by 1 : 1 CH2Cl2:MeOH (5 mL, 2 1 min), dried under a stream of nitrogen and transferred to a round-bottom flask. The peptides were cleaved for 90 minutes from the resin by treatment with 95:2.5:2.5 trifluoroacetic acid: watentriisopropylsilane. The resin was removed by filtration and the remaining solution was triturated with cold ether to provide either the light or heavy TEV-tag as a white solid. HPLC-MS revealed only minor impurities and the compounds were used without further purification. HRMS-ESI (m/z):
calculated for C83H128N23O23S [M+H]: (Light-TEV-Tag) 1846.9268; found: 1846.9187;
calculated for C7813C5Hi28N2215N023S [M+H]: (Heavy-TEV-Tag): 1852.9237; found:
1852.9309.
Synthesis of probes and fragments
Synthesis of 1
Figure imgf000135_0001
66% SI-1 44%
N- hex-5-yn-l-yl)-2-chloroacetamide (SI-1 )
Figure imgf000135_0002
[00357] To a solution of 5-hexynylamine (63 mg, 0.65 mmol, 1.0 equiv.) in CH2CI2 (3.2 mL, 0.2 M) at 0 °C was added N-methylmorpholine (215μΕ, 3 equiv.) followed by chloroacetic anhydride portionwise (222 mg, 2 equiv.). The reaction was allowed to come to room
temperature and then stirred overnight. The reaction was then diluted with ether (50 mL), washed with 1 M HC1, 1 M NaOH, then brine (20 mL each). The combined organic layers were dried over magnesium sulfate and concentrated to yield chloroacetamide SI-1 (74 mg, 66%). 1H MR (400 MHz, Chloroform-i ) δ 6.79 (s, 1H), 4.09 (d, 7= 1.1 Hz, 2H), 3.34 (q, 7= 6.8 Hz, 2H), 2.23 (td, 7= 6.9, 2.7 Hz, 2H), 1.98 (t, 7= 2.7 Hz, 1H), 1.75 - 1.62 (m, 4H), 1.62 - 1.51 (m, 2H).
N- hex-5-yn-l-yl)-2-iodoacetamide (1)
Figure imgf000135_0003
[00358] To a solution of chloroacetamide SI-1 (36.1 mg, 0.2 mmol) in acetone (1 mL, 0.2 M) was added sodium iodide (47 mg, 1.5 equiv.) and the reaction was stirred overnight. The next day the reaction was filtered through a plug of silica eluting with 20% ethyl acetate in hexanes, and the filtrate was concentrated to yield a 10: 1 mixture of the desired iodoacetamide 1 and starting material. This mixture was re-subjected to the reaction conditions for one further day, at which point complete conversion was observed. The product was purified by silica gel chromatography, utilizing a gradient of 5 to 10 to 15 to 20% ethyl acetate in hexanes to yield the desired product (24 mg, 44%). In some embodiments, the reaction is performed with 2.5 equiv. of sodium iodide, in which case re-subjection is not necessary, and purification by PTLC is accomplished in 30% EtOAc/hexanes as eluent. 1H MR (500 MHz, Chloroform- ) δ 6.16 (s, 1H), 3.69 (s, 2H), 3.30 (q, J= 6.8 Hz, 2H), 2.23 (td, J= 6.8, 2.6 Hz, 2H), 1.97 (t, J= 2.6 Hz, 1H), 1.75 - 1.61 (m, 2H), 1.61 - 1.52 (m, 2H).
N-( 4-bromophenyl) -N-phenylacrylamide (5)
Figure imgf000136_0001
[00359] The title compound was synthesized according to General Procedure C from 4- bromophenylaniline (18.9 mg, 0.0762 mmol, 1 equiv.). Purification of the crude product by prep. TLC (30% EtOAc / hexanes) provided the title compound as a white solid (12.5 mg, 54%). 1H NMR (500 MHz, Chloroform-i ) δ 7.47 (d, J= 8.2 Hz, 2H), 7.39 (t, J= 7.6 Hz, 2H), 7.32 (d, J= 7.4 Hz, 1H), 7.21 (d, J= 7.7 Hz, 2H), 7.12 (d, J= 8.2 Hz, 2H), 6.48 (d, J= 16.7 Hz, 1H), 6.17 (dd, J= 16.8, 10.3 Hz, 1H), 5.65 (d, J= 10.3 Hz, 1H); HRMS-ESI (m/z) calculated for Ci5Hi3BrNO [M+H]: 302.0175; found: 302.0176.
Synthesis of 7
Figure imgf000136_0002
72%
tert-butyl 4-(phenylamino)piperidine-l-carboxylate (SI-2)
Figure imgf000136_0003
SI-2 was prepared according to Thoma et al, J. Med. Chem. 47: 1939-1955 (2004). 1H NMR (400 MHz, Chloroform-ύ δ 7.24 - 7.12 (m, 2H), 6.75 - 6.68 (m, 1H), 6.66 - 6.58 (m, 2H), 3.88 - 3.81 (m, 1H), 3.44 (tt, J= 10.4, 3.9 Hz, 2H), 3.00 - 2.88 (m, 2H), 2.10 - 1.99 (m, 2H), 1.48 (bs 9H), 1.41 - 1.27 (m, 2H). tert-but l 4-(2-chloro-N-phenylacetamido)piperidine-l-carboxylate (SI-3)
Figure imgf000137_0001
[00360] To a solution of aniline SI-2 (65 mg, 0.24 mmol) at 0 °C in CH2C12 (0.6 mL) was added pyridine (38μΙ., 2 equiv.) followed by chloroacetyl chloride (37.4 μΐ^, 2.0 equiv.) in CH2C12 (0.6 mL). The resulting solution was allowed to warm to room temperature and stirred overnight. The solution was then quenched with saturated aqueous sodium bicarbonate, extracted with Et20 (3 x 10 mL). The combined organic layers were dried over magnesium sulfate, filtered and concentrated to give an off-white solid, which was used without further purification (47 mg, 57%). 1H NMR (400 MHz, Chloroform-i ) δ 7.47 - 7.38 (m, 3H), 7.18 - 7.03 (m, 2H), 4.75 - 4.63 (m, 1H), 4.07 (s, 2H), 3.68 (s, 2H), 2.76 (s, 2H), 1.84 - 1.69 (m, 2H), 1.35 (s, 9H), 1.27 - 1.12 (m, 2H).
N-(l-benzoylpiperidin-4-yl)-2-chloro-N-phenylacetamide (7)
Figure imgf000137_0002
[00361] To neat SI-3 (47 mg, 0.128 mmol) was added trifluoroacetic acid (0.7 mL, final 0.2 M). The resulting solution was concentrated under a stream of nitrogen until no further evaporation was observed, providing the deprotected amine as its trifiuoroacetate salt. This viscous gum was then treated with triethylamine in ethyl acetate (10% v/v, 2 mL; solution smokes upon addition). The resulting solution was concentrated to afford the free base, which contained only triethylammonium trifiuoroacetate and the free amine by proton NMR. A stock solution was prepared by dissolving the resulting gum in CH2C12 (1.2 mL, -0.1 M final).
[00362] The deprotected amine (0.3 mL of stock solution, 0.0319 mmol) was treated with Hunig's base (17.5 μΐ^, 3 equiv.) and benzoyl chloride (7.6 μΐ^, 2.0 equiv.). This solution was stirred overnight, quenched with saturated aqueous sodium bicarbonate, extracted with Et20 (3 χ 10 mL). The resulting solution was dried over magnesium sulfate, filtered and concentrated. The resulting oil was purified by silica gel chromatography (20% EtOAc/hexanes) to afford chloroacetamide 7 as a white solid (8.6 mg, 75%). 1H NMR (500 MHz, Chloroform- ) δ 7.55 (dd, J= 5.5, 3.0 Hz, 3H), 7.50 - 7.32 (m, 5H), 7.21 (s, 2H), 4.92 (tt, J= 12.3, 4.0 Hz, 1H), 4.87 (s, 1H), 3.87 (s, 1H), 3.78 (s, 2H), 3.21 (s, 1H), 2.97 - 2.90 (m, 1H), 2.01 (s, 1H), 1.90 (s, 1H), 1.45 (s, 1H), 1.36 - 1.26 (m, 1H); HRMS-ESI (m/z) calculated for C20H22CIN2O2 [M+H]:
357.1364; found: 357.1362.
l- 4-benzylpiperidin-l-yl)-2-chloroethan-l-one (9)
Figure imgf000138_0001
[00363] Following General Procedure A, starting from 4-benzylpiperidine (840 mg, 5.2 mmol, 1 equiv.), the desired compound was obtained after column chromatography as a yellow oil (1 g, 81%). Spectroscopic data matches those reported previously reported in Papadopoulou et al. J. Med. Chem. 55:5554-5565 (2012). 1H MR (500 MHz, Chloroform-i ) δ 7.42 - 7.14 (m, 5H), 4.61 (d, J= 13.4 Hz, 1H), 4.14 (q, J= 21.9, 1 1.5 Hz, 2H), 3.89 (d, J= 13.5, 1H), 3.11 (td, J = 13.1, 2.7 Hz, 1H), 2.69 - 2.57 (m, 3H), 1.92 - 1.75 (m, 3H), 1.40 - 1.21 (m, 2H); HRMS-ESI (m/z) calculated for C14H19CINO [M+H]: 252.115; found: 252.115.
N-(2-( lH-indol-3-yl)ethyl)-2-chloroacetamide (13)
Figure imgf000138_0002
[00364] Following General Procedure A, starting from tryptamine (400 mg, 2.5 mmol, 1 equiv.), the desired compound was obtained after column chromatography as a brownish solid (460 mg, 77%). 1H MR (500 MHz, Chloroform-i ) δ 8.55 (s, 1H), 7.70 (d, J= 7.9 Hz, 1H), 7.45 (d, J= 8.1 Hz, 1H), 7.30 (t, J= 7.5 Hz, 1H), 7.23 (t, J= 7.4 Hz, 1H), 7.10 (s, 1H), 6.84 (s, 1H), 4.08 (s, 2H), 3.72 (q, J= 6.4 Hz, 2H), 3.10 (t, J= 6.8 Hz, 2H); HRMS-ESI (m/z) calculated for C12H14CIN2O2 [M+H]: 237.0789; found: 237.0791.
-(3,5-bis(trifluoromethyl)phenyl)acrylamide (14)
Figure imgf000138_0003
[00365] Following General Procedure B, starting from 3,5-bis(trifluoromethyl)aniline (1.16 g, 5 mmol, 1 equiv.), the desired compound was obtained after column chromatography as a white solid (1.05 g, 74%). 1H MR (500 MHz, Chloroform-i ) δ 8.33 (s, 1H), 8.18 (s, 2H), 7.68 (s, 1H), 6.57 (d, J= 17.5 Hz, 1H), 6.38 (dd, J= 16.9, 10.3 Hz, 1H), 5.93 (d, J= 12.5 Hz, 1H);
HRMS-ESI (m/z) calculated for CnH8F6N02 [M+H]: 284.0505; found: 284.0504. -(4^henoxy-3-(trifluoromethyl)phenyl)-N-(pyridin-3-ylmethyl)acrylamide (15)
Figure imgf000139_0001
[00366] 4-phenoxy-3-(trifluoromethyl)aniline (260 mg, 1 mmol, 1 equiv.) (Combi-Blocks) was dissolved in TFA (5 mL). Following the reductive amination protocol reported by Boros et al. J. Org. Chem 74:3587-3590 (2009), the reaction mixture was cooled to 0 °C and to this sodium triacetoxyborohydride (STAB) (270 mg, 1.3 mmol, 1.3 equiv.) was added. 3- pyridinecarboxaldehyde (200 mg, 2 mmol, 2 equiv.) was dissolved in CH2CI2 (5 mL) and slowly added to the reaction mixture. Upon complete conversion to product, the reaction was diluted with CH2C12 (20 mL) and washed with saturated sodium bicarbonate solution (3 x 20 mL) and the organic layer was dried then concentrated under reduced pressure. Without further purification the crude material was dissolved in anhydrous CH2CI2 and subjected to General Procedure B. The resulting crude was purified by prep. TLC to give a white solid (31 mg, 10%). 1H MR (500 MHz, Chloroform-i ) δ 8.52 (d, J= 3.5 Hz, 1H), 8.39 (s, 1H), 7.68 (d, J= 7.8 Hz, 1H), 7.40 (t, J= 7.7 Hz, 2H), 7.34 (s, 1H), 7.28 - 7.18 (m, 2H), 7.07 (d, J= 8.2 Hz, 2H), 6.98 (d, J= 7.5 Hz, 1H), 6.82 (d, J= 8.8 Hz, 1H), 6.46 (d, J= 16.8 Hz, 1H), 6.01 (dd, J= 16.2, 10.7 Hz, 1H), 5.64 (d, J = 10.3 Hz, 1H), 4.96 (s, 2H). HRMS-ESI (m/z) calculated for C22Hi8F3N202 [M+H]: 399.1315; found: 399.1315.
Iodoacetamide-rhodamine (16)
Figure imgf000139_0002
[00367] 5-(and-6)-((N-(5-aminopentyl)amino)carbonyl)tetramethylrhodamine (tetramethylrhodamine cadaverine) mixed isomers (60 mg, 0.12 mmol, 1 equiv.) were dissolved in anhydrous DMF (500 μΕ) with sonication. To this was added DIPEA (60 μΐ., 0.34 mmol, 3 equiv.) and chloroacetyl chloride (10 μΐ., 0.13 mmol, 1 equiv., diluted 1 : 10 in DMF) and the reaction was stirred at room temperature for 20 min until complete conversion to the product was detected by TLC. The DMF was removed under a stream of nitrogen and the reaction mixture was separated by PTLC in MeOH:CH2Cl2:TEA (15:85:0.001). The chloroacetamide rhodamine was then eluted in MeOH:CH2Cl2 (15:85), concentrated under reduced pressure and redissolved in acetone (500 μΐ.). Nal (150 mg, 1 mmol, 10 equiv.) was added to this and the reaction was stirred for 20 min at 50 °C until complete conversion to product was detected and the crude reaction mixture was purified by reverse phase FIPLC on a C18 column and concentrated to yield the title compound as a purple solid that is a mixture of 5 and 6
carboxamide tetramethylrhodamine isomers (ratio ~ 6: 1) (10 mg, 12 %). 1H MR (600 MHz, Methanol-^) δ 8.87 (t, J= 4.8 Hz, 0.14 H), 8.80 - 8.71 (m, 1H), 8.41 (dd, J= 8.2, 1.1 Hz, 0.86H), 8.35 (br s, 1H), 8.27 (dt, J= 7.9, 1.5 Hz, 0.164 H), 8.20 (dt, J= 8.2, 1.5 Hz, 0.86H), 7.81 (s, 0.86H), 7.53 (d, J= 7.8 Hz, 0.14 H), 7.18 - 7.11 (m, 2H), 7.07 (d, J= 9.5 Hz, 2H), 7.00 (s, 2H), 3.68 - 3.62 (m, 2H), 3.46 - 3.37 (m, 2H), 3.31 (s, 12H, obscured by solvent) 3.21 - 3.12 (m, 2H), 1.81 - 1.21 (m, 6H); HRMS-ESI (m/z) calculated for Cs^elWs [M+H]: 683.1725; found: 683.1716.
N-(3,5-bis(trifluoromethyl)phenyl)acetamide (17)
Figure imgf000140_0001
[00368] Following General Procedure A, starting with 3,5-bis(trifluoromethyl)aniline (327 mg, 1.42 mmol, 1 equiv.) and acetic anhydride (200 μΐ., 3 mmol, 2 equiv.), the title compound was obtained after PTLC as a white solid (302 mg, 78%). 1H NMR (500 MHz, Chloroform-i ) δ 8.10 (s, 2H), 7.72 (s, 1H), 7.68 (s, 1H), 2.32 (d, J= 0.9 Hz, 3H). HRMS-ESI (m/z) calculated for CiiH8F6N02 [M+H]: 284.0505; found: 284.0504. S nthesis of 18 and 19
Figure imgf000141_0001
"^C^CI Chloroacetamide Acrylamide 18
3-amino-N- hex-5-yn-l-yl)-5-(trifluoromethyl)benzamide (SI-5)
Figure imgf000141_0002
[00369] To a solution of 3-amino-5-(trifluoromethyl)benzoic acid (74 mg, 0.36 mmol) in acetonitrile (3.6 mL, 0.1 M final) was added EDCI (83 mg, 1.2 equiv.) followed by hex-5- ynamine (35 mg, 1.0 equiv.) followed by 1-hydroxybenzotriazole hydrate (HOBt, 66.3 mg, 1.2 equiv.) and the resulting solution was stirred overnight. The reaction was diluted with ethyl acetate, washed with 1 M HCl twice and then brine. The organic layer was dried over
magnesium sulfate and concentrated to yield aniline SI-5 (97.4 mg, 95%) as a white solid. 1H
MR (400 MHz, Chloroform-i ) δ 7.29 - 7.22 (m, 2H), 6.98 (t, J= 1.8 Hz, 1H), 6.38 (t, J = 5.5 Hz, 1H), 4.08 (s, 2H), 3.46 (td, J= 7.1, 5.7 Hz, 2H), 2.25 (td, J = 6.9, 2.6 Hz, 2H), 1.99 (t, J = 2.7 Hz, 1H), 1.81 - 1.55 (m, 4H).
-acrylamido-N-(hex-5-yn-l-yl)-5-(trifluoromethyl)benzamide (18)
Figure imgf000141_0003
[00370] Following General Procedure B, starting with SI-5 (42 mg, 0.15 mmol, 1 equiv.), the title compound was obtained after column chromatography as a white solid (34 mg, 70%). 1H NMR (500 MHz, Chloroform-i ) δ 8.94 (s, 1H), 8.24 (d, J= 11.9 Hz, 2H), 7.71 (s, 1H), 6.87 (t, J = 5.7 Hz, 1H), 6.55 (dd, J= 17.4, 0.7 Hz, 1H), 6.43 (dd, J= 16.9, 10.1 Hz, 1H), 5.88 (dd, J = 10.1, 1.3 Hz, 1H), 3.56 (q, J= 6.7 Hz, 2H), 2.33 (td, J= 6.9, 2.7 Hz, 2H), 2.06 (t, J= 2.7 Hz, 1H), 1.87 (p, J= 7.3 Hz, 2H), 1.69 (p, J= 7.8 Hz, 2H); HRMS-ESI (m/z) calculated for Ci7Hi8F3N202 [M+H]: 339.1314; found 339.1313.
3-acr lamido-N-(hex-5-yn-l-yl)-5-(trifluoromethyl)benzamide (19)
Figure imgf000142_0001
[00371] Synthesized according to General Procedure A2, starting from SI-5. 1H NMR (600 MHz, Chloroform-d) δ 8.57 (s, 1H), 8.16 (t, J= 1.8 Hz, 1H), 8.05 (t, J= 1.8 Hz, 1H), 7.79 (d, J = 2.0 Hz, 1H), 6.38 (d, J= 6.1 Hz, 1H), 4.23 (s, 2H), 3.51 (td, J= 7.1, 5.7 Hz, 2H), 2.27 (td, J = 6.9, 2.7 Hz, 2H), 2.00 (t, J= 2.6 Hz, 1H), 1.82 - 1.74 (m, 2H), 1.71 - 1.59 (m, 2H); HRMS-ESI (m/z) calculated for Ci6Hi7ClF3N202 [M+H]: 361.0925; found: 361.0925.
-chloro-l-(4-(hydroxydiphenylmethyl)piperidin-l-yl)ethan-l-one (20)
Figure imgf000142_0002
[00372] Following General Procedure A, starting with a,a-diphenyl-4-piperidinomethanol (800 mg, 3 mmol, 1 equiv.), the title compound was obtained after column chromatography as a white solid (637 mg, 61%). 1H NMR (500 MHz, Chloroform-i ) δ 7.56 (d, J= 7.6 Hz, 4H), 7.39 (q, J= 7.1 Hz, 4H), 7.28 (q, J= 6.8 Hz, 2H), 4.66 (d, J= 13.3 Hz, 1H), 4.07 (dd, J= 12.2, 4.2 Hz, 2H), 3.91 (d, J= 13.4 Hz, 1H), 3.18 (t, J= 12.9 Hz, 1H), 2.77 - 2.62 (m, 3H), 1.67 (t, J = 12.5 Hz, 2H), 1.56 (q, J= 11.8 Hz, 1H), 1.44 (q, J= 12.4, 11.8 Hz, 1H); HRMS-ESI (m/z) calculated for C20H23ClNO2 [M+H]: 344.1412; found: 344.1412. (E)-3-( 3, 5 -bis ( trifluoromethyl)phenyl) -2-cyanoacrylamide (23)
Figure imgf000143_0001
[00373] 3,5-bis(trifluoromethyl)benzaldehyde (880 mg, 3.6 mmol, 1 equiv.) and 2- cyanoacetamide (460 mg, 5.5 mmol, 1.5 equiv.) were dissolved in MeOH (10 mL). To this was added piperidine (214 mg, 0.7 equiv.) and the reaction was stirred at room temperature for 30 minutes at which point starting material was consumed. After addition of an equivalent volume of water (10 mL), the precipitate was collected by filtration and washed with water/methanol (1 : 1) to yield the title compound as a white solid (534 mg, 47%).; 1H NMR (400 MHz, Acetone- d6) δ 8.78 (s, 2H), 8.61 (s, 1H), 8.41 (s, 1H), 7.57 (s, 1H), 7.42 (s, 1H); HRMS-ESI (m/z) calculated for Ci2H7F6N202 [M+H]: 309.0457; found: 309.0459.
N- 3, 5-bis(trifluoromethyl)phenyl)-2-bromopropanamide (24)
Figure imgf000143_0002
[00374] Following General Procedure Al, starting with 3,5-bis(trifluoromethyl)aniline (250 mg, 1.1 mmol, 1 equiv.) and 2-bromopropionyl chloride (200 μΐ., 2 mmol, 1.8 equiv.) the title compound was obtained by PTLC as a white solid (130 mg, 35%). 1H NMR (500 MHz, Chloroform-ύ δ 8.34 (s, 1H), 8.06 (s, 2H), 7.66 (s, 1H), 4.58 (q, J= 7.0 Hz, 1H), 1.98 (d, J = 7.0 Hz, 3H); HRMS-ESI (m/z) calculated for CnH7BrF6NO [M-H]: 361.9621; found: 361.9623 N-(3, 5-bis(trifluoromethyl)phenyl)-2-chloropropanamide (25)
Figure imgf000143_0003
[00375] Following General Procedure Al, starting with 3,5-bis(trifluoromethyl)aniline (327 mg, 1.42 mmol, 1 equiv.) and 2-chloropropionyl chloride (200 μΐ., 2 mmol, 1.8 equiv.) the title compound was obtained by PTLC as a white solid (250 mg, 55%). 1H NMR (500 MHz, Chloroform-ύ δ 8.61 (s, 1H), 8.16 (s, 2H), 7.75 (s, 1H), 4.67 (q, J= 7.1 Hz, 1H), 1.93 (d, J = 7.1 Hz, 3H). HRMS-ESI (m/z) calculated for CnH7ClF6NO [M-H]: 318.0126; found: 318.0126. N-( 3, 5 -bis ( trifluoromethyl)phenyl) -N-(pyridin-3-ylmethyl )acrylamide (31)
Figure imgf000144_0001
[00376] 3,5-bis(trifluoromethyl)aniline (350 mg, 1.6 mmol, 1 equiv.) was dissolved in TFA (5 mL). The reaction mixture was cooled to 0 °C and to this sodium triacetoxyborohydride (STAB) ( 400 mg, 2 mmol, 1.3 equiv.) was added. 3-pyridinecarboxaldehyde (244 mg, 1.5 mmol, 1 equiv.) was dissolved in CH2C12 (5 mL) and slowly added to the reaction mixture dropwise over 10 minutes. Upon complete conversion to product, the reaction mixture was diluted with CH2C12 (20 mL) and washed with saturated sodium bicarbonate solution (3 χ 20 mL) and the organic layer was dried then concentrated under reduced pressure. Without further purification the crude material was dissolved in anhydrous CH2C12 and subjected to General Procedure B. The resulting crude was purified by PTLC to give a white solid (10 mg, 2%). 1H MR (500 MHz, Chloroform-i ) δ 8.63 (d, J= 3.8 Hz, 1H), 8.49 (s, 1H), 7.93 (s, 1H), 7.70 (d, J = 7.7 Hz, 1H), 7.55 (s, 2H), 7.35 (dd, J= 7.6, 5.3 Hz, 1H), 6.60 (dd, J= 16.6, 1.6 Hz, 1H), 6.02 (dd, J= 16.9, 10.2 Hz, 1H), 5.79 (dd, J= 10.3, 1.6 Hz, 1H), 5.11 (s, 2H). HRMS-ESI (m/z) calculated for Ci7Hi3F6N20 [M+H]: 375.0927; found: 375.0928.
3- 2-chloroacetamido)-5-(trifluoromethyl)benzoic acid (36)
Figure imgf000144_0002
[00377] To a solution of 3-amino-5-(trifluoromethyl)benzoic acid (500 mg, 2.44 mmol) in 1.5 mL of dimethylacetamide (1.6 M) at 0 °C was added chloroacetyl chloride (214 μΐ^, 2.69 mmol, 1.1 equiv.). The resulting solution was warmed to ambient temperature and stirred for 20 minutes, at which point ethyl acetate (40 mL) and water (30 mL) were added. The pH of the aqueous layer was adjusted to pH 10 via addition of 1 N NaOH, and the phases were separated. The aqueous layer was washed with 40 mL of ethyl acetate, then acidified by adding 1 N HCl. The product was extracted with ethyl acetate (40 mL), and the organic layer was washed with 1M HCl (2 x 40 mL), brine (40 mL), dried over magnesium sulfate and concentrated to provide the desired product (456 mg, 66%). 1H MR (500 MHz, Chloroform-i ) δ 8.31 (s, 1H), 8.27 (s, 1H), 8.14 (s, 1H), 4.13 (s, 2H); HRMS-ESI (m/z) calculated for Ci0¾ClF3NO3 [M+H]:
282.0139; found: 282.0141. l- 4-(5-fluorobenzisoxazol-3-yl )piperidin-l-yl)prop-2-en-l-one (37)
Figure imgf000145_0001
[00378] The title compound was obtained starting from 6-fluoro-3(4-piperidinyl)-l,2- benzisoxazole hydrochloride (53 mg, 0.2 mmol, 1 equiv.) according to General Procedure C as a colorless oil (49.1 mg, 87%). 1H NMR (400 MHz, Chloroform-i ) δ 7.64 (dd, J= 8.7, 5.1 Hz, 1H), 7.27 (dd, 7= 8.4, 2.3 Hz, 1H), 7.08 (td, 7= 8.9, 2.1 Hz, 1H), 6.64 (dd, 7= 16.8, 10.6 Hz, 1H), 6.32 (dd, 7= 16.9, 1.9 Hz, 1H), 5.73 (dd, 7= 10.6, 1.9 Hz, 1H), 4.70 (d, 7= 13.4 Hz, 1H), 4.15 (d, 7= 12.4 Hz, 1H), 3.53 - 3.13 (m, 2H), 2.99 (t, 7= 13.1 Hz, 1H), 2.25 - 2.07 (m, 2H), 2.00 (ddd, 7= 23.1, 14.2, 7.8 Hz, 2H); HRMS-ESI (m/z) calculated for Ci5Hi6FN20 [M+H]: 275.119; found: 275.119.
tert-butyl 4-(4-acrylamido-2, 6-difluorophenyl)piperazine-l-carboxylate (38)
Figure imgf000145_0002
[00379] The title compound was obtained starting from tert-Butyl 4-(4-amino-2,6- difluorophenyl)piperazine-l-carboxylate according to General Procedure B. 1H NMR (400 MHz, Chloroform-ύ δ 8.12 (s, 1H), 7.13 (d, 7= 10.4 Hz, 2H), 6.36 (d, 7= 16.9 Hz, 1H), 6.19 (dd, 7= 16.8, 10.2 Hz, 1H), 5.70 (d, 7= 10.2 Hz, 1H), 3.45 (t, 7= 4.7 Hz, 4H), 3.00 (t, 7= 3.7 Hz, 4H), 1.41 (s, 9H); HRMS-ESI (m/z) calculated for C18H24F2N3O3 [M+H]: 368.178; found: 368.178.
N- 4-bromo-2, 5-dimethylphenyl)acrylamide (40)
Figure imgf000145_0003
[00380] Following General Procedure B, starting from 4-bromo-2,5-dimethylaniline (900 mg, 4.5 mmol, 1 equiv.), the title compound was obtained after column chromatography and recrystallization from cold CH2C12 as a white solid (611 mg, 40%). 1H NMR (500 MHz, Chloroform-ύ δ 7.87 (s, 1H), 7.43 (s, 1H), 7.16 (s, 1H), 6.50 (d, 7= 16.7 Hz, 1H), 6.35 (dd, 7 = 16.4, 10.3 Hz, 1H), 5.86 (d, 7= 10.3 Hz, 1H), 2.42 (s, 3H), 2.28 (s, 3H); HRMS-ESI (m/z) calculated for CnHi3BrNO [M+H]: 254.0175; found: 254.0175. 2-Chloroacetamido-2-deoxy- a/ β-D-glucopyranose ( 44)
Figure imgf000146_0001
[00381] To a stirred solution of hexosamine hydrochloride (590 mg, 3.39 mmol, 1 equiv.) in anhydrous MeOH (200 mL) at room temperature was added sodium metal (60 mg, 2.6 mmol, 0.78 equiv.), TEA (400 μΐ^, 5.7 mmol, 1.8 equiv.). Chloroacetic anhydride (1 g, 5.9 mmol, 1 equiv.) was then added and the mixture stirred for 6 h, monitoring for completeness by TLC. After which, the reaction mixture was concentrated in vacuo. The crude product then was purified by two rounds of column chromatography to afford the pure title product as a white solid (610 mg, 72%). 1H MR (500 MHz, Methanol-^) δ 5.20 (d, J= 3.7 Hz, ΙΗα), 4.75 (d, J = 8.3 Hz, ΙΗβ), 4.19 (dd, J= 20.2, 13.9 Hz, 2H), 4.19 (d, J= 12.6 Hz, 1H), 3.95 (dd, J= 10.6, 3.5 Hz, lHa), 3.83 (m, 3Ha, 3Ηβ), 3.74 (d, J= 5.1 Hz, ΙΗβ), 3.70 (dd, J= 11.4, 8.9 Hz, ΙΗβ), 3.60 (dd, J= 10.7, 9.5 Hz, ΙΗβ), 3.46 (t, J= 9.3 Hz, 1H), 3.42 (t, J= 10.0 Hz, ΙΗβ); HRMS-ESI (m/z) calculated for C8Hi5ClN06 [M+H]: 256.0582; found: 256.0582.
-chloro-l-(2-methyl-3,4-dihydroquinolin-l(2H)-yl)ethan-l-one (45)
Figure imgf000146_0002
[00382] Chloroacetyl chloride (80.4 μΐ., 0.9 mmol, 1.7 equiv.) was dissolved in anhydrous CH2C12 (3 mL) and cooled to 0 °C. A solution of 2-m ethyl- 1,2,3, 4-tetrahydroquinoline (80.1 mg, 0.544 mmol, 1 equiv.) and N-methylmorpholine (0.11 mL, 1.0 mmol, 1.8 equiv.) in CH2C12 (2 mL) was then added dropwise. After 6 h, the reaction was quenched with saturated aqueous NaHC03 (5 mL) and extracted with CH2C12 (3 x 10 mL). The combined organic layers were dried over anhydrous Na2S04 and concentrated under reduced pressure. The resultant residue was purified by prep. TLC (30% EtOAc / hexanes), providing the title compound as an off -white solid (108.8 mg, 89%). 1H MR (400 MHz, chloroform-i ) δ 7.30 - 7.13 (m, 4H), 4.86 - 4.75 (m, 1H), 4.20 (d, J= 12.5 Hz, 1H), 4.09 (d, J= 12.5 Hz, 1H), 2.69 - 2.58 (m, 1H), 2.59 - 2.46 (m, 1H), 2.46 - 2.31 (m, 1H), 1.36 - 1.29 (m, 1H), 1.15 (d, J= 6.5 Hz, 3H); HRMS-ESI (m/z) calculated for Ci2Hi5ClNO [M+H]: 224.0837; found: 224.0836. -cyclohexyl-N-phenylacrylamide (46)
Figure imgf000147_0001
[00383] The title compound was synthesized according to General Procedure C from N- cyclohexylaniline (89.5 mg, 0.511 mmol, 1 equiv.). Purification of the crude product by flash column chromatography (10-20% EtOAc / hexanes) then prep. TLC (30% EtOAc / hexanes) provided the title compound as an off-white solid (53.1 mg, 45%). 1H MR (400 MHz, chloroform-ύ δ 7.42 - 7.33 (m, 3H), 7.10 - 7.06 (m, 2H), 6.31 (dd, J= 16.7, 2.1 Hz, 1H), 5.77 (dd, J= 16.7, 10.3 Hz, 1H), 5.41 (dd, J= 10.4, 2.1 Hz, 1H), 4.65 (tt, J= 12.2, 3.7 Hz, 1H), 1.85 (dt, 7= 11.2, 1.8 Hz, 2H), 1.75 - 1.68 (m, 2H), 1.61 - 1.53 (m, 1H), 1.40 (qt, 7= 13.3, 3.6 Hz, 2H), 1.07 (qd, 7= 12.4, 3.6 Hz, 2H), 0.91 (qt, 7= 13.1, 3.8 Hz, 1H); HRMS-ESI (m/z) calculated for Ci5H20NO [M+H]: 230.1539; found: 230.1539.
-(5-bromoindolin-l-yl )prop-2-en-l-one (47)
Figure imgf000147_0002
[00384] The title compound was synthesized according to General Procedure C from 5- bromoindoline (41.7 mg, 0.211 mmol, 1 equiv.), acryloyl chloride (32 μΕ, 0.40 mmol, 1.9 equiv.), and changing the base to pyridine (32 μΐ., 0.40 mmol, 1.9 equiv.). Purification of the crude product by re-precipitation from EtOAc provided the title compound as a white solid (67.8 mg, 64%). 1H MR (400 MHz, chloroform-i ) δ 8.16 (d, 7= 8.6 Hz, 1H), 7.33 - 7.25 (m, 2H), 6.60 - 6.42 (m, 2H), 5.84 - 5.76 (m, 1H), 4.15 (t, 7= 8.6 Hz, 2H), 3.17 (t, 7= 8.6 Hz, 2H); HRMS-ESI (m/z) calculated for CnHnBrNO [M+H]: 252.0018; found: 252.0017.
-(l-benzylpiperidin-4-yl)-N-phenylacrylamide (48)
Figure imgf000147_0003
[00385] The title compound was synthesized according to General Procedure C from 1 - benzyl-N-phenylpiperidin-4-amine (30.0 mg, 0.113 mmol, 1 equiv.), acryloyl chloride (17 μΐ., 0.21 mmol, 1.9 equiv.), and changing the base to pyridine (17 μΐ., 0.21 mmol, 1.9 equiv.).
Purification of the crude product by prep. TLC provided the title compound as a white solid (22.5 mg, 64%). 1H MR (400 MHz, chloroform-i ) δ 7.62 - 7.56 (m, 2H), 7.43 - 7.36 (m, 6H), 7.05 (d, J= 6.2 Hz, 2H), 6.29 (dd, J= 16.8, 2.1 Hz, 1H), 5.79 (dd, J= 16.8, 10.3 Hz, 1H), 5.46 (dd, J= 10.3, 2.1 Hz, 1H), 4.81 - 4.70 (m, 1H), 4.09 (s, 2H), 3.41 (d, J= 12.0 Hz, 2H), 2.82 (q, J= 11.5 Hz, 2H), 2.21 (q, J= 11.9 Hz, 2H), 1.94 (d, J= 14.2 Hz, 2H); HRMS-ESI (m/z) calculated for C2iH25N20 [M+H]: 321.1961; found: 321.1962.
-chloro-N-(2-methyl-5-(trifluoromethyl)phenyl)acetamide (49)
Figure imgf000148_0001
[00386] The title compound was synthesized according to General Procedure Al from 2- methyl-5-(trifluoromethyl)aniline (35.0 mg, 0.2 mmol, 1 equiv.). Purification of the crude product by prep. TLC (20% EtOAc / hexanes) provided the title compound as a white solid (48.2 mg, 95%). 1H MR (600 MHz, chloroform-i ) δ 8.31 (s, 1H), 8.25 (d, J= 1.9 Hz, 1H), 7.37 (dd, J= 7.9, 1.8 Hz, 1H), 7.32 (d, J= 7.9 Hz, 1H), 4.25 (s, 2H), 2.36 (s, 3H); HRMS-ESI calculated for Ci0Hi0ClF3NO [M+H]: 252.0397; found: 252.0397.
-(5-bromoindolin-l-yl)-2-chloroethan-l-one (50)
Figure imgf000148_0002
[00387] The title compound was synthesized according to General Procedure Al from 5- bromoindoline (39.6 mg, 0.2 mmol, 1 equiv.). Purification of the crude product by prep. TLC (25%) EtOAc / hexanes) provided the title compound as an off-white solid (48.6 mg, 89%). 1H MR (600 MHz, CDC13) δ 8.07 (d, J= 8.4 Hz, 1H), 7.32 (d, J= 8.8 Hz, 2H), 4.17 (t, J= 8.6 Hz, 2H), 4.14 (s, 2H), 3.22 (t, J= 8.4 Hz, 2H); HRMS-ESI (m/z) calculated for Ci0Hi0BrClNO
[M+H]: 273.9629; found: 273.9629.
-chloro-N-(quinolin-5-yl)acetamide (51)
Figure imgf000148_0003
[00388] To a stirring suspension of 5-aminoquinoline (28.8 mg, 0.2 mmol, 1 equiv.) and potassium carbonate (82.9 mg, 0.6 mmol, 3 equiv.) in anhydrous CH2C12 (3 mL) at 0 °C was added chloroacetyl chloride (24 μΕ, 1.5 equiv.). The reaction was allowed to slowly warm up to room temperature. After 3 hours, the mixture was filtered, washed with EtOAc (10 mL) and CH2C12 (10 mL). The solid cake was then eluted with MeOH (20 mL) and the filtrate
concentrated in vacuo. The residue was taken up in 10%> MeOH / CH2C12 and passed through a pad of silica to provide the title compound as an off-white solid (42.6 mg, 82%). 1H NMR (500 MHz, CDCI3) δ 8.96 (d, J= 2.5 Hz, 1H), 8.71 (s, 1H), 8.20 (d, J= 8.6 Hz, 1H), 8.04 (d, J= 8.5 Hz, 1H), 7.94 (d, J= 7.5 Hz, 1H), 7.74 (t, J= 8.0 Hz, 1H), 7.48 (dd, J= 8.5, 4.2 Hz, 1H), 4.35 (s, 2H); HRMS-ESI (m/z) calculated for CnH9ClN20 [M+H]: 221.0476; found: 221.0477. l- 4-benzylpiperidin-l-yl )prop-2-en-l-one (53)
Figure imgf000149_0001
[00389] Following General Procedure B, starting from 4-benzylpiperidine (1 g, 5.7 mmol, 1 equiv.), the title compound was obtained after column chromatography as a yellow oil (748 mg, 57%). 1H NMR (500 MHz, Chloroform-i ) δ 7.36 (t, J= 7.4 Hz, 2H), 7.28 (t, J= 7.4 Hz, 1H), 7.20 (d, J= 7.1 Hz, 2H), 6.64 (dd, J= 16.8, 10.6 Hz, 1H), 6.32 (dd, J= 16.8, 1.9 Hz, 1H), 5.72 (dd, J= 10.6, 1.9 Hz, 1H), 4.72 (d, J= 12.7 Hz, 1H), 4.03 (d, J= 13.0 Hz, 1H), 3.05 (t, J= 12.7 Hz, 1H), 2.70 - 2.59 (m, 3H), 1.86 (ddp, J= 14.6, 7.2, 3.5 Hz, 1H), 1.77 (m, 2H), 1.37-1.18 (m, 2H); HRMS-ESI (m/z) calculated for Ci5H20ClNO [M+H]: 230.1539; found: 230.1539.
2-chloro-N- 3-hydroxy-5-(hydroxymethyl)-2-methylpyridin-4-yl)m (54)
Figure imgf000149_0002
[00390] To a stirred solution of pyridoxamine hydrochloride (150 mg, 0.64 mmol, 1 equiv.) in anhydrous MeOH (20 mL) at room temperature was added sodium metal (30 mg, 1.5 mmol, 2.3 equiv.), TEA (100 μΐ., 1 mmol, 1.6 equiv.). Chloroacetic anhydride (390 mg, 2.29 mmol, 3.5 equiv.) was added and the mixture stirred for 6 h, monitoring for completeness by TLC. After which, the reaction mixture was concentrated in vacuo. The crude product then was the purified by prep. TLC to afford the title compound as a white solid (46 mg, 30%). 1H NMR (500 MHz, Methanol-^) δ 7.97 (s, 1H), 4.81 (s, 2H), 4.61 (s, 2H), 4.17 (s, 3H), 4.06 (s, 1H), 3.35 (s, 1H), 2.52 (s, 3H); HRMS-ESI (m/z) calculated for Ci0Hi4ClN2O3 [M+H]: 245.0687; found:
245.0688.
l- 6, 7-dimethoxy-3, 4-dihydroisoquinolin-2(lH)-yl)prop-2-en-l-one (56)
Figure imgf000149_0003
[00391] To a stirring suspension of the 6,7-dimethoxy-3,4-dihydroisoquinoline (1 g,
5.2 mmol, 1 equiv.) and TEA (1800 μΐ^, 12.6 mmol, 2.5 equiv.) in anhydrous THF (10 0 °C was added acryloyl chloride (1320 μΐ., 13.2 mmol, 2.6 equiv.) and the reaction was allowed to slowly warm up to room temperature. After 2 hours, the mixture was diluted with CH2C12 (2 x 50 mL) and washed with saturated brine (2 χ 50 mL) and the combined organics were concentrated in vacuo. The residue was taken up in 10% MeOH / CH2C12 and purified by column chromatography to afford the title compound as a white solid (700 mg, 54%, mixture of EIZ isomers). 1H MR (500 MHz, Chloroform-i ) δ 6.63 (m, 3H), 6.29 (d, J= 16.8 Hz, 1H), 5.69 (dd, J= 10.6, 1.8 Hz, 1H), 4.69 (s, 1H [major]), 4.63 (s, 0.8H [minor]), 3.82 (s, 7H), 3.73 (t, J= 5.6 Hz, 1H), 2.84 - 2.77 (m, 2H); HRMS-ESI (m/z) calculated for Ci4Hi8N03 [M+H]:
248.128; found: 248.1281.
2-chloro-N-( l-( S-ethynylbenzoyl)piperidin-4-yl)-N-phenylacetamide (61)
Figure imgf000150_0001
[00392] To an excess of neat SI-3 was added 0.7 mL of trifluoroacetic acid (0.2 M). The resulting solution was concentrated under a stream of nitrogen until no further evaporation was observed, providing the deprotected amine as its trifluoroacetate salt. The triflouroacetate amine salt (90.6 mg, 0.25 mmol) was taken up in DMF (0.5 mL, 0.5 M) and the resulting solution was cooled to 0 °C. 3-ethynyl benzoic acid (44 mg, 1.2 equiv.), HATU (113 mg, 1.2 equiv.), and Hunig's base (86 iL, 2 equiv.) were sequentially added. The reaction was stirred for 2 hours at 0 °C, diluted with Et20, and then washed with 1 M HC1. The organic layer was dried over magnesium sulfate, concentrated, and purified by flash chromatography (gradient from 40 to 70 % ethyl acetate in hexanes) to provide the title compound (87 mg, 92%). 1H MR (400 MHz, Chloroform-ύ δ 7.51 (dd, J= 9.5, 5.4 Hz, 4H), 7.43 (d, J= 1.9 Hz, 1H), 7.39 - 7.25 (m, 2H), 7.14 (d, J= 10.4 Hz, 2H), 4.86 (tt, J= 15.1, 5.3 Hz, 2H), 3.72 (s, 3H), 3.19 (d, J= 14.0 Hz, 1H), 3.11 (s, 1H), 2.86 (s, 1H), 1.90 (d, J= 36.6 Hz, 2H), 1.38 (s, 1H), 1.24 (d, J= 19.9 Hz, 1H); HRMS-ESI (m/z) calculated for C22H22C1N202 [M+H]: 381.1364; found: 381.1363.
Global profiling of cysteine-reactive fragments in native populations
[00393] Cysteine is unique among protein-coding amino acids owing to its high
nucleophilicity and sensitivity to oxidative modification. Cysteine residues perform catalytic functions in diverse enzyme classes and represent sites for post-translational regulation of proteins through disulfide bonding, iron-sulfur cluster formation, conversion to sulfinic and sulfonic acid, nitrosylation, S-glutathionylation and lipid modification. Using a quantitative chemical proteomic method termed isoTOP-ABPP (isotopic Tandem Orthogonal Proteolysis- Activity-Based Protein Profiling), global measurements of the intrinsic reactivity of cysteine residues was carried out and their sensitivity to modification by lipid-derived electrophiles was assessed. In order to determine whether isoTOP-ABPP was adapted to perform covalent FBLD in native biological systems, a cell preparation (lysate or intact cells) was pre-treated with DMSO or one member of a library of electrophilic small-molecule fragments and then exposed to a broad-spectrum cysteine-reactive probe iodoacetamide (IA)-alkyne 1 (Fig. 1A). Proteins harboring IA-alkyne-labeled cysteine residues from DMSO- and fragment-treated samples were conjugated by copper-mediated azide-alkyne cycloaddition (CuAAC or click) chemistry to isotopically differentiated azide-biotin tags (heavy and light, respectively), combined, enriched by streptavidin, and proteolytically digested on-bead to yield isotopic peptide pairs that were analyzed by LC-MS. Quantification of MSI chromatographic peak ratios for peptide pairs identified fragment-competed Cys residues as those displaying high competition ratios, or R values, in DMSO/fragment comparisons.
[00394] A 50+ member fragment library was constructed with most compounds containing either a chloroacetamide or acrylamide electrophile (Fig. IB and Fig. 3), which are well- characterized cysteine-reactive groups found in many chemical probes and some clinically approved drugs. These electrophiles were appended to structurally diverse small-molecule fragments (< 300 Da) intended to serve as recognition elements that promote interactions with different subsets of the human proteome. The library also contained some additional
electrophiles, such as cyanoacrylamides and vinylsulfonamides, and known bioactive electrophilic compounds (e.g., the anti-cancer agent piperlongumine and anti -migratory agent locostatin) (Fig. IB, and Fig. 3). The electrophile library was screened at a high concentration (500 μΜ) comparable to the ligand concentrations used in typical FBLD experiments. A subset of the fragment library was initially assayed by competitive profiling in a human MDA-MB-231 breast cancer cell line proteome using an IA-rhodamine probe 16, which permitted facile SDS- PAGE detection of cysteine reactivity events. This experiment identified several proteins that showed reductions in IA-rhodamine labeling in the presence of one or more fragments (Fig. 1C, asterisks). Interestingly, the proteins exhibited distinct SARs across the test fragment set, indicating that the library recognition elements exert a strong influence over specific fragment- protein reactivity events.
[00395] Competitive isoTOP-ABPP was used to globally map human proteins and the cysteine residues within these proteins that were targeted by fragment electrophiles. Each fragment was tested, in general, against two distinct human cancer cell proteomes (MDA-MB- 231 and Ramos cells) and most fragments were screened in duplicate against at least one of these proteomes. On average, 927 cysteines were quantified per data set, and it was required that individual cysteines were quantified in at least three data sets for interpretation. Based on these criteria, more than 6157 cysteines from 2885 proteins were quantified in aggregate across all data sets with an average quantification frequency of 22 data sets per cysteine (Fig. 4A).
Fragment-competed cysteine residues, or "liganded" cysteines, were defined as those showing > 75% reductions in IA-alkyne labeling (R values > 4 for DMSO/fragment). To minimize the potential for false-positives, only cysteines that showed R values > 4 in two or more data sets and met additional criteria for data quality control were considered as targets of the fragment electrophiles. The proteomic reactivity values, or liganded cysteine rates, of individual fragments were then calculated as the percentage of liganded/total quantified cysteines in isoTOP-ABPP experiments performed on that fragment.
[00396] Most fragment electrophiles showed a tempered reactivity across the human proteome, with a median liganded cysteine rate of 3.8% for the library (Fig. 4B). Substantial differences in reactivity were, however, observed, with individual electrophiles showing liganded cysteine rates of < 0.1% and others displaying rates > 15% (Fig. 4B). That
piperlongumine and locostatin fell into the latter category indicated the intrinsic proteomic reactivity of the fragment electrophiles did not, in general, exceed that of previously described electrophilic probes. A subset of fragments was also screened at lower concentrations (25-50 μΜ), which confirmed that their proteomic reactivities were concentration-dependent (Fig. 4C).
The relative reactivity of fragment electrophiles was similar in MDA-MB-231 and Ramos cell proteomes (Fig. 4D), indicating that this parameter is an intrinsic property of the compounds.
Fragments also showed consistent reactivity profiles when assayed in biological replicate experiments (Fig. 4E). Interestingly, it was found that the proteomic reactivity of fragment electrophiles was only marginally correlated with their glutathione adduction potential, which is a commonly used surrogate assay for measurements of proteinacious cysteine reactivity (Fig.
4F). These differences are attributed to the impact of the recognition element of fragment electrophiles on their interactions and, ultimately, reactivity with proteins.
[00397] A comparison of fragments 3, 14, 17, and 23-26 provided insights into the relative proteomic reactivity of different electrophilic groups coupled to a common recognition element
(3,5-di(trifluoromethyl)phenyl group). Chloroacetamide 3 exhibited greater reactivity than acrylamide 14 (15% versus 3.4% liganded cysteines, respectively; Fig. ID), with
cyanoacryl amide 23 exhibiting similar reactivity to acrylamide 14 and other, more sterically congested electrophiles (24-26) showing reduced proteomic reactivity (Fig. 4G). Importantly, the non-electrophilic acetamide control fragment 17 showed negligible activity in competitive isoTOP-ABPP experiments (liganded cysteine rate < 0.2%) (Fig. ID), indicating that the vast majority of detected fragment-cysteine interactions reflected covalent reactions versus non- covalent binding events. Also in support of this conclusion, "clickable" alkyne analogues of 3 and 14 (compounds 19 and 18, respectively) exhibited different concentration-dependent proteome labeling profiles (19 > 18; Fig. IE) that mirrored the respective liganded cysteine rates displayed by 3 and 14 in competitive isoTOP-ABPP experiments (3 > 14; Fig. ID). Despite the greater overall proteomic reactivity of chloroacetamide 3 relative to acrylamide 14 and cyanoacryl amide 23, clear examples of cysteines were found that were preferentially liganded by the latter fragments (Fig. IF).
[00398] In some instances, these findings demonstrate that the isoTOP-ABPP platform is one method for use to competitively profile fragment electrophiles against thousands of cysteine residues in native proteomes.
Cysteines targeted by fragment electrophiles in native proteomes
[00399] Across all isoTOP-ABPP data sets combined, 758 liganded cysteines were identified on 637 distinct proteins, which corresponded to -12 and 22% of the total quantified cysteines and proteins, respectively (Fig. 5A and Tables 1-3). Only a modest fraction of the proteins harboring liganded cysteines were found in the DrugBank database (15%; Fig. 5B), indicating the fragment electrophiles targeted many proteins that lack small-molecule probes. Among protein targets with known covalent ligands, the fragment electrophiles frequently targeted the same cysteine residues as these known ligands (Table 4); examples include the protein kinase BTK, in which electrophilic fragments targeted an active-site cysteine that also reacts with the cancer drug ibrutinib, and XPOl and ERCC3, in which electrophilic fragments targeted conserved cysteines that are modified by bioactive natural products and candidate anti -cancer agents. In the case of BTK, it was confirmed that the interaction of ibrutinib with this kinase was detected by isoTOP-ABPP, which also identified a known ibrutinib off-target - MAP2K7 - in Ramos cell ly sates (Fig. 7A).
[00400] DrugBank proteins with liganded cysteines mostly originated from classes that are regarded as "druggable", including enzymes, channels, and transporters (Fig. 5C). Non- DrugBank proteins with liganded cysteines, on the other hand, showed a broader class distribution that included proteins, such as transcription factors and adaptor/scaffolding proteins, that are considered challenging to target with small-molecule ligands (Fig. 5C). Even among the enzymes targeted by fragment electrophiles, many examples were noted where the liganded cysteine was a non-active site residue (Fig. 7B). These data indicated that the cysteines modified by fragment electrophiles were not restricted to classical ligand-binding pockets on proteins. Also consistent with this premise, only -6% of all of the liganded cysteines were functionally annotated as active-site residues (Fig. 5D). Active-site cysteines, as well as redox-active cysteines, were still, however, substantially enriched among the liganded cysteine group compared to unliganded cysteines quantified by isoTOP-ABPP (Fig. 5D). It had been previously found that active-site and redox-active cysteines also show, in general, greater intrinsic reactivity (as measured with the IA-alkyne probe) compared to other cysteines. While this heightened reactivity is a likely contributory factor to the ligandability of cysteines, as reflected in the high proportion of hyperreactive cysteines that were detected as targets of fragment electrophiles (Fig. 5E), liganded cysteines were also well -represented across a broad range of intrinsic reactivities (Fig. 5E). Finally, most proteins were found to harbor a single liganded cysteine among the several cysteines that were, on average, quantified per protein by isoTOP- ABPP (Fig. 5F). The nuclear export factor XPOl and metabolic enzyme PHGDH provide compelling examples of the selectivity displayed by fragment electrophiles for individual cysteines within proteins (Fig. 5G and Fig. 7C). Among the six different XPOl cysteine residues quantified by isoTOP-ABPP, a single cysteine, C528, was frequently targeted by fragment electrophiles (Fig. 5G), and this residue is also modified by electrophilic drugs in clinical development for cancer40. Similarly, among eight quantified cysteines in PHGDH, only C369, a non-active site residue, was targeted by electrophilic fragments (Fig. 7C).
[00401] Liganded cysteines displayed strikingly distinct SARs with the fragment electrophile library (Fig. 6Aand Tables 1-3). While a handful of cysteines were targeted by a large number of fragments (> 50%), most cysteines exhibited more restricted reactivity (Fig. 6 A, B and
Tables 1-3). The operational grouping of fragment electrophiles based on their relative proteomic reactivity values (group A, > 10%; group B, < 10%) revealed SAR features that emphasized both the recognition and reactivity components of cysteine-electrophile interactions. Certain cysteines, for instance, preferentially interacted with the less reactive (group B) fragments (e.g., GLRX5; MSTOl; SRP9; UCHL3; Fig. 6A), while others were mainly liganded by the most reactive (group A) fragments (e.g., ATXN7L3B; CRKL; C20RF49; Fig. 6A), although, even in these cases, the interactions differed substantially across group A fragments. Liganded cysteines located in the active sites of proteins tended to show broader reactivity with the fragment electrophiles compared to other cysteines (Fig. 6C), possibly reflecting their greater ligandability, but clear SARs were observed for many non-active site cysteines and these residues were not disproportionately targeted by group A fragments (Fig. 6D). These principles applied across different protein classes and were well-exemplified in kinases, for which > 20 liganded cysteines were identified that distributed near-evenly between active- and non-active- site residues (Fig. 7D-F). Even cysteines found in proteins considered challenging to drug, such as transcription factors/regulators, showed distinct SARs indicative of specific interactions involving both binding and reactivity (Fig. 6D and Fig. 9G). In addition, about greater than 60% of liganded cysteines, electrophile (IA-alkyne or fragment) reactivity was blocked by heat denaturation of the proteome, while about a fraction of unliganded cysteines (about 20%) showed decreased IA-alkyne labeling following heat denaturation (Figs. 15 and 16). In some instances, these results shoed that the ligand-cysteine insteractions are specific in that they depend on both the binding groups of ligands and structured sites in protein.
[00402] The availability of three-dimensional structures for a subset of proteins with liganded cysteines provided an opportunity to test whether docking predicts sites of fragment electrophile reactivity. Covalent docking programs have recently been introduced to discover ligands that target pre-specified cysteines in proteins; here, however, the aim was to computationally assess the relative ligandability of all cysteines within a protein and match these outputs to the data acquired in isoTOP-ABPP experiments. First, 29 representative protein targets were scanned and 99 solvent-accessible cysteines were identified. Then, the fragment electrophile library was docked on each residue independently using a modified potential to simulate non-covalent interactions preceding the alkylation event. In cases where the fragment electrophile bound favorably near a cysteine and the reactive group was within covalent bond distance of the cysteine, the cysteine was considered to be modified by the fragment. Docking scores were then calculated based on the estimated interaction energy of each fragment in its docked pose, and the ranking of these predictions matched the experimental data in 19 out of the 29 systems (i.e., cases where the top predicted ligandable cysteine matched the liganded cysteine determined by isoTOP-ABPP) (Fig. 6E, F and Table 5). In six out of the remaining 10 systems, the liganded cysteines were ranked second by reactive docking. In the remaining four systems, reactive docking failed to predict the liganded cysteine due to limitations in the docking scoring function or structural issues in the models usedNotably, across the entire 29 proteins evaluated by reactive docking, it was found that cysteines predicted to be ligandable were much more likely to have been detected by isoTOP-ABPP compared to cysteines not predicted to be ligandable (Fig. 6E and Fig. 7H). It was also found that cysteines predicted to be ligandable were more likely to have been detected by isoTOP-ABPP and exhibited heat-sensitive IA-alkyne reactivity (Fig. 17A and Fig. 17B). These results indicate that reactive docking provides a good overall prediction of the ligandability of proteinaceous cysteines and suggest that IA-alkyne reactivity itself provides an independent experimental parameter useful for designating potentially ligandable cysteines in proteins.
Functional analysis of ligand-cysteines interactions
[00403] The next step was to confirm and determine the functional impact of ligand-cysteine interactions mapped by isoTOP-ABPP using recombinant proteins. Two proteins were selected for which the functional significance of the liganded cysteines had been previously
demonstrated. The protein methyltransf erase PRMT1 possesses a non-catalytic active-site cysteine (CI 09) that, when modified by electrophilic small molecules like 4-hydroxynonenal (HNE), results in the inhibition of PRMTl activity27. Competitive isoTOP-ABPP revealed a very selective SAR for ligand engagement of CI 09 of PRMTl, with only three fragments (2, 11, and 51) blocking IA-alkyne labeling of this residue (Fig. 6A and Fig. 8A and Tables 1-3). Even though several additional cysteines in PRMTl were quantified in isoTOP-ABPP experiments (none of which showed sensitivity to the tested fragment electrophiles; Fig. 8A and Tables 1-3), it was found that IA-rhodamine labeling of recombinant PRMTl was blocked by mutating CI 09 to serine (Fig. 8B). These data are consistent with past studies indicating that CI 09 is the most reactive cysteine in PRMTl and is selectively labeled by low concentrations of electrophilic probes. Using a convenient SDS-PAGE readout, it was confirmed that fragment 11 blocked IA- rhodamine labeling of PRMTl with an IC50 value of 36 μΜ, whereas control fragment 3 was inactive (Fig. 8B, C), despite displaying similar overall proteome reactivity to 11 (Fig. 4B). Pre- treatment with 11, but not 3, also inhibited PRMTl -catalyzed methylation of histone 4 in a C109-dependent manner (Fig. 8D). These data indicate that electrophilic ligands targeting CI 09 act as PRMTl inhibitors.
[00404] MLTK, or ZAK, which is a MAP3 kinase that possesses an active site-proximal cysteine residue C22 that is modified by HNE to feedback-inhibit INK pathways under conditions of oxidative stress, was then examined. MLTK has also recently been implicated as an oncogenic driver in gastric cancer and is an off-target for ibrutinib, which reacts with C22 of MLTK. Competitive isoTOP-ABPP experiments identified a subset of fragment electrophiles that blocked IA-alkyne labeling of C22 in MLTK (Fig. 9A and Tables 1-3). The SAR provided by isoTOP-ABPP was verified and extended by testing fragments for blockade of labeling of recombinant MLTK using an ibrutinib-derived activity probe (Fig. 8Eand Fig. 9B), which identified the benzofuran fragment 60 as having good potency for inhibiting MLTK (IC50 value of 2.6 μΜ) and 3 as an inactive control probe (Fig. 8E, F and Fig. 9A, B). Fragment 60, but not 3, also blocked the catalytic activity of MLTK using a substrate phosphorylation assay, and this inhibitory effect was not observed with a C22A-MLTK mutant (Fig. 8G and Fig. 18).
[00405] Next, proteins were evaluated that possessed previously uncharacterized liganded cysteines. IMPDH2, which is the rate-limiting enzyme in de novo synthesis of guanine nucleotides and regulates immune cell proliferation and cancer, contained two liganded cysteines - C140 and C331 - that showed overlapping, but distinct SARs in competitive isoTOP-ABPP experiments (Fig. 9C, D; Fig. 19 and Tables 1-3). C331 serves as a catalytic nucleophile and active site-directed inhibitors of IMPDH2 have been described. C140, on the other hand, is found in a separate Bateman domain of IMPDH2, which serves as a module for allosteric regulation by sensing nucleotides (Fig. 9D) and has not been shown to react with electrophilic small molecules. Therefore focused was placed on the characterization of C140. It was first confirmed that fragment 14 directly labeled C140 of recombinant EVIPDH2 by MS methods (Table 6). An alkyne analogue of 14 (18; Fig. 8H) was then synthesized, which provided a means to directly monitor ligand interactions at C140 by click chemistry conjugation to a rhodamine-azide tag and SDS-PAGE analysis. Click probe 18 labeled WT-IMPDH2 and a C331 S-EVIPDH2 mutant, but not the C140S or C140S/C331 S mutants of this enzyme (Fig. 8H). Using this assay, it was confirmed that 14, but not control fragment 8, inhibited the labeling of EVIPDH2 by 18 (Fig. 9E). EVIPDH2 labeling by 18 was also inhibited by nucleotides ATP, AMP, and GTP, but not UTP or EVIP (Fig. 81 and Fig9F). ATP blocked 18 labeling of EVIPDH2 with an IC50 value of 45 μΜ (Fig. 8J). Thus, covalent ligands targeting the Bateman domain of IMPDH2 serves not only as inhibitors, but also probes of nucleotide binding to this enzyme.
[00406] Two liganded cysteines - CI 14 and C161- were also identified in the p53-induced phosphatase TIGAR (Fig. 9G, H). In some instances, TIGAR acts as both a fructose-2,6- bisphosphatase and 2,3-bisphosphoglycerate phosphatase to shape the metabolic state of cancer cells and protect them from ROS-induced apoptosis. Inhibitors of TIGAR have not been described. CI 14 is found on the lid of the TIGAR active site, -15 A from the phosphate substrate binding site (Fig. 9H). C161 resides on the opposite side of the protein. Focus was placed on the characterization of fragment labeling of CI 14 given its proximity to the TIGAR active site. It was first confirmed that both CI 14 and C161 of recombinant TIGAR were labeled by the IA-rhodamine probe and this labeling was partly diminished in CI 14S and C161 S single mutants and fully blocked in a CI 14S/C116S double mutant of TIGAR (Fig. 91). It was also verified interactions of hit fragment 5 with CI 14 of TIGAR by LC-MS analysis (Table 6) and by showing that the fragment blocked IA-rhodamine labeling of a C161 S-TIGAR mutant with an IC50 value of 16 μΜ (Fig. 8K, L); in contrast, the control fragment 3 showed much lower potency (Fig. 8K, L). 5 also blocked the catalytic activity of WT- and C161 S-, but not CI 14S- or CI 14S/C161 S-TIGAR using a substrate assay (Fig. 8M). Control fragment 3 did not affect TIGAR catalytic activity (Fig. 8L). Inhibition of TIGAR substrate turnover by 5 plateaued at 70% (Fig. 9J), which indicates that the covalent ligand acts by an allosteric mechanism or does not extend fully into the active site of TIGAR to produce complete inhibition.
Electrophilic ligands that inhibit IDH1 activity in cancer cells
[00407] Isocitrate dehydrogenase 1 (IDH1) and 2 (IDH2) are mutated in a number of human cancers to produce enzyme variants with a neomorphic catalytic activity that converts isocitrate to 2-hydroxyglutarate (2-HG). Increases in 2-HG inhibit a-ketoglutarate-dependent
dioxygenases that function as tumor suppressors, in particular, by methylating DNA and proteins. Competitive isoTOP-ABPP experiments identified distinct subsets of ligands that targeted a conserved cysteine in IDHl and IDH2 (C269 and C308, respectively; Tables 1-3). This cysteine is an active site-proximal residue that is 13 A from the NADP+ molecule in a crystal structure of IDHl (Fig. 10A); glutathionylation of C308 has previously been shown to block IDH2 activity, but, to our knowledge, irreversible inhibitors of IDH enzymes have not been characterized.
[00408] The functional significance of ligand interactions with IDH enzymes by
recombinantly expressing wild type (WT) and a C269S mutant of IDHl was explored. WT-, but not C269S-IDH1 reacted with the IA-rhodamine probe as detected by SDS-PAGE, and fragment electrophiles blocked this reaction with an SAR that mirrored that observed for endogenous IDHl in competitive isoTOP-ABPP experiments (Fig. HA and Tables 1-3). Fragment 20 inhibited IA-rhodamine labeling of WT-IDH1 with an IC50 value of 2.9 μΜ (Fig. 11B and Fig. 10B) and showed similar activity with the R132H oncogenic mutant of IDHl (Fig. IOC and Fig. 20). It was also confirmed by isoTOP-ABPP that 20 (25 μΜ) completely blocked IA-alkyne labeling of endogenous IDHl in MDA-MD-231 proteomes (R value = 20; Fig. 10D) and, by MS analysis, that 20 directly modifies C269 of IDHl (Table 6). Fragment 2 showed much less activity against C269 of IDHl (IC50 > 50 μΜ; Fig. 11B and Fig. 10B) and was therefore selected as a control probe. It was found that 20 blocked in a concentration-dependent manner the catalytic activity of WT-IDHl (as measured by the reduction of NADP+to NADPH in the presence of isocitrate), but did not inhibit the activity of the C269S-IDH1 mutant (Fig. 11C). The in situ activity of 20 was also tested by generating a human cancer cell line that stably overexpressed R132H-IDH1 (Fig. 10E). The R132H-IDH1 cells were treated with fragments 20 and 2 for 2 h, lysed, and assayed ex situ for 2-HG production. 20 (50 μΜ) near-completely blocked 2-HG production by R132H cell lysates, while 2 (50 μΜ) only caused a slight decrease in this activity (Fig. 11D). Parallel competitive isoTOP-ABPP experiments confirmed that fragment 20, but not fragment 2 inhibited IA-alkyne labeling of C269 of IDHl in situ (Fig. 10F)
Global profiling of cysteine-reactive fragments in cells
[00409] Encouraged by the cellular activity of the IDHl ligand 20, the capacity of fragment electrophiles to modify proteinaceous cysteines in situ was more broadly assessed. MDA-MB- 231 and Ramos cells were treated with representative members of the fragment library (23 compounds tested in total; each compound tested at 200 μΜ, 2 h in situ treatment), and the cells were then harvested, lysed, and analyzed by isoTOP-ABPP. A handful of fragments were cytotoxic to cells and re-tested at lower (50 or 100 μΜ) concentrations. The tested fragments showed a broad range of in situ reactivities that generally matched their respective reactivities in vitro (Fig. HE and Tables 1-3). Some fragments, however, showed somewhat greater reactivity in cells, while fragment 11 was notably devoid of activity in situ (Fig. HE). These differences reflect the impact of transport and/or metabolic pathways on the cellular concentrations of fragment electrophiles. A substantial fraction (64%) of the liganded cysteines identified in cell lysates were also sensitive to the same electrophilic fragments in cells (Fig. 11F). A handful of fragment-cysteine interactions were also observed selectively in situ, but not in lysates, including CI 82 of p53 (TP53), a redox-regulated residue at the dimerization interface of the DNA binding domain50 (Fig. 11G). In some instances, these liganded cysteines require an intact cellular environment to preserve their interactions with fragment electrophiles. Taken together, these findings indicate that the ligandability of cysteine residues is generally similar in lysates and cells, although exceptional cases underscore the importance of having the capability to perform ligand discovery experiments in situ.
Electrophilic ligands that target pro-caspase-8 and block extrinsic apoptosis
[00410] Several fragments targeted the catalytic cysteine nucleophile C360 of the protease caspase-8 (CASP8) in isoTOP-ABPP experiments performed in vitro and in situ (Fig. 12A and Tables 1-3). CASP8 plays important roles apoptosis, immune cell proliferation, and embryonic development, but selective, non-peptidic, and cell-active inhibitors for this protease are lacking. Representative fragment hits against recombinant, active CASP8 were screened using substrate and activity-based probe (Rho-DEVD-AOMK probe ("DEVD" disclosed as SEQ ID NO: 857)) assays and observed marginal to no inhibition with most fragments (Fig. 12B). Initially puzzled by this outcome, it was hypothesized that fragment labeling of CASP8 in isoTOP-ABPP experiments might reflect reaction with the inactive zymogen (pro-) rather than active form of this protease. Western blots confirmed that most, if not all of the CASP8 in MDA-MB-231 cell lysates existed in the pro-form (Fig. 12C). Next a recombinant form of pro-CASP8 was expressed with mutated cleavage sites (D374A and D384A) to prevent processing and activation. A non-catalytic cysteine C409S of pro-CASP8 was also mutated, which enabled detection of C360 labeling with IA-rhodamine by SDS-PAGE analysis (Fig. 13A). Several hit fragments detected in isoTOP-ABPP experiments completely blocked IA-rhodamine labeling of pro-CASP8 (Fig. 12D). Fragment 7 displayed the highest potency, with an IC50 value of ~5 μΜ (Fig.l3A, B), which, when combined with the low overall proteome reactivity of this fragment (3%), designated it as suitable tool compound for further studies.
[00411] Fragment 7 (50 μΜ) fully blocked IA-alkyne labeling of C360 of CASP8 in isoTOP- ABPP experiments performed in both Ramos and Jurkat cell lysates (Fig. 13C). Next, a clickable analogue of 7 (61) was synthesized and it was found that this probe (25 μΜ) strongly labeled pro-CASP8, but not a C360S-pro-CASP8 mutant (Fig. 13D and Fig. 12E). 7 (50 μΜ) blocked labeling of pro-CASP8 by 61, but did not inhibit labeling of active CASP8 by the Rho- DEVD-AOMK probe ("DEVD" disclosed as SEQ ID NO: 857) developed to target active caspases (Fig. 13D and Fig. 12F). Conversely, the general caspase inhibitor Ac-DEVD-CHO ("DEVD" disclosed as SEQ ID NO: 857) (20 μΜ) blocked Rho-DEVD-AOMK ("DEVD" disclosed as SEQ ID NO: 857) labeling of active CASP8, but not 61 labeling of pro-CASP8 (Fig. 13D, Fig. 12F, and Fig. 21A). Similar results were obtained in substrate assays, where DEVD-CHO ("DEVD" disclosed as SEQ ID NO: 857), but not 7, blocked CASP8 activity (Fig. 13E). Cross-reactivity of 7 with other caspases was not observed, including recombinant, active CASP3 assayed with a substrate (Fig. 13E) or the Rho-DEVD-AOMK probe ("DEVD" disclosed as SEQ ID NO: 857) (Fig. 12F) or CASP2 and CASP7 in cell lysates measured by isoTOP-ABPP (Fig. 12G). Finally, to further verify that 7 preferentially reacts with pro-CASP8 over active CASP8 in complex biological systems, recombinant forms of these proteins were doped into MDA-MB-231 cell lysates followed by treatment with 7 (30 μΜ, 1 h) or DMSO and analysis by isoTOP-ABPP. 7 produced a near-complete blockade of IA-alkyne labeling of C360 for pro-CASP8 (R = 10), but had little effect on IA-alkyne reaction with C360 of active CASP8 (R = 1.9) (Fig. 13F).
[00412] Treatment of Jurkat cell lysates with 10 or 100 μΜ of 61, followed by analysis of the combined samples by isoTOP-ABPP, confirmed direct labeling of C360 of CASP8 by 61 (Fig. 12H). The low R value observed for C360 in this analysis (R = 2) indicated near complete labeling of this cysteine by 61 at 10 μΜ in cell lysates, consistent with the low μΜ IC5o value displayed by the parent fragment 7 for inhibiting IA-rhodamine labeling of C360 of CASP8 (Fig. 13B). The effect of pro-CASP8 inhibition in cellular apoptosis assays was next to be evaluated. Because C360 is the catalytic nucleophile of CASP8, mutation of this residue was not possible to create a control protein for evaluating the pharmacological effects of 7 in cells. Instead, a structurally related inactive probe was developed for this purpose. It was found that bulky substituents placed on the aniline ring of 7 furnished compounds such as 62 that did not inhibit pro-CASP8 labeling by IA-rhodamine (Fig 13B, G). It was confirmed that 62 also did not inhibit active CASP3 or CASP8 using substrate (Fig. 13E) and activity-probe (Fig. 12F) assays and was inactive against endogenous CASP8, CASP2, or CASP7 in Jurkat lysates as determined by isoTOP-ABPP (Fig. 12G). Based on these data, 62 was designed as a suitable inactive control probe for studying the inhibition of pro-CASP8 by 7. Jurkat cells were treated with 7 or 62 (30 μΜ, 30 min) prior to addition of FASL or staurosporine (STS) to induce extrinsic and instrinsic apoptosis, respectively. 7, but not 62, completely blocked FASL-induced apoptosis (Fig. 13H and Fig. 21B-C), as well as the proteolytic processing of CASP3, CASP8, and the apoptosis marker PARP (Fig. 131). In contrast, 7 did not block STS-induced intrinsic apoptosis (Fig. 13H) or the cleavage of PARP and CASP3, although the compound did substantially inhibit cleavage of CASP8 in these cells (Fig. 131). The non-selective caspase inhibitor VAD-FMK prevented both FASL- and STS-induced apoptosis and associated proteolytic processing events (Fig. 13H, I). Chemical proteomic experiments revealed that 7 fully inhibited CASP8, as well as the related initiator caspase C ASP 10 (but not other caspases, including CASP2, 3, 6, and 9) in Jurkat cells (Fig. 14A and Fig. 22 A). It was confirmed that 7 blocked labeling of pro-CASPIO by 61 with an apparent IC50 value of 4.5 μΜ (Fig. 22B-D), but did not inhibit active CASP10 as measured by labeling with the Rho-DEVD-AOMK probe ("DEVD" disclosed as SEQ ID NO: 857) (Fig. 21A) or a substrate assay (Fig. 22E). As such, in some instances, 7 blocking CASP8 processing in both FASL- and STS-treated cells supports a model where CASP8 activation mainly occurs through auto-processing in either extrinsic or intrinsic apoptosis, but is only required for the former type of programmed cell death.
[00413] In some istances, the respective functions of CASP8 and CASP10 in extrinsic apoptosis and other cellular processes remain poorly understood in large part due to a lack of selective, non-peptidic, and cell-active inhibitors for these enzymes and the absence of animal models for CASP10 (which is not expressed in rodents). In some cases, the potency and selectivity of 7 was improved to address this issue. Conversion of the 4-piperidino moiety to a 3- piperidino group and addition of a /?-morpholino substituent to the benzoyl ring of 7 furnished compound 63 that was separated by chiral chromatography into its two purified enantiomers, 63- R (Fig. 4c) and 63-S, the former of which showed substantially improved activity against CASP8 (apparent IC50 value of 0.7 μΜ (95% CI, 0.5 - 0.8); Fig. 22F-H) and negligible cross- reactivity with C ASP 10 (IC50 value > 100 μΜ; Fig. 22C, D, F). 63-S was much less active against CASP8 (apparent IC50 value of 15 μΜ; Fig. 22G, H) and also inactive against CASP10 (Fig. 14A). With dual CASP8/10 (7) and CASP8-selective (63-R) ligands in hand, we next set out to investigate the biological functions of these proteases.
[00414] The effects of caspase ligands in human T cells were evaluated, where both CASP8 and CASP10 are highly expressed (Fig. 221) in Jurkat cells, which are a commonly studied immortalized human T cell line. It was found that 63-R fully blocked FasL-induced apoptosis in
Jurkat cells and did so with greater potency than 7 (Fig. 14B and Fig. 22J) or 63-S (Fig. 22K).
Similar results were obtained in HeLa cells, which express CASP8, but not CASP1026 (Fig.
22L). In contrast to these cell line results, FasL-induced apoptosis in primary human T cells showed substantial resistance to 63-R at all tested concentrations and instead was completely inhibited by the dual CASP8/10 ligand 7 (Fig. 14B). It was confirmed by chemical proteomics with probe 61 that 7 blocked both CASP8 and CASP10, while 63-R inhibited CASP8, but not
C ASP 10, in primary human T cells and Jurkat cells (Fig. 14A). Consistent with these cell death results, 7, but not 63-R, prevented proteolytic processing of CASP3 and CASP10 in primary human T cells (Fig. 22M). In some instances, the processing of both CASP8 and the initiator caspase substrate RIP kinase were also preferentially inhibited by 7 versus 63-R (Fig. 22M, indicating that C ASP 10 also contribute to these proteolytic events in T cells, as has been suggested by biochemical studies.
Example 2
[00415] Dimethyl fumarate (DMF) is a drug used to treat autoimmune conditions, including multiple sclerosis and psoriasis. In some instances, the mechanism of action of DMF is unclear, but is proposed to involve covalent modification of proteins and/or serving as a pro-drug that is converted to monomethyl fumarate (MMF). Using an isoTOP-ABPP approach, the mechanism of action of DMF is examined.
Chemical reagents
[00416] Assays were performed with the following reagents: dimethyl fumarate (DMF;
242926: Sigma Aldrich), monomethyl fumarate (MMF; 651419; Sigma Aldrich), dimethyl succinate (DM5; W239607; Sigma Aldrich), and buthionine su!foximine (BSC), 14484, Cayman Chemical).
Isolation of primary human T cells
[00417] All studies using samples from human volunteers follow protocols approved by the TSRI institutional review board. Blood from healthy donors (females aged 30-49) were obtained after informed consent. Peripheral blood mononuclear ceils (PBMCs) were purified over Histopaque-1077 gradients (10771; Sigma) following the manufacturer's instructions. Briefly, blood (20 x 25 mL blood aliquots) were layered over Histopaque-1077 (12.5 mL) and the samples were then fractionated by centrifugation (2000 rpm, 20 min, 20 °C, no brake). PBMC's were harvested from the Histopaque-plasma interface and washed twice with PBS. After that time, the T cells were isolated using an EasySep™ Human T Cell Isolation Kit (17951;
STEMCELL) per the manufacturer's instructions.
Mice
[00418] C57BL/6J and Nrf2";~ mice (Stock No:017009; Nfe212tmlYwk ; Jackson Labs) were bred and maintained in a closed breeding facility at The Scripps Research Institute and were 6-8 weeks old when used in experiments. All mice were used in accordance with guidelines from the Institutional Animal Care and Use Committee of The Scripps Research Institute.
[00419] For the PKC9 studies, C57BL/6 mice and Prkcq'1' mice were housed under specific pathogen-free conditions and used in accordance with a protocol approved by the La Jolla Institute for Allergy and Immunology Animal Care Committee. Isolation of primary mouse T cells
[00420] Spleens were harvested from female mice, perfused with collagenase, and incubated at 37 °C with 5% C02 for 30 min. After this time, the spleens were homogenized. Cells that passed through a 100 μιη cell strainer were collected and washed with RPMI. T cells were isolated from the splenocytes using the EasySep™ Mouse T cell Isolation Kit (19851 ;
STEMCELL) according to manufacturer's instructions.
[00421] For the PKC6 studies, CD4" T cells were isolated by anti-mouse CD4 magnetic particles (L3T4; BD IMag) and were cultured in RPMI- 1640 medium (Gibco) supplemented with 10% (vol/vol) heat-inactivated FBS, 2 niM glutamine, I mM sodium pyruvate, 1 mM MEM nonessential amino acids, 100 U/mL each of penicillin G and streptomycin (Life
Technologies) and recombinant JL-2 (100 U/mL, Bioiegend).
T cell stimulation
[00422] 96-well plates were coated with anti-CD3 (1 :200; BioXcell) and anti- CD28 (1 :500; 302933; BioLegend) in PBS (100 μΕΛνεΙΙ) overnight at 4 °C. The plates were then washed twice with PBS and to each well was added 500,000 primary T cells in 100 μΕ of RPMI supplemented with 10% FBS, glutamine, and Pen-Strep. Cells were then treated with 100 μΕ of media containing compound at the indicated concentrations (final well volume of 200 μΕ). Cells were left at 37 °C in a 5% C02 incubator for the indicated periods of time and harvested by centrifugation (500g, 8 min, 4 °C), followed by washing with PBS.
Cellular analysis and sorting by flow cytometry
[00423] Cells were transferred to a round bottom 96-well plate (0720095; Fisher Scientific), harvested by centrifugation (500g, 3 min, 4 °C), washed with PBS, and stained with
LIVE/DEAD fixable cell stain (L23105; ThermoFisher) according to the manufacturer' s instructions. Briefly, one vial of LIVE/DEAD stain was resuspended in 50 uL of DMSO and added to 20 mL of PBS. To each well of the 96-well plate was added 200 μΕ of the stain, and the cells were incubated on ice for 30 min in the dark. After this time, cells were pelleted and washed once with PBS, then stained for cell surface antigens.
[00424] Flow cytometry analysis of cell surface antigens was performed with the following antibodies: Pacific Blue-conjugated anti-CD8 (1 :25 dilution; clone RPA-T8; BD Biosciences),
APC-conjugated anti-CD4 (1 :25 dilution; clone RPA-T4; eBioscience), phycoerythrin- conjugated anti-CD25 (1 :25 dilution; clone BC96; eBioscience or PC61 ; BioLegend (PKC0 studies)), FITC-conjugated anti-CD69 (1 :25 dilution; clone FN50; eBioscience). All antibodies were diluted in 1% FBS in PBS, and 50 μΕ of the stain solution was added to each well. Cells were stained for 15 min on ice in the dark, after which cells were harvested by centrifugation
(500g, 3 min, 4 °C), washed with 1% FBS in PBS, and resuspended in 200 μίΛνεΙΙ of 4% PFA in PBS. Flow cytometry acquisition was performed with BD FACSDiva™-driven BD™ LSR II flow cytometer (Becton, Dickinson and Company). Data was then analyzed with FlowJo software (Treestar Inc.). Data represent mean ± SE for four-five experiments per group.
Quantification of secreted cytokines by enzyme-linked immunosorbent assay (ELISA)
[00425] T cells were harvested and stimulated as described above. At the indicated time points, cell culture supematants were collected and IL-2 levels were measured in clear microplates (991427; R&D Systems) according to the manufacturer's instructions (Human IL-2 DuoSet ELISA; DY202; R&D Systems). Plates were read in a Gemini SpectraMax 250 microplate reader set to 450 nm. Data represent mean ± SE for four experiments per group.
[00426] For the PKC0 studies, aliquots of transduced Prkcq 1' CD4+ T celIs (1X106) were stimulated for 48 h with anti-CD3 alone or anti-CD3 plus anti-CD28, and the concentration of IL-2 in culture supematants was determined by enzyme-linked immunosorbent assay according to the manufacturer's instructions (BioLegend). Briefly, a 96-well plate (Corning Costar) was coated overnight at 4 °C with mAb to IL-2. Triplicates of IL-2 standards and supematants from cultured ceils were then added to the plate, followed by 2h incubation at room temperature. A biotinylated polyclonal antibody to IL-2 was added to the plate, followed by incubation for 1 h at room temperature, and then Avidin-HRP was added, followed by incubation for 30 min at room temperature. The amount of bound avidin was then assessed with TMB peroxidase that was acidified by 2 N H2S04. The absorbance of each well at 450 nm was then measured with a spectrophotometric plate reader (BioTek).
Quantification of cellular glutathione (GSH) levels
[00427] Primary human T cells (2.5 million cells/mL, 20 mL per condition) were treated as indicated, harvested by centrifugation (500g, 8 min, 4 0 C), and washed twice with PBS. To the cell pellet was added 75 μΕ of lysis buffer. After vortexing, the samples were incubated on ice for 15 min, then harvested by centrifugation (16,000g, 10 min, 4 °C). Protein concentrations were adjusted to at least 5 mg/mL and the assay performed according to manufacturer's instructions (Sigma- Aldrich, CS1020). Data represent mean ± SE for two biological replicates. Protein labeling and click chemistry
[00428] Cells were lysed by sonication and diluted to a concentration of 2 mg protein/mL.
Protein concentrations were measured with the Bio-Rad DC™ protein assay reagents A and B
(5000113, 5000114; Bio-Rad). 500 μL· of proteome sample was treated with 100 μΜ of IA- alkyne probe using 10 μΕ of a 10 mM DMSO stock. The labeling reactions were incubated at room temperature for 1 h upon which time the samples were conjugated to isotopically-labeled
TEV-cleavable tags (TEV tags) by copper-catalyzed azide-alkyne cycloaddition (CuACC or
'click chemistry'). 60 μΕ of heavy click chemistry reaction mixture was added to the DMSO- treated control sample and 60 μΐ^ of the light reaction mixture was added to the compound- treated sample. The click reaction mixture comprised TEV tags (10 μΐ^ of a 5 mM stock, light (fragment treated) or heavy (DMSO treated)), CuS04 (10 μΐ^ of a 50 mM stock in water), and TBTA (30 μL· of a 1.7 mM stock in 4: 1 tBuOH:DMSO). To this was added TCEP (10 μL· of a 50 mM stock). The reaction was performed for 1 h at room temperature.
[00429] The light- and heavy-labeled samples were then centrifuged (16,000g, 5 min, 4 °C) to harvest the precipitated proteins. The resulting pellets were resuspended in 500 μΐ. of cold methanol by sonication and the heavy and light samples combined pairwise. Combined pellets were then washed with cold MeOH, after which the pellet was solubilized in PBS containing 1.2% SDS by sonication. The samples were heated at 90 °C for 5 min and subjected to streptavidin enrichment of probe-labeled proteins, sequential on-bead trypsin and TEV digestion, and liquid chromatography -tandem mass spectrometry (LC-MS/MS) according to the published isoTOP-ABPP protocols.
Peptide and protein identification
[00430] RAW Xtractor (version 1.9.9.2) was used to extract the MS2 spectra data from the raw files. MS2 data were searched against a reverse concatenated, nonredundant variant of the Human UniProt database (release-2012 1 1) using the ProLuCID algorithm. Cysteine residues were searched with a static modification for carboxyamidomethylation (+57.02146) and up to one differential modification for either the light or heavy TEV tags (+464.28595 or +470.29976, respectively). Peptides were required to have at least one tryptic terminus and to contain the TEV modification. ProLuCID data was filtered through DTASelect (version 2.0) to achieve a peptide false-positive rate below 1%.
R value calculation and processing
[00431] The quantification of heavy/light ratios (isoTOP-ABPP ratios, R values) was performed by in-house CIMAGE software using default parameters (3 MS 1 ' s per peak and signal to noise threshold of 2.5). Site-specific engagement of electrophilic compounds was assessed by blockade of IA-alkyne probe labeling. For peptides that showed a =95% reduction in MS I peak area from the compound-treated proteome (light TEV tag) when compared to the DMSO treated proteome (heavy TEV tag), a maximal ratio of 20 was assigned. Overlapping peptides with the same labeled cysteine (for example, same local sequence around the labeled cysteines but different charge states, MudPIT segment numbers, or tryptic termini) were grouped together, and the median ratio from each group was recorded as the R value of the peptide for that run. Analysis of cysteine conservation
[00432] For each human protein containing a DMF-sensitive cysteine, the mouse homolog was identified and the human and mouse sequences aligned using the Align tool on UniProt. Immunofluorescent analysis of NF-kB translocation
[00433] Primary human T cells were harvested and stimulated as described above (500,000 cells/well), with concomitant treatment with DMSO or DMF for 60 min. Cells were pelleted (500g, 3 min, 4 °C), then each well was resuspended in 50 μΙ_, PBS and added to Poly-D-lysine coated coverslips (12mm; 354087; Corning® BioCoat™). Cells were allowed to adhere to the coverslips for 30-60 min at 4 °C. Coverslips were transferred to a 6 well plate and fixed with 4% PFA (157-4-100; Electron Microscopy Sciences) at room temperature for 10 min. After washing three times with PBS, cells were permeabilized with 0.1% Triton X-100 in PBS at room temperature for 10 min. Cells were washed three times with PBS, then placed cell-side-up on Parafilm. To each cover slip was added 150 μΙ_, of blocking buffer (2% BSA in PBS), and the slides were blocked for 30 min at room temperature.
[00434] The blocking buffer was aspirated, coverslips placed face down in 40 μΙ_, of antibody buffer (anti-human p65; p65Ab; FivePhoton Biochemicals; 1 :500 dilution in blocking buffer), and allowed to stain overnight at 4 °C in a wet chamber. Cover slips were washed three times with PBS, then incubated with 150 μΙ_, of secondary antibody (anti -rabbit Alexa Fluor 488; A21441; Life Technologies; 1 :200 dilution in PBS) for 2 h at room temperature. After washing three times with PBS, 150 μΙ_, of Hoechst counter stain was added (5 μg/mL in PBS) and coverslips were left at room temperature for 30-60 min. Cells were again washed with PBS three times, then stained with Alexa Fluor 555 Phalloidin red (8953 S; Cell Signaling; 1 :20 dilution in PBS). The coverslips were washed with PBS a final three times, then transferred to SuperFrost Plus slides (12-550-15, Fisherbrand) spotted with 10 μΙ_, of Prolong® Gold Antifade Mountant (P36934, ThermoFisher). The circumference of each coverslip was sealed with clear nail polish (72180; Electron Microscopy Sciences).
[00435] Images were acquired using a Zeiss 780 laser scanning confocal microscope with a 63x Objective (0.3um image step size) and the automated stitching module to merged (10% overlap) and create a three dimensional multi-paneled mega image composite. The composite image was gathered as a z-series of at least 9 individual image panels that were auto-merged using zen software. The mega-image composite was projected into a maximum image projection in the zen software then analyzed using the colocalization modual in Zen (Zeiss Inc) and Image Pro Premier (Media Cybernetics). The Mander's Correlation Coefficients (MCC), specifically Ml and M2 between the various combination of fluorescent label (Rhodamine Phalloidin vs NFkB-P65 and Hoechst vs NFkB-p65) are calculated in ZEN (Zeiss inc) per cell and displayed as a percent. Each cell was outlined using the region of interest module and the software then calculated the Ml and M2 correlation coefficients between the two fluorophores and tabulated the results. The fluorescent signal dynamic range and threshold cutoff of real signal was defined by multiple background and secondary controls. Correlation coefficient values were compared using Image Pro Premier (TPP) (Media Cybernetics), where images were imported as raw calibrated czi files and analyzed using a similar module in IPP. Similar results were obtained with both platforms (not shown). Data represent mean ± SE for two-three biological replicates. Subcloning and mutagenesis
[00436] QuikChange site-directed mutagenesis was performed on a pEF4 His A plasmid containing the full length human PKC0 (residues 1-707). The PKC0 insert was excised using BamHI and Xhol, then ligated into a pMIG vector.
PKC9 retroviral transduction and stimulation
[00437] Platinum-E packaging ceils were plated in a six -well plate in 2 mL RPMI-1640 medium plus 10% FBS. After 24 h, cells were transfected with empty pMIG vector or the appropriate PKCG-expressing vector DNA (3 ^ig) with TransIT-LTl transfection reagent (Mirus Bio). After overnight incubation, the medium was replaced and cultures were maintained for another 24 h. Retroviral supernatants were then collected and filtered, supplemented with 8 iug/niL of polybrene and used to infect CD4 T ceils that had been pre-activated for 24 h with plate-bound monoclonal antibody to CDS (8 ^ig/mL) and CD28 (8 ug,''mL). After centrifuging plates for 1.5-2 h at 2,000 r.p.m., cell supernatants were replaced by fresh RPMI-1640
supplemented with 10% FBS and recombinant IL-2 (100 U/mL). Cells were incubated for another 24h at 37 °C. On day 3, cells were washed, moved to new plates and cultured in RPMI- 1640 medium containing 10% FBS and recombinant IL-2 (100 U/mL) without stimulation for 2 additional days before restimulation with mAb to CD3 alone or plus mAb to CD28.
PKC9 immunoprecipitation and immunoblot analysis
[00438] Ceils were iysed in 1% (wt/vol) digitonin (D141, Sigma) lysis buffer (20mM Tris- HC1, pH7.5, 150mM NaCl, 5mM EDTA) supplemented with protease inhibitors (10 μ^'ηιί aprotinin, 10 ^ig/mL leupeptin and ImM PMSF) and phosphatase inhibitors (5 raM sodium pyrophosphate and 1 mM Na3V04). Supernatants were incubated 2h with 1 g anti-CD28 mAb, and proteins were immunoprecipitated overnight at 4 °C with protein G-Sepharose beads (GE Healthcare), The immunoprecipitated proteins were resolved by SDS-PAGE, transferred onto a PVDF membrane and probed overnight at 4 °C with primary antibodies, followed by incubation for 1 h at room temperature with horseradish peroxidase (HRP)-conjugated secondary antibodies. Signals were visualized by enhanced chemiluminescence (ECL; GE Healthcare) and were exposed to X-ray film. Densitometry analysis was performed with Image! software. Immunoblotting antibodies to CD28 (C-20) and PKC9 (C-19) were obtained from Santa Cruz Biotechnology.
DMF, but not MMF, inhibits T cell activation
[00439] Multiple sclerosis is an autoimmune disease with a prominent T cell component; as such, it was reasoned that DMF in some cases impact primary T cell activation. Consistent with this, previous reports have shown that DMF inhibits cytokine release from mouse splenocytes and promotes a Th2 phenotype via induction of IL-10-producing type II dendritic cells. The effects of DMF and MMF (Fig. 23 A) were tested on cytokine release from primary human T cells activated with anti-CD3, anti-CD28 antibodies. Secretion of IL-2 was strongly inhibited by DMF, but not MMF (Fig. 23B). DMF, but not MMF or the non-electrophilic analogue dimethyl succinate (DMS, Fig. 23A) also blocked the expression of the early activation markers CD25 (Fig. 23C, D) and CD69 (Fig. 23E) in anti-CD3, anti-CD28-stimulated T cells. The blockade of T cell activation by DMF was concentration-dependent, with 10, 25 and 50 μΜ of the drug producing marginal/negligible, partial, and near-complete inhibition, respectively (Fig. 23B, D, E). In some istances, the effects of DMF on cytokine release and activation markers occurred at concentrations of the drug that did not impair T cell viability (Fig. 24). Similar results were obtained with primary splenic T cells from C57BL/6 mice, the activation of which was also suppressed by DMF, but not MMF or DMS (Fig. 25). Of note, the inhibitory effects of DMF were reduced if the drug was added two hours after anti-CD3, anti-CD28 stimulation and completely ablated if the drug was added six hours after stimulation (Fig. 23F), suggesting that DMF inhibits an early event(s) in the T cell activation pathway
DMF effects on T cell activation are independent of Nrf2 and GSH
[00440] DMF is thought to produce neuroprotective effects through activating the Nrf2-
Keapl pathway, but whether this pathway contributes to the immunomodulatory effects of [)\!Γ is unclear. A recent study showed that DMF inhibits pro-inflammatory cytokine release from primary mouse splenocytes and this effect was comparable in wild type and Nrf2(-/-) splenocytes (Giilard, et al., "DMF, but not other fumarates, inhibits NF-kappaB activity in vitro in an Nrf2-independent manner," J. Neuroimmunol. 283, 74-85 (2015)). Consistent with this, it was found that the activation of Nrf2(+/+) and (-/-) T ceils was similarly sensitive to inhibition by DMF (Fig. 26A). In some instances, DMF also impair T cell activation through depleting glutathione (GSH), and, indeed, DMF-treated primary human T cells showed a significant decrease in cellular GSH content (Fig. 26B). Significant reductions in GSH were, however, also observed with the GSH synthesis inhibitor buthionine sulfoximine (BSO), which had no effect on T cell activation (Fig. 26C, D). In some cases, these data indicate that the blockade of T cell activation by DMF involves processes other than Nrf2 activation or GSH depletion. Chemical proteomic discovery of DMF-sensitive Cvs residues in T cells
[00441] The inhibition of T ceil activation by DMF, but not the non-electrophiiic analogues MMF and DMS, pointed to a mechanism that involves covalent reactivity with one or more proteins important for T cell function. As such, a globally inventory of DMF-sensitive Cys residues in primary human and mouse T cells were examined using the quantitative chemical proteomic platform isoTOP-ABPP. In this method, DMF is evaluated for its ability to block the reactivity of proteinaceous Cys residues with the general electrophilic probe i odoacetami deal kyne (IA-alkyne). Using isotopically differentiated azide-biotin tags (containing a TEV protease-cleavable linker), Cys residues are identified and comparatively quantified for their ΪΑ- reactivity in cells treated with DMF versus DMSO control. Primary advantages of the isoTOP- ABPP platform include: 1) the competing electrophile does not itself need to be chemically altered for target identification, which is particularly beneficial when studying very small compounds like DMF; and 2) isotopic labeling occurs late in the sample processing, which facilitates the quantitative analysis of primary cells and tissues that are not readily amenable to metabolic labeling.
[00442] The isoTOP-ABPP method was performed on primary human T cells treated with DMSO or DMF (50 μΜ, 4 h). Five independent replicates were performed, and the total aggregate number of unique quantified peptides and proteins began to plateau by the fourth and fifth replicate (Fig. 28), indicating that we approached maximal proteomic coverage of IA- reactive Cys residues in human T ceils under the conditions employed. Of the more than 2400 quantified Cys residues, a small fraction (-40) showed substantial reductions (> four-fold;
isoTOP-ABPP ratio (R value) > 4) in IA-alkyne labeling in DMF -treated T cells (Fig. 27A, and Tables 7-9). Similar isoTOP-ABPP analyses revealed that none of the -40 DMF-sensitive Cys residues were altered by MMF (50 μΜ, 4 h) or BSO (2,5 mM, 4 h) treatment, which, in general, affected the reactivity of very few Cys residues across the T ceil proteome (Fig. 27A, B and Fig. 29, respectively). The Cys residues targeted by DMF exhibited concentration- (Fig. 27C and Tables 8-9) and time (Fig. 27D) dependent increases in DMF sensitivity, as revealed by isoTOP-ABPP experiments performed with human T cells treated with lower concentrations of DMF (10 and 25 uM, 4 h) or for shorter periods of time (50 μΜ DMF, 1 or 2 h). Of note, very few DMF-sensitive Cys residues were detected in T cells treated with 10 μΜ DMF, a concentration of the drag that also had limited impact T cell activation (Fig. 23B, D, E). These concentration- and time-dependent studies uncovered another -10 DMFsensitive Cys residues that were not detected in the original 50 μΜ/4 h isoTOP-ABPP experiments, likely reflecting the stochastic nature of peptide discovery in datadependent MS experiments. [00443] The possibility that some of the alterations in Cys reactivity following DMF treatment could reflect changes in protein expression was considered; however, multiple Cys residues were quantified by isoTOP-ABPP for the majority of proteins harboring DMF-sensitive Cys residues, and, in most of these cases, the additional quantified Cys residues were clearly unaffected by DMF treatment (Fig. 27E). The DNAactivated protein kinase PRKDC was shown as one representative example, for which IA-alkyne reactivity was quantified for several Cys residues, only one of which (C4045) was blocked by DMF (Fig. 27F). These results indicate that DMF directly impaired the IAalkyne reactivity of specific Cys residues rather than indirectly affecting protein expression in human T cells.
Conservation of DMF-sensitive Cys residues in human and mouse T cells
[00444] Considering that DMF impaired the activation of both human and mouse T cells, it was surmised that at least a subset of Cys residues potentially important for mediating DMF action were conserved in humans and mice. Consistent with this, approximately two-thirds of the DMF-sensitive Cys residues discovered in human T ceils are conserved in mice (Fig. 30A and Table 7), The isoTOP-ABPP experiments were performed on mouse T cells treated with DMF (50 uM, 4 h) and found that the vast majority (> 80%) of the conserved, quantified Cys residues sensitive to DMF in human T cells were also blocked (R values > 4) by this drug in mouse T cells (Fig. 30B and Tables 8-9). These results indicate that DMF targets a similar array of Cys residues in human and mouse T cells, pointing to a specific set of proteins as candidate sites of action for this eiectrophilic drug.
[00445] The proteins containing DMF-sensitive Cys residues, as a whole, originated from several functional classes, including enzymes, channels, transporters, scaffolding proteins, and transcriptional regulators (Fig. 30C). Among these proteins were several with important immune functions (Table 7). DMF-sensitive Cys residues were found, for instance, in multiple proteins that are either components or regulators of the F-κΒ signaling pathway, including ΙκΒ kinase β (Ι β or IKBKB), protein kinase C-θ (PKC0 or PRKCQ), and TNFAIP3 (Table 7). Consistent with these sites of DMF action and potentially others within the F-κΒ pathway, it was found that DMF treatment blocked p65 nuclear translocation (Fig. 31), as has been shown in other cell types. DMF-sensitive Cys residues were also found in: 1) the adenosine deaminase enzyme ADA, deleterious mutations in which cause severe combined immunodeficiency in humans, 2) the transcription factors interferon regulatory factors-4 ( I .! '4) and -8 (IRF8), and 3) the immunomodulatory cytokine IL-16 (Table 7).
DMF perturbs a CXXC motif critical for PKC6-CD28 interactions and T cell Activation
[00446] PKC9 is a key kinase involved in T cell signaling at the immunological synapse where engagement of the T cell receptor and CD28 co-receptor initiates activation of multiple downstream pathways, including NF-κΒ. T cells from PKC0(-7-) mice are defective in early activation. The isoTOP-ABPP analysis identified two DMF sensitive Cys residues - C14 and C 17 - in human (Fig. 32A) and mouse (Fig. 33 A) T cells, and these Cys residues showed time- and concentration-dependent increases in DMF sensitivity (Fig. 33B, C), but were not affected by MMF treatment (Fig. 33D). Because C14 and C17 are found on the same tryptic peptide, it was difficult to distinguish whether one or both residues was sensitive to DMF treatment, but, in certain isoTOP-ABPP experiments, this tryptic peptide appeared to migrate as two adjacent peaks, both of which showed DMF sensitivity (Fig. 32A), suggesting that the IA-alkyne reactivity of both C14 and C17 is blocked by DMF treatment. The isoTOP-ABPP experiments also identified a third Cys in PKC0 (C322) that was unaffected by DMF treatment (Fig. 32A), indicating that DMF caused reductions in CI 4/17 reactivity rather than changes in PKC0 expression. C14 and C17 form a CXXC motif found in the C2 domain of PKC0, but not other PKC isoforms (Fig. 32B, C). The C2 domain of PKC0 was recently shown to bind
phosphotyrosine-containing peptides and has been postulated to stabilize plasma membrane association of PKC0 at the immunological synapse. Upon TCR/CD28 stimulation, PKC0 is recruited to the immunological synapse where it interacts with the CD28 co-receptor by associating with the CD28 cytoplasmic tail. It was found that DMF, but not MMF, blocked the interaction between PKC0 and CD28 in mouse T cells (Fig. 32D). A retroviral transduction was used to reconstitute PKC0(-/-) T cells with either WT- or a C14S/C17S-PKC0 mutant and found that the mutant protein failed to associate with CD28 (Fig. 32E). P ('0( / ;· T cells reconstituted with the C14S/C17SPKC0 mutant also showed impaired expression of CD25 (Fig. 32F) and IL-2 release (Fig. 32G) compared to cells reconstituted with WT PKC0 following anti- CD3, anti-CD28 treatment. Taken together, these data indicate that the C14/C17 motif within the C2 domain of PKC0 regulates localization of this kinase to the immunological synapse, and disruption of this motif by DMF or genetic mutation impairs T cell activation.
Sensitive cysteine residue sites in DMF toward probe ADA
[00447] The DMFsensitive Cys residue C75 is located between two amino acids - G74 and
R76- that, when mutated in humans, contribute to an immunosuppressive phenotype. The amino acid 74-76 region of ADA is over 25 angstroms from the active site of the enzyme (Fig. 34), suggesting that it performs a non-catalytic function possibly perturbed by DMF reactivity. The
DMF-sensitive Cys in IKBKB is located in the leucine-zipper domain and is distinct from another electrophile-sensitive Cys residue CI 79 found in the active site of this kinase.
[00448] Table 1 illustrates a list of liganded cysteines and their reactivity profiles with the fragment eletrophile library from isoTOP-ABPP experiments performed in cell lysates (in vitro).
Table 1 further shows the accession number (or the protein identifier) of the protein. Table 1A
SEQ 2_500 2_500 3_500 3_500 4_250 4_250 ID uM in uM in uM in uM in uM in uM in
Identifier Protein
NO: vitro vitro_ vitro vitro_ vitro vitro_ 231 ramos 231 ramos 231 ramos
Q99873 PRMT1 Protein arginine N- 17
3.1 12.6 1.3 1.8 1.1 0.2 C109 methyltransferase 1
P24752 ACAT1 Acetyl-CoA 22
3.9 3.2 2.1 5.3 1.8 0.2 C119 acetyltransferase, mitochondrial
P09211 GSTP1 Glutathione S- 25
3.1 1.6 2.9 1.7 3.2 0.5 C48 transferase P
014980 28
XPOl Exportin-1 2.8 4.6 1.4 1.6 1.0 0.2 C34
P24752 ACAT1 Acetyl-CoA 33
12.4 9.3 1.7 3.1 1.9 0.5 C196 acetyltransferase, mitochondrial
Q15084 PDIA6 Protein disulfide- 51
6.3 7.1 17.9 14.9 1.3 1.2 C55 isomerase A6
P24752 ACAT1 Acetyl-CoA 56
18.1 ~ 15.3 20.0 3.3 ~ C413 acetyltransferase, mitochondrial
P63244 GNB2L1 Guanine nucleotide- 85
1.2 1.2 20.0 4.7 0.9 ~ C182 binding protein subunit beta-2-
P24752 ACAT1 Acetyl-CoA 89
20.0 ~ 2.6 2.4 2.1 1.2 C126 acetyltransferase, mitochondrial
Q15084 PDIA6 Protein disulfide- 96
~ 15.4 19.9 13.1 1.8 1.2 C190 isomerase A6
Q8TAQ2 SMARCC2 SWI/SNF complex 119
8.5 14.0 9.7 6.6 1.3 ~ C145 subunit SMARCC2
P68036 UBE2L3 Ubiquitin-conjugating 120
2.8 2.5 1.2 3.0 1.2 ~ C86 enzyme E2 L3
P15374 UCHL3 Ubiquitin carboxyl- 146
~ 1.8 1.7 1.4 ~ 0.9 C95 terminal hydrolase isozyme L3
Q 16763 UBE2S Ubiquitin-conjugating 187
4.2 6.9 1.3 1.5 1.2 ~ C118 enzyme E2 S
Q16822 PCK2 Phosphoenolpyruvate 192
~ ~ 10.6 0.9 2.2 ~ C306 carboxykinase
XPOl Exportin-1 218
014980
DLLGLCEQK 4.9 3.1 20.0 20.0 0.9 1.1 C528
K.DLLGLC*EQKR.G
000170 AIP AH receptor-interacting 240
12.5 6.7 7.0 3.1 2.5 0.3 C122 protein
075874 260
IDHl Isocitrate dehydrogenase 20.0 ~ 20.0 1.0 1.6 0.6 C269
075362 268
ZNF217 Zinc finger protein 217 20.0 20.0 1.4 ~ 1.7 ~ C286
P40763 STAT3 Signal transducer and 283
~ ~ 2.5 2.9 2.0 1.9 C259 activator of transcription 3
SAMHD1 SAM domain and 288
Q9Y3Z3
HD domain-containing protein ~ 4.0 2.6 ~ 1.5 ~ C522
1
MGMT Methylated-DNA-- 291
P16455
protein-cysteine ~ 20.0 ~ 17.1 ~ 6.9 C150
methyltransferase
Q96GG9 DCUN1D1 DCNl-like protein 293
~ 20.0 ~ 5.5 ~ ~ C115 1
P00813 296
ADA Adenosine deaminase ~ 7.3 ~ 1.5 ~ 0.1 C75
Q 14790 CASP8 Caspase-8 335 9.8 ~ 3.3 2.3 12.3 ~ SEQ 2_500 2_500 3_500 3_500 4_250 4_250
ID uM in uM in uM in uM in uM in uM in
Identifier Protein
NO: vitro vitro_ vitro vitro_ vitro vitro_ 231 ramos 231 ramos 231 ramos
C360
Q15306 IRF4 Interferon regulatory 338
~ ~ ~ 20.0 ~ ~ C194 factor 4
Q6L8Q7 PDE12 2,5 -phosphodiesterase 339
~ 5.1 3.6 1.8 ~ ~ C108 12
P48735 360
IDH2 Isocitrate dehydrogenase 1.4 ~ 2.2 ~ 0.7 ~ C308
Q86UV5 USP48 Ubiquitin carboxyl- 381
~ 7.1 1.8 1.5 1.4 1.4 C39 terminal hydrolase 48
P50851 LRBA Lipopolysaccharide- 388
~ 3.4 ~ 3.6 ~ 1.5 CI 704 responsive and beige-like ancho
094953 KDM4B Lysine-specific 395
20.0 5.1 20.0 7.4 20.0 ~ C694 demethylase 4B
ERCC3 TFIIH basal 402
PI 9447
transcription factor complex 20.0 ~ 12.1 ~ 1.6 1.1 C342
helicase
Q00535 CDK5 Cyclin-dependent kinase 407
~ 3.2 ~ 20.0 ~ 1.3 C157 5
Q9UPT9 USP22 Ubiquitin carboxyl- 413
~ 7.1 ~ ~ ~ 4.0 C171 terminal hydrolase 22
Q9HB90 RRAGC Ras-related GTP- 417
20.0 ~ 3.7 ~ 3.5 ~ C377 binding protein C
P50851 LRBA Lipopolysaccharide- 426
~ 3.0 ~ 5.1 1.1 ~ C2675 responsive and beige-like ancho
MLTK Mitogen-activated 430
Q9NYL2
protein kinase kinase kinase 20.0 ~ 1.8 ~ 20.0 ~ _C22
MLT
DDX59 Probable ATP- 439
Q5T1V6
dependent RNA helicase 20.0 20.0 ~ 6.0 ~ ~ C414
DDX59
Q9HB90 RRAGC Ras-related GTP- 452
20.0 20.0 1.2 ~ 1.5 ~ C358 binding protein C
MGMT Methylated-DNA-- 470
P16455
protein-cysteine ~ 20.0 ~ 20.0 ~ 20.0 C145
methyltransferase
Q9Y5T5 USP16 Ubiquitin carboxyl- 474
20.0 ~ ~ 8.8 ~ 20.0 C205 terminal hydrolase 16
Q02556 IRF8 Interferon regulatory 513
~ ~ ~ 5.6 ~ ~ C306 factor 8
Q15910 EZH2 Histone-lysine N- 557
~ 2.8 ~ 2.0 ~ 1.5 C503 methyltransferase EZH2
Q96RU2 USP28 Ubiquitin carboxyl- 569
~ 1.1 ~ 1.7 ~ ~ C171 terminal hydrolase 28
PFKFB4 6-phosphofructo-2- 582
Q16877
kinase/fructose-2,6- ~ ~ ~ 12.9 1.4 ~ C159
bisphosphata
P04150 600
NR3C1 Glucocorticoid receptor 2.3 ~ 7.6 ~ ~ ~ C302
Q96JH7 VCPIP1 Deubiquitinating 601
~ 1.1 20.0 15.1 ~ ~ C219 protein VCIP135
P48200 IREB2 Iron-responsive 603
~ 10.5 ~ ~ ~ ~ C137 element-binding protein 2
000622 612
CYR61 Protein CYR61 ~ ~ 4.0 ~ 2.8 ~ C39
Figure imgf000174_0001
Table IB
10 5
5_500 7_500 8_500u 9_500
5_500u 6_500u 7_500u 8_500u 9_500u OOuM uM in uM in M invit uM in
Identifier M invit M invit M invit M invit M invit invit vitro r vitro r ro ram vitro r ro_231 ro_231 ro_231 ro_231 ro_231 ro 23 amos amos OS amos
1
Q99873 C109 0.9 1.3 1.1 1.1 1.1 1.2 1.2 1.0 1.5 2.3
P24752 CI 19 1.8 2.4 1.1 2.3 2.0 1.8 0.7 2.2 3.1 1.5
P09211 C48 2.2 2.4 1.2 1.8 0.9 2.6 2.0 2.3 1.4 1.4
014980 C34 1.1 1.2 1.2 1.7 1.8 2.4 1.5 1.2 1.1 1.2
P24752 C196 1.5 2.2 1.0 1.9 — 2.7 1.5 3.1 2.1 2.5
Q15084 C55 0.9 1.1 1.0 1.0 ~ 4.7 2.2 1.4 1.0 20.0
P24752 C413 1.3 ~ 1.1 4.2 ~ 1.4 2.1 15.3 ~ 13.0
P63244 CI 82 1.5 1.4 1.5 1.5 ~ 0.9 1.2 0.9 0.8 1.1 10 5
5_500 7_500 8_500u 9_500
5_500u 6_500u 7_500u 8_500u 9_500u OOuM uM in uM in M invit uM in
Identifier M invit M invit M invit M invit M invit invit vitro r vitro r ro ram vitro r ro_231 ro_231 ro_231 ro_231 ro_231 ro 23 amos amos OS amos
1
P24752 C126 0.9 1.3 0.9 2.2 ~ 2.1 12.6 1.4 ~ 3.2
Q15084 C190 ~ 1.7 1.1 1.0 1.2 20.0 2.7 2.0 1.3 ~
Q8TAQ2 CI
~ 2.8 ~ 1.0 ~ 3.1 1.7 1.0 1.6 4.2 45
P68036 C86 ~ 4.8 1.8 0.9 ~ 1.4 1.0 0.8 1.5 1.1
P15374 C95 1.5 1.1 1.1 0.8 ~ 20.0 1.2 1.0 1.2 2.0
Q16763 C118 2.8 ~ ~ 1.5 — ~ 1.1 0.8 1.0 1.7
Q16822 C306 1.7 ~ ~ 1.4 ~ 2.2 1.0 1.3 ~ 1.9
014980 C528 20.0 ~ 0.7 0.7 ~ 1.8 ~ 0.8 0.7 0.8
000170 C122 ~ ~ ~ ~ ~ ~ 1.3 ~ 1.1 ~
075874 C269 ~ 0.8 ~ 1.2 ~ 12.4 ~ ~ 1.4 20.0
075362 C286 ~ ~ 0.9 1.3 — 1.8 — 1.2 — 1.1
P40763 C259 ~ ~ 0.6 1.3 1.8 1.8 1.7 1.5 1.2 3.4
Q9Y3Z3 C52
~ 5.4 ~ ~ ~ 1.8 1.1 1.0 ~ 1.8 2
P16455 C150 ~ 9.6 ~ ~ 20.0 ~ 15.8 ~ 4.0 ~
Q96GG9 Cl l
~ ~ ~ ~ 1.6 ~ 1.4 ~ 1.3 1.3 5
P00813 C75 ~ 2.5 ~ ~ 1.4 ~ 1.3 1.3 1.1 ~
Q14790 C360 1.0 1.3 ~ ~ ~ ~ 1.4 ~ ~ 2.8
Q15306 C194 ~ 2.2 ~ ~ ~ ~ 2.1 ~ 1.1 ~
Q6L8Q7 CIO
~ ~ ~ ~ ~ ~ ~ 1.3 ~ ~ 8
P48735 C308 0.8 ~ ~ 0.9 ~ 1.3 ~ 1.4 ~ 1.1
Q86UV5 C39 ~ 1.2 ~ ~ ~ ~ ~ 0.8 0.9 2.5
P50851 C170
~ 3.6 ~ 1.8 1.6 ~ 1.2 1.0 1.2 ~ 4
094953 C694 ~ ~ 1.1 ~ ~ 20.0 1.1 1.2 1.0 ~
P19447 C342 ~ 4.1 1.1 1.3 2.0 5.9 ~ 2.5 ~ 3.0
Q00535 C157 ~ 20.0 ~ ~ ~ ~ 1.3 1.2 0.8 ~
Q9UPT9 CI 7
~ 20.0 ~ ~ 3.2 ~ 2.3 ~ ~ ~ 1
Q9HB90 C37
~ ~ ~ ~ ~ 5.1 ~ 1.9 ~ ~ 7
P50851 C267
~ 20.0 ~ ~ ~ ~ 1.1 1.5 ~ ~ 5
Q9NYL2 C2
20.0 ~ ~ 3.1 ~ ~ ~ 20.0 ~ ~ 2
Q5T1V6 C41
~ ~ ~ ~ ~ ~ 2.0 ~ ~ ~ 4
Q9HB90 C35
~ ~ 1.2 ~ ~ ~ ~ 1.9 ~ ~ 8
P16455 C145 ~ 2.0 ~ ~ ~ ~ 20.0 ~ 20.0 ~
Q9Y5T5 C20
~ 1.1 ~ ~ ~ ~ 20.0 ~ ~ ~ 5
000541 C272 ~ 12.3 ~ ~ ~ ~ 1.6 ~ ~ ~
Q02556 C306 ~ 2.6 ~ ~ 20.0 ~ 2.2 ~ 1.0 ~
Q15910 C503 ~ 20.0 1.8 ~ ~ ~ ~ 1.1 ~ ~
Q96RU2 CI 7
~ ~ 0.8 ~ ~ 7.5 ~ 1.1 ~ 1.4 1
Q16877 C159 ~ 20.0 ~ ~ ~ ~ 1.0 1.5 ~ 1.2
P04150 C302 ~ ~ ~ ~ ~ ~ ~ ~ ~ 4.1 10 5
5_500 7_500 8_500u 9_500
5_500u 6_500u 7_500u 8_500u 9_500u OOuM uM in uM in M invit uM in
Identifier M invit M invit M invit M invit M invit invit vitro r vitro r ro ram vitro r ro_231 ro_231 ro_231 ro_231 ro_231 ro 23 amos amos OS amos
1
Q96JH7 C21
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 9
P48200 C137 ~ ~ ~ ~ 2.2 ~ 1.6 ~ ~ ~
000622 C39 ~ ~ ~ ~ ~ ~ ~ 2.0 ~ ~
Q5T1V6 C45
~ ~ ~ ~ ~ 20.0 ~ 2.1 ~ ~ 3
P51617 C608 ~ 10.1 ~ ~ ~ ~ ~ ~ ~ ~
P42575 C370 ~ ~ ~ ~ ~ ~ 2.0 ~ 1.2 ~
P09086 C346 ~ 4.6 ~ ~ ~ ~ ~ ~ ~ ~
Q09472 CI 73
~ 20.0 ~ ~ ~ ~ ~ 1.5 ~ ~ 8
Q01201 C109 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Q70CQ2 C74
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 1
P41226 C599 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
P14598 C378 ~ 4.0 ~ ~ ~ ~ 1.3 ~ ~ ~
Q9C0C9 C37
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 5
000622 C134 ~ ~ ~ ~ ~ ~ ~ 1.7 ~ 20.0
000541 C361 ~ ~ ~ ~ ~ ~ ~ 20.0 ~ ~
P43403 CI 17 ~ ~ ~ ~ ~ 20.0 ~ ~ ~ 20.0
Q96FA3 C28
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 2
Q9UPT9 C44 ~ 20.0 ~ ~ ~ ~ 1.5 ~ ~ ~
Q9Y4C1 C25
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 1
Q70CQ2 CIO
~ 20.0 ~ ~ ~ ~ ~ ~ ~ ~ 90
000622 C70 ~ ~ ~ — — 20.0 — ~ — ~
P04150 C622 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Table 1C
11 50 13 50
10_500 11 500 12_500 12_500 13 50 14_500 14_500 15 500
0uM_i 0uM_i
uM inv uM inv uM inv uM inv 0uM_i uM inv uM inv uM inv
Identifier nvitro nvitro
itro ra itro 23 itro 23 itro ra nvitro itro 23 itro ra itro 23 ramo ramo
mos 1 1 mos 231 1 mos 1 s s
Q99873 C109 1.3 20.0 20.0 1.4 1.3 1.0 0.9 0.8 1.4 0.9
P24752 CI 19 1.4 1.2 0.9 1.0 0.7 1.2 0.6 1.2 1.8 1.0
P09211 C48 0.8 1.7 1.1 2.9 1.1 1.5 1.2 3.2 2.1 2.5
014980 C34 0.7 1.2 1.0 1.2 0.7 0.9 0.7 1.1 1.5 ~
P24752 C196 1.7 1.5 1.3 1.2 0.9 1.6 1.7 1.1 1.4 0.9
Q15084 C55 ~ 20.0 20.0 3.7 3.1 1.9 1.1 1.0 1.4 0.8
P24752 C413 1.6 1.9 1.7 1.5 1.1 5.5 7.3 1.3 1.6 1.2
P63244 CI 82 1.1 1.0 1.1 0.9 0.8 1.4 0.8 1.2 1.5 ~
P24752 C126 1.5 1.5 ~ 1.0 1.1 9.6 20.0 1.1 1.4 0.8
Q15084 C190 3.3 20.0 20.0 ~ 20.0 1.5 1.3 1.1 1.8 0.9
Q8TAQ2 CI
~ 20.0 ~ 2.8 1.6 ~ 0.8 1.1 2.3 ~ 45
P68036 C86 0.4 ~ 1.0 ~ ~ 1.1 0.8 1.0 1.1 ~
P15374 C95 ~ 0.8 0.8 ~ 0.5 ~ 0.8 1.2 1.0 ~
Q16763 C118 ~ ~ ~ ~ 0.6 ~ 0.7 0.7 1.2 ~ 11 50 13 50
10_500 11 500 12_500 12_500 13 50 14_500 14_500 15 500
0uM_i 0uM_i
uM inv uM inv uM inv uM inv 0uM_i uM inv uM inv uM inv
Identifier nvitro nvitro
itro ra itro 23 itro 23 itro ra nvitro itro 23 itro ra itro 23 ramo ramo
mos 1 1 mos 231 1 mos 1 s s
Q16822 C306 1.1 3.2 1.2 1.5 0.7 0.9 1.0 ~ 20.0 ~
014980 C528 ~ 1.0 0.7 1.1 ~ ~ 0.5 20.0 20.0 0.8
000170 C122 1.7 ~ 2.8 ~ 1.3 ~ 0.9 ~ 2.5 ~
075874 C269 ~ 1.1 0.7 ~ ~ 2.7 ~ 0.9 1.3 ~
075362 C286 ~ 1.2 1.2 1.0 ~ 0.9 0.9 0.9 ~ ~
P40763 C259 ~ 20.0 20.0 ~ ~ ~ 1.2 0.8 3.1 ~
Q9Y3Z3 C52
~ 2.1 ~ ~ ~ 0.8 ~ 1.7 1.5 ~ 2
P16455 C150 20.0 ~ 20.0 ~ 20.0 ~ 4.2 5.5 5.1 ~
Q96GG9 Cl l
1.0 ~ 0.9 ~ 0.6 ~ ~ 1.1 1.6 ~ 5
P00813 C75 1.0 ~ 7.5 ~ 0.7 ~ 0.9 1.5 1.7 ~
014933 C98 ~ ~ 3.4 ~ 1.3 ~ 1.0 ~ 2.8 ~
Q14790 C360 1.5 20.0 20.0 ~ ~ ~ ~ 1.4 1.9 ~
Q15306 C194 2.5 ~ 4.3 ~ 3.7 ~ 1.1 1.3 2.2 ~
Q6L8Q7 CIO
~ 4.2 ~ ~ 1.5 ~ 1.0 0.5 2.2 1.1 8
P48735 C308 ~ 1.1 ~ 1.7 ~ 1.1 ~ 4.7 ~ ~
Q86UV5 C39 ~ ~ 1.0 ~ ~ 0.8 0.6 0.8 ~ ~
P50851 C170
0.8 ~ 1.4 ~ ~ 0.7 0.8 1.9 1.7 ~ 4
094953 C694 1.4 ~ 1.9 ~ ~ 1.4 1.1 ~ 2.0 ~
P19447 C342 — 2.9 1.5 3.0 ~ 1.0 — 4.7 ~ ~
Q00535 C157 ~ ~ 1.3 ~ 1.0 ~ ~ 6.1 2.3 ~
Q9UPT9 CI 7
3.8 ~ 8.2 ~ 4.1 ~ 0.8 4.0 4.0 ~ 1
Q9HB90 C37
~ ~ 1.6 ~ ~ 2.1 ~ ~ 20.0 ~ 7
P50851 C267
1.3 ~ 1.3 ~ 1.5 ~ ~ 1.9 2.9 ~ 5
Q9NYL2 C2
~ ~ ~ ~ ~ 5.2 ~ 1.6 ~ ~ 2
Q5T1V6 C41
~ ~ ~ ~ ~ ~ 1.2 ~ 4.0 ~ 4
Q9HB90 C35
~ ~ ~ ~ ~ 1.5 ~ 0.9 2.1 1.1 8
P16455 C145 — ~ 20.0 — ~ ~ 20.0 1.5 2.0 ~
Q9Y5T5 C20
~ 20.0 ~ 2.2 ~ ~ 3.3 ~ ~ ~ 5
000541 C272 2.2 ~ 4.2 ~ ~ 1.7 ~ 2.6 ~ ~
Q02556 C306 3.6 ~ 3.1 ~ ~ ~ 0.7 2.4 2.3 ~
Q15910 C503 ~ ~ 1.3 ~ ~ ~ 0.8 0.6 1.9 0.8
Q96RU2 CI 7
~ 20.0 ~ ~ ~ ~ 2.4 1.0 ~ ~ 1
Q16877 C159 — ~ 0.6 — ~ 0.9 — 2.3 ~ ~
P04150 C302 ~ ~ 1.5 ~ 1.3 ~ ~ ~ 2.2 ~
Q96JH7 C21
~ ~ ~ ~ ~ ~ ~ 0.6 1.5 ~ 9
P48200 C137 2.1 ~ 3.7 ~ ~ ~ ~ 5.2 ~ ~
000622 C39 ~ 9.6 ~ ~ ~ 1.2 ~ 3.6 ~ ~
Q5T1V6 C45
~ 20.0 ~ ~ ~ ~ 1.4 2.2 ~ ~ 3
Figure imgf000178_0001
Table ID
20 50 21 50 22 50
15 500 27_500 20 50 21_500 22_500 23 50 23_500
0uM_i 0uM_i 0uM_i
uM inv uM inv 0uM_i uM inv uM inv 0uM_i uM inv
Identifier nvitro nvitro nvitro
itro ra itro 23 nvitro itro 23 itro 23 nvitro itro ra ramo ramo ramo
mos 1 231 1 1 231 mos s s s
Q99873 C109 1.2 0.7 1.9 2.2 1.1 1.5 1.0 1.0 0.8 1.0
P24752 CI 19 1.2 0.9 3.6 2.6 0.9 1.1 1.0 0.8 1.7 2.0
P09211 C48 1.3 1.3 2.5 1.1 1.1 1.3 0.9 0.7 0.8 0.9
014980 C34 1.2 0.9 1.0 1.2 0.9 1.2 1.1 ~ 3.4 5.3
P24752 C196 1.6 0.9 3.7 2.9 1.0 1.1 1.4 0.7 ~ 1.8
Q15084 C55 1.0 0.8 1.7 2.0 1.1 2.1 2.1 1.6 0.8 1.0
P24752 C413 0.9 0.9 20.0 14.0 1.4 1.4 1.1 1.1 1.1 1.2
P63244 CI 82 1.3 1.0 1.4 1.1 0.9 0.9 1.6 ~ 4.6 ~
P24752 C126 ~ 0.9 10.6 20.0 1.2 1.4 2.4 1.7 0.8 0.8
Q15084 C190 1.2 0.9 2.1 2.1 1.1 2.3 2.2 4.2 0.8 1.1
Q8TAQ2 CI
1.2 0.9 2.9 3.9 1.2 ~ ~ 1.2 1.3 ~ 45
P68036 C86 1.3 1.2 1.1 1.2 1.2 2.0 ~ 0.8 7.4 13.2
P15374 C95 0.7 0.4 1.0 0.9 1.1 ~ ~ 0.7 0.8 1.1
Q16763 C118 1.2 0.8 ~ 2.9 0.9 ~ — 0.9 1.6 2.4
Q16822 C306 ~ 1.4 2.5 1.6 1.2 2.0 ~ ~ 3.0 ~
014980 C528 1.0 1.2 1.4 4.0 0.9 0.8 ~ ~ ~ 2.9
000170 C122 1.4 0.4 ~ 2.2 1.4 ~ 1.3 1.1 0.7 1.4
075874 C269 0.6 0.4 ~ 0.6 1.5 ~ 1.0 ~ 0.7 1.1
075362 C286 ~ 20.0 1.3 ~ 0.9 ~ ~ ~ 20.0 ~
P40763 C259 1.9 0.7 ~ 3.0 ~ 2.9 2.0 ~ 1.1 0.8 20 50 21 50 22 50
15 500 27_500 20 50 21_500 22_500 23 50 23_500
0uM_i OuM i OuM i
uM inv uM inv 0uM_i uM inv uM inv 0uM_i uM inv
Identifier nvitro nvitro nvitro
itro ra itro 23 nvitro itro 23 itro 23 nvitro itro ra ramo ramo ramo
mos 1 231 1 1 231 mos s s s
Q9Y3Z3 C52
1.6 0.8 2.1 1.3 ~ 1.4 ~ 0.8 ~ ~ 2
P16455 C150 1.9 1.0 ~ 18.3 ~ 17.2 ~ 2.9 ~ 1.0
Q96GG9 Cl l
0.8 1.1 1.2 1.2 1.0 1.0 1.0 0.8 ~ 20.0 5
P00813 C75 1.1 1.1 ~ 1.5 ~ 1.3 ~ 1.3 ~ 1.1
014933 C98 1.3 1.2 ~ 3.9 ~ ~ ~ 2.0 0.7 1.2
Q 14790 C360 ~ 0.8 1.9 1.5 1.9 1.9 ~ ~ 1.4 1.4
Q15306 C194 2.2 ~ ~ 3.0 ~ 1.7 ~ 1.1 ~ 1.9
Q6L8Q7 CIO
~ 0.5 1.6 ~ ~ ~ ~ ~ 1.0 ~ 8
P48735 C308 ~ 1.9 0.5 ~ 0.6 ~ ~ ~ ~ ~
Q86UV5 C39 0.7 0.8 1.1 1.8 20.0 1.2 ~ ~ ~ 1.3
P50851 C170
1.9 ~ 2.0 1.7 ~ 1.6 ~ ~ ~ 3.4 4
094953 C694 1.1 1.4 2.8 2.0 ~ 1.8 ~ ~ ~ 3.2
P19447 C342 ~ 0.8 2.6 ~ 6.1 ~ ~ ~ 1.6 ~
Q00535 C157 ~ ~ ~ 2.0 ~ ~ ~ 0.9 ~ 2.8
Q9UPT9 CI 7
~ ~ ~ 20.0 ~ ~ ~ 1.1 ~ 5.1 1
Q9HB90 C37
~ 1.4
7 - 1.0 ~ - 1.7 0.9 1.5 ~
P50851 C267
1.6 ~ ~ ~ 0.9 1.5 1.1 1.1 ~ 2.6 5
Q9NYL2 C2
~ 1.1 20.0 ~ 2.0 ~ 1.0 ~ 1.4 ~ 2
Q5T1V6 C41
1.0 0.8 ~ 3.1 1.2 ~ ~ 1.3 ~ 1.4 4
Q9HB90 C35
8 - 0.3 ~ - 1.2 - 1.7 ~ ~ ~
P16455 C145 1.3 ~ ~ 20.0 ~ — ~ ~ ~ 1.0
Q9Y5T5 C20
~ 1.6 ~ ~ 1.6 ~ ~ 1.6 ~ 1.2 5
000541 C272 ~ 1.1 ~ ~ ~ 5.5 1.6 ~ ~ 2.0
Q02556 C306 2.1 ~ ~ 2.1 ~ ~ ~ ~ ~ ~
Q15910 C503 ~ ~ ~ ~ ~ ~ ~ ~ ~ 2.0
Q96RU2 CI 7
1 - 0.9 ~ 2.4 1.1 - ~ ~ - 1.2
Q16877 C159 — — — 1.0 ~ 1.2 ~ ~ ~ 1.0
P04150 C302 0.9 1.3 ~ ~ ~ ~ ~ 0.9 — 4.7
Q96JH7 C21
1.3 1.7 ~ 1.2 ~ ~ ~ ~ ~ ~ 9
P48200 C137 2.0 1.4 ~ 5.2 ~ ~ ~ ~ ~ 1.6
000622 C39 ~ 1.0 ~ ~ ~ ~ ~ ~ 4.4 ~
Q5T1V6 C45
3 - ~ ~ - ~ - ~ ~ - ~
P51617 C608 — ~ ~ ~ ~ — ~ -- ~ 1.5
P42575 C370 — 0.9 ~ ~ ~ ~ ~ — — 2.1
P09086 C346 0.8 ~ ~ 2.3 0.9 ~ ~ ~ ~ 1.1
Q09472 CI 73
~ ~ ~ 3.2 ~ ~ ~
8 - ~ ~
Q01201 C109 — 0.8 ~ ~ ~ ~ ~ — — 1.3
Q70CQ2 C74 ~ ~ 2.8 1.7 ~ 1.7 ~ — ~ ~ 20 50 21 50 22 50
15 500 27_500 20 50 21_500 22_500 23 50 23_500
0uM_i 0uM_i 0uM_i
uM inv uM inv 0uM_i uM inv uM inv OuMJ uM inv
Identifier nvitro nvitro nvitro
itro ra itro 23 nvitro itro 23 itro 23 nvitro itro ra ramo ramo ramo
mos 1 231 1 1 _231 mos s s s
1
P41226 C599 ~ 0.7 1.1 12.2 ~ 2.5 ~ ~ ~ 3.2
P14598 C378 ~ ~ ~ ~ — ~ ~ ~ ~
Q9C0C9 C37
0.7 ~ ~ ~ 2.4 ~ ~ ~ 1.4 5
000622 C134 1.5 20.0 ~ ~ — ~ — — ~
000541 C361 0.6 20.0 20.0 — — — ~ — 1.1
P43403 CI 17 20.0 ~ ~ 20.0 ~ — — — ~ ~
Q96FA3 C28
~ ~ ~ ~ ~ 2 - ~ - - ~
Q9UPT9 C44 — 20.0 ~ ~ ~ — ~ ~ ~ ~
Q9Y4C1 C25
~ ~ ~ ~ 0.7
1 - ~ ~ ~ ~
Q70CQ2 CIO
~ ~ ~ 2.7 ~ 0.9 ~ ~ ~ ~ 90
000622 C70 — 1.1 — ~ — — ~ ~ ~ ~
P04150 C622 -- - - - - - - ~ - ~
Table IE
24J00 25_500 26_500 27_500 27_500 28_500 28_500 29_500 29J00 30_500 uM in uM in uM in uMjn uM in uM in uM in uM in uM in uM in
Identifier
vitro r vitro 2 vitro r vitro 2 vitro r vitro 2 vitro r vitro 2 vitro r vitro 2 amos 31 amos 31 amos 31 amos 31 amos 31
Q99873 C109 1.0 1.0 0.9 0.9 1.0 0.8 1.1 1.0 0.9 1.1
P24752 CI 19 1.3 1.0 1.0 1.0 1.0 1.1 1.1 1.3 0.6 0.7
P09211 C48 1.6 1.8 0.9 1.0 1.0 1.3 1.5 1.0 0.4 1.4
014980 C34 1.1 0.9 0.9 1.0 1.0 0.8 0.9 1.3 0.9 0.7
P24752 C196 1.1 0.9 1.0 1.0 1.1 1.0 1.1 1.3 1.0 0.8
Q15084_C55 1.3 1.0 0.9 1.1 1.2 1.1 1.5 1.0 1.3 2.2
P24752 C413 1.2 1.0 0.9 1.1 1.3 1.2 1.4 1.3 0.7 1.1
P63244 CI 82 2.6 1.2 1.0 0.8 1.1 1.0 1.1 1.3 1.2 1.2
P24752 C126 1.1 0.9 0.9 1.0 1.3 1.7 1.6 1.1 0.8 1.1
Q15084 C190 1.7 1.0 0.9 — 1.1 1.1 1.6 1.2 ~ 2.3
Q8TAQ2 C145 5.4 0.9 0.9 — 1.2 1.1 1.2 1.0 — 1.1
P68036 C86 — 1.5 0.7 0.9 — 1.4 ~ 0.9 1.5 —
P15374 C95 1.1 0.8 0.8 — 1.0 1.1 0.9 0.9 1.0 0.7
Q16763 C118 1.2 1.0 0.9 — 0.9 1.0 0.8 — ~ 0.7
Q16822 C306 — 0.8 ~ 1.5 1.1 1.1 ~ 1.1 0.5 1.2
014980 C528 20.0 ~ 0.9 0.9 0.8 ~ ~ 1.0 ~ ~
000170 C122 1.3 ~ 0.9 — 1.0 0.9 1.2 — 1.3 1.1
075874 C269 0.9 1.0 ~ 1.0 ~ 1.1 ~ 0.8 ~ 0.7
075362 C286 — 20.0 ~ 0.9 1.0 1.0 1.0 0.9 ~ —
P40763 C259 1.7 0.9 ~ — ~ 1.0 1.1 1.5 ~ ~
Q9Y3Z3 C522 1.1 ~ 0.9 1.0 0.9 1.0 — 1.5 ~ 0.7
P16455 C150 4.0 ~ 0.9 — 2.3 ~ 1.7 — 1.3 ~
Q96GG9 CI 15 1.7 ~ 0.9 — 1.0 — ~ — ~ 0.7
P00813 C75 1.2 ~ 0.8 — 1.0 0.9 — ~ ~
014933 C98 — — 1.0 — 1.0 ~ 1.2 — ~ 1.3
Q14790 C360 1.4 1.0 — — 1.2 1.1 ~ — ~ 2.0
Q15306 C194 1.9 ~ 1.1 — 1.0 ~ 1.1 — 1.2 ~
Q6L8Q7 C108 1.4 0.8 1.1 — 1.3 1.0 1.2 — ~ 1.1
P48735 C308 ~ 14.5 — 1.0 ~ 0.9 ~ 1.0 - 0.7
Figure imgf000181_0001
Table IF
Figure imgf000181_0002
Figure imgf000182_0001
Figure imgf000183_0001
Table 1G
35_500 36_500 37_500 38_500 38_500 39_500 40_500 40_500 41_500 41_500 uM in uM in uM in uM in uM in uM in uM in uM in uM in uM in laenuiier
vitro r vitro 2 vitro r vitro 2 vitro r vitro 2 vitro 2 vitro r vitro 2 vitro r amos 31 amos 31 amos 31 31 amos 31 amos
Q99873 C109 1.0 1.0 1.0 0.9 1.4 1.5 0.8 1.3 0.9 0.9
P24752 CI 19 1.0 1.2 0.8 1.0 1.7 1.7 0.8 0.9 1.0 0.8
P09211 C48 1.3 1.6 1.2 1.9 ~ 2.0 1.6 1.6 1.3 0.9
014980 C34 1.3 1.3 0.9 1.0 1.0 1.3 0.8 1.1 3.1 2.0
P24752 C196 1.4 1.5 1.0 1.0 1.3 3.0 0.8 ~ 1.3 1.0
Q15084 C55 ~ ~ 1.0 1.0 1.3 3.9 0.8 ~ 0.9 0.8
P24752 C413 1.3 2.7 0.9 1.1 20.0 2.7 0.9 ~ 0.8 0.9
P63244 CI 82 0.9 1.3 0.9 1.1 ~ 1.0 1.0 0.6 ~ 4.7
P24752 C126 10.4 ~ 0.9 1.0 1.4 ~ 0.9 ~ 1.0 0.9
Q15084 C190 ~ ~ 1.1 1.1 ~ 7.0 0.9 1.7 1.0 ~
Q8TAQ2 C145 ~ ~ ~ 1.1 1.9 1.2 0.9 1.1 1.8 2.8
P68036 C86 ~ ~ ~ ~ 1.5 1.9 0.8 0.9 9.8 7.9
P15374 C95 ~ ~ ~ 2.7 12.5 ~ ~ 0.9 1.0 0.8
Q16763 C118 ~ ~ ~ 1.0 1.0 1.4 0.9 0.8 ~ 3.4
Q16822 C306 1.0 ~ 1.0 1.0 ~ ~ ~ ~ 1.2 1.0
014980 C528 0.8 ~ 0.9 1.0 ~ ~ — ~ 0.6 ~
000170 C122 ~ ~ ~ 1.5 1.5 ~ ~ ~ 1.1 1.0
075874 C269 ~ 1.7 ~ 0.7 ~ ~ ~ 0.8 0.8 ~
075362 C286 ~ ~ 0.9 0.9 1.2 20.0 ~ ~ ~ 0.6
P40763 C259 1.1 ~ 0.9 ~ 1.5 ~ ~ 1.4 0.9 ~
Q9Y3Z3 C522 — ~ 0.7 — ~ 1.1 0.8 ~ 0.9 ~
P16455 C150 1.3 ~ 1.0 ~ 1.8 ~ ~ 1.9 ~ 0.9
Q96GG9 C115 0.9 ~ 0.8 1.2 1.0 ~ ~ 0.9 ~ ~
P00813 C75 1.1 ~ 0.9 ~ 2.3 ~ ~ 0.9 ~ 1.1
014933 C98 ~ ~ ~ 1.6 1.7 ~ ~ 1.3 1.2 1.1
Q14790 C360 1.3 ~ 1.0 1.0 4.1 ~ ~ ~ 1.0 0.8
Q15306 C194 1.0 ~ 0.9 ~ 2.1 ~ ~ 1.1 ~ 1.2
Q6L8Q7 C108 ~ ~ ~ 0.9 6.8 3.2 0.8 ~ 1.3 ~
P48735 C308 ~ 1.3 ~ 1.2 ~ ~ ~ ~ ~ ~
Q86UV5 C39 1.0 ~ 0.7 3.7 ~ ~ ~ 1.6 ~ ~
P50851 C1704 1.1 ~ 1.0 ~ 2.5 ~ ~ ~ ~ ~
094953 C694 1.1 ~ 1.1 ~ 2.4 ~ ~ ~ ~ ~
P19447 C342 ~ ~ ~ 1.3 ~ ~ ~ ~ 1.1 ~
Figure imgf000184_0001
Table 1H
42_500 43_500 43_500 44_500 45_500 46_500 47_500 48_500 49_500 50_500 uM in uM in uM in uM in uM in uM in uM in uM in uM in uM in
Identifier
vitro r vitro 2 vitro r vitro 2 vitro 2 vitro 2 vitro 2 vitro 2 vitro 2 vitro 2 amos 31 amos 31 31 31 31 31 31 31
Q99873 C109 0.9 1.4 1.2 0.8 2.2 0.8 0.9 1.0 2.0 1.0
P24752 CI 19 0.8 1.8 1.1 0.8 12.4 0.8 1.0 1.3 4.9 2.1
P09211 C48 0.7 4.1 2.2 0.9 19.7 1.0 2.5 1.1 ~ 2.3
014980 C34 0.9 1.1 ~ 0.8 1.8 1.0 1.0 1.1 1.9 1.2
P24752 C196 0.8 2.4 2.3 0.8 20.0 0.9 1.1 1.3 4.7 3.9
Q15084 C55 0.9 20.0 20.0 1.0 ~ 0.8 0.9 1.0 12.8 1.3
P24752 C413 0.7 15.9 ~ 0.8 20.0 1.1 0.9 1.1 2.6 9.2
P63244 CI 82 0.8 0.9 0.9 ~ 0.9 0.8 1.1 1.1 ~ 0.8
P24752 C126 0.8 ~ 20.0 0.8 20.0 0.9 1.0 ~ 20.0 20.0
Figure imgf000185_0001
42_500 43_500 43_500 44_500 45_500 46_500 47_500 48_500 49_500 50_500 uM in uM in uM in uM in uM in uM in uM in uM in uM in uM in
Identifier
vitro r vitro 2 vitro r vitro 2 vitro 2 vitro 2 vitro 2 vitro 2 vitro 2 vitro 2 amos 31 amos 31 31 31 31 31 31 31
000541 C361 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
P43403 CI 17 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Q96FA3 C282 1.2 ~ ~ ~ ~ ~ ~ ~ ~ ~
Q9UPT9 C44 ~ ~ 2.1 ~ ~ ~ ~ ~ ~ ~
Q9Y4C1 C251 — ~ ~ — ~ 1.0 — ~ — ~
Q70CQ2 C1090 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
000622 C70 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
P04150 C622 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Table 11
51 500 51 500 52_500 52_500 53_500 53_500 54_500 55_500 56_500 56_500 uM in uM in uM in uM in uM in uM in uM in uM in uM in uM in
Identifier
vitro 2 vitro r vitro 2 vitro r vitro 2 vitro r vitro 2 vitro 2 vitro 2 vitro r
31 amos 31 amos 31 amos 31 31 31 amos
Q99873 C109 0.9 9.5 0.9 1.3 0.9 1.1 1.2 1.0 1.0 1.1
P24752 CI 19 0.8 0.9 1.4 0.8 0.9 0.7 1.0 0.9 1.0 1.2
P09211 C48 1.1 1.0 1.0 1.9 2.3 1.2 1.3 1.4 1.9 1.2
014980 C34 0.9 1.0 1.0 1.1 0.9 1.0 1.1 0.9 0.9 0.9
P24752 C196 1.1 ~ 1.3 1.5 1.2 0.9 1.1 1.0 1.1 1.0
Q15084 C55 2.5 2.7 1.1 1.8 0.9 1.0 1.2 0.9 1.0 1.1
P24752 C413 1.0 1.0 1.9 1.5 1.0 ~ 1.1 1.0 1.0 0.9
P63244 CI 82 0.7 0.9 1.0 ~ 0.9 0.8 ~ 1.0 1.1 ~
P24752 C126 1.0 1.2 1.3 ~ ~ ~ 1.1 0.9 0.9 ~
Q15084 C190 ~ 5.6 1.0 1.9 1.0 1.0 1.2 ~ 0.9 1.1
Q8TAQ2 C145 ~ 1.1 0.9 1.8 ~ 1.2 ~ ~ 1.4 1.2
P68036 C86 ~ 1.0 0.8 1.3 1.0 1.0 ~ 1.0 1.5 1.3
P15374 C95 ~ 0.9 1.2 ~ ~ 0.7 ~ ~ 1.0 1.1
Q16763 C118 ~ 0.9 1.2 ~ 1.0 0.8 ~ ~ 1.2 1.1
Q16822 C306 ~ 0.8 1.3 1.6 1.3 ~ ~ ~ 1.4 ~
014980 C528 ~ 0.9 1.3 ~ ~ 0.8 ~ ~ 20.0 ~
000170 C122 1.4 ~ ~ 1.4 ~ 0.9 ~ ~ 0.8 0.9
075874 C269 ~ ~ ~ ~ ~ ~ 1.1 ~ 0.9 ~
075362 C286 0.9 20.0 1.0 ~ 1.1 ~ ~ ~ 1.0 ~
P40763 C259 ~ ~ 2.2 ~ ~ 1.9 1.3 ~ ~ ~
Q9Y3Z3 C522 — — 1.2 ~ ~ 0.7 — — 1.0 1.0
P16455 C150 ~ ~ ~ 2.2 ~ 1.1 ~ ~ ~ 1.4
Q96GG9 C115 ~ 0.8 ~ 1.0 1.0 0.9 1.1 ~ ~ 1.0
P00813 C75 ~ 1.0 ~ 1.2 ~ ~ ~ ~ ~ 0.9
014933 C98 ~ ~ 1.4 1.4 ~ 1.0 ~ ~ ~ 1.1
Q 14790 C360 — — ~ ~ ~ ~ 1.2 — 1.0 ~
Q15306 C194 ~ 1.4 ~ 1.4 ~ 1.0 ~ ~ ~ 1.5
Q6L8Q7 C108 ~ ~ 1.4 6.8 ~ 1.1 ~ ~ ~ ~
P48735 C308 ~ ~ 1.1 ~ 0.5 ~ ~ ~ 1.0 ~
Q86UV5 C39 ~ 1.8 0.9 ~ ~ ~ ~ ~ 1.0 ~
P50851 C1704 ~ 1.1 ~ 1.1 ~ ~ ~ ~ ~ ~
094953 C694 ~ 1.4 ~ 1.4 ~ ~ ~ ~ ~ ~
P19447 C342 1.4 ~ 1.0 ~ ~ ~ ~ ~ 1.2 ~
Q00535 C157 ~ 1.0 ~ ~ 1.2 0.8 ~ ~ 1.1 ~
Q9UPT9 C171 ~ ~ ~ ~ ~ ~ ~ ~ ~ 1.9
Q9HB90 C377 0.6 ~ ~ ~ ~ 1.1 ~ ~ 1.2 ~
P50851 C2675 ~ 0.9 ~ ~ ~ ~ ~ ~ ~ ~
Q9NYL2 C22 ~ ~ ~ ~ ~ ~ 1.1 1.2 1.1 ~
Figure imgf000187_0001
[00449] Table 2 illustrates a list of liganded cysteines and their reactivity profiles with the fragment electrophile library from isoTOP-ABPP experiments performed in situ. Table 2 further shows the accession number (or the protein identifier) of the protein.
Table 2A
9_200
SEQ 2_200 4 100 8_200
uM in
Identifier Protein ID uM in uM in uM in
situ 23
NO: situ_231 situ_231 situ_231
1
P04406 GAPDH Glyceraldehyde-3 -phosphate
16 20.0 1.6 1.1 1.7 C 152 dehydrogenase
P61978 HNR PK Heterogeneous nuclear
768 5.0 3.8 4.2 1.4 C132 ribonucleoprotein K
Q13526 PIN1 Peptidyl-prolyl cis-trans isomerase
19 20.0 3.4 2.6 3.3 C113 NIMA-interacting 1
P24752 ACAT1 Acetyl-CoA acetyltransf erase,
22 2.6 3.0 5.5 3.8 C119 mitochondrial
P24752 ACAT1 Acetyl-CoA acetyltransferase,
56 20.0 4.5 20.0 19.5 C413 mitochondrial 9_200
SEQ 2_200 4 100 8_200
uM in
Identifier Protein ID uM in uM in uM in
situ 23 NO: situ_231 situ_231 situ_231
1
Q9NUY
101 4.1 2.1 4.9 2.1 8 C283 TBC1D23 TBCl domain family member 23
P13667
36 2.6 6.8 15.2 1.1 C206 PDIA4 Protein disulfide-isomerase A4
P12268 IMPDH2 Inosine-5 -monophosphate
45 1.3 1.0 0.9 1.2 C140 dehydrogenase 2
Q15365
29 5.1 1.4 1.4 10.1 C194 PCBP1 Poly(rC)-binding protein 1
Q9NVC MED 17 Mediator of RNA polymerase II
211 7.7 1.7 3.2 2.2 6 C649 transcription subunit 17
P42166 TMPO Lamina-associated polypeptide 2,
88 8.3 20.0 4.6 3.7 C561 isoform alpha
Q9Y696 CLIC4 Chloride intracellular channel protein
21 1.7 20.0 4.7 2.6 C35 4
P10599
34 7.9 3.3 5.9 20.0 C32 TXN Thioredoxin
P31943 HNR PH1 Heterogeneous nuclear
769 3.8 6.0 5.3 2.1 C267 ribonucleoprotein H
Q86SX6 GLRX5 Glutaredoxin -related protein 5,
26 1.1 1.4 1.0 5.1 C67 mitochondrial
P15121
48 0.9 1.0 1.7 1.4
C299 AKRIBI Aldose reductase
P52597 HNR PF Heterogeneous nuclear
108 4.2 20.0 ~ 3.2 C267 ribonucleoprotein F
Q9ULV
770 3.0 1.3 4.6 2.0 4 C420 COROIC Coronin-lC
P62888
100 1.0 2.7 1.5 1.3 C92 RPL30 60S ribosomal protein L30
Q9NQR
47 13.1 0.7 20.0 20.0 4 C153 NIT2 Omega-amidase NIT2
P42765 ACAA2 3-ketoacyl-CoA thiolase,
71 20.0 3.7 20.0 4.3 C92 mitochondrial
Q15084
51 3.0 4.9 4.4 2.1 C55 PDIA6 Protein disulfide-isomerase A6
Q96HE7
61 2.4 3.0 3.7 ~ C241 EROIL EROl-like protein alpha
Q99439
41 ~ 1.5 1.1 4.7 C164 CN 2 Calponin-2
P25205 MCM3 DNA replication licensing factor
74 3.5 4.3 3.5 1.9 C119 MCM3
Q9NS86
203 ~ 1.6 6.1 2.5 C187 LANCL2 LanC-like protein 2
Q15233 NONO Non-POU domain-containing
72 ~ 1.6 12.0 1.4 C145 octamer-binding protein
Q9BRA TXNDC17 Thioredoxin domain-containing
62 15.5 3.5 20.0 20.0 2 C43 protein 17
P35611
771 5.8 4.3 2.4 4.3 C68 ADD1 Alpha-adducin
075521 ECI2 Enoyl-CoA delta isomerase 2,
155 0.6 1.6 0.8 0.7 C380 mitochondrial
Q9BXW CECR5 Cat eye syndrome critical region
772 3.6 2.2 3.2 1.5 7 C392 protein 5
P30101
773 3.2 9.0 20.0 1.4 C406 PDIA3 Protein disulfide-isomerase A3 9_200
SEQ 2_200 4 100 8_200
uM in
Identifier Protein ID uM in uM in uM in
situ 23 NO: situ_231 situ_231 situ_231
1
Q96AB3 IS0C2 Isochorismatase domain-containing
159 1.6 0.8 4.7 ~ C114 protein 2, mitochondria
P13667
774 3.2 9.0 20.0 1.4 C555 PDIA4 Protein disulfide-isomerase A4
Q09161 NCBPl Nuclear cap-binding protein subunit
102 2.1 1.8 2.0 ~ C44 1
P78417
32 ~ 19.9 20.0 ~ C32 GSTOl Glutathione S-transf erase omega- 1
Q9ULW
437 20.0 20.0 20.0 5.7 0 C536 TPX2 Targeting protein for Xklp2
Q9NRG CHRAC1 Chromatin accessibility complex
252 20.0 20.0 7.4 2.9 0 C55 protein 1
Q96T76 MMS19 MMS19 nucleotide excision repair
114 3.9 20.0 1.7 20.0 C848 protein homolog
Q8TAQ SMARCC2 SWI/SNF complex subunit
775 4.6 3.1 ~ ~ 2 C145 SMARCC2
Q9BVC
168 2.7 3.3 4.5 ~ 5 CIO C2orf49 Ashwin
Q7Z2W ZC3HAV1 Zinc finger CCCH-type antiviral
776 ~ 4.4 5.1 3.0 4 C645 protein 1
Q9BQ69 MACROD1 O-acetyl-ADP-ribose
777 4.8 1.8 5.0 1.2 C186 deacetylase MACROD1
Q16831
364 1.2 1.0 20.0 7.7 C162 UPP1 Uridine phosphorylase 1
P30101
133 2.2 20.0 20.0 ~ C57 PDIA3 Protein disulfide-isomerase A3
P12268 IMPDH2 Inosine-5 -monophosphate
144 12.1 4.6 ~ 1.3 C331 dehydrogenase 2
095571
176 3.8 2.3 6.8 2.0 C170 ETHE1 Protein ETHE1, mitochondrial
000299 CLICl Chloride intracellular channel protein
69 9.5 20.0 10.3 3.0 C24 1
014879 IFIT3 Interferon-induced protein with
308 7.6 6.6 2.1 17.2 C343 tetratricopeptide
Q96CM ACSF2 Acyl-CoA synthetase family
194 20.0 20.0 20.0 ~ 8 C64 member 2, mitochondrial
P51946
778 1.9 5.3 ~ ~ C244 CCNH Cyclin-H
P49588
122 ~ 1.3 1.5 1.5 C773 AARS Alanine—tRNA ligase, cytoplasmic
Q96RN5 MED 15 Mediator of RNA polymerase II
171 6.4 4.5 20.0 ~ C618 transcription subuni
015294 OGT UDP-N-acetylglucosamine~peptide N-
104 2.6 6.4 2.4 2.1 C758 acetylglucosami
P46734 MAP2K3 Dual specificity mitogen-activated
184 0.8 0.6 0.8 3.3 C207 protein kinase
Q96S55
215 2.2 3.7 1.2 4.5 C272 WRNIPl ATPase WRNIPl
095229
779 3.4 12.4 3.1 3.7 C54 ZWINT ZW10 interacted
060610
780 2.2 1.6 ~ 1.5 CI 227 DIAPH1 Protein diaphanous homolog 1
Q13428
150 3.3 2.8 2.0 ~ C38 TCOF1 Treacle protein 9_200
SEQ 2_200 4 100 8_200
uM in
Identifier Protein ID uM in uM_in uM in
situ 23 NO: situ_231 situ_231 situ_231
1
Q9Y277 VDAC3 Voltage-dependent anion-selective
512 3.3 4.7 6.2 ~ C65 channel protein
P57764
110 20.0 20.0 8.4 ~ C268 GSDMD Gasdermin-D
Q9Y3A3
121 ~ 1.2 1.6 20.0 C134 MOB4 MOB-like protein phocein
Q02252 ALDH6 A 1 Methylmalonate-semialdehyde
265 10.3 14.4 ~ ~ C317 dehydrogenase
Q9NYL
130 2.2 0.8 0.9 7.6 9 C132 TMOD3 Tropomodulin-3
P83731
147 2.4 0.7 ~ 4.3 C6 RPL24 60S ribosomal protein L24
095336
55 14.9 ~ 20.0 6.7 C32 PGLS 6-phosphogluconolactonase
Q13155 AIMP2 Aminoacyl tRNA synthase complex-
83 20.0 20.0 2.4 1.9 C291 interacting multif
Q13418
149 1.4 0.8 ~ 2.7 C346 ILK Integrin-linked protein kinase
A6NDU
298 4.2 3.0 ~ 20.0 8 C179 C5orf51 UPF0600 protein C5orf51
Q9UKF6 CPSF3 Cleavage and polyadenylation
118 6.2 3.5 20.0 ~ C498 specificity factor su
Q96F86 EDC3 Enhancer of mRNA-decapping
141 20.0 20.0 ~ ~ C413 protein 3
P42224 STAT1 Signal transducer and activator of
228 16.9 ~ ~ 3.1 C492 transcription 1
P11216 366 ~ 1.7 20.0 2.6
C326 PYGB Glycogen phosphorylase, brain fonn
P21980 TGM2 Protein-glutamine gamma-
356 0.7 0.5 0.7 2.3 C277 glutamyltransferase 2
Q9HAV GRPEL1 GrpE protein homolog 1.
206 3.2 1.7 2.2 4.7 7 C124 mitochondrial
P24752 ACAT1 Acetyl-CoA acetyltransferase.
89 9.1 2.8 4.5 14.1 C126 mitochondrial
Q9NQ88
117 6.0 ~ ~ 2.6 C161 TIGAR Fructose-2,6-bisphosphatase TIGAR
Q13155 AIMP2 Aminoacyl tRNA synthase complex-
132 ~ 2.8 2.7 ~ C23 interacting multifunctional protein 2
Q9NQW
281 ~ 4.1 14.1 3.1 6 C712 ANLN Actin-binding protein anillin
P51649 ALDH5A1 Succinate-semialdehyde
189 1.9 1.0 2.7 ~ C340 dehydrogenase, mitochondria
Q15021
139 ~ 2.2 0.6 6.1 C439 NCAPD2 Condensin complex subunit 1
Q5T0N5
418 20.0 20.0 3.6 ~ C69 FNBP1L Formin -binding protein 1 -like
P38606 ATP6V1A V-type proton ATPase catalytic
201 ~ 20.0 20.0 20.0 C138 subunit A
Q9HCC MCCC2 Methylcrotonoyl-CoA carboxylase
363 4.6 5.9 3.9 ~ 0 C216 beta chain, mitoch
Q9NQC
247 12.0 20.0 ~ 20.0 3 C1101 RTN4 Reticulon-4
P35754
142 ~ 20.0 ~ ~ C23 GLRX Glutaredoxin-1 9_200
SEQ 2_200 4 100 8_200
uM in
Identifier Protein ID uM in uM in uM in
situ 23 NO: situ_231 situ_231 situ_231
1
Q99757
208 10.1 3.6 4.5 20.0 C90 TXN2 Thioredoxin, mitochondrial
Q9Y3D0 FAM96B Mitotic spindle-associated MMXD
179 20.0 20.0 ~ ~ C93 complex subunit MIP18
Q9UMS NFU1 NFU1 iron-sulfur cluster scaffold
394 7.4 4.4 ~ ~ 0 C213 homolog, mitochondrial
Q9NXV
255 20.0 ~ ~ 6.6 6 C516 CDKN2AIP CDKN2A-interacting protein
Q96RS6 NUDCD1 NudC domain -containing protein
79 ~ 1.6 ~ 2.2 C376 1
Q14997 PSME4 Proteasome activator complex
257 2.8 20.0 20.0 ~ CI 840 subunit 4
P50570
73 4.6 2.2 ~ ~ C27 DNM2 Dynamin-2
Q86YH6 PDSS2 Decaprenyl-diphosphate synthase
401 20.0 20.0 ~ ~ C71 subunit 2
Q99497
109 ~ 0.9 ~ ~ C106 PARK7 Protein DJ-1
Q9UJW
103 4.0 ~ 20.0 ~ 0 C258 DCTN4 Dynactin subunit 4
Q9BUH
348 ~ 1.2 6.6 2.8 6 C180 C9orfl42 Uncharacterized protein C9orfl42
P24752 ACAT1 Acetyl-CoA acetyltransf erase,
33 20.0 3.5 ~ 5.8 C196 mitochondrial
Q13162
781 4.6 1.2 1.4 3.3 C51 PRDX4 Peroxiredoxin-4
Q9BTA9 WAC WW domain-containing adapter
153 17.6 17.8 9.6 ~ C553 protein with coiled-coil
P48643
126 ~ 0.8 0.8 ~ C253 CCT5 T-complex protein 1 subunit epsilon
075362
268 8.9 ~ ~ ~ C286 ZNF217 Zinc finger protein 217
060825 PFKFB2 6-phosphofructo-2-kinase/fructose-
272 0.7 ~ ~ ~ C158 2,6-bisphosphata
Q8NBS9 TXNDC5 Thioredoxin domain-containing
136 ~ 2.4 ~ 1.7 C350 protein 5
Q9NYL MLTK Mitogen-activated protein kinase
430 5.5 20.0 1.4 10.3 2 C22 kinase kinase MLTK
P27707
93 ~ 1.4 ~ 4.9 C9 DCK Deoxycytidine kinase
Q93009 USP7 Ubiquitin carboxyl-terminal hydrolase
782 ~ 6.8 20.0 ~ C223 7
014929 HATl Histone acetyltransf erase type B
154 ~ ~ ~ 9.0 ClOl catalytic subunit
Q9UPQ0 LIMCH1 LIM and calponin homology
783 20.0 ~ 20.0 ~ C140 domains-containing protein
Q96NY7 CLIC6 Chloride intracellular channel protein
447 ~ 20.0 5.1 3.6 C487 6
Q9NQ88
143 2.8 ~ ~ 1.9 C114 TIGAR Fructose-2,6-bisphosphatase TIGAR
Q14790
335 20.0 20.0 ~ ~ C360 CASP8 Caspase-8
P04183
784 1.9 0.8 ~ 10.0 C230 TK1 Thymidine kinase, cytosolic
Figure imgf000192_0001
Figure imgf000193_0001
9_200
SEQ 2_200 4 100 8_200
uM in
Identifier Protein ID uM in uM in uM in
situ 23 NO: situ_231 situ_231 situ_231
1
P42166 TMPO Lamina-associated polypeptide 2,
312 ~ ~ 4.6 ~ C684 isoform alpha
Q96EY5
540 4.1 1.7 ~ 1.9 C231 FAM125A Multivesicular body subunit 12A
P14635
448 ~ ~ ~ 3.7 C238 CCNB1 G2/mitotic-specific cyclin-Bl
Q8NDH
791 20.0 20.0 20.0 ~ 3 C81 NPEPL1 Probable aminopeptidase NPEPL1
Q9P0J1
276 20.0 ~ ~ ~ C149 PDP1
Q96P48 ARAP1 Arf-GAP with Rho-GAP domain,
433 5.5 ~ ~ ~ C900 ANK repeat and PH domain
Q96HE7
347 ~ 5.4 ~ 20.0 C37 EROIL EROl-like protein alpha
Q07065
733 ~ ~ ~ 15.4 ClOO CKAP4 Cytoskeleton-associated protein 4
Q9BRJ7
432 20.0 20.0 ~ ~ C88 NUDT16L1 Protein syndesmos
075439 PMPCB Mitochondrial-processing peptidase
320 ~ ~ ~ ~ C265 subunit beta
043175 PHGDH D-3-phosphoglycerate
248 20.0 ~ ~ ~ C369 dehydrogenase
Q9UNI6 DUSP12 Dual specificity protein
241 ~ 0.8 ~ ~ C265 phosphatase 12
Q06203
188 ~ 1.7 ~ ~ ClOO PPAT Amidophosphoribosyltransferase
AOAVT UBA6 Ubiquitin-like modifier-activating
158 ~ 20.0 ~ 3.3 1 C347 enzyme 6
Q86X76
471 ~ 0.7 ~ ~ C203 NIT1 Nitrilase homolog 1
Q6XZF7
353 ~ 1.2 ~ 3.4 C691 DNMBP Dynamin-binding protein
Q15398
167 20.0 ~ ~ ~ C129 DLGAP5 Disks large-associated protein 5
075717 WDHD1 WD repeat and HMG-box DNA-
289 ~ ~ 4.2 ~ C773 binding protein 1
Q01433
259 4.4 2.3 6.2 3.1 C107 AMPD2 AMP deaminase 2
Q8WVV HNRPLL Heterogeneous nuclear
487 ~ ~ ~ ~ 9 C464 ribonucleoprotein L-like
014733 MAP2K7 Dual specificity mitogen-activated
427 ~ ~ ~ ~ C131 protein kinase
Q14137
535 20.0 1.2 ~ ~ C404 BOP1 Ribosome biogenesis protein BOP1
Q96RU2 USP28 Ubiquitin carboxyl-terminal
569 ~ 20.0 1.2 ~ C171 hydrolase 28
Q9Y679
564 ~ 20.0 20.0 ~ C391 AUP1 Ancient ubiquitous protein 1
P51610
270 4.1 ~ ~ ~ CI 872 HCFCl Host cell factor 1
P22307
541 20.0 20.0 20.0 ~ C307 SCP2 Non-specific lipid-transfer protein
Q9BTE3 MCMBP Mini-chromosome maintenance
792 ~ ~ ~ ~ C325 complex-binding protein
Figure imgf000195_0001
acetygucosamne
Figure imgf000196_0001
9_200
SEQ 2_200 4_100 8_200
uM in
Identifier Protein ID uM in uM in uM in
situ 23 NO: situ_231 situ_231 situ_231
1
Q9NUI1 DECR2 Peroxisomal 2,4-dienoyl-CoA
698 ~ ~ ~ ~ C22 reductase
Q02556
513 ~ ~
C306 IRF8 Interferon regulatory factor 8 - -
Q9UPT9 USP22 Ubiquitin carboxyl-terminal
802 ~ ~ ~ ~ C171 hydrolase 22
Q8N999
484 ~ C302 C12orf29 Uncharacterized protein C12orf29 - - ~
Q8IU81 IRF2BP1 Interferon regulatory factor 2-
803 ~ ~ ~ ~ C363 binding protein 1
Q9C0I1
671 ~ ~ ~ ~ C152 MTMR12 Myotubularin-related protein 12
Q9P2X3
678 ~ 20.0 ~
C 195 IMPACT Protein IMPACT -
Q6QNY BLOC1S3 Biogenesis of lysosome-related
411 ~ ~ ~ ~ 0 C168 organelles complex
Q15796 SMAD2 Mothers against decapentaplegic
561 20.0 ~ ~ ~ C81 homolog 2
Q9NZB2 FAM120A Constitutive coactivator of
492 ~ ~ ~ ~ C531 PPAR-gamma-like protei
Q9HB90
417 3.3 ~ ~ 4.3 C377 RRAGC Ras-related GTP-binding protein C
Q9BR61 ACBD6 Acyl-CoA-binding domain-
472 ~ ~ ~
C267 containing protein 6 -
P16455 MGMT Methylated-DNA~protein-cysteine
470 ~ ~ CI 45 methyltransferase - -
Q86UV5 USP48 Ubiquitin carboxyl-terminal
381 20.0 ~ ~ ~ C39 hydrolase 48
A2A288
515 ~ ~ ~ ~ C367 ZC3H12D Probable ribonuclease ZC3H12D
Q8NEC7 GSTCD Glutathione S-transferase C-
602 ~ ~ ~ ~ C140 terminal domain-containing protein
Q6PJG6
695 ~ ~ ~
C673 BRATl BRCAl -associated ATM activator 1 -
Q13232
653 ~ ~ ~ 2.7 C158 NME3 Nucleoside diphosphate kinase 3
Q86X76
345 ~ 0.9 ~ ~ C 165 NIT1 Nitrilase homolog 1
P42695
573 ~ ~ ~ ~ C541 NCAPD3 Condensin-2 complex subunit D3
P41226 UBA7 Ubiquitin-like modifier-activating
702 ~ ~ ~ ~ C599 enzyme 7
Q99986 VRK1 Serine/threonine-protein kinase
497 ~ - - ~ C50 VRK1
Q8WU PDCD6IP Programmed cell death 6-
527 ~ ~ ~ ~ M4 C90 interacting protein
P29590
477 ~ ~ ~
C213 PML Protein PML -
Q9P0K7
638 ~ 8.7 ~ ~ C973 RAI14 Ankycorbin
P53992
498 ~ ~ 5.4 ~ C78 SEC24C Protein transport protein Sec24C
Q13867
431 ~ ~ ~ 3.9 C73 BLMH Bleomycin hydrolase 9_200
SEQ 2_200 4 100 8_200
uM in
Identifier Protein ID uM in uM in uM in
situ 23 NO: situ_231 situ_231 situ_231
1
Q8ND24
451 6.5 ~ ~ 4.0 C655 RNF214 RING finger protein 214
Q96EK4 THAPl l THAP domain-containing protein
538 7.9 ~ ~ ~ C48 11
Q96IV0 NGLY1 Peptide-N(4)-(N-acetyl-beta-
660 ~ ~ ~ ~ C309 glucosaminyl)asparagin
Q5T1V6 DDX59 Probable ATP-dependent RNA
439 ~ ~ ~ ~ C414 helicase DDX59
Q9UHQ NARF Nuclear prelamin A recognition
740 ~ ~ ~ ~ 1 C99 factor
043396
310 ~ ~ ~ ~ C34 TXNL1 Thioredoxin-like protein 1
Q8IV53 DENND1C DENN domain -containing
804 ~ ~ ~ ~ C174 protein 1C
Q8N9T8
563 ~ ~ 5.3 ~ C673 KRI1 Protein KRI1 homolog
Table 2B
Figure imgf000198_0001
P30101 C406 1.6 3.3 1.4 ~ 20.0 ~ 2.0 1.1 ~ ~
Q96AB3 CI 14 3.7 20.0 ~ 0.9 1.6 2.4 ~ 1.0 ~ 4.9
P13667 C555 1.6 3.3 1.4 ~ 20.0 ~ 2.0 1.1 ~ ~
Q09161 C44 1.4 12.4 ~ 1.1 4.7 1.0 ~ 2.0 ~ 1.3
P78417 C32 20.0 20.0 20.0 ~ 20.0 20.0 20.0 1.7 ~ ~
Q9ULW0 C536 1.6 17.6 ~ 1.4 20.0 1.5 ~ 1.5 ~ 0.9
Q9NRG0 C55 2.5 2.7 ~ 1.1 20.0 1.2 ~ 1.7 ~ ~
Q96T76 C848 1.8 2.5 ~ ~ 20.0 0.6 1.5 5.0 20.0 0.7
Q8TAQ2 C145 2.4 ~ 20.0 7.6 ~ 1.1 ~ 1.3 ~ 1.6
Q9BVC5 C10 1.4 3.1 ~ 1.1 2.9 1.6 ~ 1.2 ~ 1.5
Q7Z2W4 C645 1.3 3.8 — 1.0 3.3 1.3 0.9 1.4 ~ 0.9
Q9BQ69 CI 86 ~ 2.4 ~ 1.0 2.7 1.4 ~ 1.1 ~ 1.4
Q16831 C162 ~ 7.3 ~ 0.8 0.9 1.0 ~ 0.6 ~ 1.6
P30101 C57 1.5 3.3 ~ 2.2 20.0 ~ 1.6 1.0 ~ 1.0
P12268 C331 2.6 ~ 2.1 1.0 ~ ~ 1.4 1.7 ~ 0.8
095571 C170 1.8 9.0 — 1.2 6.3 1.2 1.9 1.0 ~ 1.7
000299 C24 1.4 5.0 ~ 2.5 20.0 ~ 1.8 0.8 ~ 0.7
014879 C343 ~ 4.9 ~ 0.4 3.5 1.0 ~ 1.5 ~ 0.8
Q96CM8 C64 20.0 20.0 ~ 1.5 17.9 1.4 ~ 1.5 ~ 2.0
P51946 C244 2.0 1.7 ~ 1.3 1.3 1.0 1.2 1.6 ~ 1.4
P49588 C773 2.0 2.1 1.7 0.9 0.9 0.8 1.1 1.0 ~ 1.1
Q96R 5 C618 ~ ~ ~ 1.0 20.0 1.0 1.4 1.6 ~ 1.0
015294 C758 ~ 2.3 ~ 1.0 20.0 1.0 ~ 1.6 2.9 1.1
P46734 C207 1.8 0.8 ~ 0.6 0.7 0.8 ~ 13.8 ~ 0.9
Q96S55 C272 2.5 1.2 ~ ~ 2.4 0.7 ~ 1.4 ~ 1.3
095229 C54 ~ 2.3 ~ 1.2 20.0 0.8 1.4 5.0 20.0 0.8
060610 C1227 ~ 1.5 1.4 0.9 0.8 0.7 0.9 ~ ~ 20.0
Q13428 C38 1.7 4.2 ~ 1.1 4.7 1.6 ~ 3.6 ~ 1.5
Q9Y277 C65 0.8 3.3 2.5 1.4 3.9 1.0 ~ 1.7 ~ 2.9
P57764 C268 4.9 ~ 2.2 0.7 ~ ~ 1.5 1.6 ~ 0.7
Q9Y3A3 C134 20.0 1.9 ~ 1.8 1.4 ~ 1.9 1.1 ~ ~
Q02252 C317 2.6 1.5 0.7 0.9 3.2 1.3 ~ 1.1 ~ 1.5
Q9NYL9 C132 ~ ~ ~ 0.5 0.6 ~ ~ 0.5 ~ 1.4
P83731 C6 1.3 0.4 2.0 0.5 0.3 0.7 1.0 — ~ 1.0
095336 C32 2.6 ~ ~ 0.9 20.0 1.6 2.2 3.2 ~ 0.9
Q13155 C291 ~ 1.7 ~ 0.9 1.1 ~ ~ 1.6 ~ 0.8
Q13418 C346 ~ 1.1 8.3 0.6 0.6 0.7 2.1 0.8 ~ 0.5
A6NDU8 CI 79 3.7 1.3 ~ 0.7 ~ 0.8 ~ 1.9 ~ 0.9
Q9UKF6 C498 1.7 20.0 2.8 1.3 4.1 ~ 1.4 1.9 ~ 1.4
Q96F86 C413 4.0 20.0 20.0 20.0 ~ 1.6 2.0 1.0 ~ 0.8
P42224 C492 ~ 20.0 ~ 0.7 1.0 1.0 ~ 20.0 ~ 0.7
P11216 C326 ~ 8.6 ~ 0.8 3.9 1.3 ~ 0.8 ~ 0.9
P21980 C277 ~ 0.6 ~ 0.6 0.3 0.8 ~ 20.0 ~ 0.7
Q9HAV7 C124 3.8 1.0 ~ 0.9 1.1 ~ ~ 1.3 ~ 1.7
P24752 C126 20.0 5.9 ~ 1.2 5.8 ~ ~ 0.9 ~ 1.4
Q9NQ88 C161 2.0 4.6 2.1 0.9 ~ 1.1 1.5 20.0 ~ 1.0
Q13155 C23 1.6 2.3 ~ 1.2 1.6 0.9 ~ 1.1 ~ 0.9
Q9NQW6 C712 ~ 11.2 ~ 0.9 16.7 1.3 ~ 2.1 ~ ~
P51649 C340 1.5 13.4 20.0 ~ 20.0 1.1 1.3 0.7 ~ 1.2
Q15021 C439 ~ 5.2 ~ 0.9 6.7 ~ ~ 5.9 ~ 0.7
Q5T0N5 C69 ~ 2.0 ~ 1.3 20.0 1.0 ~ 1.5 ~ 0.8
P38606 C138 ~ 20.0 ~ 9.2 20.0 20.0 ~ 1.9 ~ 20.0
Q9HCC0 C216 3.3 2.3 ~ 1.9 20.0 ~ 1.8 1.0 ~ 2.2
Q9NQC3 CI 101 20.0 20.0 — 1.1 20.0 ~ ~ 20.0 ~ 2.1
P35754 C23 5.7 ~ 13.0 20.0 20.0 ~ ~ 0.7 ~ ~
Q99757 C90 ~ ~ ~ 2.7 5.0 ~ ~ 3.2 ~ 4.3
Q9Y3D0 C93 2.1 2.1 ~ 0.9 ~ ~ 1.4 2.7 ~ 0.5 Q9UMS0 C213 — 20.0 — ~ 20.0 1.6 ~ 3.5 ~ ~
Q9NXV6 C516 13.8 7.7 ~ 1.0 3.3 2.4 1.6 1.4 ~ 0.8
Q96RS6 C376 1.7 5.7 ~ 0.8 1.1 0.7 1.2 0.9 1.8 1.4
Q14997 C1840 ~ 20.0 ~ ~ 20.0 0.9 1.6 ~ 20.0 1.0
P50570 C27 1.4 3.1 ~ 0.7 0.7 ~ 1.0 1.0 ~ 0.8
Q86YH6 C71 6.6 20.0 ~ 1.0 20.0 1.2 1.8 20.0 ~ 2.1
Q99497 CI 06 2.6 1.7 ~ 1.3 3.0 ~ 3.3 0.8 ~ ~
Q9UJW0 C258 20.0 20.0 20.0 20.0 20.0 ~ 1.9 3.0 ~ ~
Q9BUH6 CI 80 ~ 3.8 ~ ~ 1.4 1.1 ~ 1.3 ~ ~
P24752 C196 6.8 5.1 2.3 1.1 4.6 ~ ~ 1.0 ~ 1.5
Q13162 C51 — 2.0 — 0.9 1.0 ~ ~ 0.9 ~ 1.7
Q9BTA9 C553 2.1 20.0 ~ 1.7 ~ 1.4 ~ 1.7 ~ 2.8
P48643 C253 1.1 ~ 0.9 0.6 5.1 ~ ~ 0.8 ~ ~
075362 C286 1.8 2.6 ~ 1.0 20.0 1.1 20.0 1.4 ~ 1.1
060825 C158 1.8 0.7 1.2 ~ 0.8 ~ 0.9 1.5 14.2 1.4
Q8NBS9 C350 1.8 — 1.4 3.4 9.4 1.3 ~ 1.0 ~ ~
Q9NYL2 C22 ~ 1.2 ~ 0.7 0.7 1.3 ~ 0.9 ~ ~
P27707 C9 1.5 1.9 1.7 0.5 1.4 1.0 1.3 ~ ~ 1.3
Q93009 C223 20.0 2.5 ~ 1.0 3.1 ~ 1.3 1.1 ~ 1.8
014929 ClOl ~ 20.0 ~ 0.8 20.0 0.9 1.2 1.6 1.9 1.2
Q9UPQ0 C140 ~ 3.9 ~ 1.0 1.9 1.1 ~ 4.6 ~ 0.8
Q96NY7 C487 ~ 3.2 ~ 3.1 20.0 1.9 ~ 0.8 ~ 1.2
Q9NQ88 C114 1.4 2.1 ~ 0.8 ~ 1.1 1.2 2.2 5.1 0.9
Q14790 C360 3.2 ~ 5.8 ~ ~ 1.2 ~ ~ ~ 0.9
P04183 C230 2.0 1.3 ~ ~ ~ 0.7 ~ ~ ~ 0.7
P68366 C54 1.8 7.9 2.8 0.3 ~ ~ 1.3 ~ 5.1 0.8
Q13428 C1298 2.5 5.9 ~ 0.8 20.0 ~ ~ 3.0 ~ ~
Q5MNZ6 C63 11.2 3.6 ~ 0.9 20.0 1.1 ~ 1.2 ~ 0.8
014980 C528 1.8 1.3 0.7 0.7 ~ 1.1 1.1 20.0 ~ 0.8
Q86W42 C35 2.7 ~ ~ 1.3 ~ ~ ~ 0.9 ~ ~
Q9Y6G9 C51 ~ 4.9 ~ ~ 6.5 ~ 1.5 2.1 4.3 0.6
Q9NY27 C22 1.8 ~ 2.1 2.3 20.0 ~ 1.5 1.3 ~ ~
Q8NFH5 C255 1.9 3.5 ~ 1.3 12.4 1.5 ~ 1.0 ~ 1.8
Q9Y676 C128 1.3 2.9 — 0.8 1.4 ~ ~ 1.2 ~ 1.5
P35658 C728 4.2 20.0 ~ 0.8 ~ 1.0 ~ 1.2 ~ 1.3
Q9NTX5 C133 ~ 1.6 ~ 1.0 1.2 1.1 ~ 1.0 ~ 1.1
Q15118 C71 ~ 18.0 2.7 0.6 ~ 1.6 ~ 4.4 ~ 1.7
Q00765 C18 ~ 4.2 ~ 0.8 20.0 1.2 ~ 20.0 ~ 0.7
P22307 C71 — 8.3 — 2.0 3.7 4.9 ~ 1.1 ~ 8.3
075521 C312 ~ 5.0 ~ ~ 4.2 0.7 ~ 1.4 ~ 1.6
P49189 C288 20.0 20.0 ~ ~ 1.8 ~ 12.0 0.9 ~ 1.0
Q5T440 C170 2.4 5.7 ~ ~ ~ 1.2 ~ 1.2 ~ 1.6
Q15084 C190 1.6 3.7 ~ ~ 20.0 ~ ~ 1.1 ~ ~
Q96C19 C172 ~ 2.4 ~ 0.9 0.9 0.7 ~ 1.0 ~ ~
P22061 C102 1.3 1.6 ~ 0.8 5.8 0.9 ~ 1.0 ~ ~
Q9NP73 C86 ~ ~ ~ 0.7 ~ 0.9 1.4 1.1 ~ ~
Q9BRF8 C54 1.9 3.8 ~ 1.0 1.2 1.4 ~ 0.9 ~ ~
Q6ICB0 C108 ~ ~ ~ 0.7 ~ ~ 4.0 ~ ~ 0.6
P29590 C389 ~ 7.9 ~ 0.5 ~ 1.3 ~ 1.2 ~ 20.0
P07858 C211 ~ 7.4 ~ 1.1 ~ 1.6 ~ 1.0 ~ 2.6
Q9NX18 C83 ~ 4.4 ~ 1.1 20.0 ~ ~ 2.2 ~ 1.3
P46109 C249 4.0 20.0 ~ ~ 20.0 ~ 1.8 1.3 ~ 0.9
P45984 C177 3.8 20.0 ~ 0.9 2.1 ~ ~ 1.5 ~ ~
P19447 C342 2.7 20.0 — 0.7 6.7 1.1 ~ 2.5 ~ 2.0
P42166 C341 1.5 ~ ~ 0.8 ~ 1.3 1.4 2.3 ~ 1.7
Q8N1F7 C522 3.7 ~ ~ ~ 20.0 ~ 2.2 3.2 20.0 ~
Q86UY8 C276 ~ 5.8 ~ 1.4 20.0 1.5 ~ 2.8 ~ 1.8 Q8WWI1 C228 1.4 17.2 — 1.3 20.0 2.0 1.0 — ~ 1.1
Q9NWA0 C139 ~ 2.9 ~ ~ 20.0 1.1 ~ ~ ~ 1.5
P09110 C381 0.9 ~ ~ 0.8 4.5 ~ ~ 1.0 ~ ~
Q2NL82 CI 26 ~ 4.9 ~ ~ 20.0 ~ ~ 0.8 ~ 1.5
Q5JPI3 C308 4.0 ~ ~ 1.0 ~ 1.1 1.7 3.2 ~ 0.9
P23919 C163 2.4 ~ ~ 0.2 ~ ~ ~ 0.9 ~ ~
Q96EB1 C218 3.7 1.1 ~ ~ ~ 1.2 1.7 ~ ~ 0.7
Q96FX7 C209 16.1 ~ ~ ~ 20.0 2.1 6.8 2.7 ~ 2.0
014933 C98 2.8 20.0 3.2 ~ ~ ~ ~ 2.5 ~ 0.7
Q29RF7 C242 ~ ~ ~ 0.9 12.9 ~ ~ 1.8 ~ 0.9
Q96T76 C819 4.0 — 3.0 0.8 ~ ~ ~ 3.4 ~ ~
P23919 C117 1.4 0.4 ~ ~ 0.2 0.6 ~ 1.2 ~ 0.9
Q15149 C4574 ~ 2.2 ~ 1.1 ~ 1.4 ~ 1.2 ~ 1.0
Q96RP9 C153 2.8 14.8 ~ ~ 20.0 ~ 0.8 1.2 ~ 1.4
P04818 C199 11.7 1.5 ~ ~ ~ ~ ~ ~ ~ ~
P27708 C73 2.2 — — ~ ~ ~ 5.0 20.0 3.7 0.6
P55265 C1224 ~ 10.7 ~ ~ ~ 1.9 ~ 4.2 ~ 2.2
Q9Y3D2 C105 ~ 20.0 ~ 1.3 20.0 ~ ~ 2.7 ~ ~
000244 C12 1.6 2.1 ~ 0.7 ~ 1.5 ~ 0.7 ~ 0.7
Q8WV74 C207 20.0 ~ 20.0 ~ ~ 13.1 ~ 20.0 ~ ~
Q9NRW3 C130 6.2 ~ ~ ~ ~ ~ ~ ~ 1.4 1.3
P24468 C326 ~ 20.0 ~ 0.9 ~ ~ ~ 4.9 ~ 1.1
P42166 C684 ~ 5.2 ~ 0.7 ~ 1.2 ~ 3.1 ~ ~
Q96EY5 C231 2.1 ~ ~ 0.7 ~ ~ ~ ~ ~ 0.7
P14635 C238 2.1 6.7 ~ 0.7 ~ 1.3 1.5 2.6 ~ ~
Q8NDH3 C81 ~ 5.7 ~ ~ 20.0 ~ ~ 1.4 ~ 1.6
Q9P0J1 C149 ~ 18.0 ~ 1.4 18.5 ~ ~ 1.3 ~ ~
Q96P48 C900 ~ 1.8 ~ ~ 1.9 1.2 ~ ~ ~ 0.7
Q96HE7 C37 ~ ~ ~ ~ ~ ~ 1.4 1.4 ~ ~
Q07065 CIOO ~ ~ ~ 1.0 20.0 1.6 ~ 11.2 ~ 2.2
Q9BRJ7 C88 4.0 20.0 ~ ~ ~ ~ ~ ~ ~ ~
075439 C265 1.7 ~ 2.1 1.0 ~ ~ ~ 20.0 ~ ~
043175 C369 2.0 ~ 2.4 ~ ~ ~ 2.0 ~ 20.0 1.0
Q9UNI6 C265 — 1.2 — ~ 1.2 ~ ~ 0.7 ~ ~
Q06203 CIOO 2.9 ~ 20.0 ~ ~ ~ 2.0 ~ ~ ~
A0AVT1 C347 2.6 ~ 20.0 1.6 ~ 1.3 ~ ~ ~ ~
Q86X76 C203 20.0 ~ ~ 0.8 3.9 ~ ~ 0.7 ~ 0.8
Q6XZF7 C691 10.0 20.0 20.0 ~ ~ 1.6 ~ ~ ~ 0.8
Q15398 C129 — — — 0.6 ~ ~ ~ — ~ ~
075717 C773 ~ 2.2 ~ 1.2 ~ 1.1 ~ 1.7 ~ ~
Q01433 C107 1.4 ~ ~ 0.6 ~ ~ ~ ~ ~ ~
Q8WVV9 C464 ~ ~ ~ 0.9 20.0 ~ ~ 2.2 ~ 1.4
014733 C131 ~ 1.8 ~ ~ 1.7 ~ 1.6 ~ 20.0 0.7
Q14137 C404 2.1 20.0 ~ ~ ~ 1.1 ~ ~ ~ 1.1
Q96RU2 CI 71 ~ ~ ~ 2.2 20.0 ~ ~ 1.3 ~ ~
Q9Y679 C391 ~ ~ ~ 3.0 ~ ~ ~ ~ ~ 1.5
P51610 C1872 1.0 1.5 ~ 0.5 ~ 1.0 ~ 1.1 ~ ~
P22307 C307 ~ ~ ~ 2.9 20.0 ~ ~ 0.9 ~ 20.0
Q9BTE3 C325 5.4 ~ ~ ~ ~ ~ ~ 5.7 ~ ~
Q9HA64 C24 1.6 ~ ~ ~ ~ ~ ~ 1.7 ~ 0.6
Q5TFE4 C119 20.0 ~ 20.0 ~ ~ ~ 1.7 3.8 ~ 1.2
Q96N67 C2125 ~ ~ ~ ~ ~ ~ ~ 1.8 ~ ~
P52948 C1312 ~ 20.0 ~ ~ 20.0 1.2 ~ 1.7 ~ 1.1
Q5UIP0 C2298 — 20.0 — 1.3 ~ ~ ~ 20.0 ~ 0.9
P51812 C436 1.8 ~ ~ 0.5 1.4 1.0 ~ 0.6 ~ ~
Q92616 CI 692 ~ ~ ~ ~ ~ ~ 0.9 20.0 ~ ~
Q15345 C297 ~ ~ ~ 1.1 ~ 1.6 ~ 5.8 ~ 1.1 Q9NPH0 C267 2.6 13.4 — ~ ~ ~ ~ 1.2 ~ ~
P04183 C66 1.8 ~ ~ 0.6 ~ 1.0 ~ ~ ~ ~
P42166 C629 ~ 2.5 ~ 0.6 ~ 1.0 ~ 20.0 ~ 1.6
Q15013 C124 ~ ~ ~ 1.0 ~ ~ ~ 1.4 ~ ~
Q9Y5Y2 C72 1.3 1.3 ~ 0.5 ~ ~ 1.5 ~ ~ ~
015446 C86 ~ 4.7 ~ ~ 3.3 1.3 ~ 1.0 ~ 1.7
Q13630 C116 1.6 20.0 ~ ~ ~ ~ 1.2 ~ ~ 0.8
Q8IYQ7 C324 ~ 5.3 ~ ~ ~ 1.3 ~ 1.6 ~ 2.0
P05091 C319 20.0 ~ ~ ~ 20.0 ~ ~ 3.3 ~ ~
Q29RF7 C532 ~ 20.0 ~ ~ ~ ~ 2.0 2.5 ~ ~
Q9Y570 C381 — 4.7 — ~ ~ 0.9 ~ — ~ ~
Q14980 C961 ~ ~ ~ ~ 20.0 2.2 ~ ~ ~ 3.7
P53384 C235 ~ ~ ~ 0.5 ~ ~ ~ ~ ~ ~
Q15003 C418 2.5 ~ ~ 1.6 ~ ~ ~ ~ ~ ~
P53634 C258 ~ 5.5 ~ ~ 13.4 1.6 ~ 1.9 ~ 2.8
Q8NFF5 C499 12.1 — — 1.6 ~ ~ ~ 20.0 ~ ~
Q9ULA0 CI 44 1.3 5.2 ~ ~ ~ ~ ~ ~ ~ ~
P22307 C94 ~ ~ ~ 3.3 20.0 ~ ~ 1.0 ~ ~
015294 C620 ~ ~ 12.0 ~ ~ ~ ~ 1.0 ~ ~
Q9Y5S2 C1517 ~ ~ ~ 1.2 ~ ~ ~ 0.9 ~ ~
Q8TD19 C623 20.0 ~ ~ ~ ~ ~ 20.0 ~ ~ 0.8
Q8N2W9 C326 2.6 4.5 ~ ~ ~ ~ 20.0 ~ ~ 1.0
Q13158 C98 ~ ~ ~ ~ ~ 1.6 ~ ~ ~ ~
Q9UKX7 CI 51 1.8 ~ ~ 1.0 6.7 ~ ~ 1.8 ~ ~
Q6PCB5 C280 2.5 ~ ~ ~ 20.0 ~ 0.8 20.0 ~ ~
P10398 C597 1.3 1.4 ~ ~ ~ ~ 1.0 ~ ~ 0.7
Q9UL40 C68 20.0 ~ ~ ~ 20.0 ~ 20.0 20.0 ~ ~
P46013 C903 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Q16667 C39 4.5 2.3 ~ ~ ~ ~ 1.2 ~ ~ 0.8
075150 C890 ~ 4.2 ~ 1.1 ~ ~ ~ 1.1 ~ ~
Q00610 C870 ~ ~ ~ 0.9 1.7 ~ ~ 20.0 ~ ~
Q9Y5T5 C205 ~ ~ ~ ~ ~ ~ 20.0 ~ ~ 0.9
095881 C66 20.0 ~ ~ ~ 20.0 20.0 ~ 1.3 ~ ~
Q7Z5K2 C160 — 20.0 — 1.0 ~ ~ ~ 2.4 ~ ~
P42166 C518 1.3 4.2 ~ 0.8 ~ ~ ~ ~ ~ 1.4
Q9Y2S7 C143 ~ 13.9 ~ ~ ~ ~ ~ 1.3 ~ ~
E2QRD5 CI 83 ~ ~ ~ 0.9 ~ ~ ~ ~ ~ 0.8
095833 C22 ~ ~ ~ ~ ~ ~ ~ 0.8 ~ 1.0
094953 C694 4.2 — 3.9 ~ ~ ~ 1.5 — ~ ~
000541 C272 ~ ~ ~ 1.0 ~ 1.4 ~ 3.8 ~ ~
Q9NXJ5 C149 20.0 ~ 11.0 ~ 3.9 ~ 20.0 ~ ~ ~
Q8N5L8 C131 ~ ~ ~ ~ ~ 1.2 ~ 1.3 ~ 0.8
Q8IZ73 C246 6.5 ~ ~ ~ ~ ~ ~ ~ ~ 1.8
Q99798 C385 ~ 0.7 ~ ~ ~ ~ ~ ~ ~ 2.0
Q9GZR2 C382 1.0 ~ ~ ~ 20.0 ~ 0.9 1.2 ~ ~
Q13613 C117 4.3 ~ ~ ~ ~ ~ ~ 1.0 1.7 0.9
Q9NUI1 C22 ~ ~ ~ 2.1 ~ ~ ~ 1.1 ~ ~
Q02556 C306 5.5 ~ ~ ~ ~ ~ 1.5 ~ ~ ~
Q9UPT9 C171 4.4 ~ 9.3 0.8 ~ ~ 3.2 ~ ~ ~
Q8N999 C302 20.0 ~ ~ ~ ~ ~ ~ ~ ~ ~
Q8IU81 C363 ~ 5.9 ~ ~ ~ ~ ~ ~ ~ ~
Q9C0I1 C152 ~ ~ ~ ~ 20.0 ~ 6.5 ~ ~ 0.7
Q9P2X3 C195 ~ ~ ~ ~ ~ ~ ~ 1.6 ~ ~
Q6QNY0 C168 4.6 — — ~ ~ ~ 1.2 — ~ ~
Q15796 C81 5.6 ~ ~ ~ ~ ~ ~ ~ ~ 1.3
Q9NZB2 C531 ~ ~ ~ ~ 20.0 ~ ~ ~ ~ 1.6
Q9HB90 C377 20.0 ~ ~ ~ ~ ~ ~ 1.1 ~ ~
Figure imgf000203_0001
Table 2C
27 20 28 20 29 20 31 20 31 20 33 20 38 20 41 20 45 20 51 20 56 20
OuM_i OuM_i OuM_i 0uM_i 0uM_i 0uM_i 0uM_i 0uM_i 0uM_i 0uM_i 0uM_i nsitu nsitu nsitu_ nsitu nsitu_ nsitu nsitu nsitu nsitu nsitu nsitu
Identifier 231 231 ramos 231 ramos 231 231 231 231 231 231
P04406 C152 0.5 0.8 1.2 0.8 1.0 1.1 0.9 0.6 20.0 1.0 0.9
P61978 C132 0.9 1.0 1.2 0.9 1.1 1.3 1.0 0.9 9.3 1.2 0.8
Q13526 C113 0.5 0.7 1.0 0.6 1.1 1.2 1.0 0.6 10.0 1.4 ~
P24752 CI 19 1.6 1.0 1.2 0.6 0.7 2.3 1.2 1.9 12.4 0.7 0.8
P24752 C413 1.9 1.0 1.2 1.0 ~ 20.0 1.1 0.9 20.0 1.1 0.7
Q9NUY8 C283 0.6 0.7 1.3 0.6 ~ 1.6 1.0 0.7 2.7 2.1 0.9
P13667 C206 1.0 1.3 1.4 0.8 1.1 2.7 0.9 1.0 1.9 5.3 0.8
P12268 C140 0.5 0.6 1.6 0.6 1.0 1.4 1.2 0.7 1.9 1.0 1.1
Q15365 C194 0.5 ~ 1.0 0.7 1.6 2.2 2.5 0.3 1.5 0.6 1.2
Q9NVC6 C649 1.3 1.1 1.6 0.8 ~ 1.2 1.2 0.8 20.0 ~ 0.8
P42166 C561 0.8 1.1 0.9 1.3 2.5 2.9 20.0 0.7 20.0 1.3 1.1
Q9Y696 C35 1.0 2.7 1.0 0.6 ~ 8.9 1.0 0.6 1.4 2.6 0.9
P10599 C32 2.5 20.0 1.9 1.2 1.7 20.0 2.1 0.4 5.7 13.9 1.5
P31943 C267 1.1 ~ 1.4 0.7 0.9 1.3 1.2 1.0 3.6 1.3 0.8
Q86SX6 C67 1.5 ~ 1.2 ~ 0.7 1.0 1.0 1.0 9.1 1.0 0.8
P15121 C299 0.5 0.8 1.1 0.8 1.3 0.7 1.0 0.9 1.0 0.9 0.9
P52597 C267 1.0 ~ 1.2 1.2 1.3 1.6 1.8 0.8 4.1 1.3 0.9
Q9ULV4 C420 0.8 0.7 — 0.8 1.1 1.6 1.4 0.8 3.2 1.1 1.0
P62888 C92 0.6 0.9 2.0 ~ ~ 1.1 3.2 0.8 2.2 0.9 0.9
Q9NQR4 C153 1.3 ~ ~ 0.6 1.4 1.0 1.1 0.7 2.0 0.6 0.8
P42765 C92 1.2 ~ 0.9 0.8 1.2 1.3 0.9 1.1 15.1 1.1 ~
Q15084 C55 1.2 1.4 1.0 0.7 ~ 3.7 0.8 0.9 2.0 5.0 0.8
Q96HE7 C241 0.9 1.1 1.3 0.7 — 2.3 1.6 0.7 2.1 1.7 0.9
Q99439 C164 ~ 0.9 1.0 0.7 1.3 1.5 1.5 0.4 1.8 ~ 1.2
P25205 CI 19 1.0 ~ 1.3 0.5 ~ 1.2 0.9 0.9 6.2 2.6 1.0
Q9NS86 CI 87 0.4 ~ 1.7 0.5 ~ 1.5 2.0 0.6 3.1 2.0 1.2 27 20 28 20 29 20 31 20 31 20 33 20 38 20 41 20 45 20 51 20 56 20
OuM_i OuM_i OuM_i 0uM_i 0uM_i 0uM_i 0uM_i 0uM_i 0uM_i 0uM_i 0uM_i nsitu nsitu nsitu_ nsitu nsitu_ nsitu nsitu nsitu nsitu nsitu nsitu
Identifier 231 231 ramos 231 ramos 231 231 231 231 231 231
Q15233 C145 1.2 1.1 ~ 0.7 0.8 1.3 0.9 1.1 3.7 1.6 0.8
Q9BRA2 C43 20.0 ~ 2.3 0.6 ~ 20.0 1.1 0.6 8.0 ~ 3.9
P35611 C68 0.9 1.7 ~ ~ ~ 1.6 1.4 0.9 20.0 1.0 0.8
075521 C380 1.6 0.8 1.2 ~ ~ 0.7 0.9 1.1 1.2 1.0 1.1
Q9BXW7 C392 1.3 1.0 1.5 ~ — 1.1 0.8 1.0 4.9 1.1 0.8
P30101 C406 1.2 ~ 1.5 0.8 1.1 ~ 0.8 1.0 2.6 6.0 0.7
Q96AB3 C114 1.0 0.8 1.1 0.7 ~ 0.9 1.2 0.7 3.1 0.8 ~
P13667 C555 1.2 ~ 1.5 0.8 1.1 ~ 0.8 1.0 2.6 6.0 0.7
Q09161 C44 0.8 ~ 1.1 0.6 ~ 1.1 1.2 0.8 2.5 1.2 1.0
P78417 C32 20.0 20.0 20.0 0.8 2.6 20.0 1.0 0.3 1.9 ~ 1.3
Q9ULW0 C536 1.1 1.0 1.8 ~ ~ 2.2 1.2 1.2 20.0 1.4 ~
Q9NRG0 C55 0.9 ~ 1.5 ~ 1.6 1.1 1.0 0.9 20.0 1.1 1.0
Q96T76 C848 0.5 0.8 1.0 20.0 ~ ~ ~ 0.7 20.0 ~ 1.4
Q8TAQ2 C145 2.1 0.9 2.7 0.8 1.1 1.3 1.3 1.2 2.9 1.2 0.8
Q9BVC5 C10 1.1 0.9 1.8 ~ ~ 1.3 0.9 1.0 3.4 1.0 0.8
Q7Z2W4 C645 0.8 ~ 1.6 ~ ~ 2.1 0.9 0.9 4.6 1.6 0.9
Q9BQ69 C186 2.4 ~ 1.5 0.5 ~ 0.9 3.6 1.2 3.3 1.2 0.8
Q16831 C162 0.5 0.7 ~ ~ ~ 1.3 1.4 0.8 1.3 1.1 1.1
P30101 C57 1.1 1.6 1.2 0.9 ~ ~ 0.9 1.3 2.6 ~ 0.7
P12268 C331 0.7 0.8 1.3 1.0 1.1 1.5 1.3 0.8 4.1 ~ ~
095571 C170 1.1 ~ 1.5 ~ ~ 1.2 ~ 1.1 3.1 ~ 0.8
000299 C24 0.7 ~ 1.1 0.6 ~ 8.4 1.0 0.6 ~ ~ 0.9
014879 C343 0.4 0.7 ~ 0.9 ~ 2.4 ~ 0.3 4.0 1.3 1.2
Q96CM8 C64 1.8 ~ 2.3 0.8 ~ 2.3 1.1 0.8 20.0 ~ 0.8
P51946 C244 1.1 ~ 1.1 ~ ~ 0.9 1.0 1.2 2.4 0.7 1.0
P49588 C773 0.6 ~ 1.1 ~ 1.2 ~ 0.7 0.7 10.1 ~ ~
Q96R 5 C618 0.8 0.7 1.3 0.9 ~ 2.0 1.4 ~ 20.0 2.1 0.8
015294 C758 1.0 1.0 — ~ — 1.2 ~ 0.9 2.9 1.1 1.0
P46734 C207 0.8 ~ ~ ~ ~ 1.2 1.5 0.7 1.5 0.7 1.3
Q96S55 C272 0.9 0.5 ~ ~ ~ 1.2 1.3 0.8 2.5 0.9 0.9
095229 C54 0.5 0.7 1.2 ~ ~ ~ ~ 0.6 ~ 1.0 1.0
060610 C1227 0.5 ~ 1.6 0.6 1.2 ~ 0.3 0.6 3.0 ~ 1.2
Q13428 C38 1.3 1.0 1.5 ~ — ~ 1.1 1.1 2.8 1.1 0.8
Q9Y277 C65 1.6 1.2 ~ ~ 1.0 ~ 20.0 1.4 ~ ~ 0.9
P57764 C268 0.6 ~ 1.1 ~ 3.1 ~ 1.8 0.6 20.0 1.8 0.8
Q9Y3A3 C134 ~ ~ 1.2 1.0 1.8 1.2 1.0 0.9 2.3 ~ 0.9
Q02252 C317 2.0 1.1 1.2 ~ ~ 2.3 2.4 0.8 2.6 ~ ~
Q9NYL9 C132 0.6 0.7 0.9 0.7 0.7 ~ 2.6 0.3 0.9 0.8 ~
P83731 C6 0.4 ~ 1.6 ~ ~ 1.6 2.1 0.3 1.2 ~ ~
095336 C32 0.5 ~ 3.0 ~ 1.7 2.3 2.8 0.6 ~ ~ 0.9
Q13155 C291 0.7 0.9 ~ 0.8 ~ ~ 1.7 0.7 7.3 1.1 1.1
Q13418 C346 0.6 ~ ~ 0.6 ~ 1.4 1.0 0.6 ~ 0.8 ~
A6NDU8 CI 79 0.7 0.8 1.0 ~ ~ 7.4 1.4 0.6 20.0 ~ 1.2
Q9UKF6 C498 1.5 ~ 2.3 ~ ~ 1.6 ~ 1.0 20.0 ~ 1.0
Q96F86 C413 0.7 ~ 1.9 ~ 1.3 3.0 ~ ~ 20.0 20.0 1.2
P42224 C492 ~ 0.8 3.8 0.8 1.9 1.3 ~ 0.6 20.0 2.4 1.2
P11216 C326 0.7 ~ ~ 0.5 ~ 1.6 1.2 0.7 5.3 0.8 ~
P21980 C277 0.6 0.4 ~ ~ ~ ~ 1.4 0.6 1.1 1.0 ~
Q9HAV7 C124 0.6 ~ 1.0 ~ ~ ~ 1.0 0.6 2.1 ~ 0.7
P24752 C126 — 1.2 1.2 0.7 — 20.0 ~ 0.8 ~ — 0.8
Q9NQ88 C161 0.6 ~ 1.3 1.0 2.6 ~ ~ ~ 20.0 ~ 1.4
Q13155 C23 0.7 ~ 1.6 ~ ~ 1.4 1.4 ~ 4.4 1.2 0.9
Q9NQW6 C712 1.0 0.9 ~ 0.7 ~ 1.5 1.5 0.7 5.9 ~ 1.0 27 20 28 20 29 20 31 20 31 20 33 20 38 20 41 20 45 20 51 20 56 20
OuM_i OuM_i OuM_i OuM_i OuM_i 0uM_i 0uM_i OuM_i OuM_i OuM_i OuM_i nsitu nsitu nsitu_ nsitu nsitu_ nsitu nsitu nsitu nsitu nsitu nsitu
Identifier 231 231 ramos 231 ramos 231 231 231 231 231 231
P51649 C340 0.9 ~ 1.0 ~ 1.0 0.9 ~ ~ 2.4 ~ ~
Q15021 C439 0.6 0.9 1.4 20.0 ~ ~ 1.1 0.8 ~ 1.1 1.2
Q5T0N5 C69 0.6 ~ ~ ~ ~ 2.6 1.5 0.6 5.4 1.7 1.0
P38606 C138 4.6 ~ ~ ~ 1.7 20.0 2.1 0.8 ~ 10.6 1.1
Q9HCC0 C216 1.0 — — ~ — 1.0 ~ 1.0 7.0 0.9 0.7
Q9NQC3 CI 101 1.4 ~ ~ 12.8 ~ 2.1 20.0 1.0 ~ 1.2 0.8
P35754 C23 0.7 1.4 ~ 0.6 1.1 20.0 0.9 0.5 16.3 ~ 0.9
Q99757 C90 6.3 ~ 1.6 1.4 ~ ~ 3.6 0.5 6.7 5.0 ~
Q9Y3D0 C93 0.7 0.8 1.0 ~ ~ ~ 2.4 0.6 20.0 20.0 ~
Q9UMS0 C213 1.3 ~ 2.0 1.5 ~ 1.9 3.5 0.9 7.1 1.8 0.7
Q9NXV6 C516 1.4 ~ 1.1 ~ ~ 1.4 1.8 0.9 ~ ~ ~
Q96RS6 C376 ~ ~ 1.3 0.8 ~ ~ 1.3 0.7 ~ ~ ~
Q14997 C1840 20.0 0.9 ~ ~ 1.2 ~ 20.0 0.9 ~ 20.0 ~
P50570 C27 0.5 ~ 1.0 ~ ~ ~ 0.9 0.5 15.4 1.1 ~
Q86YH6 C71 1.3 ~ 1.1 ~ ~ ~ ~ 0.9 20.0 ~ 1.0
Q99497 C106 0.9 1.0 ~ 0.8 1.0 ~ 1.1 0.6 20.0 1.4 ~
Q9UJW0 C258 0.8 ~ 20.0 ~ ~ ~ ~ 0.7 2.7 1.1 1.3
Q9BUH6 CI 80 ~ 1.1 1.5 0.8 ~ 1.4 1.4 0.6 5.5 ~ 1.3
P24752 C196 ~ ~ 1.1 ~ ~ 3.4 1.0 0.9 ~ ~ 0.8
Q13162 C51 0.9 ~ ~ ~ 1.3 1.7 2.4 ~ 2.3 ~ 1.0
Q9BTA9 C553 ~ ~ 1.1 ~ ~ ~ 1.1 1.0 19.1 1.2 0.9
P48643 C253 ~ 0.8 ~ 0.5 1.0 ~ 1.1 0.8 1.2 0.8 0.9
075362 C286 1.0 0.9 1.0 ~ ~ ~ ~ 1.0 20.0 ~ 0.9
060825 C158 ~ ~ 0.9 ~ 1.3 ~ ~ 0.7 1.3 0.9 1.2
Q8NBS9 C350 1.1 1.1 ~ 0.8 ~ 2.6 ~ 0.9 2.1 ~ 0.8
Q9NYL2 C22 0.7 ~ 1.2 ~ ~ 1.5 ~ 0.6 2.3 ~ 1.0
P27707 C9 ~ ~ 1.2 ~ 1.2 1.9 1.8 0.4 ~ ~ ~
Q93009 C223 1.1 1.0 1.0 ~ — ~ 0.8 ~ 20.0 1.0 ~
014929 ClOl ~ ~ 1.0 ~ ~ 1.3 1.1 0.6 20.0 ~ ~
Q9UPQ0 C140 0.7 ~ ~ ~ ~ 2.5 2.5 0.5 20.0 ~ 1.1
Q96NY7 C487 ~ ~ ~ 0.5 ~ 14.9 ~ 0.4 1.8 ~ 1.3
Q9NQ88 C114 0.4 ~ 0.9 ~ ~ ~ ~ 0.6 4.6 ~ ~
Q 14790 C360 0.4 — 1.6 0.5 — 3.2 1.0 0.5 20.0 — 1.2
P04183 C230 ~ ~ 1.2 ~ 2.1 1.6 2.8 0.6 2.0 0.7 ~
P68366 C54 0.3 ~ 1.0 ~ 1.1 ~ 2.0 ~ ~ ~ ~
Q13428 C1298 1.0 0.9 ~ ~ 2.2 ~ 2.0 1.0 4.7 ~ ~
Q5MNZ6 C63 ~ 1.7 ~ ~ ~ 1.6 ~ 0.8 2.6 ~ 0.7
014980 C528 ~ ~ 1.1 ~ ~ ~ 2.1 0.9 ~ 1.1 ~
Q86W42 C35 1.3 1.3 ~ 0.5 ~ 1.6 0.8 1.1 20.0 1.4 0.7
Q9Y6G9 C51 0.6 0.7 ~ ~ 1.6 1.8 ~ 0.7 ~ ~ ~
Q9NY27 C22 ~ ~ 1.7 ~ ~ ~ 1.0 0.8 20.0 2.2 ~
Q8NFH5 C255 ~ ~ 1.6 ~ ~ ~ 1.1 0.8 5.5 ~ ~
Q9Y676 C128 1.0 ~ ~ ~ ~ ~ 1.0 ~ 2.0 1.0 ~
P35658 C728 ~ ~ 1.7 0.7 ~ 6.4 ~ 1.1 3.1 ~ ~
Q9NTX5 C133 1.5 ~ ~ ~ ~ 1.0 ~ ~ 4.2 ~ 0.9
Q15118 C71 1.5 ~ 1.8 ~ ~ ~ 1.3 ~ 20.0 ~ ~
Q00765 CI 8 1.1 ~ 1.4 ~ ~ 1.1 ~ ~ 20.0 ~ 0.9
P22307 C71 20.0 ~ ~ ~ ~ ~ ~ 1.2 16.8 5.6 ~
075521 C312 1.1 ~ ~ ~ ~ ~ 1.1 ~ 0.7 ~ 0.9
P49189 C288 — — — ~ — 20.0 0.7 0.8 20.0 — 0.9
Q5T440 C170 ~ ~ 1.3 ~ ~ 2.5 ~ 0.7 3.5 ~ 1.0
Q15084 C190 1.0 ~ 1.0 1.0 ~ ~ 1.0 0.9 2.8 ~ ~
Q96C19 C172 ~ ~ 1.2 1.6 ~ ~ 0.7 0.5 ~ 1.2 ~ 27 20 28 20 29 20 31 20 31 20 33 20 38 20 41 20 45 20 51 20 56 20
OuM_i OuM_i OuM_i OuM_i OuM_i OuM_i OuM_i OuM_i OuM_i OuM_i OuM_i nsitu nsitu nsitu_ nsitu nsitu_ nsitu nsitu nsitu nsitu nsitu nsitu
Identifier 231 231 ramos 231 ramos 231 231 231 231 231 231
P22061 C102 2.2 ~ 1.2 ~ ~ ~ ~ 0.7 ~ ~ ~
Q9NP73 C86 0.4 ~ 2.0 ~ ~ ~ 1.3 0.5 20.0 1.3 ~
Q9BRF8 C54 ~ ~ 1.5 ~ ~ 1.8 ~ 0.7 20.0 ~ ~
Q6ICB0 C108 1.9 ~ 1.0 1.2 1.3 ~ 1.1 ~ 3.1 1.1 ~
P29590 C389 0.5 — — 1.1 — ~ 2.2 0.4 3.4 0.9 ~
P07858 C211 0.8 ~ ~ ~ ~ 2.2 1.8 0.4 2.4 ~ ~
Q9NX18 C83 1.5 ~ 1.9 ~ ~ ~ ~ ~ 3.7 1.0 0.7
P46109 C249 ~ ~ 4.2 ~ ~ ~ 1.6 ~ 20.0 ~ 0.8
P45984 C177 0.5 ~ 1.7 ~ ~ 2.0 ~ 0.5 ~ ~ 1.1
P19447 C342 ~ ~ ~ ~ ~ ~ 1.0 0.7 5.9 ~ 0.7
P42166 C341 0.6 ~ ~ ~ 2.0 ~ ~ 0.5 ~ 1.2 ~
Q8N1F7_C522 ~ ~ 1.7 ~ 3.6 1.7 1.7 0.8 20.0 ~ ~
Q86UY8 C276 1.7 ~ ~ ~ ~ 0.8 ~ 1.7 2.1 1.2 ~
Q8WWI1 C228 0.8 ~ 1.0 ~ ~ ~ ~ 0.8 20.0 ~ ~
Q9NWA0 C139 1.3 ~ 0.7 ~ ~ 1.6 ~ 0.8 ~ ~ 1.1
P09110 C381 ~ 0.8 ~ 0.7 ~ ~ 0.8 0.9 1.2 ~ 0.8
Q2NL82 C126 0.4 ~ ~ ~ ~ ~ 0.9 0.8 20.0 1.9 ~
Q5JPI3 C308 0.7 ~ 1.3 ~ ~ ~ ~ 0.7 ~ ~ 0.7
P23919 C163 0.2 0.4 ~ ~ 1.0 ~ 5.7 0.2 ~ ~ 0.9
Q96EB1 C218 0.6 ~ 1.1 ~ ~ ~ 1.3 ~ 20.0 ~ ~
Q96FX7 C209 ~ ~ ~ ~ ~ ~ 1.0 ~ 20.0 ~ 0.9
014933 C98 ~ ~ 2.4 ~ 1.9 ~ 1.6 0.5 20.0 ~ ~
Q29RF7 C242 0.9 1.0 ~ ~ ~ ~ 1.0 ~ ~ 1.8 1.0
Q96T76 C819 0.5 ~ ~ 20.0 20.0 ~ 4.6 0.6 20.0 ~ 0.9
P23919 C117 ~ ~ 1.6 ~ ~ 2.5 4.4 ~ ~ ~ ~
Q15149 C4574 1.3 ~ ~ ~ ~ 1.4 ~ 0.9 ~ ~ 1.0
Q96RP9 C153 1.6 ~ 1.2 ~ ~ ~ ~ 1.1 ~ ~ ~
P04818 C199 0.5 — 2.5 ~ — ~ 20.0 0.7 20.0 20.0 ~
P27708 C73 0.5 ~ 1.8 ~ ~ ~ ~ 0.7 ~ ~ ~
P55265 C1224 1.4 ~ 1.8 ~ ~ 1.6 ~ ~ 20.0 1.3 ~
Q9Y3D2 C105 1.5 ~ ~ ~ ~ ~ 1.6 1.2 20.0 ~ ~
000244 C12 0.4 ~ 1.4 ~ ~ 1.8 ~ ~ ~ ~ ~
Q8WV74 C207 2.2 — — ~ — ~ ~ 0.9 20.0 — ~
Q9NRW3 C130 1.3 ~ 13.3 ~ ~ ~ ~ 0.9 20.0 ~ 0.9
P24468 C326 1.2 ~ ~ ~ ~ ~ ~ 0.8 20.0 ~ ~
P42166 C684 ~ ~ ~ ~ 3.3 ~ 3.0 ~ 6.8 1.2 1.1
Q96EY5 C231 0.6 ~ 1.1 ~ ~ 1.2 1.8 ~ ~ ~ ~
P14635 C238 ~ ~ ~ ~ ~ 1.2 1.8 0.5 ~ ~ ~
Q8NDH3 C81 ~ ~ ~ 1.0 ~ ~ 2.0 ~ ~ ~ 1.0
Q9P0J1 C149 ~ ~ ~ 1.0 ~ ~ ~ 0.9 20.0 ~ 0.8
Q96P48 C900 ~ ~ 1.1 1.1 ~ 1.7 ~ 0.5 ~ ~ ~
Q96HE7 C37 ~ ~ 2.1 0.6 ~ ~ 3.4 0.4 20.0 ~ ~
Q07065 C100 ~ ~ ~ ~ ~ 1.7 ~ 1.0 20.0 ~ ~
Q9BRJ7 C88 ~ ~ 1.7 ~ 2.9 ~ ~ 0.4 ~ 1.6 1.0
075439 C265 1.6 0.6 ~ 1.0 1.4 ~ ~ 1.3 ~ ~ ~
043175 C369 ~ ~ 1.1 20.0 20.0 ~ ~ ~ ~ ~ ~
Q9U I6 C265 0.6 0.8 ~ ~ ~ ~ 1.0 1.0 7.5 ~ ~
Q06203 C100 0.8 ~ 1.6 0.6 1.2 ~ 1.2 ~ ~ ~ ~
A0AVT1 C347 0.8 ~ ~ ~ 1.9 ~ ~ 0.6 ~ ~ ~
Q86X76 C203 — — — ~ — ~ ~ 1.3 20.0 — 0.8
Q6XZF7 C691 ~ ~ ~ 2.3 ~ 2.5 ~ ~ ~ ~ ~
Q15398 C129 0.5 ~ 1.2 ~ 2.4 1.3 1.4 ~ ~ 7.4 1.3
075717 C773 ~ ~ ~ ~ ~ ~ 2.3 1.0 3.6 ~ 0.9 27 20 28 20 29 20 31 20 31 20 33 20 38 20 41 20 45 20 51 20 56 20
OuM_i OuM_i OuM_i OuM_i OuM_i 0uM_i 0uM_i 0uM_i 0uM_i 0uM_i 0uM_i nsitu nsitu nsitu_ nsitu nsitu_ nsitu nsitu nsitu nsitu nsitu nsitu
Identifier 231 231 ramos 231 ramos 231 231 231 231 231 231
Q01433 C107 0.4 ~ ~ 0.7 ~ ~ ~ 0.5 ~ ~ ~
Q8WW9 C464 ~ ~ ~ ~ ~ 2.2 1.5 ~ 20.0 3.2 ~
014733 C131 ~ ~ 1.0 ~ ~ ~ 20.0 0.8 ~ ~ ~
Q14137 C404 0.6 ~ 1.5 ~ ~ ~ ~ ~ ~ ~ ~
Q96RU2 CI 71 1.1 1.1 — ~ — ~ 0.8 ~ ~ — ~
Q9Y679 C391 1.2 ~ ~ ~ ~ ~ ~ ~ 20.0 20.0 0.7
P51610 C1872 ~ ~ ~ ~ ~ ~ 1.3 0.5 ~ ~ ~
P22307 C307 ~ ~ ~ ~ ~ ~ ~ 1.1 ~ ~ ~
Q9BTE3 C325 ~ 1.0 ~ 1.9 3.5 ~ 2.2 0.6 20.0 ~ ~
Q9HA64 C24 ~ ~ ~ ~ ~ ~ 1.7 ~ ~ ~ 1.0
Q5TFE4 CI 19 ~ ~ ~ 3.1 ~ ~ ~ 0.5 ~ ~ ~
Q96N67 C2125 0.9 ~ ~ 0.8 ~ ~ 1.5 0.7 ~ ~ 0.9
P52948 C1312 1.3 ~ ~ ~ ~ 1.8 1.3 ~ ~ ~ ~
Q5UIP0 C2298 ~ ~ ~ ~ ~ 3.4 ~ 1.2 20.0 ~ ~
P51812 C436 ~ ~ 1.3 ~ ~ ~ ~ 4.5 ~ ~ ~
Q92616 CI 692 0.5 ~ ~ 0.6 2.0 1.2 ~ 0.7 ~ ~ ~
Q15345 C297 ~ ~ ~ 0.9 ~ 1.9 1.4 0.8 ~ ~ ~
Q9NPH0 C267 1.3 ~ 2.1 ~ ~ ~ ~ ~ 20.0 ~ ~
P04183 C66 ~ ~ 1.6 0.7 ~ ~ ~ 0.4 ~ ~ ~
P42166 C629 ~ ~ 1.6 ~ ~ ~ ~ ~ ~ ~ ~
Q15013 C124 1.0 1.0 ~ ~ ~ ~ ~ ~ 20.0 1.2 ~
Q9Y5Y2 C72 ~ ~ 1.1 ~ ~ ~ 1.7 ~ ~ ~ ~
015446 C86 ~ ~ ~ ~ ~ 1.6 ~ ~ 4.6 ~ ~
Q13630 C116 ~ ~ ~ ~ ~ ~ ~ ~ 20.0 ~ ~
Q8IYQ7 C324 ~ ~ ~ ~ ~ 1.4 ~ ~ ~ ~ ~
P05091 C319 ~ ~ 10.5 ~ ~ ~ ~ 0.9 20.0 ~ ~
Q29RF7 C532 ~ ~ ~ 5.1 8.1 ~ ~ 0.8 ~ ~ ~
Q9Y570 C381 — — 1.2 ~ — ~ ~ 0.2 ~ — ~
Q14980 C961 2.4 ~ ~ 1.1 ~ 4.8 2.6 ~ ~ ~ ~
P53384 C235 ~ ~ ~ 0.7 ~ ~ 2.0 0.2 ~ ~ ~
Q15003 C418 ~ ~ 1.4 ~ ~ ~ ~ ~ 20.0 ~ 1.3
P53634 C258 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Q8NFF5 C499 — — 3.4 ~ — ~ ~ ~ ~ — 1.3
Q9ULA0 CI 44 ~ ~ 1.3 ~ ~ ~ ~ 0.8 ~ ~ 0.9
P22307 C94 ~ 20.0 ~ ~ ~ ~ ~ ~ 20.0 ~ ~
015294 C620 ~ ~ 2.6 1.1 ~ ~ ~ 1.0 ~ ~ ~
Q9Y5S2 C1517 ~ ~ ~ ~ ~ ~ ~ 0.4 20.0 ~ ~
Q8TD19 C623 ~ ~ ~ 0.7 1.4 ~ ~ 0.6 ~ ~ ~
Q8N2W9 C326 ~ ~ ~ ~ ~ ~ 0.9 0.8 ~ ~ ~
Q13158 C98 0.6 ~ 1.3 0.9 ~ ~ 1.5 ~ ~ ~ ~
Q9UKX7 CI 51 ~ ~ ~ ~ ~ ~ ~ 0.8 ~ ~ ~
Q6PCB5 C280 1.0 ~ 1.4 ~ ~ ~ ~ ~ ~ ~ ~
P10398 C597 ~ ~ 1.2 ~ ~ ~ ~ ~ ~ ~ ~
Q9UL40 C68 1.9 ~ 2.4 ~ ~ ~ ~ ~ ~ ~ ~
P46013 C903 ~ ~ 1.6 ~ ~ ~ 1.1 1.0 20.0 1.1 ~
Q16667 C39 ~ ~ 1.7 ~ ~ ~ ~ ~ ~ ~ ~
075150 C890 1.2 ~ ~ 0.7 ~ ~ ~ 1.3 ~ ~ ~
Q00610 C870 ~ ~ ~ ~ ~ ~ 20.0 ~ ~ ~ ~
Q9Y5T5 C205 1.2 ~ ~ ~ ~ ~ 1.5 ~ ~ ~ ~
095881 C66 — — — ~ — ~ 1.0 1.0 ~ — ~
Q7Z5K2 CI 60 ~ ~ 0.7 ~ ~ ~ ~ 1.0 ~ ~ ~
P42166 C518 ~ ~ ~ ~ ~ ~ 2.2 ~ ~ ~ ~
Q9Y2S7 C143 ~ ~ ~ ~ ~ ~ ~ 1.2 ~ ~ 0.8
Figure imgf000208_0001
[00450] Table 3 illustrates a list of cysteine containing proteins and potential cysteine site of conjugation.
Figure imgf000208_0002
Identifier Protein Name Cysteine Location Protein Class
000541 PES 1 Pescadillo homolog C272; C361 Uncategorized
000622 CYR61 Protein CYR61 C39; C70; C134 Uncategorized
1KB KB Inhibitor of nuclear factor kappa-B kinase
014920 C464 Enzyme subunit
UBE2L6 Ubiquitin ISG15-conjugating enzyme
014933 C98 Enzyme
E2 L6 PCTK
014980 XPOl Exportin-1 C34; C528; C1070 Uncategorized
Transcription
075362 ZNF217 Zinc finger protein 217 C286 factors and regulators
094953 KDM4B Lysine-specific demethylase 4B C694 Enzyme
P00813 ADA Adenosine deaminase C75 Enzyme
Transcription
P04150 NR3C1 Glucocorticoid receptor C302; C622 factors and regulators
Transcription
POU2F2 POU domain, class 2, transcription
P09086 C346 factors and factor 2
regulators
P09211 GSTP1 Glutathione S -transferase P C48 Enzyme
Adapter,
P14598 NCF1 Neutrophil cytosol factor 1 C378 scaffolding, modulator proteins
UCHL3 Ubiquitin carboxyl-terminal hydrolase
P15374 C95 Enzyme isozyme L3
MGMT Methylated-DNA~protein-cysteine
P16455 C145; C150 Enzyme methyltransferase
P17812 CTP synthase 1 C491 Enzyme
ERCC3 TFIIH basal transcription factor complex
PI 9447 C342 Enzyme helicase
TNFAIP3 Tumor necrosis factor alpha-induced
P21580 C54 Enzyme protein 3
ACAT1 Acetyl-CoA acetyltransf erase, C119; C126; C196;
P24752 Enzyme mitochondrial C413
P40261 Nicotinamide N-methyltransferase C165 Enzyme
Transcription
STAT3 Signal transducer and activator of
P40763 C259 factors and transcription 3
regulators
UBA7 Ubiquitin-like modifier-activating enzyme
P41226 C599 Enzyme
7
P42575 CASP2 Caspase-2 C370 Enzyme
P43403 ZAP70 Tyrosine-protein kinase ZAP-70 CU7 Enzyme
Transcription
P48200 IREB2 Iron-responsive element-binding protein 2 C137 factors and regulators
P48735 IDH2 Isocitrate dehydrogenase C308 Enzyme
LRBA Lipopolysaccharide-responsive and beige¬
P50851 CI 704; C2675 Uncategorized like anchor protein
P51617 IRAKI Interleukin-1 receptor-associated kinase 1 C608 Enzyme
P61081 NEDD8-conjugating enzyme Ubcl2 C47 Enzyme
P61088 Ubiquitin-conjugating enzyme E2 N C87 Enzyme
Channels,
GNB2L1 Guanine nucleotide-binding protein
P63244 C182 Transporters, subunit beta-2-like 1
Receptors Identifier Protein Name Cysteine Location Protein Class
P68036 UBE2L3 Ubiquitin-conjugating enzyme E2 L3 C86 Enzyme
Q00535 CDK5 Cyclin-dependent kinase 5 C157 Enzyme
Transcription
Q01201 RELB Transcription factor RelB C109 factors and regulators
Transcription
Q02556 IRF8 Interferon regulatory factor 8 C306 factors and regulators
Q04759 PRKCQ Protein kinase C theta type C14; C17 Enzyme
Tyrosine-protein phosphatase non-receptor type
Q06124 C573 Enzyme
11
Q09472 EP300 Histone acetyltransferase p300 C1738 Enzyme
Q 14790 CASP8 Caspase-8 C360 Enzyme
C55; C58; C190;
Q15084 PDIA6 Protein disulfide-isomerase A6 Enzyme
C193
Transcription
Q15306 IRF4 Interferon regulatory factor 4 C194 factors and regulators
Q15910 EZH2 Histone-lysine N-methyltransferase EZH2 C503 Enzyme
Channels,
Q16186 Proteasomal ubiquitin receptor ADRMl C88 Transporters,
Receptors
Q 16763 UBE2S Ubiquitin-conjugating enzyme E2 S C118 Enzyme
Q16822 PCK2 Phosphoenolpyruvate carboxykinase C306 Enzyme
6-phosphofructo-2-kinase/fructose-2,6-
Q16875 C155 Enzyme bisphosphatase 3
PFKFB4 6-phosphofructo-2-kinase/fructose-2,6-
Q16877 C159 Enzyme bisphosphata
Q6L8Q7 PDE12 2,5 -phosphodiesterase 12 C108 Enzyme
Q70CQ2 USP34 Ubiquitin carboxyl-terminal hydrolase 34 C741; C1090 Enzyme
Transcription
ZC3HAV1 Zinc finger CCCH-type antiviral
Q7Z2W4 C645 factors and protein 1
regulators
Q86UV5 USP48 Ubiquitin carboxyl-terminal hydrolase 48 C39 Enzyme
Transcription
SMARCC2 SWI/SNF complex subunit
Q8TAQ2 C145 factors and
SMARCC2
regulators
Q92851 Caspase-10 C401 Enzyme
Q93009 USP7 Ubiquitin carboxyl-terminal hydrolase 7 C223; C315 Enzyme
PELI1 E3 ubiquitin-protein ligase pellino
Q96FA3 C282 Enzyme homolog 1
Q96GG9 DCUN1D1 DCNl-like protein 1 C115 Uncategorized
Q96JH7 VCPIP1 Deubiquitinating protein VCIP135 C219 Enzyme
Q96RU2 USP28 Ubiquitin carboxyl-terminal hydrolase 28 C171; C733 Enzyme
Q99873 PRMTl Protein arginine N-methyltransferase 1 C109 Enzyme
Q9C0C9 UBE20 Ubiquitin-conjugating enzyme E2 0 C375 Enzyme
Channels,
Q9HB90 RRAGC Ras-related GTP -binding protein C C358; C377 transporters, and receptors
Q9NRW4 Dual specificity protein phosphatase 22 C124 Enzyme
Q9NWZ3 IRAK4 Interleukin-1 receptor-associated kinase 4 C13 Enzyme
Q9NYL2 MLTK Mitogen-activated protein kinase kinase C22 Enzyme Identifier Protein Name Cysteine Location Protein Class
kinase MLT
Q9UPT9 USP22 Ubiquitin carboxyl-terminal hydrolase 22 C44; C171 Enzyme
SAMHD1 SAM domain and HD domain-
Q9Y3Z3 C522 Enzyme containing protein 1
Q9Y4C1 KDM3A Lysine-specific demethylase 3A C251 Enzyme
Q9Y5T5 USP16 Ubiquitin carboxyl-terminal hydrolase 16 C205 Enzyme
[00451] Table 4 shows representative cysteines with known covalent ligands targeted by fragment electrophiles in isoTOP-ABPP experiments.
Figure imgf000211_0001
[00452] Table 5 shows Reactive docking results for liganded cysteines.
Figure imgf000212_0001
[00453] Table 6 shows site of fragment labeling for recombinant proteins. The underlines portion indicates the fragment-modified cysteines.
Figure imgf000212_0002
[00454] Table 7 illustrates a list of DMF-sensitive Cys residues in human T ceils, defined as
Cys residues that showed R values (DMSO/DMF) > 4 in isoTOP-ABPP experiments comparing DMSO- versus DMF-treated T cells.
Figure imgf000213_0001
Induces
Mitochondrial carrier
MTCH2 mitochondrial C296 yes Unknown homolog 2
depolarization
Phosphoglycolate
PGP Phosphatase C297 yes Unknown phosphatase
Modulates TGF- beta signaling,
Protein
induced by
PML Promyelocytic RNA DNA binding C479 yes
interferon to leukemia
promote antiviral responses
Promotes TCR signaling through
Protein kinase C theta S erine/threonine activation of NF-
PRKCQ C14 yes
type protein kinase KB and other transcription factors
Glycogen
PYGB phosphorylase, brain Phosphorylase C326 yes Unknown form
Arginine-tRNA
RAPvS tRNA binding C32 yes Unknown ligase, cytoplasmic
SON Protein SON RNA DNA binding C92 yes Unknown
SYNE2 Nesprin-2 Actin binding C553 yes Unknown
Tudor and KH
TDRKH domain -containing RNA binding C109 yes Unknown protein
Threonine synthase-
THNSL1 Threonine synthase C324 yes Unknown like 1
THO complex
THOC1 RNA/DNA binding C49 yes Unknown subunit 1
Inhibits NF-KB
Tumor necrosis factor
Ubiquitin-specific signaling upon
TNFAIP3 alpha-induced protein C54 yes
protease TCR-mediated T
3
cell activation
E3 ubiquitin-protein
UBR4 Ubiquitin ligase C2554 yes Unknown ligase
Deubiquitinates
Ubiquitin carboxyl- Ubiquitin-specific FOXP3, increasing
USP7 C315 yes
terminal hydrolase 7 protease Treg suppressive capacity
Voltage-dependent
Mitochondrial outer
VDAC3 anion-selective C65 yes Unknown membrane channel
channel protein
Voltage-dependent
Voltage-gated anion
VDAC3 anion-selective C36 yes Unknown channel
channel protein
Zinc finger CCCH-
Poly(A) RNA Inhibits viral
ZC3HAV1 type antiviral protein C645 yes
binding replication 1
Zinc finger protein
ZNF346 RNA binding C68 yes Unknown
346
Alanine-tRNA
AARS Alanine-tRNA ligase C773 no Unknown ligase, cytoplasmic
Probable DNA dC- Inhibits retrovirus
APOBEC3C Cytidine deaminase C130 no
dU- editing enzyme replication Expression induced
Bcl-2 -related protein
BCL2A1 Scaffolding protein C55 no by inflammatory
Al
cytokines
Bcl-2 -related protein
BCL2A1 Scaffolding protein C19 no Unknown
Al
Chromatin
Chromatin
CHRAC1 accessibility complex C55 no Unknown remodeling
protein 1
DCXR L-xylulose reductase Xylulose reductase C244 no Unknown
GH3 domain-
GHDC Uncharacterized C502 no Unknown containing protein
Helps initiate innate immune response by
Interleukin-1 promoting
S erine/threonine
IRAK4 receptor-associated C13 no ubiquitination of protein kinase
kinase 4 IRAKI upon TLR activation. Also implicated in T cell activation
Glutamine-dependent
NADSYN1 NAD(+) synthase C428 no Unknown
NAD(+) synthetase
Hydrolysis of 6-
6-phospho-
PGLS phosphogluconolact C32 no Unknown gluconolactonase
one
Regulates DNA
DNA-dependent
S erine/threonine damage response,
PRKDC protein kinase C4045 no
protein kinase involved in V(D)J catalytic subunit
recombination tRNA pseudouridine Pseudouridine
PUSL1 C292 no Unknown synthase-like 1 synthase
Ras and Rab
RIN3 GTPase activator C942 no Unknown interactor 3
SCLY Selenocysteine lyase Selenocysteine lyase C22 no Unknown
Signal peptidase
SPCS2 Peptidase C17 no Unknown complex subunit 2
Mutations lead to
B-cell immunodeficiency
CCA tRNA
as well as
TRNT1 nucleotidyltransferase tRNA binding C373 no
progressive 1, mitochondrial
reductions in T and NK cells (OMIM number 616084)
Gamma-tubulin Gamma-tubulin
TUBGCP3 C194 no Unknown complex component 3 binding
Acts as an E2
Ubiquitin ISG 15-
Ubiquitin- enzyme for an IFN-
UBE2L6 conjugating enzyme C98 no
conjugating enzyme induced ubiquitin- E2 L6
like protein
[00455] Table 8 illustrates an exemplary list of DMF sensitive cysteine-containing proteins in human T cell targets. Table 8 further shows the accession number (or the protein identifier) of the protein.
SEQ DMF DMF DMF DMF DMF MMF
Identifier Protein Name
ID 50u 50u 50u 25u lOu 50u NO: M_4h M_2h M lh M_4h M_4h M_4h
Q9NRW3 APOBEC3C Probable DNA dC-
805
COO dU-editing enzyme APOBEC-3C 20 20 8.6 1.92 1.48
Q9NWZ3 IRAK4 Interleukin-1 receptor-
806
C13 associated kinase 4 20 8.3 1.48
Q9Y2W6 TDRKH Tudor and KH domain-
807
C109 containing protein 20 20 4 2.34 1.36
Q6IA69 NADSYN1 Glutamine-dependent
808
C428 NAD(+) synthetase 20 2.31 1.81 1.43 1.33
014920 1KB KB Inhibitor of nuclear factor
809
C464 kappa-B kinase subunit 20 10.12 3.96 2.59
P00813
810
C75 ADA Adenosine deaminase 20 5.08 2.51 2.29
Q9Y277 VDAC3 Voltage-dependent anion-
811
C65 selective channel protein 15.94 7.53 3.35 5.64 1.73 1.39
P49588 AARS Alanine-tRNA ligase,
812
C773 cytoplasmic 12.75 10.16 9.34 2.84 1.24
014933 UBE2L6 Ubiquitin ISG15-
813
C98 conjugating enzyme E2 L6 12.55 2.92 2.44 1.49 1.7
095336
814
C32 PGLS 6-phosphogluconolactonase 11.51 9.49 3.42 5.32 1.9 1.26
A6NDG6 PGP Phosphoglycolate
815
C297 phosphatase 10.77 4.21 3.06 1.52
Q7Z6Z7 HUWE1 E3 ubiquitin-protein
816
C3372 ligase HUWE1 10.48 4.43 2.28 1.58 1.2
Q16548
817
C55 BCL2A1 Bcl-2-related protein Al 7.18 0.97
P11216 PYGB Glycogen phosphorylase,
818
C326 brain form 6.76 3.73 2.47 3.53 1.65 1.29
095081 AGFG2 Arf-GAP domain and FG
819
C39 repeat-containing protein 2 6.39 3.85 1.42 1.24
Q7Z2W4 ZC3HAV1 Zinc finger CCCH-type
820
C645 antiviral protein 1 6.28 3.13 2.36 2.52 1.46 1.3
000170 AIP AH receptor-interacting
821
C122 protein 6.14 3.05 1.24
TRNT1 CCA tRNA
Q96Q11 nucleotidyltransferase 1, 822
C373 mitochondrial 5.83 2.66 1.97 1.29
Q8TB24
823
C942 RIN3 Ras and Rab interactor 3 5.7 3 1.23
Q9Y4W2 LAS1L Ribosomal biogenesis
824
C456 protein LAS1L 5.61 3.42 1.8 1.29 1.14
Q02556
825
C306 IRF8 Interferon regulatory factor 8 5.32 1.66 1.9
Q96GW9 MARS2 Methionine-tRNA ligase,
826
C425 mitochondrial .A 5.3 4.16 2.23 2.86 1.84 1.3
Q15306
827
C194 IRF4 Interferon regulatory factor 4 5.25 3.13 1.32 1.78 1.69 1.33
Q15005 SPCS2 Signal peptidase complex
828
C17 subunit 2 5.09 3.86 2.25 2.41 1.42 1.32
P54136 RARS Arginine-tRNA ligase,
829
C32 cytoplasmic 5.02 3.58 2.58 4.03 0.62 1.78
Q96CW5 TUBGCP3 Gamma-tubulin
830
C194 complex component 3 4.94 2.44
P46109
831
C249 CRKL Crk-like protein 4.86 3.21 2.21 1.38 1.27
Q8N0Z8 PUSL1 tRNA pseudouridine
832
C292 synthase-like 1 4.68 1.36 Q5T4S7 UBR4 E3 ubiquitin-protein ligase
833
C2554 UBR4 4.63 2.1 1.6 1.52 1.2
Q9UL40
834
C68 ZNF346 Zinc finger protein 346 4.6 3.91 2.5 1.98 1.26
Q13045
835
C46 FLU Protein flightless- 1 homolog 4.5 3.57 2.05 1.55 1.27
Q86YS7 KIAA0528 Uncharacterized
836
C993 protein KIAA0528 4.38 1.4 1.4
Q9Y6C9 MTCH2 Mitochondrial carrier
837
C296 homolog 2 4.3 2.44 1.81 1.68 1.35
Q7Z4W1
838
C244 DCXR L-xylulose reductase 4.24 2.76 1.29 2.3
Q04759 PRKCQ Protein kinase C theta
839
C14 type 4.21 2.92 3.29 1.62 1.14
P18583
840
C92 SON Protein SON 4.17 6.31 2.5 1.31
P31153 MAT2A S-adenosylmethionine
841
C56 synthase isoform type -2 4.17
Q16548
842
C19 BCL2A1 Bcl-2-related protein Al 4.16 2.09 2.19 1.15 1.28
Q14005
843
CI 004 IL16 Pro-interleukin-16 4.13 3.32 1.95 1.37 1.31
P31153 MAT2A S-adenosylmethionine
844
C104 synthase isoform type -2 4.11 1.5 1.31
Q9Y277 VDAC3 Voltage-dependent anion-
845
C36 selective channel protein 4.11 3.98 3.21 1.18
Q8WXH0
846
C553 SYNE2 Nesprin-2 4.05 3.29 1.61
Q96I15
847
C22 SCLY Selenocysteine lyase 4.04 2.16 1.9 2.16 1.31 1.27
P29590
848
C479 PML Protein PML 4.57 2.17 2.1 1.27
Q8IYQ7 THNSL1 Threonine synthase-like
849
C324 1 19.36 15.93 1.4
Q93009 USP7 Ubiquitin carboxyl-terminal
850
C315 hydrolase 7 14.06 5.33 1.9 1.4
P21580 TNFAIP3 Tumor necrosis factor
851
C54 alpha-induced protein 3 5.34 1.58
014976
852
C87 GAK Cyclin-G-associated kinase 4.79 1.36
Q96FV9
853
C49 THOC1 THO complex subunit 1 5.7 3.93 0.97
P78527 PRKDC DNA-dependent protein
854
C4045 kinase catalytic subunit 10.53 4.14 1.23
Q9NRG0 CHRACl Chromatin accessibility
855
C55 complex protein 1 11.72 12.59 5.07 1.27
Q8N2G8 GHDC GH3 domain-containing
856
C502 protein 20 4.23 [00456] Table 9 illustrates the full protein sequence of exemplary cysteine-containing proteins described herein. The cysteine residue of interest is denoted with (*).
Figure imgf000218_0001
IKSNLDRALG RQ
Q14790 CASP8 C360 MDFSRNLYDI GEQLDSEDLA 3
SLKFLSLDYI PQRKQEPIKD ALMLFQRLQE KRMLEESNLS FLKELLFRIN RLDLLITYLN TRKEEMEREL QTPGRAQISA YRVMLYQISE EVSRSELRSF KFLLQEEISK CKLDDDMNLL DIFIEMEKRV
ILGEGKLDIL KRVCAQINKS LLKIINDYEE FSKERSSSLE GSPDEFSNGE ELCGVMTISD SPREQDSESQ TLDKVYQMKS KPRGYCLIIN HNF AKAREK VPKLHSIRDR NGTHLDAGAL TTTFEELHFE IKPHDDCTVE QIYEILKIYQ
LMDHS MDCF ICCILSHGDK GIIYGTDGQE APIYELTSQF TGLKCPSLAG KPKVFFIQAC*
QGDNYQKGIP VETDSEEQPY LEMDLSSPQT RYIPDEADFL LGMATVNNCV SYRNPAEGTW YIQSLCQSLR ERCPRGDDIL TILTEVNYEV SNKDDKKNMG KQMPQPTFTL RKKLVFPSD
Q92851 C ASP 10 C401 MKSQGQHWYS SSDKNCKVSF 4
REKLLIIDSN LGVQDVENLK FLCIGLVPNKKLEKSSSASD VFEHLLAEDL LSEEDPFFLA ELLYIIRQKK LLQHLNCTKE EVERLLPTRQ RVSLFRNLLY ELSEGIDSEN LKDMIFLLKD SLPKTEMTSL
SFLAFLEKQG KIDEDNLTCL EDLCKTVVPK LLRNIEKYKR EKAIQIVTPP VDKEAESYQG EEELVSQTDV KTFLEALPQE SWQNKHAGSN GNRATNGAPS LVSRGMQGAS ANTLNSETST KRAAVYRMNR NHRGLC VIVN NHSFTSLKDR
QGTHKDAEIL SHVFQWLGFT VHIHNNVTKV EMEMVLQKQK CNPAHADGDC FVFCILTHGR FGAVYSSDEA LIPIREFMSH FTALQCPRLA EKPKLFFIQA C*
QGEEIQPSV SIEADALNPE QAPTSLQDSI PAEADFLLGL ATVPGYVSFR HVEEGSWYIQ SLCNHLKKLV PRMLKFLEKT MEIRGRKRTV
WGAKQISATS LPTAISAQTP RPPMRRWSSV S
Q99873 PRMT1 C109 ME FVATLAN GMSLQPPLEE 5
VSCGQAESSE KPNAEDMTSK DYYFDSYAHF GIHEEMLKDE VRTLTYRNSM FHNRHLFKDK VVLDVGSGTG ILCMFAAKAG ARKVIGIEC* S SISDYAVKIV
KA KLDHVVT IIKGKVEEVE LPVEKVDIII
SEWMGYCLFY ESMLNTVLYA RDKWLAPDGL IFPDRATLYV TAIEDRQYKD YKIHWWENVY GFDMSCIKDV AIKEPLVDVV DPKQLVTNAC LIKEVDIYTV KVEDLTFTSP FCLQVKRNDY VHALVAYFNI EFTRCHKRTG FSTSPESPYT
HWKQTVFYME DYLTVKTGEE IFGTIGMRPN AKN RDLDFT IDLDFKGQLC ELSCSTDYRM R
Q9NYL2 MAP3 kinase C22 MSSLGASFVQ IKFDDLQFFE 6
MLTK (or NC*GGGSFGSV YRAKWISQDK ZAK) EVAVKKLLKI EKEAEILSVL
SHRNIIQFYG VILEPPNYGI VTEYASLGSL YDYINSNRSE EMDMDHIMTW ATDVAKGMHY LHMEAPVKVI HRDLKSRNVV IAADGVLKIC
DFGASRFHNH TTHMSLVGTF PWMAPEVIQS LPVSETCDTY SYGVVLWEML TREVPFKGLE GLQVAWLVVE KNERLTIPSS CPRSFAELLH QCWEADAKKR PSFKQIISIL ESMSNDTSLP DKCNSFLHNK AEWRCEIEAT LERLKKLERD
LSFKEQELKE RERRLKMWEQ KLTEQSNTPL LPSFEIGAWT EDDVYCWVQQ LVRKGDSSAE MS VYASLFKE NNITGKRLLL LEEEDLKDMG IVSKGHIIHF KSAIEKLTHD YINLFHFPPL IKDSGGEPEE EEKIV LEL VFGFHLKPGT
GPQDCKWKMY MEMDGDEIAI TYIKDVTFNT LPDAEILKM TKPPFVMEKW IVGIAKSQTV ECTVTYESDV RTPKSTKHVH SIQWSRTKPQ DEVKAVQLAI
QTLFTNSDGN PGSRSDSSAD CQWLDTLRMR QIASNTSLQR SQS PILGSP
FFSHFDGQDS YAAAVRRPQV PIKYQQITPV NQSRSSSPTQ YGLTK FSSL HLNSRDSGFS SGNTDTSSER GRYSDRSR K YGRGSISLNS SPRGRYSGKS QHSTPSRGRY PGKFYRVSQS AL PHQSPDF KRSPRDLHQP NTIPGMPLHP ETDSRASEED SKVSEGGWTK VEYRKKPHRP SPAKTNKERA RGDHRGWRNF
P12268 IMPDH2 C140, MADYLISGGT SYVPDDGLTA 7
C331 QQLFNCGDGL TYNDFLILPG
YIDFTADQVD LTSALTKKIT LKTPLVSSPM DTVTEAGMAI AMALTGGIGF IHHNCTPEFQ A EVRKVKKY EQGFITDPVV LSPKDRVRDV FEAKARHGFC*
GIPITDTGRM GSRLVGIISS RDIDFLKEEE HDCFLEEFMT KREDLVVAPA GITLKEA EI LQRSKKGKLP IV EDDELVA IIARTDLKKN RDYPLASKDA KKQLLCGAAI GTHEDDKYRL DLLAQAGVDV VVLDSSQGNS IFQINMIKYI KDKYP LQVI
GGNVVTAAQ A KNLID AGVD A LRVGMGSGSI C*ITQEVLACG
RPQATAVYKV SEYARRFGVP VIADGGIQNV GHIAKALALG ASTVMMGSLL AATTEAPGEY FFSDGIRLKK YRGMGSLDAM DKHLSSQNRY FSEADKIKVA QGVSGAVQDK GSfflKFVPYL IAGIQHSCQD
IGAKSLTQVR AMMY S GELKF EKRTSSAQVE GGVHSLHSYE KRLF
Q9NQ88 TIGAR C114, MARF ALTVVR HGETRFNKEK 8
C161 IIQGQGVDEP LSETGFKQAA
AAGIFLNNVK FTHAFSSDLM RTKQTMHGIL ERSKFCKDMT VKYDSRLRER KYGVVEGKAL SELRAMAKAA REEC*PVFTPP
GGETLDQVKM RGIDFFEFLC QLILKEADQK
EQFSQGSPSN C*LETSLAEIF PLGK HSSKV NSDSGIPGLA
ASVLVVSHGA YMRSLFDYFL TDLKCSLPAT LSRSELMSVT PNTGMSLFII NFEEGREVKP TVQCICMNLQ DHLNGLTETR
Q04759 PKC0 C14, MSPFLRIGLS NFDC*GSC*QSC 9
C17 QGEAV PYCA VLVKEYVESE
NGQMYIQKKP TMYPPWDSTF DAHINKGRVM QIIVKGKNVD LISETTVELY SLAERCRKNN GKTEIWLELK PQGRMLMNAR YFLEMSDTKD MNEFETEGFF ALHQRRGAIK
QAKVHHVKCH EFTATFFPQP TFCSVCHEFV WGLNKQGYQC RQCNAAIHKK CIDKVIAKCT GSAINSRETM FHKERFKIDM PHRFKVYNYK SPTFCEHCGT LLWGLARQGL KCDACGMNVH HRCQTKVANL CGINQKLMAE ALAMIESTQO ARCLRDTEQI FREGPVEIGL PCSIKNEARP PCLPTPGKRE PQGISWESPL DEVDKMCHLP EPEL KERPS LQIKLKIEDF ILHKMLGKGS FGKVFLAEFK KTNQFFAIKA LKKDVVLMDD DVECTMVEKR VLSLAWEHPF LTFIMFCTFQT
KENLFFVMEY LNGGDLMYHI QSCHKFDLSR ATFYAAEIIL GLQFLHSKGI VYRDLKLDNI LLDKDGHIKI ADF GMCKENM LGDAKTNTFC GTPDYIAPEI LLGQKYNHSV DWWSFGVLLY EMLIGQSPFH GQDEEELFHS IRMDNPFYPR
WLEKEAKDLL VKLFVREPEK RLGVRGDIRQ HPLFREINWE ELERKEIDPP FRPKVKSPFD CS FDKEFLN EKPRLSFADR ALINSMDQNM FR FSFMNPG MERLIS
[00457] Table 10A - Table 10E illustrate a list of cysteine containing proteins and potential cysteine site of conjugation separated by protein class. Table 10A illustrates cysteine containing enzymes and potential cysteine conjugation site. Table 10B shows a list of cysteine containing transcription factors and regulators. Table IOC shows an exemplary list of cysteine containing channels, transcporters and receptors. Table 10D illustrates an exemplary cysteine containing adapter, scaffolding, and modulator protein. Table 10E provides an exemplary list of uncategorized cysteine containing proteins.
Table 10A
Cysteine Protein
Identifier Protein Name
Location Class
014920 IKBKB Inhibitor of nuclear factor kappa-B kinase subunit C464 Enzyme
014933 UBE2L6 Ubiquitin/ISG15-conjugating enzyme E2 L6 PCTK C98 Enzyme
094953 KDM4B Lysine-specific demethylase 4B C694 Enzyme
P00813 ADA Adenosine deaminase C75 Enzyme
P09211 GSTP1 Glutathione S-transf erase P C48 Enzyme
P15374 UCHL3 Ubiquitin carboxyl-terminal hydrolase isozyme L3 C95 Enzyme
C145;
P16455 MGMT Methylated-DNA— protein-cysteine methyltransferase Enzyme
C150
P17812 CTP synthase 1 C491 Enzyme
PI 9447 ERCC3 TFIIH basal transcription factor complex helicase C342 Enzyme
P21580 T FAIP3 Tumor necrosis factor alpha-induced protein 3 C54 Enzyme
C119;
C126;
P24752 AC ATI Acetyl-CoA acetyl transferase, mitochondrial Enzyme
C196;
C413
P40261 Nicotinamide N-methyltransferase C165 Enzyme
P41226 UBA7 Ubiquitin-like modifier-activating enzyme 7 C599 Enzyme
P42575 CASP2 Caspase-2 C370 Enzyme
P43403 ZAP70 Tyrosine-protein kinase ZAP-70 C117 Enzyme
P48735 IDH2 Isocitrate dehydrogenase C308 Enzyme
P51617 IRAKI Interleukin-1 receptor-associated kinase 1 C608 Enzyme
P61081 NEDD8-conjugating enzyme Ubcl2 C47 Enzyme
P61088 Ubiquitin-conjugating enzyme E2 N C87 Enzyme
P68036 UBE2L3 Ubiquitin-conjugating enzyme E2 L3 C86 Enzyme
Q00535 CDK5 Cyclin-dependent kinase 5 C157 Enzyme
C14;
Q04759 PRKCQ Protein kinase C theta type Enzyme
C17
Q06124 Tyrosine-protein phosphatase non-receptor type 11 C573 Enzyme
Q09472 EP300 Hi stone acetyltransferase p300 C1738 Enzyme
Q14790 CASP8 Caspase-8 C360 Enzyme
C55;
C58;
Q15084 PDIA6 Protein disulfide-isomerase A6 Enzyme
C190;
C193
Q15910 EZH2 Histone-lysine N-methyltransferase EZH2 C503 Enzyme
Q 16763 UBE2S Ubiquitin-conjugating enzyme E2 S C118 Enzyme
Q 16822 PCK2 Phosphoenolpyruvate carboxykinase C306 Enzyme
Q16875 6-phosphofructo-2-kinase/fructose-2,6-bisphosphatase 3 C155 Enzyme
Q16877 PFKFB4 6-phosphofructo-2-kinase/fructose-2,6-bisphosphata C159 Enzyme
Q6L8Q7 PDE12 2,5-phosphodiesterase 12 C108 Enzyme Cysteine Protein
Identifier Protein Name
Location Class
C741;
Q70CQ2 USP34 Ubiquitin carboxyl-terminal hydrolase 34 Enzyme
CI 090
Q86UV5 USP48 Ubiquitin carboxyl-terminal hydrolase 48 C39 Enzyme
Q92851 Caspase-10 C401 Enzyme
C223;
Q93009 USP7 Ubiquitin carboxyl-terminal hydrolase 7 Enzyme
C315
Q96FA3 PELI1 E3 ubiquitin-protein ligase pellino homolog 1 C282 Enzyme
Q96JH7 VCPIP1 Deubiquitinating protein VCIP135 C219 Enzyme
C171;
Q96RU2 USP28 Ubiquitin carboxyl-terminal hydrolase 28 Enzyme
C733
Q99873 PRMT1 Protein arginine N-m ethyl transferase 1 C109 Enzyme
Q9C0C9 UBE20 Ubiquitin-conjugating enzyme E2 0 C375 Enzyme
Q9 RW4 Dual specificity protein phosphatase 22 C124 Enzyme
Q9NWZ3 IRAK4 Interleukin-1 receptor-associated kinase 4 C13 Enzyme
Q9NYL2 MLTK Mitogen-activated protein kinase kinase kinase MLT C22 Enzyme
C44;
Q9UPT9 USP22 Ubiquitin carboxyl-terminal hydrolase 22 Enzyme
C171
Q9Y3Z3 SAMHDl SAM domain and HD domain-containing protein 1 C522 Enzyme
Q9Y4C1 KDM3 A Lysine-specific demethylase 3 A C251 Enzyme
Q9Y5T5 USP16 Ubiquitin carboxyl-terminal hydrolase 16 C205 Enzyme
Table 10B
Figure imgf000224_0001
Cysteine Protein
Identifier Protein Name
Location Class
Transcription
Q7Z2W4 ZC3HAV1 Zinc finger CCCH-type antiviral protein 1 C645 factors and regulators
Transcription
Q8TAQ2 SMARCC2 SWI/S F complex subunit SMARCC2 C145 factors and regulators
Table IOC
Cysteine Protein
Identifier Protein Name
Location Class
Channels,
G B2L1 Guanine nucleotide-binding protein
P63244 C182 Transporters, subunit beta-2-like 1
Receptors
Channels,
Q16186 Proteasomal ubiquitin receptor ADRM1 C88 Transporters,
Receptors
Channels,
C358;
Q9HB90 RRAGC Ras-related GTP-binding protein C transporters,
C377
and receptors
Table 10D
Figure imgf000225_0001
Table 10E
Figure imgf000225_0002
[00458] The examples and embodiments described herein are for illustrative purposes only and various modifications or changes suggested to persons skilled in the art are to be included within the spirit and purview of this application and scope of the appended claims.

Claims

WHAT IS CLAIMED IS:
1. A method of identifying a cysteine containing protein as a binding target for a small molecule fragment, comprising:
a) obtaining a set of cysteine-reactive probe-protein complexes from a sample
comprising a first cell solution treated with a small molecule fragment and a cysteine reactive probe wherein the cysteine-reactive probe comprises a reactive moiety capable of forming a covalent bond with a cysteine residue located on the cysteine containing protein;
b) analyzing the set of cysteine-reactive probe-protein complexes by a proteomic analysis means; and
c) based on step b), identifying a cysteine containing protein as the binding target for the small molecule fragment.
2. The method of claim 1, further comprising determining a value of each of the cysteine containing protein from the set of cysteine-reactive probe-protein complexes for identifying a cysteine containing protein as the binding target for the small molecule fragment, wherein the value is determined based on the proteomic analysis means of step b).
3. The method of claim 1, wherein the sample further comprises a second cell solution.
4. The method of claim 1, further comprising contacting the first cell solution with a small molecule fragment for an extended period of time prior to incubating the first cell solution with a first cysteine-reactive probe to generate a first group of cysteine-reactive probe-protein complexes.
5. The method of claim 4, wherein the extended period of time is about 5, 10, 15, 20, 30, 60, 90, 120 minutes or longer.
6. The method of claim 3, further comprising contacting the second cell solution with a second cysteine-reactive probe to generate a second group of cysteine-reactive probe- protein complexes.
7. The method of any one of the claims 3-6, wherein the first cysteine-reactive probe and the second cysteine-reactive probe are the same.
8. The method of any one of the claims 3-7, wherein the first group and the second group of cysteine-reactive probe-protein complexes comprise the set of cysteine-reactive probe- protein complexes.
9. The method of claim 1, wherein the cysteine containing protein is an enzyme, a transporter, a receptor, a channel protein, an adaptor protein, a chaperone, a signaling protein, a plasma protein, transcription related protein, translation related protein, mitochondrial protein, or cytoskeleton related protein.
The method of claim 1, wherein the cysteine containing protein is a protein illustrated
Table 3.
The method of claim 1, wherein the cysteine containing protein is a protein illustrated Table 1, Table 2, Table 8, Table 9, Table 10A, Table 10B, Table IOC, Table 10D or Table 10E.
The method of claim 1, wherein the small molecule fragment is a small molecule fragment of Formula (I):
Figure imgf000227_0001
Formula (I)
wherein:
RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and
F is a small molecule fragment moiety.
13. The method of claim 12, wherein the Michael acceptor moiety comprises an alkene or an alkyne moiety.
14. The method of claim 12, wherein F is obtained from a compound library.
15. The method of claim 14, wherein the compound library comprises ChemBridge fragment library, Pyramid Platform Fragment-Based Drug Discovery, Maybridge fragment library, FRGx from AnalytiCon, TCI-Frag from AnCoreX, Bio Building Blocks from ASINEX, BioFocus 3D from Charles River, Fragments of Life (FOL) from Emerald Bio, Enamine Fragment Library, IOTA Diverse 1500, BIONET fragments library, Life Chemicals Fragments Collection, OTAVA fragment library, Prestwick fragment library, Selcia fragment library, TimTec fragment-based library, Allium from Vitas-M Laboratory, or Zenobia fragment library.
16. The method of any one of the claims 12-15, wherein F is a small molecule fragment moiety illustrated in Fig. 3. The method of claim 1, wherein the cysteine-reactive probe is a cysteine-reactive prob of Formula (II):
Figure imgf000228_0001
Formula (II)
wherein:
RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond to the thiol group of a cysteine residue; and
AHM is an affinity handle moiety.
18. The method of claim 17, wherein the Michael acceptor moiety comprises an alkene or an alkyne moiety.
19. The method of claim 17, wherein the affinity handle moiety comprises an affinity handle and a binding moiety that facilitates covalent interaction of the cysteine-reactive probe to a cysteine residue of a cysteine-containing protein.
20. The method of claim 19, wherein the binding moiety is a small molecule fragment
obtained from a compound library.
21. The method of claim 17 or 19, wherein the affinity handle comprises a carbodiimide, N- hydroxysuccinimide (NHS) ester, imidoester, pentafluorophenyl ester, hydroxymethyl phosphine, maleimide, haloacetyl, pyridyl disulfide, thiosulfonate, vinylsulfone, hydrazide, alkoxyamine, alkyne, azide, or isocyanate group.
22. The method of any one of the claims 17, or 19-21, wherein the affinity handle is further conjugated to an affinity ligand.
23. The method of claim 22, wherein the affinity ligand comprises a chromophore, a labeling group, or a combination thereof.
24. The method of claim 23, wherein the chromophore comprises non-fluorochrome
chromophore, quencher, an absorption chromophore, fluorophore, organic dye, inorganic dye, metal chelate, or a fluorescent enzyme substrate.
25. The method of claim 23, wherein the labeling group is a biotin moiety, a streptavidin moiety, bead, resin, a solid support, or a combination thereof.
26. The method of claim 17, wherein the cysteine-reactive probe is a cysteine-reactive probe illustrated in Fig. 3.
27. The method of claim 1, wherein the proteomic analysis means comprises a mass spectroscopy method.
28. The method of claim 1, wherein the identifying in step c) further comprises
i. locating a first value assigned to a cysteine containing protein from the first group of cysteine-reactive probe-protein complex and a second value of the same cysteine containing protein from the second group of cysteine-reactive probe- protein complex; and
ii. calculating a ratio between the two values assigned to the same cysteine
containing protein.
29. The method of claim 28, wherein the ratio of greater than 2 indicates that the cysteine containing protein is a candidate for interacting with the small molecule fragment.
30. The method of claim 28, wherein the identifying in step c) further comprises calculating a percentage of inhibition of the cysteine-reactive probe to the cysteine containing protein.
31. The method of claim 30, wherein the percentage of inhibition of greater than 50%, 60%, 70%), 80%), 90%o, or at 100%> indicates that the cysteine containing protein is a candidate for interacting with the small molecule fragment.
32. The method of any one of the claims 1-31, wherein the method is an in situ method.
33. The method of any one of the claims 1-32, wherein the cysteine-reactive probe is not 4- hydroxynonenal or 15-deoxy-A12, 14-prostaglandin J2.
34. A modified cysteine containing protein comprising a small molecule fragment having a covalent bond to a cysteine residue of a cysteine containing protein, wherein the small molecule fragment has a molecular weight of about 150 Dalton or higher.
35. The modified cysteine containing protein of claim 34, wherein the cysteine containing protein comprises a cysteine residue site denoted in Table 3.
36. The modified cysteine containing protein of claim 34, wherein the cysteine containing protein comprises a protein sequence illustrated in Table 1, Table 2, Table 8, Table 9, Table 10A, Table 10B, Table IOC, Table 10D or Table 10E.
37. The modified cysteine containing protein of claim 34, wherein the cysteine containing protein is about 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000 amino acid residues in length or more.
38. The modified cysteine containing protein of claim 34, wherein the cysteine residue of the modified cysteine containing protein has the structure SR, wherein R is selected from:
Figure imgf000230_0001
aryl; and
F' is the small molecule fragment moiety.
39. The modified cysteine containing protein of claim 34 or 38, wherein the small molecule fragment has a molecular weight of about 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000 Dalton, or higher.
40. The modified cysteine containing protein of any one of the claims 34, 38 or 39, wherein the molecular weight of the small molecule fragment is calculated based on carbon and hydrogen atoms and optionally further based on nitrogen, oxygen and/or sulfur atoms.
41. The modified cysteine containing protein of claim 34, wherein the modified cysteine containing protein is selected from IDH2, caspase-8, caspase-10 or PRMT1.
42. The modified cysteine containing protein of claim 34, wherein IDH2 is modified at cysteine position 308.
43. The modified cysteine containing protein of claim 34, wherein caspase-8 is modified at cysteine position 360.
44. The modified cysteine containing protein of claim 34, wherein caspase-10 exist in the proform and is modified at cysteine position 401.
45. The modified cysteine containing protein of claim 34, wherein PRMT1 is modified at cysteine position 109.
46. The modified cysteine containing protein of claim 34, wherein the small molecule fragment is a small molecule fragment of Formula (I):
Figure imgf000231_0001
Formula (I)
wherein:
RM is a reactive moiety selected from a Michael acceptor moiety, a leaving group moiety, or a moiety capable of forming a covalent bond with the thiol group of a cysteine residue; and
F is a small molecule fragment moiety.
The modified cysteine containing protein of claim 46, wherein the Michael acceptor moiety comprises an alkene or an alkyne moiety.
The modified cysteine containing protein of claim 46, wherein F is obtained from a compound library.
The modified cysteine containing protein of claim 46, wherein F is a small molecule fragment moiety illustrated in Fig. 3.
The modified cysteine containing protein of claim 46, wherein F further comprises a linker moiety that connects F to the carbonyl moiety.
The modified cysteine containing protein of any one of the claims 34-50, wherein the small molecule fragment binds irreversibly to the cysteine containing protein.
The modified cysteine containing protein of any one of the claims 34-50, wherein the small molecule fragment binds reversibly to the cysteine containing protein.
A method of screening a small molecule fragment for interaction with a cysteine containing protein, comprising:
a) harvesting a set of cysteine-reactive probe-protein complexes from a sample comprising a first cell solution treated with a small molecule fragment and a cysteine reactive probe wherein the cysteine-reactive probe comprises a reactive moiety capable of forming a covalent bond with a cysteine residue located on the cysteine containing protein;
b) analyzing the set of cysteine-reactive probe-protein complexes by a proteomic analysis means; and
c) based on step b), identifying the small molecule fragment as interacting with the cysteine containing protein.
The method of claim 53, further comprising determining a value of each of the cysteine containing protein from the set of cysteine-reactive probe-protein complexes prior to identifying the small molecule fragment as interacting with the cysteine containing protein, wherein the value is determined based on the proteomic analysis means of step b).
55. The method of claim 53, wherein the cysteine containing protein is a protein illustrated in Table 3.
56. The method of claim 53, wherein the cysteine containing protein is a protein illustrated in Table 1, Table 2, Table 8, Table 9, Table 10A, Table 10B, Table IOC, Table 10D or Table 10E.
PCT/US2016/058308 2015-10-22 2016-10-21 Cysteine reactive probes and uses thereof WO2017070611A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2018516113A JP6953400B2 (en) 2015-10-22 2016-10-21 Cysteine-reactive probe and its use
EP16858391.2A EP3365686A4 (en) 2015-10-22 2016-10-21 Cysteine reactive probes and uses thereof
CA3001847A CA3001847A1 (en) 2015-10-22 2016-10-21 Cysteine reactive probes and uses thereof

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201562244881P 2015-10-22 2015-10-22
US62/244,881 2015-10-22
US201662345710P 2016-06-03 2016-06-03
US62/345,710 2016-06-03

Publications (1)

Publication Number Publication Date
WO2017070611A1 true WO2017070611A1 (en) 2017-04-27

Family

ID=58558160

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/058308 WO2017070611A1 (en) 2015-10-22 2016-10-21 Cysteine reactive probes and uses thereof

Country Status (5)

Country Link
US (2) US10670605B2 (en)
EP (1) EP3365686A4 (en)
JP (1) JP6953400B2 (en)
CA (1) CA3001847A1 (en)
WO (1) WO2017070611A1 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018144869A1 (en) * 2017-02-03 2018-08-09 The Regents Of The University Of California Compositons and methods for modulating uba5
WO2018144871A1 (en) * 2017-02-03 2018-08-09 The Regents Of The University Of California Compositions and methods for modulating ppp2r1a
WO2018144870A1 (en) * 2017-02-03 2018-08-09 The Regents Of The University Of California Compositions and methods for inhibiting reticulon 4
CN111063389A (en) * 2019-12-04 2020-04-24 浙江工业大学 Ligand binding residue prediction method based on deep convolutional neural network
US10670605B2 (en) 2015-10-22 2020-06-02 The Scripps Research Institute Cysteine reactive probes and uses thereof
US10782295B2 (en) 2013-08-13 2020-09-22 The Scripps Research Institute Cysteine-reactive ligand discovery in proteomes
EP3688472A4 (en) * 2017-09-27 2021-06-23 The Scripps Research Institute Conjugated proteins and uses thereof
EP3688012A4 (en) * 2017-09-27 2021-06-30 Vividion Therapeutics, Inc. Compounds and methods of modulating protein degradation
US11535597B2 (en) 2017-01-18 2022-12-27 The Scripps Research Institute Photoreactive ligands and uses thereof
EP4045920A4 (en) * 2019-10-15 2023-10-11 Jnana Therapeutics Inc. Reactive affinity probe-interaction discovery platform
EP4045043A4 (en) * 2019-10-16 2023-11-08 Vividion Therapeutics, Inc. Compounds and methods for modulating immune-related proteins

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102444509B1 (en) 2016-05-18 2022-09-19 미라티 테라퓨틱스, 인크. KRAS G12C inhibitor
WO2017210600A1 (en) * 2016-06-03 2017-12-07 The Scripps Research Institute Compositions and methods of modulating immune response
US10807951B2 (en) 2017-10-13 2020-10-20 The Regents Of The University Of California mTORC1 modulators
US10689377B2 (en) 2017-11-15 2020-06-23 Mirati Therapeutics, Inc. KRas G12C inhibitors
US10647715B2 (en) 2017-11-15 2020-05-12 Mirati Therapeutics, Inc. KRas G12C inhibitors
US20210008012A1 (en) * 2017-12-11 2021-01-14 Geneheal Biotechnology Co., Ltd. Novel applications of disulfiram and derivatives thereof
WO2019217307A1 (en) 2018-05-07 2019-11-14 Mirati Therapeutics, Inc. Kras g12c inhibitors
JP2022517222A (en) 2019-01-10 2022-03-07 ミラティ セラピューティクス, インコーポレイテッド KRAS G12C inhibitor
JP2022546043A (en) 2019-08-29 2022-11-02 ミラティ セラピューティクス, インコーポレイテッド KRAS G12D inhibitor
KR20220091480A (en) 2019-09-24 2022-06-30 미라티 테라퓨틱스, 인크. combination therapy
EP4045050A4 (en) * 2019-10-16 2023-11-08 The Scripps Research Institute An activity-guided map of electrophile-cysteine interactions in primary human immune cells
BR112022012106A2 (en) 2019-12-20 2022-09-20 Mirati Therapeutics Inc SOS1 INHIBITORS
CN112255396A (en) * 2020-10-15 2021-01-22 南开大学 Single-molecule mechanical method for measuring small-molecule drug inhibition protein nucleic acid interaction
WO2022240966A1 (en) * 2021-05-11 2022-11-17 Opna Immuno-Oncology Sa Compounds and methods for yap/tead modulation and indications therefor
WO2023092133A1 (en) * 2021-11-22 2023-05-25 The Scripps Research Institute Stereoselective covalent ligands for oncogenic and immunological proteins
CN116790698A (en) * 2023-06-21 2023-09-22 南京大学 Method for synthesizing thioaldehyde based on oxidative decarboxylase and application thereof

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015023724A1 (en) * 2013-08-13 2015-02-19 The Scripps Research Institute Cysteine-reactive ligand discovery in proteomes

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6344330B1 (en) 1998-03-27 2002-02-05 The Regents Of The University Of California Pharmacophore recombination for the identification of small molecule drug lead compounds
WO2000077184A1 (en) * 1999-06-10 2000-12-21 Pharmacia & Upjohn Company Caspase-8 crystals, models and methods
MXPA03004435A (en) * 2000-11-21 2005-01-25 Sunesis Pharmaceuticals Inc An extended tethering approach for rapid identification of ligands.
WO2005118833A2 (en) 2004-06-01 2005-12-15 Bayer Healthcare Ag Diagnostics and therapeutics for diseases associated with sterile-alpha motif and leucine zipper containing kinase (zak)
US7935479B2 (en) 2004-07-19 2011-05-03 Cell Biosciences, Inc. Methods and devices for analyte detection
US20100179118A1 (en) * 2006-09-08 2010-07-15 Dainippon Sumitomo Pharma Co., Ltd. Cyclic aminoalkylcarboxamide derivative
US20090068107A1 (en) 2006-10-02 2009-03-12 The Scripps Research Institute Enzyme regulating ether lipid signaling pathways
CN101219219B (en) * 2007-01-10 2013-02-13 北京普罗吉生物科技发展有限公司 Complex containing vascellum chalone or fragment, preparation method and application thereof
EP1947193A1 (en) 2007-01-17 2008-07-23 Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. Screening method for anti-diabetic compounds
CA2679643A1 (en) * 2007-03-09 2008-09-18 The University Of British Columbia Procaspase 8-mediated disease targeting
US9907828B2 (en) * 2012-06-22 2018-03-06 The University Of Vermont And State Agricultural College Treatments of oxidative stress conditions
US20140357512A1 (en) 2013-06-03 2014-12-04 Acetylon Pharmaceuticals, Inc. Histone deacetylase (hdac) biomarkers in multiple myeloma
EP3182971A4 (en) 2014-08-21 2018-04-25 SRX Cardio, LLC Composition and methods of use of small molecules as binding ligands for the modulation of proprotein convertase subtilisin/kexin type 9(pcsk9) protein activity
EP3365686A4 (en) 2015-10-22 2019-03-27 The Scripps Research Institute Cysteine reactive probes and uses thereof
AU2018211066B2 (en) 2017-01-18 2024-03-21 The Scripps Research Institute Photoreactive ligands and uses thereof

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015023724A1 (en) * 2013-08-13 2015-02-19 The Scripps Research Institute Cysteine-reactive ligand discovery in proteomes

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
ABEGG ET AL.: "Proteome-Wide Profiling of Targets of Cysteine reactive Small Molecules by Using Ethynyl Benziodoxolone Reagents", ANGEWANDTE CHEMIE INTERNATIONAL EDITION, vol. 127, no. 37, 24 July 2015 (2015-07-24), pages 11002 - 11007, XP055540308, DOI: 10.1002/anie.201505641 *
FREI ET AL.: "Fast and Highly Chemoselective Alkynylation of Thiols with Hypervalent Iodine Reagents Enabled through a Low Energy Barrier Concerted Mechanism", JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, vol. 136, no. 47, 1 November 2014 (2014-11-01), pages 16563 - 16573, XP 055376809 *
GIRON ET AL.: "Cysteine Tagging for MS-based Proteomics", MASS SPECTROMETRY REVIEWS, vol. 30, no. 3, 1 May 2011 (2011-05-01), pages 366 - 395, XP 055538942, DOI: 10.1002/mas.20285 *
See also references of EP3365686A4 *
WEERAPANA ET AL.: "Quantitative Reactivity Profiling Predicts Functional Cysteines in Proteomes", NATURE, vol. 486, no. 7325, 2010, pages 790 - 795, XP 055315560 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10782295B2 (en) 2013-08-13 2020-09-22 The Scripps Research Institute Cysteine-reactive ligand discovery in proteomes
US10670605B2 (en) 2015-10-22 2020-06-02 The Scripps Research Institute Cysteine reactive probes and uses thereof
US11535597B2 (en) 2017-01-18 2022-12-27 The Scripps Research Institute Photoreactive ligands and uses thereof
WO2018144869A1 (en) * 2017-02-03 2018-08-09 The Regents Of The University Of California Compositons and methods for modulating uba5
WO2018144871A1 (en) * 2017-02-03 2018-08-09 The Regents Of The University Of California Compositions and methods for modulating ppp2r1a
WO2018144870A1 (en) * 2017-02-03 2018-08-09 The Regents Of The University Of California Compositions and methods for inhibiting reticulon 4
EP3688472A4 (en) * 2017-09-27 2021-06-23 The Scripps Research Institute Conjugated proteins and uses thereof
EP3688012A4 (en) * 2017-09-27 2021-06-30 Vividion Therapeutics, Inc. Compounds and methods of modulating protein degradation
EP4045920A4 (en) * 2019-10-15 2023-10-11 Jnana Therapeutics Inc. Reactive affinity probe-interaction discovery platform
EP4045043A4 (en) * 2019-10-16 2023-11-08 Vividion Therapeutics, Inc. Compounds and methods for modulating immune-related proteins
CN111063389A (en) * 2019-12-04 2020-04-24 浙江工业大学 Ligand binding residue prediction method based on deep convolutional neural network
CN111063389B (en) * 2019-12-04 2021-10-29 浙江工业大学 Ligand binding residue prediction method based on deep convolutional neural network

Also Published As

Publication number Publication date
EP3365686A1 (en) 2018-08-29
CA3001847A1 (en) 2017-04-27
JP2019501363A (en) 2019-01-17
US20170115303A1 (en) 2017-04-27
US10670605B2 (en) 2020-06-02
US20200292555A1 (en) 2020-09-17
JP6953400B2 (en) 2021-10-27
EP3365686A4 (en) 2019-03-27

Similar Documents

Publication Publication Date Title
US20200292555A1 (en) Cysteine reactive probes and uses thereof
Vinogradova et al. An activity-guided map of electrophile-cysteine interactions in primary human T cells
Ward et al. Covalent ligand screening uncovers a RNF4 E3 ligase recruiter for targeted protein degradation applications
Zambaldo et al. 2-Sulfonylpyridines as tunable, cysteine-reactive electrophiles
US10859585B2 (en) Lipid probes and uses thereof
Ulrich et al. Ubiquitin signalling in DNA replication and repair
Prieto et al. Large-scale differential proteome analysis in Plasmodium falciparum under drug treatment
Thinon et al. N-myristoyltransferase inhibition induces ER-stress, cell cycle arrest, and apoptosis in cancer cells
US11535597B2 (en) Photoreactive ligands and uses thereof
US20210255193A1 (en) Lysine reactive probes and uses thereof
Feldman et al. Selective inhibitors of SARM1 targeting an allosteric cysteine in the autoregulatory ARM domain
Meng et al. Drug design targeting active posttranslational modification protein isoforms
CN113853372A (en) Thio-heterocyclic exchange chemistry and uses thereof
Gilbert et al. Profiling sulfur (VI) fluorides as reactive functionalities for chemical biology tools and expansion of the ligandable proteome
Scarpino et al. WIDOCK: a reactive docking protocol for virtual screening of covalent inhibitors
Ahuja et al. Inhibition of protein synthesis by didemnin B: How EF-1α mediates inhibition of translocation
Sameshima et al. Discovery of an irreversible and cell-active BCL6 inhibitor selectively targeting Cys53 located at the protein–protein interaction interface
Xu et al. A comparison of two stability proteomics methods for drug target identification in OnePot 2D format
Fiolek et al. Fluspirilene analogs activate the 20S proteasome and overcome proteasome impairment by intrinsically disordered protein oligomers
Cheng et al. Shared and distinctive neighborhoods of emerin and lamin B receptor revealed by proximity labeling and quantitative proteomics
Xie et al. Development of Potent and Selective Coactivator-Associated Arginine Methyltransferase 1 (CARM1) Degraders
US20200278355A1 (en) Conjugated proteins and uses thereof
US20240123078A1 (en) Compounds and methods for modulating immune-related proteins
Gabizon et al. Best Practices for Design and Characterization of Covalent Chemical Probes
Bagci et al. The hGID GID4 E3 ubiquitin ligase complex targets ARHGAP11A to regulate cell migration

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16858391

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2018516113

Country of ref document: JP

ENP Entry into the national phase

Ref document number: 3001847

Country of ref document: CA

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2016858391

Country of ref document: EP