CA2811403A1 - Bioproduction of aromatic chemicals from lignin-derived compounds - Google Patents

Bioproduction of aromatic chemicals from lignin-derived compounds Download PDF

Info

Publication number
CA2811403A1
CA2811403A1 CA2811403A CA2811403A CA2811403A1 CA 2811403 A1 CA2811403 A1 CA 2811403A1 CA 2811403 A CA2811403 A CA 2811403A CA 2811403 A CA2811403 A CA 2811403A CA 2811403 A1 CA2811403 A1 CA 2811403A1
Authority
CA
Canada
Prior art keywords
lignin
polypeptide
daltons
beta
amino acid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA2811403A
Other languages
French (fr)
Inventor
Ranjini Chatterjee
Kenneth Zahn
Kenneth Mitchell
Gary Y. Liu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aligna Technologies Inc
Original Assignee
Aligna Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aligna Technologies Inc filed Critical Aligna Technologies Inc
Publication of CA2811403A1 publication Critical patent/CA2811403A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y205/00Transferases transferring alkyl or aryl groups, other than methyl groups (2.5)
    • C12Y205/01Transferases transferring alkyl or aryl groups, other than methyl groups (2.5) transferring alkyl or aryl groups, other than methyl groups (2.5.1)
    • C12Y205/01018Glutathione transferase (2.5.1.18)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/1085Transferases (2.) transferring alkyl or aryl groups other than methyl groups (2.5)
    • C12N9/1088Glutathione transferase (2.5.1.18)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P7/00Preparation of oxygen-containing organic compounds
    • C12P7/02Preparation of oxygen-containing organic compounds containing a hydroxy group
    • C12P7/22Preparation of oxygen-containing organic compounds containing a hydroxy group aromatic
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P7/00Preparation of oxygen-containing organic compounds
    • C12P7/24Preparation of oxygen-containing organic compounds containing a carbonyl group
    • C12P7/26Ketones
    • C12P7/28Acetone-containing products

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Biotechnology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)
  • Enzymes And Modification Thereof (AREA)
  • Compounds Of Unknown Constitution (AREA)

Abstract

The teachings provided herein are generally directed to a method of converting lignin-derived compounds to valuable aromatic chemicals using an enzymatic, bioconversion process. The teachings provide a selection of (i) host cells that are tolerant to the toxic compounds present in lignin fractions; (ii) polypeptides that can be used as enzymes in the bioconversion of the lignin fractions to the aromatic chemical products; (iii) polynucleotides that can be used to transform the host cells to express the selection of polypeptides as enzymes in the bioconversion of the lignin fractions; and (iv) the transformants that express the enzymes.

Description

BIOPRODUCTION OF AROMATIC CHEMICALS FROM LIGNIN-DERIVED COMPOUNDS
RANJINI CHATTERJEE
KENNETH ZAHN
KENNETH MITCHELL
GARY LIU
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of U.S. Provisional Application Nos. 61/403,440, filed 9/15/2010; and 61/455,709, filed 10/25/2010; each application of which is hereby incorporated herein by reference in it's entirety, SEQUENCE LISTING
[0002] The instant application is filed with an ASCII compliant text file of a Sequence Listing.
The name of the attached file is ALIGPOO4US01 SEQLIST AS-FILED.txt, and the file was created August 29, 2011, is 813 KB in size, and is hereby incorporated herein by reference in its entirety. Because the ASCII compliant text file serves as both the paper copy required by 1.821(c) and the CRF required by 1.821(e), the statement indicating that the paper copy and CRF copy of the sequence listing are identical is no longer necessary under 37 C.F.R. 1.821(f), as per Federal Register /Vol. 74, No. 206 /Tuesday, October 27, 2009, Section I.
BACKGROUND OF THE INVENTION
Field of the Invention [0003] The teachings provided herein are generally directed to a method of converting lignin-derived compounds to valuable aromatic chemicals using an enzymatic, bioconversion process.
Description of the Related Art [0004] Currently, there is a worldwide, global dependence on petroleum as a deplete-able feedstock for the manufacture of fuels and chemicals. The problems of using petroleum are so well-known and documented that they've become nearly a cliché to the world population.
In short, petroleum-based processes are dirty and hazardous. Environmental effects associated with the use of petroleum are known to include, for example, air pollution, global warming, damage from extraction, oil spills, tarballs, and health hazards to humans, domestic animals, and wildlife.
[0005] Oil refineries, for example, are petroleum-based processes that primarily produce gasoline. However, they are also used extensively to produce valuable and less well-known chemical products used in the manufacture of pharmaceuticals, agrochemicals, food ingredients, and plastics. A clean, green alternative to this market area would be appreciated worldwide.
[0006] Bioprocesses can present a clean, green alternative to the petroleum-based processes, a bioprocess being one that uses organisms, cells, organelles, or enzymes to carry out a commercial process. Biorefineries, for example, can produce, for example, chemicals, heat and power, as well as food, feed, fuel and industrial chemical products.
Examples of biorefineries can include wet and dry corn mills, pulp and paper mills, and the biofuels industry. In leather tanning, hides are softened and hair is removed using proteases. In brewing, amylases are used in germinating barley. In cheese-making, rennin is used to coagulated the proteins in mil. The biofuels industry, for example, has been a point of focus recently, naturally focusing on fuel products to replace petroleum-based fuels and, as a result, has not developed other valuable chemical products that also rely on petroleum-based processes.
[0007] As such, biorefineries use enzymes to convert natural products to useful chemicals. A
natural product, such as the wood that is used in a pulp and paper mill, contains cellulose, hemicelluloses, and lignin. A typical range of compositions for a hardwood may be about 40-44% cellulose, about 15-35% hemicelluloses, and about 18-25% lignin.
Likewise, a typical range of compositions for a softwood may be about 40-44% cellulose, about 20-32%
hemicelluloses, and about 25-35% lignin. Since all biofuels come from cellulosic biorefineries, where the key raw material is glucose, derived from cellulose, lignin remains underutilized. Lignin is the single most abundant source of aromatic compounds in nature, and the use of lignin is currently limited to low value applications, such as combustion to generate process heat and energy for the biorefinery facilities. In the alternative, lignin is sold as a natural component of animal feeds or fertilizers. Interestingly, however, lignin is the only plant biomass component based on aromatic core structures, and such core structures are valuable in the production of industrial chemicals. One of skill will appreciate that, unfortunately, a major problem to such a use of lignin remains: the aromatic compounds present in the lignin fraction of a biorefinery include toxic compounds that inhibit the growth and survival of industrial microbes. For at least these reasons, processes for converting lignin fractions to industrial products using industrial microbes have not been successful.
[0008] In view of the above, one of skill will appreciate (i) a clean, green replacement for petroleum-based processes in the production of valuable chemical products that include major markets such as, for example, pharmaceuticals, agrochemicals, food ingredients, and plastics; (ii) a profitable use of the abundant and renewable natural resource available in lignin, which is currently an industrial waste stream that is underutilized as an industrial feedstock; (iii) a selection of host cells that are tolerant to the toxic compounds present in lignin fractions in the feedstock; (iv) a selection of polypeptides that can be used as enzymes in the bioconversion of the lignin fractions to the valuable chemical products; (v) a selection of polynucleotides that can be used to transform host cells to express the selection of polypeptides in the bioconversion of the lignin fractions to the valuable chemical products;
(vi) systems that include transformants that express the enzymes, where the transformants can be used to (a) express the enzymes while in direct contact with the lignin fractions or (b) express the enzymes for extraction from the cells, after which the extracted enzymes are used directly in contact with the lignin fractions; and (vii) a clean-and-green method of producing valuable chemical products at higher profits than petroleum-based processes.
SUMMARY
[0009] This invention is generally directed to a recombinant method of producing enzymes for use in the bioconversion of lignin-derived compounds to valuable aromatic chemicals. In some embodiments, the teachings are directed to an isolated recombinant polypeptide, comprising an amino acid sequence having at least 95% identity to SEQ ID
NO:101. The sequence can conserve residues T19,120, S21, P22, V24, W25, T27, K28, Y29, A30, H33, K34, G35, F36, D39,140, V41, P42, G43, G44, F45, G47,148, E50, R51, T52, G53, G54, K100, A101, N104, V111, G112, M115, F116, P166, W107, Y184, Y187, R188, G191, G192, and F195.
[0010] In some embodiments, the teachings are directed to an isolated recombinant polypeptide, comprising SEQ ID NO:101; or conservative substitutions thereof outside of the conserved residues. The conserved residues can include T19, 120, S21, P22, V24, W25, T27, K28, Y29, A30, H33, K34, G35, F36, D39,140, V41, P42, G43, G44, F45, G47,148, E50, R51, T52, G53, G54; K100, A101, N104, V111, G112, M115, F116, P166, W107, Y184, Y187, R188, G191, G192, and F195.
[0011] In some embodiments, the teachings are directed to an isolated recombinant glutathione S-transf erase enzyme, comprising an amino acid sequence having at least 95%
identity to SEQ ID NO:101. The amino acid sequence can conserve residues T19, 120, S21, P22, V24, W25, T27, K28, Y29, A30, H33, K34, G35, F36, D39,140, V41, P42, G43, G44, F45, G47,148, E50, R51, T52, G53, G54; K100, A101, N104, V111, G112, M115, F116, P166, W107, Y184, Y187, R188, G191, G192, and F195; wherein, the amino acid sequence functions to cleave a beta-aryl ether.
[0012] In some embodiments, the teachings are directed to an isolated recombinant glutathione S-transf erase enzyme, comprising an amino acid sequence having at least 95%
identity to SEQ ID NO:101; wherein, the amino acid sequence functions to cleave a beta-aryl ether.
[0013] In some embodiments, the teachings are directed to an isolated recombinant polypeptide, comprising (i) a length ranging from about 279 to about 281 amino acids; (ii) a first amino acid region consisting of residues 19-54 from SEQ ID NO:101, or conservative substitutions thereof outside of conserved residues T19,120, S21, P22, V24, W25, T27, K28, Y29, A30, H33, K34, G35, F36, D39, 140, V41, P42, G43, G44, F45, G47, 148, E50, R51, T52, G53, and G54; wherein, the first amino acid region can be located in the recombinant polypeptide from about residue 14 to about residue 59; and, (iii) a second amino acid region consisting of residues 98-221 from SEQ ID NO:101, or conservative substitutions thereof outside of conserved residues K100, A101, N104, V111, G112, M115, F116, P166, W107, Y184, Y187, R188, G191, G192, and F195; wherein, the second amino acid region is located in the recombinant polypeptide from about residue 93 to about residue 226.
[0014] In some embodiments, the teachings are directed to an isolated recombinant glutathione S-transf erase enzyme, comprising (i) a length ranging from about 279 to about 281 amino acids; (ii) a first amino acid region having at least 95% identity to residues 19-54 from SEQ
ID NO:101 while conserving residues T19,120, S21, P22, V24, W25, T27, K28, Y29, A30, H33, K34, G35, F36, D39,140, V41, P42, G43, G44, F45, G47, 148, E50, R51, T52, G53, and G54; wherein, the first amino acid region is located in the recombinant polypeptide from about residue 14 to about residue 59; and, (iii) a second amino acid region having at least 95% identity to residues 98-221 from SEQ ID NO:101 while conserving residues K100, A101, N104, V111, G112, M115, F116, P166, W107, Y184, Y187, R188, G191, G192, and F195; wherein, the second amino acid region can be located in the recombinant polypeptide from about residue 93 to about residue 226; and, the recombinant glutathione S-transferase enzyme can function to cleave a beta-aryl ether.
[0015] In some embodiments, the teachings are directed to an isolated recombinant glutathione S-transf erase enzyme, comprising an amino acid sequence having at least 95%
identity to SEQ ID NO:541; wherein, the amino acid sequence functions to cleave a beta-aryl ether.
[0016] In some embodiments, the teachings are directed to an isolated recombinant polypeptide, comprising (i) a length ranging from about 256 to about 260 amino acids; (ii) a first amino acid region consisting of residues 47-57 from SEQ ID NO:541, or conservative substitutions thereof outside of conserved residues A47, 148, N49, P50, G52, V54, P55, V56, L57; wherein, the first amino acid region is located in the recombinant polypeptide from about residue 45 to about residue 57; (iii) a second amino acid region consisting of 63-76 from SEQ ID NO:541; and, (iv) a third amino acid region consisting of residues 99-230 from SEQ ID NO:541, or conservative substitutions thereof outside of conserved residues R100, Y101, K104, D107, M111, N112, S115, M116, K176, L194,1197, N198, S201, H202, and M206; wherein, the second amino acid region is located in the recombinant polypeptide from about residue 94 to about residue 235.
[0017] In some embodiments, the teachings are directed to an isolated recombinant glutathione S-transf erase enzyme, comprising (i) a length ranging from about 279 to about 281 amino acids; (ii) a first amino acid region having at least 95% identity to 47-57 from SEQ ID
NO:541, or conservative substitutions thereof outside of conserved residues A47, 148, N49, P50, G52, V54, P55, V56, L57; wherein, the first amino acid region can be located in the recombinant polypeptide from about residue 45 to about residue 57; (iii) a second amino acid region consisting of 63-76 from SEQ ID NO:541; and, (iv) a third amino acid region having at least 95% identity to residues 99-230 from SEQ ID NO:541, or conservative substitutions thereof outside of conserved residues R100, Y101, K104, D107, M111, N112, S115, M116, K176, L194,1197, N198, S201, H202, and M206; wherein, the second amino acid region can be located in the recombinant polypeptide from about residue 94 to about residue 235; wherein, the recombinant glutathione S-transferase enzyme functions to cleave a beta-aryl ether.
[0018] In some embodiments, an amino acid substitution outside of the conserved residues can be a conservative substitution. And, in many embodiments, the amino acid sequence can function to cleave a beta-aryl ether.
[0019] The teachings are also directed to a method of cleaving a beta-aryl ether bond, the comprising contacting a polypeptide taught herein with a lignin-derived compound having (i) a beta-aryl ether bond and (ii) a molecular weight ranging from about 180 Da!tons to about 3000 Da!tons; wherein, the contacting occurs in a solvent environment in which the lignin-derived compound is soluble.
[0020] In some embodiments, the lignin-derived compound has a molecular weight of about 180 Da!tons to about 1000 Da!tons. In some embodiments, the solvent environment comprises water. And, in some embodiments, the solvent environment comprises a polar organic solvent.
[0021] The teachings are also directed to a system for bioprocessing lignin-derived compounds, the system comprising a polypeptide taught herein, a lignin-derived compound having a beta-aryl ether bond and a molecular weight ranging from about 180 Da!tons to about 3000 Da!tons; and, a solvent in which the lignin-derived compound is soluble;
wherein, the system functions to cleave the beta-aryl ether bond by contacting the polypeptide with the lignin-derived compound in the solvent.
[0022] The teachings are also directed to a recombinant polynucleotide comprising a nucleotide sequence that encodes a polypeptide taught herein. Likewise, the teachings are also directed to a vector or plasmid comprising the polynucleotide, as well as a host cell transformed by the vector or plasmid to express the polypeptide.
[0023] The teachings are also directed to a method of cleaving a beta-aryl ether bond, the method comprising (i) culturing a host cell taught herein under conditions suitable to produce a polypeptide taught herein; (ii) recovering the polypeptide from the host cell culture; and, (iii) contacting the polypeptide of claim 1 with a lignin-derived compound having a beta-aryl ether bond and a molecular weight ranging from about 180 Da!tons to about 3000 Da!tons; wherein, the contacting occurs in a solvent environment in which the lignin-derived compound is soluble.
[0024] In some embodiments, the host cell can be E. Coli or an Azotobacter strain, such as Azotobacter vinelandii. And, in some embodiments, the lignin-derived compound can have a molecular weight of about 180 Da!tons to about 1000 Da!tons.
[0025] The teachings are also directed to a system for bioprocessing lignin-derived compounds, the system comprising (i) a transformed host cell taught herein; (ii) a lignin-derived compound having a beta-aryl ether bond and a molecular weight ranging from about 180 Da!tons to about 3000 Da!tons; and, (iii) a solvent in which the lignin-derived compound is soluble; wherein, the system functions to cleave the beta-aryl ether bond by contacting a polypeptide taught herein with the lignin-derived compound in the solvent.
BRIEF DESCRIPTION OF THE DRAWINGS
[0026] FIGs. lA and 1B illustrate general concepts of the biorefinery and discovery processes discussed herein, according to some embodiments.
[0027] FIG. 2 illustrates the structures of some building block chemicals that can be produced using bioconversions, according to some embodiments.
[0028] FIG. 3 is an example of a beta-etherase catalyzed hydrolysis of a model lignin dimer, a-0-(B-methylumbelliferyl) acetovanillone (MUAV), according to some embodiments.
[0029] FIG. 4 illustrates unexpected results from biochemical activity assays for beta-etherase function for the S. paucimobilis positive control polypeptides, and the N.
aromaticivorans putative beta-etherase polypeptide, according to some embodiments.
[0030] FIG. 5 illustrates beta-aryl-ether compounds to be tested as substrates representing native lignin structures, according to some embodiments.
[0031] FIG. 6 illustrates pathways of guaiacylglycerol-8-guaiacyl ether (GGE) metabolism by S.
paucimobilis, according to some embodiments.
[0032] FIG. 7 illustrates an example of a biochemical process for the production of catechol from lignin oligomers, according to some embodiments.
[0033] FIG. 8 illustrates an example of a biochemical process for the production of vanillin from lignin oligomers, according to some embodiments.
[0034] FIG. 9 illustrates an example of a biochemical process for the production of 2,4-diaminotoluene from lignin oligomers, according to some embodiments.
[0035] FIG. 10 illustrates process schemes for additional product targets that include ortho-cresol, salicylic acid, and aminosalicylic acid, for the production of valuable chemicals from lignin oligomers, according to some embodiments.
DETAILED DESCRIPTION OF THE INVENTION
[0036] This invention is generally directed to a recombinant method of producing enzymes for use in the bioconversion of lignin-derived compounds to valuable aromatic chemicals.
Currently, the art is limited in it's ability to control the degradation of lignin to produce useful products, as it's limited in it's knowledge of enzymes that are capable of selectively converting lignin into desired aromatic compounds. Generally, the art knows two basic things: (1) lignin is complex; and (2) bacterial lignin degradation systems are therefore at least as complex as lignin itself. Accordingly, and for at least these reasons, the teachings provided herein offer a valuable, unexpected, and surprising set of systems, methods, and compositions of matter that will be useful in the production of industrially useful aromatic chemicals.
[0037] FIGs. lA and 1B illustrate general concepts of the biorefinery and discovery processes discussed herein, according to some embodiments. FIG. lA shows a generalized example of a use of recombinant microbial strains in biotransformations for the production of aromatic chemicals from lignin-derived compounds. Biorefinery process 100 converts a soluble biorefinery lignin 105 through a series of biotransformations using a transformed host cell.
The biorefinery lignin 105 is a feedstock comprising a lignin-derived compound which can be, for example, a combination of lignin-derived monomers and oligomers.
"Biotransformation 1" 107 can be used to selectively cleave a bond on or between monomers to create additional lignin monomers 110. "Biotransformation 2" 112 can be used to selectively cleave an additional bond on or between monomers to create mono-aromatic commercial products 115. FIG. 1B shows a discovery process 120, which includes selecting a host cell strain that is tolerant to toxic lignin-derived compounds. The strain acquisition 125 includes growth of the strain, sample preparation, and storage. A set of bacterial strains are obtained for testing strain tolerance to soluble biorefinery lignin samples.
[0038] In some embodiments, the strains can be selected for (i) having well-characterized aromatic and xenobiotic metabolisms; (ii) annotated genome sequences; and (iii) prior use in fermentation processes at pilot or larger scales. Examples of strains can include, but are not limited to, Azotobacter vinelandii (ATCC BAA-1303 DJ), Azotobacter chroococcum (ATCC 4412 (EB Fred) X-50), Pseudomonas putida (ATCC BAA-477 Pf-5), Pseudomonas fluorescens (ATCC 29837 NCTC 1100). Stains can be streaked on relevant rich media plates as described by the accompanying ATCC literature for revival.
Individual colonies (5 each) can be picked and cultured on relevant liquid media to saturation.
Culture samples prepared in a final glycerol concentration of 12.5% can be flash-frozen and stored at -80 C.
[0039] The model substrate synthesis 150 for use in the biochemical screening for selective activity can be outsourced through a contract research organization (CRO). The enzyme discovery effort can initially be focused on identifying potential beta-etherase candidate genes identified through bioinformatic methods. The identification of candidates having beta-etherase activity is the 1st step towards generating lignin monomers from lignin oligomers present in soluble lignin streams. The fluorescent substrate a-0-(13-methylumbelliferyl) acetovanillone (MUAV), for example, can be used in in vitro assays to identify beta-etherase function (Acme Biosciences, Mt. View, CA). The formation of 4 methylumbelliferone (4MU) upon hydrolysis of the aryl ether bond can be monitored by fluorescence, for example, at Aex=365nm and Aem=450nm (or 460nm).
[0040] The gene synthesis, cloning, and transformation step 145 can include combining bioinformatic methods with known information about enzymes showing a desired, selective enzyme activity. For example, bioinformatics can produce a putative beta-etherase sequence that shares a significant homology to the S. paucimobilis ligE and ligF beta-etherase sequences. See Masai, E., et al. Journal of Bacteriology (3):1768-1775(2003)("Masai"), which is hereby incorporated herein in it's entirety by reference. The S. paucimobilis sequences can be used as positive controls for biochemical assays to show relative activities in an enzyme discovery strategy.
[0041] The gene synthesis, cloning, and transformation step 145 can be performed using any method known to one of skill. For example, all genes can be synthesized directly as open reading frames (ORFs) from oligonucleotides by using standard PCR-based assembly methods, and using the E. coli codon bias. The end sequences can contain adaptors (BamHI and HindIII) for restriction digestion and cloning into the E. coli expression vector pET24a (Novagen). Internal BamHI and Hind!!! sites can be excluded from the ORF

sequences during design of the oligonucleotides. Assembled genes can be cloned into the proprietary cloning vector (pG0V4), transformed into E. coli CH3 chemically competent cells, and DNA sequences determined (Tocore Inc.) from purified plasmid DNA.
After sequence verification, restriction digestion can be used to excise each ORF
fragment from the cloning vector, and the sequence can be sub-cloned into pET24a. The entire set of lig E
and ligF bearing plasmids can then be transformed into E. coli BL21 (DE3) which can serve as the host strain for beta-etherase expression and biochemical testing.
[0042] The enzyme screening 155 is done to identify novel etherases 160. The fluorescent substrate MUAV can be used to screen for and identify beta-etherase activity from the recombinant E. coli clones. Expression of the beta-etherase genes can be done in 5m1 or 25m1 samples of the recombinant E. coli strains in LB medium using induction with IPTG.
Following induction, and cell harvest, cell pellets can be be lysed using the BPER
(Invitrogen) cell lysis system. Cell extracts can be tested in the in vitro biochemical assay for beta-etherase activity on the fluorescent substrate MUAV. The formation of methylumbelliferone (4MU) upon hydrolysis of the aryl ether bond in MUAV can be monitored by fluorescence at Aex=365nm and Aem=460nm, and can provide quantitative measurement of beta-etherase function. Cell extracts of E. coli transformed with the S.
paucimobilis lig E and lig F genes can be the assay positive controls. Test or unknown samples can include, for example, E. coli strains expressing putative beta-etherase genes from N. aromaticovorans.
[0043] The lignin stream acquisition 130 includes a waste lignin stream from a biorefinery for testing. A preliminary characterization of one source of such lignin has shown an aromatic monomer concentration of less than 1g/L and an oligomer concentration of -10g/L.
Oligomers appear to be associated with carbohydrates in 10:1 ratio for sugar:phenolics.
Some information exists on compounds in the liquid stream, including benzoic acid, vanillin, syringic acid and ferulics, which are routinely quantified in soluble samples.
An average molecular weight of -280 has been established for the monomers; and the oligomeric components remain to be characterized.
[0044] The strain tolerance testing 135 Strain tolerance will be determined by cell growth upon exposure to biorefinery lignin. Tolerance to the phenolic compounds in biorefinery lignin waste stream will be critically important to the bioprocess efficiency and high level production of aromatic chemicals by microbial systems. Cell growth will be quantified as a function of respiration by the reduction of soluble tetrazolium salts. XTT
(2,3-Bis(2-methoxy-4-nitro-5-sulfophenyI)-2H-tetrazolium-5-carboxanilide inner salt, Sigma) is reduced to a soluble purple formazan compound by respiring cells. The formazan product will be detected and quantified by absorbance at 450nm.
[0045] Strain tolerance testing 135 on soluble lignin can be done in liquid format in 48 well plates, for example. Each strain can be tested in replicates of 8, for example, and E. coli can be used as a negative control strain. Strains can first be grown in rich medium to saturation, washed, and OD600nm of the cultures determined. Equal numbers of bacteria can be inoculated into wells of the 48-well growth plate containing minimal medium excluding a carbon source. Increasing concentrations of soluble lignin fractions, in addition to a minus-lignin positive control, can be added to the wells containing each species to a final volume of 0.8m1. A benzoic acid content analysis of the lignin fractions can be used as an internal indicator of the phenolic content of lignin wastes of different origin. Following incubation for 24-48 hours with shaking at 30 C, the cultures can be tested for growth upon exposure to the lignin fraction using an XTT assay kit. Culture samples can be removed from the 48 well growth plate and diluted appropriately in 96 well assay plates to which the XTT reagent can be be added. The soluble formazan produced will be quantified by absorbance at 450nm. Bacterial strains exhibiting the highest level of growth, and therefore tolerance, can be candidates for further development as host strains for lignin conversions.
[0046] The strain demonstrated to have the best tolerance characteristics can be transformed with the beta-etherase gene identified as showing the highest biochemical activity.
Restriction digestion can be used to excise the ORF fragment from the cloning vector, and the sequence can be sub-cloned into the shuttle vector pMMB206. Constructs cloned in the shuttle vector can be transformed into Azotobacter or Pseudomonas strains by electroporation, or chemical transformation. The recombinant, lignin tolerant host strain can be re-tested for beta-etherase expression and activity using any methods known to one of skill, such as those described herein, adapted to the particular host strain being used.
Feedstock from biorefinery processes [0047] An example of a starting material might be pretreated lignocellulosic biomass. In some embodiments, the lignocellulose biomass material might include grasses, corn stover, rice hull, agricultural residues, softwoods and hardwoods. In some embodiments, the lignin-derived compounds might be derived from hardwood species such as poplar from the Upper Peninsula region of Michigan, or hardwoods such as poplar, lolloby pine, and eucalyptus from Virginia and Georgia areas, or mixed hardwoods including maple and oak species from upstate New York.
[0048] In some embodiments, the pretreatment methods might encompass a range of physical, chemical and biological based processes. Examples of pretreatment methods used to generate the feedstock for Aligna processes might include physical pretreatment, solvent fractionation, chemical pretreatment, biological pretreatment, ionic liquids pretreatment, supercritical fluids pretreatment, or a combination thereof, for example, which can be applied in stages.
[0049] Physical pretreatment methods used to reduce the lignocellulose biomass particle size reduction might utilize mechanical stress methods of dry, wet vibratory and compression-based ball milling procedures. Solvent fractionation methods include organosolve processes, phosphoric acid fractionation processes, and methods using ionic liquids to pretreat the lignocellulose biomass to differentially solubilize and partition various components of the biomass. In some embodiments, organosolve methods might be performed using alcohol, including ethanol, with an acid catalyst at temperature ranges from about 90 to about 20 C, and from about 155 to about 220 C with residence time of about 25 minutes to about 100 minutes. Catalyst concentrations can vary from about 0.83% to about 1.67% and alcohol concentrations can vary from about 25% to about 74% (v/v).
In some embodiments, phosphoric acid fractionations of lignocellulose biomass might be performed using a series of different extractions using phosphoric acid, acetone, and water at temperature of around 50 C. In some embodiments, ionic liquid pretreatment of lignocellulose biomass might include use of ionic liquids containing anions like chloride, formate, acetate, or alkylphosphonate, with biomass:ionic liquids ratios of approximately 1:10 (w/w). The pretreatment might be performed at temperatures ranging from about 100 C to about 150 C. Other ionic liquid compounds that might be used include 1-butyl-3-methyl-imidazolium chloride and 1-ethyl-3-methylimidazolium chloride.
[0050] Chemical pretreatments of lignocellulose biomass material might be performed using technologies that include acidic, alkaline and oxidative treatments. In some embodiments, acidic pretreatment methods of lignocellulose biomass such as those described below might be applied. Dilute acid pretreatments using sulfuric acid at concentrations in the approximate range of about 0.05% to about 5%, and temperatures in the range of about 160 C to about 220 C. Steam explosion, with or without the use of catalysts such as sulfuric acid, nitric acid, carbonic acid, succinic acid, fumaric acid, maleic acid, citric acid, sulfur dioxide, sodium hydroxide, ammonia, before steam explosion, at temperatures between about 160 C to about 290 C. Liquid hot water treatment at pressure >5MPa at temperatures ranging from about 160 C to about 230 C, and pH range between about 4 and about 7.
And, in some embodiments, alkaline pretreatment methods using catalysts such as calcium oxide, ammonia, and sodium hydroxide might be used. The ammonia fiber expansion (AFEX) method might be applied in which concentrated ammonia at about 0.3kg to about 2kg of ammonia per kg of dry weight biomass is used at about 60 C to about 140 C in a high pressure reactor, and cooked for 5-45 minutes before rapid pressure release.
The ammonia recycle percolation (ARP) method might be used in flow through mode by percolating ammoniacal solutions at 5-15% concentrations at high temperatures and pressures.
Oxidative pretreatment methods such as alkaline wet oxidation might be used with sodium carbonate at a temperature ranging from about 170 C to about 220 C in a high pressure reactor using pressurized air/oxygen mixtures or hydrogen peroxide as the oxidants.
[0051] Biological pretreatment methods using white rot basidomycetes and certain actinomycetes might be applied. One type of product stream from such pretreatment methods might be soluble lignin, and might contain lignin-derived monomers and oligomers in the range of about 1g/L to about 10g/L, and xylans. The lignin-derived monomers might include compounds such as gallic acid, hydroxybenzoate, ferulic acid, hydroxymethyl furfural, hydroxymethyl furfural alcohol, vanillin, homovanillin, syringic acid, syringaldehyde, and furfural alcohol.
[0052] Supercritical fluid pretreatment methods might be used to process the biomass.
Examples of supercritical fluids for use in processing biomass include ethanol, acetone, water, and carbon dioxide at a temperature and pressures above the critical points for ethanol and carbon dioxide but at a temperature and/or pressure below that of the critical point for water.
[0053] Combinations of steam pretreatment and biological pretreatment methods might be applied. For example, a biomass steam can be pretreated at 195 C for 10 min at controlled pH, followed by enzymatic treatment using commercial cellulases and xylanases at dosings of 100mg protein/g total solid, and with incubation at 50 C at pH 5.0 with agitation of 500 rpm.
[0054] In some embodiments, combinations of hydrothermal, organosolve, and biological pretreatment methods might be used. One example of such a combination is a 3 stage process:

Stage 1. Use heat in an aqueous medium at a predetermined pH, temperature and pressure for the hydrothermal process;
Stage 2. Use at least one organic solvent from those described in 6-6c in water for the organosolve step;
Stage 3. Use yeast, white rot basidomycetes, actinomycetes, and cellulases and xylanases in native or recombinant forms for the biological pretreatment step.
[0055] Soluble lignin fractions derived using organosolve methods might produce soluble lignins in the molecular weight range of 188-1000, soluble in various polar solvents.
Without intending to be bound by any theory or mechanism of action, organosolve processes are generally believed to maintain the lignin beta-aryl ether linkage.
[0056] Lignin streams from steam exploded lignocellulosic biomass might be used. Steam explosion might be performed, for example, using high pressure steam in the range of about 200 psi to about 500psi, and at temperatures ranging from about 180 C to about 230 C for about 1 minute to about 20 minutes in batch or continuous reactors. The lignin might be extracted from the steam-exploded material with alkali washing or extraction using organic solvents. Steam exploded lignins can exhibit properties similar to those described form organosolve lignins, retaining native bond structures and containing about 3 to about 12 aromatic units per oligomer unit.
[0057] Supercritical fluid pretreatment can produce soluble lignin fractions that can be used with the teachings provided herein. Such processes typically yield monomers and lignin oligomers having a molecular weight of about <1000 Da!tons.
[0058] Biological pretreatment can produce soluble lignin fractions that can be used with the teachings provided herein. Such lignin streams might contain lignin monomers and oligomers in the range of about 1 g/L to about 10 g/L and have a molecular weight of about <1000 Da!tons, and xylans. The lignin-derived monomers might include compounds such as gallic acid, hydroxybenzoate, ferulic acid, hydroxymethyl furfural, hydroxymethyl furfural alcohol, vanillin, homovanillin, syringic acid, syringaldehyde, and furfural alcohol.
Feedstock from wood pulping processes [0059] Wood pulping processes produce a variety of lignin types, the type of lignin dependent on the type of process used. Chemical pulping processes include, for example, Kraft and sulfite pulping.
[0060] In some embodiments, the lignin-derived compound can be derived from a spent pulping liquor or "black liquor" from Kraft pulping processes. Kraft lignin might be derived from batch or continuous processes using, for example, reaction temperatures in the range of about 150 C to about 200 C and reaction times of approximately 2 hours. Any range of molecular weights of lignin may be obtained, and the useful fraction may range, in some embodiments, from about 200 Da!tons to about 4000 Da!tons. A Kraft lignin having a molecular weight ranging from about 1000 Da!tons to about 3000 Da!tons might be used in a bioconversion.
[0061] In some embodiments, lignin from a sulfite pulping process might be used. A sulfite pulping process can include, for example, a chemical sulfonation using aqueous sulfur dioxide, bisulfite and monosulfite at a pH ranging from about 2 to about 12.
The sulfonated lignin might be recovered by precipitation with excess lime as lignosulfonates. Alternatively, formaldehyde-based methylation of the lignin aromatics followed by sulfonation might be performed. Any range of molecular weights of lignin may be obtained, and the useful fraction may range, in some embodiments, from about 200 Da!tons to about 4000 Da!tons.
A sulfite lignin having a molecular weight ranging from about 1000 Da!tons to about 3000 Da!tons might be used in a bioconversion.
Characterization of lignin-derived compounds for use in bioconversion [0062] Optimization of a system for a particular feedstock should include an understanding of the composition of the particular feedstock. For example, one of skill will appreciate that the composition of a native lignin can be significantly different than the composition of the lignin-derived compounds in a given lignin faction that is used for a feedstock.
Accordingly, and understanding of the composition of the feedstock will assist in optimizing the conversion of the lignin-derived compounds to the valuable aromatic compounds. Any method known to one of skill can be used to characterize the compositions of the feedstock.
For example, one of skill may use wet chemistry techniques, such as thioacidolysis and nitrobenzene oxidation, coupled with gas chromatography, which have been used traditionally, or spectroscopic techniques such as NMR and FTIR. Thioacidolysis, for example, cleaves the 13-0-4 linkages in lignin, giving rise to monomers and dimers which are then used to calculate the S and G content. Similar information can be obtained using nitrobenzene oxidation, but the ratios are thought to be less accurate. In some embodiments, the content of S, G, and H, as well as their relative ratios can be used to characterize feedstock compositions for purposes of determining a bioconversion system design.
[0063] It is widely accepted that the biosynthesis of lignin stems from the polymerization of three types of phenylpropane units, also referred to as monolignols. These units are coniferyl, sinapyl, and p-coumaryl alcohol. The three structures are as follows:
HO .OH
p-coumaryl alcohol (H);

coniferyl alcohol (G); and, sinapyl alcohol (S).
[0064] Tables 1A and 1B summarize distributions of p-coumaryl alcohol or p-hydroxyl phenol (H), coniferyl alcohol or guaiacyl (G), and sinapyl alcohol or syringyl (S) lignin in several sources of biomass. Table 1A compares percent lignin in the biomass to the G:S:H.
[0065] Table 1A.
.
''...a.iin G S . H
zt, . .
Wheat Straw 16-21 45 46 9 Rice Straw 6 45 40 15 Rye Straw 18 43 53 1 Hemp 8-13 51 40 9 Tall Fescue:
Stems 7-10 55 42 -Intemodes 11 48 50 , Flax 21-34 67 29 4 Jute 15-26 36 62 /
Sisal 7-14 22 76 ':t Curaua Leaf fiber 7 29 41 30 Banana Plant Leaf 41 50 7 Piassav a Fiber (Plam Tree) 45 4.0 9 51 Abaca 7-9 19 55 26 Loblolly Pine /9 86 2 12 Compression 60 40 Spruce (Picea Abies) 28 94 1 5 _ Eucalyptus i-dobus 22 14 84 2 Eucalyptus erandis 27 õ,./ 69 4 Birch pen du la 1/ 19 69 1 Beech 26 56 40 4 Acacia 18 48 49 1 -...

Table 1A compares location of a sample in the biomass, species, and environmental stress to the G:S:H.
Table 1B.
White Birch Cl S
* Fiber, $2 layer 12 88 * Vemelõ 52 Layer 8$1.
* Ray p:arenchytna, &layer 49 51 = Middic 91 * Middle 1ameila (1 20 * Middle lamella (1i het 4'4y) 100 * Mid.dklamella (raY/14h $$ 12 0. S.
Lignin Sarnples- -Carpinus betulus MWL 19 1 Every:pi-14 etlehilVia MWL 35: 6 Barnhivio .M WL
Pam* wht krali nm 72, 3 Eucalyptlis kraft iign in 2.2 73 6 Lobolly Pine laymile * Norm a! 95 gi Wind Oppo5ito 96 4 * Wind (omprosiOn 89 Ti *: Bent (.)-mothto 96 4 = Bent COMMSSiari 8$ 12 [0066] In general, the relative amounts of G, S, and H in lignin can be a good indicator of its overall composition and response to a treatment, such as the bioconversions taught herein.
In poplar species, for example, differences can be seen based on the measurement technique as well as species, but in general the SIG ratio ranges from 1.3 to 2.2. This is similar to the hardwood eucalyptus, but higher than herbaceous biomass switchgrass and Miscanthus. This is to be expected given the higher H contents in grass lignin. An optimized nitrobenzene oxidation method has shown SIG ratios of 13 poplar samples from two different sites and obtained values ranging from 1.01 to 1.68. Further, a linear correlation (R2= 0.85) has been found in poplar between decreasing lignin content and increasing SIG ratios. The correlation was stronger (R2= 0.93) in samples from a single site suggesting a dependency on geographic location.
[0067] Higher throughput methods can be used for rapid screening of feedstocks. Examples of such methods can include, but are not limited to, near-infrared (NIR), reflectance spectroscopy, pyrolysis molecular beam mass spectrometry (pyMBMS), Fourier transform infrared spectroscopy, a modified thioacidolysis technique, and whole cell NMR
after dissolution in ionic liquids. Information on some structural characteristics of lignin, such as SIG ratios, can be rapidly obtained using these methods. The average S:G:H
ratio of 104 poplar lignin samples, for example, was determined using the modified thioacidolysis technique, and was found to be 68:32:0.02. In some embodiments, the S, G, and H
components in the ratio can be expressed as mass percent. In some embodiments, the S, G, and H components in the ratio can be expressed as any relative unit, or unitless. Any comparison can be used, if the amount of each component directly correlates with the other respective components in the composition. The ratios can be expressed in relative whole numbers or fractions as S:G:H, or any other order or combination of components, SIG, G/S, and the like. In some embodiments, the SIG ratio is used. In some embodiments, the SIG
ratio can range from about 0.20 to about 20.0, from about 0.3 to about 18.0, from about 0.4 to about 15.0, from about 0.5 to about 15.0, from about 0.6 to about 12.0, from about 0.7 to about 10.0, from about 0.8 to about 8.0, from about 0.9 to about 9.0, from about 1.0 to about 7.0, or any range therein. In some embodiments, the SIG ratio can be about 0.2, about 0.4, about 0.6, about 0.8, about 1.0, about 1.2, about 1.4, about 1.6, about 1.8, about 2.0, about 2.2, about 2.4, about 2.6, about 2.8, about 3.0, about 3.2, about 3.4, about 3.6, about 3.8, about 4.0, about 4.2, about 4.4, about 4.6, about 4.8, about 5.0, about 5.2, about 5.4, about 5.6, about 5.8, about 6.0, about 6.2, about 6.4, about 6.6, about 6.8, about 7.0, about 7.2, about 7.4, about 7.6, about 7.8, about 8.0, about 8.2, about 8.4, about 8.6, about 8.8, about 9.0, about 9.2, about 9.4, about 9.6, about 9.8, about 10.0, and any ratio in-between on 0.1 increments, and any range of ratios therein.
Fractionation of lignin-derived compounds for use in bioconversion [0068] Soluble lignin streams derived from biorefinery or Kraft processes might be used directly in microbial conversions without additional purification or, they might be further purified by one or more of the separation or fractionation techniques prior to microbial conversions.
[0069] In some embodiments, membrane filtration might be applied to achieve a starting concentration of lignin monomers and oligomers in the 1-60% (w/v) concentration range, and molecular weights ranging from about 180 Da!tons to about 2000 Da!tons, from about 200 Da!tons to about 4000 Da!tons, from about 250 Da!tons to about 2500 Da!tons, from about 180 Da!tons to about 3500 Da!tons, from about 300 Da!tons to about 3000 Da!tons, or any range therein.
[0070] In some embodiments, soluble lignin streams might be partially purified by chromatography using, for example, HP-20 resin. The lignin monomers and oligomers can bind to the resin while highly polar impurities or inorganics that might be toxic to microorganisms can remain un-bound. Subsequent elution, for example, with a methanol-water solvent system, can provide fractions of higher purity that are enriched in lignin monomers and oligomers.
Chemical products [0071] A purpose of the present teaching includes the discovery of novel biochemical conversions that create valuable commercial products from various lignin core structures.
Such commercial products include monomeric aromatic chemicals that can serve as building block chemicals. One of skill will appreciate that a vast number of aromatic chemicals can be produced using the principles provided by the teachings set-forth herein, and that a comprehensive teaching of every possible chemical that can be produced would be beyond the scope and purpose of this teaching.
[0072] FIGs. 2A and 2B illustrate (i) the structures of some building block chemicals that can be produced using bioconversions, and (ii) an example enzyme system from a Sphingomonaas paucimobilis gene cluster, according to some embodiments. FIG. 2A shows that examples of some monomeric aromatic structures that can serve as building block chemicals derived from lignin include, but are not limited to, guaiacol, 6-hydroxypropiovanillone, 4-hydroxy-3 methoxy mandelic acid, coniferaldehyde, ferulic acid, eugenol, propylguaicol, and 4-acetylguaiacol. It should be appreciated that each of these structures can be produced using the teachings provided herein. FIG. 2B(i) shows the organization of the LigDFEG
gene cluster in a Sphingomonaas paucimobilis strain. FIG. 2B(ii) shows deduced functions of the gene products believed to be involved in a 6-aryl ether bond cleavage in a model lignin structure, guaiacylglycerol-6-guaiacyl ether (GGE). The vertical bars above the restriction map indicate the positions of the gene insertions of LigD, LigF, LigE, and LigG.

LigD shoed Ca-dehydrogenase activity, LigF and LigE showed 6-etherase activity, and LigG
showed glutathione lyase activity. FIG. 2 LEGEND (Abbreviations): restriction enzymes Ap (Apal), Bs (BstXI), E (EcoRI), Ec (Eco4711I), MI (Mlul), P (Pstl), RV (EcoRV), S, (Sall), Sc (Sac!). Scl I (Sad!), St (Stu I), Sm (Smal), Tt (TthIIII), and X (Xhol);
chemicals GGE
(guaiacylglycerol-6-guaiacyl ether), GSH (glutathione), GSSG (glutathione disulfide), and asterisks are asymmetric carbons.
[0073] Commercial products that can be obtained from a bioconversion of lignin-derived compounds, as taught herein, include mono-aromatic chemicals. Examples of such chemicals include, but are not limited to, caprolactam, cumene, styrene, mononitro- and dinitrotoluenes and their derivatives, 2,4-diaminotoluene, 2,4-dinitrotoluene, terephthalic acid, catechol, vanillin, salicylic acid, aminosalicylic acid, cresol and isomers, alkylphenols, chlorinated phenols, nitrophenols, polyhydric phenols, nitrobenzene, aniline and secondary and tertiary aniline bases, benzothiazole and derivatives, alkylbenzene and alkylbenzene sulfonates, 4,4-diphenylmethane diisocyanate (MDI), chlorobenzenes and dichlorobenzenes, nitrochlorobenzenes, sulfonic acid derivatives of toluene, pseudocumene, mesitylene, nitrocumene, cumenesulfonic acid.
Enzyme discovery [0074] The teachings herein are also directed to the discovery of novel enzymes. In some embodiments, the enzymes are beta-etherase enzymes.
[0075] Lignin is the only plant biomass constituent based on aromatic core structures, and is comprised of branched phenylpropenyl (C9) units. The guaiacol and syringol building blocks of lignin are linked through carbon-carbon (C-C) and carbon-oxygen (C-0, ether) bonds. The native structure of lignin suggests its key application as a chemical feedstock for aromatic chemicals. The production of such chemical structures necessitates depolymerization and rupture of C-C and C-0 bonds. An abundant chemical linkage in lignin is the beta-aryl ether linkage, which comprises 50% to 70% of the bond type in lignin.
The efficient scission of the beta-aryl ether bond would generate the monomeric building blocks of lignin, and provide the chemical feedstock for subsequent conversion to a range of industrial products.
[0076] The beta-etherase enzyme system has multiple advantages for conversions of lignin oligomers to monomers over the laccase enzyme systems. The beta-etherase enzyme system would achieve highly selective reductive bond scission catalysis for efficient and high yield conversions of lignin oligomers to monomers without the formation of side products, degradation of the aromatic core structures of lignin, or the use of electron transfer mediators required with use of the oxidative and radical chemistry-based laccase enzyme systems.
[0077] FIG. 3 is an example of a beta-etherase catalyzed hydrolysis of a model lignin dimer, a-0-(B-methylumbelliferyl) acetovanillone (MUAV), according to some embodiments.
The scission of the beta-aryl ether bond in model compounds of lignin by beta-etherases from the microbe Sphingmonas paucimobilis has been described. However, the available information is limited, and there is no precedent in the literature for the use of S.
paucimobilis as an industrial microbe for commercial scale processes. The discovery of new beta-etherase enzymes, and the heterologous expression of these new enzymes in Azotobacter strains will provide the art with valuable industrial strains that particulary well-suited for lignin conversion processes.
[0078] One of skill will recognize the chemical nomenclature used herein as standard to the art.
For example, the amino acids used herein can be identified by at least the following conventional three-letter abbreviations in Table 2:
[0079] Table 2.
Alanine A Ala Leucine L Leu Arginine R Arg Lysine K Lys Asparagine N Asn Methionine M Met Aspartic acid D Asp Phenylalanine F Phe Cysteine C Cys Proline P Pro Glutamic acid E Glu Serine S Ser Glutamine Q Gln Threonine T Thr Glycine G Gly Tryptophan W Trp Histidine H His Tyrosine Y Tyr lsoleucine I Ile Valine V Val Ornithine 0 Orn Other Xaa [0080] The single letter identifier is provided for ease of reference, but any format can be used.
The three-letter abbreviations are generally accepted in the peptide art, recommended by the IUPAC-IUB commission in biochemical nomenclature, and are provided to comply with WIPO Standard ST.25. Furthermore, the peptide sequences are taught according to the generally accepted convention of placing the N-terminus on the left and the C-terminus on the right of the sequence listing to again comply with WIPO Standard ST.25.
The Recombinant Polypeptides [0081] The teachings herein are based on discovery of novel and non-obvious proteins, DNAs, and host cell systems that can function in the conversion of lignin-derived compounds into valuable aromatic compounds. The systems can include natural, wild-type components or recombinant components, the recombinant components being isolatable from what occurs in nature.
[0082] The term "isolated" means altered "by the hand of man" from its natural state; i.e., if it occurs in nature, it has been changed or removed from its original environment, or both. For example, a naturally occurring polynucleotide or a polypeptide naturally present in a living animal in its natural state is not "isolated," but the same polynucleotide or polypeptide separated from the coexisting materials of its natural state is "isolated", as the term is used herein. For example, with respect to polynucleotides, the term isolated means that it is separated from the chromosome and cell in which it naturally occurs. However, a nucleic acid molecule contained in a clone that is a member of a mixed clone library (e.g., a genomic or cDNA library) and that has not been isolated from other clones of the library (e.g., in the form of a homogeneous solution containing the clone without other members of the library) or a chromosome isolated or removed from a cell or a cell lysate (e.g., a "chromosome spread", as in a karyotype), is not "isolated" for the purposes of the teachings herein. Moreover, a lone nucleic acid molecule contained in a preparation of mechanically or enzymatically cleaved genomic DNA, where the isolation of the nucleic molecule was not the goal, is also not "isolated" for the purposes of the teachings herein. As part of, or following, an intentional isolation, polynucleotides can be joined to other polynucleotides, for mutagenesis, to form fusion proteins, and for propagation or expression in a host, for instance. Isolated polynucleotides, alone or joined to other polynucleotides such as vectors, can be introduced into host cells, in culture or in whole organisms, after which such DNAs still would be isolated, as the term is used herein, because they would not be in their naturally occurring form or environment. Similarly, the isolated polynucleotides and polypeptides may occur in a composition, such as a media formulation, solutions for introduction of polynucleotides or polypeptides, for example, into cells, compositions or solutions for chemical or enzymatic reactions, for instance, which are not naturally occurring compositions, and, therein remain "isolated" polynucleotides or polypeptides within the meaning of that term as it is used herein.
[0083] A "vector," such as an expression vector, is used to transfer or transmit the DNA of interest into a prokaryotic or eukaryotic host cell, such as a bacteria, yeast, or a higher eukaryotic cell. Vectors can be recombinantly designed to contain a polynucleotide encoding a desired polypeptide. These vectors can include a tag, a cleavage site, or a combination of these elements to facilitate, for example, the process of producing, isolating, and purifying a polypeptide. The DNA of interest can be inserted as the expression component of a vector. Examples of vectors include plasmids, cosmids, viruses, and bacteriophages. If the vector is a virus or bacteriophage, the term vector can include the viral/bacteriophage coat. The term "expression vector" is usually used to describe a DNA
construct containing gene encoding an expression product of interest, usually a protein, that is expressed by the machinery of the host cell. This type of vector is frequently a plasmid, but the other forms of expression vectors, such as bacteriophage vectors and viral vectors (e.g., adenoviruses, replication defective retroviruses, and adeno-associated viruses), can be used.
[0084] In some embodiments, the polypeptides taught herein can be natural or wildtype, isolated and/or recombinant. In some embodiments, the polynucleotides can be natural or wildtype, isolated and/or recombinant. In some embodiments, the teachings are directed to a vector than can include such a polynucleotide or a host cell transformed by such a vector.
[0085] In some embodiments, the polypeptide can be an isolated recombinant polypeptide, comprising an amino acid sequence having at least 95% identity to SEQ ID
NO:101. The sequence can conserve residues T19,120, S21, P22, V24, W25, T27, K28, Y29, A30, H33, K34, G35, F36, D39,140, V41, P42, G43, G44, F45, G47,148, E50, R51, T52, G53, G54, K100, A101, N104, V111, G112, M115, F116, P166, W107, Y184, Y187, R188, G191, G192, and F195.
[0086] In some embodiments, the polypeptide can be an isolated recombinant polypeptide, comprising SEQ ID NO:101; or conservative substitutions thereof outside of the conserved residues. The conserved residues can include T19,120, S21, P22, V24, W25, T27, K28, Y29, A30, H33, K34, G35, F36, D39, 140, V41, P42, G43, G44, F45, G47,148, E50, R51, T52, G53, G54; K100, A101, N104, V111, G112, M115, F116, P166, W107, Y184, Y187, R188, G191, G192, and F195.
[0087] In some embodiments, the polypeptide can be an isolated recombinant glutathione S-transferase enzyme, comprising an amino acid sequence having at least 95%
identity to SEQ ID NO:101. The amino acid sequence can conserve residues T19, 120, S21, P22, V24, W25, T27, K28, Y29, A30, H33, K34, G35, F36, D39,140, V41, P42, G43, G44, F45, G47,148, E50, R51, T52, G53, G54; K100, A101, N104, V111, G112, M115, F116, P166, W107, Y184, Y187, R188, G191, G192, and F195; wherein, the amino acid sequence functions to cleave a beta-aryl ether.
[0088] In some embodiments, the polypeptide can be an isolated recombinant glutathione S-transferase enzyme, comprising an amino acid sequence having at least 95%
identity to SEQ ID NO:101; wherein, the amino acid sequence functions to cleave a beta-aryl ether.
[0089] In some embodiments, the polypeptide can be an isolated recombinant polypeptide, comprising (i) a length ranging from about 279 to about 281 amino acids; (ii) a first amino acid region consisting of residues 19-54 from SEQ ID NO:101, or conservative substitutions thereof outside of conserved residues T19, 120, S21, P22, V24, W25, T27, K28, Y29, A30, H33, K34, G35, F36, D39,140, V41, P42, G43, G44, F45, G47, 148, E50, R51, T52, G53, and G54; wherein, the first amino acid region can be located in the recombinant polypeptide from about residue 14 to about residue 59; and, (iii) a second amino acid region consisting of residues 98-221 from SEQ ID NO:101, or conservative substitutions thereof outside of conserved residues K100, A101, N104, V111, G112, M115, F116, P166, W107, Y184, Y187, R188, G191, G192, and F195; wherein, the second amino acid region is located in the recombinant polypeptide from about residue 93 to about residue 226.
[0090] In some embodiments, the polypeptide can be an isolated recombinant glutathione S-transferase enzyme, comprising (i) a length ranging from about 279 to about 281 amino acids; (ii) a first amino acid region having at least 95% identity to residues 19-54 from SEQ
ID NO:101 while conserving residues T19,120, S21, P22, V24, W25, T27, K28, Y29, A30, H33, K34, G35, F36, D39,140, V41, P42, G43, G44, F45, G47, 148, E50, R51, T52, G53, and G54; wherein, the first amino acid region is located in the recombinant polypeptide from about residue 14 to about residue 59; and, (iii) a second amino acid region having at least 95% identity to residues 98-221 from SEQ ID NO:101 while conserving residues K100, A101, N104, V111, G112, M115, F116, P166, W107, Y184, Y187, R188, G191, G192, and F195; wherein, the second amino acid region can be located in the recombinant polypeptide from about residue 93 to about residue 226; and, the recombinant glutathione S-transferase enzyme can function to cleave a beta-aryl ether.
[0091] In some embodiments, the polypeptide can be an isolated recombinant glutathione S-transferase enzyme, comprising an amino acid sequence having at least 95%
identity to SEQ ID NO:541; wherein, the amino acid sequence functions to cleave a beta-aryl ether.
[0092] In some embodiments, the polypeptide can be an isolated recombinant polypeptide, comprising (i) a length ranging from about 256 to about 260 amino acids; (ii) a first amino acid region consisting of residues 47-57 from SEQ ID NO:541, or conservative substitutions thereof outside of conserved residues A47,148, N49, P50, G52, V54, P55, V56, L57;
wherein, the first amino acid region is located in the recombinant polypeptide from about residue 45 to about residue 57; (iii) a second amino acid region consisting of 63-76 from SEQ ID NO:541; and, (iv) a third amino acid region consisting of residues 99-230 from SEQ
ID NO:541, or conservative substitutions thereof outside of conserved residues R100, Y101, K104, D107, M111, N112, S115, M116, K176, L194,1197, N198, S201, H202, and M206;
wherein, the second amino acid region is located in the recombinant polypeptide from about residue 94 to about residue 235.
[0093] In some embodiments, the polypeptide can be an isolated recombinant glutathione S-transferase enzyme, comprising (i) a length ranging from about 279 to about 281 amino acids; (ii) a first amino acid region having at least 95% identity to 47-57 from SEQ ID
NO:541, or conservative substitutions thereof outside of conserved residues A47, 148, N49, P50, G52, V54, P55, V56, L57; wherein, the first amino acid region can be located in the recombinant polypeptide from about residue 45 to about residue 57; (iii) a second amino acid region consisting of 63-76 from SEQ ID NO:541; and, (iv) a third amino acid region having at least 95% identity to residues 99-230 from SEQ ID NO:541, or conservative substitutions thereof outside of conserved residues R100, Y101, K104, D107, M111, N112, S115, M116, K176, L194,1197, N198, S201, H202, and M206; wherein, the second amino acid region can be located in the recombinant polypeptide from about residue 94 to about residue 235; wherein, the recombinant glutathione S-transferase enzyme functions to cleave a beta-aryl ether.
[0094] In some embodiments, an amino acid substitution outside of the conserved residues can be a conservative substitution. And, in many embodiments, the amino acid sequence can function to cleave a beta-aryl ether.
Methods of Preparing the Recombinant SDF-1 Polynucleotide and Polypeptides [0095] The teachings include a method of preparing the polypeptides described herein, comprising culturing a host cell under conditions suitable to produce the desired polypeptide; and recovering the polypeptide from the host cell culture;
wherein, the host cell comprises an exogenously-derived polynucleotide encoding the desired polypeptide. In some embodiments, the host cell is E. Coli. In some embodiments, the host cell can be an Azotobacter strain such as, for example, Azotobacter vinelandii.
[0096] Initially, a double-stranded DNA fragment encoding the primary amino acid sequence of recombinant polypeptide can be designed. This DNA fragment can be manipulated to facilitate synthesis, cloning, expression or biochemical manipulation of the expression products. The synthetic gene can be ligated to a suitable cloning vector and then the nucleotide sequence of the cloned gene can be determined and confirmed. The gene can be then amplified using designed primers having specific restriction enzyme sequences introduced at both sides of insert gene, and the gene can be subcloned into a suitable subclone/expression vector. The expression vector bearing the synthetic gene for the mutant can be inserted into a suitable expression host. Thereafter the expression host can be maintained under conditions suitable for production of the gene product and, in some embodiments, the protein can be (i) isolated and purified from the cells expressing the gene or (ii) used directly in a reaction environment that includes the host cell.
[0097] The nucleic acid (e.g., cDNA or genomic DNA) may be inserted into a replicable vector for cloning (amplification of the DNA) for expression. Various vectors are publicly available.
In general, DNA can be inserted into an appropriate restriction endonuclease site(s) using techniques known in the art, for example. Vector components generally include, but are not limited to, one or more of a signal sequence, an origin of replication, one or more marker genes, an enhancer element, a promoter, and a transcription termination sequence.
[0098] The signal sequence may be a prokaryotic signal sequence selected, for example, from the group of the alkaline phosphatase, penicillinase, Ipp, or heat-stable enterotoxin II
leaders. For yeast secretion the signal sequence may be, e.g., the yeast invertase leader, alpha factor leader (including Saccharomyces and Kluyveromyces alpha-factor leaders, the latter described in U.S. Pat. No. 5,010,182), or acid phosphatase leader, the C. albicans glucoamylase leader (EP 362,179), or the signal described in WO 90/13646, for example.
In mammalian cell expression, mammalian signal sequences may be used to direct secretion of the protein, such as signal sequences from secreted polypeptides of the same or related species, as well as viral secretory leaders.
[0099] Both expression and cloning vectors contain a nucleic acid sequence that enables the vector to replicate in one or more selected host cells. Such sequences are well known for a variety of bacteria, yeast, and viruses. The origin of replication from a plasmid, e.g.
pBR322, for example, is suitable for most Gram-negative bacteria, and the 2 p plasmid origin is suitable for yeast, and various viral origins (SV40, polyoma, adenovirus, VSV or BPV) are useful for cloning vectors in mammalian cells.
[00100] Expression and cloning vectors will typically contain a selection gene, also termed a selectable marker. Typical selection genes encode proteins that (a) confer resistance to antibiotics or other toxins, e g., ampicillin, neomycin, methotrexate, or tetracycline, (b) complement auxotrophic deficiencies, or (c) supply critical nutrients not available from complex media, e.g., the gene encoding D-alanine racemase for Bacilli.
[00101] An example of suitable selectable markers for mammalian cells are those that enable the identification of cells competent to take the encoding nucleic acid, such as DHFR
or thymidine kinase. An appropriate host cell when wild-type DHFR is employed is the CHO
cell line deficient in DHFR activity, prepared and propagated as described by Urlaub et al., Proc. Natl. Acad. Sci. USA, 77:4216 (1980). A suitable selection gene for use in yeast is the trp1 gene present in the yeast plasmid YRp7 (Stinchcomb et al., Nature, 282:39 (1979);
Kingsman et al., Gene, 7:141 (1979); Tschemper et al., Gene, 10:157 (1980)).
The trpl gene provides a selection marker for a mutant strain of yeast lacking the ability to grow in tryptophan, for example, ATCC No. 44076 or PEP4-1 (Jones, Genetics, 85:12 (1977)).
[00102] Expression and cloning vectors usually contain a promoter operably linked to the encoding nucleic acid sequence to direct mRNA synthesis. Promoters recognized by a variety of potential host cells are well known. Promoters suitable for use with prokaryotic hosts include the .beta.-lactamase and lactose promoter systems (Chang et al., Nature, 275:615 (1978); Goeddel et al., Nature, 281:544 (1979)), alkaline phosphatase, a tryptophan (trp) promoter system (Goeddel, Nucleic Acids Res., 8:4057 (1980); EP 36,776), and hybrid promoters such as the tac promoter (deBoer et al., Proc. Natl. Acad. Sci. USA, 80:21 25 (1983)). Promoters for use in bacterial systems also will contain a Shine-Dalgarno sequence operably linked to the encoding DNA.
[00103] Other yeast promoters, which are inducible promoters having the additional advantage of transcription controlled by growth conditions, are the promoter regions for alcohol dehydrogenase 2, isocytochrome C, acid phosphatase, degradative enzymes associated with nitrogen metabolism, metallothionein, glyceraldehyde-3-phosphate dehydrogenase, and enzymes responsible for maltose and galactose utilization.
Suitable vectors and promoters for use in yeast expression are known in the art, e.g.
see EP 73,657 for a further discussion.
[00104] PR087299 transcription from vectors in mammalian host cells is controlled, for example, by promoters obtained from the genomes of viruses such as polyoma virus, fowlpox virus (UK 2,211,504), adenovirus (such as Adenovirus 2), bovine papilloma virus, avian sarcoma virus, cytomegalovirus, a retrovirus, hepatitis-B virus and Simian Virus 40 (5V40), from heterologous mammalian promoters, e.g., the actin promoter or an immunoglobulin promoter, and from heat-shock promoters, provided such promoters are compatible with the host cell systems.
[00105] Transcription of the encoding DNA by higher eukaryotes may be increased by inserting an enhancer sequence into the vector. Enhancers are cis-acting elements of DNA, usually about from 10 to 300 bp, that act on a promoter to increase its transcription. Many enhancer sequences are now known from mammalian genes (globin, elastase, albumin, a-fetoprotein, and insulin). Typically, however, one will use an enhancer from a eukaryotic cell virus. Examples include the 5V40 enhancer on the late side of the replication origin, the cytomegalovirus early promoter enhancer, the polyoma enhancer on the late side of the replication origin, and adenovirus enhancers. The enhancer may be spliced into the vector at a position 5' or 3' to the coding sequence but is preferably located at a site 5' from the promoter.
[00106] Expression vectors used in eukaryotic host cells (yeast, fungi, insect, plant, animal, human, or nucleated cells from other multicellular organisms) will also contain sequences necessary for the termination of transcription and for stabilizing the mRNA.
Such sequences are commonly available from the 5' and, occasionally 3', untranslated regions of eukaryotic or viral DNAs or cDNAs. These regions contain nucleotide segments transcribed as polyadenylated fragments in the untranslated portion of the mRNA encoding the mutants.
[00107] In some embodiments, the expression control sequence can be selected from a group consisting of a lac system, T7 expression system, major operator and promoter regions of pBR322 origin, and other prokaryotic control regions. Still other methods, vectors, and host cells suitable for adaptation to the synthesis of the mutants in recombinant vertebrate cell culture are described in Gething et al., Nature, 293:620 625 (1981); Mantei et al., Nature, 281:40 46 (1979); EP 117,060; and EP 117,058.
[00108] Mutants can be expressed as a fusion protein. In some embodiments, the methods involve adding a number of amino acids to the protein, and in some embodiments, to the amino terminus of the protein. Extra amino acids can serve as affinity tags or cleavage sites, for example. Fusion proteins can be designed to: (1) assist in purification by acting as a temporary ligand for affinity purification, (2) produce a precise recombinant by removing extra amino acids using a cleavage site between the target gene and affinity tag, (3) increase the solubility of the product, and/or (4) increase expression of the product. A
proteolytic cleavage site can be included at the junction of the fusion region and the protein of interest to enable further purification of the product ¨ separation of the recombinant protein from the fusion protein following affinity purification of the fusion protein. Such enzymes, and their cognate recognition sequences, can include Factor Xa, thrombin and enterokinase, cyanogen bromide, trypsin, or chymotrypsin, for example. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D. B. and Johnson, K. S.
Gene 67:31-40 (1988)), pMAL (New England Biolabs, Beverly, Mass.), pRIT5 (Pharmacia, Piscataway, N.J.), and pET (Strategen), which can fuse glutathione S-transferase (GST), maltose E binding protein, protein A, or a six-histidine sequence, respectively, to a target recombinant protein.
[00109] Synthetic DNAs containing the sequences of nucleotides, tags and cleavage sites can be designed and provided as a modified coding for recombinant polypeptide mutants. In some embodiments, a polypeptide can be a fusion polypeptide having an affinity tag, and the recovering step includes (1) capturing and purifying the fusion polypeptide, and (2) removing the affinity tag for high yield production of the desired polypeptide or an amino acid sequence that is at least 95% homologous to a desired polypeptide. DNA encoding the mutants may be obtained from a cDNA library prepared from tissue possessing the mRNA for the mutants. As such, the DNA can be conveniently obtained from a cDNA library. The encoding gene for the mutants may also be obtained from a genomic library or by known synthetic procedures (e.g., automated nucleic acid synthesis).
[00110] Libraries can be screened with probes designed to identify the gene of interest or the protein encoded by it. Screening the cDNA or genomic library with the selected probe may be conducted using standard hybridization procedures, such as described in Sambrook et al., Molecular Cloning: A Laboratory Manual (New York: Cold Spring Harbor Laboratory Press, 1989), which is herein incorporated by reference. An alternative means to isolate the gene encoding recombinant polypeptide mutants is to use PCR methodology [Sambrook et al., supra; Dieffenbach et al., PCR Primer: A Laboratory Manual (Cold Spring Harbor Laboratory Press, 1995)].

[0 0 1 1 1] Nucleic acids having a desired protein coding sequence may be obtained by screening selected cDNA or genomic libraries using a deduced amino acid sequence and, if necessary, a conventional primer extension procedure as described in Sambrook et al., supra, to detect precursors and processing intermediates of mRNA that may not have been reverse-transcribed into cDNA.
[00112] The selection of expression vectors, control sequences, transformation methods, and the like, are dependent on the type of host cell used to express the gene.
Following entry into a cell, all or part of the vector DNA, including the insert DNA, may be incorporated into the host cell chromosome, or the vector may be maintained extrachromosomally.
Those vectors that are maintained extrachromosomally are frequently capable of autonomous replication in the host cell. Other vectors are integrated into the genome of a host cell upon and are replicated along with the host genome.
[00113] Host cells are transfected or transformed with the expression or cloning vectors described herein to produce the mutants. The cells are cultured in conventional nutrient media modified as appropriate for inducing promoters, selecting transformants, or amplifying the genes encoding the desired sequences. The culture conditions, such as media, temperature, pH and the like, can be selected by the skilled artisan without undue experimentation. In general, principles, protocols, and practical techniques for maximizing the productivity of cell cultures can be found in Mammalian Cell Biotechnology: a Practical Approach, M. Butler, ed. (IRL Press, 1991) and Sambrook et al., supra, each of which are incorporated by reference.
[00114] The host cells can be prokaryotic or eukaryotic and, suitable host cells for cloning or expressing the DNA in the vectors herein can include prokaryote, yeast, or higher eukaryote cells. Methods of eukaryotic cell transfection and prokaryotic cell transformation are known to the ordinarily skilled artisan, for example, CaCl2, CaPO4, liposome-mediated and electroporation. Depending on the host cell used, transformation is performed using standard techniques appropriate to such cells. The calcium treatment employing calcium chloride, as described in Sambrook et al., supra, or electroporation is generally used for prokaryotes. Infection with Agrobacterium tumefaciens is used for transformation of certain plant cells, as described by Shaw et al., Gene, 23:315 (1983) and WO 89/05859 published 29 Jun. 1989. For mammalian cells without such cell walls, the calcium phosphate precipitation method of Graham and van der Eb, Virology, 52:456 457 (1978) can be employed. General aspects of mammalian cell host system transfections have been described in U.S. Pat. No. 4,399,216. Transformations into yeast are typically carried out according to the method of Van Solingen et al., J. Bact., 130:946 (1977) and Hsiao et al., Proc. Natl. Acad. Sci. (USA), 76:3829 (1979). However, other methods for introducing DNA
into cells, such as by nuclear microinjection, electroporation, bacterial protoplast fusion with intact cells, or polycations, e.g., polybrene, polyornithine, may also be used. For various techniques for transforming mammalian cells, see Keown et al., Methods in Enzymology, 185:527 537 (1990) and Mansour et al., Nature, 336:348 352 (1988).
[00115] Suitable host cells for cloning or expressing the DNA in the vectors herein include prokaryote, yeast, or higher eukaryote cells. Suitable prokaryotes include, but are not limited to, eubacteria, such as Gram-negative or Gram-positive organisms, for example, Enterobacteriaceae such as E. coli. Various E. coli strains are publicly available, such as E.
coli K12 strain MM294 (ATCC 31,446); E. coli X1776 (ATCC 31,537); E. coli strain W3110 (ATCC 27,325) and K5 772 (ATCC 53,635). Other suitable prokaryotic host cells include Enterobacteriaceae such as Escherichia, e.g., E. coli, Enterobacter, Erwinia, Klebsiella, Proteus, Salinonella, e.g., Salmonella typhimunrium, Serratia, e.g., Serratia marcescans, and Shigella, as well as Bacilli such as B. subtilis and B. licheniformis (e.g., B. licheniformis 41P disclosed in DD 266,710 published 12 Apr. 1989), Pseudomonas such as P.
aeruginosa, and Streptomyces. These examples are illustrative rather than limiting, and merely supplement the remainder of the teachings herein. Strain W3110 is one particularly preferred host or parent host because it is a common host strain for recombinant DNA
product fermentations. Preferably, the host cell secretes minimal amounts of proteolytic enzymes. For example, strain W3110 may be modified to effect a genetic mutation in the genes encoding proteins endogenous to the host, with examples of such hosts including E.
coli W3110 strain 1 A2, which has the complete genotype tonA; E. coli W3110 strain 9E4, which has the complete genotype tonA ptr3; E. coli W3110 strain 2707 (ATCC
55,244), which has the complete genotype tonA ptr3 phoA E15 (argF-lac)169 degP ompT
kanr ; E.
coli W3110 strain 37D6, which has the complete genotype tonA ptr3 phoA E15 (arg F-lac)169 degP ompT rbs7 ilvC kanr ; E. coli W3110 strain 4064, which is 37D6 with a non-kanamycin resistant degP deletion mutation; and an E. coli strain having mutant periplasmic protease as disclosed in U.S. Pat. No. 4,946,783. Alternatively, in vitro methods of cloning, e.g., PCR or other nucleic acid polymerase reactions, are suitable.
[00116] In addition to prokaryotes, eukaryotic microbes such as filamentous fungi or yeast are suitable cloning or expression hosts for the mutants. Saccharomyces cerevisiae is a commonly used lower eukaryotic host microorganism. Others include Schizosaccharomyces pombe (Beach and Nurse, Nature, 290: 140 (1981); EP
139,383 published 2 May 1985); Kluyveromyces hosts (U.S. Pat. No. 4,943,529; Fleer et al., Bio/Technology, 9:968 975 (1991)) such as, e.g., K. lactis (MW98-8C, CB5683, CB54574;
Louvencourt et al., J. Bacteriol., 154(2):737 742 (1983)), K. fragilis (ATCC
12,424), K.
bulgaricus (ATCC 16,045), K. wickeramii (ATCC 24,178), K. waltii (ATCC
56,500), K.
drosophilarum (ATCC 36,906; Van den Berg et al., Bio/Technology, 8:135 (1990)), K.
thermotolerans, and K. marxianus; yarrowia (EP 402,226); Pichia pastoris (EP
183,070;
Sreekrishna et al., J. Basic Microbiol., 28:265 278 [1988]); Candida;
Trichoderma reesia (EP
244,234); Neurospora crassa (Case et al., Proc. Natl. Acad. Sci. USA, 76:5259 (1979)); Schwanniomyces such as Schwanniomyces occidentalis (EP 394,538); and filamentous fungi such as, e.g., Neurospora, Penicillium, Tolypocladium (WO
91/00357), and Aspergillus hosts such as A. nidulans (Ba!lance et al., Biochem. Biophys.
Res.
Commun., 112:284289 (1983); Tilburn et al., Gene, 26:205 221 (1983); YeIton et al., Proc.
Natl. Acad. Sci. USA, 81: 1470 1474 (1984)) and A. niger (Kelly and Hynes, EMBO J., 4:475 479 (1985)) Methylotropic yeasts are suitable herein and include, but are not limited to, yeast capable of growth on methanol selected from the genera consisting of Hansenula, Candida, Kloeckera, Pichia, Saccharomyces, Torulopsis, and Rhodotorula. A list of specific species that are exemplary of this class of yeasts may be found in C. Anthony, The Biochemistry of Methylotrophs, 269 (1982).
[00117] Suitable host cells for the expression of glycosylated mutants can be derived from multicellular organisms. Invertebrate cells include insect cells such as Drosophila S2 and Spodoptera Sf9, as well as plant cells. Useful mammalian host cell lines include Chinese hamster ovary (CHO) and COS cells. More specific examples include monkey kidney CVI line transformed by 5V40 (COS-7, ATCC CRL 1651); human embryonic kidney line (293 or 293 cells subcloned for growth in suspension culture, Graham et al., J. Gen Virol., 36:59 (1977)); Chinese hamster ovary cells/-DHFR (CHO, Urlaub and Chasin, Proc.
Natl. Acad. Sci. USA, 77:4216 (1980)); mouse sertoli cells (TM4, Mather, Biol.
Reprod., 23:243 251 (1980)); human lung cells (W138, ATCC CCL 75); human liver cells (Hep G2, HB 8065); and mouse mammary tumor (MMT 060562, ATCC CCL5 1). One of skill can readily choose the appropriate host cell, at least for extracellular protein harvesting embodiments, without undue experimentation.

[00118] In some embodiments, a nucleotide sequence will be hybridizable, under moderately stringent conditions, to a nucleic acid having a nucleotide sequence comprising or complementary to the desired nucleotide sequences. In some embodiments, an isolated nucleotide sequence will be hybridizable, under stringent conditions, to a nucleic acid having a nucleotide sequence comprising or complementary to the desired nucleotide sequences.
A nucleic acid molecule can be "hybridizable" to another nucleic acid molecule when a single-stranded form of the nucleic acid molecule can anneal to the other nucleic acid molecule under the appropriate conditions of temperature and ionic strength (see Sambrook et al., supra,). The conditions of temperature and ionic strength determine the "stringency"
of-the hybridization. "Hybridization" requires that two nucleic acids contain complementary sequences. However, depending on the stringency of the hybridization, mismatches between bases may occur. The appropriate stringency for hybridizing nucleic acids depends on the length of the nucleic acids and the degree of complementation.
Such variables are well known in the art. More specifically, the greater the degree of similarity or homology between two nucleotide sequences, the greater the value of Tm for hybrids of nucleic acids having those sequences. For hybrids of greater than 100 nucleotides in length, equations for calculating Tm have been derived (see Sambrook et al., supra). For hybridization with shorter nucleic acids, the position of mismatches becomes more important, and the length of the oligonucleotide determines its specificity (see Sambrook et al., supra).
[00119] In some embodiments, the polynucleotides and polypeptides have at least 55, 60, 65, 70, 75, 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98, or 99 percent homology to a desired polynucleotide or polypeptide. In some embodiments, the polynucleotides and polypeptides have at least 55, 60, 65, 70, 75, 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98, or 99 percent identity to a desired polynucleotide or polypeptide. And, in some embodiments, the polynucleotides and polypeptides have at least 55, 60, 65, 70, 75, 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98, or 99 percent similarity to a desired polynucleotide or polypeptide. As described above, degenerate forms of the desired polynucleotide are also acceptable. In some embodiments, a polypeptide can be 90, 91, 92, 93, 94, 95, 96, 97, 98, or homologous, identical, or similar to a desired polypeptide as long as it shares the same function as the desired polypeptide, and the extent of the function can be less or more than that of the desired polypeptide. In some embodiments, for example, a polypeptide can have a function that is 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, or any 0.1%
increment in-between, that of the desired polypeptide. And, in some embodiments, for example, a polypeptide can have a function that is 110%, 120%, 130%, 140%, 150%, 160%, 170%, 180%, 190%, 200%, 300%, 400%, 500%, or more, or any 1% increment in-between, that of the desired polypeptide. In some embodiments the "function" is an enzymatic activity, measurable by any method known to one of skill such as, for example, a method used in the teachings herein. The "desired polypeptide" or "desired polynucleotide" can be referred to as a "reference polypeptide" or "reference polynucleotide", or the like, in some embodiments as a control for comparison of a polypeptide of interest, which may be considered a "test polypeptide" or "test polynucleotide" or the like. In any event, the comparison is that of one set of bases or amino acids against another set for purposes of measuring homology, identity, or similarity. The ability to hybridize is, of course, another way of comparing nucleotide sequences.
[00120] The terms "homology" and "homologous" can be used interchangeably in some embodiments. The terms can refer to nucleic acid sequence matching and the degree to which changes in the nucleotide bases between polynucleotide sequences affects the gene expression. These terms also refer to modifications, such as deletion or insertion of one or more nucleotides, and the effects of those modifications on the functional properties of the resulting polynucleotide relative to the unmodified polynucleotide. Likewise the terms refer to polypeptide sequence matching and the degree to which changes in the polypeptide sequences, such as those seen when comparing the modified polypeptides to the unmodified polypeptide, affect the function of the polypeptide. It should appreciated to one of skill that the polypeptides, such as the mutants taught herein, can be produced from two non-homologous polynucleotide sequences within the limits of degeneracy.
[00121] The terms "similarity" and "identity" are known in the art. The term "identity" can be used to refer to a sequence comparison based on identical matches between correspondingly identical positions in the sequences being compared. The term "similarity"
can be used to refer to a comparison between amino acid sequences, and takes into account not only identical amino acids in corresponding positions, but also functionally similar amino acids in corresponding positions. Thus similarity between polypeptide sequences indicates functional similarity, in addition to sequence similarity.
Levels of identity between gene sequences and levels of identity or similarity between amino acid sequences can be calculated using known methods. For example, publicly available computer based methods for determining identity and similarity include the BLASTP, BLASTN and FASTA (Atschul et al., J. Molec. Biol., 1990; 215:403-410), the BLASTX

program available from NCB!, and the Gap program from Genetics Computer Group, Madison Wis. In some embodiments, the Gap program, with a Gap penalty of 12 and a Gap length penalty of 4 can be used for determining the amino acid sequence comparisons, and a Gap penalty of 50 and a Gap length penalty of 3 for the polynucleotide sequence comparisons. In some embodiments, the sequences can be aligned so that the highest order match is obtained. The match can be calculated using published techniques that include, for example, Computational Molecular Biology, Lesk, A. M., ed., Oxford University Press, New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, D. W., ed., Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part I, Griffin, A.
M., and Griffin, H. G., eds., Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heinje, G., Academic Press, 1987; and Sequence Analysis Primer, Gribskov, M. and Devereux, J., eds., M Stockton Press, New York, 1991, each of which is incorporated by reference herein.
[00122] As such, the term "similarity" is similar to "identity", but in contrast to identity, similarity can be used to refer to both identical matches and conservative substitution matches. For example, if two polypeptide sequences have 10/20 identical amino acids, and the remainder are all non-conservative substitutions, then the percent identity and similarity would both be 50%. On the other hand, if there are 5 five more positions where there are conservative substitutions, then the percent identity is 50%, whereas the percent similarity is 75%.
[00123] In some embodiments, the term "substantial sequence identity" can refer to an optimal alignment, such as by the programs GAP or BESTFIT using default gap penalties, having at least 65, 70, 75, 80, 85, 90, 95, 96, 97, 98, or 99 percent sequence identity. The difference in what is "substantial" regarding identity can often vary according to a corresponding percent similarity, since the factor of primary importance is often the function of the sequence in a system. The term "substantial percent identity" can be used to refer to a DNA sequence that is sufficiently similar to a reference sequence at the nucleotide level to code for the same protein, or a protein having substantially the same function, in which the comparison can allow for allelic differences in the coding region. Likewise, the term can be used to refer to a comparison of sequences of two polypeptides optimally aligned.
[00124] In some embodiments, sequence comparisons can be made to a reference sequence over a "comparison window" of amino acids or bases that includes any number of amino acids or bases that is useful in the particular comparison. For example, the reference sequence may be a subset of a larger sequence. In some embodiments, the comparison window can include at least 10 residue or base positions, and sometimes at least 15-20 amino acids or bases. The reference or test sequence may represent, for example, a polypeptide or polynucleotide having one or more deletions, substitutions or additions.
[00125] The term "variant" refers to modifications to a peptide that allows the peptide to retain its binding properties, and such modifications include, but are not limited to, conservative substitutions in which one or more amino acids are substituted for other amino acids; deletion or addition of amino acids that have minimal influence on the binding properties or secondary structure; conjugation of a linker; post-translational modifications such as, for example, the addition of functional groups. Examples of such post-translational modifications can include, but are not limited to, the addition of modifying groups described below through processes such as, for example, glycosylation, acetylation, phosphorylation, modifications with fatty acids, formation of disulfide bonds between peptides, biotinylation, PEGylation, and combinations thereof. In fact, in most embodiments, the polypeptides can be modified with any of the various modifying groups known to one of skill.
[00126] The terms "conservatively modified variant," "conservatively modified substitution," and "conservative substitution" can be used interchangeably in some embodiments. These terms can be used to refer to a conservative amino acid substitution, which is an amino acid substituted by an amino acid of similar charge density, hydrophilicity/hydrophobicity, size, and/or configuration such as, for example, substituting valine for isoleucine. In comparison, a "non-conservatively modified variant"
refers to a non-conservative amino acid substitution, which is an amino acid substituted by an amino acid of differing charge density, hydrophilicity/hydrophobicity, size, and/or configuration such as, for example, substituting valine for phenyalanine. One of skill will appreciate that there are a plurality of ways to define conservative substitutions, and any of these methods may be used with the teachings provided herein. In some embodiments, for example, a substitution can be considered conservative if an amino acid falling into one of the following groups is substituted by an amino acid falling in the same group: hydrophilic (Ala, Pro, Gly, Glu, Asp, Gln, Asn, Ser, Thr), aliphatic (Val, Ile, Leu, Met), basic (Lys, Arg, His), aromatic (Phe, Tyr, Trp), and sulphydryl (Cys). See Dayhoff, MO. Et al. National Biomedical Research Foundation, Georgetown University, Washington DC:89-99(1972), which is incorporated herein. In some embodiments, the substitution of amino acids can be considered conservative where the side chain of the substitution has similar biochemical properties to the side chain of the substituted amino acid.
Microbial systems ¨ antimicrobial lignin-derived compounds [00127] The antimicrobial activity of lignin-derived compounds is a major problem addressed by the systems taught herein. For example, typical industrial fermentation processes might utilize the microbes Escherichia coli K12 or Escherichia coli B, or the yeast Saccharomyces cerevisiae, and recombinant versions of these microbes, which are well characterized industrial strains. The problem is that the antimicrobial activities of aromatic compounds on such industrial microbes are toxic to the microbes, which negates an application to biotransformations of lignin-derived compounds.
[00128] The phenolic streams or soluble lignin streams derived from pretreated lignocellulosic biomass, for example, might contain aromatic and nonaromatic compounds, such as gallic acid, hydroxymethylfurfural alcohol, hydroxymethylfurfural, furfural alcohol, 3,5-dihydroxybenzoate, furoic acid, 3,4-dihydroxybenzaldehyde, hydroxybenzoate, homovanillin, syringic acid, vanillin, and syringaldehyde. There are several lignin-derived compounds that are antimicrobials. For example, furfural, 4-hydroxybenzaldehyde, syringaldehyde, 5-hydroxymethylfurfural, and vanillin are each known to have antimicrobial activity against Escherichia coli, and might have an additive antimicrobial activity against Escherichia coli when present in combination. Moreover, veratraldehyde, cinnamic acid and the respective benzoic acid derivatives of vanillic acid, vanillylacetone, and the cinnamic acid derivatives o-coumaric acid, m-coumaric acid, and p-coumaric acid might be components of the phenolic streams from pretreated lignocellulosic biomass.
Veratraldehyde, cinnamic acid and the respective benzoic acid derivatives of vanillic acid, vanillylacetone, and cinnamic acid derivatives o-coumaric acid, m-coumaric acid, and p-coumaric acid, each have significant antifungal activities against the yeast Saccharomyces cerevisiae, and might have an additive antifungal activity against the yeast Saccharomyces cerevisiae when present in combination.
[00129] One or more of the following benzaldehyde derivatives might be present in the phenolic streams from pretreated lignocellulosic biomass: 2,4,6-trihydroxybenzaldehyde, 2,5-dihydroxybenzaldehyde, 2,3,4-trihydroxybenzaldehyde, 2-hydroxy-5-methoxybenzaldehyde, 2,3-dihydroxybenzaldehyde, 2-hydroxy-3-methoxybenzaldehyde, 4-hydroxy-2,6-dimethoxybenzaldehyde, 2,5-dihydroxybenzaldehyde, 2,4-dihydroxybenzaldehyde, and 2-hydroxybenzaldehyde. Likewise, 2,4,6-trihydroxybenzaldehyde, 2,5-dihydroxybenzaldehyde, 2,3,4-trihydroxybenzaldehyde, 2-hydroxy-5-methoxybenzaldehyde, 2,3-dihydroxybenzaldehyde, 2-hydroxy-3-methoxybenzaldehyde, 4-hydroxy-2,6-dimethoxybenzaldehyde, 2,5-dihydroxybenzaldehyde, 2,4-dihydroxybenzaldehyde, and 2-hydroxybenzaldehyde have each demonstrated antibacterial activity against Escherichia coli, and might have an additive antibacterial activity against Escherichia coli when present in combination.
Microbial systems ¨ suitable microbes [00130] The antimicrobial activity of lignin-derived compounds creates a need for a strain of microbe that is tolerant to such activity in the reaction environment. The teachings include the identification of recombinant or non-recombinant microbial species that are naturally capable of metabolizing aromatic compounds for the biotransformations of lignin-derived compounds to commercial products.
[00131] Some examples of microbial species particularly suited for biotransformations of phenolic streams from pretreated lignocellulosic biomass include, but are not limited to, Azotobacter chroococcum, Azotobacter vinelandii, Novosphingobium aromaticivorans, Pseudomonas aeruginosa, Pseudomonas putida, Pseudomonas fluorescens, Pseudomonas stutzerii, Pseudomonas dim inuta, Pseudomonas pseudoalcaligenes, Rhodopseudomonas palustris, Spingomonas sp.A1, Sphingomonas paucimobilis SYK-6, Sphingomonas japonicum, Sphingomonas alaskenesis, Sphingomonas wittichii, Streptomyces viridosporus, Delftia acidivorans, and Rhodococcus equi. Both bio-informatic and experimental data from the literature reveal the presence of extensive metabolic activity towards aromatic compounds in these strains, making them relevant species for the discovery of enzymes that hydrolyze lignin-derived oligomers, and for biotransformations of lignin core structures. Without intending to be bound by any theory or mechanism of action, these species exhibit, for example, metabolism of aromatic compounds such as benzoate;
amino-, fluoro-, and chloro-benzoates; biphenyl; toluene and nitrotoluenes;
xylenes;
alkylbenzenes; styrene; atrazine; caprolactam; and polycyclic aromatic hydrocarbons.
[00132] The microbes can be grown in a fermentor, for example, using methods known to one of skill. The enzymes used in the bioprocessing are obtained from the microbes, and they can be intracellular, extracellular, or a combination thereof. As such, the enzymes can be recovered from the host cells using methods known to one of skill in the art that include, for example, filtering or centrifuging, evaporation, and purification. In some embodiments, the method can include breaking open the host cells using ultrasound or a mechanical device, remove debris and extract the protein, after which the protein can be purified using, for example, electrophoresis. In some embodiments, however, the teachings include the use of a microbe, recombinant or non-recombinant, that has tolerance to lignin-derived compounds. A microbe that is tolerant to lignin-derived compounds can be used industrially, for example, to express any enzyme, recombinant or non-recombinant, having a desired enzyme activity while directly in association with the lignin-derived compounds. Such activities include, for example, beta etherase activity, C-alpha-dehydrogenase activity, glutathione lyase activity, or any other enzyme activity that would be useful in the biotransformation of lignin-derived compounds. The activities can be wild-type or produce through methods known to one of skill, such as transfection or transformation, for example.
Microbial systems ¨ Azotobacter strains [00133] The teachings herein are also directed to the discovery and use of recombinant Azotobacter strains heterologously expressing novel beta-etherase enzymes for the hydrolysis of lignin oligomers.
[00134] Research directed to the discovery of a suitable microbe has shown that Azotobacter vinelandii may possess the industrially relevant strain criteria desired for the teachings provided herein. In some embodiments, the criteria includes (i) growth on inexpensive and defined medium, (ii) resistance to inhibitors in hydrolysates of lignocellulose, (iii) tolerance to acidic pH and higher temperatures, (iv) the co-fermentation of pentose and hexose sugars, (v) genetic tractability and availability of gene expression tools, (vi) rapid generation times, and (vii) successful growth performance in pilot scale fermentations. Additionally, key physiological traits that contribute to the potential suitability of A. vinelandii to the conversion of lignin-streams include an ability to metabolize aromatic compounds and xenobiotics. Moreover, it has been shown to have a tolerance to phenolic compounds in industrial waste streams. The annotated genome sequence of A.
vinelandii, and the availability of genetic tools for its transformation and for the heterologous expression of enzymes, contribute to the potential of this microbe to function, in it's native form or as a transformant, for example, in a high-yield production of industrial chemicals from lignin streams.

[00135] The teachings are also directed to a method of cleaving a beta-aryl ether bond, the comprising contacting a polypeptide taught herein with a lignin-derived compound having (i) a beta-aryl ether bond and (ii) a molecular weight ranging from about 180 Da!tons to about 3000 Da!tons; wherein, the contacting occurs in a solvent environment in which the lignin-derived compound is soluble. The term "contacting" refers to placing an agent, such as a compound taught herein, with a target compound, and this placing can occur in situ or in vitro, for example.
[00136] The teachings are also directed to a method of cleaving a beta-aryl ether bond, the comprising contacting a polypeptide taught herein with a lignin-derived compound having (i) a beta-aryl ether bond and (ii) a molecular weight ranging from about 180 Da!tons to about 3000 Da!tons; wherein, the contacting occurs in a solvent environment in which the lignin-derived compound is soluble. In some embodiments, the lignin-derived compound has a molecular weight of about 180 Da!tons to about 1000 Da!tons. In some embodiments, the solvent environment comprises water. And, in some embodiments, the solvent environment comprises a polar organic solvent.
[00137] The teachings are also directed to a system for bioprocessing lignin-derived compounds, the system comprising a polypeptide taught herein, a lignin-derived compound having a beta-aryl ether bond and a molecular weight ranging from about 180 Da!tons to about 3000 Da!tons; and, a solvent in which the lignin-derived compound is soluble;
wherein, the system functions to cleave the beta-aryl ether bond by contacting the polypeptide with the lignin-derived compound in the solvent.
[00138] The teachings are also directed to a recombinant polynucleotide comprising a nucleotide sequence that encodes a polypeptide taught herein. Likewise, the teachings are also directed to a vector or plasmid comprising the polynucleotide, as well as a host cell transformed by the vector or plasmid to express the polypeptide.
[00139] The teachings are also directed to a method of cleaving a beta-aryl ether bond, the method comprising (i) culturing a host cell taught herein under conditions suitable to produce a polypeptide taught herein; (ii) recovering the polypeptide from the host cell culture; and, (iii) contacting the polypeptide of claim 1 with a lignin-derived compound having a beta-aryl ether bond and a molecular weight ranging from about 180 Da!tons to about 3000 Da!tons; wherein, the contacting occurs in a solvent environment in which the lignin-derived compound is soluble.

[00140] In some embodiments, the host cell can be E. Coli or an Azotobacter strain, such as Azotobacter vinelandii. And, in some embodiments, the lignin-derived compound can have a molecular weight of about 180 Da!tons to about 1000 Da!tons.
[00141] The teachings are also directed to a system for bioprocessing lignin-derived compounds, the system comprising (i) a transformed host cell taught herein;
(ii) a lignin-derived compound having a beta-aryl ether bond and a molecular weight ranging from about 180 Da!tons to about 3000 Da!tons; and, (iii) a solvent in which the lignin-derived compound is soluble; wherein, the system functions to cleave the beta-aryl ether bond by contacting a polypeptide taught herein with the lignin-derived compound in the solvent.
EXAMPLES
[00142] The following examples illustrate, but do not limit, the present invention.

[00143] Microbial growth and metabolism studies on soluble lignin samples are performed to test the tolerance of microbes on lignin-derived compounds. A set of aromatic and nonaromatic compounds known to inhibit growth of E. coil and S. cerevisiae strains might be used to characterize the growth, tolerance and metabolic capability of Azotobacter vinelandii strain BAA1303, and A. chroococcum strain 4412 (EB Fred) X-50.
Metabolism of various aromatic and nonaromatic compounds by microbial strains might be determined as a function of cellular respiration by the reduction of soluble tetrazolium salts by actively metabolizing cells. XTT (2,3-Bis(2-methoxy-4-nitro-5-sulfophenyI)-2H-tetrazolium-5-carboxanilide inner salt, Sigma) is reduced to a soluble purple formazan compound by respiring cells. E. coil might be used as the negative control strain in this study. Strains might be grown in rich medium to saturation, washed, and OD600nm of the cultures determined. Equal numbers of bacteria will be inoculated into wells of the 48-well growth asing concentrations of aromatic and non-aromatic compounds in the range of 0-500mM, will be added to the wells to a final volume of 0.8m1. Following incubation for 24-48 hours with shaking at 25-37 C, the cultures will be tested for growth upon exposure to the test compounds using the XTT assay kit (Sigma). Culture samples will removed from the 48 well growth plate, and diluted appropriately in 96 well assay plates to which the XTT reagent will be added. Soluble formazan formed will be quantified by absorbance at 450nm.
Increased absorbance at 450nm will be indicative of growth or survival, or metabolism of a particular test compound by the strains. Table 3 lists some example compounds that can be used to test the tolerance of microbes on lignin-derived compounds.
[00144] Table 3.
Test Compound 1 Syringic acid 2 Syringaldehyde 3 Gallic acid 4 Furfural 5-Hydroxymethylfurfural 6 4-hydroxybenzaldehyde 7 Hydroxybenzoate 8 Vanillin 9 Vanillic acid Cinnamic acid 11 o-, m-and p-Coumaric acids 12 2-hydroxy-3-methoxybenzaldehyde 13 2,4,6-trihydroxybenzaldehyde 14 4-hydroxy-2,6-dimethoxybenzaldehyde [00145] The set of lignin compounds to be tested might be expanded to any of the teachings provided herein. And, the microbial growth and metabolism studies on soluble lignin samples can also be performed actual industrial samples such as, for example, kraft lignins and biorefinery lignins.

[00146] This example illustrates how prospective enzymes were identified for use with the teachings provided herein. Although never successfully expressed heterologously as an industrial microbe in a commercial scale process, Sphingomonas paucimobilis has been shown to produce enzymes that have some activity in cleaving the beta aryl ether bond in lignin. See Masai, E., et al. Accordingly, the enzyme discovery effort started with running BLAST searches against the two enzymes identified by Masai as having beta etherase activity, "ligE" and "ligF". See Id. at Abstract. Table 4 lists genes identified in the BLAST
searches for initial screening.

[00147] Table 4.
Gene Species Activity Genbank Identity/Similarity Accession # (0/0) 1 ligE Sphingomonas Beta-etherase BAA02032.1 paucimobilis 2 ligE-1 Novosphingobium Putative ABD26841.1 (62%) (75%) aromaticivorans Beta-etherase 3 ligF Sphingomonas Beta-etherase BAA02031.1 paucimobilis 4 ligF-1 Novosphingobium Putative ABD26530.1 (60%) (77%) aromaticivorans Beta-etherase aromaticivorans Beta-etherase aromaticivorans Beta-etherase [00148] The nucleotide and amino acid sequences in Table 4 are incorporated herein by reference in their entirety through the GenBank Accession Numbers.

[00149] This example describes a method for preparing recombinant host cells for the heterologous expression of known and putative beta-etherase encoding gene sequences in Escherichia coil (E. coli). E. coil is used in this example as a surrogate enzyme production host organism for the enzyme discovery. The construction of a novel industrial host microbe, A. vinelandii is described below.
[00150] The gene sequences with accession numbers in Table 3 were synthesized directly as open reading frames (ORFs) from oligonucleotides by using standard PCR-based assembly methods, and using the E. coil codon bias with 10% threshold. The end sequences contained adaptors (Ndel and Xhol) for restriction digestion and cloning into the E. coil expression vector pET24b (Novagen). Internal Ndel and Xhol sites were excluded from the ORF sequences during design of the oligonucleotides. Assembled genes were cloned into a cloning vector (pG0V4), transformed into E. coil CH3 chemically competent cells, and DNA sequences determined from purified plasmid DNA. After sequence verification, restriction digestion was used to excise each ORF fragment from the cloning vector, and the sequence sub-cloned into pET24b. The entire set of ligE and ligF bearing plasmids were then transformed into E. coil BL21 (DE3) which served as the host strain for beta-etherase expression and biochemical activity testing.

[00151] LigE, from Accession No BAA2032.1, is listed herein as SEQ ID NO:1 for the protein and SEQ ID NO:2 for the gene. An "optimized" nucleic acid sequence was created to facilitate the transformation in E. co/land is listed herein as SEQ ID
NO:977.
[00152] LigE-1, from Accession No ABD26841.1, is listed herein as SEQ ID
NO:101 for the protein and SEQ ID NO:102 for the gene. An "optimized" nucleic acid sequence was created to facilitate the transformation in E. co/land is listed herein as SEQ
ID NO:978.
[00153] LigF, from Accession No BAA2031.1 (P30347.1), is listed herein as SEQ ID
NO:513 for the protein and SEQ ID NO:514 for the gene. An "optimized" nucleic acid sequence was created to facilitate the transformation in E. coli and is listed herein as SEQ
ID NO:979.
[00154] LigF-1, from Accession No ABD26530.1, is listed herein as SEQ ID
NO:539 for the protein and SEQ ID NO:540 for the gene. An "optimized" nucleic acid sequence was created to facilitate the transformation in E. co/land is listed herein as SEQ
ID NO:980.
[00155] LigF-2, from Accession No ABD27301.1, is listed herein as SEQ ID
NO:541 for the protein and SEQ ID NO:542 for the gene. An "optimized" nucleic acid sequence was created to facilitate the transformation in E. coli and is listed herein as SEQ ID NO:981.
[00156] LigF-3, from Accession No ABD27309.1, is listed herein as SEQ ID
NO:545 for the protein and SEQ ID NO:546 for the gene. An "optimized" nucleic acid sequence was created to facilitate the transformation in E. co/land is listed herein as SEQ
ID NO:982.

[00157] This example describes a method for gene expression in E. coli, as well as beta-etherase biochemical assays. Expression of known and putative beta-etherase genes was performed using 5m1 cultures of the recombinant E. coli strains described herein in Luria Broth medium by induction of gene expression using isopropylthiogalactoside (IPTG) to a final concentration of 0.1mM. Following induction, and cell harvest, the cells were disrupted using either sonication or the BPER (Invitrogen) cell lysis system.
[00158] Clarified cell extracts were tested in the in vitro biochemical assay for beta-etherase activity on a fluorescent substrate, a model lignin dimer compound a-0-(B-methylumbelliferyl) acetovanillone (MUAV). In vitro reactions were performed in a total volume of 200u1 and contained: 25mM TrisHCI pH 7.5; 0.5mM dithiothreitol; 1mM
glutathione;0.05mM or 0.1mM MUAV; lOul of clarified cell extract used to initiate the reactions. Following incubation for 2.5 hours at room temperature, a 50u1 sample of the reactions was terminated using 150uL of 300mM glycine/NaOH buffer pH 9. The formation of 4 methylumbelliferone (4MU) upon hydrolysis of the aryl ether bond was monitored by the increase in fluorescence at X,ex=360nm and X,em=450nm using a Spectramax UV/visible/fluorescent spectrophotometer.
[00159] The total protein concentrations of the cell lysates were determined using the BOA reagent system for protein quantification (Pierce).
[00160] Induction might be also performed using IPTG concentrations in the range of 0.01-1mM. Cell disruption might be also performed using toluene permeabilization, French pressure techniques, or using multiple freeze/thaw cycles in conjunction with lysozyme.
Assay conditions might be varied to include TrisHCI at 10-150mM concentrations and in the pH range of 6.5-8.5; 0-2mM dithiothreitol; 0.05-2mM glutathione; 0.01-5mM MUAV

substrate; 22-42 C reaction temperatures. The biochemical assay might be performed as a fixed time point assay with reaction times ranging from 5 minutes-12 hours, or performed continuously without quenching with glycine/NaOH buffer to extract enzyme kinetic parameters.

[00161] This example describes the tested biochemical activities of the newly-discovered beta-etherase enzymes.
[00162] FIG. 4 illustrates unexpected results from biochemical activity assays for beta-etherase function for the S. paucimobilis positive control polypeptides, and the N.
aromaticivorans putative beta-etherase polypeptide, according to some embodiments. The much elevated beta-etherase activity exhibited by the putative ligEl gene product from N.
aromaticivorans as compared to the S. paucimobilis ligE gene product was a completely unexpected result of the enzyme discovery program.
[00163] In reactions containing 0.1mM MUAV substrate, E. coil cell extracts expressing the N. aromaticovorans ligE1 protein yielded a total activity of 529rfu/ug compared to 7rfu/ug for the S. paucimobilis ligE protein. The newly discovered beta-etherase from N.

aromaticovorans is approximately 75-fold more efficient than the previously described S.
paucimobilis ligE beta-etherase enzyme. The highly efficient novel beta-etherase is ideally suited to be a biocatalyst for conversion of lignin aryl ethers to monomers in biotechnological processes.
[00164] It was also surprising to find that 3 novel N. aromaticivorans polypeptides having identities to the S. paucimobilis LigF sequence showed beta-etherase activity on the MUAV
substrate. While all 3 putative ligF gene products from N. aromaticivorans exhibited beta-etherase activity, the LigF2 polypeptide is approximately 2-fold more efficient than the S.
paucimobilis LigF protein. The N. aromaticovorans LigF2 protein yielded a total activity of 1206rfu/ug compared to 558rfu/ug for the S. paucimobilis LigF protein.
[00165] As such, the enzyme discovery program unexpectedly and surprisingly generated four (4) novel polypeptides from N. aromaticivorans with beta-etherase activity.
This set of enzymes show great potential for the catalysis of a complete depolymerization of lignin-derived compounds. The results were unexpected and surprising for at least the following reasons:
[00166] Four (4) novel gene sequences encoding polypeptides with beta-etherase activity were discovered from N. aromaticivorans. These sequences have GenBank Nos.
ABD26841.1 (SEQ ID NO:101); ABD26530.1 (SEQ ID NO:539); ABD27301.1 (SEQ ID
NO:541); and ABD27309.1 (SEQ ID NO:545).
[00167] One of skill will appreciate that the bioinformatic screen that was used to help identify putative enzymes is not a definitive predictor in itself of biochemical activities, particularly in view of (i) having only one known active enzyme for LigE in a different species, (ii) one known active enzyme for LigF, and (iii) the unexpected extent of such activities discovered. The tests for function therefore had to be performed empirically on the N. aromaticivorans putative beta-etherase gene set.
[00168] One of skill will also appreciate that the discovery of beta-etherase activities for all 4 N. aromaticivorans polypeptides was a complete surprise given the relatively low levels of identities (37%-62%) the sequences had with respect to the S. paucimobilis LigE and LigF proteins.
[00169] One of skill will also appreciate that the discovery of 2 novel beta-etherases from the N. aromaticivorans with improved activities over the corresponding LigE
and LigF

proteins from S. paucimobilis were completely unexpected, and this exciting discovery provides a foundation for further enzyme development for industrial applications.

[00170] This example describes the extended use of bioinformatics to identify a pool of putative enzymes in the discovery program. As noted above, the bioinformatic screen that was used to help identify putative enzymes initially was not a definitive predictor in itself of biochemical activities, particularly in view of (i) having only one known active enzyme for LigE in a different species, (ii) one known active enzyme for LigF, and (iii) the unexpected extent of such activities discovered. Having the additional known active enzymes provided more information that could be used to enhance the effectiveness of the bioinformatics in identifying the pool of putative enzymes for both LigE-type and LigF-type enzymes.
[00171] Sequence to function correlations for the newly discovered beta-etherases were analyzed and identified. A bioinformatic survey of functional domains, essential catalytic residues, and sequence alignments was performed for the N. aromaticivorans LigE and LigF
polypeptides. While not intending to be bound by any theory or mechanism of action, the rationale and key results of the survey include at least the following:
[00172] Identifying functional domains [00173] As shown in FIG. 4, high levels of beta-etherase activities were discovered for the N. aromaticivorans LigE1 and LigF2 polypeptide sequences compared to the S.
paucimobilis LigE and LigF proteins. The N. aromaticivorans LigE1 and LigF2 polypeptide sequences were used as query sequences for the identification of functional domains using the Conserved Domain Database (CDD) in GenBank.
[00174] The N. aromaticivorans LigE1 polypeptide is annotated as a glutathione S-transferase (GST)-like protein with similarity to the GST C family, and the beta-etherase LigE subfamily. The LigE sub-family is composed of proteins similar to S.
paucimobilis beta etherase, LigE, a GST-like protein that catalyzes the cleavage of the beta-aryl ether linkages present in low-moleculer weight lignins using reduced glutathione (GSH) as the hydrogen donor in the reaction. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains.

[00175] Table 5 describes conserved domains and essential amino acid residues in the N. aromaticivorans LigE1 polypeptide (ABD26841.1), according to some embodiments. The three (3) conserved functional domains annotated in the N. aromaticivorans LigE1 polypeptide are: i) the dimer interface; ii) the N terminal domain; iii) the lignin substrate binding pocket or the H site. Amino acid residues defining the functional domains in such embodiments are residues 98-221 in the N. aromaticivorans LigE1 polypeptide.
[00176] Table 5 also lists fifteen (15) amino acid residues as conserved and essential for catalytic activity (column 3 of Table 5)õ according to some embodiments. These include:
K100; A101; N104; P166; W107; Y184; Y187; R188; G191; G192; F195; V111; G112;
M115; F116. While not intending to be bound by any theory or mechanism of action, these residues appear responsible for the high beta-etherase catalytic activity discovered for the N. aromaticivorans LigE1 polypeptide compared to the S. paucimobilis ligE
polypeptide.
[00177] In such embodiments, the essential amino acid residues of the N.
aromaticivorans LigE1 polypeptide might be altered conservatively, and singly or in combination with similar amino acid residues that would retain or improve the catalytic function of the N. aromaticivorans LigE1 polypeptide. Examples of such alternate residues that might be incorporated at the essential positions are also shown in column 4 of Table 5.

[00178] Table 5.
Functional Residues Conserved Alternate residues domain defining the residues essential suggested domain in for catalysis in N. for the essential positions N. aromaticivorans aromaticivorans LigEl LigEl Dimer interface (residues 98-221 of K100; A101; N104;
K100->R
SEQ ID NO:101) P166 A101->L; 1; V; G; S
N104->Q; H; S; A
N terminal (residues 98-221 of K100;
W107; Y184; K100->R
domain interface SEQ ID NO:101) Y187; R188;G191;
W107->Y; F; A; S
F195 Y184->W; F; A; S
Y187-> W; F; A; S
R188->K
G191-> L; 1; V; A; S
Fl 95->W; Y; A; S
Lignin/substrate (residues 98-221 of W107; V111; G112;
W107->Y; F; A; S
binding pocket or SEQ ID NO:101) M115; F116; G192; V111->
L; 1; G; A; S
H site F195 G112-> L; I; V; A; S
M115->S; A; G
G192-> L; 1; V; A; S
F195-> W; Y; A; S
[00179] The N. aromaticivorans LigF2 polypeptide is annotated as a glutathione S-transferase (GST)-like protein with similarity to the GST C family, catalyzing the conjugation of glutathione with a wide range of xenobiotic agents.
[00180] Table 6 describes conserved domains and essential amino acid residues in the N. aromaticivorans LigF2 polypeptide (ABD27301.1), according to some embodiments. The three (3) conserved functional domains annotated for the N. aromaticivorans LigF2 polypeptide are similar to those described for the N. aromaticivorans LigE
polypeptide and comprise: i) the dimer interface; ii) the N terminal domain; iii) the substrate binding pocket or the H site. In such embodiments, amino acid residues defining the functional domains are residues 99-230 in the N. aromaticivorans LigF2 polypeptide.
[00181] Table 6 also lists sixteen (16) amino acid residues as conserved and essential for catalytic activity (column 3 of Table 6) of the N. aromaticivorans LigF2 polypeptide, according to some embodiments. These include: R100; Y101; K104; K176; D107;
L194;
1197; N198; S201; M206; M111; N112; S115; M116; M206; H202. While not intending to be bound by any theory or mechanism of action, these 16 residues appear to be responsible for the high beta-etherase catalytic activity discovered for the N.
aromaticivorans LigF2 polypeptide compared to the S. paucimobilis LigF polypeptide.

[00182] In such embodiments, the essential amino acid residues of the N.
aromaticivorans LigF2 polypeptide might be altered conservatively, and singly or in combination with similar amino acid residues that would retain or improve the catalytic function of the N. aromaticivorans LigF2 polypeptide. Examples of such alternate residues that might be incorporated at the essential positions are shown in column 4 of Table 6.
[00183] Table 6.
Functional Residues Conserved Alternate residues suggested domain defining the domain residues essential for the essential positions in for catalysis in N.
N. aromaticivorans aromaticivorans LigF2 LigF2 Dimer interface (residues 99-230 of R100; Y101; K104; R100->K
SEQ ID NO:541) K176 Y101-> W; F; A; S
K104->R
K176->R
N terminal (residues 99-230 of R100; D107; L194; R100->K
domain SEQ ID NO:541) 1197; N198; S201; D107->E
interface M206 L194-> V; 1; G; A; S
1197-> L; V; G; A; S
N198->Q
5201->A; M; G
M206->S; A; G
Substrate (residues 99-230 of D107; M111; N112; D107->E
binding pocket SEQ ID NO:541) S115; M116; M206; M111->S;
A; G
or H site H202 N112->Q
S115->A; M; G
M116->S; A; G
M206->S; A; G
H202->N; Q; S; M
[00184] Identifying additional functional domains [00185] Bioinformatic methods were used to further understand the protein structure that may result in the desired activities. First, the LigE1 and LigF2 were analyzed together.
Amino acid sequence alignments were performed using the N. aromaticivorans ligE1 (ABD26841.1) and ligF2 (ABD27301.1) sequences using the BLAST-P program in GenBank, and the ProDom and PraLine programs. Full length sequence alignments yielded hits with relatively low identities, for example, identities of <70%.
[00186] Next, regions in LigE1 and LigF2 were analyzed independently in GENBANK.
For LigE1, an alignment was performed against the database in GENBANK using the following query sequence: "tispfvwatkyalkhkgfdldvvpggftgilertgg" (residues 19-54 of SEQ ID
NO:101), from N. aromaticivorans ligE1. The BLAST yielded at least 3 subject sequences with high identities in the thioredoxin (TRX)-like superfamily of proteins containing a TRX
fold. Many members contain a classic TRX domain with a redox active CXXC
motif.
[00187] Without intending to be bound by any theory or mechanism of action, they are thought to function as protein disulfide oxidoreductases (PD05), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tIpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO
proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others.
[00188] Table 7 lists 3 subject sequences having high identities (>80%) to residues 19-54 of LigE-1 (SEQ ID NO:101). In some embodiments, these sequences are likely to be essential to catalytic functions similar to those discovered for the N.
aromaticivorans IigE1 polypeptide.
[00189] Table 7.
Subject sequence Species; Gene GenBank Identity/Similarity to accession #
N. aromaticovorans LigE1 query sequence residues 19-54 (%) (residues 19-54 of SEQ ID Sphingomonas BAA02032.1 89/97 NO:1) paucimobilis; beta TISPYVWRTKYALKHKGFDI etherase DIVPGGFTGILERTGG
(residues 19-54 of SEQ ID Novosphingobium sp. YP004533906.1 86/92 NO:89) PP1Y; glutathione S
TISPFVWRTKYALAHKGFD transferase like protein VDIVPGGFTGIAERTGG
(residues 19-54 of SEQ ID Sphingobium sp. SYK- BAJ11989.1 83/94 NO:3) 6;
TISPFVWATKYAIAHKGFEL beta-etherase DIVPGGFSGIPERTGG
[00190] The nucleotide and amino acid sequences in Table 7 are incorporated herein by reference in their entirety through the GenBank Accession Numbers.
[00191] Likewise, for LigF2, separate alignments were performed against the database in GENBANK using the following 2 query sequences: "ainpegqvpvl" (residues 47-57 of SEQ

ID NO:541); and "iithttvineyled" (residues 63-76 of SEQ ID NO:541), from N.
aromaticivorans ligF2 (ABD27301.1) yielded multiple subject sequences with high identities in the GST-N superfamily of proteins. Without intending to be bound by any theory or mechanism of action, the N terminal region (residues 43-75 of SEQ ID NO:541) of the N.
aromaticivorans ligF2 polypeptide is annotated in the ODD to encompass:
[00192] i. N terminal residues thought to make contact with the C
terminal interface in forming the tertiary protein structure for the GST-N family of proteins;
[00193] ii. N terminal residues thought to be involved in dimerization of the polypeptides; and, [00194] iii. Residues thought to be involved in the binding of glutathione substrate.
[00195] Table 8 provides the percent identities and similarities to N.
aromaticovorans LigF2 query sequence residues 47-57.

[00196] Table 8.
Subject sequence Species; Gene GenBank Identity/Similarity to N.
accession # aromaticovorans LigF2 query sequence residues 47-57 (%) (residues 45-55 of Proteus mirabilis ATCC ZP 03840063.1 91/91 SEQ ID NO:983) 29906; glutathione S-AINPKGQVPVL transferase (residues 60-70 of Neisseria macacae ATCC ZP 08683997.1 82/91 SEQ ID NO:985) 33926; glutathione S-AINPQGQVPAL transferase (residues 43-53 of Rhodospirillum rubrum; YP 425114.1 82/91 SEQ ID NO:987) glutathione S-transferase-AMNPEGEVPVL like protein (residues 46-56 of Neisseria sicca ATCC ZP 05317369.1 82/91 SEQ ID NO:989) 29256; glutathione S-AINPQGQVPAL transferase (residues 46-56 of Neisseria mucosa ATCC ZP 05978410.1 82/91 SEQ ID NO:991) 25996; glutathione S-AINPQGQVPAL transferase (residues 19-29 of alpha proteobacterium ZP 02189431.1 82/91 SEQ ID NO:993) BALI 99; Glutathione S-AINPAGEVPVL transferase-like protein (residues 31-41 of Marinomonas sp. MED121; ZP 01077889.1 91/91 SEQ ID NO:995) glutathione 5-transferase AINPLGQVPVL
(residues 46-55 of Proteus penneri ATCC ZP 03805830.1 90/90 SEQ ID NO:997) 35198; hypothetical protein (residues 45-55 of AURANDRAFT 7474 EGB13094.1 82/91 SEQ ID NO:999) Aureococcus AINPQGKVPVL anophagefferens;
hypothetical protein [00197] The nucleotide and amino acid sequences in Table 8 are incorporated herein by reference in their entirety through the GenBank Accession Numbers.
[00198] Table 9 provides the percent identities and similarities to N.
aromaticovorans LigF2 query sequence residues 63-76.

[00199] Table 9.
Subject sequence Species; Gene GenBank Identity/Similarity to N.
accession # aromaticovorans LigF2 query sequence residues 63-76 (%) (residues 107-115 of Trichophyton verrucosum XP 003019921.1 100/100 SEQ ID NO:1001) HKI 0517; conserved TVINEYLED hypothetical protein (residues 103-111 of Arthroderma benhamiae XP 003017304.1 100/100 SEQ ID NO:1003) CBS 112371; conserved TVINEYLED hypothetical protein (residues 72-80 of Trichophyton rubrum CBS XP 003232549.1 100/100 SEQ ID NO:1005) 118892; glutathione TVINEYLED transferase (residues 62-75 of Novosphingobium sp. PP1Y YP 004533905.1 79/79 SEQ ID NO:1007) ; glutathione S-transferase-IITESTVICEYLED like protein (residues 84-92 of Arthroderma gypseum CBS XP 003171868.1 89/100 SEQ ID NO:1009) 118893; hypothetical protein (residues 61-69 of Trichophyton equinum CBS EGE04518.1 89/100 SEQ ID NO:1011) 127.97; hypothetical protein [00200] The nucleotide and amino acid sequences in Table 9 are incorporated herein by reference in their entirety through the GenBank Accession Numbers.
[00201] The bioinformatics provides valuable information about protein structure that can assist in identifying test candidates. For example, the LigE1 has the 98-221 region, which is annotated in the databases as potentially responsible as component of binding and activity, dimerization, and for binding and catalysis in general. While not intending to be bound by any theory or mechanism of action, the variability in active site structures is reflected by the variability in substrate structures. Likewise, upon further research using bioinformatics, it was further discovered that the 19-54 region, which is annotated in the databases as a second region that is potentially responsible as component of the reductase function, and thus potentially responsible for catalysis in addition to the 98-221 region, while having more conservation between members.
[00202] Obtaining additional structural information that will assist in finding high performing proteins within each family of strains is within the scope of the teachings to the extent that the methodology is known to one of skill. A variety of research techniques are known to one of skill. Bioinformatic methods, such as motif finding, are an example of one way to obtain the additional structural information. Motif finding, also known as profile analysis, constructs global multiple sequence alignments that attempt to align short conserved sequence motifs among the sequences in the query set. This can be done, for example, by first constructing a general global multiple sequence alignment, after which highly conserved regions are isolated, in a manner similar to what is taught herein, and used to construct a set of profile matrices. The profile matrix for each conserved region is arranged like a scoring matrix but its frequency counts for each amino acid or nucleotide at each position are derived from the conserved region's character distribution rather than from a more general empirical distribution. The profile matrices are then used to search other sequences for occurrences of the motif they characterize.
[00203] Lig E-1 and Lig F-2 were further examined by comparing their structures to other polypeptides of the LigE-type and LigF-type, respectively. Table 10A shows conserved residues between the polypeptide sequences of LigE and LigE-1, and Table 10B
shows shows conserved residues between the polypeptide sequences of LigF and LigF-2.

[00204] Table 10A.
Res Pos Res Pos Res Pos Res Pos Res Pos Res Pos [00205] As can be seen, there is a high degree of between-species similarity between LigE and LigE-1 in the LigE-type family. The LigE residues are from S.
paucimobilis (BAA02032.1) and the LigE-1 residues are from N. aromaticivorans LigE1 (ABD26841.1).
The numbering is done according to the S. paucimobilis sequence (BAA02032.1) in the PRALINE alignment file (gaps not included).

[00206] Table 10B.
Res Pos Res Pos [00207] As can be seen, there is less between-species similarity between LigF and LigF-2 in the LigF-type family. The LigF residues are from S. paucimobilis (BAA02031.1) and the LigF-2 residues are from N. aromaticivorans (ABD27301.1). Numbering is according to the S. paucimobilis sequence (BAA02031.1) in the PRALINE alignment file (gaps not included.

[00208] This example provides additional sequences for a second round of assays, the sequences containing the 3 conserved functional domains described herein for the GST C

family of proteins, and belong to the beta-etherase LigE subfamily. Table 11 lists nine (9) additional sequences having identities of 51% -73% at the amino acid level that were identified in the SwissProt database using the S. paucimobilis LigE sequence (P27457.3) as the query. The bioinformatics information suggests that these 9 sequences are excellent candidates for the next round of synthesis, cloning, expression and testing for the desired biochemical functions using the methods described herein.
[00209] Table 11.
Annotation Accession # Identity to S.
SwissProt/GenBank paucimobilis LigE
polypeptide WO
7 Dianthus caryophyllus; Glutathione S P28342.1/121736 59 transf erase 8 Euforbua esula; Glutathione S P57108.1/11132235 51 transf erase 9 Zea mays; Glutathione S transf erase P04907.4/1170090 70 Pseudomonas aeruginosa; P57109.1/11133449 58 Maleylacetoacetate isomerase 11 Zea mays; Glutathione S transf erase P46420.2/1170092 63 12 Arabidopsis thaliana; Glutathione S Q8L7C9.1/75329755 61 transf erase 13 Arabidopsis thaliana; Glutathione S P42769.1/1170093 73 transf erase 14 Oryza sativa Japonica Group; 065857.2/57012737 59 Probable Glutathione S transferase Oryza sativa Japonica Group; 082451.3/57012739 62 Probable Glutathione S transf erase [00210] The nucleotide and amino acid sequences in Table 11 are incorporated herein by reference in their entirety through the GenBank Accession Numbers.

[00211] This example describes how native lignin core structures can be hydrolyzed by the action of C alpha-dehydrogenases, beta-etherases, and glutathione-eliminating enzymes.
[00212] FIG. 5 illustrates beta-aryl-ether compounds to be tested as substrates representing native lignin structures, according to some embodiments. While MUAV was used as a model substrate in the identification of novel beta-etherase enzymes, additional aryl-ether compounds such as those shown in FIG. 5 might be used to assess substrate specificities of the beta-etherases towards dimers and trimers of aromatic compounds containing the beta-aryl ether linkage and representative of native lignin structures. Higher order oligomers of molecular weights <2000 might be synthesized and tested as well. The compounds might be obtained by custom organic synthesis, as for the fluorescent substrate MUAV.
[00213] FIG. 6 illustrates pathways of guaiacylglycerol-6-guaiacyl ether (GGE) metabolism by S. paucimobilis, according to some embodiments. Enzymes in addition to LigE/F-like beta etherases might be required to hydrolyze native lignin core structures. The model 6-aryl ether compound guaiacylglycerol- 13 -guaiacyl ether (GGE) is believed to contain the main chemical linkages present in native lignin, including the hydroxyl, aryl-ether and methoxy functionalities. The biotransformation of GGE to the lignin monomer beta-hydroxypropiovanillone (beta-HPV) is partially understood for S. paucimobilis, and proposed to occur via the action of 3 separate enzymes in a step-wise manner. The ligD
gene product encodes a Q alpha-dehydrogenase which oxidizes GGE to a-(2-methoxyphenoxy)-6-hydroxypropiovanillone (MPHPV); the ether bond of MPHPV is cleaved by the beta-etherase activities of the ligE and ligFgene products to yield the lignin monomer guaiacol, and a-glutathionylhydroxypropiovanillone (GS-HPV), respectively. The ligG gene product encodes a glutathione (GSH)-eliminating glutathione S transf erase (GST) which catalyzes the elimination of glutathione (GSH) from GS-HPV to yield the lignin hydroxypropiovanillone (HPV).
[00214] While the LigE and LigF polypeptides, or similar ones described herein, might be sufficient to hydrolyze native lignin structures, it would be useful to discover novel C alpha dehydrogenases (S. paucimobilis LigD homologs) and glutathione (GSH)-eliminating glutathione S transferases (S. paucimobilis LigG homologs) for industrial applications. The enzyme discovery programs might be conducted by methods similar to those described herein. The detection of lignin substrates, intermediates, and products of biochemical reactions might be measured following filtration, and the extraction of substrates and products into ethyl acetate. Substrates and products might be separated using reverse phase HPLC conditions with a 018 column developed with a gradient solvent system of methanol and water, and detected at 230nm or 254nm.
[00215] Table 12 lists potential C alpha-dehydrogenase polypeptide sequences, the LigD-type, for use in conjunction with beta etherases including, but not limited to, LigE/F.
The sequences were identified using bioinformatic methods, such as those taught herein.

These C alpha-dehydrogenases are classified in the ODD as short-chain dehydrogenase/reductases (SDRs) and are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold (alpha/beta folding pattern with a central beta-sheet), an NAD(P)(H)-binding region, and a structurally diverse 0-terminal region. Classical SDRs are typically about 250 residues long, while extended SDRs are approximately 350 residues. Sequence identity between different SDR
enzymes are typically in the 15-30% range, but the enzymes share the Rossmann fold NAD-binding motif and characteristic NAD-binding and catalytic sequence patterns.
[00216] Without intending to be bound by any theory or mechanism of action, these enzymes are thought to catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG]XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human prostaglandin dehydrogenase (PGDH) numbering). In addition to the Tyr and Lys, there is often an upstream Ser (Ser-138, PGDH
numbering) and/or an Asn (Asn-107, PGDH numbering) contributing to the active site; while substrate binding is in the 0-terminal region, which determines specificity.
[00217] Without intending to be bound by any theory or mechanism of action, the standard reaction mechanism is thought to be a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide.
Extended SDRs have additional elements in the C-terminal region, and typically have a TGXXGXXG cofactor binding motif. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase can have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases can have a TGXXXGX(1-2)G
NAD(P)-binding motif. Some atypical SDRs are thought to have lost catalytic activity and/or have an unusual NAD(P)-binding motif and missing or unusual active site residues.
Reactions catalyzed within the SDR family can include isomerization, decarboxylation, epimerization, C=N bond reduction, dehydratase activity, dehalogenation, Enoyl-CoA
reduction, and carbonyl-alcohol oxidoreduction.

[00218] Table 12.
Species GenBank Accession Identity/Similarity to Numbers S. paucimobilis LigD
polypeptide (%) 1 N. aromaticivorans YP495487.1 78/88 2 N. aromaticivorans YP496072.1 39/58 3 N. aromaticivorans YP496073.1 39/59 4 N. aromaticivorans YP495984.1 35/56 N. aromaticivorans YP497149.1 38/58 [00219] The nucleotide and amino acid sequences in Table 12 are incorporated herein by reference in their entirety through the GenBank Accession Numbers.
[00220] Table 13 lists potential LigG (glutathione-eliminating)-like enzyme sequences for use in conjunction with beta etherases including, but not limited to, LigE/F.
The sequences were identified using bioinformatic methods, such as those taught herein.
These might be utilized in conjunction with C-alpha dehydrogenases, and/or with LigE/F-like beta-etherases.
The LigG-like proteins are annotated in the CDD as glutathione S-transf erase (GST)-like proteins with similarity to the GST C family, the GST-N family, and the thioredoxin (TRX)-like superfamily of proteins containing a TRX fold.
[00221] Table 13.
Species GenBank Accession Identity/Similarity to Numbers S. paucimobilis LigG
polypeptide (%) 1 N. aromaticovorans YP 498160.1 23/41 2 A. vinelandii DJ YP 002798340 32/50 [00222] The nucleotide and amino acid sequences in Table 13 are incorporated herein by reference in their entirety through the GenBank Accession Numbers.

[00223] This example describes the creation of a novel recombinant microbial system for the conversion of lignin oligomers to monomers. Azotobacter vinelandii strain DJ, for example, might be transformed with beta-etherase encoding genes from N.
aromaticovorans with the objective of creating a lignin phenolics-tolerant A.
vinelandii strain capable of converting lignin oligomers to monomers at high yields in industrial processes.
Table 14 lists additional A. vinelandii strains that might be used as host strains for beta-etherase gene expression, for example, by their strain designation and American Type Culture Collection (ATCC) number.
[00224] Table 14.
Strain ATCC Strain ATCC Strain ATCC
# # #
Designation Number Designation Number Designation Number 1 Wisconsin 0 12518 8 Ad116 17962 14 B-6 7489 2 3a 12837 9 NRS 16 25308 15 B-9 7492 7 135 [VKM B--547]
[00225] The heterologous production of beta etherases, Ca dehydrogenases, and other enzymes for the production of lignin monomers and aromatic products in A.
vinelandii might be achieved using the expression plasmid system described herein. The broad host range multicopy plasmid pKT230 (ATCC) encoding streptomycin resistance might be used for gene cloning. Genes can be synthesized by methods describe above, and cloned into the Smal site of pKT230. The nifH promoter from A. vinelandii strain BAA 1303 DJ
can be used to control gene expression.
[00226] A. vinelandii strain BAA 1303 DJ might be transformed with pKT230 derivatives using electroporation of electrocompetent cell (Eppendorf method), or by incubation of plasmid DNA with chemically competent cells prepared in TF medium (1.9718g of MgSO4, 0.0136 g of CaSO4, 1.1 g of CH3COONH4, 10 g of glucose, 0.25 g of KH2PO4, and 0.55 g of K2HPO4 per liter). Transformants might be selected by screening for resistance to streptomycin. Gene expression might be induced by cell growth under nitrogen-free Burk's medium (0.2 g of MgSO4, 0.1 g of CaSO4, 0.5 g of yeast extract, 20 g of sucrose, 0.8 g of K2HPO4, and 0.2 g of KH2PO4, with trace amounts of FeCI3 and Na2Mo04, per liter).
[00227] The biochemical activity of a newly-discovered beta-etherase enzyme functionally expressed in A. vinelandii strain BAA 1303 DJ can be tested using methods known to one of skill, such as the methods provided herein. Biochemical activity assays for beta-etherase function, and for total protein might be performed as described herein.

[00228] This example describes the design and use of recombinant Azotobacter strains heterologously expressing enzymes for the production of high value aromatic compounds from lignin core structures. Table 15 lists a few examples of aromatic compounds that might be produced by the microbial platforms described herein.

[00229] Table 15.
Chemical Market Volume Market Value Uses Product (metric ton/year) ($/lb) Catechol Antioxidant: 4-tert-butylcatechol.
H = 0Flavors:
30x103 2.34 piperonal;
veratrol.
OH
Insecticides:
carbofuran; propoxur.
Vanillin H
Flavor agent. Precursor 20x103 6.12 for pharmaceutical H3C0 methyldopa.
OH
2,4-Diaminotoluene Precursor to toluene 3x106 1.65 diisocyanates for H2N urethane polymers.

Salicylic acid Precursor to analgesic drug acetylsalicylic acid. Precursor to HO 1.6x10 3 (US) 3.92 fragrances: amyl and COOH methyl esters of salicylic acid.
Aminosalicylic acid 57.38 Tuberculosis drug.
HO
COOH
ortho-Cresol Precursors to = 38x103 0.8 herbicides: 4-chloro-2-methylphenoxyacetic HO acid; 2-(4-chloro-2-CH3 methylphenoxy)-propionic acid.
[00230] One example of a microbial process to a commercial aromatic compound might be the production of catechol from lignin-derived phenolic compounds. Catechol might be produced from guaiacol using an A. vinelandii or A. chroococcum strain engineered with enzymes including beta-etherases and demethylases, or demethylase enzymes alone.

Azotobacter strains might be engineered to express the heterologous enzymes by the methods described herein.
[00231] FIG. 7 illustrates an example of a biochemical process for the production of catechol from lignin oligomers, according to some embodiments. The biochemical processes leading to aromatic products such as catechol might be designed as 3 unit operations described below:
[00232] i) Fractionation of soluble lignin - Concentration or partial purification of soluble biorefinery lignin fractions or phenolic streams using methods known to one of skill.
[00233] ii) Biotransformation - The biotransformation of the phenolic substrate stream might be carried out in a fed-batch bioprocess using Azotobacter strains engineered to specifically and optimally convert specific lignin-derived phenolic substrates to the final product, such as catechol. Corn steep liquor might be used the base medium used in the biotransformations. The phenolic stream might be introduced in fed-batch mode, at concentrations that will be tolerated by the strains.
[00234] iii) Product separation - The product, such as catechol, might be purified from the aqueous culture broths using standard chemical separation methods such as liquid-liquid extractions (LLE) with solvents of varying polarities applied in a sequential manner.
[00235] Additional examples of designed biochemical routes to aromatic products are described below:
[00236] i) lignin-derived syringic acid might be converted to gallic acid via a 2-step biochemical conversion using aryl aldehyde oxidases and demethylases.
[00237] ii) Lignin-derived vanillin might be converted to protocatechuic acid via a 2-step biochemical conversion using aryl aldehyde oxidases and demethylases.
[00238] iii) Lignin-derived vanillin might be converted to catechol via a 3-step biochemical conversion using aryl aldehyde oxidases, aromatic decarboxylases, and demethylases.
[00239] iv) Lignin-derived 2-methoxytoluene might be converted to the urethane precursor 2,4-diaminotoluene via a 4-step biochemical conversion using demethylases, ferulate-5-hydroxylases, 2,4-nitrophenol oxidoreductases, and 2,4-nitrobenzene reductases.

[00240] In each case, the specific enzymes might be engineered into A.
vinelandii or A.
chroococcum strains, for example, and the process might be performed using unit operations similar to those described herein for the biochemical production of catechol.
[00241] FIG. 8 illustrates an example of a biochemical process for the production of vanillin from lignin oligomers, according to some embodiments. Vanillin can be used as a flavoring agent, and as a precursor for pharmaceuticals such as methyldopa.
Synthetic vanillin, for example, can be produced from petroleum-derived guaiacol by reaction with glyoxylic acid. Vanillin, however, can also be produced from lignin-derived p-hydroxypropiovanillone (13-HPV) according to the process scheme indicated in FIG. 8. A 2-step biochemical route to vanillin from 13-HPV can be achieved using the enzymes 2,4-dihydroxyacetophenone oxidoreductase, and vanillin dehydrogenase or carboxylic acid reductases, engineered into A. vinelandii.
[00242] FIG. 9 illustrates an example of a biochemical process for the production of 2,4-diaminotoluene from lignin oligomers, according to some embodiments. Toluene diisocyanate (TDI) can be used in the manufacture of polyurethanes. For example, 2,4-diaminotoluene (2,4-DAT) is the key precursor to TD I. Diaminotoluenes can be produced industrially by the sequential nitration of toluene with nitric acid, followed by the reduction of the dinitrotoluenes to the corresponding diaminotoluenes. Both nitration and reduction reactions yield mixtures of toluene isomers from which the 2,4-DAT isomer is purified by distillation. The conversion of lignin-derived 2-methoxytoluene to 2,4-DAT can be achieved according to the process scheme outlined in FIG. 9. 2-methoxytoluene can be be converted to 2,4-DAT by A. vinelandii engineered with 4 enzymes to specifically demethylate, hydroxylate, nitrate and aminate methoxytoluene.
[00243] FIG. 10 illustrates process schemes for additional product targets that include ortho-cresol, salicylic acid, and aminosalicylic acid, for the production of valuable chemicals from lignin oligomers, according to some embodiments. These chemicals, as with the others, have traditionally been obtained from the problematic petrochemical processes. A
few of the process schemes for producing these chemicals using the teachings herein, based on guaiacol or 2-methoxytoluene, are shown schematically in FIG. 10.
Designed biochemical routes, combined with the remarkable phenolics-tolerance traits of Azotobacter strains are proposed for conversions of lignin structures to industrial and fine chemicals.

[00244] This example describes potential LigE-, LigF-, LigG-, and LigD-type polypeptides, and the genes encoding them. The potential polypeptides were identified using bioinformatic methods, such as those taught herein.
[00245] As described above, the query sequences in the initial pass for the LigE-type and LigF-type were Sphingomonas paucimobilis sequences, such as those discussed in Masai, E., et al. Likewise, the query sequences for the LigG-type and LigD-type were also Sphingomonas paucimobilis sequences, such as those discussed in Masai. The following sequences were used in the initial pass for all queries:
[00246] LigE, from Accession No BAA2032.1, is listed herein as SEQ ID NO:1 for the protein and SEQ ID NO:2 for the gene.
[00247] LigF, from Accession No BAA2031.1 (P30347.1), is listed herein as SEQ ID
NO:513 for the protein and SEQ ID NO:514 for the gene.
[00248] LigG, from Accession No Q9Z339.2, is listed herein as SEQ ID NO:733 for the protein and SEQ ID NO:734 for the gene.
[00249] LigD, from Accession No Q01198.1, is listed herein as SEQ ID NO:777 for the protein and SEQ ID NO:778 for the gene.
[00250] The following sequences were used in a modified query to further refine the LigE-type and LigF-type, and the query sequences were the LigE-1 and LigF-2 that showed the surprising and unexpected results shown in FIG. 4:
[00251] LigE-1, from Accession No ABD26841.1, is listed herein as SEQ ID
NO:101 for the protein and SEQ ID NO:102 for the gene.
[00252] LigF-2, from Accession No ABD27301.1, is listed herein as SEQ ID
NO:541 for the protein and SEQ ID NO:542 for the gene.
[00253] Table 16 lists SEQ ID NOs:1-246, which are potential protein sequences of the LigE-type, as well as a respective gene sequence encoding the protein. Table 17 lists SEQ
ID NOs:247-576, which are potential protein sequences of the LigF-type, as well as a respective gene sequence encoding the protein. Table 18 lists SEQ ID NOs:577-776, which are potential protein sequences of the LigG-type, as well as a respective gene sequence encoding the protein. Table 19 lists SEQ ID NOs: 777-976, which are potential protein sequences of the LigD-type, as well as a respective gene sequence encoding the protein.
[00254] Bioinformatic methods, such as those described herein, can be used to suggest an efficient order of experimentation to identify additional potential enzymes for use with the teachings provided herein. Moreover, mutations and amino acid substitutions can be used to test affects on enzyme activity to further understand the structure of the most active proteins with respect to the enzyme functions sought by teachings provided herein.

[00255] Table 16.

PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
1 2 BAA02032.1 Sphingomonas paucimobilis LIGE
3 4 BAJ11989.1 beta-etherase [Sphingobium sp. SYK-6] LIGE
glutathione S-transferase domain-containing LIGE
6 EFV85608.1 protein [Achromobacter xylosoxidans C54]

EFW42705.1 predicted protein [Capsaspora owczarzaki ATCC LIGE
Glutathione S-transferase domain-containing LIGE
9 10 EGE55257.1 protein [Rhizobium etli CNPAF512]
glutathione S-transferase domain-containing LIGE
11 12 EGP48556.1 protein [Achromobacter xylosoxidans AXX-A]
13 14 EGP57475.1 lignin degradation protein [Agrobacterium LIGE
Glutathione S-transferase [Rhodotorula glutinis LIGE
16 EGU12703.1 ATCC 204091]
glutathione S-transferase domain-containing LIGE
17 18 EGU56510.1 protein [Vibrio tubiashii ATCC 19109]
hypothetical protein pTi-SAKURA_p086 LIGE
[Agrobacterium tumefaciens] >dbjIBAA87709.1I
19 20 NP 053324.1 tiorf84 [Agrobacterium tumefaciens]
lignin beta-ether hydrolase [Mesorhizobium loti LIGE
MAFF303099] >dbj1BAB54276.11lignin beta-21 22 NP 108131.1 ether hydrolase [Mesorhizobium loti lignin degradation protein [Agrobacterium LIGE
tumefaciens str. C58] >gbIAAK86925.21lignin 23 24 NP 354140.2 degradation protein [Agrobacterium tumefaciens putative BETA-etherase (BETA-aryl ether LIGE
cleaving enzyme) protein [Sinorhizobium meliloti 1021] >embICAC45742.11 Putative beta-etherase (beta-aryl ether cleaving enzyme) protein [Sinorhizobium meliloti 1021]
>gbIAEG03720.11Glutathione S-transferase domain protein [Sinorhizobium meliloti BL225C]
26 NP 385269 1 >gbIAEH79753.11 putative BETA-etherase = . . .
ligninase [Bradyrhizobium japonicUm USDA 110] LIGE
>dbjIBAC52692.1I ligE [Bradyrhizobium 27 28 NP 774067.1 japonicum USDA 110]
putative lignin beta-ether hydrolase LIGE
[Rhodopseudomonas palustris CGA009]
29 30 NP 949676.1 >embICAE29781.11 putative lignin beta-ether PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
RecName: Full=Beta-etherase; AltName: LIGE
Full=Beta-aryl ether cleaving enzyme >gbIAAA25878.1I beta-etherase [Sphingomonas 31 32 P27457.3 paucimobilis] >dbjIBAA02032.11beta-etherase hypothetical protein SCHCODRAFT_85860 LIGE
[Schizophyllum commune H4-8]
33 34 P_003028922. >gbIEF194019.11 hypothetical protein hypothetical protein SCHCODRAFT_57691 LIGE
[Schizophyllum commune H4-8]
35 36 P_003030384. >gblEF195481.11 hypothetical protein hypothetical protein SCHCODRAFT_81614 LIGE
[Schizophyllum commune H4-8]
37 38 P_003033715. >gbIEF198812.11 hypothetical protein hypothetical protein NECHADRAFT_55532 LIGE
[Nectria haematococca mpVI 77-13-4]
>gblEEU35500.11 hypothetical protein 39 40 P_003041213. NECHADRAFT_55532 [Nectria haematococca 41 42 XP 382462.1 hypothetical protein FG02286.1 [Gibberella zeae LIGE
putative glutathione S-transferase (GST) LIGE
[Bradyrhizobium sp. 0R5278]
43 44 P001207860. >embICAL79645.11 putative glutathione 5-glutathione S-transferase domain-containing LIGE
protein [Acidiphilium cryptum JF-5]
>gbIABQ32287.11Glutathione S-transferase, N-45 46 P_001236206. terminal domain protein [Acidiphilium cryptum JF=
putative glutathione S-transferase LIGE
[Bradyrhizobium sp. BTAi1] >gbIABQ33995:11 47 48P 001237901. putative glutathione S-transferase (GST) hypothetical protein Swit_1652 [Sphingomonas LIGE
wittichii RW1] >gbIABQ68015.11 hypothetical 49 50 (P_001262153. protein Swit 1652 [Sphingomonas wittichii RW1]
glutathione S-transferase domain-containing LIGE
protein [Sinorhizobium medicae WSM419]
>gbIABR59630.1I Glutathione S-transferase 51 52 (P_001326465. domain [Sinorhizobium medicae W5M419]
glutathione S-transferase domain-containing LIGE
protein [Parvibaculum lavamentivorans DS-1]
>gbIABS63563.1I Glutathione S-transferase 53 54 (P_001413220. domain [Parvibaculum lavamentivorans DS-1]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione S-transferase [Azorhizobium LIGE
caulinodans ORS 571] >dbjIBAF89264.11 55 56 (P 001526182. glutathione S-transferase [Azorhizobium lignin degradation protein [Sorangium cellulosum LIGE
'So ce 56'] >embICAN96036.11lignin 57 58P 001616516. degradation protein [Sorangium cellulosum 'So glutathione S-transferase domain-containing LIGE
protein [Methylobacterium sp. 4-46]
59 60 (P 001772944. >gblACA20610.11 Glutathione S-transferase glutathione S-transferase domain-containing LIGE
protein [Beijerinckia indica subsp. indica ATCC
9039] >gblACB95969.11Glutathione 5-61 62 P_001833458. transferase domain [Beijerinckia indica subsp.
beta-aryl ether cleaving enzyme, lignin LIGE
degradation protein [Rhizobium etli CIAT 652]
>gblACE90517.11 beta-aryl ether cleaving 63 64 P_001977695. enzyme, lignin degradation protein [Rhizobium glutathione S-transferase domain-containing LIGE
protein [Rhodopseudomonas palustris TIE-1]
>gblACF03309.11Glutathione S-transferase 65 66 (P_001993784. domain [Rhodopseudomonas palustris TIE-1]
glutathione S-transferase domain [Rhizobium LIGE
leguminosarum by. trifolii W5M2304]
>gblAC154372.11Glutathione S-transferase 67 68 P_002280598. domain [Rhizobium leguminosarum by. trifolii glutathione S-transferase [Oligotropha LIGE
carboxidovorans 0M5] >reflYP_004631892.11 beta etherase [Oligotropha carboxidovorans 0M5] >gblAC194284.11glutathione S-transferase [Oligotropha carboxidovorans 0M5]
>gbIAEI02075.11 putative beta etherase 69 70 P_002290149. [Oligotropha carboxidovorans 0M4]
-glutathione S-transferase domain-containing LIGE
protein [Methylocella silvestris BL2]
71 72P 002362903. >gblACK51541.11glutathione S-transferase glutathione S-transferase domain-containing LIGE
protein [Methylobacterium nodulans ORS 2060]
>gblACL61802.11Glutathione S-transferase 73 74 P_002502105. domain protein [Methylobacterium nodulans PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
lignin degradation protein [Agrobacterium vitis LIGE
S4] >gblACM36110.11lignin degradation protein 75 76 (P_002549116. [Agrobacterium vitis S4]
glutathione S-transferase-like protein LIGE
[Azotobacter vinelandii DJ] >gblAC076830:11 77 78 P002797805. Glutathione S-transferase-like protein putative lignin beta-ether hydrolase LIGE
[Sinorhizobium fredii NGR234]
79 80 P002825455. >gblACP24702.11 putative lignin beta-ether glutathione S-transferase domain protein LIGE
[Rhizobium leguminosarum by. trifolii WSM1325]
>gblACS55517.11Glutathione S-transferase 81 82 P_002975056. domain protein [Rhizobium leguminosarum by.
lignin degradation protein [Agrobacterium sp. LIGE
H13-3] >gbIADY64039.11lignin degradation 83 84 (P_004278359. protein [Agrobacterium sp. H13-3]
putative beta-etherase [Acidiphilium multivorum LIGE
AlU301] >dbjIBAJ82791.11 putative beta-85 86 (P_004285673. etherase [Acidiphilium multivorum AlU301]
glutathione S-transferase-like protein LIGE
[Pseudomonas mendocina NK-01]
87 88 P004378290. >gbIAEB56538.1I glutathione S-transferase-like glutathione S-transferase-like protein LIGE
[Novosphingobium sp. PP1Y]
89 90 P004533906. >embICCA92088.1I glutathione 5-transferase-glutathione S-transferase domain-containing LIGE
protein [Sinorhizobium meliloti AK83]
91 92 P004548326. >gbIAEG52712.11Glutathione S-transferase glutathione S-transferase domain-containing LIGE
protein [Mesorhizobium opportunistum W5M2075] >gbIAEH89616.11Glutathione 5-93 94 (P 004613710. transferase domain protein [Mesorhizobium putative lignin beta-etherase [Colwellia LIGE
psychrerythraea 34H] >gbIAAZ24120.11 putative 95 96 YP 269568.1 lignin beta-etherase [Colwellia psychrerythraea beta-aryl ether cleaving enzyme, lignin LIGE
degradation protein [Rhizobium etli CFN 42]
>gbIABC90274.1I beta-aryl ether cleaving 97 98 YP 469001.1 enzyme, lignin degradation protein [Rhizobium PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione S-transferase-like protein LIGE
[Rhodopseudomonas palustris HaA2]
99 100 YP 487746.1 >gbIABD08835.11Glutathione S-transferase-like ik_rm!!!!!!._.
glutathione S-transferase-like protein LIGE
[Novosphingobium aromaticivorans DSM 12444]
>gbIABD26841.1I glutathione S-transferase-like mgantat 102 YP 497675.1 protein [Novosphingobium aromaticivorans DSM
glutathione S-transferase-like protein LIGE
[Rhodopseudomonas palustris BisB18]
103 104 YP 533979.1 >gbIABD89660.1Iglutathione S-transferase-like glutathione S-transferase-like protein LIGE
[Chromohalobacter salexigens DSM 3043]
>gbIABE60032.1I glutathione S-transferase-like 105 106 YP 574731.1 Protein [Chromohalobacter salexigens DSM
glutathione S-transferase-like protein LIGE
[Trichodesmium erythraeum IMS101]
107 108 YP 723508.1 >gbIABG53035.1Iglutathione S-transferase-like etherase [Rhizobium leguminosarum by. viciae LIGE
3841] >embICAK07074.11 putative etherase 109 110 YP_767183.1 [Rhizobium leguminosarum by. viciae 3841]
glutathione S-transferase [Rhodopseudomonas LIGE
palustris BisA53] >gbIABJ08111.11Glutathione 111 112 YP 783091.1 5-transferase [Rhodopseudomonas palustris glutathione S-transferase domain-containing LIGE
protein [Paracoccus den itrificans PD1222]
>gbIABL69699.1I Glutathione S-transferase, N-113 114 YP 915395.1 terminal domain [Paracoccus denitrificans putative beta-etherase (beta-aryl ether cleaving LIGE
enzyme) protein [Phaeobacter gallaeciensis BS107] >gbIEDQ11875.11 putative beta-115 116 ZP 02146530.1 etherase (beta-aryl ether cleaving enzyme) putative beta-etherase (beta-aryl ether cleaving LIGE
enzyme) protein [Phaeobacter gallaeciensis 2.10] >gbIEDQ08644.1I putative beta-etherase 117 118 ZP 02149699.1 (beta-aryl ether cleaving enzyme) protein putative beta-etherase (beta-aryl ether cleaving LIGE
enzyme) protein [Hoeflea phototrophica DFL-43]
>gbIEDQ33834.11 putative beta-etherase (beta-119 120 ZP 02166231.1 aryl ether cleaving enzyme) protein [Hoeflea PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione S-transferase-like protein [alpha LIGE
proteobacterium BALI 99] >gbIEDP62276.11 121 122 zp 02190934: glutathione S-transferase-like protein [alpha 123 124 ZP 03503368. Glutathione S-transferase domain [Rhizobium LIGE
125 126 ZP 03507162.' Glutathione S-transferase domain [Rhizobium LIGE
127 128 ZP 03513891.' Glutathione S-transferase domain [Rhizobium LIGE
129 130 ZP 03519388.' Glutathione S-transferase domain [Rhizobium LIGE
131 132 ZP 03520502.' putative etherase [Rhizobium etli GR56]
LIGE
glutathione S-transferase, N-terminal domain LIGE
[Pseudovibrio sp. JE062] >gblEEA94709:11 133 134 zp 05084767: glutathione S-transferase, N-terminal domain lignin degradation protein [Achromobacter LIGE
piechaudii ATCC 43553] >gbIEFF74366.1I lignin 135 136 ZP 06688746: degradation protein [Achromobacter piechaudii glutathione S-transferase family protein LIGE
[Roseomonas cervical is ATCC 49957]
>gbIEFH10151.11 glutathione S-transferase 137 138 ZP 06898146: family protein [Roseomonas cervicalis ATCC
Glutathione S-transferase domain protein [Afipia LIGE
sp. 1NLS2] >gbIEF151229.11Glutathione 5-139 140 ZP_07027473.' transferase domain protein [Afipia sp. 1NLS2]
beta-etherase [Ahrensia sp. R2A130] LIGE
141 142 ZP 07373940.' >gbIEFL90585.1I beta-etherase [Ahrensia sp.
Glutathione S-transferase [gamma LIGE
proteobacterium IMCC1989] >gblEGG95341 Al 143 144 ZP 08328512: Glutathione S-transferase [gamma lignin degradation protein [Agrobacterium sp. LIGE
ATCC 31749] >gb1EGL63395.11 lignin 145 146 ZP 08529965.' degradation protein [Agrobacterium sp. ATCC
lignin beta-ether hydrolase [Bradyrhizobiaceae LIGE
bacterium SG-6C] >gblEGP10168.11 lignin beta-147 148 ZP 08627134: ether hydrolase [Bradyrhizobiaceae bacterium Glutathione S-transferase domain-containing LIGE
protein [Acidiphilium sp. PM] >gblEG096849.11 149 150 ZP 08631370: Glutathione S-transferase domain-containing Glutathione S-transferase domain-containing LIGE
protein [Acidiphilium sp. PM] >gblEG093307.11 151 152 ZP 08634908: Glutathione S-transferase domain-containing PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione S-transferase domain-containing LIGE
protein [Halomonas sp. TD01] >gblEGP21558.11 153 154 zp 08635074.1 glutathione S-transferase domain-containing hypothetical protein SERLA73DRAFT_115219 LIGE
[Serpula lacrymans var. lacrymans S7.3]
>gblEG019163.11 hypothetical protein 155 156 EGN93792.1 SERLADRAFT_453680 [Serpula lacrymans var.
hypothetical protein SERLA73DRAFT_188253 LIGE
[Serpula lacrymans var. lacrymans S7.3]
>gblEG019875.11 hypothetical protein 157 158 EGN94392.1 SERLADRAFT_478300 [Serpula lacrymans var.
hypothetical protein SERLA73DRAFT_186005 LIGE
[Serpula lacrymans var. lacrymans S7.3]
>gblEG021854.11 hypothetical protein 159 160 EGN96317.1 SERLADRAFT_474829 [Serpula lacrymans var.
hypothetical protein SERLA73DRAFT_185168 LIGE
[Serpula lacrymans var. lacrymans S7.3]
>gblEG022516.11 hypothetical protein 161 162 EGN96924.1 SERLADRAFT_473468 [Serpula lacrymans var.
hypothetical protein SERLA73DRAFT_107446 LIGE
[Serpula lacrymans var. lacrymans S7.3]
>gblEG025928.11 hypothetical protein 163 164 EG000367.1 SERLADRAFT_415302 [Serpula lacrymans var.
conserved hypothetical protein [Aspergillus LIGE
terreus NIH2624] >gblEAU33805.11 conserved 165 166 P_001215222. hypothetical protein [Aspergillus terreus hypothetical protein AOR_1_322094 [Aspergillus LIGE
oryzae RIB40] >dbjIBAE62801.11 unnamed 167 168 P_001823934. protein product [Aspergillus oryzae RIB40]
hypothetical protein CC1G_07903 [Coprinopsis LIGE
cinerea okayama7#130] >gblEAU82621.11 169 170 P_001839188. hypothetical protein CC1G_07903 [Coprinopsis predicted protein [Laccaria bicolor 5238N-H82] LIGE
>gbIEDR03530.1I predicted protein [Laccaria 171 172 P_001885678. bicolor 5238N-H82]
conserved hypothetical protein [Penicillium LIGE
marneffei ATCC 18224] >gblEEA19427.11 173 174 P_002152364. conserved hypothetical protein [Penicillium PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
conserved hypothetical protein [Aspergillus LIGE
flavus NRRL3357] >gblEED49097.11 conserved 175 176 P_002380998. hypothetical protein [Aspergillus flavus hypothetical protein MPER_07394 LIGE
[Moniliophthora perniciosa FA553]
177 178 P_002392962. >gblEEB93892.11 hypothetical protein predicted protein [Postia placenta Mad-698-R] LIGE
>gblEED86077.11 predicted protein [Postia 179 180 P_002468854. placenta Mad-698-R]
predicted protein [Postia placenta Mad-698-R] LIGE
>gblEED82308.11 predicted protein [Postia 181 182 P_002472522. placenta Mad-698-R]
Pc12g05530 [Penicillium chrysogenum LIGE
Wisconsin 54-1255] >embICAP80180.11 183 184 P_002557398. Pc12g05530 [Penicillium chrysogenum hypothetical protein SCHCODRAFT_12387 LIGE
[Schizophyllum commune H4-8]
185 186 P_003026159. >gblEF191256.11 hypothetical protein hypothetical protein SCHCODRAFT_111982 LIGE
[Schizophyllum commune H4-8]
187 188 P_003028923. >gblEF194020.11 hypothetical protein Glutathione S-transferase domain-containing LIGE
protein [Cyanothece sp. PCC 7822]
189 190 P003890246. >gbIADN16971.11Glutathione S-transferase glutathione S-transferase-like [Halomonas LIGE
elongata DSM 2581] >embICBV41472.11 191 192 P003896657. glutathione S-transferase-like [Halomonas glutathione S-transferase [Achromobacter LIGE
xylosoxidans A8] >gbIADP17667.11glutathione 193 194 P_003980382. 5-transferase, N-terminal domain protein 4 glutathione S-transferase domain-containing LIGE
protein [Rhodopseudomonas palustris DX-1]
>gbIADU46105.11Glutathione S-transferase 195 196 'F1004110838. domain [Rhodopseudomonas palustris DX-1]
glutathione S-transferase [Mesorhizobium ciceri LIGE
biovar biserrulae WSM1271] >gbIADV13817.11 Glutathione S-transferase domain 197 198 'P 004143867. [Mesorhizobium ciceri biovar biserrulae PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
conserved hypothetical protein [Congregibacter LIGE
litoralis KT71] >gblEAQ98305.11 conserved 199 200 ZP 01102591. hypothetical protein [Congregibacter litoralis 201 202 AAA87183.1 auxin-induced protein [Vigna radiata] LIGE
203 204 AAG34797.1 glutathione S-transferase GST 7 [Glycine max]
LIGE
205 206 AA069664.1 glutathione S-transferase [Phaseolus acutifolius]
LIGE
207 208 ACU24385.1 unknown [Glycine max] LIGE
209 210 ADP99065.1 glutathione S-transferase [Marinobacter LIGE
putative glutathione S-transferase [Acinetobacter LIGE
211 212 ADY82158.1 calcoaceticus PHEA-2]
213 214 BAA77215.1 beta-etherase [Sphingomonas paucimobilis] LIGE
hypothetical protein CC1G_12612 [Coprinopsis LIGE
cinerea okayama7#130] >gblEAU82225.11 215 216 1='_001839584. hypothetical protein CC1G_12612 [Coprinopsis predicted protein [Populus trichocarpa] LIGE
217 218 P_002336443. >gblEEE73479.11 predicted protein [Populus hypothetical protein SCHCODRAFT_59314 LIGE
[Schizophyllum commune H4-8]
219 220 P_003028624. >gbIEF193721.11 hypothetical protein DEHA2A00660p [Debaryomyces hansenii LIGE
CB5767] >embICAG84310.11DEHA2A00660p 221 222 XP 456365.1 [Debaryomyces hansenii]
hypothetical protein [Cryptococcus neoformans LIGE
var. neoformans JEC21] >refIXP_773999.11 hypothetical protein CNBH0460 [Cryptococcus neoformans var. neoformans B-3501A]
>gblEAL19352.11 hypothetical protein CNBH0460 [Cryptococcus neoformans var.
223 224 XP 572781.1 neoformans B-3501A] >gbIAAW45474.11 . .
glutathione S-transferase domain-containing LIGE
protein [Acidiphilium cryptum JF-5]
>gbIABQ32287.11Glutathione S-transferase, N-225 226 'P 001236206. terminal domain protein [Acidiphilium cryptum JF=
putative glutathione S-transferase LIGE
[Bradyrhizobium sp. BTAi1] >gbIABQ33995:11 227 228P 001237901. putative glutathione S-transferase (GST) hypothetical protein Swit_1652 [Sphingomonas LIGE
wittichii RW1] >gbIABQ68015.11 hypothetical 229 230 001262153. protein Swit 1652 [Sphingomonas wittichii RW1]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione S-transferase domain-containing LIGE
protein [Sinorhizobium medicae WSM419]
>gbIABR59630.1I Glutathione S-transferase 231 232 (P_001326465. domain [Sinorhizobium medicae W5M419]
glutathione S-transferase domain-containing LIGE
protein [Parvibaculum lavamentivorans DS-1]
>gbIABS63563.1I Glutathione S-transferase 233 234 (P_001413220. domain [Parvibaculum lavamentivorans DS-1]
glutathione S-transferase [Azorhizobium LIGE
caulinodans ORS 571] >dbjIBAF89264.11 235 236 P001526182. glutathione S-transferase [Azorhizobium glutathione S-transferase [Synechococcus LIGE
elongatus PCC 6301] >reflYP_399807.11 glutathione S-transferase [Synechococcus elongatus PCC 7942] >dbjIBAD78939.11 glutathione S-transferase [Synechococcus 237 238 YP 171459.1 elongatus PCC 6301] >gbIABB56820.11 glutathione S-transferase-like protein [Anabaena LIGE
variabilis ATCC 29413] >gbIABA21529.11 239 240 YP 322424.1 Glutathione S-transferase-like protein glutathione S-transferase, putative [marine LIGE
gamma proteobacterium HTCC2080]
>gblEAW41324.11glutathione S-transferase, 241 242 ZP 01625805.1putative [marine gamma proteobacterium Glutathione S-transferase-like protein [Nodularia LIGE
spumigena CCY9414] >gblEAW44220.11 243 244 ZP 01631145. Glutathione S-transferase-like protein [Nodularia glutathione S-transferase [Acinetobacter LIGE
calcoaceticus RUH2202] >gblEEY78560.11 245 246 ZP 06057261.1glutathione S-transferase [Acinetobacter [00256] Table 17.

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione S-transferase, class-phi LigF
247 248 AAB65163.1 [Solanum commersonii]
glutathione S-transferase GST 42 [Zea LigF
249 250 AAG34850.1 mays]
putative glutathione S-transferase LigF
OsGSTU7 [Oryza sativa Japonica 251 252 AAK98535.1 Group]
glutathione S-transferase [Allium cepa] LigF
253 254 AAL61612.1 Intracellular chloride channel [Medicago LigF
255 256 ABE86679.1 truncatula]
Intracellular chloride channel [Medicago LigF
257 258 ABE86683.1 truncatula]
glutathione S-transferase [Solanum LigF
259 260 ABQ96853.1 tuberosum]
glutathione-S-transferase LigF
261 262 ACF15452.1 [Phanerochaete chrysosporium]
glutathione S-transferase GSTU6 [Zea LigF
263 264 ACG44597.1 mays]
265 266 ACJ86045.1 unknown [Medicago truncatula] LigF
Probable maleylacetoacetate isomerase LigF
267 268 AC015091.1 2 [Caligus clemensi]
phi class glutathione transferase GSTF7 LigF
269 270 ADB11335.1 [Populus trichocarpa]
glutathione S-transferase [Medicago LigF
271 272 BAB70616.1 sativa]
glutathione S-transferase [Allium cepa] LigF
273 274 BAF56180.1 predicted protein [Hordeum vulgare LigF
subsp. vulgare] >dbjIBAJ99460.11 predicted protein [Hordeum vulgare 275 276 BAJ90004.1 subsp. vulgare]
glutathione S-transferase GST1 LigF
277 278 CAI51314.2 [Capsicum chinense]
hypothetical protein 0s1_34425 [Oryza LigF
279 280 EAY79299.1 sativa Indica Group]
hypothetical protein OsJ_32234 [Oryza LigF
281 282 EAZ16758.1 sativa Japonica Group]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
hypothetical protein 0s1_34397 [Oryza LigF
283 284 EEC67342.1 sativa Indica Group]
glutathione S-transferase LigF
285 286 EFV87279.1 [Achromobacter xylosoxidans C54]
hypothetical protein LigF
SERLA73DRAFT 190579 [Serpula lacrymans var. lacrymans S7.3]
>gblEG026403.11 hypothetical protein SERLADRAFT 463437 [Serpula 287 288 EGN92742.1 lacrymans var. lacrymans S7.9]
hypothetical protein FOXI3_13869 LigF
289 290 EGU75635.1 [Fusarium oxysporum Fo5176]
0s10g0525600 [Oryza sativa Japonica LigF
Group] >gbIAAM12493.11AC074232_20 putative glutathione S-transferase [Oryza sativa Japonica Group]
>dbjIBAF27029.11 0s10g0525600 291 292 NP 001065115.1 [Oryza sativa Japonica Group]
0s10g0527400 [Oryza sativa Japonica LigF
Group] >gbIAAM12310.11AC091680_11 putative glutathione S-transferase [Oryza sativa Japonica Group]
>gbIAAM12478.11AC074232_5 putative glutathione S-transferase [Oryza sativa Japonica Group] >gbIAAP54729.11 glutathione S-transferase GSTU6, putative, expressed [Oryza sativa Japonica Group] >dbjIBAF27032.11 0s10g0527400 [Oryza sativa Japonica Group] >gbIEEE51298.1I hypothetical protein OsJ_32225 [Oryza sativa 293 294 NP 001065118.1 Japonica Group]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
0s10g0529300 [Oryza sativa Japonica LigF
Group] >gbIAAK98546.11AF402805_1 putative glutathione S-transferase OsGSTU18 [Oryza sativa Japonica Group] >gbIAAM12302.11AC091680_3 putative glutathione S-transferase [Oryza sativa Japonica Group]
>gbIAAM94529.1I putative glutathione S-transferase [Oryza sativa Japonica Group] >gbIAAP54753.1Iglutathione S-transferase GSTU6, putative, expressed [Oryza sativa Japonica Group]
>dbjIBAF27040.110s10g0529300 [Oryza sativa Japonica Group]
>gb1EAY79288.11 hypothetical protein 0s1_34414 [Oryza sativa Indica Group]
>dbjIBAG87628.11 unnamed protein product [Oryza sativa Japonica Group]
>dbjIBAG97643.11 unnamed protein product [Oryza sativa Japonica Group]
>dbjIBAG87189.11 unnamed protein 295 296 NP_001065126.1 product [Oryza sativa Japonica Group]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
0s10g0529900 [Oryza sativa Japonica LigF
Group] >gbIAAM12331.11AC091680_32 putative glutathione S-transferase [Oryza sativa Japonica Group]
>gbIAAM94517.11 putative glutathione S-transferase [Oryza sativa Japonica Group] >gbIAAP54759.1Iglutathione S-transferase GSTU6, putative [Oryza sativa Japonica Group]
>dbjIBAF27046.110s10g0529900 [Oryza sativa Japonica Group]
>gblEAZ16763.11 hypothetical protein OsJ_32239 [Oryza sativa Japonica 297 298 NP 001065132.1 Group]
L00542632 [Zea mays] LigF
>gbIAAG34835.11AF244692_1 glutathione S-transferase GST 27 [Zea mays] >gbIACF85142.11 unknown [Zea 299 300 NP 001105627.1 mays]
glutathione S-transferase GSTU6 [Zea LigF
mays] >gbIACG46501.11glutathione 5-301 302 NP 001152229.1 transferase GSTU6 [Zea mays]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
putative glutathione S-transferase LigF
protein [Sinorhizobium meliloti 1021]
>reflYP_004550950.1Iglutathione S-transferase domain-containing protein [Sinorhizobium meliloti AK83]
>embICAC41740.11 Putative glutathione 5-transferase [Sinorhizobium meliloti 1021] >gbIAEG06303.11Glutathione S-transferase domain protein [Sinorhizobium meliloti BL225C]
>gbIAEG55336.11Glutathione S-transferase domain protein [Sinorhizobium meliloti AK83]
>gbIAEH81005.11 putative glutathione S-transferase protein [Sinorhizobium 303 304 NP 384409.1 meliloti SM11]
hypothetical protein BC1G_05597 LigF
[Botryotinia fuckeliana B05.10]
>gbIEDN24875.1I hypothetical protein BC1G_05597 [Botryotinia fuckeliana 305 306 XP 001555922.1 B05.10]
hypothetical protein SNOG_15716 LigF
[Phaeosphaeria nodorum SN15]
>gblEAT76811.21 hypothetical protein SNOG 15716 [Phaeosphaeria nodorum 307 308 XP 001805855.1 5N15]
predicted protein [Populus trichocarpa] LigF
>gb1EEE99635.11 predicted protein 309 310 XP 002321320.1 [Populus trichocarpa]
hypothetical protein LigF
SORBIDRAFT_03g025210 [Sorghum bicolor] >gblEES00904.11 hypothetical protein SORBIDRAFT_03g025210 311 312 XP 002455784.1 [Sorghum bicolor]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
hypothetical protein LigF
SORBIDRAFT_01g030860 [Sorghum bicolor] >gblEER94604.11 hypothetical protein SORBIDRAFT_01g030860 313 314 XP 002467606.1 [Sorghum bicolor]
PREDICTED: ganglioside-induced LigF
differentiation-associated protein 1-like 315 316 XP 002734706.1 [Saccoglossus kowalevskii]
PREDICTED: ganglioside-induced LigF
differentiation-associated protein 1-like 317 318 XP 002734707.1 [Saccoglossus kowalevskii]
PREDICTED: Glutathione S-Transferase LigF
family member (gst-42)-like 319 320 XP 002737947.1 [Saccoglossus kowalevskii]
hypothetical protein LigF
SELMODRAFT 184606 [Selaginella moellendorffii] >gbIEFJ09414.11 hypothetical protein SELMODRAFT 184606 [Selaginella 321 322 XP 002989538.1 moellendorffii]
glutathione S-transferase domain- LigF
containing protein [Loa loa]
>gbIEF017107.1Iglutathione S-transferase domain-containing protein 323 324 XP 003146962.1 [Loa loa]
glutathione S-transferase domain- LigF
containing protein [Pseudomonas mendocina ymp] >gbIABP84676.11 Glutathione S-transferase, N-terminal domain protein [Pseudomonas 325 326 YP 001187408.1 mendocina ymp]
glutathione S-transferase domain- LigF
containing protein [Bradyrhizobium sp.
BTAi1] >gbIABQ35828.11 putative glutathione S-transferase enzyme with thioredoxin-like domain [Bradyrhizobium 327 328 YP 001239734.1 sp. BTAi1]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione S-transferase domain- LigF
containing protein [Sphingomonas wittichii RW1] >gbIABQ67801.11 Glutathione S-transferase, N-terminal 329 330 YP 001261939.1 domain [Sphingomonas wittichii RW1]
glutathione S-transferase domain- LigF
containing protein [Sphingomonas wittichii RW1] >gbIABQ68928.11 Glutathione S-transferase, N-terminal 331 332 YP 001263066.1 domain [Sphingomonas wittichii RW1]
glutathione S-transferase domain- LigF
containing protein [Parvibaculum lavamentivorans DS-1]
>gbIABS64709.11Glutathione S-transferase domain [Parvibaculum 333 334 YP 001414366.1 lavamentivorans DS-1]
maleylacetoacetate isomerase LigF
[Parvibaculum lavamentivorans DS-1]
>gbIABS65181.11maleylacetoacetate isomerase [Parvibaculum 335 336 YP 001414838.1 lavamentivorans DS-1]
glutathione S-transferase domain- LigF
containing protein [Caulobacter sp. K31]
>gbIABZ71793.11Glutathione S-transferase domain [Caulobacter sp.
337 338 YP 001684291.1 K31]
glutathione S-transferase domain- LigF
containing protein [Methylobacterium sp.
4-46] >gblACA18150.11Glutathione S-transferase domain [Methylobacterium 339 340 YP 001770584.1 sp. 4-46]
predicted glutathione S-transferase LigF
protein [Sinorhizobium fredii NGR234]
>gblACP27363.11 predicted glutathione S-transferase protein [Sinorhizobium 341 342 YP 002828116.1 fredii NGR234]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione S-transferase domain- LigF
containing protein [Caulobacter segnis ATCC 21756] >gbIADG10504.11 Glutathione S-transferase domain protein [Caulobacter segnis ATCC
343 344 YP 003593122.1 21756]
glutathione S-transferase [Pantoea LigF
vagans C9-1] >gbIAD009418.11 Glutathione S-transferase [Pantoea 345 346 YP 003930867.1 vagans C9-1]
Glutathione S-transferase domain LigF
protein [Glaciecola agarilytica 4H-3-7+YE-5] >gbIAEE23328.11Glutathione S-transferase domain protein [Glaciecola 347 348 YP 004434596.1 sp. 4H-3-7+YE-5]
glutathione S-transferase [Ramlibacter LigF
tataouinensis TTB310]
>gbIAEG94864.1Iglutathione S-transferase-like protein [Ramlibacter 349 350 YP 004620883.1 tataouinensis TTB310]
glutathione S-transferase family protein LigF
[Aeromonas punctata]
>embICAG15111.11 glutathione S-transferase family protein [Aeromonas 351 352 YP 067874.1 caviae]
glutathione S-transferase, putative LigF
[Ruegeria pomeroyi DSS-3]
>gbIAAV96533.1Iglutathione S-transferase, putative [Ruegeria pomeroyi 353 354 YP 168502.1 DSS-3]
glutathione S-transferase LigF
[Pseudoalteromonas haloplanktis TAC125] >embICA185615.11 putative glutathione S-transferase [Pseudoalteromonas haloplanktis 355 356 YP 339058.1 TAC125]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione S-transferase-like [Ruegeria LigF
sp. TM1040] >gbIABF62942.11 glutathione S-transferase-like protein 357 358 YP_612204.1 [Ruegeria sp. TM1040]
glutathione S-transferase family protein LigF
[Sulfitobacter sp. EE-36]
>refIZP_00961889.11glutathione S-transferase family protein [Sulfitobacter sp. NAS-14.1] >gblEAP81303.11 glutathione S-transferase family protein [Sulfitobacter sp. NAS-14.1]
>gb1EAP85807.11glutathione S-transferase family protein [Sulfitobacter 359 360 ZP 00954574.1 sp. EE-36]
maleylacetoacetate isomerase LigF
[Oceanospirillum sp. MED92]
>gblEAR62715.11maleylacetoacetate 361 362 ZP_01165363.1 isomerase [Oceanospirillum sp. MED92]
glutathione S-transferase, putative LigF
[Roseovarius sp. TM1035]
>gbIEDM30676.11glutathione S-transferase, putative [Roseovarius sp.
363 364 ZP 01881157.1 TM1035]
Glutathione S-transferase domain LigF
365 366 ZP 03523367.1 [Rhizobium etli GR56]
Glutathione S-transferase GST-6.0 LigF
[Yersinia ruckeri ATCC 29473]
>gblEEQ00521.11Glutathione S-transferase GST-6.0 [Yersinia ruckeri 367 368 ZP 04614975.1 ATCC 29473]
glutathione S-transferase, N-terminal LigF
domain protein [Rhodobacteraceae bacterium KLH11] >gblEEE36118.11 glutathione S-transferase, N-terminal domain protein [Rhodobacteraceae 369 370 ZP 05125190.1 bacterium KLH11]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione S-transferase [Silicibacter LigF
lacuscaerulensis ITI-1157]
>gblEEX09309.11glutathione S-transferase [Silicibacter lacuscaerulensis 371 372 ZP 05786193.1 ITI-1157]
maleylacetoacetate isomerase LigF
[Asticcacaulis biprosthecum C19]
>gblEGF90974.11maleylacetoacetate isomerase [Asticcacaulis biprosthecum 373 374 ZP 08264339.1 C19]
glutathione S-transferase LigF
[Bradyrhizobiaceae bacterium SG-6C]
>gb1EGP07427.11glutathione S-transferase [Bradyrhizobiaceae 375 376 ZP 08630058.1 bacterium SG-6C]
glutathione S-transferase GST 16 LigF
377 378 AAG34806.1 [Glycine max]
tau class GST protein 3 [Oryza sativa LigF
Indica Group] >gb1EAY79295.11 hypothetical protein 0s1_34421 [Oryza sativa Indica Group] >embICAZ68078.11 glutathione S-transferase [Oryza sativa 379 380 AAQ02687.1 Indica Group]
Glutathione S-transferase domain LigF
381 382 ADV56298.1 protein [Shewanella putrefaciens 200]
glutathione S-transferase [Medicago LigF
383 384 BAB70616.1 sativa]
predicted protein [Hordeum vulgare LigF
385 386 BAJ94610.1 subsp. vulgare]
hypothetical protein VITISV_002763 LigF
387 388 CAN68934.1 [Vitis vinifera]
putative glutathione S-transferase LigF
389 390 CBW26056.1 [Bacteriovorax marinus SJ]
glutathione S-transferase [Coccidioides LigF
391 392 EFW18159.1 posadasii str. Silveira]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
hypothetical protein LigF

393 394 EGF84337.1 [Batrachochytrium dendrobatidis JAM81]
0s10g0528400 [Oryza sativa Japonica LigF
Group] >gbIAAG32472.11AF309379_1 putative glutathione S-transferase OsGSTU3 [Oryza sativa Japonica Group] >gbIAAM12325.11AC091680_26 putative glutathione S-transferase [Oryza sativa Japonica Group]
>gbIAAM94544.1I putative glutathione S-transferase [Oryza sativa Japonica Group] >gbIAAP54745.1Iglutathione S-transferase GSTU6, putative, expressed [Oryza sativa Japonica Group]
>dbjIBAF27038.110s10g0528400 [Oryza sativa Japonica Group]
>gblEAZ16756.11 hypothetical protein OsJ_32232 [Oryza sativa Japonica 395 396 NP 001065124.1 Group]
Glutathione S-transferase-like protein LigF
[Arabidopsis thaliana]
>embICAB83126.11Glutathione transferase III-like protein [Arabidopsis thaliana] >gbIAEE80388.1I Glutathione S-transferase-like protein [Arabidopsis 397 398 NP 191835.1 thaliana]
glutathione S-transferase family protein LigF
[Shewanella oneidensis MR-1]
>gbIAAN54634.11AE015603_8 glutathione S-transferase family protein 399 400 NP 717190.1 [Shewanella oneidensis MR-1]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione 5-transferase LigF
[Bradyrhizobium japonicum USDA 1101 >dbjIBAC47768.1Iglutathione 5-transferase [Bradyrhizobium japonicum 401 402 NP 769143.1 USDA 110]
glutathione transferase zeta 1 LigF
[Chromobacterium violaceum ATCC
12472] >gbIAAQ58646.11 probable glutathione transferase zeta 1 [Chromobacterium violaceum ATCC
403 404 NP 900642.1 12472]
glutathione 5-transferase [Coccidioides LigF
405 406 XP 001246353.1 immitis RS]
PREDICTED: similar to glutathione S- LigF
407 408 XP 002171087.1 transferase [Hydra magnipapillata]
PREDICTED: hypothetical protein [Vitis LigF
vinifera] >embICB132223.31 unnamed 409 410 XP 002263386.1 protein product [Vitis vinifera]
PREDICTED: hypothetical protein [Vitis LigF
vinifera] >embICB132222.31 unnamed 411 412 XP 002263424.1 protein product [Vitis vinifera]
PREDICTED: hypothetical protein LigF
413 414 XP 002272099.1 isoform 2 [Vitis vinifera]
glutathione s-transferase, putative LigF
[Ricinus communis] >gbIEEF34551.11 glutathione s-transferase, putative 415 416 XP 002527848.1 [Ricinus communis]
Glutathione 5-transferase A, putative LigF
[Perkinsus marinus ATCC 50983]
>gblEER18137.11Glutathione 5-transferase A, putative [Perkinsus 417 418 XP 002786341.1 marinus ATCC 50983]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
Glutathione S-transferase, putative LigF
[Coccidioides posadasii C735 delta SOWgp] >gblEER24644.11Glutathione 5-transferase, putative [Coccidioides 419 420 XP 003066789.1 posadasii C735 delta SOWgp]
PREDICTED: similar to ganglioside- LigF
induced differentiation-associated-protein 1 [Tribolium castaneum]
>gbIEFA00477.11 hypothetical protein TcasGA2 TC003336 [Tribolium 421 422 XP 970577.1 castaneum]
glutathione S-transferase domain- LigF
containing protein [Sphingomonas wittichii RW1] >gbIABQ69421.11 Glutathione S-transferase, N-terminal 423 424 YP 001263559.1 domain [Sphingomonas wittichii RW1]
glutathione S-transferase domain- LigF
containing protein [Shewanella pealeana ATCC 700345] >gbIABV88497.11 Glutathione S-transferase domain 425 426 YP 001503032.1 [Shewanella pealeana ATCC 700345]
glutathione S-transferase ll LigF
[Acaryochloris marina MBIC11017]
>gbIABW27665.1Iglutathione S-transferase ll [Acaryochloris marina 427 428 YP 001516981.1 MBIC11017]
glutathione S-transferase, [Sorangium LigF
cellulosum 'So ce 56']
>embICAN94912.1Iglutathione S-transferase, putative [Sorangium 429 430 YP 001615392.1 cellulosum 'So ce 561 glutathione S-transferase domain- LigF
containing protein [Caulobacter sp. K31]
>gbIABZ73058.11Glutathione S-transferase domain [Caulobacter sp.
431 432 YP 001685556.1 K31]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione S-transferase domain- LigF
containing protein [Pseudomonas putida W619] >gblACA71685.11Glutathione S-transferase domain [Pseudomonas 433 434 YP 001748054.1 putida W619]
glutathione S-transferase [Cyanothece LigF
sp. ATCC 51142] >gblACB52305.11 glutathione S-transferase [Cyanothece 435 436 YP 001804371.1 sp. ATCC 51142]
glutathione s-transferase protein; gsta LigF
protein [Cupriavidus taiwanensis LMG
19424] >embICAQ71222.11 putative glutathione S-transferase protein; gstA
protein [Cupriavidus taiwanensis LMG
437 438 YP 002007283.1 19424]
glutathione S-transferase LigF
[Phenylobacterium zucineum HLK1]
>gblACG78383.11glutathione S-transferase [Phenylobacterium zucineum 439 440 YP 002130812.1 HLK1]
glutathione S-transferase domain LigF
[Acidithiobacillus ferrooxidans ATCC
53993] >reflYP_002426974.11 glutathione S-transferase [Acidithiobacillus ferrooxidans ATCC
23270] >gb1ACH84426.11Glutathione S-transferase domain [Acidithiobacillus ferrooxidans ATCC 53993]
>gblACK78121.11glutathione S-transferase [Acidithiobacillus 441 442 YP 002220633.1 ferrooxidans ATCC 23270]
glutathione S-transferase domain- LigF
containing protein [Cyanothece sp. PCC
7425] >gblACL44057.11Glutathione S-transferase domain protein [Cyanothece 443 444 YP 002482418.1 sp. PCC 7425]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione 5-transferase protein LigF
[Agrobacterium radiobacter K84]
>gblACM25821.11glutathione 5-transferase protein [Agrobacterium 445 446 YP 002543747.1 radiobacter K84]
glutathione 5-transferase domain protein LigF
[Rhizobium leguminosarum by. trifolii WSM1325] >gblACS55200.11 Glutathione 5-transferase domain protein [Rhizobium leguminosarum by.
447 448 YP 002974739.1 trifolii WSM1325]
glutathione transferase LigF
[Pseudoalteromonas sp. 5M9913]
>gbIADT70298.1I glutathione transferase [Pseudoalteromonas sp.
449 450 YP 004065207.1 5M9913]
glutathione 5-transferase [Pseudomonas LigF
brassicacearum subsp. brassicacearum NFM421] >gbIAEA72175.11 putative glutathione 5-transferase [Pseudomonas brassicacearum subsp. brassicacearum 451 452 YP 004357179.1 NFM421]
glutathione 5-transferase [Cupriavidus LigF
necator N-1] >gbIAE179688.11 glutathione 5-transferase [Cupriavidus 453 454 YP 004680920.1 necator N-1]
glutathione 5-transferase [Rhizobium etli LigF
CFN 42] >gbIABC90083.11 glutathione 5-transferase protein [Rhizobium etli CFN
455 456 YP 468810.1 42]
glutathione 5-transferase [Burkholderia LigF
xenovorans LB400] >gbIABE34690.1I
Glutathione 5-transferase [Burkholderia 457 458 YP 554040.1 xenovorans LB400]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione S-transferase-like [Ruegeria LigF
sp. TM1040] >gbIABF62841.11 glutathione S-transferase-like protein 459 460 YP 612103.1 [Ruegeria sp. TM1040]
glutathione S-transferase domain- LigF
containing protein [Shewanella sp. MR-4]
>gbIAB140253.11Glutathione S-transferase, N-terminal domain protein 461 462 YP 735310.1 [Shewanella sp. MR-4]
glutathione S-transferase domain- LigF
containing protein [Nitrosomonas eutropha C91] >gbIAB159602.11 Glutathione S-transferase, C-terminal 463 464 YP 747567.1 domain [Nitrosomonas eutropha C91]
maleylacetoacetate isomerase LigF
[Maricaulis mans MCS10]
>gbIAB166289.1I maleylacetoacetate 465 466 YP 757227.1 isomerase [Maricaulis marls MCS10]
glutathione S-transferase domain- LigF
containing protein [Shewanella sp. ANA-3] >gbIABK46993.11Glutathione S-transferase, N-terminal domain protein 467 468 YP 868399.1 [Shewanella sp. ANA-3]
glutathione S-transferase domain- LigF
containing protein [Shewanella sp. ANA-3] >gbIABK49092.11Glutathione S-transferase, N-terminal domain protein 469 470 YP 870498.1 [Shewanella sp. ANA-3]
glutathione S-transferase domain- LigF
containing protein [Marinobacter aquaeolei VT8] >gbIABM17524.11 Glutathione S-transferase, N-terminal 471 472 YP 957711.1 domain [Marinobacter aquaeolei VT8]
glutathione S-transferase domain- LigF
containing protein [Marinobacter aquaeolei VT8] >gbIABM17686.11 Glutathione S-transferase, N-terminal 473 474 YP 957873.1 domain [Marinobacter aquaeolei VT8]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione S-transferase domain- LigF
containing protein [Marinobacter aquaeolei VT8] >gbIABM20606.11 Glutathione S-transferase, N-terminal 475 476 YP 960793.1 domain [Marinobacter aquaeolei VT8]
glutathione S-transferase domain- LigF
containing protein [Shewanella sp. W3-18-1] >gbIABM24864.1I Glutathione S-transferase, N-terminal domain 477 478 YP 963418.1 [Shewanella sp. W3-18-1]
glutathione S-transferase family protein LigF
[Oceanicola batsensis HTCC2597]
>gb1EAQ02499.11glutathione S-transferase family protein [Oceanicola 479 480 ZP 01000028.1 batsensis HTCC2597]
glutathione S-transferase [Stigmatella LigF
aurantiaca DW4/3-1]
>reflYP_003956548.1Iglutathione s-transferase [Stigmatella aurantiaca DW4/3-1] >gb1EAU70026.1Iglutathione 5-transferase [Stigmatella aurantiaca DW4/3-1] >gbIAD074721.11Glutathione 5-transferase [Stigmatella aurantiaca 481 482 ZP 01459182.1 DW4/3-1]
Glutathione S-transferase domain LigF
[Burkholderia graminis C4D1M]
>gbIEDT08402.11Glutathione S-transferase domain [Burkholderia 483 484 ZP 02886014.1 graminis C4D1M]
Glutathione S-transferase [Alteromonas LigF
485 486 ZP 04713937.1 macleodii ATCC 27126]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
Glutathione S-transferase, N-terminal LigF
domain protein [Rhodobacterales bacterium HTCC2083] >gbIEDZ42709.11 Glutathione S-transferase, N-terminal domain protein [Rhodobacteraceae 487 488 ZP 05075049.1 bacterium HTCC2083]
glutathione S-transferase protein LigF
[Roseobacter sp. GAI101]
>gb1EEB85730.11glutathione S-transferase protein [Roseobacter sp.
489 490 ZP 05101428.1 GA1101]
glutathione S-transferase LigF
[Rhodobacteraceae bacterium KLH11]
>gb1EEE39034.11glutathione S-transferase [Rhodobacteraceae 491 492 ZP 05124402.1 bacterium KLH11]
glutathione S-transferase [Vibrio sp. LigF
RC341] >gb1EEX64947.11glutathione 5-493 494 ZP 05926645.1 transferase [Vibrio sp. RC341]
Glutathione S-transferase-like protein LigF
[Cylindrospermopsis raciborskii CS-505]
>gbIEFA69058.11Glutathione S-transferase-like protein 495 496 ZP 06308936.1 [Cylindrospermopsis raciborskii CS-505]
Glutathione S-transferase domain LigF
protein [Burkholderia sp. Ch1-1]
>gbIEFG73275.11Glutathione S-transferase domain protein [Burkholderia 497 498 ZP 06838829.1 sp. Ch1-1]
glutathione S-transferase III [Vibrio LigF
sinaloensis DSM 21326]
>gb1EGA68654.11glutathione S-transferase III [Vibrio sinaloensis DSM
499 500 ZP 08104209.1 21326]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
Glutathione 5-transferase LigF
[Oxalobacteraceae bacterium IMCC9480] >gblEGF30821.11 Glutathione 5-transferase [Oxalobacteraceae bacterium 501 502 ZP 08275708.1 1MCC9480]
glutathione 5-transferase LigF
[Pseudoalteromonas haloplanktis ANT/505] >gblEG173123.11 glutathione S
transferase [Pseudoalteromonas 503 504 ZP 08409706.1 haloplanktis ANT/505]
glutathione 5-transferase [Shewanella LigF
sp. HN-41] >gblEGM70872.11 glutathione 5-transferase [Shewanella 505 506 ZP 08565123.1 sp. HN-41]
507 508 CAA12269.1 ORF 3 [Sphingomonas sp. RW5] LigF
glutathione transferase [Triticum LigF
509 510 CAC94002.1 aestivum]
maleylacetoacetate isomerase / LigF
glutathione 5-transferase [Bdellovibrio bacteriovorus HD1001 >embICAE77948.1Imaleylacetoacetate isomerase / glutathione 5-transferase 511 512 NP 967294.1 [Bdellovibrio bacteriovorus HD100]
RecName: Full=Protein ligF LigF
>dbjIBAA02031.11beta-etherase [Sphingomonas paucimobilis]
513 514 P30347.1 >prf111914145A beta etherase hypothetical protein LigF
SELMODRAFT 142654 [Selaginella moellendorffii] >gbIEFJ34604.11 hypothetical protein SELMODRAFT 142654 [Selaginella 515 516 XP 002964271.1 moellendorffii]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione S-transferase-like protein LigF
[Methylibium petroleiphilum PM1]
>gbIABM95079.11 glutathione S-transferase-like protein [Methylibium 517 518 YP_001021314.1 petroleiphilum PM1]
glutathione S-transferase domain- LigF
containing protein [Burkholderia phymatum 5TM815] >gblACC75341.11 Glutathione S-transferase domain 519 520 YP_001862387.1 [Burkholderia phymatum 5TM815]
glutathione S-transferase LigF
[Phenylobacterium zucineum HLK1]
>gblACG78321.11glutathione S-transferase [Phenylobacterium zucineum 521 522 YP 002130750.1 HLK1]
glutathione S-transferase [Sinorhizobium LigF
fredii NGR234] >gblACP24502.11 glutathione S-transferase [Sinorhizobium 523 524 YP 002825255.1 fredii NGR234]
glutathione S-transferase domain- LigF
containing protein [Burkholderia sp.
CCGE1003] >gbIADN59379.11 Glutathione S-transferase domain 525 526 YP_003908670.1 protein [Burkholderia sp. CCGE1003]
glutathione s-transferase domain- LigF
containing protein [Variovorax paradoxus EPS] >gbIADU36319.11Glutathione S-transferase domain [Variovorax 527 528 YP 004154430.1 paradoxus EPS]
glutathione S-transferase domain- LigF
containing protein [Burkholderia sp.
CCGE1001] >gbIADX56921.11 Glutathione S-transferase domain 529 530 YP_004229981.1 protein [Burkholderia sp. CCGE1001]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione S-transferase, N-terminal LigF
domain protein [Polymorphum gilvum 5L003B-26A1] >gbIADZ69468.11 Glutathione S-transferase, N-terminal domain protein [Polymorphum gilvum 531 532 YP 004302768.1 5L003B-26A1]
glutathione S-transferase-like protein LigF
[Novosphingobium sp. PP1Y]
>embICCA92074.1Iglutathione S-transferase-like [Novosphingobium sp.
533 534 YP 004533892.1 PP1Y]
glutathione S-transferase-like protein LigF
[Novosphingobium sp. PP1Y]
>embICCA92075.1Iglutathione S-transferase-like [Novosphingobium sp.
535 536 YP 004533893.1 PP1Y]
glutathione S-transferase-like protein LigF
[Novosphingobium sp. PP1Y]
>embICCA92087.1Iglutathione S-transferase-like [Novosphingobium sp.
537 538 YP 004533905.1 PP1Y]
glutathione S-transferase-like protein LigF
[Novosphingobium aromaticivorans DSM
12444] >gbIABD26530.1Iglutathione S-transferase-like protein [Novosphingobium aromaticivorans DSM
539 540 YP 497364.1 12444]
glutathione S-transferase-like protein LigF
[Novosphingobium aromaticivorans DSM
12444] >gbIABD27301.11glutathione S-transferase-like protein [Novosphingobium aromaticivorans DSM
541 542 YP 498135.1 12444]
glutathione S-transferase-like protein LigF
[Novosphingobium aromaticivorans DSM
12444] >gbIABD27308.1Iglutathione S-transferase-like protein [Novosphingobium aromaticivorans DSM
543 544 YP 498142.1 12444]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione S-transferase-like protein LigF
[Novosphingobium aromaticivorans DSM
12444] >gbIABD27309.1Iglutathione S-transferase-like protein [Novosphingobium aromaticivorans DSM
545 546 YP 498143.1 12444]
maleylacetoacetate isomerase LigF
[Oceanicaulis alexandrii HTCC2633]
>gblEAP91525.11 maleylacetoacetate isomerase [Oceanicaulis alexandrii 547 548 ZP 00952372.1 HTCC2633]
glutathione S-transferase, putative LigF
[Roseovarius nubinhibens ISM]
>gblEAP78164.11glutathione S-transferase, putative [Roseovarius 549 550 ZP 00959702.1 nubinhibens ISM]
glutathione S-transferase, putative LigF
[Roseovarius sp. 217] >gb1EAQ27224.11 glutathione S-transferase, putative 551 552 ZP_01034543.1 [Roseovarius sp. 217]
glutathione S-transferase, putative LigF
[Roseobacter sp. MED193]
>gblEAQ44057.11 glutathione S-transferase, putative [Roseobacter sp.
553 554 ZP 01057917.1 MED193]
glutathione S-transferase [marine LigF
gamma proteobacterium HTCC2207]
>gblEAS48069.11glutathione S-transferase [marine gamma 555 556 ZP 01223510.1 proteobacterium HTCC2207]
glutathione S-transferase, putative LigF
[Roseobacter sp. 5K209-2-6]
>gbIEBA17470.11 glutathione S-transferase, putative [Roseobacter sp.
557 558 ZP 01753989.1 5K209-2-6]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione S-transferase-like protein LigF
[Phaeobacter gallaeciensis B5107]
>gbIEDQ11817.1Iglutathione S-transferase-like protein [Phaeobacter 559 560 ZP 02146800.1 gallaeciensis B5107]
glutathione S-transferase, putative LigF
[Phaeobacter gallaeciensis 2.101 >gbIEDQ07480.11glutathione S-transferase, putative [Phaeobacter 561 562 ZP 02150992.1 gallaeciensis 2.10]
glutathione S-transferase 2 LigF
[Rhodobacterales bacterium HTCC2083]
>gbIEDZ41252.1Iglutathione S-transferase 2 [Rhodobacteraceae 563 564 ZP 05073592.1 bacterium HTCC2083]
glutathione S-transferase LigF
[Rhodobacterales bacterium Y4I]
>gbIEDZ45430.1Iglutathione S-transferase [Rhodobacterales bacterium 565 566 ZP 05077451.1 Y41]
Glutathione S-transferase, N-terminal LigF
domain protein [Pseudovibrio sp. JE062]
>gblEEA92555.11Glutathione S-transferase, N-terminal domain protein 567 568 ZP_05087035.1 [Pseudovibrio sp. JE062]
glutathione S-transferase [Ruegeria sp. LigF
R11] >gblEEB71116.11glutathione 5-569 570 ZP_05089424.1 transferase [Ruegeria sp. R11]
protein LigF [gamma proteobacterium LigF
NOR5-3] >gblEED32863.11 protein LigF
571 572 ZP_05126316.1 [gamma proteobacterium NOR5-3]
maleylacetoacetate isomerase [gamma LigF
proteobacterium NOR5-3]
>gblEED33370.1Imaleylacetoacetate isomerase [gamma proteobacterium 573 574 ZP 05126823.1 NOR5-3]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione S-transferase [Silicibacter sp. LigF
TrichCH4B] >gblEEW58747.11 glutathione S-transferase [Silicibacter sp.
575 576 ZP 05741946.1 TrichCH4B]

[00257] Table 18.

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione 5-transferase homolog LigG
577 578 BAA77216.1 [Sphingomonas paucimobilis]
glutathione 5-transferase family protein LigG
[Novosphingobium sp. PP1Y]
>embICCA92089.1I glutathione 5-579 580 YP 004533907.1 transferase family protein glutathione 5-transferase family protein LigG
[Thiobacillus denitrificans ATCC 25259]
>gbIAAZ97003.1I glutathione 5-581 582 YP 314808.1 transferase family protein [Thiobacillus glutathione 5-transferase family protein LigG
[Ruegeria pomeroyi DSS-3]
>gbIAAV95330.1I glutathione 5-583 584 YP 167289.1 transferase family protein [Ruegeria glutathione 5-transferase family protein LigG
[Maritimibacter alkaliphilus HTCC2654]
>gblEAQ14262.11glutathione 5-585 586 ZP 01011943.1 transferase family protein glutathione 5-transferase protein LigG
[Agrobacterium radiobacter K84]
587 588 YP 002540613.1>gblACM29018.11glutathione 5-Novel glutathione 5-transferase omega LigG
589 590 CAJ81793.1 protein [Xenopus (Silurana) tropicalis]
glutathione 5-transferase omega 2 LigG
[Xenopus (Silurana) tropicalis]
591 592 NP 001005086.1>gbIAAH77010.1I MGC89704 protein PREDICTED: glutathione 5-transferase LigG
593 594 XP_624501.1 omega-1 [Apis mellifera]
GM24932 [Drosophila sechellia] LigG
595 596 XP 002029736.1>gbIEDW40722.1I GM24932 [Drosophila hypothetical protein L0C436894 [Danio LigG
597 598 NP 001002621.1rerio] >gbIAAH75965.11Zgc:92254 [Danio predicted protein [Pediculus humanus LigG
corporis] >gblEEB18748.11 predicted 599 600 XP_002431486.1protein [Pediculus humanus corporis]
glutathione 5-transferase [Glossina LigG
601 602 ADD18952.1 morsitans morsitans]

PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
GE21298 [Drosophila yakuba] LigG
603 604 XP 002093444.1>gbIEDW93156.11GE21298 [Drosophila GK20540 [Drosophila willistoni] LigG
605 606 XP 002068563.1>gbIEDW79549.11GK20540 [Drosophila 607 608 NP 001165912.1glutathione S-transferase 01 [Nasonia LigG
putative glutathione S-transferase LigG
609 610 CAM34501.1 [Cotesia congregata]
PREDICTED: similar to glutathione-S- LigG
611 612 XP 421747.1 transferase homolog isoform 2 [Gallus GA23449 [Drosophila pseudoobscura LigG
pseudoobscura] >gbIEDY73696.1I
613 614 XP 002135069.1GA23449 [Drosophila pseudoobscura glutathione S-transferase omega-1 [Mus LigG
musculus]
>sp1009131.21GSTO1 MOUSE
RecName: Full=Glutathione S-transferase omega-1; Short=GSTO-1; AltName:
Full=p28 >gbIAAB70110.11 glutathione-S-transferase homolog [Mus musculus]
>dbjIBAC25667.11 unnamed protein product [Mus musculus]
>gbIAAH85165.1I Glutathione S-transferase omega 1 [Mus musculus]
>dbjIBAE27469.11 unnamed protein 615 616 NP 034492.1 product [Mus musculus]
. . .
glutathione S-transferase domain- LigG
617 618 ZP 03524422.1 containing protein [Rhizobium etli GR56]
CG6673, isoform A [Drosophila LigG
melanogaster] >gbIAAF50404.21CG6673, isoform A [Drosophila melanogaster]
619 620 NP 729388.1 >gblACZ02426.11glutathione 5-glutathione S-transferase [Xanthomonas LigG
vesicatoria ATCC 35937]
621 622 ZP 08179398.1 >gblEGD08414.11glutathione 5-PREDICTED: glutathione S-transferase LigG
623 624 XP 003218563.1omega-1-like isoform 1 [Anolis 625 626 ABC86304.1 IP16242p [Drosophila melanogaster] LigG

PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
GL15567 [Drosophila persimilis]
LigG
627 628 XP 002026470.1>gbIEDW33419.1I GL15567 [Drosophila glutathione S-transferase omega 4 LigG
[Bombyx mori] >gbIABY66601.11 629 630 NP 001108461.1glutathione S-transferase 13 [Bombyx glutathione S-transferase omega-1 [Sus LigG
scrofa] >refIXP 001929519.11 PREDICTED: glutathione S-transferase omega-1-like [Sus scrofa]
>splQ9N1F5.21GST01_PIG RecName:
Full=Glutathione S-transferase omega-1;
Short=GSTO-1; AltName:
Full=Glutathione-dependent 631 632 NP 999215.1 dehydroascorbate reductase hypothetical protein L0C492500 [Danio LigG
rerio] >gbIAAH85467.11Zgc:101897 633 634 NP 001007373.1[Danio rerio] >gbIAA165433.11Zgc:101897 glutathione S-transferase domain-LigG
containing protein [Delftia acidovorans SPH-1] >gbIABX38269.1I Glutathione 5-635 636 YP 001566654.1transferase domain [Delftia acidovorans omega class glutathione S-transferase LigG
637 638 ADY80021.1 [Oplegnathus fasciatus]
glutathione S-transferase domain-LigG
containing protein [Sinorhizobium medicae WSM419] >gbIABR62323.11 639 640 YP 001329158.1Glutathione S-transferase domain hypothetical protein LOC431979 LigG
[Xenopus laevis] >gbIAAH70673.1I
641 642 NP_001084924.1MGC82327 protein [Xenopus laevis]
PREDICTED: glutathione S-transferase LigG
643 644 XP 003396907.1omega-1-like [Bombus terrestris]
PREDICTED: glutathione S-transferase LigG
645 646 XP 001368758.1omega-1-like isoform 1 [Monodelphis GH16193 [Drosophila grimshawi]
LigG
647 648 XP 001983981.1>gbIEDV96329.1I GH16193 [Drosophila 649 650 ADK66966.1 glutathione s-transferase [Chironomus LigG

PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
PREDICTED: similar to glutathione-S-LigG
651 652 XP 001232808.1transferase homolog isoform 1 [Gallus GK20354 [Drosophila willistoni]
LigG
653 654 XP 002068565.1>gbIEDW79551.11GK20354 [Drosophila hypothetical protein sce0602 [Sorangium LigG
cellulosum 'So ce 56'] >embICAN90759.11 655 656 YP 001611239.1gst2 [Sorangium cellulosum 'So ce 56']
PREDICTED: glutathione S-transferase LigG
657 658 XP_001499427.'4. omega-1-like isoform 1 [Equus caballus]
putative glutathione S-transferase protein LigG
[Sinorhizobium meliloti 1021]
>reflYP_004550950.1Iglutathione S-transferase domain-containing protein [Sinorhizobium meliloti AK83]
>embICAC41740.11 Putative glutathione S-transferase [Sinorhizobium meliloti 1021] >gbIAEG06303.11Glutathione S-transferase domain protein [Sinorhizobium meliloti BL225C]
659 660 NP 384409.1 >gbIAEG55336.11Glutathione 5-661 662 CAG05035.1 unnamed protein product [Tetraodon LigG
hypothetical protein PaerPA_01002475 LigG
[Pseudomonas aeruginosa PACS2]
>reflYP_002440902.11 maleylacetoacetate isomerase [Pseudomonas aeruginosa LESB58]
>refIZP_04928412.11maleylacetoacetate isomerase [Pseudomonas aeruginosa C3719] >gblEAZ52531.11 maleylacetoacetate isomerase [Pseudomonas aeruginosa C3719]
>embICAW28043.1 I maleylacetoacetate 663 664 ZP 01365353.1 isomerase [Pseudomonas aeruginosa maleylacetoacetate isomerase LigG
[Pseudomonas aeruginosa PA7]
>gbIABR84080.11maleylacetoacetate 665 666 YP 001348642.1 isomerase [Pseudomonas aeruginosa PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
maleylacetoacetate isomerase LigG
[Pseudomonas aeruginosa 2192]
>gblEAZ57884.1Imaleylacetoacetate 667 668 ZP 04933765.1 isomerase [Pseudomonas aeruginosa maleylacetoacetate isomerase LigG
[Pseudomonas aeruginosa PA01]
>spIP57109.11MAAI_PSEAE RecName:
Full=Maleylacetoacetate isomerase;
Short=MAAI
669 670 NP 250697.1 >gbIAAG05395.11AE004627_3 hypothetical protein LigG
671 672 EFN59352.1 CHLNCDRAFT 137800 [Chlorella glutathione S-transferase domain-LigG
containing protein [Variovorax paradoxus 5110] >gblACS20318.11Glutathione 5-673 674 YP 002945584.1transferase domain protein [Variovorax PREDICTED: glutathione S-transferase LigG
675 676 XP 002197460.1omega 1 [Taeniopygia guttata]
GG15075 [Drosophila erecta]
LigG
677 678 XP 001971643.1>gbIEDV50669.1I GG15075 [Drosophila glutathione S-transferase omega-1-like LigG
[Acyrthosiphon pisum] >dbjIBAH71013.11 679 680 NP_001155757.1ACYP1008340 [Acyrthosiphon pisum]
GL15565 [Drosophila persimilis]
LigG
681 682 XP 002026468.1>gbIEDW33417.1I GL15565 [Drosophila GA19760 [Drosophila pseudoobscura LigG
pseudoobscura] >gblEAL29555.11 683 684 XP 001353820.1GA19760 [Drosophila pseudoobscura maleylacetoacetate isomerase LigG
[Pseudomonas aeruginosa UCBPP-PA14]
>gbIABJ11194.11maleylacetoacetate 685 686 YP 791232.1 isomerase [Pseudomonas aeruginosa PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
maleylacetoacetate isomerase LigG
[Pseudomonas aeruginosa PAb1]
>refIZP_07797003.1Imaleylacetoacetate isomerase [Pseudomonas aeruginosa 39016] >gbIEFQ42099.11 maleylacetoacetate isomerase [Pseudomonas aeruginosa 39016]
687 688 ZP 06879058.1 >gblEGM14661.11 maleylacetoacetate 689 690 EFZ22366.1 hypothetical protein SINV_14968 LigG
Glutathione 5-transferase domain LigG
691 692 ZP 03527925.1 [Rhizobium etli CIAT 894]
693 694 ABD77536.1 hypothetical protein [Ictalurus punctatus] LigG
PREDICTED: glutathione 5-transferase LigG
695 696 XP_002756473.1omega-1-like [Callithrix jacchus]
predicted protein [Nematostella vectensis] LigG
>gbIED044933.11 predicted protein 697 698 XP 001636996.1[Nematostella vectensis]
glutathione 5-transferase [Rhizobium etli LigG
CFN 42] >gbIABC89104.1Iglutathione 5-699 700 YP 467831.1 transferase protein [Rhizobium etli CFN
glutathione-S-transferase [Mesorhizobium LigG
loti MAFF303099] >dbjIBAB48791.11 701 702 NP 103005.1 glutathione-S-transferase [Mesorhizobium 703 704 ADY47623.1 Glutathione transferase omega-1 [Ascaris LigG
705 706 BAG36430.1 unnamed protein product [Homo sapiens] LigG
PREDICTED: glutathione-S-transferase LigG
707 708 XP 002718774.1omega 1 [Oryctolagus cuniculus]
Chain A, Crystal Structure Of Human LigG
Glutathione Transferase Omega 1, Delta 155 >pdbl3LFLIB Chain B, Crystal Structure Of Human Glutathione Transferase Omega 1, Delta 155 709 710 3LFL_A >pdbl3LFLIC Chain C, Crystal Structure PREDICTED: glutathione 5-transferase LigG
omega-1-like [Macaca mulatta]
711 712 XP 002805857.1>gbIAB021635.1I glutathione 5-PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
glutathione 5-transferase omega-1 LigG
[Rattus norvegicus] >gbIAAH79363.11 Glutathione 5-transferase omega 1 [Rattus norvegicus] >gbIEDL94393.1I
713 714 NP 00i007603.1 glutathione 5-transferase omega 1, PREDICTED: similar to glutathione-S- LigG
715 716 XP 535007.1 transferase omega 1 isoform 1 [Canis glutathione 5-transferase omega-1 LigG
isoform 1 [Homo sapiens]
>spIP78417.2IGST01_HUMAN
RecName: Full=Glutathione 5-transferase omega-1; Short=GSTO-1 >pdbI1EEMIA
Chain A, Glutathione Transferase From Homo Sapiens >gbIAAF73376.11AF212303_1 glutathione transferase omega [Homo sapiens]
>gbIAAB70109.1Iglutathione-S-transferase homolog [Homo sapiens]
>gbIAAH00127.1I Glutathione 5-transferase omega 1 [Homo sapiens]
>gbIAAV68046.1I glutathione 5-717 718 NP 004823.1 transferase omega 1-1 [Homo sapiens]
_ . IANAIAI,ArlA Al I 1 il = f, PREDICTED: glutathione 5-transferase LigG
719 720 XP 002758417.1omega-1-like [Callithrix jacchus]
PREDICTED: glutathione 5-transferase LigG
721 722 XP 003218564.1omega-1-like isoform 2 [Anolis Glutathione transferase omega-1 LigG
723 724 EFN62827.1 [Camponotus floridanus]
PREDICTED: glutathione 5-transferase LigG
725 726 XP 508020.3 omega-1 isoform 3 [Pan troglodytes]
727 728 CAD97673.1 hypothetical protein [Homo sapiens] LigG
glutathione 5-transferase omega 1 LigG
729 730 BAJ20927.1 [synthetic construct]
731 732 ACR43779.1 glutathione 5-transferase [Chironomus LigG
PROTEIN GENE GENBANK DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
RecName: Full=Glutathione S-transferase LigG
omega-1; Short=GSTO-1; AltName:
Full=Glutathione-dependent dehydroascorbate reductase 733 734 Q9Z339.2 >gblAC132122.11glutathione 5-GF10159 [Drosophila ananassae] LigG
735 736 XP 001956909.1>gbIEDV39715.1I GF10159 [Drosophila hypothetical protein [Monosiga brevicollis LigG
MX1] >gbIEDQ92516.1I predicted protein 737 738 XP 001742278.1[Monosiga brevicollis MX1]
PREDICTED: glutathione S-transferase LigG
739 740 XP 002821176.1omega-1-like [Pongo abelii]
PREDICTED: glutathione S-transferase LigG
741 742 XP 003255483.1omega-1-like isoform 1 [Nomascus glutathione S-transferase-like protein LigG
[Anabaena variabilis ATCC 29413]
>gbIABA24595.1I Glutathione 5-743 744 YP 325490.1 transferase-like protein [Anabaena PREDICTED: glutathione S-transferase LigG
745 746 XP 003208190.1omega-1-like [Meleagris gallopavo]
GK20539 [Drosophila willistoni] LigG
747 748 XP 002068562.1>gbIEDW79548.11GK20539 [Drosophila GF10161 [Drosophila ananassae] LigG
749 750 XP 001956911.1>gbIEDV39717.11GF10161 [Drosophila gluthathione S-transferase omega LigG
751 752 ABV24048.1 [Takifugu obscurus]
putative glutathione S-transferase protein LigG
[Pseudovibrio sp. JE062]
>gb1EEA93528.11 putative glutathione 5-753 754 ZP 05086262.1 transferase protein [Pseudovibrio sp.
755 756 AAI28951.1 LOC100037104 protein [Xenopus laevis] LigG
GF10160 [Drosophila ananassae] LigG
757 758 XP 001956910.1>gbIEDV39716.1I GF10160 [Drosophila glutathione S-transferase omega 2 LigG
[Xenopus laevis] >gbIAA153758.1I
759 760 NP 001099052.1LOC100037104 protein [Xenopus laevis]
Glutathione S-transferase domain LigG
761 762 ZP 03503214.1 [Rhizobium etli Kim 5]
PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION
NO: NO: NO:
GJ12198 [Drosophila virilis]
LigG
763 764 XP 002046961.1>gbIEDW69303.1I GJ12198 [Drosophila GF24331 [Drosophila ananassae]
LigG
765 766 XP 001956912.1>gbIEDV39718.1I GF24331 [Drosophila PREDICTED: glutathione S-transferase LigG
767 768 XP 001368790.1omega-1-like isoform 1 [Monodelphis Glutathione S-transferase-like protein LigG
[Cylindrospermopsis raciborskii CS-505]
>gbIEFA69058.1I Glutathione 5-769 770 ZP 06308936.1 transferase-like protein glutathione S-transferase omega 1 LigG
[Bombyx mandarina] >dbjIBAF91356.11 771 772 ABJ15788.1 omega-class glutathione S-transferase glutathione S-transferase omega 2 LigG
[Bombyx mori] >gbIABC79689.1I
773 774 NP_001037406.1glutathione S-transferase 6 [Bombyx mori]
glutathione S-transferase omega 1 LigG
[Bombyx mori] >gbIABD36128.1I
775 776 NP 001040131.1glutathione S-transferase omega 1 [00258] Table 19.
PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION NO:
NO: NO:
RecName: Full=C alpha-dehydrogenase LigD
>dbj1BAA02030.11C alpha-dehydrogenase [Sphingomonas paucimobilis]
>dbj1BAA01953.11C alpha-dehydrogenase [Sphingomonas paucimobilis]
>gbIAAC60455.11C alpha-dehydrogenase 777 778 Q01198.1 [Sphingomonas paucimobilis]
short-chain dehydrogenase/reductase SDR LigD
[Novosphingobium aromaticivorans DSM
12444] >gbIABD24653.11 short-chain dehydrogenase/reductase SDR
[Novosphingobium aromaticivorans DSM
779 780 YP 495487.1 12444]
short-chain dehydrogenase/reductase SDR LigD
[Novosphingobium sp. PP1Y]
>emb1CCA92080.11short-chain dehydrogenase/reductase SDR
781 782 YP 004533898.1 [Novosphingobium sp. PP1Y]
Calpha-dehydrogenase [Sphingobium sp.
LigD
783 784 BAH56687.1 SYK-6]
short-chain dehydrogenase/reductase SDR LigD
[Novosphingobium sp. PP1Y]
>emb1CCA92103.11short-chain dehydrogenase/reductase SDR
785 786 YP 004533921.1 [Novosphingobium sp. PP1Y]
short-chain dehydrogenase/reductase SDR LigD
[Novosphingobium aromaticivorans DSM
12444] >gbIABD25238.11 short-chain dehydrogenase/reductase SDR
[Novosphingobium aromaticivorans DSM
787 788 YP 496072.1 12444]
Chain A, Structure Of Putative Short-Chain LigD
Dehydrogenase (Saro_0793) From Novosphingobium Aromaticivorans >pdb1310Y1B Chain B, Structure Of Putative Short-Chain Dehydrogenase (Saro_0793) 789 790 310Y _A From Novosphingobium Aromaticivorans PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION NO:
NO: NO:
short-chain dehydrogenase/reductase SDR LigD
[Novosphingobium aromaticivorans DSM
12444] >gbIABD25239.11 short-chain dehydrogenase/reductase SDR
[Novosphingobium aromaticivorans DSM
791 792 YP 496073.1 12444]
Calpha-dehydrogenase [Sphingobium sp.
LigD
793 794 BAH56683.1 SYK-6]
short-chain dehydrogenase/reductase SDR LigD
[Novosphingobium sp. PP1Y]
>embICCA92102.1I short-chain dehydrogenase/reductase SDR
795 796 YP_004533920.1 [Novosphingobium sp. PP1Y]
short-chain dehydrogenase/reductase SDR LigD
[Caulobacter segnis ATCC 21756]
>gbIADG10214.11 short-chain dehydrogenase/reductase SDR [Caulobacter 797 798 YP 003592832.1 segnis ATCC 21756]
short-chain dehydrogenase/reductase SDR LigD
[Novosphingobium aromaticivorans DSM
12444] >gbIABD25150.11 short-chain dehydrogenase/reductase SDR
[Novosphingobium aromaticivorans DSM
799 800 YP 495984.1 12444]
short-chain dehydrogenase/reductase SDR LigD
[Novosphingobium aromaticivorans DSM
12444] >gbIABD26315.1I short-chain dehydrogenase/reductase SDR
[Novosphingobium aromaticivorans DSM
801 802 YP 497149.1 12444]
short-chain dehydrogenase/reductase SDR LigD
[Caulobacter segnis ATCC 21756]
>gbIADG10212.11 short-chain dehydrogenase/reductase SDR [Caulobacter 803 804 YP 003592830.1 segnis ATCC 21756]
short-chain dehydrogenase/reductase SDR LigD
[Sphingomonas wittichii RW1]
>gbIABQ66748.1I short-chain dehydrogenase/reductase SDR
805 806 YP 001260886.1 [Sphingomonas wittichii RW1]
PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION NO:
NO: NO:
short-chain dehydrogenase/reductase SDR LigD
[Parvibaculum lavamentivorans DS-1]
>gbIABS64322.11 short-chain dehydrogenase/reductase SDR
807 808 YP 001413979.1 [Parvibaculum lavamentivorans DS-1]
short-chain dehydrogenase/reductase SDR LigD
[Parvibaculum lavamentivorans DS-1]
>gbIABS62643.11 short-chain dehydrogenase/reductase SDR
809 810 YP 001412300.1 [Parvibaculum lavamentivorans DS-1]
short-chain dehydrogenase/reductase SDR LigD
[Parvibaculum lavamentivorans DS-1]
>gbIABS62642.11 short-chain dehydrogenase/reductase SDR
811 812 YP 001412299.1 [Parvibaculum lavamentivorans DS-1]
Calpha-dehydrogenase [Sphingobium sp.
LigD
813 814 BAH56685.1 SYK-6]
short chain dehydrogenase [Mycobacterium LigD
avium subsp. paratuberculosis K-10]
>reflYP_880159.11 short chain dehydrogenase [Mycobacterium avium 104]
>refIZP_05215302.11 short chain dehydrogenase [Mycobacterium avium subsp. avium ATCC 25291]
>gbIAAS03027.1I hypothetical protein MAP 0710c [Mycobacterium avium subsp.
paratuberculosis K-10] >gbIABK67661.11 short chain dehydrogenase [Mycobacterium avium 104] >gblEG040035.11 short-chain alcohol dehydrogenase [Mycobacterium 815 816 NP 959644.1 avium subsp. paratuberculosis S397]
short chain dehydrogenase [Mycobacterium LigD
colombiense CECT 3035] >gb1EGT85268:11 short chain dehydrogenase [Mycobacterium 817 818 ZP 08717023.1 colombiense CECT 3035]
PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION NO:
NO: NO:
oxidoreductase, short chain LigD
dehydrogenase/reductase family protein [gamma proteobacterium NOR5-3]
>gblEED33994.11oxidoreductase, short chain dehydrogenase/reductase family 819 820 ZP_05127447.1 protein [gamma proteobacterium NOR5-3]
Estradiol 17-beta-dehydrogenase LigD
[Sphingobium chlorophenolicum L-1]
>gbIAEG50913.11Estradiol 17-beta-dehydrogenase [Sphingobium 821 822 YP 004555419.1 chlorophenolicum L-1]
short-chain dehydrogenase/reductase SDR LigD
[Burkholderia sp. CCGE1001]
>gbIADX57778.11 short-chain dehydrogenase/reductase SDR
823 824 YP 004230838.1 [Burkholderia sp. CCGE1001]
putative oxidoreductase [Acidiphilium LigD
multivorum AlU301] >dbj1BAJ81707.11 putative oxidoreductase [Acidiphilium 825 826 YP 004284589.1 multivorum AlU301]
hypothetical protein Acry_2115 [Acidiphilium LigD
cryptum JF-5] >gbIABQ31314.11 short-chain dehydrogenase/reductase SDR [Acidiphilium 827 828 YP 001235233.1 cryptum JF-5]
hypothetical protein GP2143_09415 [marine LigD
gamma proteobacterium HTCC2143]
>gblEAW30413.11 hypothetical protein GP2143 09415 [marine gamma 829 830 ZP 01617820.1 proteobacterium HTCC2143]
short-chain dehydrogenase/reductase LigD
[Bradyrhizobiaceae bacterium SG-6C]
>gblEGP07476.11 short-chain dehydrogenase/reductase 831 832 ZP 08629833.1 [Bradyrhizobiaceae bacterium SG-6C]
PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION NO:
NO: NO:
short-chain type dehydrogenase/reductase LigD
[Mycobacterium marinum M]
>gblACC43159.11short-chain type dehydrogenase/reductase [Mycobacterium 833 834 YP 001853014.1 marinum M]
short-chain dehydrogenase/reductase SDR LigD
[Collimonas fungivorans Ter331]
>gbIAEK63634.11 short-chain dehydrogenase/reductase SDR [Collimonas 835 836 YP 004754457.1 fungivorans Ter331]
short-chain dehydrogenase/reductase SDR LigD
[gamma proteobacterium NOR5-3]
>gblEED30944.11 short-chain dehydrogenase/reductase SDR [gamma 837 838 ZP 05129129.1 proteobacterium NOR5-3]
short chain dehydrogenase [Mycobacterium LigD
839 840 ZP 05223648.1 intracellulare ATCC 13950]
short-chain dehydrogenase/reductase SDR LigD
[Sphingobium chlorophenolicum L-1]
>gbIAEG50877.1I short-chain dehydrogenase/reductase SDR
841 842 YP 004555383.1 [Sphingobium chlorophenolicum L-1]
PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION NO:
NO: NO:
short chain dehydrogenase [Mycobacterium LigD
bovis BCG str. Pasteur 1173P2]
>reflYP_002643932.11 short-chain dehydrogenase [Mycobacterium bovis BCG
str. Tokyo 172] >refIZP_06432004.11 short-chain type dehydrogenase/reductase [Mycobacterium tuberculosis T46]
>refIZP_06449040.11 short-chain type dehydrogenase/reductase [Mycobacterium tuberculosis T17] >refIZP_06453700.11 short=
chain type dehydrogenase/reductase [Mycobacterium tuberculosis K85]
>refIZP_06508748.1I short-chain type dehydrogenase/reductase [Mycobacterium tuberculosis T92] >refIZP_06512283.11 short chain dehydrogenase [Mycobacterium tuberculosis EA5054] >reflYP_004722558.11 short-chain type dehydrogenase/reductase [Mycobacterium africanum GM041182]
>embICAL70889.11 Putative short-chain type dehydrogenase/reductase [Mycobacterium bovis BCG str. Pasteur 1173P2]
>dbjIBAH25164.11 short-chain dehydrogenase [Mycobacterium bovis BCG
str. Tokyo 172] >gblEFD12419.11 short-chain type dehydrogenase/reductase [Mycobacterium tuberculosis T46]
>gblEFD42482.11 short-chain type dehydrogenase/reductase [Mycobacterium 843 844 YP 976997.1 tuberculosis K85] >gblEFD46215.11 short-Short-chain dehydrogenase/reductase SDR LigD
[Congregibacter litoralis kT71]
>gblEAQ98875.11 Short-chain dehydrogenase/reductase SDR
845 846 ZP 01101659.1 [Congregibacter litoralis KT71]
short chain dehydrogenase [marine gamma LigD
proteobacterium HTCC2143]
>gb1EAW32447.11 short chain dehydrogenase [marine gamma 847 848 ZP 01615364.1 proteobacterium HTCC2143]
PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION NO:
NO: NO:
short-chain type dehydrogenase/reductase LigD
[Mycobacterium tuberculosis CPHL_A]
>gbIEFD16575.11 short-chain type dehydrogenase/reductase [Mycobacterium 849 850 ZP 06436160.1 tuberculosis CPHL A]
short chain dehydrogenase [Mycobacterium LigD
bovis AF2122/97] >embICAD93736.11 PUTATIVE SHORT-CHAIN TYPE
DEHYDROGENASE/REDUCTASE
851 852 NP 854532.1 [Mycobacterium bovis AF2122/97]
putative short-chain type LigD
dehydrogenase/reductase [Mycobacterium canettii CIPT 140010059]
>embICCC43191.11 putative short-chain type dehydrogenase/reductase 853 854 YP 004744317.1 [Mycobacterium canettii CIPT 140010059]
short-chain dehydrogenase/reductase sdr LigD
[Paenibacillus polymyxa 5C2]
>gbIAD057345.1I Short-chain dehydrogenase/reductase SDR
855 856 YP 003947586.1 [Paenibacillus polymyxa 5C2]
short-chain dehydrogenase/reductase LigD
[Stigmatella aurantiaca DW4/3-1]
>gbIAD069364.1I Short-chain dehydrogenase/reductase SDR [Stigmatella 857 858 YP 003951191.1 aurantiaca DW4/3-1]
hypothetical protein Rmet_1846 LigD
[Cupriavidus metallidurans CH34]
>gbIABF08725.1I conserved hypothetical 859 860 YP 583994.1 protein [Cupriavidus metallidurans CH34]
PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION NO:
NO: NO:
short chain dehydrogenase [Mycobacterium LigD
tuberculosis H37Rv] >reflYP_001282151.11 short chain dehydrogenase [Mycobacterium tuberculosis H37Ra] >reflYP_001286813.11 short chain dehydrogenase [Mycobacterium tuberculosis F11] >refIZP_02549252.11 short chain dehydrogenase [Mycobacterium tuberculosis H37Ra] >reflYP_003033128.11 short-chain type dehydrogenase/reductase [Mycobacterium tuberculosis KZN 1435]
>refIZP_04924487.11 hypothetical protein TBCG_00842 [Mycobacterium tuberculosis C] >refIZP_04979832.11 hypothetical short-chain type dehydrogenase/reductase [Mycobacterium tuberculosis str. Haarlem]
>refIZP_05140274.11 short chain dehydrogenase [Mycobacterium tuberculosis '98-R604 INH-RIF-EM']
>refIZP_06444578.11 short-chain type dehydrogenase/reductase [Mycobacterium tuberculosis KZN 605] >refIZP_06503955.11 short chain dehydrogenase [Mycobacterium tuberculosis 02_1987] >refIZP_06516315.11 short chain dehydrogenase [Mycobacterium tuberculosis T85] >refIZP_06520361.11 short=
chain type dehydrogenase/reductase [Mycobacterium tuberculosis GM 1503]
>refIZP_06802023.11 short chain dehydrogenase [Mycobacterium tuberculosis 861 862 NP 215366.1 210] >refIZP 06951148.11 short chain short chain dehydrogenase [Mycobacterium LigD
ulcerans Agy99] >gbIABL03054.11short-chain type dehydrogenase/reductase 863 864 YP 904525.1 [Mycobacterium ulcerans Agy99]
PROTEIN GENE GENBANK
DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION NO:
NO: NO:
short-chain dehydrogenase/reductase family LigD
oxidoreductase [Mycobacterium parascrofulaceum ATCC BAA-614]
>gbIEFG75472.1I short-chain dehydrogenase/reductase family oxidoreductase [Mycobacterium 865 866 ZP 06851131.1 parascrofulaceum ATCC BAA-614]
3-oxoacyl-[acyl-carrier-protein] reductase (3- LigD
ketoacyl-acyl carrier protein reductase) [Paenibacillus polymyxa E681]
>gbIADM70831.113-oxoacyl-[acyl-carrier-protein] reductase (3-ketoacyl-acyl carrier protein reductase) [Paenibacillus polymyxa 867 868 YP 003871369.1 E681]
oxidoreductase, short chain LigD
dehydrogenase/reductase family [marine gamma proteobacterium HTCC2148]
>gb1EEB78920.11oxidoreductase, short chain dehydrogenase/reductase family [marine gamma proteobacterium 869 870 ZP 05094873.1 HTCC2148]
probable oxidoreductase dehydrogenase LigD
signal peptide protein [marine gamma proteobacterium HTCC2207]
>gblEAS47242.11 probable oxidoreductase dehydrogenase signal peptide protein [marine gamma proteobacterium 871 872 ZP 01224235.1 HTCC2207]
short chain dehydrogenase [Myxococcus LigD
xanthus DK 1622] >gbIABF86178.11 oxidoreductase, short chain dehydrogenase/reductase family 873 874 YP 634033.1 [Myxococcus xanthus DK 1622]
short-chain dehydrogenase/reductase LigD
875 876 ABL97174.1 [uncultured marine bacterium EBO 49D07]
PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION NO:
NO: NO:
short chain dehydrogenase [Mycobacterium LigD
tuberculosis CDC1551]
>refIZP_07413312.21 short-chain type dehydrogenase/reductase [Mycobacterium tuberculosis SUMu001]
>refIZP_07668817.1I short-chain type dehydrogenase/reductase [Mycobacterium tuberculosis SUMu010]
>refIZP_07669069.1I short-chain type dehydrogenase/reductase [Mycobacterium tuberculosis SUMu011] >gbIAAK45115.11 oxidoreductase, short-chain dehydrogenase/reductase family [Mycobacterium tuberculosis CDC1551]
>gbIEF075870.11 short-chain type dehydrogenase/reductase [Mycobacterium tuberculosis SUMu001] >gbIEFP48221.11 short-chain type dehydrogenase/reductase [Mycobacterium tuberculosis SUMu010]
>gbIEFP52129.11 short-chain type dehydrogenase/reductase [Mycobacterium 877 878 NP 335301.1 tuberculosis SUMu011]
short-chain dehydrogenase/reductase SDR LigD
[marine gamma proteobacterium HTCC2080] >gblEAW39988.11 short-chain dehydrogenase/reductase SDR [marine 879 880 ZP 01627272.1 gamma proteobacterium HTCC2080]
short chain dehydrogenase [Brevibacillus LigD
brevis NBRC 100599] >dbjIBAH46143.11 probable short chain dehydrogenase 881 882 YP 002774647.1 [Brevibacillus brevis NBRC 100599]
short-chain dehydrogenase/reductase SDR LigD
[Novosphingobium sp. PP1Y]
>embICCA92091.11 short-chain dehydrogenase/reductase SDR
883 884 YP 004533909.1 [Novosphingobium sp. PP1Y]
PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION NO:
NO: NO:
short chain dehydrogenase [Mycobacterium LigD
885 886 ZP 04751842.1 kansasii ATCC 12478]
short-chain dehydrogenase/reductase SDR LigD
[gamma proteobacterium IMCC3088]
>gblEGG29327.11 short-chain dehydrogenase/reductase SDR [gamma 887 888 ZP 08271356.1 proteobacterium IMCC3088]
short chain dehydrogenase [Myxococcus LigD
fulvus HW-1] >gbIAE165260.1I short chain 889 890 YP 004666338.1 dehydrogenase [Myxococcus fulvus HW-1]
putative short chain LigD
dehydrogenase/reductase [Mycobacterium abscessus ATCC 19977]
>embICAM63993.11 Putative short chain dehydrogenase/reductase [Mycobacterium 891 892 YP 001704647.1 abscessus]
cis-2,3-dihydrobipheny1-2,3-diol LigD
dehydrogenase [Streptomyces sp. AA4]
>gbIEFL12318.11cis-2,3-dihydrobipheny1-2,3-diol dehydrogenase [Streptomyces sp.
893 894 ZP 07283949.1 AA4]
hypothetical protein RALTA_A1476 LigD
[Cupriavidus taiwanensis LMG 19424]
>embICAQ69425.11 putative OXIDOREDUCTASE DEHYDROGENASE
895 896 YP 002005492.1 [Cupriavidus taiwanensis LMG 19424]
SDR-family protein [Sphingobium japonicum LigD
UT265] >dbjIBA195093.11SDR-family 897 898 YP 003543705.1 protein [Sphingobium japonicum UT265]
short chain dehydrogenase/reductase family LigD
oxidoreductase [Hyphomonas neptunium ATCC 15444] >gbIAB175402.11 oxidoreductase, short chain dehydrogenase/reductase family 899 900 YP 759628.1 [Hyphomonas neptunium ATCC 15444]
PROTEIN GENE GENBANK
DESCRIPTION: TYPE
SEQ ID SEQ ID ACCESSION NO:
NO: NO:
short-chain dehydrogenase/reductase SDR LigD
[Comamonas testosteroni KF-1]
>gblEED68191.11 short-chain dehydrogenase/reductase SDR
901 902 ZP 03543905.1 [Comamonas testosteroni KF-1]
hypothetical protein SCAB_14801 LigD
[Streptomyces scabiei 87.22]
>embICBG68626.11 putative PROBABLE
SHORT-CHAIN TYPE
DEHYDROGENASE/REDUCTASE
903 904 YP 003487191.1 [Streptomyces scabiei 87.22]
3-oxoacyl-[acyl-carrier-protein] red uctase LigD
905 906 AEG69105.1 [Ralstonia solanacearum P082]
short-chain dehydrogenase/reductase SDR LigD
[Clostridium cellulovorans 743B]
>refIZP_07630916.11 short-chain dehydrogenase/reductase SDR [Clostridium cellulovorans 743B] >gbIADL50229.11 short-chain dehydrogenase/reductase SDR
907 908 YP 003841993.1 [Clostridium cellulovorans 743B]
hypothetical protein Rpic_1437 [Ralstonia LigD
pickettii 12J] >gbIACD26578.11 short-chain dehydrogenase/reductase SDR [Ralstonia 909 910 YP 001899010.1 pickettii 12J]
short chain dehydrogenase [Segniliparus LigD
rugosus ATCC BAA-974] >gbIEFV13275.11 short chain dehydrogenase [Segniliparus 911 912 ZP 07965490.1 rugosus ATCC BAA-974]
PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION NO:
NO: NO:
short-chain dehydrogenase [Pseudomonas LigD
aeruginosa PA01] >refIZP_01364886.11 hypothetical protein PaerPA_01001998 [Pseudomonas aeruginosa PACS2]
>reflYP_002441374.11 putative short-chain dehydrogenase [Pseudomonas aeruginosa LESB58] >refIZP_04933207.1I hypothetical protein PA2G 00514 [Pseudomonas aeruginosa 2192]
>gbIAAG04926.11AE004582_4 probable short-chain dehydrogenase [Pseudomonas aeruginosa PA01] >gb1EAZ57326.11 hypothetical protein PA2G_00514 [Pseudomonas aeruginosa 2192]
>embICAW28518.11 probable short-chain dehydrogenase [Pseudomonas aeruginosa LESB58] >gblEGM16253.11 putative short-chain dehydrogenase [Pseudomonas 913 914 NP 250228.1 aeruginosa 138244]
hypothetical protein Mpe_A1784 LigD
[Methylibium petroleiphilum PM1]
>gbIABM94743.11 putative oxidoreductase dehydrogenase signal peptide protein 915 916 YP 001020978.1 [Methylibium petroleiphilum PM1]
oxidoreductase dehydrogenase [Ralstonia LigD
solanacearum CFBP2957]
>embICBJ43067.11 putative oxidoreductase dehydrogenase [Ralstonia solanacearum 917 918 YP 003745682.1 CFBP2957]
919 920 ADD82954.1 BatM [Pseudomonas fluorescens] LigD
short-chain dehydrogenase/reductase family LigD
oxidoreductase [Mycobacterium parascrofulaceum ATCC BAA-614]
>gbIEFG80090.1I short-chain dehydrogenase/reductase family oxidoreductase [Mycobacterium 921 922 ZP 06846575.1 parascrofulaceum ATCC BAA-614]
PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION NO:
NO: NO:
oxidoreductase, short chain LigD
dehydrogenase/reductase family [Alcanivorax sp. DG881] >gbIEDX89108.11 oxidoreductase, short chain dehydrogenase/reductase family 923 924 ZP 05041687.1 [Alcanivorax sp. DG881]
hypothetical protein H16_A1536 [Ralstonia LigD
eutropha H16] >embICAJ92668.11 conserved hypothetical protein [Ralstonia 925 926 YP 726036.1 eutropha H16]
Hypothetical Protein IMCC9480_775 LigD
[Oxalobacteraceae bacterium IMCC9480]
>gblEGF30787.11 Hypothetical Protein IMCC9480 775 [Oxalobacteraceae 927 928 ZP 08275744.1 bacterium IMCC9480]
putative short-chain dehydrogenase LigD
[Pseudomonas aeruginosa UCBPP-PA14]
>refIZP_06879570.11 putative short-chain dehydrogenase [Pseudomonas aeruginosa PAb1] >refIZP_07792770.11 putative short-chain dehydrogenase [Pseudomonas aeruginosa 39016] >gbIABJ10717.11 putative short-chain dehydrogenase [Pseudomonas aeruginosa UCBPP-PA14]
>gbIEFQ37866.11 putative short-chain dehydrogenase [Pseudomonas aeruginosa 39016] >gblEGM15719.11 putative short-chain dehydrogenase [Pseudomonas 929 930 YP 791716.1 aeruginosa 152504]
oxidoreductase dehydrogenase protein LigD
931 932 CAQ35702.1 [Ralstonia solanacearum M0lK2]
short chain dehydrogenase [Segniliparus LigD
rugosus ATCC BAA-974] >gbIEFV12481.11 short chain dehydrogenase [Segniliparus 933 934 ZP 07966320.1 rugosus ATCC BAA-974]
PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION NO:
NO: NO:
hypothetical protein Rpic12D_1478 LigD
[Ralstonia pickettii 12D] >gblACS62765:11 short-chain dehydrogenase/reductase SDR
935 936 YP 002981437.1 [Ralstonia pickettii 12D]
C alpha-dehydrogenase LigD [Cupriavidus LigD
necator N-1] >gbIAE176910.11C alpha-dehydrogenase LigD [Cupriavidus necator N-937 938 YP 004685391.1 1]
Hypothetical Protein RRSL_01608 LigD
[Ralstonia solanacearum UW551]
>reflYP_002259522.11oxidoreductase dehydrogenase protein [Ralstonia solanacearum IP01609] >gblEAP71895.11 Hypothetical Protein RRSL_01608 [Ralstonia solanacearum UW551]
>embICAQ61454.1Ioxidoreductase dehydrogenase protein [Ralstonia 939 940 ZP 00945631.1 solanacearum IP01609]
hypothetical protein RSc1769 [Ralstonia LigD
solanacearum GMI1000]
>embICAD15471.11 probable oxidoreductase dehydrogenase signal peptide protein [Ralstonia solanacearum 941 942 NP 519890.1 GM11000]
oxidoreductase dehydrogenase signal LigD
peptide protein [Ralstonia sp. 5_7_47FAA]
>gbIEFP64736.1I oxidoreductase dehydrogenase signal peptide protein 943 944 ZP 07676733.1 [Ralstonia sp. 5 7 47FAA]
oxidoreductase dehydrogenase [Ralstonia LigD
solanacearum PSI07] >embICBJ51176.11 putative oxidoreductase dehydrogenase 945 946 YP 003752456.1 [Ralstonia solanacearum PS107]
PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION NO:
NO: NO:
hypothetical protein PP1Y_AT3242 LigD
[Novosphingobium sp. PP1Y]
>embICCA91281.11 conserved hypothetical 947 948 YP 004533099.1 protein [Novosphingobium sp. PP1Y]
hypothetical protein Daci_3363 [Delftia LigD
acidovorans SPH-1] >gbIABX36001.1Ishort-chain dehydrogenase/reductase SDR
949 950 YP 001564386.1 [Delftia acidovorans SPH-1]
short-chain dehydrogenase/reductase SDR LigD
[Delftia sp. Cs1-4] >gbIAEF90398.11 short-chain dehydrogenase/reductase SDR
951 952 YP 004488753.1 [Delftia sp. Cs1-4]
short-chain dehydrogenase/reductase SDR LigD
[Pseudomonas mendocina ymp]
>gbIABP85377.11 short-chain dehydrogenase/reductase SDR
953 954 YP 001188109.1 [Pseudomonas mendocina ymp]
short-chain dehydrogenase/reductase SDR LigD
955 956 ADP99633.1 [Marinobacter adhaerens HP15]
short-chain dehydrogenase/reductase family LigD
protein [Alcanivorax borkumensis 5K2]
>embICAL17366.11 short-chain dehydrogenase/reductase family 957 958 YP 693638.1 [Alcanivorax borkumensis 5K2]
short-chain dehydrogenase/reductase SDR LigD
[Cupriavidus metallidurans CH34]
>gbIABF10471.11 short-chain dehydrogenase/reductase SDR [Cupriavidus short-chain dehydrogenase/reductase SDR LigD
[Comamonas testosteroni CNB-2]
>gblACY32473.11 short-chain dehydrogenase/reductase SDR
961 962 YP 003277769.1 [Comamonas testosteroni CNB-2]
PROTEIN GENE GENBANK DESCRIPTION:
TYPE
SEQ ID SEQ ID ACCESSION NO:
NO: NO:
hypothetical protein HGR_11311 LigD
[Hylemonella gracilis ATCC 19624]
>gblEG176405.11 hypothetical protein HGR 11311 [Hylemonella gracilis ATCC
963 964 ZP 08406457.1 19624]
short-chain dehydrogenase/reductase SDR LigD
[Clostridium cellulovorans 743B]
>refIZP_07632312.1I short-chain dehydrogenase/reductase SDR [Clostridium cellulovorans 743B] >gbIADL50757.11 short-chain dehydrogenase/reductase SDR
965 966 YP 003842521.1 [Clostridium cellulovorans 743B]
short-chain dehydrogenase/reductase SDR LigD
[Comamonas testosteroni S44]
>gbIEF162855.1I short-chain dehydrogenase/reductase SDR
967 968 ZP 07043693.1 [Comamonas testosteroni S44]
hypothetical protein Reut_A1415 [Ralstonia LigD
eutropha JMP134] >gbIAAZ60785.1I Short-chain dehydrogenase/reductase SDR
969 970 YP_295629.1 [Ralstonia eutropha JMP134]
putative oxidoreductase dehydrogenase LigD
971 972 CBJ37979.1 [Ralstonia solanacearum CMR15]
short-chain dehydrogenase/reductase sdr LigD
[Variovorax paradoxus EPS]
>gbIADU37360.1I short-chain dehydrogenase/reductase SDR [Variovorax 973 974 YP 004155471.1 paradoxus EPS]
hypothetical protein mma_1991 LigD
[Janthinobacterium sp. Marseille]
>gbIABR91341.11 short-chain dehydrogenase/reductase SDR
975 976 YP 001353681.1 [Janthinobacterium sp. Marseille]
[00259] Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, that there are many equivalents to the specific embodiments described herein that have been described and enabled to the extent that one of skill in the art can practice the invention well-beyond the scope of the specific embodiments taught herein. Such equivalents are intended to be encompassed by the following claims. In addition, there are numerous lists and Markush groups taught and claimed herein. One of skill will appreciate that each such list and group contains various species and can be modified by the removal, or addition, of one or more of species, since every list and group taught and claimed herein may not be applicable to every embodiment feasible in the practice of the invention. As such, components in such lists can be removed and are expected to be removed to reflect some embodiments taught herein. All publications, patents, patent applications, other references, accession numbers, ATCC
numbers, etc., mentioned in this application are herein incorporated by reference into the specification to the same extent as if each was specifically indicated to be herein incorporated by reference in its entirety.

Claims (128)

1. An isolated recombinant polypeptide, comprising:
an amino acid sequence having at least 95% identity to SEQ ID NO:101, the amino acid sequence conserving residues 1, 2, 4-8, 10-12, 14, 17, 19-22, 24, 25, 27-37, 39, 41-54, 57, 58, 60, 62-67, 69-73, 75, 77-80, 82-87, 89, 100, 102, 103, 104, 105, 107, 110-114, 117, 212, 122, 124-130, 133, 134, 137-139, 148, 149, 151-156, 159, 160, 166-168, 170, 173, 174, 178-181, 184, 185, 187-189, 198-201, 204, 205, 207, 210-216, 219, 222, 223, 226-232, 235-239, 242-246, 249, 251, 254, 257, 264, 266, 267, 270, 275, and 278 of SEQ ID NO:101;
wherein, an amino acid substitution outside of the conserved residues is a conservative substitution; and, the amino acid sequence functions to cleave a beta-aryl ether.
2. An isolated recombinant polypeptide, comprising:
an amino acid sequence having at least 95% identity to SEQ ID NO:101, the amino acid sequence conserving residues 19-22, 24, 25, 27-30, 33-36, 39-45, 47, 48, 50-54;
100, 101, 104, 111, 112, 115, 116, 166, 107, 184, 187, 188, 191, 192, and 195 of SEQ ID:101;
wherein an amino acid substitution outside of the conserved residues is a conservative substitution.
3. The isolated recombinant polypeptide of claim 2, wherein the amino acid sequence functions to cleave a beta-aryl ether.
4. An isolated recombinant polypeptide, comprising:
SEQ ID NO:101; or conservative substitutions thereof outside of conserved residues 19-22, 24, 25, 27-30, 33-36, 39-45, 47, 48, 50-54; 100, 101, 104, 111, 112, 115, 116, 166, 107, 184, 187, 188, 191, 192, and 195 of SEQ ID:101.
5. A isolated recombinant glutathione S-transferase enzyme, comprising:
an amino acid sequence having at least 95% identity to SEQ ID NO:101, the amino acid sequence conserving residues 19-22, 24, 25, 27-30, 33-36, 39-45, 47, 48, 50-54;

100, 101, 104, 111, 112, 115, 116, 166, 107, 184, 187, 188, 191, 192, and 195 of SEQ ID:101;
wherein, the amino acid sequence functions to cleave a beta-aryl ether.
6. A isolated recombinant glutathione S-transferase enzyme, comprising:
an amino acid sequence having at least 95% identity to SEQ ID NO:101; wherein, the amino acid sequence functions to cleave a beta-aryl ether.
7. An isolated recombinant polypeptide, comprising:
a length ranging from about 279 to about 281 amino acids;
a first amino acid region consisting of residues 19-54 from SEQ ID NO:101, or conservative substitutions thereof outside of conserved residues 19-22, 24, 25, 27-30, 33-36, 39-45, 47, 48, and 50-54 of SEQ ID NO:101; and, a second amino acid region consisting of residues 98-221 from SEQ ID NO:101, or conservative substitutions thereof outside of conserved residues 100, 101, 104, 111, 112, 115, 116, 166, 107, 184, 187, 188, 191, 192, and 195 of SEQ ID:101.
8. An isolated recombinant glutathione S-transferase enzyme, comprising:
a length ranging from about 279 to about 281 amino acids;
a first amino acid region having at least 95% identity to residues 19-54 from SEQ ID
NO:101 while conserving residues 19-22, 24, 25, 27-30, 33-36, 39-45, 47, 48, and 50-54 of SEQ ID NO:101; wherein, the first amino acid region is located in the recombinant polypeptide from about residue 14 to about residue 59; and, a second amino acid region having at least 95% identity to residues 98-221 from SEQ ID
NO:101 while conserving residues 100, 101, 104, 111, 112, 115, 116, 166, 107, 184, 187, 188, 191, 192, and 195 of SEQ ID:101; wherein, the second amino acid region is located in the recombinant polypeptide from about residue 93 to about residue 226; and, wherein, the recombinant glutathione S-transferase enzyme functions to cleave a beta-aryl ether.
9. The isolated recombinant polypeptide of claim 8, wherein an amino acid substitution outside of the conserved residues is a conservative substitution.
10. A method of cleaving a beta-aryl ether bond, comprising:
contacting a polypeptide comprising an amino acid sequence having at least 95%

identity to SEQ ID NO:101, the amino acid sequence conserving residues 19-22, 24, 25, 27-30, 33-36, 39-45, 47, 48, 50-54; 100, 101, 104, 111, 112, 115, 116, 166, 107, 184, 187, 188, 191, 192, and 195 of SEQ ID:101, with a lignin-derived compound having (i) a beta-aryl ether bond and (ii) a molecular weight ranging from about 180 Daltons to about 3000 Daltons;
wherein, the contacting occurs in a solvent environment in which the lignin-derived compound is soluble.
11. The method of claim 10, wherein the lignin-derived compound has a molecular weight of about 180 Daltons to about 1000 Daltons.
12. The method of claim 10, wherein an amino acid substitution outside of the conserved residues is a conservative substitution.
13. The method of claim 10, wherein the solvent environment comprises water.
14. The method of claim 10, wherein the solvent environment comprises a polar organic solvent.
15. A method of cleaving a beta-aryl ether bond, comprising:
contacting a polypeptide comprising an amino acid sequence having at least 95%

identity to SEQ ID NO:101, the amino acid sequence conserving residues 19-22, 24, 25, 27-30, 33-36, 39-45, 47, 48, 50-54; 100, 101, 104, 111, 112, 115, 116, 166, 107, 184, 187, 188, 191, 192, and 195 of SEQ ID:101, with a lignin-derived compound having (i) a beta-aryl ether bond and (ii) a molecular weight ranging from about 180 Daltons to about 3000 Daltons;
wherein, the contacting occurs in a solvent environment in which the lignin-derived compound is soluble.
16. The method of claim 15, wherein the lignin-derived compound has a molecular weight of about 180 Daltons to about 1000 Daltons.
17. The method of claim 15, wherein the solvent environment comprises water.
18. The method of claim 15, wherein the solvent environment comprises a polar organic solvent.
19. A system for bioprocessing lignin-derived compounds, comprising:
a polypeptide comprising an amino acid sequence having at least 95% identity to SEQ
ID NO:101, the amino acid sequence conserving residues 19-22, 24, 25, 27-30, 33-36, 39-45, 47, 48, 50-54; 100, 101, 104, 111, 112, 115, 116, 166, 107, 184, 187, 188, 191, 192, and 195 of SEQ ID:101;
a lignin-derived compound having a beta-aryl ether bond and a molecular weight ranging from about 180 Daltons to about 3000 Daltons; and, a solvent in which the lignin-derived compound is soluble;
wherein, the system functions to cleave the beta-aryl ether bond by contacting the polypeptide with the lignin-derived compound in the solvent.
20. The system of claim 19, wherein an amino acid substitution outside of the conserved residues is a conservative substitution.
21. A recombinant polynucleotide comprising a nucleotide sequence that encodes a polypeptide comprising an amino acid sequence having at least 95% identity to SEQ ID
NO:101, the amino acid sequence conserving residues 19-22, 24, 25, 27-30, 33-36, 39-45, 47, 48, 50-54; 100, 101, 104, 111, 112, 115, 116, 166, 107, 184, 187, 188, 191, 192, and 195 of SEQ ID:101.
22. A recombinant polynucleotide comprising a nucleotide sequence that encodes a polypeptide comprising SEQ ID NO:101; or conservative substitutions thereof outside of conserved residues 19-22, 24, 25, 27-30, 33-36, 39-45, 47, 48, 50-54; 100, 101, 104, 111, 112, 115, 116, 166, 107, 184, 187, 188, 191, 192, and 195 of SEQ ID:101.
23. A vector comprising the polynucleotide of claim 21.
24. A vector comprising the polynucleotide of claim 22.
25. A plasmid comprising the polynucleotide of claim 21.
26. A plasmid comprising the polynucleotide of claim 22.
27. A host cell transformed by the vector of claim 23.
28. A host cell transformed by the vector of claim 24.
29. A method of cleaving a beta-aryl ether bond, comprising culturing the host cell of claim 27 under conditions suitable to produce the polypeptide;
recovering the polypeptide from the host cell culture; and, contacting the polypeptide with a lignin-derived compound having (i) a beta-aryl ether bond and (ii) a molecular weight ranging from about 180 Daltons to about 3000 Daltons;
wherein, the contacting occurs in a solvent environment in which the lignin-derived compound is soluble.
30. The method of claim 29, wherein the host cell is E. Coli.
31. The method of claim 29, wherein the host cell is Azotobacter vinelandii.
32. The method of claim 29, wherein the lignin-derived compound has a molecular weight of about 180 Daltons to about 1000 Daltons.
33. The method of claim 29, wherein an amino acid substitution outside of the conserved residues is a conservative substitution.
34. The method of claim 29, wherein the solvent environment comprises water.
35. The method of claim 29, wherein the solvent environment comprises a polar organic solvent.
36. A method of cleaving a beta-aryl ether bond, comprising culturing the host cell of claim 28 under conditions suitable to produce the polypeptide;
recovering the polypeptide from the host cell culture; and, contacting the polypeptide with a lignin-derived compound having (i) a beta-aryl ether bond and (ii) a molecular weight ranging from about 180 Daltons to about 3000 Daltons;

wherein, the contacting occurs in a solvent environment in which the lignin-derived compound is soluble.
37. The method of claim 36, wherein the host cell is E. Coli.
38. The method of claim 36, wherein the host cell is Azotobacter vinelandii.
39. The method of claim 36, wherein the lignin-derived compound has a molecular weight of about 180 Daltons to about 1000 Daltons.
40. The method of claim 36, wherein the solvent environment comprises water.
41. The method of claim 36, wherein the solvent environment comprises a polar organic solvent.
42. A system for bioprocessing lignin-derived compounds, comprising:
the transformed host cell of claim 27;
a lignin-derived compound having a beta-aryl ether bond and a molecular weight ranging from about 180 Daltons to about 3000 Daltons; and, a solvent in which the lignin-derived compound is soluble;
wherein, the system functions to cleave the beta-aryl ether bond by contacting the polypeptide with the lignin-derived compound in the solvent.
43. The system of claim 42, wherein the transformed host cell comprises Azotobacter vinelandii.
44. The system of claim 42, wherein the transformed host cell expresses the polypeptide in the solvent in which the lignin-derived compound is soluble.
45. A system for bioprocessing lignin-derived compounds, comprising:
a transformant including a host cell transformed with the vector of claim 23, the transformant expressing the polypeptide;
a lignin-derived compound having a beta-aryl ether bond and a molecular weight ranging from about 180 Daltons to about 3000 Daltons; and, a solvent in which the lignin-derived compound is soluble;

wherein, the system functions to cleave the beta-aryl ether bond by contacting the polypeptide with the lignin-derived compound in the solvent.
46. The system of claim 45, wherein the transformant comprises E. Coli.
47. The system of claim 45, wherein the transformant comprises Azotobacter vinelandii.
48. The system of claim 45, wherein an amino acid substitution outside of the conserved residues is a conservative substitution.
49. The system of claim 45, wherein the transformed host cell expresses the polypeptide in the solvent in which the lignin-derived compound is soluble.
50. The system of claim 45, wherein the lignin-derived compound has a molecular weight ranging from about 180 Daltons to about 1000 Daltons.
51. The system of claim 45, wherein the solvent environment comprises water.
52. The system of claim 45, wherein the solvent environment comprises a polar organic solvent.
53. A system for bioprocessing lignin-derived compounds, comprising:
a transformant including a host cell transformed with the vector of claim 24, the transformant expressing the polypeptide;
a lignin-derived compound having a beta-aryl ether bond and a molecular weight ranging from about 180 Daltons to about 3000 Daltons; and, a solvent in which the lignin-derived compound is soluble;
wherein, the system functions to cleave the beta-aryl ether bond by contacting the polypeptide with the lignin-derived compound in the solvent.
54. The system of claim 53, wherein the transformant comprises E. Coli.
55. The system of claim 53, wherein the transformant comprises Azotobacter vinelandii.
56. The system of claim 53, wherein the transformed host cell expresses the polypeptide in the solvent in which the lignin-derived compound is soluble.
57. The system of claim 53, wherein the lignin-derived compound has a molecular weight ranging from about 180 Daltons to about 1000 Daltons.
58. The system of claim 53, wherein the solvent environment comprises water.
59. The system of claim 53, wherein the solvent environment comprises a polar organic solvent.
60. A system for bioprocessing lignin-derived compounds, comprising:
a transformant including an Azotobacter vinelandii host cell transformed with the vector of claim 23, the transformant expressing the polypeptide;
a lignin-derived compound having a beta-aryl ether bond and a molecular weight ranging from about 180 Daltons to about 3000 Daltons; and, a solvent in which the lignin-derived compound is soluble;
wherein, the system functions to cleave the beta-aryl ether bond by contacting the polypeptide with the lignin-derived compound in the solvent.
61. The system of claim 60, wherein the transformed host cell expresses the polypeptide in the solvent in which the lignin-derived compound is soluble.
62. The system of claim 60, wherein the lignin-derived compound has a molecular weight ranging from about 180 Daltons to about 1000 Daltons.
63. The system of claim 60, wherein the solvent environment comprises water.
64. The system of claim 60, wherein the solvent environment comprises a polar organic solvent.
65. An isolated recombinant polypeptide, comprising:
an amino acid sequence having at least 95% identity to SEQ ID NO:541, the amino acid sequence conserving residues 47-57, 63-76, 100, 101, 104, 107, 111, 112, 115, 116, 176, 194, 197, 198, 201, 202, and 206.
66. The isolated recombinant polypeptide of claim 65, wherein an amino acid substitution outside of the conserved residues is a conservative substitution.
67. The isolated recombinant polypeptide of claim 65, wherein, the amino acid sequence functions to cleave a beta-aryl ether.
68. A isolated recombinant polypeptide, comprising:
SEQ ID NO:541; or conservative substitutions thereof outside of conserved residues 47-57, 63-76, 100, 101, 104, 107, 111, 112, 115, 116, 176, 194, 197, 198, 201, 202, and 206.
69. A isolated recombinant glutathione S-transferase enzyme, comprising:
an amino acid sequence having at least 95% identity to SEQ ID NO:541, the amino acid sequence conserving residues conserved residues 47-57, 63-76, 100, 101, 104, 107, 111, 112, 115, 116, 176, 194, 197, 198, 201, 202, and 206;
wherein, the amino acid sequence functions to cleave a beta-aryl ether.
70. A isolated recombinant glutathione S-transferase enzyme, comprising:
an amino acid sequence having at least 95% identity to SEQ ID NO:541; wherein, the amino acid sequence functions to cleave a beta-aryl ether.
71. An isolated recombinant polypeptide, comprising:
a length ranging from about 256 to about 260 amino acids;
a first amino acid region consisting of residues 47-57 from SEQ ID NO:541, or conservative substitutions thereof outside of conserved residues 47, 48, 49, 50, 52, 54, 55, 56, 57;
a second amino acid region consisting of 63-76 from SEQ ID NO:541; and, a third amino acid region consisting of residues 99-230 from SEQ ID NO:541, or conservative substitutions thereof outside of conserved residues 100, 101, 104, 107, 111, 112, 115, 116, 176, 194, 197, 198, 201, 202, and 206.
72. An isolated recombinant glutathione S-transferase enzyme, comprising:
a length ranging from about 279 to about 281 amino acids;
a first amino acid region having at least 95% identity to 47-57 from SEQ ID
NO:541, or conservative substitutions thereof outside of conserved residues 47, 48, 49, 50, 52, 54, 55, 56, 57;

a second amino acid region consisting of 63-76 from SEQ ID NO:541; and, a third amino acid region having at least 95% identity to residues 99-230 from SEQ ID
NO:541, or conservative substitutions thereof outside of conserved residues 100, 101, 104, 107, 111, 112, 115, 116, 176, 194, 197, 198, 201, 202, and 206;
wherein, the recombinant glutathione S-transferase enzyme functions to cleave a beta-aryl ether.
73. The isolated recombinant polypeptide of claim 72, wherein an amino acid substitution outside of the conserved residues is a conservative substitution.
74. A method of cleaving a beta-aryl ether bond, comprising:
contacting an amino acid sequence having at least 95% identity to SEQ ID
NO:541, the amino acid sequence conserving residues 47-57, 63-76, 100, 101, 104, 107, 111, 112, 115, 116, 176, 194, 197, 198, 201, 202, and 206 with a lignin-derived compound having (i) a beta-aryl ether bond and (ii) a molecular weight ranging from about 180 Daltons to about 3000 Daltons;
wherein, the contacting occurs in a solvent environment in which the lignin-derived compound is soluble.
75. The method of claim 74, wherein the lignin-derived compound has a molecular weight of about 180 Daltons to about 1000 Daltons.
76. The method of claim 74, wherein an amino acid substitution outside of the conserved residues is a conservative substitution.
77. The method of claim 74, wherein the solvent environment comprises water.
78. The method of claim 74, wherein the solvent environment comprises a polar organic solvent.
79. A method of cleaving a beta-aryl ether bond, comprising:
contacting a polypeptide comprising SEQ ID NO:541; or conservative substitutions thereof outside of conserved residues 47-57, 63-76, 100, 101, 104, 107, 111, 112, 115, 116, 176, 194, 197, 198, 201, 202, and 206 with a lignin-derived compound having (i) a beta-aryl ether bond and (ii) a molecular weight ranging from about 180 Daltons to about 3000 Daltons;
wherein, the contacting occurs in a solvent environment in which the lignin-derived compound is soluble.
80. The method of claim 79, wherein the lignin-derived compound has a molecular weight of about 180 Daltons to about 1000 Daltons.
81. The method of claim 79, wherein the solvent environment comprises water.
82. The method of claim 79, wherein the solvent environment comprises a polar organic solvent.
83. A system for bioprocessing lignin-derived compounds, comprising:
a polypeptide having at least 95% identity to SEQ ID NO:541, the amino acid sequence conserving residues 47-57, 63-76, 100, 101, 104, 107, 111, 112, 115, 116, 176, 194, 197, 198, 201, 202, and 206;
a lignin-derived compound having a beta-aryl ether bond and a molecular weight ranging from about 180 Daltons to about 3000 Daltons; and, a solvent in which the lignin-derived compound is soluble;
wherein, the system functions to cleave the beta-aryl ether bond by contacting the polypeptide with the lignin-derived compound in the solvent.
84. The system of claim 83, wherein an amino acid substitution outside of the conserved residues is a conservative substitution.
85. A recombinant polynucleotide comprising a nucleotide sequence that encodes the polypeptide of claim 65.
86. A recombinant polynucleotide comprising a nucleotide sequence that encodes the polypeptide of claim 68.
87. A vector comprising the polynucleotide of claim 65.
88. A vector comprising the polynucleotide of claim 68.
89. A plasmid comprising the polynucleotide of claim 65.
90. A plasmid comprising the polynucleotide of claim 68.
91. A host cell transformed by the vector of claim 87.
92. A host cell transformed by the vector of claim 88.
93. A method of cleaving a beta-aryl ether bond, comprising culturing the host cell of claim 91 under conditions suitable to produce the polypeptide;
recovering the polypeptide from the host cell culture; and, contacting the polypeptide with a lignin-derived compound having (i) a beta-aryl ether bond and (ii) a molecular weight ranging from about 180 Daltons to about 3000 Daltons;
wherein, the contacting occurs in a solvent environment in which the lignin-derived compound is soluble.
94. The method of claim 93, wherein the host cell is E. Coli.
95. The method of claim 93, wherein the host cell is Azotobacter vinelandii.
96. The method of claim 93, wherein the lignin-derived compound has a molecular weight of about 180 Daltons to about 1000 Daltons.
97. The method of claim 93, wherein an amino acid substitution outside of the conserved residues is a conservative substitution.
98. The method of claim 93, wherein the solvent environment comprises water.
99. The method of claim 93, wherein the solvent environment comprises a polar organic solvent.
100. A method of cleaving a beta-aryl ether bond, comprising culturing the host cell of claim 92 under conditions suitable to produce the polypeptide;
recovering the polypeptide from the host cell culture; and, contacting the polypeptide with a lignin-derived compound having (i) a beta-aryl ether bond and (ii) a molecular weight ranging from about 180 Daltons to about 3000 Daltons;
wherein, the contacting occurs in a solvent environment in which the lignin-derived compound is soluble.
101. The method of claim 100, wherein the host cell is E. Coli.
102. The method of claim 100, wherein the host cell is Azotobacter vinelandii.
103. The method of claim 100, wherein the lignin-derived compound has a molecular weight of about 180 Daltons to about 1000 Daltons.
104. The method of claim 100, wherein the solvent environment comprises water.
105. The method of claim 100, wherein the solvent environment comprises a polar organic solvent.
106. A system for bioprocessing lignin-derived compounds, comprising:
the transformed host cell of claim 91;
a lignin-derived compound having a beta-aryl ether bond and a molecular weight ranging from about 180 Daltons to about 3000 Daltons; and, a solvent in which the lignin-derived compound is soluble;
wherein, the system functions to cleave the beta-aryl ether bond by contacting the polypeptide with the lignin-derived compound in the solvent.
107. The system of claim 106, wherein the transformed host cell comprises Azotobacter vinelandii.
108. The system of claim 106, wherein the transformed host cell expresses the polypeptide of claim 65 in the solvent in which the lignin-derived compound is soluble.
109. A system for bioprocessing lignin-derived compounds, comprising:
a transformant including a host cell transformed with the vector of claim 87, the transformant expressing the polypeptide;

a lignin-derived compound having a beta-aryl ether bond and a molecular weight ranging from about 180 Daltons to about 3000 Daltons; and, a solvent in which the lignin-derived compound is soluble;
wherein, the system functions to cleave the beta-aryl ether bond by contacting the polypeptide of with the lignin-derived compound in the solvent.
110. The system of claim 109, wherein the transformant comprises E. Coli.
111. The system of claim 109, wherein the transformant comprises Azotobacter vinelandii.
112. The system of claim 109, wherein an amino acid substitution outside of the conserved residues is a conservative substitution.
113. The system of claim 109, wherein the transformed host cell expresses the polypeptide in the solvent in which the lignin-derived compound is soluble.
114. The system of claim 109, wherein the lignin-derived compound has a molecular weight ranging from about 180 Daltons to about 1000 Daltons.
115. The system of claim 109, wherein the solvent environment comprises water.
116. The system of claim 109, wherein the solvent environment comprises a polar organic solvent.
117. A system for bioprocessing lignin-derived compounds, comprising:
a transformant including a host cell transformed with the vector of claim 88, the transformant expressing the polypeptide;
a lignin-derived compound having a beta-aryl ether bond and a molecular weight ranging from about 180 Daltons to about 3000 Daltons; and, a solvent in which the lignin-derived compound is soluble;
wherein, the system functions to cleave the beta-aryl ether bond by contacting the polypeptide with the lignin-derived compound in the solvent.
118. The system of claim 117, wherein the transformant comprises E. Coli.
119. The system of claim 117, wherein the transformant comprises Azotobacter vinelandii.
120. The system of claim 117, wherein the transformed host cell expresses the polypeptide in the solvent in which the lignin-derived compound is soluble.
121. The system of claim 117, wherein the lignin-derived compound has a molecular weight ranging from about 180 Daltons to about 1000 Daltons.
122. The system of claim 117, wherein the solvent environment comprises water.
123. The system of claim 117, wherein the solvent environment comprises a polar organic solvent.
124. A system for bioprocessing lignin-derived compounds, comprising:
a transformant including an Azotobacter vinelandii host cell transformed with the vector of claim 87, the transformant expressing the polypeptide;
a lignin-derived compound having a beta-aryl ether bond and a molecular weight ranging from about 180 Daltons to about 3000 Daltons; and, a solvent in which the lignin-derived compound is soluble;
wherein, the system functions to cleave the beta-aryl ether bond by contacting the polypeptide with the lignin-derived compound in the solvent.
125. The system of claim 124, wherein the transformed host cell expresses the polypeptide of claim 65 in the solvent in which the lignin-derived compound is soluble.
126. The system of claim 124, wherein the lignin-derived compound has a molecular weight ranging from about 180 Daltons to about 1000 Daltons.
127. The system of claim 124, wherein the solvent environment comprises water.
128. The system of claim 124, wherein the solvent environment comprises a polar organic solvent.
CA2811403A 2010-09-15 2011-08-29 Bioproduction of aromatic chemicals from lignin-derived compounds Abandoned CA2811403A1 (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US40340010P 2010-09-15 2010-09-15
US61/403,400 2010-09-15
US45570910P 2010-10-25 2010-10-25
US61/455,709 2010-10-25
PCT/US2011/049619 WO2012036884A2 (en) 2010-09-15 2011-08-29 Bioproduction of aromatic chemicals from lignin-derived compounds

Publications (1)

Publication Number Publication Date
CA2811403A1 true CA2811403A1 (en) 2012-03-22

Family

ID=45832163

Family Applications (1)

Application Number Title Priority Date Filing Date
CA2811403A Abandoned CA2811403A1 (en) 2010-09-15 2011-08-29 Bioproduction of aromatic chemicals from lignin-derived compounds

Country Status (6)

Country Link
EP (1) EP2616481A4 (en)
JP (1) JP2014506115A (en)
CN (1) CN103797026A (en)
AU (1) AU2011302522A1 (en)
CA (1) CA2811403A1 (en)
WO (1) WO2012036884A2 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9775347B2 (en) 2013-06-14 2017-10-03 Washington State University Methods to convert lignin to phenolic and carboxylate compounds
MY180037A (en) 2013-07-09 2020-11-20 Toray Industries Method of producing sugar liquid
CA3059588A1 (en) * 2017-04-17 2018-10-25 Board Of Trustees Of Michigan State University Methods for lignin depolymerization using thiols
WO2018204424A1 (en) * 2017-05-01 2018-11-08 National Technology & Engineering Solutions Of Sandia, Llc Novel compositions and methods for synthesizing deep eutectic solvents from lignin derived phenolic compounds
CN109423456B (en) * 2017-08-30 2022-08-19 中国石油化工股份有限公司 Azotobacter chroococcum as well as identification method and application thereof
KR101831966B1 (en) 2017-10-27 2018-02-23 경상대학교산학협력단 Method for producing humified lignin conversion product
CN117285663A (en) * 2023-10-17 2023-12-26 中国石油大学(华东) Method for mildly and stepwise separating lignocellulose biomass components

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4399216A (en) 1980-02-25 1983-08-16 The Trustees Of Columbia University Processes for inserting DNA into eucaryotic cells and for producing proteinaceous materials
ZA811368B (en) 1980-03-24 1982-04-28 Genentech Inc Bacterial polypedtide expression employing tryptophan promoter-operator
NZ201705A (en) 1981-08-31 1986-03-14 Genentech Inc Recombinant dna method for production of hepatitis b surface antigen in yeast
US4943529A (en) 1982-05-19 1990-07-24 Gist-Brocades Nv Kluyveromyces as a host strain
AU2353384A (en) 1983-01-19 1984-07-26 Genentech Inc. Amplification in eukaryotic host cells
US4713339A (en) 1983-01-19 1987-12-15 Genentech, Inc. Polycistronic expression vector construction
DD266710A3 (en) 1983-06-06 1989-04-12 Ve Forschungszentrum Biotechnologie Process for the biotechnical production of alkaline phosphatase
AU3145184A (en) 1983-08-16 1985-02-21 Zymogenetics Inc. High expression of foreign genes in schizosaccharomyces pombe
US4879231A (en) 1984-10-30 1989-11-07 Phillips Petroleum Company Transformation of yeasts of the genus pichia
GB8610600D0 (en) 1986-04-30 1986-06-04 Novo Industri As Transformation of trichoderma
US4946783A (en) 1987-01-30 1990-08-07 President And Fellows Of Harvard College Periplasmic protease mutants of Escherichia coli
US5010182A (en) 1987-07-28 1991-04-23 Chiron Corporation DNA constructs containing a Kluyveromyces alpha factor leader sequence for directing secretion of heterologous polypeptides
GB8724885D0 (en) 1987-10-23 1987-11-25 Binns M M Fowlpox virus promotors
EP0397687B1 (en) 1987-12-21 1994-05-11 The University Of Toledo Agrobacterium mediated transformation of germinating plant seeds
AU4005289A (en) 1988-08-25 1990-03-01 Smithkline Beecham Corporation Recombinant saccharomyces
EP0394538B1 (en) 1989-04-28 1996-10-16 Rhein Biotech Gesellschaft Für Neue Biotechnologische Prozesse Und Produkte Mbh A yeast cell of the genus schwanniomyces
FR2646437B1 (en) 1989-04-28 1991-08-30 Transgene Sa NOVEL DNA SEQUENCES, THEIR APPLICATION AS A SEQUENCE ENCODING A SIGNAL PEPTIDE FOR THE SECRETION OF MATURE PROTEINS BY RECOMBINANT YEASTS, EXPRESSION CASSETTES, PROCESSED YEASTS AND PROCESS FOR PREPARING THE SAME
EP0402226A1 (en) 1989-06-06 1990-12-12 Institut National De La Recherche Agronomique Transformation vectors for yeast yarrowia
FR2649120B1 (en) 1989-06-30 1994-01-28 Cayla NOVEL STRAIN AND ITS MUTANTS OF FILAMENTOUS MUSHROOMS, PROCESS FOR PRODUCING RECOMBINANT PROTEINS USING SAID STRAIN, AND STRAINS AND PROTEINS OBTAINED BY SAID METHOD
JP2002034557A (en) * 2000-07-18 2002-02-05 Rengo Co Ltd ENZYME FOR CLEAVING ARYLGLYCEROL-beta-ARYL ETHER TYPE BOND WHICH IS MAIN BOND BETWEEN UNIT OF LIGNIN, METHOD FOR PRODUCING THE SAME AND MICROORGANISM FOR PRODUCING THE SAME

Also Published As

Publication number Publication date
EP2616481A2 (en) 2013-07-24
AU2011302522A1 (en) 2013-05-02
CN103797026A (en) 2014-05-14
WO2012036884A2 (en) 2012-03-22
EP2616481A4 (en) 2014-04-02
WO2012036884A3 (en) 2012-08-02
JP2014506115A (en) 2014-03-13

Similar Documents

Publication Publication Date Title
US20120202257A1 (en) Lige-type enzymes for bioconversion of lignin-derived compounds
CA2811403A1 (en) Bioproduction of aromatic chemicals from lignin-derived compounds
Gall et al. A group of sequence-related sphingomonad enzymes catalyzes cleavage of β-aryl ether linkages in lignin β-guaiacyl and β-syringyl ether dimers
US11685908B2 (en) Prenyltransferase variants and methods for production of prenylated aromatic compounds
De Gonzalo et al. Bacterial enzymes involved in lignin degradation
US20200270585A1 (en) Cytochrome p450 and cytochrome p450 reductase polypeptides, encoding nucleic acid molecules and uses thereof
Majumdar et al. Roles of small laccases from Streptomyces in lignin degradation
Granja-Travez et al. Functional genomic analysis of bacterial lignin degraders: diversity in mechanisms of lignin oxidation and metabolism
US10961539B2 (en) Promoter system inducing expression by 3-hydroxypropionic acid and method for biological production of 3-hydroxypropionic acid using same
US11447754B2 (en) In vitro methods of chemical conversion using non-stereospecific glutathione lyases
Sun et al. Molecular cloning and biochemical characterization of two cinnamyl alcohol dehydrogenases from a liverwort Plagiochasma appendiculatum
WO2020210810A1 (en) Compositions and methods for using genetically modified enzymes
EP3090043B1 (en) 3-hydroxyisovalerate (hiv) synthase variants
WO2009115114A1 (en) Polypeptide having glyoxylase iii activity, polynucleotide encoding the same and uses thereof
US20220411766A1 (en) Compositions and methods for using genetically modified orthologous enzymes
Ribitsch et al. Heterologous expression and characterization of choline oxidase from the soil bacterium Arthrobacter nicotianae
US20220243230A1 (en) Bioconversion of 4-coumaric acid to resveratrol
US11981946B2 (en) Microorganisms and methods for producing 2-pyrone-4,6-dicarboxylic acid and other compounds
US20240102057A1 (en) Methods and compositions useful for the production of 4-vinylphenol
Nguyen et al. Recent Advances in Enzymatic Conversion of Lignin to Value Added Products
US20200255841A1 (en) Host cells and methods for reducing isoprenoid precursors and isoprenoids by geranylgeranyl reductase
RIYADI BIOCONVERSION OF LIGNIN BY OXIDATIVE ENZYMES FOR LIGNIN DEPOLYMERIZATION FROM TROPICAL BACTERIA ISOLATES
CA3191268A1 (en) Methods and cells for production of volatile compounds
Matsuno et al. An Elicitor-Inducible NADP-Malic Enzyme in Lithospermum erythrorhizon Cultured Cells: cDNA Cloning and Characterization
Gall Beta-Etherase and benzoyl-CoA pathway enzymes mediate biodegradation of lignin-related aromatic compounds

Legal Events

Date Code Title Description
FZDE Dead

Effective date: 20170829