WO2005082398A2 - Diagnosis of hyperinsulinemia and type ii diabetes and protection against same based on genes differentially expressed in muscle cells - Google Patents

Diagnosis of hyperinsulinemia and type ii diabetes and protection against same based on genes differentially expressed in muscle cells Download PDF

Info

Publication number
WO2005082398A2
WO2005082398A2 PCT/US2005/005596 US2005005596W WO2005082398A2 WO 2005082398 A2 WO2005082398 A2 WO 2005082398A2 US 2005005596 W US2005005596 W US 2005005596W WO 2005082398 A2 WO2005082398 A2 WO 2005082398A2
Authority
WO
WIPO (PCT)
Prior art keywords
protein
human
actin
mouse
gene
Prior art date
Application number
PCT/US2005/005596
Other languages
French (fr)
Other versions
WO2005082398A3 (en
Inventor
John J. Kopchick
Karen T. Coschigano
Keith S. Boyce
Andres Kriete
Original Assignee
Ohio University
Icoria, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ohio University, Icoria, Inc. filed Critical Ohio University
Priority to EP05713932A priority Critical patent/EP1732582A2/en
Priority to AU2005216922A priority patent/AU2005216922A1/en
Priority to CA002557181A priority patent/CA2557181A1/en
Publication of WO2005082398A2 publication Critical patent/WO2005082398A2/en
Publication of WO2005082398A3 publication Critical patent/WO2005082398A3/en

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • A61K38/16Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • A61K38/17Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • A61K38/1703Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • A61K38/1709Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals

Definitions

  • the human molecules, or antagonists thereof, could be used for protection against faster-than-normal biological aging, or5 to achieve slower-than-normal biological aging. It was also taught that the human molecules may also be used as markers of biological aging.
  • provisional application Ser. No. 60/474,606, filed June 2, 2003 (our docket Kopchick7-USA) ⁇ our research group0 used a gene chip to study the genetic changes in the liver. : . of C57B1/6J mice that occur at frequent intervals of the . aging process . Differential 'hybridization techniques were used to identify mouse genes that are di ferentially., expressed in mice, depending upon their age. The level of5- gene expression of approximately 10,000 mouse.
  • mice- from . the Amersham Codelink UniSet Mouse I Bioarray, produc . ' code: 300013 in. the liver of mice with average' ages of 35, -, 49, 56, 77, 118, 133, 207, .403, 558 and 725 days was * determined.
  • complementary. ?NA derived from mice0 of different ages was screened for hybridization ' with ' , •• oligonucleotide probes each specific to a particular mouse ge e, each gene in turn representative of a particular mouse ' gene cluster (Unigene) .
  • 60/460,415 (our docket: Kopchick6- USA) , filed April 7, 2003, was similar, but complementary RNA, derived from RNA of mouse liver, was screened against a mouse gene chip. See also 60/506,716, filed Sept. 30, 2003 (Kopchick6.1) . Gene chip analyses have also been used to identify genes differentially expressed in normal vs. hyperinsulinemic, hyperinsulinemic vs. type II diabetic, or normal * vs . type II diabetic mouse pancreas, see U.S. Provisional Appl. 60/517,376, filed Nov. 6, 2003
  • the invention relates to various nucleic acid molecules and proteins, and their use in (1) diagnosing hyperinsulinemia and type II diabetes, or conditions associated with their development, and (2) protecting mammals (including humans) against them.
  • diabetes mellitus A deficiency of insulin in the body results in diabetes mellitus, which affects about 18 million individuals in the United States. It is characterized by a high blood glucose (sugar) level and glucose spilling into the urine due to a deficiency of insulin. As more glucose concentrates in the urine, more water is excreted, resulting in .extreme thirst, rapid weight loss, drowsiness, fatigue, and possibly dehydration. Because the cells of the diabetic cannot use glucose for fuel, the body uses stored protein and fat for energy, which leads to a buildup of acid (acidosis) in the blood.
  • acidosis acid
  • Type II diabetes is the predominant ' form found in the Western world; fewer than 8% of diabetic Americans have the type I disease. .
  • Type I diabetes . ⁇ In Type I diabetes, formerly called juvenile-onset or insulin-dependent diabetes mellitus, the , pancreas .cannot produce insulin. People with Type I diabetes must have daily insulin injections. But they need to avoid taking too "much insulin ⁇ because " that can -lead to insulin shock, which begins with a mild hunger.
  • Type I diabetics are often characterized by their low or absent levels of circulating endogenous insulin, i.e., hypoinsulinemia (1) . Islet cell antibodies causing damage to the pancreas are frequently present at diagnosis. Injection of exogenous insulin is required to prevent ketosis and sustain life.
  • Type II diabetes Type II diabetes, formerly called adult-onset or non-insulin-dependent diabetes mellitus (NIDDM) , can occur at any age. The pancreas can produce insulin, but the cells do not respond to it. Type II diabetes is a metabolic disorder that affects , approximately 17 million Americans . It is estimated that another 10 million individuals are "prone" to becoming diabetic. These vulnerable individuals can become resistant to insulin,"a pancreatic hormone that signals glucose (blood. sugar) uptake by fat and muscle . In order to maintain normal glucose levels, the islet cells of' the pancreas produce ⁇ more insulin, resulting- in a condition called hyperinsulinemia.
  • NIDDM non-insulin-dependent diabetes mellitus
  • Type II diabetes is a metabolic disorder that is characterized by insulin resistance and impaired glucose-stimulated insulin secretion (2,3,4) .
  • Type II diabetes and atherosclerotic disease are viewed as consequences of having the insulin resistance syndrome (IRS) for many years (5) .
  • the current theory of the pathogenesis of Type II diabetes is often referred to as the "insulin resistance/islet cell exhaustion" theory.
  • a condition causing insulin resistance compels the pancreatic islet cells to hypersecrete insulin in order to maintain glucose homeostasis.
  • the islet cells eventually fail and the symptoms of clinical ' diabetes are manifested. Therefore, this theory implies that, at some point, peripheral hyperinsulinemia will. be an antecedent of Type II diabetes.
  • Peripheral hyperinsulinemia can be viewed as the difference between what is produced by ⁇ the ⁇ cell minus that which is taken up by the liver. Therefore, peripheral hyperinsulinemia can be caused by increased ⁇ cell production, decreased hepatic uptake or some combination of both. It is also important to note that it is not possible to determine the origin of insulin resistance once it is established since the onset of peripheral hyperinsulinemia leads to a condition of global insulin resistance. ; Multiple environmental and genetic factors are involved in the development of ' insulin resistance, .hyperinsulinemia ⁇ and type II diabetes. An important risk factor for the development of insulin resistance, hyperinsulinemia and type II diabetes • is obesity, particularly visceral obesity (6,7,8). 'Type II diabetes exists world-wide, but in developed societies, the prevalence has risen as the average age of the population increases and the average individual becomes- more obese. ' ' ,. ' ,. '
  • Obesi ty and Diabetes are a serious and growing ' ⁇ problem in the United States. Obesity-related health risks include high blood pressure, hardening of the arteries, cardiovascular disease, and Type II diabetes (also known as non-insulin-dependent diabetes mellitus, Type II diabetes) (9,10,11). Recent studies show that 85% of the individuals with Type II diabetes are obese (12) .
  • Metformin insulin therapy for Type I and oral sulfonylureas and/or insulin therapy for Type II.
  • Metformin (glucophage) was the ⁇ first antidiabetic drug approved by FDA (May 1995) for the treatment of Type II diabetes since the oral sulfonylureas were introduced in 1984. Metformin promotes the use of insulin already in the blood. This May 1995 approval was followed by the September 1995 approval of another ' antidiabetic drug, Acarbose (precose) . It slows down the digestion and absorption of complex sugars, which reduces blood sugar levels after meals. Before 1982, insulin was purified from beef or pork pancreas. This was a problem for those diabetics allergic to animal insulin.
  • Complications of diabetes include retinopathy, neuropathy, and nephropathy (traditionally designated as microvascular complications) as well as atherosclerosis (a macrovascular complication) .
  • mice Animal Models Transgenic Mouse Models of Diabetes or Diabetes Resistance. McGrane, et al . , J. Biol. Chem. 263:11443-51 (1988) and Chen, et al . , J.. Biol. Chem.-, 269:15892-7 (1994) describe the genetic engineering of mice to express bovine growth hormone (bGH) or human growth hormone (hGH) , respectively. These mice exhibited an enhanced growth phenotype . They also developed kidney lesions similar to those seen in diabetic glomerulosclerosis, see Yang, et al . , Lab. Invest., 68:62-70 (1993). Ogueta, et al., J.
  • bGH bovine growth hormone
  • hGH human growth hormone
  • Endocrinol., 165: 321-8 (2000) reported that transgenic mice expressing bovine GH develop arthritic disorder and self- antibodies .
  • Growth hormone has many roles, ranging from regulation of protein, fat and carbohydrate metabolism to growth promotion.
  • GH is produced in the somatrophic cells of the anterior pituitary and exerts its effects either through the GH-induced action of IGF-I, in the case of growth promotion, or by direct interaction with the GHR on target cells including liver, muscle, adipose, and kidney cells.
  • Hyposecretion of GH during development leads to dwarfism, ⁇ and hypersecretion before puberty leads to gigantism.
  • GH hypersecretion of GH results in acromegaly, a clinical condition characterized by enlarged facial bones, hands, feet, fatigue and an increase in weight. Of those individuals with acromegaly, 25% develop type II diabetes. This may be due to. insulin resistance caused by the high circulating levels of GH leading to high circulating levels of insulin (Kopchick et.al., Annual Rev. Nutrition 1999. 19:437-61) . ' ' ' , ' ! A further mode of GH. action may be ' through the transcriptional regulation of a number of genes contributing ' . to the' physiological effects of GH.
  • mice have been made that express the GH antagonists bGH-G119R or hGH G120R, and which exhibit a dwarf phenotype., Chen, et al . , J. Biol. Chem., 263:15892-7 (1994); Chen, et al . , Mol. Endocrinol, 5:1845-52 (1991); Chen, et al . , Proc . Nat. Acad. Sci.- USA 87:5061-5 (1990) . .These mice did not develop kidney lesions. See Yang (1993), supra.
  • mice phenotype.-, - GHR/BP-KO mice, made diabetic by streptozotocin treatment, are protected from the development of dia?betes- associated nephropathy. Bellush, et al .,' Endocrinol . , ' 141:163-8 (2000) . ⁇ ,
  • High-Fat Diets have been shown to induce both obesity and Type II diabetes ' in laboratory animals (13) .
  • high-fat fed animals had significantly 5 elevated fasting blood-glucose and insulin levels and also demonstrated a decrease in insulin sensitivity (14) .
  • Ahren and colleagues (15) reported evidence of insulin resistance as well as diminished glucose-stimulated insulin release, after feeding with a high-fat diet for 12 weeks.
  • Muscle Muscle tissue constitutes about 40% of the body mass. Muscles may be classified by location, i.e., skeletal if attached to bone, cardiac if forming the wall of the heart, and visceral if associated with another body organ. Muscles
  • Skeletal muscles are voluntary, while cardiac and visceral muscles are involuntary. It is also possible to classify muscles morphologically; skeletal and cardiac
  • muscle cells are striated, whereas visceral muscle cells are not .
  • Each skeletal muscle is composed of many individual muscle cells called muscle fibers. ⁇ • The fibers are held together by
  • fascia fibrous connective-tissue membranes
  • The. fascium which ' envelops ' the entire muscle is the epimysium, and. the fascia which penetrate the muscle, . separating the fibers into bundles (fasciculi) >are called perimysium.
  • 35 muscles are attached either directly to a bone, or ' indirectly through a tendon.
  • the individual muscle fibers ' (cells) comprise threadlike protein structures , called myofibrils. There are over , 600 muscles in the human body. We will have occasion later to refer to the gastrocnemius . It is a superficial muscle in the posterior compartment of the lower leg, which together with the underlying soleus forms the characteristic bulge of the calf.
  • Muscle cells respond to insulin by increasing glucose uptake from the bloodstream. Muscle tissue can become resistant to insulin, causing the beta cells to initially increase insulin secretion. Eventually, though, the beta cells become unable to compensate for this increasing insulin resistance from muscle and other cells, and they fail to respond to elevated blood glucose levels. Thus, clinical type 2 - diabetes results from the combination of insulin resistance and impaired beta cell function. ' • Defects in muscle glycogen synthesis are known to play a role in the development of insulin resistance. At least three steps-those mediated by glycogen synthase, hexokinase, and GLUT4-have been reported to be defective in patients with type 2 diabetes .
  • Myopathy is a general term used to describe any ' ' disease of muscles,- such as the, muscular dystrophies and myopathies associated. with ' thyroid disease. ' It .can be- caused by endocrine disorders, including diabetes, metabolic disorders, .infection or inflammation of the muscle, certain drugs and mutations in genes. In diabetes, myopathy is thought to be caused by neuropathy, a complication of diabetes. General symptoms of myopathies include muscle weakness of limbs sometimes occurring during exercise although in some cases the symptoms diminish as exercise increases. Depending on the type of myopathy, one muscle group may be more affected than others . " See “Joint and Muscle Problems Associated with Diabetes", www, iddtinternational .or ⁇ /iointandmuscleproblems.html [Last modified June 12, 2003] .
  • Patti, et al . "Coordinated reduction of genes of oxidative metabolism in humans with insulin resistance and diabetes: Potential role of PGC1 and NRF1", Proc. Nat. Acad. SCi . (USA), 100(14): 8466 (July 8, 2003) used microarrays to analyze skeletal muscle expression of genes in nondiabetic insulin—resistant subjects at high risk for diabetes (based on family hisotry of diabetes and Mexican-American ethnicity) and diabetic Mexican-American subjects. , Of 7,129 sequences represented on, the microarray, 187*. were differentially expressed between ' control, and diabetic subjects.
  • the top-ranked cellular component terms were mitochondrion, mitochondrial membrane, mitochondrial inner membrane, and ribosome, and the top- ranked process term was ATP biosynthesis.
  • the over-represented groups were energy generation, protein biosynthesis/ribosomal proteins, RNA binding, ribosomal structural protein, and ATP synthase complex.
  • the measured or calculated parameters were total body mass , lean body mass , left leg lean mass (by biopsy) , maximum isometric left knee extension force, left knee extension force/left keg lean mass , Peak V0 2 /lean body mass , and Peak V0 2 /left leg lean mass .
  • There were 1178 "probe sets" (representing 1053 different Unigene clusters) for which differential expression was detected; 550 for which expression was higher in older women, and 628 the inverse effect. The differences ranged from 1.2 to 4 fold; most (78A%) were less than 1.5 fold.
  • Kidney androgen-regulated protein gene was used as a positive control, as it is known to be up-regulated by DHT. See also Holland, et al . , Abstract 607, "Identification of Genes Possibly Involved in Nephropathy of Bovine Growth Hormone Transgenic Mice” (Endocrine Society Meeting, June 22, 2000) and Coschigano, et al . , Abstract 333, "Identification of Genes Potentially Involved in Kidney Protection During Diabetes” (Endocrine Society Meeting, June 22, 2000).
  • the following differential hybridization articles may also be of - interest : Wada, et al .
  • Apoptotic ⁇ cells undergo an orchestrated • cascade, of morphological changes such as . membrane blebbing, nuclear shrinkage, chromatin condensation, and formation of apoptotic bodies -which then undergo phagocytosis by neighboring cells .
  • One of the hallmarks of cellular apoptosis is the cleavage of chromosomal DNA into discrete oligonucleosomal size fragments. This orderly removal of unwanted cells minimizes the release of cellular components that may affect neighboring tissue. In contrast, membrane rupture and release of cellular components during necrosis often leads to tissue inflammation.
  • Caspases are a family of serine proteases that are synthesized as inactive proenzymes. Their activation by apoptotic signals such as CD95 (Fas) death receptor activation or tumor necrosis factor results in the cleavage of specific target proteins and execution of the apoptotic program. Apoptosis may occur by either an extrinsic pathway involving the activation of cell surface death receptors (DR) or by an intrinsic mitochondrial pathway. Yoon, J-H. Gores G.J. (2002) Death receptor-mediated apoptosis and - the liver. J. Hepatology 37:400-410. These pa.thways are not ⁇ mutually exclusive and some cell types require the activation of both pathways for maximal apoptotic signaling.
  • DR cell surface death receptors
  • type-I cells death receptor' activation, leads to the recruitment and activation of caspases-8/10 and the rapid cleavage and activation of. caspase-3 in a mitochondrial-independent manner, i Hepatocytes are members of the Type-II cells in which mitochondria are essential for DR-mediated apoptosis
  • Bid a Bcl2 interacting protein, mediates cytochrome c release from mitochondria in response to activation of cell surface death receptors.
  • DFF DNA fragmentation factor
  • DFF45 cleavage by activated caspase-3 results in its dissociation from DFF40 and allows the caspase-activated DNAse (CAD) activity of DFF40 to cleave chromosomal .
  • the 40-kDa subunit of DNA fragmentation factor induces DNA fragmentation and chromatin condensation during apoptosis. Proc. Natl. Acad. Sci. USA. 95:8461-8466; Halenbeck, R., MacDonald, H.
  • CIDEs cell-death-inducing DFF45-li?ke effectors '
  • CIDE-3 a novel member of the cell-death-inducing DNA- fragmentatio -factor (DFF45) -like effector family. Biochem. J. 370:195-203.
  • the CIDEs contain an N-terminal domain that shares homology with the N-terminal region of DFF45 and may represent a regulatory region via protein interaction. See Inohara, supra; Lugovskoy, A.A. , Zhou, P., Chou, J.J., McCarty, J.S. , Li, P., Wagner, G. .
  • CIDE-A brown adipose tissue
  • BAT brown adipose tissue
  • CIDE-A can interact and inhibit • UCP1 in BAT and may therefore play a role in regulating energy ibalance, see Zhou supra.
  • CIDE-A is not expressed. in either adult human o mouse liver tissue, see Inohara supra, Zhou supra. . , The human protein cell ' death activator CIDE-A is of particular' interest because of its highly dramatic change in liver expression with " age, first demonstrated in our Kopchick7 application, supra. CIDE-A expression is elevated in older normal mice . CIDE-A expression was studied for normal C57BI/6J mouse ages 35, 49, 77, 133, 207, 403 and 558 days. Expression is low at the first five data points, then rises sharply at 403 days, and again at 558 days. CIDE-A was therefore classified as an "unfavorable protein", i.e.
  • mice that are differentially expressed in the muscle (gastrocnemius) of mice, depending upon their development of hyperinsulinemia or type II diabetes.
  • complementary ?RNA derived from normal mice, or mouse models of hyperinsulinemia or type II diabetes was screened for hybridization with oligonucleotide probes each specific to a particular mouse database DNA, the latter being identified, by database accession number, by the gene manufacturer.
  • Each database DNA in, turn was also identified by the gene chip manufacturer as representative of a particular mouse gene cluster (Unigene) .
  • this database DNA sequence is a full length genomic DNA or cDNA sequence, and is therefore either identical to, or otherwise encodes the same protein as does, a natural full-length genomic DNA protein coding sequence. Those which don't present at least a partial sequence of a natural gene or its cDNA equivalent .
  • mouse database DNA sequences whether full-length or partial, and whether cDNA or genomic DNA, are referred to herein as "mouse genes". When only the genomic sequence is intended, we will refer specifically to "genomic DNA” or "gDNA” .
  • mouse proteins regardless of whether* they are in fact full length , sequences .
  • mouse genes which were differentially expressed (normal vs. hyperinsulinemic, hyperinsulinemic vs . diabetic, or normal vs. diabetic), as measured by .different levels of hybridization of the respective c?RNA samples with the : particular probe ⁇ corresponding to that mouse gene) were identified.
  • normal and "control” are used interchangeably in thiis specification, unless expressly stated otherwise.
  • the control or normal subject is a mouse which is normal vis-a-vis fasting insulin and fasting glucose levels.
  • normal means normal relative to tlr ⁇ ose parameters, and does not necessitate that the mouse be normal in every respect .
  • a mouse gene is said to have exhibited a favorable behavior if, for a particular mouse age of observation, its average level of expression in mice which are in a more favored state is hig?t ⁇ er than that in mice which are in a less favored state.
  • A. mouse gene is said to have exhibited an unfavorable behavior if, for a particular mouse age of observation, its average level of expression in mice which are in a more favored state is lower than that in mice which are in a less favored state.
  • mice gene were observed at an age other than one of the ages noted in the Examples, we would have observed a still stronger differential expression behavior. Nonetheless, we must classify the mouse genes on the basis of the behavior which we actually observed, not the behavior which might have been observed at some other age.
  • mice genes which exhibit strongly favorable or unfavorable differential expression behaviors.
  • a behavior is considered strong if the ratio of the higher level to the lower level is at least two-fold.
  • a mouse gene may still be identified as favorable or unfavora?ble even if none of its observed behaviors are strong as defined above.
  • tissue than iri either normal or type IT diabetic ; tissue (i.e., C ⁇ HI , HI>D) will be deemed both ! ⁇ ⁇ unfavorable", by virtue of the control :hyperinsulinemic and "favorable", by virtue of , the ' ' - hyperinsulinemic:diabetic comparison.
  • This is one of several possible "mixed” expression patterns.
  • the genes/proteins with "mixed" expression patterns are, by definition, both partially favorable and partially unfavorable.
  • use of the wholly favorable or wholly unfavorable genes/proteins is preferred to use of the partially favorable or partially unfavorable ones.
  • mixed genes/proteins are those exhibiting a combination of favorable and unfavorable behavior.
  • a mixed gene/protein can be used as would a favorable gene/protein if its favorable behavior outweighs the unfavorable. It can be used as would an unfavorable gene/protein if its unfavorable behavior outweighs the favorable.
  • they are used in conjunction with other agents that affect their balance of favorable and unfavorable behavior.
  • mice gene is classified on ' the basis of the strongest C-HI behavior among the ages tested, the strongest HI-D behavior among the ages tested, and the strongest C-D behavior among the ages tested. If at least one of these three behaviors is significantly favorable, and none of the others of these : three behaviors is significantly unfavorable, the mouse gene will be classified as wholly favorable • and listed in subtable lA of Master Table 1. However, that does not mean that it may not have exhibited a ' weaker but unfavorable expression behavior at some tested age.
  • the "favorable”, “unfavorable” and “mixed” mouse proteins of the present invention include the mouse database proteins listed in the Master Table in the same row as a particular "favorable” , ' “unfavorable” or “mixed 7 ' mouse gene, respectively. These proteins may be the exact translation ⁇ product of the identified mouse gene (database DNA) . However, if they were sequenced directly, they could be shorter or longer than that translation product. They could also differ in sequence from the exact translation product as a result of post-translational modifications.
  • mouse proteins of interest also include mouse proteins which, while not listed in the table, correspond to (i.e., homologous to, i.e., which could be aligned in a statistically significant manner to) such mouse proteins or genes, and mouse proteins which are at least substantially identical or conservatively identical to the listed mouse proteins.
  • human genes databases DNAs
  • proteins were identified by searching a database, comprising human DNAs or proteins for sequences corresponding to (i.e., homologous to, i.e., which could be aligned in a statistically significant manner to) the mouse gene or protein. More than one human protein may be . identified as corresponding to a particular mouse chip probe and to a particular mouse gene.
  • human genes . and “human proteins” are used in a manner analogox ⁇ s to that already discussed in the case of "mouse genes” and “mouse proteins”.
  • the term "corresponding" does not mean identical, but rather implies the existence of.
  • a statistically significant sequence similarity such as one sufficient to qualify the human protein or gene as a homologus protein or DNA as defined below.
  • the greater the degree of relationship as thus defined i.e., by the ' statistical significance of each alignment used to connept the mouse cDNA to the human protein or gene,' measured by an E value),- the more close the correspondence.
  • the connection may be ?direct (mouse gene to human protein) or indirect ' (e.g., mouse gene to human gene, human.gene to human protein) .By "mouse gene”, we mean the mouse gene from which the gene chip DNA in question was derived.: .
  • the human genes/proteins which ,most closely correspond, directly or indirectly; to the mouse genes are • preferred, such as the one(s) with the highest, top two highest, top three highest, top four highest, top five highest, and top ten highest E values for the final alignment in the connection process.
  • the human genes/proteins deemed to correspond to our mouse genes are identified in the Master Tables . Note that it is possible to identify homologous full- length human genes and proteins, if they are present in the database, even if the query mouse DNA or protein sequence is not a full-length sequence. If there is no homologous full-length human gene or protein in the database, but there is a partial one, the latter may nonetheless be useful.
  • a partial protein may still have biological activity, and a molecule which binds the partial protein may also bind the full- . length protein so as to antagonize a biological activity of the full-length protein.
  • a partial human gene may encode a partial protein which has biological activity, or the gene may be useful in the design of a hybridization probe or in the design of a therapeutic antisense DNA.
  • the partial genes and protein sequences may of course also be used in the design of 1 probes intended to identify .. the full length gene or protein sequence.
  • a human protein For the sake of convenience, we refer to a human protein as favorable if (1) it is listed in Master Table 1 as corresponding to a favorable mouse gene, or (2) it is at least substantially identical or conservatively identical to a listed protein per (1), or (3) it is a member of a human protein class listed in Master Table 2 (if provided) as corresponding to a favorable mouse gene.
  • a human protein We define a human protein as unfavorable in an analogous manner.
  • a human gene which encodes a. particular human protein may be classified in the same way- as the human ⁇ protein which it encodes. However, it should be noted that this classification is not based on the direct study of the expression of the human gene/protein. of course, the human genes/proteins of ultimate interest will be the ones whose change in level of 5 expression is, in fact, correlated, directly or inversely, with the change of state (normal, hyperinsulinemic, diabetic) of the subject.
  • one 10 may formulate agents useful in screening humans at risk for - progression toward hyperinsulinemia or toward type II diabetes, or protecting humans at risk thereof from progression from a normoinsulinemic state to a hyperinsulinemic state, or from either to a type II diabetic '15 state.
  • Agents which bind the "favora le” and “unfavorable" nucleic acids e.g., the agent is a substantially complementary nucleic acid hybridization probe) , or the
  • corresponding proteins may be used to evaluate whether a human subject is at increased or decreased risk for progression toward type II diabetes.
  • a subject with one or more elevated “unfavorable” and/or one or more depressed “favorable” genes/proteins is
  • the assay may be used as a preliminary screening assay to select subjects for further analysis, or as a formal diagnostic assay.
  • 35 use of the corresponding -mouse or human proteins, in diagnostic agents, to measure progression toward .hyperinsulinemia or type II diabetes, or protection against ⁇ the disorder(s), or to estimate related end ⁇ organ damage such as kidney damage; • • ' (5) use of the corresponding mouse or human proteins in assays to determine whether a substance binds to (and hence may neutralize) the protein; and (6) use of the neutralizing substance to protect 5 against the disorder (s) .
  • DNAs of interest include those which specifically hybridize to the aforementioned mouse or human genes, and are thus of interest as hybridization assay reagents or for 10 antisense therapy. They also include synthetic DNA sequences which encode the same polypeptide as is encoded by the database DNA, and thus are useful for producing the polypeptide in cell culture or in situ (i.e ., gene therapy) . Moreover, they include DNA sequences which encode , 15 polypeptides which are substantially structurally identical or conservatively identical in amino acid sequence to the mouse and human proteins identified in 'the Master Table 1, subtables 1A or IC. Finally, they include DHSTA sequences which encode peptide (including antibody) antagonists of the 20 proteins of Master Table 1, subtables IB or IC.
  • the related human DNAs may be identified by comparing the mouse sequence (or its AA translation product) to known human DNAs (and their AA translation products) . 25.. Related human DNAs also may be identified by screening human cDNA or genomic DNA libraries using the mouse gene of the Master Table, or a fragment thereof, as a probe. If the mouse gene of Master Table 1 is not full-length, and there is no closely corresponding full-length mouse gene in 30 the sequence databank, then the mouse DNA may first be used ' , as a hybridization probe to screen a mouse cDNA library to isolate the corresponding full-length sequence. Alternatively, the mouse DNA may be used as a probe to screen a mouse genomic DNA library.
  • ⁇ • 7 ⁇ is possible that the genes > found- to be unfavorable act indirectly by accentuating obesity. Consequently, it is ⁇ within the compass of the present invention to use the favorable genes and proteins, or to use antagonists of the unfavorable genes and proteins, to protect against obesity, as well as against sequelae of obesity such as hyperinsulinemia and diabetes. Since type II diabetes is an age-related disease, the agents of the present invention may be used in conunction with known anti-aging or anti-age-related disease agents.
  • FIG. la Body weight gain [Fig. la] , fasting glucose [Fig. lb] and fasting insulin [Fig. Ic] levels of mice on the HF or Std diets.
  • Figure 2 Expression levels of Actin, alpha, cardiac (Actcl, NM_009608) using RNA isolated from gastrocnemius muscle of individual diabetic HF mice and corresponding Std mice at different time points.
  • Figure 3 Data shown are expression levels for additional actin-related and actin-binding genes exhibiting a consistent decrease in expression in the HF mice in -comparison to Std mice at all four time points (Fig. 3(a)) or at three of the four time points (Fig. 3(b)) .
  • a "full length" gene is here defined as (1) a naturally occurring DNA sequence which begins with an initiation codon (almost always the Met codon, ATG) , and ends with a stop codon in phase with said initiation codon0 (when introns, if any, are ignored) , and thereby encod.es a naturally occurring polypeptide with biological activity, or a naturally occurring precursor thereof, or (2) a synthetic DNA sequence which encodes the same polypeptide as that which is encoded by (1) .
  • the gene may, but need not,5 include introns .
  • a "full-length"- protein is here defined as a naturally occurring protein encoded by a full-length gene, or a protein derived naturally by post-translational modification of such a protein. Thus, it includes mature0 proteins, proproteins, preproteins and preproproteins . It also includes substitution and extension mutants of such naturally occurring proteins.
  • a mouse is considered to be a diabetic subject if, regardless of its fasting plasma insulin level, it has a fasting plasma glucose level of at least 190 mg/dL.
  • a mouse is considered to 'be a hyperinsulinemic . subject if its fasting plasma insulin level is at' least 0.67 ng/mL and it0 , does, not qualify as a- diabetic subject.
  • a mouse is considered to be "normal” if it is neither 'diabetic nor - ⁇ hyperinsulinemic. Thus, normality is defined in a very limited manner.
  • a ' mouse is considered “obese” if its weight is at least'5 ' 15% in excess of the -.mean weight for mice of its age and , sex.
  • a mouse which does • n'ot satisfy 'this standard may be characterized as "nonrobese" , the term “normal” being reserved for use in reference to glucose and insulin levels '1' as previously -described.
  • a human is considered a diabetic subject if, regardless of his or her fasting plasma insulin level, the fasting plasma glucose level -is at least 126 mg/dL.
  • a human is considered a hyperinsulinemic subject if the fasting plasma insulin level is more than 26 micro International Units/mL (it is believed that this is equivalent to 1.08 ng/mL) , and does not qualify as a diabetic subject.
  • a human is considered to be "normal” if it is neither diabetic nor hyperinsulinemic.
  • NIDDK Non-overweight
  • HDL cholesterol level >35 mg/dL (0. 90 mmol/L)
  • the diagnostic and protective methods of the present invention are applied to human subjects exhibiting one or more of the aforementioned risk factors. Likewise, in a preferred embodiment, they are applied to human subjects who, while not diabetic, exhibit impaired glucose homeostasis (110 to ⁇ 126 mg/dL) .
  • the age of the subjects is at least 45, at least 50, at least 55, att least 60, at least 65, at least 70, and at least 75.
  • NEDDK says that "The relative risk of diabetes increases by approximately 25 percent for each additional unit of BMI over 22.”
  • the BlV-IIs of- the human subjects is at least 23, at least 24, at least 25 (i.e., overweight by our criterion), at least 26, at least 27, at least 28, at. least 29, at least 30 (i.e. , obese), at least 31, at least 32, at least 33, at least 3-4, at least 35, at least 36, at least 37, at least 38, at heast 39, at least 40, or over 40.
  • Age-related (senescent) diseases i ⁇ clude certain cancers, atherosclerosis, diabetes (type 2) , osteoporosis, hypertension, depression, Alzheimer's, Parkinson's, glaucoma, certain immune system defects, kidney failure, and liver steatosis. In general, they are diseases for which the relative risk (comparing a subpopulation over age 55 to ' a suitably matched population under age 55 ) is at ⁇ least 1.1.
  • the agents of the present invention protect against one or more age-related diseases for at least a subpopulation of mature (post-puberty) adult subjects.
  • the mouse or human genes may be used directly. For diagnostic or screening purposes, they (or specific binding fragments thereof) may be labeled and used as hybridization probes. For therapeutic purposes, they (or specific binding fragments thereof) may be used as antisense reagents to inhibit the expression of the corresponding gene, or of a sufficiently homologous , gene of another species . If the database DNA appears to be a full-length cDNA or gDNA, that is, it encodes an entire, functional, naturally occurring protein, then it may be used in the expression of that protein. Likewise, if the corresponding human gene is known in full-length, it may be used to express the human protein.
  • the disclosed genes (gDNA or cDNA)have significant similarities to known DNAs (and their translated AA sequences to known proteins)
  • the results of several such searches are set forth in the Examples. Such results are dependent, to some degree, on the search parameters. Preferred parameters are set forth in Example 1.
  • the results are also dependent on the content of the database. While the raw similarity score of a particular target (database) sequence will not vary with content (as long as it remains in the database) , its informational value (in bits), expected value, and relative ranking can change. Generally speaking, the changes are small.
  • the nucleic acid and protein databases keep growing. Hence a later search may identify high scoring target sequences which were not uncovered by an earlier search because the target sequences were not previously part of a database .
  • the cognate DNAs and proteins include not only those set forth in the examples, but those which would have been highly ranked (top ten, more preferably top three, even more preferably top two, most preferably the top one) in a search run with the same parameters on the date of filing of this application.
  • an antagonist of a protein or other molecule may be obtained by preparing a combinatorial library, as described belqw, of potential antagonists, and screening the library members for binding to the protein or other molecule in question. The binding members may then be further screened for the ability to antagonize the biological activity of the target.
  • the antagonists may be used therapeutically, or, in suitably labeled or immobilized form, diagnostically. If the identified mouse or human database DNA is related to a known protein, then, substances known to interact with that protein (e.g., agonists, antagonists, substrates, receptors, second messengers, regulators, and so forth) , and binding molecules which bind them, are also of utility. Such binding molecules can likewise be identified by screening a combinatorial library.
  • the possession of ''one DNA greatly .facilitates -the isolation of homologous DNAs. If only a partial DNA is known, this ' partial DNA may first be used as a probe to isolate the corresponding full length DNA for the same species, and that the latter may be used as the starting DNA in the search for homologous genes .
  • the starting DNA, or a fragment thereof is used as a hybridization probe to screen a cDNA or genomic DNA library for clones containing inserts which encode either the entire homologous protein, or a recognizable fragment thereof.
  • the minimum length of the hybridization probe is dictated by the need for specificity.
  • the human cDNA library is about 10 8 bases and the human genomic DNA library is about 10 10 bases.
  • the library is preferably derived from an organism which is .known, on biochemical evidence, to produce a homologous protein, and more preferably from the genomic DNA or mRNA of cells of that organism which are likely to be relatively high producers of that protein.
  • a cDNA library (which is derived from an mRNA library) is especially ⁇ preferred.
  • a synthetic hybridization probe may be used which encodes the same amino acid sequence but whose codon utilization is more similar to that of the DNA of the • target organism.
  • the synthetic probe may employ inosine as a substitute for those bases which are most likely to be divergent, or the probe may be a mixed probe which mixes the codons for ,;the source DNA with the preferred codons (encoding the same amino acid) for the target organism.
  • the Tm of a perfect duplex of starting DNA is determined. " One may then select a ' ' - hybridization temperature which is sufficiently 'lower than the perfect duplex Tm to allow hybridization?
  • the library is screened under conditions where the temperature is at least 20°C, more preferably at least 50°C. , below the perfect duplex Tm. Since salt reduces the Tm, one ordinarily would carry out the search for DNAs encoding highly homologous proteins under relatively low salt hybridization conditions, e.g., ⁇ 1M NaCl.
  • the corresponding mouse protein can be identified by performing a BlastX search on a mouse protein database with the mouse database DNA sequence as the query sequence. Even if the protein sequence is not in the database, if the DNA sequence comprises a full-length coding sequence, the corresponding protein can be identified by translating the coding sequence in accordance with the Genetic Code.
  • a human protein can be said to be identifiable as corresponding (homologous) to a gene chip DNA if it is identified as corresponding (homologous) to the mouse gene (gDNA or cDNA, whole or partial) identified by the gene chip manufacturer as corresponding to that gene chip DNA. : In turn, it is identifiable as corresponding (homologous) to said identified mouse gene, if
  • BlastX it is encoded by a human gene, or can be aligned to a human gene ,by BlastX, which in turn can be aligned by BlastN to said mouse gene and/or
  • BlastP to a mouse protein, the latter being encoded by said mouse gene, or aligned to said mouse gene BlastX, where any alignment by BlastN, ' BlastP or BlastX is in accordance ' with the default parameters set. forth below, and , the expected value ' (E) of each alignment (the probability that such an alignment would have occurred by chance.'alone) l is less than e-10. (Note that because this is a negative exponent, a value such as e-50 is less than e-10.)
  • a human protein with a score worse (i.e., higher) than e-50 may appear in Master Table 1: If the manufacturer of the gene chip identifies the gene chip DNA as corresponding to an EST, or other DNA which is not. a full-length mouse gene or cDNA, a longer (possibly full length) mouse gene. or cDNA may be identified by a BlastN search of the mouse DNA database.
  • the identified DNA 1 may be used to conduct ; BlastN search of a human DNA database, or a BlastX search of a' mouse or human protein database .
  • a human protein can be said -to be ⁇ identifiable as corresponding (homologous) to a gene chip DNA, or to a DNA identified by the manufacturer as corresponding to that gene chip DNA, if
  • any alignment by BlastN, BlastP, or BlastX is in accordance with the default parameters set forth below, and the expected value (E) of each alignment (the probability that such an alignment would have occurred by chance alone) is less than e-10. (Note that because this is a negative exponent, a value such as e-50 is less than e-10.)
  • the E value is less than e-50, more preferably less than e-60, , still more preferably less than e-70, even more preferably less than e-80, considerably more preferably less than e-90, and most preferably less than e-100.
  • one or more of these standards of preference are met for two, three, four or all five of conditions (1')- (5 1 ) .
  • the E value is preferably, so limited for all of said alignments in the connecting chain.
  • a human gene corresponds (is homologous) to a gene chip DNA or manufacturer identified corresponding DNA if it encodes a homologous human protein as defined above, or if it can be aligned either directly to that DNA, or indirectly through a mouse gene which can be aligned to said DNA, according to the conditions set forth above. ' . , Master table 1 assembles a list of human protein corresponding to each of the mouse DNAs/proteins identified as related to the . chip DNA. These human proteins form a set- and can be given a percentile rank, with respect to E value, within that set.
  • the human proteins of the present invention preferably are those scorers with a percentile- rank of at least 50%, more preferably at least 60%, still more preferably at least ' 70%, even more preferably at least 80%, and most preferably at least 90%.
  • .' • ⁇
  • These human proteins form a subset of the set above and can be given a percentile rank'within that subset, e.g., the human proteins with scores in the top 10% of that subset have a percentile rank of 90% or higher.
  • the human proteins of the present invention preferably are those 'best scorer subset proteins with a percentile rank ' within- he subset of at, least -50%, more preferably ,at least 60%, still more preferably at least 70%; even more , , ' preferably at least *80%,, and most preferably at ' least 90%.
  • ' BlastN and BlastX report very low expected values as ⁇
  • a human protein may be said to be functionally homologous to the mo ⁇ se gene if the human protein has at least one biological activity in common with the mouse protein encoded by said mouse gene.
  • the human proteins of interest also include those that are substantially and/or conservatively identical (as defined below) to the homologous and/or functionally homologous human proteins defined above .
  • the degree of differential expression may be expressed as the ratio of the higher expression level to the lower expression level. Preferably, this is at least 2-fold, and more preferably, it is higher, such as at least 3-fold, at least, 4-fold, at least 5-fold, at least 6-fold, at least 7- fold, at least 8-fold, at least 9-fold, or at least 10-fold.
  • the human protein of interest corresponds to a mouse gene for which the degree of differential expression places it among the top 10% of the mouse genes in the appropriate subtable.
  • the complementary strand of the gene, or a portion thereof may be used in labeled form as a hybridization probe to detect- messenger RNA and thereby monitor the level of expression of the gene in a subject. Elevated levels are indicative of progression, or propensity to progression, to a less favored state, and clinicians may take appropriate preventative, curative or ameliorative action.
  • the messenger ?RNA product (or equivalent cDNA) , the protein product, or a binding molecule specific for that product (e.g., an antibody which binds the product) , or a downstream product which mediates the activity (e.g., a signaling intermediate) or a binding molecule (e.g., an antibody) therefor, may be used, preferably in labeled or immobilized form, as an assay reagent in an assay for said nucleic acid product, protein product, or downstream product (e.g., a signaling intermediate) .
  • elevated levels are indicative of a present or future problem.
  • an agent which down-regulates expression of the gene may be used to reduce levels of the corresponding protein and thereby inhibit further damage.
  • This agent could inhibit transcription of the gene in the subject, or translation of the corresponding messenger RNA.
  • Possible inhibitors of transcription and translation include antisense molecules and repressor molecules.
  • the agent could also inhibit a post-translational modification (e.g., , glycosylation, phosphorylation, cleavage, GPI attachment) s required for activity, or post-translationally modify the protein so as to inactivate it.
  • it could be an agent which down- or up-regulated a positive or negative regulatory gene, respectively.
  • an agent which is an antagonist of the messenger RNA product or protein product of the gene, ' or of a downstream product ' through which' its activity is . manifested (e.g., a signaling intermediate), may be used to inhibit its activity.
  • This antagonist could be an antibody, a peptide, a peptoid, a nucleic acid, a peptide nucleic acid (PNA) 5 oligomer, a small organic molecule of a kind for which a combinatorial library exists (e.g., a benzodiazepine) , etc.
  • An antagonist is simply a binding molecule which, by binding, reduces or abolishes the undesired activity of its target.
  • the antagonist if not an oligomeric molecule, is
  • an agent which degrades, or abets the degradation of, that messenger RNA, its protein product or a downstream product which mediates its activity e.g., a
  • the complementary strand of the gene, or a portion thereof may be used in labeled form as a hybridization probe to detect messenger RNA and thereby monitor the level of expression of the gene in a subject. Depressed levels are indicative of damage, or possibly of a
  • RNA product the messenger RNA product, the ⁇ equivalent cDNA, protein product, . or a binding molecule specific for those products, or a downstream product, or a signaling
  • an agent 'which ,up-regulates expression of the gene may be used to increase- levels of the corresponding protein and thereby inhibit further progression to a less favored state.
  • it could be a vector ⁇ which carries a copy of the gene, but- which expresses the gene at higher levels than does the endogenous expression system.
  • it could be an agent which up- or down-regulates a positive or negative regulatory gene.
  • an agent which is an agonist of the protein product of the gene, or of a downstream product through which its activity (of inhibition of progression to a less favored state) is manifested, or of a signaling intermediate may be used to foster its activity.
  • an agent which inhibits the degradation of that protein product or of a downstream product or of a signaling intermediate may be used to increase the effective period of activity of the protein.
  • Mutant Proteins The present invention also contemplates mutant proteins
  • peptides which are substantially identical (as defined below) to the parental protein (peptide) .
  • the fewer the mutations the more likely the mutant protein is to retain the activity of the parental protein.
  • the effect of mutations is usually (but not always) additive. Certain individual mutations are more likely to be tolerated than * others .
  • a protein is more likely to tolerate a mutation which (a) is a substitution rather than an insertion or deletion; • (b) is an insertion or deletion at the terminus, rather than internally, or, if internal, is at a domain boundary, or a loop or turn, rather than in an alpha helix or beta strand; ' (c) affects a surface residue rather than an interior residue; (d) affects a part of the molecule distal to the binding site; (e) is a, substitution of. one amino acid for another of similar size,, charge, and/or , * ' . - * hydrophobicity, .. and does not destroy a disulfide 'bond or other crosslink; and ; (f) is at a site which is subject to substantial variation among a family of homologous proteins to which the protein of interest belongs.
  • Binding Si te Residues forming the binding site may be identified by (1) comparing the effects of labeling the surface residues before and after complexing the protein to its target, (2) labeling the binding site directly with affinity ligands, (3) ' fragmenting the protein and testing the fragments for binding activity, and ,(4) systematic mutagenesis (e.g., alanine-scanning mutagenesis) to determine which mutants destroy binding. If the binding site of a homologous protein is known, the binding site may be postulated by analogy. Protein libraries may be constructed and screened that a large family (e.g., 10 8 ) of related mutants may, be evaluated simultaneously. _ , ' Hence, the mutations are preferably conservative modifications as defined below. ' ;
  • Substantially Identical A mutant protein (peptide).- is substantially identical ' , to a reference protein (peptide) if (a) it has at least 10% of a specific binding activity or a non-nutritional biological activity of the reference protein, and (b) is at least 50% identical in amino acid sequence to the reference protein (peptide) . It is “substantially structurally identical” if condition (b) applies, regardless of (a) .
  • Percentage amino acid identity is determined by aligning the mutant and reference sequences according to a rigorous dynamic programming algorithm which globally aligns their sequences to maximize their similarity, the similarity being scored as the sum of scores for each aligned pair according to an unbiased PAM250 matrix, and a penalty for each internal gap of -12 for the first null of the gap and - 4 for each additional null of the same gap.
  • the percentage identity is the number of matches expressed as a percentage of the adjusted (i.e., counting inserted nulls) length of the reference sequence .
  • a mutant DNA sequence is substantially identical to a reference DNA sequence if they are structural sequences, and encoding mutant and reference proteins which are substantially identical as described above.
  • mutant sequence has at least 10% of the regulatory activity of the reference sequence, and is at least 50% identical in nucleotide sequence to the reference sequence. Percentage identity is determined as for proteins except that matches are scored +5, mismatches - 4, the gap open penalty is -12, and the gap extension penalty (per additional null) is -4. More preferably, the sequence is not merely . substantially identical but rather is at least 51%, at least • 66%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identical in sequence to the reference ⁇ sequence .
  • DNA sequences may also be considered "substantially identical" if they hybridize to each other under .stringent conditions, ,i.e., conditions at which the Tm of the heteroduplex of the one strand of the mutant DNA and the more complementary strand of the reference ' DNA is not in excess of 10°C. less than the Tm of the reference DNA homoduplex. Typically this will correspond to a percentage identity of 85-90%.
  • Constant Modifications "Conservative modifications” are defined as (a) conservative substitutions of amino acids as hereafter defined; or (b) ' single or multiple insertions (extension) or deletions (truncation) of amino acids at the termini . Conservative modifications are preferred to other modifications. Conservative substitutions are preferred to other conservative modifications.
  • “Semi-Conservative Modifications” are modifications which are not conservative, but which are (a) semi- conservative substitutions as hereafter defined; or (b) single or multiple insertions or deletions internally, but at interdomain boundaries, in loops or in other segments of relatively high mobility. Semi-conservative modifications are preferred to nonconservative modifications ; Semi- conservative substitutions are preferred to other semi- conservative modifications. Non-conservative substitutions are preferred to other non-conservative modifications.
  • a priori sense i.e., modifications which would be expected to preserve 3D structure and activity, based on analysis of .the naturally occurring families of homologous proteins and of past experience with the effects of deliberate mutagenesis, rather than post facto, a modification already known to conserve activity.
  • a modification which is conservative a priori may, and usually is, also conservative post facto.
  • no more than about five amino acids are inserted or deleted- at a particular * locus, and the modifications are outside regions : known ' to contain binding sites important to activity.
  • insertions or deletions are limited to the ⁇ termini .
  • a conservative substitution is a substitution of one amino acid for another of the same exchange group, the 5 exchange groups being defined as follows I Gly, Pro, Ser, Ala (Cys) (and any nonbiogenic, neutral amino acid with a hydrophobicity not exceeding that of- the aforementioned a.a.'s) II Arg, Lys, His (and any nonbiogenic, positively- 10 charged amino acids) III Asp,. Glu, Asn, Gin (and any nonbiogenic negatively-charged amino acids) IV Leu, lie, Met, Val (Cys) (and any nonbiogenic, . aliphatic, neutral amino acid with a
  • Cys belongs to both I and IV. 20 Residues Pro, Gly and Cys have special conformational roles. Cys participates in formation of disulfide bonds. Gly imparts flexibility to the chain. Pro imparts rigidity to the chain and disrupts a. helices. These residues may be essential in certain regions of the polypeptide, but 25 substitutable elsewhere. One, .two or three conservative substitutions are more likely to be tolerated than a larger number.
  • substitutions are defined herein as being substitutions within supergroup I/II/III or within 30 supergroup IV/V, but not within a single one of groups I-V., They also include replacement of any other amino acid with alanine. If a substitution is not conservative, it preferably is semi-conservative. . ' “Non-conservative substitutions” are substitutions 35 which are not “conservative” or “semi-conservative”.
  • “Highly conservative substitutions” are a subset of conservative substitutions, and are exchanges of ami ⁇ o acids .within the groups Phe/Tyr/Trp, Met/Leu/lle/Val, .His/Arg/Lys ' , ? , Asp/Glu and Ser/Thr/Ala. ' They are more, likely to be tolerated than other conservative substitutions. Again, the smaller the number of substitutions, the more likely they are to be tolerated.
  • a protein (peptide) is conservatively identical to a reference protein (peptide) it differs from the latter, if at all, solely by conservative modifications, the protein (peptide remaining at least seven amino acids long if the reference protein (peptide) was at least seven amino acids long.
  • a protein is at least semi-conservatively identical to a reference protein (peptide) if it differs from the latter, if at all, solely by semi-conservative or conservative modifications.
  • a protein (peptide) is nearly conservatively identical to a reference protein (peptide) if it differs from the latter, if at all, solely by one or more conservative modifications and/or a single nonconservative substitution. It is highly conservatively identical if it differs, if at all, solely by highly conservative substitutions. Highly conservatively identical proteins are preferred to those merely conservatively identical. An absolutely identical protein is even more preferred.
  • the core sequence of a reference protein is the largest single fragment wl ⁇ ich retains at least 10% of a particular specific binding activity, if one is specified, or otherwise of at, least one specific binding activity of the referent.- If the referent has more than one specific binding activity, it may have more- than one core sequence, and these may overlap or not.
  • a peptide of the present invention may have a particular similarity relationship (e.g., markedly identical) to a reference protein (peptide)
  • ' preferred peptides are those which comprise a sequence having- ' that relationship to a core sequence of the reference protein ' (peptide) , but with , internal insertions or deletions in either sequence excluded. Even more preferred peptides are those whose entire sequence has that relationship, with the same exclusion, to a core sequence of that reference protein (peptide) .
  • Library generally refers to a collection of chemical or biological entities which are related in origin, structure, and/or function, and which can be screened simultaneously for a property of interest. Libraries may be classified by how they are constructed (natural vs. artificial diversity; combinatorial vs. noncombinatorial) , how they are screened (hybridization, expression, display) , or by the nature of the screened library members (peptides, nucleic acids, etc.). In a "natural diversity” library, essentially all of the diversity arose without human intervention. This would be true, for example, of messenger RNA extracted from a non- engineered cell.
  • a limitation might be to cells of a particular individual, to a particular species, or to a particular genus, or, more complexly, to individuals of a particular species who are of a particular age, sex, physical .condition, geographical location, .occupation and/or. familial relationship. lternatively or additionally, it ' might"be to cells of a particular- tissue or. organ. Or it • could be cells exposed to particular pharmacological, environmental, or pathogenic conditions.- Or the library could be of chemicals, or a particular class of chemicals, produced by such cells. In a ""controlled structure" library, the library members are deliberately limited by the production conditions to particular chemical structures.. For example, if they are oligomers, they may be limited in length and monomer composition, e.g. hexapeptides composed of the twenty genetically encoded • amino acids.
  • hybridization Library In a hybridization library, the library members are nucleic acids, and are screened using a nucleic acid hybridization probe. Bound nucleic acids may then be amplified, cloned, and/or sequenced.
  • the screened library members are gene expression products, but one may also speak of an underlying library of genes encoding those products.
  • the library is made by subcloning DNA encoding the library members (or portions thereof) into expression vectors (or into cloning vectors which subsequently are used to construct expression vectors) , each vector comprising an expressible gene encoding a particular library member, - introducing the expression vectors into suitable cells, and expressing the genes so the expression products are produced.
  • the expression products are secreted, so the library, can be screened using an affinity reagent, such as an antibody or receptor.
  • the bound expression products may be sequenced, directly, or their , j sequences inferred by, e.g., sequencing at least the- variable portion of the encoding DNA.
  • the cells are lysed,- thereby exposing the expression products, and .the latter are ⁇ screened with the affinity reagent. '• • >'.
  • the cells,- express the library members in such a manner that they are-, displayed on the surface of the cells, or on the surface of viral particles produced by the cells. (See- display libraries, below).
  • the screening is not for the ability of the expression product to bind to an affinity reagent, but rather for its ability to alter the phenotype of the host cell in a particular detectable manner.
  • the screened library members are transformed cells, but there is a first underlying library of expression products which mediate the behavior of' the cells, and a second underlying library of genes which encode those products.
  • the library members are each conjugated to, and displayed upon, a support of some kind.
  • the support may be living (a cell or virus) , or nonliving (e.g., a bead or plate).
  • display will normally be effectuated by expressing a fusion protein which comprises the library member, a carrier moiety allowing integration of the fusion protein into the surface of the cell or virus, and optionally a lining moiety.
  • the cell coexpresses a first fusion comprising the library member and a linking moiety LI, and a second fusion comprising a linking moiety L2 and the carrier moiety.
  • LI and L2 interact to associate the first fusion with the second fusion and hence, indirectly, the library . member with the surface of the cell or virus..
  • a Soluble Library the library members are free in solution.
  • a soluble library may be produced directly, or , ' one may first" make a -display library and then release the library members from their supports.
  • Encapsulated Library In an encapsulated library, the library members are inside cells or liposomes. Generally speaking, encapsulated libraries are used to store the library members for -future use; the members are extracted in some way for screening purposes. However, if they differentially affect the phenotype of the cells, they may be screened indirectly by screening the cells.
  • a cDNA library is usually prepared by extracting RNA from cells of particular origin, fractionating the RNA to isolate the messenger RNA (mRNA has a poly (A) tail, so this is usually done by oligo-dT affinity chromatography) , synthesizing complementary DNA (cDNA) using reverse transcriptase, DNA polymerase, and other enzymes, subcloning the cDNA into vectors, and introducing the vectors into cells. Often, only mRNAs or cDNAs of particular sizes will be used, to make it more likely that the cDNA encodes a functional polypeptide.
  • a cDNA library explores the natural diversity of the transcribed DNAs of cells from a particular source. It is not a combinatorial library.
  • a cDNA library may be used to make a hybridization library, or it may be used as an (or to make) expression library.
  • Genomic DNA Library A genomic DNA library is made by extracting DNA from a particular source, fragmenting the DNA, isolating fragments of a particular size range, subcloning the DNA fragments into vectors, and introducing the vectors into cells. Like a cDNA library, a genomic DNA library is a natural diversity library, and not a combinatorial library. A genomic DNA library may be used the same way as a , cDNA library.
  • Synthetic DNA library A synthetic DNA library may be screened directly (as a hybridization library) , or used in the creation of an expression or display library of peptides/proteins.
  • combinatorial libraries refers to a library in which the individual members are either systematic or random combinations of a limited set of basic elements, the properties of each member being dependent on the choice and location of the elements incorporated into it. Typically, the members of the library are at least capable.of being screened simultaneously. Randomization may be complete or partial; some positions may be randomized and others predetermined, and at random positions, the' choices may be limited in a predetermined manner.
  • the members of a combinatorial library may be oligomers or polymers, of some kind, in which the variation occurs through the choice of monomeric building block at one or more positions of the oligomer or polymer, and possibly in terms of the connecting linkage, or the length of the oligomer or polymer, too.
  • the members may be nonoligomeric molecules with a standard core structure, like the 1, 4-benzodiazepine structure, with the variation being introduced by the choice of substituents at particular variable sites on the core structure.
  • the members may be nonoligomeric molecules assembled like a jigsaw puzzle, but wherein each piece has both one or more variable moieties (contributing to library diversity) and one or more constant moieties (providing the functionalities for coupling the piece in question to other pieces) .
  • each piece has both one or more variable moieties (contributing to library diversity) and one or more constant moieties (providing the functionalities for coupling the piece in question to other pieces) .
  • a "simple combinatorial library” is a mixture of two or more simple libraries, e.g., DNAs and peptides, - or peptides, peptoids, , and PNAs, or benzodiazepines and carbamates ..
  • the number of component simple libraries in a composite library will, of ⁇ course, normally he smaller than the average number of members in each simple library, as- otherwise ' the advantage of a library over individual , synthesis is small.
  • the first combinatorial libraries were composed of peptides or proteins, in which all or selected amino acid positions were randomized. Peptides and- proteins can exhibit high and specific binding activity, and can act as catalysts. In consequence, they, are of great importance in biological systems . Nucleic acids have also been used in combinatorial libraries.
  • the size of a library is the number of molecules in it.
  • the simple diversity of a library is the number of unique structures in it . There is no formal minimum or maximum diversity. If the library has a very low diversity, the library has little advantage over just synthesizing and screening the members individually. If the library is of very high diversity, it may be inconvenient to handle, at least without automatizing the process.
  • the simple diversity of a library is preferably at least 10, 10E2, 10E3, 10E4, 10E6, 10E7, 10E8 or 10E9, the higher the better under most circumstances.
  • the simple diversity is usually not more than 10E15, and more usually not more than 10E10.
  • the average sampling level is the size divided by the simple diversity.
  • the expected average sampling level must be high enough to provide a reasonable assurance -that, ,if a given structure were expected, as a consequence of the library design, to be present, that ' the actual average sampling level will be' high enough so that the structure, if satisfying the screening criteria, will yield a positive result when the library is screened.
  • the preferred average sampling level is a function of the detection limit, which initurn is a function of the strength of the signal -to be screened. , ⁇ ,
  • There are more complex measures of diversity than simple diversity. These attempt to take into account the degree of structural difference between the various unique sequences.- These more comple 'measures are usually used in ' the context of small organic compound libraries, see below.
  • the library members may be presented as solutes in ' solution, or immobilized on some form of support.
  • the support may be living (cell, virus) or nonliving (bead, plate, etc.).
  • the supports may be separable (cells, virus particles, beads) so that binding and nonbinding members can be separated, or nonseparable (plate) .
  • the members will normally, be placed on addressable positions on the support .
  • the advantage of a soluble library is that there is no carrier moiety that could interfere with the binding of the members to the support.
  • the advantage of an immobilized library is that it is easier to identify the structure of the members which were positive.
  • oligonucleotide libraries An oligonucleotide library is a combinatorial library, at least some of whose members are single-stranded oligonucleotides having three or more nucleotides connected by phosphodiester or analogous bonds.
  • the oligonucleotides may be linear, cyclic or branched, and may include non- nucleic acid moieties.
  • The, nucleotides are not limited to the nucleotides normally found in DNA or RNA. For examples of nucleotides modified to increase nuclease resistance and chemical stability of aptamers, see Chart 1 in Osborne and Ellington, Chem. Rev., 97: 349-70 (1997).
  • RNA For screening' of RNA, see Ellington and Szostak, Nature, 346: .818-22 (1990). There is no formal minimum or maximum size for these oligonucleotides. However, the number of conformations which an oligonucleotide can assume increases exponentially with its length in bases. Hence, a longer oligonucleotide is more likely to be able to fold to adapt itself to a protein surface: On the other hand, while ver 'long molecules can be ..synthesized and screened, unless they provide a much ⁇ superior affinity to that of shorter molecules, they -are not likely to be found in the selected population, for ,the , ⁇ reasons explained by Osborne and Ellington (1997) .
  • the libraries of the present invention are preferably composed of oligonucleotides having a length of 3 to 100 bases, more preferably 15 to 35 bases.
  • the oligonucleotides in a given library may be of the same or of different lengths .
  • Oligonucleotide libraries have the advantage that libraries of very high diversity (e.g., 10 15 ) are feasible, and binding molecules are readily amplified in vitro by polymerase chain reaction (PCR) .
  • PCR polymerase chain reaction
  • nucleic acid molecules can have very high specificity and affinity to targets .
  • this invention prepares and screens Oligonucleotide libraries by the SEL ⁇ X method, as described in King and Famulok, Molec. Biol.
  • aptamer is conferred on those . oligonucleotides which bind the target protein. Such aptamers may be used to- characterize the ' target protein, both directly (through identification of the aptamer and the points of contact between the aptamer and the protein) and indirectly (by use of the aptamer as a ligand to modify the chemical reactivity of the protein) .
  • each nucleotide (monomeric unit) is. composed of a phosphate group, a sugar moiety, and either a purine or a pyrimidine base.
  • the sugar is deoxyribose and in ?RNA it is ribose.
  • the nucleotides are linked by 5 ⁇ -3' phosphodies er bonds.
  • the ' deoxyribose phosphate backbone of DNA can be . modified to increase resistance to nuclease and to increase penetration ⁇ of cell membranes.
  • Derivatives such as mono- or dithiophosphates, methyl phosphonates, boranophosphates, - for acetals, carbamates, siloxanes, and dimethylenethio- - sulfoxideo- and-sulfono- linked species are known in the ' art . . , . ' -
  • a peptide is composed of a plurality of amino acid residues joined together by peptidyl (-NHC0-) bonds.
  • a . biogenic peptide is a peptide in which the residues are all genetically encoded amino acid residues; it is not necessary that the biogenic peptide actually be produced by gene expression.
  • Amino acids are the basic building blocks with which peptides and proteins are constructed. , Amino acids possess both an amino group (-NH 2 ) and a carboxylic acid group (- COOH) . Many amino acids, but not all, have the alpha amino acid structure ISIH a -CHR-COOH, where R is hydrogen, or any of a variety of functional groups .
  • Twenty amino acids are genetically encoded: Alanine, Arginine, Asparagine, Aspartic Acid, Cysteine, Glutamic . Acid, Glutamine, Glycine, Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Proline, Serine, Threonine, Tryptophan, Tyrosine, and Valine. Of these, all save Glycine are optically isomeric, however, only the L- form is found in humans. Nevertheless, the D-forms of these amino acids do have biological significance; D-Phe, for example, is a known analgesic.
  • amino acids are also known, ' including: 2- Aminoadipic acid; 3-Aminoadipic acid; beta-Aminopropionic acid; 2-Aminobutyric acid; 4-Aminobutyric acid (Piperidinic acid) ; 6-Aminocaproic acid; 2-Aminoheptanoic acid; 2-
  • Aminoisobutyric acid 3 -Aminoisobutyric acid; 2-Aminopimelic acid; 2,4-Diaminobutyric acid; Desmosine; 2,2'- Diaminopimelic acid; 2 , 3-Diaminopropionic acid; N- Ethylglycine; N-Ethylasparagine; Hydroxylysine,-, allo- Hydroxylysine; 3-Hydroxyproline; 4-Hydroxyproline,- 1
  • Peptides are constructed by condensation of amino acids and/or smaller peptides .
  • the amino group of one amino acid (or peptide) reacts with the carboxylic acid group of a : second amino acid, (or peptide) to form a peptide ⁇ (-?NHCO-) bond, releasing- one molecule of ater.
  • the core of that residue is the moiety which excludes the -NH and -CO linking functionalities which connect it to other residues.
  • This moiety consists of one or more main chain atoms (see below) and the attached side chains .
  • the main chain moiety of each amino acid consists of the -NH and -CO linking unctionalities and a core main chain moiety.- Usually, the latter is a single carbon atom.
  • the core main chain moiety may include additional carbon atoms, and may also include nitrogen, oxygen or sulfur atoms, which together form a single chain.
  • the core main chain atoms consist solely of carbon atoms.
  • the side chains are attached to the core main chain atoms.
  • the C-l, C-2 and N-2 of each residue form the repeating unit of the main chain, and the word "side chain” refers to, the C-3 and higher numbered carbon atoms and their substituents. It also includes H atoms attached to the main chain' atoms.
  • Amino acids may be classified according to the number of carbon atoms which appear in the main chain between the carbonyl carbon and amino nitrogen atoms which ' participate in the peptide bonds.
  • alpha, beta, gamma and delta amino acids are known. These have 1-4 intermediary carbons. Only alpha amino acids occur in proteins. Proline is a special case of an alpha amino acid; its side chain also binds to the peptide bond nitrogen. For beta and higher order amino acids, there is a choice as to which main chain core carbon a side chain other than H is attached to.- The preferred attachment site is- the C-2 (alpha) carbon, i.e., the one adjacent to the carboxyl carbon of the -CO linking functionality. It ' is also possible ' for more than one main chain atom to carry ' a' side chain ⁇ other than H.
  • a main chain carbon atom may carry either one or two side chains; one is more common.
  • a side chain may be attached to a main chain carbon atom by a single or a double bond; the former is more common.
  • a simple combinatorial peptide library is one whose members are peptides having three or more amino acids connected via peptide bonds .
  • the peptides may be linear, branched, or cyclic, and may covalently or noncovalently include nonpeptidyl moieties. .
  • the amino acids are not limited to the naturally occurring or to the genetically encoded amino acids.
  • a biased peptide library is one in which one or more (but not all) residues of the peptides are constant residues.
  • Cyclic Peptides Many naturally occurring peptides are cyclic.
  • Cyclization is a common mechanism for stabilization of peptide conformation thereby achieving improved association of the peptide with its ligand and hence improved biological activity. Cyclization is usually achieved by intra-chain cystine formation, by formation of peptide bond between side chains or between N- and C- terminals. Cyclization was usually achieved by peptides in solution, but several publications . have appeared that describe cyclization of peptides on beads.
  • a peptide library may be an oligopeptide library or a protein library.
  • the oligopeptide are at least five, six, seven or eight amino -acids in length. Preferably, they are composed of less than 50, more preferably less than 20 amino acids. ⁇ In the' case of an oligopeptide library, all or just some of the' residues may.be varia?ble.
  • the oligopeptide may be unconstrained, or constrained to a. particular conformation by, e.g., the participation of constant cysteine residues in the formation of a constraining disulfide bond.
  • Proteins like oligopeptides, are composed of a plurality of amino acids, but the term protein is usually reserved for longer peptides, which are able to fold into a stable conformation.
  • a protein may be composed of two or more polypeptide chains, held together by covalent or ' noncovalent crosslinks. These may occur in a homooligomeric or a heterooligomeric state.
  • a peptide is considered a protein if it (1) is at least 50 amino acids long, or (2) has at least two stabilizing covalent crosslinks (e.g., disulfide bonds). Thus, conotoxins are considered proteins .
  • the proteins of a protein library will be characterizable as having both constant residues (the same for all proteins in the library) and variable residues (which vary from member to member) . This is simply because, for a given range of variation at each position, the sequence space (simple diversity) grows exponentially with the number of residue positions, so at some point it becomes inconvenient for all residues of a peptide to be variable positions. Since proteins are usually larger than oligopeptides, it is more common for protein libraries than oligopeptide libraries to feature variable positions.- In the case of a protein library, it is desirable to focus the mutations at those sites which are tolerant, of mutation.
  • variable domains of an antibody possess hypervariable regions and hence, in some embodiments, the protein library comprises members which comprise a, mutant of VH or VL chain, or a mutant of an antigen-specific binding fragment of such a chain.
  • VH and VL chains are usually each about 110 amino acid residues, and are held in proximity by a disulfide bond between the adjoing CL and CHI regions to form a variable domain. Together, the VH, VL, CL and CHI form an Fab fragment.
  • the hypervariable regions are at 31-35, 49-65, 98-111 and 84-88, but only the first three are involved in antigen binding.
  • VH and VL chains may be covaleiitly joined by a suitable ' -- linker moiety, as in a "single chain antibody”, or they may be noncovalently joined, as in a naturally occurring variable domain. If the joining is noncovalent, and the library is displayed on cells or virus, then either the VH or the VL chain may be fused to the carrier surface/coat protein.
  • the complementary chain may be co-expressed, or added exogenously to the library.
  • the members may further comprise some or all of an antibody constant heavy and/or constant light chain, or a mutant thereof.
  • Each may be chosen independently from the group consisting of amine (-NH-) , substituted amine (-NR-) , carbonyl (-CO-) , thiocarbonyl (-CS-) ,methylene (-CH2-) , monosubstituted methylene (-CHR-) , disubstituted methylene (-CR1R2-) , ether (-0-) and thioether (-S-).
  • the more preferred pseudopeptide bonds include : N-modified -NRCO- Carba ⁇ -CH 2 -CH 2 - Depsi ⁇ -C0-O- Hydroxyethylene ⁇ -CHOH-CH 2 - Ketomethylene ⁇ -C0-CH 2 - Methylene-Oxy -CH 2 -0- Reduced -CH 2 -NH- Thiomethylene -CH 2 -S- , Thiopeptide -CS-NH- Retro-Inverso -CO-NH-
  • a single peptoid molecule may include more than one kind of. pseudopeptide bond.
  • the side chains attached 5 to the core main chain atoms of the monomers linked by the pseudopeptide bonds and/or (2) the side chains (e.g., the - R of an -NRC0-) of the pseudopeptide bonds.
  • the monomeric units which are not amino acid residues are of the structure -NR1-CR2-CO-, where at least 10 one of RI and R2 are not hydrogen. If there is variability in the pseudopeptide bond, this is most conveniently done by using an -NRCO- or other pseudopeptide bond with an R group, and varying the R group.
  • the R group will usually be any of the side chains characterizing the amino 15 ' acids of peptides, as previously discussed. If the R group of the pseudopeptide bond is not variable, it will usually be small, e.g., not more than 10 atoms (e.g., hydroxyl, amino, carboxyl, methyl, ethyl, propyl) .
  • a simple combinatorial library may include both peptides and peptoids .
  • a PNA oligomer is here defined as one - comprising a plurality of units, at least one of which is a PNA monomer which comprises a side chain comprising a nucleobase.
  • a PNA monomer which comprises a side chain comprising a nucleobase.
  • the classic PNA oligomer is composed of (2- 30 aminoethyl) glycine units, with nucleobases attached by methylene carbonyl linkers. That is, it has the structure
  • nucleobase B is separated : from the' backbone N by three bonds, and the points of attachment of the side chains are separated by six bonds.
  • the nucleobase may be any of the bases included in the nucleotides discussed in connection with oligonucleotide libraries.
  • the bases of nucleotides A, G, T, C and U are preferred.
  • a PNA oligomer may further comprise one or more amino acid residues, especially glycine and proline.
  • a side chain is attached to one of the three main chain carbons not participating in the peptide bond (either instead or in addition to the side chain attached to the N of the classic PNA).; and/or (3) the peptide bonds ' are replaced by pseudopeptide bonds as disclosed previously in the context of peptoids.
  • PNA oligomer libraries have been made; see e.g. Cook, 6,204,326.
  • Small Organic Compound Library The small organic compound library (“compound library”, for short) is a combinatorial library whose members are suitable for use as drugs if, indeed, they have the ability to mediate a biological activity of the target protein. ' Peptides have certain disadvantages as drugs .
  • disjunction in- which one moiety is, replaced by another which may be similar or different, but which is not in effect a disjunction or conjunction.
  • alteration in- which one moiety is, replaced by another which may be similar or different, but which is not in effect a disjunction or conjunction.
  • the use of the terms "disjunction”, “conjunction” and “alteration” is intended only to connote the structural relationship of the end product to the original leads, and not how the new drugs are actually synthesized, although it is possible that the two are the same.
  • the process of disjunction is illustrated by the evolution of neostigmine (1931) and edrophonium (1952) from physostigmine (1925) . Subsequent conjunction is illustrated by demecarium (1956) and ambenonium (1956) .
  • Alterations may modify the size, polarity, or electron distribution of an original moiety. Alterations include ring closing or opening, formation of lower or higher homologues, introduction or saturation of double bonds , introduction of optically active centers, introduction, removal or replacement of bulky groups, isosteric or bioisosteric substitution, changes in the position or orientation of a group, introduction of alkylating groups, and introduction, removal or replacement of groups with a ' view toward inhibiting or promoting inductive (electrostatic) or. conj ⁇ gative (resonance) effects.
  • the substituents may include , electron acceptors and/or electron donors.
  • Typical electron' donors (+1) include -CH 3 , -CH 2 R, -CHR 2 , -CR 3 and -COO".
  • the substituents may also include those which increase or decrease .electronic density in conjugated systems.
  • a library / a compound, or a family of compounds having one or more pharmacological activities (which need not be related to the known or suspected activities of the target protein) , may be disjoined into two or more known orr potential pharmacophoric moieties. Analogues of each of these moieties may be identified, and mixtures of these a-nalogues reacted so as to reassemble compounds which have some similarity to the original ⁇ lead compound. It is not necessary that all members of the library possess moi&ties analogous to all of the moieties of the lead compound.
  • benzodiazepines have widespread biological activities; derivatives have been reported to a ct not only as anxiolytics, but also as anticonvul sants ; cholecystokinin (CCK) receptor subtype A or B, kapp.a opioid receptor, platelet activating factor, and HIV transactivator Tat antagonists, and GPIIblla, reverse transcriptase and ras farnesyltransferase inhibitors .
  • CCK cholecystokinin
  • the benzodiazepine structure has been disjoined into a 2-aminobenzophenone, an amino acid, and an alkylating agent. - See Bunin, et al . , Proc. Nat. Acad. Sci.
  • the hydantoins were synthesized by first simultaneously deprotecting and then treating each of five amino acid resins with each of eight isocyanates .
  • the benzodiazepines were synthesized by treating each of five deprotected amino acid resins with each of eight 2-amino benzophenone imines . Chen, et al . , J. Am. Chem. Soc, 116:2661-62 (1994) described the preparation of a pilot (9 member) combinatorial library of formate esters.
  • a polymer bead- bound aldehyde preparation was "split" into three aliquots, each reacted with one of three different ylide reagents. The reaction products were combined, and then divided into three new aliquots, each of which was reacted with a different Michael donor. Compound identity was found to be determinable on a single bead basis by gas chromatography/mass spectroscopy analysis . Holmes, USP 5,549,974 (1996) sets forth methodologies ' for the combinatorial synthesis of libraries of thiazolidinones and metathiazanones . These libraries are-,, made by combination of amines, carbonyl compounds, and thiols under cyclization conditions.
  • each member is 25 synthesized only at a particular coordinate on ' or in a matrix, or in a particular chamber .
  • This ' might be, for example, the location of a : . , particular pin, or a particular well on a microtiter plate, or inside a "tea bag” .
  • the present invention is not limited to any particular form of identification. However, it is possible to simply characterise those members of the library which are found to be active, based on the characteristic spectroscopic indicia of the various -35 -- building blocks. .Solid phase synthesis permits greater controH ⁇ over , . which derivatives are formed. However, the' solid phase . could interfere with activity.
  • the preferred animal subject of the present invention is a mammal.
  • mammal an individual belonging to the class Mammalia.
  • the invention is particularly useful in the treatment of human subjects, although it is intended for veterinary and nutritional uses as well.
  • Preferred nonhuman subjects are of the orders Primata (e.g., apes and monkeys), Artiodactyla or Perissodactyla (e.g., cows, pigs, sheep, horses, goats), Carnivora (e.g., cats, dogs), Rodenta (e.g., rats, mice, guinea pigs, hamsters), Lagomorpha (e.g., rabbits) or other pet, farm or laboratory mammals.
  • Primata e.g., apes and monkeys
  • Artiodactyla or Perissodactyla e.g., cows, pigs, sheep, horses, goats
  • Carnivora e.g., cats, dogs
  • prevention is intended to include “prevention,” “suppression” and “treatment.”
  • prevention strictly speaking, involves administration of the pharmaceutical prior to the induction of the disease (or other adverse clinical condition) .
  • suppression involves administration of the composition prior to the clinical appearance of the disease.
  • Treatment involves administration of the protective composition after the appearance of the disease. It will be understood that in human and veterinary medicine, it is not always possible to distinguish between : “preventing” and “suppressing” since the ultimate inductive event or events may be unknown, latent, or the patient is not ascertained until well after the occurrence of the event or events.
  • prevention will be understood to refer to both prevention in the strict ' sense, and to suppression.
  • the preventative or prophylactic use of a pharmaceutical usually involves identifying subjects who are at higher risk than the general population of contracting the disease, and administering the pharmaceutical to them in advance of the ; clinical appearance of the disease. -The effectiveness of such use is measured by comparing the , subsequent incidence or severity of the disease, or of particular symptoms of the disease, in the treated subjects against that in untreated subjects of the- same high risk group.
  • high risk factors vary from disease to disease, in general, these include (1) prior occurrence of the disease in one or more members of the same family, or, in the case of a contagious disease, in individuals with whom the subject has come into potentially contagious contact at a time when the earlier victim was likely to be contagious, (2) a prior occurrence of the disease in the subject, (3) prior occurrence of a related disease, or a condition known to increase the likelihood of the disease, in the subject; (4) appearance of a suspicious level of a marker of the disease, or a related disease or condition; (5) a subject who is immunologically compromised, e.g., by radiation treatment, HIV infection, drug use,, etc., or (6) membership in a particular group (e.g., a particular age, sex, race, ethnic group, etc.) which has been epidemiologically associated with that disease.
  • a subject who is immunologically compromised e.g., by radiation treatment, HIV infection, drug use,, etc.
  • membership in a particular group e.
  • prophylaxis for the general population, and not just a high risk group. This is most likely to be the case when essentially all are at risk of contracting the disease, the effects of the disease are serious, the therapeutic index of the prophylactic agent is high, and the cost of the agent is low.
  • a prophylaxis or treatment may be curative, that is, directed at the underlying cause of a disease, or ameliorative, that is, directed at the symptoms of the disease, especially those which reduce the quality of life. It should also be understood that to be useful, the protection provided need not be absolute, provided that it is sufficient to carry clinical value.
  • administration may be systemic or topical.
  • administration of such a composition may be by various parenteral routes such as subcutaneous, intravenous, intradermal, intramuscular, intraperitoneal, intranasal, transdermal, or buccal routes.
  • parenteral routes such as subcutaneous, intravenous, intradermal, intramuscular, intraperitoneal, intranasal, transdermal, or buccal routes.
  • administration may be by the oral route.
  • Parenteral administration can be by bolus injection or by gradual perfusion over time.
  • a typical regimen comprises administration of an effective amount of the drug, administered over a period ranging from a single dose, to dosing over a period of hours, days, weeks, months , or years .
  • the suitable dosage of a drug of the present invention will be dependent upon the age, sex, health, and weight of the recipient, kind of concurrent treatment, if any, frequency of treatment, and the nature of the effect desired. However, the most preferred dosage can be tailored to the individual subject, as is understood and . determinable by one of skill in the art, without undue experimentation.- This will typically involve adjustment of a standard dose, e.g., reduction of the dose if the patient has a low body weight. Prior to use in humans, a drug , will first be evaluated for safety and efficacy in- laboratory animals.
  • the total dose required for each treatment may be administered by multiple doses or in a single dose.
  • the protein may be administered alone or in conjunction with other therapeutics , directed to the disease or directed to other symptoms thereof.
  • Typical pharmaceutical doses for adult humans, are in the range of 1 ng to lOg per day, more often 1 mg to lg per day.
  • the appropriate dosage form will depend on the disease, the pharmaceutical, and the mode of administration; possibilities include tablets, capsules, lozenges, dental pastes, suppositories, inhalants, solutions, ointments and parenteral depots. See, e.g., Berker, supra, Goodman, supra, Avery, supra and Ebadi, supra, which are entirely incorporated herein by reference, including all references cited therein.
  • the drug may be administered in the- form of an expression vector comprising a nucleic acid encoding the peptide; such a vector, after incorporation into' the genetic complement of a cell of the patient, directs synthesis of the peptide.
  • Suitable vectors include genetically engineered poxviruses (vaccinia) , adenoviruses, adeno-associated viruses, herpesviruses and lentiviruses which are or have been rendered nonpathogenic.
  • a pharmaceutical composition may contain suitable pharmaceutically acceptable carriers, such as excipients, carriers and/or auxiliaries which facilitate processing of the active compounds into preparations which can be' used pharmaceutically. . See, e.g., Berker, supra, Goodman, supra, Avery, supra and Ebadi, supra, which -are entirely incorporated herein by reference, included all references cited therein.
  • the invention contemplates that it may be appropriate to ascertain or to mediate the biological activity of a substance of this invention in a target organism.
  • the target organism may be a plant, animal, or microorganism.
  • the drug may be intended to increase the disease, weather or pest resistance, alter the growth characteristics, or otherwise improve the useful characteristics or mute undesirable characteristics of the plant .
  • it may be a weed, in which case the drug may be intended to kill or otherwise inhibit the growth of the plant, or to alter its characteristics to convert it from a weed to an economic plant.
  • the plant may be a tree, shrub, crop, grass, etc.-
  • the plant may be an algae (which are in some cases also microorganisms) , or a vascular plant, especially gymnosperms (particularly conifers) and angiospeirms.
  • Angiosperms may be monocots or dicots.
  • the plants of greatest interest are rice, wheat, corn, alfalfa, soybeans, potatoes, peanuts, tomatoes, melons, apples, pears, plums, pineapples, fir, spruce, pine, cedar, and oak.
  • the target organism is .a microorganism, it may be ' • algae, bacteria, fungi, or a virus (although the biological activity of a.
  • the microorganism may be human or other animal or plant pathogen, or it may be nonpathogenic. It may be a soil or water organism, 1 or one which normally lives inside other living things. If the target organism is an animal, it may be a , vertebrate or a nonvertebrate animal. Nonvertebrate animals are chiefly of- interest when they act as pathogens or parasites, and the drugs are intended to act as biocidic or biostatic agents. Nonvertebrate animals of ' interest include ' worms, mollusks, and arthropods. ?
  • the target organism may also be a vertebrate animal, i.e., a mammal, bird, reptile, fish or amphibian. Among mammals, the target animal preferably belongs to the order
  • the target animals are preferably of the orders Anseriformes (e.g., ducks, geese, swans) or
  • Galliformes e.g., quails, grouse, pheasants, turkeys and chickens
  • the target animal is preferably of the order Clupeif ⁇ rmes (e.g., sardines, shad, anchovies, whitefish, salmon) .
  • Target Tissues refers to any whole animal, physiological system, whole organ, part of organ, miscellaneous tissue, cell, or cell component (e.g., the cell membrane) of a target animal in which biological activity may be measured. Routinely in mammals one would choose to compare and contrast the biological impact on virtually any and all tissues which express the subject receptor protein.
  • the main tissues to use are: brain, heart, lung, kidney, liver, pancreas, skin, intestines, adipose,- stomach, skeletal muscle, adrenal glands, breast, prostate, vasculature, retina, cornea, thyroid gland, parathyroid glands, thymus, bone marrow, bone, etc.
  • B cells B cells, T cells, macrophages, neutrophils, eosinophils,. mast cells, platelets, megakaryocytes, erythrocytes, bone marrow stomal cells, fibroblasts, neurons, astrocytes, neuroglia, . microglia, epithelial cells (from any organ, e.g. skin, ' ' breast, prostate, lung, intestines etc) , cardiac muscle cells, smooth muscle cells, striated muscle cells, osteoblasts, osteocytes, chondroblasts, chondrocytes, keratinocytes, ⁇ melanocytes, etc. ' .
  • Screening Assays Assays intended to determine the binding or the biological activity of a substance are called preliminary screening assays. Screening assays will typically be either in vitro (cell-free) 'assays (for binding to an immobilized receptor) or cell-based assays (for alterations in the phenotype of the cell) . They will not involve screening of whole multicellular organisms, or isolated organs. The comments on diagnostic biological assays apply mutatis mutandis to screening cell-based assays.
  • in vitro is descriptive of an event, such as binding or enzymatic action, which occurs within a living organism.
  • the organism in question may, however, be genetically modified.
  • the term in vi tro refers to an event which occurs outside a living organism. Parts of an organism (e.g., a membrane, or an isolated biochemical) are used, together with artificial substrates and/or conditions .
  • the term in vitro excludes events occurring inside or on an intact cell, whether of a unicellular or multicellular organism.
  • In vivo assays include both cell-based assays, and organismic assays.
  • the cell-based assays include both assays on unicellular organisms, and assays on isolated cells or cell cultures derived from multicellular organisms.
  • the cell cultures may be mixed, provided that they are not 'organized into tissues or organs.
  • organismic assay refers to assays on whole multicellular organisms, and assays on isolated organs or tissues of such organisms.
  • the assay may be a binding assay, in which one step involves the binding of a diagnostic reagent to the analyte, or a reaction assay, which involves the reaction of a reagent with the analyte.
  • the reagents used in a binding assay may be classified as to the nature of their interaction with analyte: (1) analyte analogues, or (2) analyte binding . molecules (ABM). They may be labeled or-, insolubilized.
  • the assay may look for a direct reaction between the analyte and a reagent which is reactive with the analyte, or if the analyte is an enzyme or enzyme inhibitor, for a reaction catalyzed or inhibited by the analyte.
  • the reagent may be a reactant, a catalyst, or an inhibitor for the reaction.
  • An assay may involve a cascade of ste s in which the product of one step acts as the target for the next step. These steps may be binding steps, reaction steps, or a combination thereof.
  • SPS Signal Producing System
  • the assay In order to detect ' the presence, or measure the amount, of an analyte, the assay must provide -for a signal producing -system (SPS) ( in which there is a detectable difference in the signal produced, depending on whether the analyte is ' present' or absent (or,-, in a quantitative assay, on the amount of the analyte) .
  • the detectable signal may be one which is visually detectable, or one detectable only with instruments. Possible signals include production of colored or luminescent products, alteration of the characteristics (including amplitude or polarization) of absorption or emission of radiation by an assay component or product, and precipitation or agglutination of a. component or product.
  • signal is intended to include the discontinuance of an existing signal, or a change in the rate of change of an observable parameter, rather than a change in its absolute value.
  • the signal may be monitored manually or automatically.
  • the signal is often a product of the reaction.
  • a binding assay it is normally provided by a label borne by a labeled reagent.
  • a label may be, e.g., a radioisotope, a fluorophore, an enzyme, a co-enzyme, an enzyme substrate, an electron-dense compound, an agglutinable particle.
  • the radioactive isotope can be detected by such means as the use of. a gamma counter or a scintillation counter or by autoradiography.
  • Isotopes which are particularly useful for the purpose of the present invention include 3 H, 125 I, 131 I, 3 ⁇ S, 14 C, 32 P and 33 P. 125 I is preferred for antibody labeling.
  • the label may also be a fluorophore.
  • the • fluorescently labeled reagent When the • fluorescently labeled reagent is exposed to light of the proper wave length, its presence can then be detected due to fluorescence.
  • fluorescence-emitting metals such as - -: 125 Eu, or others of the lanthanide series, may be incorporated into a diagnostic reagent using such metal chelating groups as diethylenetriaminepentaacetic acid (DTPA) of ethylenediamine-tetraacetic acid (EDTA) .
  • DTPA diethylenetriaminepentaacetic acid
  • EDTA ethylenediamine-tetraacetic acid
  • the label may also be a chemiluminescent compound. The presence of the chemiluminescently labeled reagent is then . - determined by detecting the presence of luminescence that' arises during the course of a chemical reaction.
  • Enzyme labels such as horseradish peroxidase and alkaline phosphatase, are preferred.
  • the signal producing system must also include a substrate for the enzyme. If the enzymatic reaction product is not itself detectable, the SPS will include one or more additional reactants so that a detectable product appears.
  • An enzyme analyte may act as its own label if an enzyme inhibitor is used as a diagnostic reagent.
  • Binding assays may be divided into two basic types, heterogeneous and homogeneous.
  • heterogeneous assays the interaction between the affinity molecule and the analyte ' does' not affect ' the label, hence, to determine the amount or presence of analyte, bound label, must be separated from free label.
  • homogeneous assays the interaction does affect the activity of the label, and therefore analyte levels can be deduced without the need for a separation step.
  • the ABM is insolubilized by coupling it to a macromolecular support, 'and analyte in the sample, is allowed to compete with a known quantity of.
  • analyte, ⁇ analogue is a molecule capable of competing with analyte for binding to the ABM, and the term is intended to include analyte itself. It may be labeled already, or it may be labeled subsequently by specifically binding the label to a 5 moiety differentiating the analyte analogue from analyte.
  • the solid and liquid phases are separated, and the labeled analyte analogue in one phase is quantified. The higher the level of analyte analogue in the solid phase, i.e., sticking to the ABM, the lower the level of analyte in the
  • both an insolubilized ABM, and a labeled ABM are employed.
  • the analyte is captured by the insolubilized ABM and is tagged by the labeled ABM, forming . a ternary complex.
  • the reagents may be added to the sample
  • the ABMs may be the same or different.
  • the amount of labeled ABM in the ternary complex is directly proportional to the amount of analyte in the sample.
  • a label may be conjugated, directly or indirectly
  • the ABM may be conjugated to • a solid phase support to form a solid phase (“capture”) diagnostic reagent.
  • Suitable supports include glass, polystyrene, polypropylene, polyethylene, dextran, nylon, amylases, • natural and modified celluloses, polyacrylamides, agaroses, and magnetite .
  • the nature of the carrier can be either soluble to some extent or insoluble for the purposes of the
  • the support material may have virtually any possible , structural configuration so long as the coupled molecule is capable of binding to its target.
  • the support ' > .configuration may be spherical, as in a. bead, or cylindrical, as in the inside surface of a test tube,, or the external surface of a rod.
  • the surface may be flat such as a sheet, test strip, etc.
  • Biological Assays measures or detects a biological response of a biological, entity to a substance.
  • the biological entity may be a whole organism, an isolated organ or tissue, freshly isolated cells, an immortalized cell line, or a subcellular component (such as a membrane; this term should not be construed as including an isolated .receptor) .
  • the entity may be, or may be derived from, an organism which occurs in nature, or which is modified in some way. Modifications may be genetic (including radiation and chemical mutants, and genetic engineering) or somatic (e.g., surgical, chemical, etc.). In the case of a multicellular entity, the modifications may affect some or all cells.
  • the entity need not be the target organism, or a derivative thereof, if there is a reasonable correlation between bioassay activity in the assay entity and biological activity in the target organism.
  • the entity is placed in a particular environment, which may be more or less natural. For example, a culture medium may, but need not, contain serum or serum substitutes, and it may, but need ' not, include a support matrix of some kind, it may be still, or agitated.
  • particular nutrients e.g., consumption of ⁇ '. ; oxygen, production of C0 2 , /production of organic acids, uptake or discharge of ions
  • the direct signal produced by the biological marker may be transformed by a signal producing system into a different signal which is more observable, for example, a fluorescent or colorimetric signal.
  • the entity, environment, marker and signal producing system are chosen to achieve a clinically acceptable level of sensitivity, specificity and accuracy.
  • the goal will be to identify substances which mediate the biological activity of a natural biological entity, and the assay is carried out directly with that entity.
  • the biological entity is used simply as a model of some more complex (or otherwise inconvenient to work with) biological entity. In that event, the model biological entity is used because activity in the model system is considered more predictive of activity in the ultimate natural biological entity than is simple binding activity in an in vitro system.
  • the model entity is used instead of the ultimate entity because the former is more expensive or slower to work with, or because ethical considerations forbid working with the ultimate entity yet .
  • the model entity may be naturally occurring, if the model entity usefully models the ultimate entity under some conditions. Or it may be non-naturally occurring, with modifications that increase its resemblance to the ultimate entity.
  • Transgenic animals such as transgenic mice, rats, and rabbits, have been 1 found useful as model systems ' .
  • the receptor may be functionally connected to a signal (biological marker) producing system, ' which, may be endogenous or exogenous ' to the cell.
  • a signal biological marker
  • the binding? of a peptide to the target protein results in a screenable or selectable ⁇ . phenotypic change, without resort to fusing the target protein (or a ligard binding moiety thereof) to an endogenous protein. It may be that the target protein is endogenous to the host cell, or is substantially identical to an endogenous receptor so that it can take advantage of the latter 's native signal transduction pathway. Or sufficient elements of the signal transduction pathway normally associated with the target protein may be engineered into the cell so that the cell signals binding to the target protein.
  • a chimera receptor a hybrid of the target protein and an endogenous receptor
  • the chimeric receptor has the ligand binding characteristics of the target protein and the signal transduction characteristics of the endogenous receptor.
  • the normal signal transduction pathway of the endogenous receptor is subverted.
  • the endogenous receptor is inactivated, or the conditions of the assay avoid activation of the endogenous receptor, to improve the signal-to-noise ratio. See Fowlkes USE 3 5,789,184 for a yeast system.
  • Another type of "one-hybrid” system combines a peptide: DNA-binding ' domain fusion with an unfused target receptor that possesses an activation domain.
  • the cell-based assay is a two hybrid system.
  • This term implies that the ligand is ⁇ incorporated into a first hybrid protein, and the receptor into a second hybrid, protein.
  • the first hybrid also comprises component A of a signal generating system, and the second hybrid comprises component B of that system.
  • Components A and B by themselves, are insufficient to generate a signal. However, if the ligand binds the receptor, components A and B are brought into sufficiently close proximity so that they can cooperate to generate a signal.
  • Components A and B may naturally occur, or be substantially identical to moieties which naturally occur, as components of a single naturally occurring biomolecule, or they may naturally occur, or be substantially identical to moieties which naturally occur, as separate naturally occurring biomolecules which interact in nature.
  • two-Hybrid System Transcription Factor Type
  • one member of a peptide ligand:receptor binding pair is expressed as a fusion to a DNA-binding domain (DBD) from a transcription factor (this fusion protein is called the “bait")
  • the other is expressed as a fusion to a transactivation domain (TAD) (this fusion protein is called the "fish", the "prey”, or the "catch”)
  • the transactivation domain should be complementary to the DNA-binding domain, i.e., it should interact with the latter so as to activate transcription of a specially designed reporter gene that carries a binding site for the DNA-binding domain.
  • the two fusion proteins must likewise be complementary.
  • This complementarity may be achieved by use of the complementary and separable DNA-binding and transcriptional activator domains of a single transcriptional activator protein, or one may use complementary domains derived from different proteins.
  • the domains may be identical to the native domains, or mutants thereof.
  • the assay members may be fused directly to the DBD or TAD, or fused through an intermediated linker.
  • the target DNA operator may be the native operator sequence, or a mutant operator. Mutations, in the operator may be coordinated with mutations in. the DBD and the TAD.
  • the two fusion proteins may be expressed from the same or different vectors.
  • the activatable reporter gene may be expressed from the same vector as either fusion protein (or both proteins) , or from a third vector.
  • Potential DNA-binding domains include Gal4, LexA, and mutant domains substantially identical to the above. Potential activation, domains include E.
  • the assay system will include a signal producing system, too.
  • the first element of this system is a reporter gene operably linked to an operator responsive to the DBD and TAD of -choice. The expression of this reporter gene will result, directly or indirectly, in a selectable or screenable phenotype (the signal) .
  • the signal producing system may include, besides the reporter gene, additional genetic or biochemical elements which cooperate in the production of the signal. Such an element could be, for example, a selective agent in the cell growth medium.
  • the system may include more than one reporter gene.
  • the sensitivity of the system may be adjusted by, e.g., use of competitive inhibitors of any step in the activation or signal production process, increasing or decreasing . the number of operators, using a stronger or weaker DBD or TAD, etc. -
  • the assay is said to be a selection.
  • the signal merely results in a detectable phenotype by which the signaling cell may be differentiated from the same cell in a nonsignaling state; (either way being a living cell) ' ,-, the , ; assay is a screen.
  • screening assay may be used in a broader sense to include ' a selection.--.
  • nonselective screen Various screening and selection systems are discussed in Ladner, USP 5,198,346. Screening and selection may be for or against the peptide: target protein or compound:target protein interaction.
  • Preferred assay cells are microbial (bacterial, yeast, algal, protozooal) , invertebrate, vertebrate (esp. mammalian, particularly human) .
  • the best developed two- hybrid assays are yeast and. mammalian systems.
  • two hybrid assays are used to determine whether a protein X and a protein Y interact, by virtue of their ability to reconstitute the interaction of the DBD and the TAD.
  • augmented two-hybrid assays have been used to detect interactions that depend on a third, non- protein ligand.
  • two-hybrid assays see Brent and Finley, Jr., Ann. Rev. Genet., 31:663-704 (1997); Fremont- Racine, et al . , Nature Genetics, 277-281 (16 July 1997);
  • Radio-labeled ABM may be administered to the human or animal subject. Administration is typically by injection, e.g., intravenous or arterial or other means of administration in a quantity sufficient to permit subsequent dynamic and/or static imaging using suitable radio-detecting devices. The dosage is the smallest amount capable of providing a diagnostically effective image, and may be determined by means conventional in the art, using known radio-imaging agents as a guide.
  • the imaging is carried out on the whole body of the subject, or on that portion of the body or organ relevant to the condition or disease under study.
  • the amount of radio-labeled ABM accumulated at a given point in time in relevant target organs can then be quantified.
  • a particularly suitable radio-detecting device is a scintillation camera, such as a gamma camera.
  • a scintillation camera is a stationary device that can be used to image distribution of radio-labeled ABM.
  • the detection device in the- camera senses the radioactive decay, the distribution of which can be recorded.
  • Data produced by the imaging system can be digitized. The digitized information can be analyzed over time discontinuously or continuously.
  • the digitized data can be processed to produce images, called frames, of the pattern of uptake of the radio-labeled ABM in the target organ at a discrete point in time.
  • images called frames
  • quantitative data is obtained' by observing changes in distributions of radioactive decay in target organs over time.
  • a time-activity analysis of .the data will illustrate .. uptake through clearance of the radio-labeled binding protein by the target organs with time. ;
  • Various factors should be taken into consideration in ⁇ selecting an- appropriate radioisotope .
  • the radioisotope ⁇ must be 1 ,selected with a view to obtaining good quality resolution upon imaging, should be safe for diagnostic use in humans and animals, and should preferably have a short physical half-life so as to decrease the amount of radiation received by the body.
  • the radioisotope used should preferably be pharmacologically inert, and, in the quantities administered, should not have any substantial physiological effect .
  • the ABM may be radio-labeled with different isotopes of iodine, for example 123 I, 125 I, or 131 I (see for example, U.S. Patent 4,609, 725) .
  • the extent of radio-labeling must, however be monitored, since it will affect the calculations made based on the imaging results (i.e.
  • a diiodinated ABM will result in twice the radiation count of a similar monoiodinatecl ABM over the same time frame) .
  • radioisotopes other than 125 I for labeling in order to decrease the total dosimetry exposure of the human body and to optimize the detectability of the labeled molecule (though this radioisotope can be used if circumstances require) . Ready availability for clinical -use is also a factor. Accordingly, for human applications, preferred radio-labels are for example, 99m Tc, S7 Ga, 68 Ga, 90 Y, U1 ln, 113ra In, 123 I, 186 He, 188 Re or 211 At .
  • the radio-labeled ABM may be prepared by various methods. These include radio-halogenation by the chloramine - T method or the lactoperoxidase method and subsequent purification by HPLC (high pressure liquid chromatography) , , for example as described by J. Gutkowska et al in "Endocrinology and Metabolism Clinics of America: (1987) 16. (1):183. Other known, methods of radio-labeling can be used, such as IODOBEADSTM. ' ⁇ ⁇ There are a number of different methods of delivering the radio-labeled ABM to -the end-user. It may, be administered by any means that enables the active agent- to reach the agent's site of action in the body of a mammal.
  • parenteral administration i.e.,' intravenous, • subcutaneous, intramuscular, would ordinarily be used to optimize absorption of an ABM, such as an antibody, which is a protein.
  • Obesity and subsequent hyperinsulinemia and hyperglycemia were induced by feeding a group of 3 week old mice (50 C57BL/6 males) a high-fat diet (Bio-Serve , Frenchtown, NJ, #F1850 High Carbohydrate-High Fat; 56% of ⁇ calories from fat, 16% from protein and 27% from carbohydrates) : Another group of 3 week old mice (20 C57B1/6 males) were fed the normal control diet (P1 I Nutrition International Inc., Brentwood, MO, Prolab ?RMH3000; 14% of calories from fat, 16% from protein and 60% from carbohydrates) . The mice were placed onto the respective ' diets immediately following weaning. Animal weights were determined weekly.
  • Results reflect mean ⁇ SE of 50 mice on the HF diet and 20 mice on the Std diet. Normal weight, normal fasting blood glucose and normal fasting plasma insulin levels are defined as the respective mean values of the animals fed the control diet. Two of the "most typical" animals were selected for each group (Control, hyperinsulinemic and Diabetic) at each time point ( 2,4, 8, and 16 weeks after commencement of diet) for sacrifice. The selected mice were sacrificed and muscle tissue obtained and immediately processed for RNA- isolation.
  • Plasma insulin measurements were measured from a drop of blood taken from the tip of the tail of fasted (8 hr) mice using a Lifescan Genuine One Touch glucometer. All measurements occurred between 2:00 pm and 5:00 pm. Plasma insulin measurements. . Blood was collected from the tail of, fasted (8 hr) mice into a heparinized capillary tube and stored on ice. All collections occurred between 2:00 pm and 5:00 pm. Plasma was separated from red blood cells by centrifugation for 10 minutes at 8000 x g and then stored at -20'C. Insulin concentrations were determined using the Rat Insulin ELISA kit and rat insulin standards (ALPCO) .essentially as instructed by the manufacturer. Values were adjusted by a factor of 1.23 as determined by the manufacturer to correct for the species difference in cross-reactivity with the antibody.- ' . '
  • RNA isolation Total RNA was isolated from muscle (skeletal muscle , specifically, gastrocnemius) of two mice at each time point ' during the progression of HF diet-induced type 2 diabetes, , as well as age-matched controls on the/Std, diet, /using the ? RNA" STAT-60 Total RNA/mRNA Isolation Reagent .according to : the manufacturer's, instructions (Tel-Test, Friendswood, TX) . Sample Quantification and Quality Assessment Total RNA was quantified and assessed for quality on a Bioanalyzer ?RNA 6000 Nano chip (Agilent) . Each chip contained an interconnected set o f gel-filled channels that allowed for molecular sieving of xiucleic acids.
  • Pin- electrodes in the chip were used to create electrokinetic forces capable of driving molecules through these micro- channels to perform electrophoret c separations. Ribosomal peaks were measured by fluorescence signal and displayed in an electropherogram. A successful total RNA sample featured 2 distinct ribosomal peaks (18S and 28S rRNA) .
  • RNA was prepared for use as a hybridization target as described in the manufacturer 7 s instructions for CodeLink Expression Bioarrays (TM) (Amersham Biosciences) .
  • the CodeLink Expression Bioarrays utilize nucleic acid hybridization of a biotin-labeled complementary RNA(cRNA) target with DNA oligonucleotide probes attached to a gel matrix.
  • the biotin-labeled cRNA target is prepared by a linear amplification method.
  • Poly (A) + UJMA (within the total RNA • population) is primed for reverse transcription by a DNA oligonucleotide containing a T7 RKFA polymerase promoter 5' to a (dT) 24 sequence.
  • the cDNA serves as the template in an in vi tro transcription (IVT) reaction to produce the target cRNA.
  • IVT in vi tro transcription
  • the IVT is performed in the presence of bioti nylated nucleotides to label the target c?NA. This procedure results in. a 50-200 fold linear amplification of the imput poly (A) + RNA.
  • Hybridization Probes The oligonucleotide probes we e provided by the Codelink Uniset Mouse I Bioarray (Amersham,- product code 300013) . Amine-terminated oligonucleotide probes are attached to a three-dimensional' poHyacrylamide gel matrix. There - are 10,000 oligonucleotide pirobes, ' each specific to-a well-characterized mouse gene.' Eac?h mouse . gene is representative of a unique gene cluster from the fourth quarter 2001 Genbank Unigene build. There are also 500 control probes. The sequences of the probes are proprietary to Amersham.
  • Hybridization Using the cRNA target, the hybridization reaction mixture is prepared and loaded into array chambers for bioarray processing as set forth in the manufacturer's instructions for CodeLink Gene Expression BioarraysTM (Amerhsam Biosciences) . Each sample is hybridized to an individual microarray. Hybridization is at 37°C. The hybridization buffer, is prepared as set f rth in the Motorola instructions. Hybridization to the microarray is detected with an avidinated fluorescent reagent, Streptavidin-Alexa Fluor ® 647 (Amersham) .
  • mice Normal mice compared to hyperinsulinemic mice at 2, 4, 8 and 16 weeks on normal vs. high -fat diet.
  • Hyperinsulinemic compared to hyperinsulinemic/hyperglycemic mice at 2 , 4, 8 and.16 weeks on high-fat diets.
  • Nucleotide sequences and predicted amino acid sequences were compared to public domain databases using the Blast 2.0 program (National Center for Biotechnology Information, National Institutes of Health) . Nucleotide sequences were displayed using ABI prism Edit View 1.0.1 (PE Applied Biosystems, Foster City, CA) . Nucleotide database searches were conducted with the then, current version of BIASTN 2.0.12, see Alts ' chul, et al . , "Gapped BIAST and PSI-BLAST: a new generation of protein .database search programs",' Nucleic Acids Res., 25:3389-3402 (1997). Searches employed the default parameters, unless otherwise stated.
  • RefSeq records are owned by NCBI and therefore can be updated as needed to maintain current annotation or to incorporate additional sequence information.” See also http: //www.ncbi .nlm.nih.gov/LocusLink/refseq.html It will be appreciated by those in the art that the exact results of a database search will change from day to day, as new sequences are added. Also, if you query with a longer version of the original sequence, the results will change. The results given here were obtained at one time and no guarantee is made that the exact same hits would be obtained in a • search on the filing date. However, if an alignment between a particular query sequence and a particular database sequence is discussed, that alignment should not change (if the parameters and sequences rremain unchanged) . '
  • Northern analysis may be used to confirm the results .
  • Favorable and unfavorable genes, identified as described , above, or fragments thereof, will be used as probes in Northern hybridization, analyses to confirm their differential expression.
  • Total RNA isolated from subject mice will be resolved by agarose gel electrophoresis through ' a 1% agarose, 1 % formaldehyde denaturing gel, transferred ' ⁇ to positively charged nylon membrane, and hybrid!zedL to a probe labeled with [32P] dCTP ' that was : generated from the aforementioned gene or fragment using the Random' Primed DNA Labeling Kit ' (Roche, Palo Alto, ' CA) , or to a probe Labeled with digoxigenin (Roche Molecular, Biochemicals, Indianapolis , IN) , according to the manufacturer' s instructions .
  • Real-Time RNA Analysis may also be used for confirmation.
  • RNA will be converted to cDNA and then probed with gene-specific primers made for each clone.
  • "Real-time” incorporation of fluorescent dye will be measured to determine the ' amount of specific transcript present in each sample. Sample differences (control, vs. hyperinsulinemic, hyperinsulinemic vs. diabetic, or control vs. diabetic) wi!Ll be evaluated. Confirmation using several independent animals is desirable.
  • si tu hybridizations on selected human (obtained by Tissue Informatics) and mouse tissues using cRNA probes generated from mouse genes found to be up- or down-regulated during the disease progression.
  • si tu hybridizations may also be performed on mouse tissues using cRNA probes generated from differentially eixpressed DNAs. These cRNA's will hybridize to their corresponding messenger RNA's present in cells and will 'provide information regarding the particular cell types within a tissue that is expressing the particular gene as well as the relative level of gene expression.
  • the cRNA probes may be ' generated by in vi tro transcription of template cDNA by Sp6 .
  • tissue sections can also be analyzed using Tissuelnformatics .
  • TissueAnalyticsTM software A single represe tative section may be cut from each tissue block, placed on a slide, and stained with H&E. Digital images of each slide may be acquired using an research microscope and digital camera (Olympus E600 microscope and Sony DKC-ST5) . These images may be acquired at 20x magnification with a resolution of 0.64 mm/pixel. A hyperquantitative ana ⁇ ysis may be performed on the resulting images: First a digital image analysis can identify and annotate structural o jects in a tissue using machine vision. These objects, which are constituents of the tissue, can be annotated because they are visually identifiable and have a biological meaning. '
  • Mathematical statistics provides a rich set of additional ' tools to analyze time resolved data sets of hyper- ' quantitative and gene expression profiles for similarities, including rank correlation, .the calculation of regression and correlation coefficients, and clustering. Continuous functions may also be fitted through ' the data points of : ⁇ individual 'gene and tissue feature data. Relation' etween gene expression and hyper-quantitative tissue data. may be linear or non-linear, in synchronous or asynchronous arrangements .
  • Example 1 Obesity is increasing at an alarming ' rate in the United States. In parallel, the incidence of type II diabetes is also rising. We are interested in defining alterations in gene expression that correlate with the development of these conditions in the hopes of reversing these dangerous trends . Insulin plays a major role in regulating blood glucose levels. It stimulates the uptake of glucose in adipose tissue and striated muscle for storage as intracellular triglycerides and glycogen. Insulin also inhibits the release of glucose from the liver. Normally, this wo ⁇ ld prevent the rise in blood sugar concentration that occurs after eating. However, in the early stages of type 2 diabetes, resistance to insulin is seen. Muscle plays a major role in glucose metabolism.
  • type 2 diabetes In normal situations, muscle cells respond to increasing levels of insulin by increasing glucose uptake from the bloodstream. However, during the very early stages of type 2 diabetes, muscle tissue becomes resistant to insulin, requiring the pancreatic beta cells to increase insulin secretion. Eventually, the beta cells become unable to compensate for this increasing insulin resistance from muscle and other cells, and insulin production drops. Thus, clinical type 2 diabetes results from the combination of insulin resistance and impaired beta cell function. Defects in muscle glycogen synthesis are known to play a ,role in the development of insulin resistance' (Petersen and Shulman, 2002) .
  • microarray analysis In order to identify additional muscle genes involved in the development of type 2 diabetes, we used microarray analysis to compare RNA expression levels of 10,000 genes in muscle of high fat diet fed and control diet fed mice at various time points, in .the progression of type 2 diabetes. Microarray analysis provides a more global picture, of gene regulation, allowing the identification of families or groups of genes showing similar expression patterns that potentially imply similar or coordinated roles in disease progression.. , , Consumption of the HF diet resulted in significant, progressive increases in body weight- and fasting insulin levels in comparison to consumption of the 'Std diet. Fasting glucose levels of mice on the HF diet were dramatically increased at the first time point assayed (2 weeks) and remained high through the duration of ⁇ the experiment (16 weeks) ., .
  • Actin, alpha, cardiac (Actcl, ?NM_009608) was one of the most down-regulated genes when comparing HF to Std mice. It was consistently expressed at lower levels in the HF. diabetic mice in comparison to the Std mice and also steadily decreased over the 16 week study.
  • the master tables reflect applicants 7 analysis of the gene chip data.
  • Col. 1 The mouse gene (upper) and mouse protein (lower) database accession #s .
  • Col. 2 The corresponding mouse Unigene Cluster, as of the 4 th Quarter 2001 build.
  • Col. 4 A related human protein, identified by its database accession number. Usually, several such proteins are identified relative to each mouse gene. These proteins have been identified by BLAST searches, as explained in cols. 6-
  • Col. 5 The name of the related human protein.
  • Col. 6 The score (in bits) for the alignment performed by the BIAST program.
  • Col. 7 The E-value for the alignment performed by the BLAST program., It is worth noting that Unigene considers a Blastx E Value of less than le-6 to be a "match" to the reference sequence of a cluster.
  • bit score and E-value ' for the alignment is with respect to the alignment of the mouse DNA of col . 1 to the human protein of col . 4 by BlastX, according to the .default parameters.
  • Master Table, 1 is divided into three subtables on the basis of the behavior in col. 3. If a gene ' has at least ' one significantly favorable behavior, and no significantly unfavorable ones, it is put into Subtable 1A. In the opposite case, it is put into Subtable IB. If its ' behavior is mixed, i.e., at least one significantly favorable and at least one significantly unfavorable, it is put into Subtable IC. Note that this classification is based - on the strongest observed differential expression behaviors for each of the three subject comparisons, C-HI, HI-D and C-D.
  • Unigene record link Additional information of interest may be accessed by searching with the mouse gene accession # in the Mouse Gene
  • Subtable 1 A Wholly Favorable Genes and Proteins
  • M12866 F (C-D) AAA37164.1 Mm.214950 -1.69 NP_001091.1 alpha 1 actin precursor; alpha skeletal muscle actin 765 NP_005150.1 cardiac muscle alpha actin proprotein; smooth muscle actin 759 NP 001604.1 alpha 2 actin; alpha-cardiac actin 753
  • Actin-related protein 2 3582e- AAP37280.1 actin alpha 1 skeletal muscle protein 3327e- XP_208204.1 similar to actin-related protein 2 331 2e- XP_377904.1 similar to cytoplasmic beta-actin 3234e- AAH36253.1 ACTR2 protein 321 2e- AAHl 0417.2 ACTG1 protein 321 2e- NP_006678.1 actin-like 7A; actin-like 7-alpha 321 2e NP 06677.1 actin-like 7B; act ⁇ n-like 31036- AAH09544.1- Unknown (protein for IMAGE:3897065) 3105e- NP_848620.1 actin-like -. - ⁇ ' ⁇ • • ' . 3003e AAP20052.1 HSD21 ... . 2999e- . .- • ; "" " XP_377631.1 similar to beta actin " 2999e-
  • beta actin beta actin 724 0
  • XP_293924.1 similar to RIKEN cDNA 4732495G21 gene 689 0
  • XP_377904.1 similar to cytoplasmic beta-actin 3252e-88 AAP37280.1 actin alpha 1 skeletal muscle protein 3236e-88 AAH10417.2 ACTG1 protein 3238e-88 - AAH3G253.1 ACTR2 protein 318 1e-86 NP_006677.1 actin-like 7B; actin-like 7-beta 3169e-86 .
  • AAH09544.1 Unknown (protein for IMAGE:3897065) 311 2e-84 BAB71690.1 unnamed protein product 3036e-82 NP_848620.1 actin-like 3038e-82 AAP20052.1 HSD21 301 2e-81
  • NP_003118.1 spectrin, alpha, non-erythrocytic 1 (alpha-fodrin) 259 2e-68 plectin 1 isoform 1; hemidesmosomal protein 1; epidermolysis bullosa simplex 1
  • PLE1_HUMA hi Plectin 1 (PLTN) (PGN) (Hemidesmosomal protein 1) (HD1) 241 4e « 63
  • BPA Hemidesmosomal plaque protein
  • NP_899236.1 230/240kD
  • dystonin hemidesmosomal plaque protein 231 4e-60
  • actin-binding LIM protein 1 isoform m; LIM actin-binding protein 1; limatin; 111 NP_006710.2 actin-binding LIM protein 3 actin-binding LIM protein 1 isoform s; LIM actin-binding protein 1; limatin; NP_006711.2 actin-binding LIM protein 756 BAA74866.2.
  • NM_016860 F (C-D) protein 1 , yeast) homolog A (centractin alpha); centractin alpha; actin-RPV;
  • beta-actin (beta'-actin) 423 e-118
  • XP_293924.1 similar to RIKEN cDNA 4732495G21 gene 417 e-116
  • XP_292982.4 similar to pote protein; Expressed in prostate, ovary, testis, and placenta 404 e-112
  • NP_536356.3 actin-related protein M2; actin-related protein hArpM2; actin-related protein T2 309 1e-83
  • XP_208204.1 similar to actin-related protein 2 296 1e-79
  • coronin actin binding protein, 2A; coronin, actin-binding protein, 2A; coronin 2A; NP_438171.1 coronin-like protein B; WD-repeat protein 2; WD protein IR10 408 e-113 coronin, actin binding protein, 2A; coronin, actin-binding protein, 2A; coronin 2A; NP_003380.2.
  • coronin-iike protein B coronin-iike protein B; WD-repeat protein2; WD protein IR10 408 e-113 AAB47807.1 - WD protein IR10 404 ⁇ 3-112 T47174 hypothetical protein DKFZp762l 166.1 - human (fragment) 389 3-107 AAS48630.1 unknown 3147e-85 NP_116243.1 hypothetical protein FLJ 14871 311 5e-84 . . . AAQ04659.1 Unknown 311 6e-84 NP_078811.1 hypothetical protein FLJ22021 234 ⁇ ⁇ -61
  • AA118546 F (C-D) ARP3 actin-related protein 3 homolog; ARP3 (actin-related protein 3, yeast)
  • NP_005712.1 homolog 850 0 actin-related protein 3-beta; actin-related protein 3-beta; actin-related protein NP_065178.1 Arp11 ; actin-related protein Arp11 793 0 AAP97150.1 actin related protein 662 0 AAH15207.1 ARP3BETA protein 597 e-170 XP_374583.1 similar to actin-related protein Arp11 348 3e-95 JC7580 actin-related protein Arp11 - human _ 344 4e-94 AAK31778.1 FKSG74 . 253 8e-67 AAK31776.1 FKSG72 .
  • beta-actin (beta'-actin) 247 6e-65 AAH08633.1 actin, beta . . . ... 247 8e-65 NP_005150.1 cardiac muscle alpha actin proprotein; smooth muscle actin 247 8e-65 XP_293924.1 similar to RIKEN cDNA 4732495G21 gene " 246 1e-64 ATHUS actin alpha 2, aortic smooth muscle - human 246 1e-64 NP_001604.1 . alpha 2 actin; alpha-cardiac actin .
  • NP_001606.1 actin gamma 2 propeptide
  • actin alpha-3 245 3e-64
  • ARP1 actin-related protein 1 homolog B centractin beta; centractin beta; ARP1 (actin-related protein 1 , yeast) homolog B (centractin beta); PC3; ARP1 , yeast NP 005726.1 homolog B 236 1e-61
  • NP 14084.1 Mm.2l772 -1.21 NP_003068.2 complex 60 kDa subunit B 828 AAC50696.1 SWI/SNF complex 60 KDa subunit 745 SWI/SNF-related matrix-associated actin-dependent regulator of chromatin d3; Rsc ⁇ p; mammalian chromatin remodeling complex BRG1 -associated factor 60C; Swp73-like protein; chromatin remodeling complex BAF60C subunit; SWI/SNF NP_003069.2 complex 60 kDa subunit C 622 e-178 AAR88510.1 60kDa BRG-1/Brm associated factor subunit c isoform 2 619 e-177 AAC50697.1 SWi/SNF complex 60 KDa subunit 596 e-170 AAH09368.2 SMARCD1 protein 569 e-168
  • SWI/SNF-related matrix-assoGiated actin-dependent regulator of chromatin d1 isoform a Rsc6p; mammalian chromatin remodeling complex BRG1 -associated factor 60A; chromatin remodeling complex BAF60A subunit; Swp73-like protein; NP_003067.2 SWI/SNF complex 60 kDa subunit A " 589 e-168 AAD23390.1 SWI/SNF-related, matrix-associated, actin-dependent regulator of chromatin D1 582 e-165 SWI/SNF-related matrix-associated actin-dependent regulator of chromatin d1 - isoform b; Rsc ⁇ p; mammalian chromatin remodeling complex BRG1 -associated factor 60A; chromatin remodeling complex BAF60A subunit; Swp73-Iike protein; NP ⁇ .620710.1 SWI/SNF complex 60 kDa subunit A 505 e-142 AAC50695.1 SWI/SNF complex 60 KDa
  • Actin-reiated protein 2/3 complex subunit 1 A (SOP2-like protein) 723 0 actin related protein 2/3 complex subunit 1 B; ARP2/3 protein complex subunit NP_005711.1 p41; actin related protein 2/3 complex, subunit 1A (41 kD) 533 e-151
  • NM 011418 - F (C-D) subfamily b, member 1; sucrose nonfermenting, yeast, homolog-like 1; integrase
  • NP 035548.1 Mm.279751 -1.14 NP_003064.2 interactor 1 754 0 SNF5_HUMA SWI/SNF related, matrix associated, actin dependent regulator of chromatin N subfamily B member 1 (Integrase interactor 1 protein) (hSNF5) (BAF47) 749 0 CAA09759.1 Inilb 728 0 BAB14784.1 . unnamed protein product . 710 0 CAA76639.1 SNF5/INI1 protein 685 0
  • Subtable IB Wholly Unfavorable Genes and Proteins
  • Subtable IC Mixed Genes and Proteins
  • Ci tation of documents herein is not intended as an admission that any of the documents cited herein is pertinent prior art, or an admission that the cited documents is considered material to the patentability of any of the claims of the present application . All statements as to the date or representation as to the contents of these documents is based on the information available to the applicant and does not constitute any admission as to the correctness of the dates or contents of these documents . The appended claims are to be treated as a non-limi ting recitation of preferred embodiments.
  • references ci ted herein including journal articles or abstracts, published, corresponding, prior or otherwise related U. S. or foreign patent applications, issued U. S . or foreign patents, or any other references, are entirely incorporated by reference herein, including all data, tables, figures, and ex presented in the ci ted references . Additionally, the entire contents of the references ci ted wi thin the references ci ted herein are also entirely incorporated by reference . Reference to known method steps, conventional methods steps, known methods or conventional methods is not in any way an admission that any aspect, description or embodiment of the present invention is disclosed, taught or suggested in the relevant art .

Abstract

Mouse genes differentially expressed in comparisons of normal vs. hyperinsulinemic, hyperinsulinemic vs. type 2 diabetic, and normal vs. type 2 diabetic muscle by gene chip analysis have been identified, as have corresponding human genes and proteins. The human molecules, or antagonists thereof, may be used for protection against or diagnosis hyperinsulinemia or type 2 diabetes, or their sequelae.

Description

DIAGNOSIS OF HYPERINSULINEMIA AND TYPE II DIABETES AND PROTECTION AGAINST SAME BASED ON GENES DIFFERENTIALLY EXPRESSED IN MUSCLE CELLS (15.1) Cross -Reference to Related. Applications 5 Anti -Aging Applications. icie with a disrupted growth hormone receptor/binding protein gene enjoy an increased lifespan. In U.S. Prov. Appl. 60/485,222, filed July 8, 2003 (Kopchickβ) mouse genes differentially expressed in comparisons of gene expression in growth hormone0 . receptor/binding protein gene-disrupted mouse livers and normal mouse livers were identified, as were corresponding human genes and proteins. It was suggested \that the human molecules, or antagonists thereof, could be used for protection against faster-than-normal biological aging, or5 to achieve slower-than-normal biological aging. It was also taught that the human molecules may also be used as markers of biological aging. In provisional application Ser. No. 60/474,606, filed June 2, 2003 (our docket Kopchick7-USA) , our research group0 used a gene chip to study the genetic changes in the liver. : . of C57B1/6J mice that occur at frequent intervals of the . aging process . Differential 'hybridization techniques were used to identify mouse genes that are di ferentially., expressed in mice, depending upon their age. The level of5- gene expression of approximately 10,000 mouse. genes- (from . the Amersham Codelink UniSet Mouse I Bioarray, produc .'code: 300013) in. the liver of mice with average' ages of 35, -, 49, 56, 77, 118, 133, 207, .403, 558 and 725 days was * determined. In essence, complementary. ?NA derived from mice0 of different ages was screened for hybridization 'with' , •• oligonucleotide probes each specific to a particular mouse ge e, each gene in turn representative of a particular mouse' gene cluster (Unigene) . Mouse genes whic were; '■ . * . differentially expressed (younger vs. older) ,' as measured .by • different levels of hybridization' of the respective cRNA .- . samples with the particular probe 'corresponding, to' -that • •mouse gene, were identified.. Related human genes ..and ' '.; ' proteins were identified by sequence comparisons ' to the'. mouse gene or protein. In the international appl .
Kopchick7A-PCT, .filed June 2, 2004, weβadded some additional studies of CIDE-A (see below) . In a like manner, the effect of aging on the expression of genes in mouse skeletal muscle was studied, see provisional application Ser. No. 60/566,068, filed April 29, 2004 (our docket Kopchickl4-USA) .
Anti -Diabetes Applications . In U.S. Provisional Appl. Ser. No. 60/458,398 (our docket Kelderl-USA) , filed March 31, 2003, members of our research group describe the identification of genes differentially expressed in normal vs. hyperinsulinemic, hyperinsulinemic vs. type II diabetic, or normal vs. type II diabetic mouse liver. Forward- and reverse-substracted cDNA libraries were prepared, clones were isolated, and differentially expressed cDNA inserts were sequenced and compared with sequences in publicly available sequence databases. The corresponding mouse and human genes and proteins were identified. The purpose of our research group's provisional application Ser. ,No. 60/460,415 (our docket: Kopchick6- USA) , filed April 7, 2003, was similar, but complementary RNA, derived from RNA of mouse liver, was screened against a mouse gene chip. See also 60/506,716, filed Sept. 30, 2003 (Kopchick6.1) . Gene chip analyses have also been used to identify genes differentially expressed in normal vs. hyperinsulinemic, hyperinsulinemic vs. type II diabetic, or normal * vs . type II diabetic mouse pancreas, see U.S. Provisional Appl. 60/517,376, filed Nov. 6, 2003
(Kopchickl2) and muscle, see U.S Provisional Appl. 60/547,512, filed Feb. 26, 2004 (Kopchicklδ) . Other differential hybridization applications . The use of differential hybridization to identify genes and proteins is also described in our research group's Ser. No. PCT/US00/12145 (Kopchick 3A-PCT) , Ser. No. PCT/US00/l2366 , (Kopchick4A-PCT) ; and Ser. No. 60/400,052 ■ (Kopchickδ) . All of the foregoing applications are hereby incorporated by reference in their entirety.
BACKGROUND OF THE INVENTION Field of the Invention The invention relates to various nucleic acid molecules and proteins, and their use in (1) diagnosing hyperinsulinemia and type II diabetes, or conditions associated with their development, and (2) protecting mammals (including humans) against them.
Description of the Background Art ' ' \ ' Diabetes A deficiency of insulin in the body results in diabetes mellitus, which affects about 18 million individuals in the United States. It is characterized by a high blood glucose (sugar) level and glucose spilling into the urine due to a deficiency of insulin. As more glucose concentrates in the urine, more water is excreted, resulting in .extreme thirst, rapid weight loss, drowsiness, fatigue, and possibly dehydration. Because the cells of the diabetic cannot use glucose for fuel, the body uses stored protein and fat for energy, which leads to a buildup of acid (acidosis) in the blood. If this condition is prolonged, the perso can, fall • into a diabetic coma, characterized by deep' labored breathing and fruity-odored breath. There are two types of diabetes mellitus, Type I and Type II. Type II diabetes is the predominant ' form found in the Western world; fewer than 8% of diabetic Americans have the type I disease. . Type I diabetes . ■ In Type I diabetes, formerly called juvenile-onset or insulin-dependent diabetes mellitus, the , pancreas .cannot produce insulin. People with Type I diabetes must have daily insulin injections. But they need to avoid taking too "much insulin^ because"that can -lead to insulin shock, which begins with a mild hunger. This is quickly followed by sweating, shallow breathing, dizziness, palpitations, trembling, and mental confusion. As the blood sugar falls, the body tries to compensate by breaking down fat and protein to make more sugar. Eventually, low blood sugar leads to a decrease in the sugar supply to the brain,. resulting in a loss of consciousness. Eating a sugary. food can prevent insulin shock until appropriate medical measures can be taken. Type I diabetics are often characterized by their low or absent levels of circulating endogenous insulin, i.e., hypoinsulinemia (1) . Islet cell antibodies causing damage to the pancreas are frequently present at diagnosis. Injection of exogenous insulin is required to prevent ketosis and sustain life.
Type II diabetes . Type II diabetes, formerly called adult-onset or non-insulin-dependent diabetes mellitus (NIDDM) , can occur at any age. The pancreas can produce insulin, but the cells do not respond to it. Type II diabetes is a metabolic disorder that affects , approximately 17 million Americans . It is estimated that another 10 million individuals are "prone" to becoming diabetic. These vulnerable individuals can become resistant to insulin,"a pancreatic hormone that signals glucose (blood. sugar) uptake by fat and muscle . In order to maintain normal glucose levels, the islet cells of' the pancreas produce more insulin, resulting- in a condition called hyperinsulinemia. .When the pancreas can no longer produce enough insulin to compensate for the insulin resistance, and thereby maintain normal glucose levels, hyperglycemia (elevated blood glucose) results, and 'type II diabetes is - diagnosed. , , ' Early Type II diabetics are often characterized by hyperinsulinemia and resistance to insulin.- Late Type' II . diabetics may be normoinsulinemic or hypoinsulinemic. Type II diabetics are usually not •' insulin dependent or prone to ketosis under normal- circumstances. • . ' ' -'< ■ Little is known about the disease progression from' the ? normoinsulinemic state to the hyperinsulinemic state, and from the hyperinsulinemic state to the Type II diabetic state. As stated above, type II diabetes is a metabolic disorder that is characterized by insulin resistance and impaired glucose-stimulated insulin secretion (2,3,4) .
However, Type II diabetes and atherosclerotic disease are viewed as consequences of having the insulin resistance syndrome (IRS) for many years (5) . The current theory of the pathogenesis of Type II diabetes is often referred to as the "insulin resistance/islet cell exhaustion" theory. According to this theory, a condition causing insulin resistance compels the pancreatic islet cells to hypersecrete insulin in order to maintain glucose homeostasis. However, after many years of hypersecretion, the islet cells eventually fail and the symptoms of clinical ' diabetes are manifested. Therefore, this theory implies that, at some point, peripheral hyperinsulinemia will. be an antecedent of Type II diabetes. Peripheral hyperinsulinemia can be viewed as the difference between what is produced by \ the β cell minus that which is taken up by the liver. Therefore, peripheral hyperinsulinemia can be caused by increased β cell production, decreased hepatic uptake or some combination of both. It is also important to note that it is not possible to determine the origin of insulin resistance once it is established since the onset of peripheral hyperinsulinemia leads to a condition of global insulin resistance. ; Multiple environmental and genetic factors are involved in the development of ' insulin resistance, .hyperinsulinemia and type II diabetes. An important risk factor for the development of insulin resistance, hyperinsulinemia and type II diabetes is obesity, particularly visceral obesity (6,7,8). 'Type II diabetes exists world-wide, but in developed societies, the prevalence has risen as the average age of the population increases and the average individual becomes- more obese. ' ',. ' ,. '
Obesi ty and Diabetes . Obesity is a serious and growing '■ problem in the United States. Obesity-related health risks include high blood pressure, hardening of the arteries, cardiovascular disease, and Type II diabetes (also known as non-insulin-dependent diabetes mellitus, Type II diabetes) (9,10,11). Recent studies show that 85% of the individuals with Type II diabetes are obese (12) .
Treatment of Diabetes. For many years, treatment was insulin therapy for Type I and oral sulfonylureas and/or insulin therapy for Type II. Metformin (glucophage) was the first antidiabetic drug approved by FDA (May 1995) for the treatment of Type II diabetes since the oral sulfonylureas were introduced in 1984. Metformin promotes the use of insulin already in the blood. This May 1995 approval was followed by the September 1995 approval of another ' antidiabetic drug, Acarbose (precose) . It slows down the digestion and absorption of complex sugars, which reduces blood sugar levels after meals. Before 1982, insulin was purified from beef or pork pancreas. This was a problem for those diabetics allergic to animal insulin. Researchers produced a synthetic insulin called humulin. Approved by FDA in 1982, it was the first genetically engineered consumer health product manufactured for diabetics. Synthetic insulins can be produced in unlimited quantities. Another possible treatment for diabetes includes ; surgically replacing the pancreas' endocrine tissues (islets of Langerhans) with healthy islet of Langerhans tissue grafts. Since 1988, 45 patients worldwide have undergone successful transplantation. ,
Complications . Complications of diabetes (end jorgan damage) include retinopathy, neuropathy, and nephropathy (traditionally designated as microvascular complications) as well as atherosclerosis (a macrovascular complication) .
Early stages of hyperglycemia can usually be controlled' by an alteration in diet and increasing the amount of exercise, but drug treatment, including insulin, may be required. It has been shown that meticulous blood glucose control -can often slow down or halt the progression of diabetic complications if caught early enough (1) . However, tight metabolic control is extremely difficult to achieve.
Animal Models Transgenic Mouse Models of Diabetes or Diabetes Resistance. McGrane, et al . , J. Biol. Chem. 263:11443-51 (1988) and Chen, et al . , J.. Biol. Chem.-, 269:15892-7 (1994) describe the genetic engineering of mice to express bovine growth hormone (bGH) or human growth hormone (hGH) , respectively. These mice exhibited an enhanced growth phenotype . They also developed kidney lesions similar to those seen in diabetic glomerulosclerosis, see Yang, et al . , Lab. Invest., 68:62-70 (1993). Ogueta, et al., J. Endocrinol., 165: 321-8 (2000) reported that transgenic mice expressing bovine GH develop arthritic disorder and self- antibodies . Growth hormone has many roles, ranging from regulation of protein, fat and carbohydrate metabolism to growth promotion. GH is produced in the somatrophic cells of the anterior pituitary and exerts its effects either through the GH-induced action of IGF-I, in the case of growth promotion, or by direct interaction with the GHR on target cells including liver, muscle, adipose, and kidney cells. Hyposecretion of GH during development leads to dwarfism, ■ and hypersecretion before puberty leads to gigantism. In 'adults, hypersecretion of GH results in acromegaly, a clinical condition characterized by enlarged facial bones, hands, feet, fatigue and an increase in weight. Of those individuals with acromegaly, 25% develop type II diabetes. This may be due to. insulin resistance caused by the high circulating levels of GH leading to high circulating levels of insulin (Kopchick et.al., Annual Rev. Nutrition 1999. 19:437-61) .' '' , ' !A further mode of GH. action may be ' through the transcriptional regulation of a number of genes contributing '. to the' physiological effects of GH. , Growth hormone genes and the proteins encoded by them can be converted into growth hormone antagonists by mutation, see Kopchick USP 5,350,836. Transgenic mice have been made that express the GH antagonists bGH-G119R or hGH G120R, and which exhibit a dwarf phenotype., Chen, et al . , J. Biol. Chem., 263:15892-7 (1994); Chen, et al . , Mol. Endocrinol, 5:1845-52 (1991); Chen, et al . , Proc . Nat. Acad. Sci.- USA 87:5061-5 (1990) . .These mice did not develop kidney lesions. See Yang (1993), supra. , Chen, et al., Endocrinol, 136:660-7 (1995) compared the effect of streptozotocin treatment in normal nontransgenic mice, and in mice transgenic for (1) a GH receptor antagonist, the G119R mutant of bovine growth hormone or (2), the E117L-mutant of bGH. (According to Chen's ref. 24,-. these large GH transgenic streptozotocin-treated mice constitute an animal model for diabetes.) Glomerulosclerosis was seen in diabetic (STZ-treated) nontransgenic mice and in diabetic bGH-E117L mice, but not in diabetic bGH-G119R (GH antagonist) mice. Two of the proteins which mediate growth hormone activity are the growth hormone receptor and the growth hormone binding protein, encoded by the same gene in mice (GHR/BP) . It is possible to genetically engineer mice so that the gene encoding these proteins is disrupted ( "knocked-out" ; inactivated), see Zhou, et al . , , Proc . Nat. Acad. Sci. (USA), 94:13215-20 (1997). Zhou, et al . inactivated the GHR/BP gene by replacing the 3 ' portion of exon 4 (which encodes a portion of the GH, binding domains) and the 5' region of intron 4 with a 'neomycin gene cassette. The modified gene was introduced into the target mice by homologous recombination. Like mice expressing a ' GH antagonist ,, homozygous GHR/BP-KO mice exhibit a dwarf
; phenotype.-, - GHR/BP-KO mice, made diabetic by streptozotocin treatment, are protected from the development of dia?betes- associated nephropathy. Bellush, et al .,' Endocrinol . , ' 141:163-8 (2000) . ,
High-Fat Diets . High-fat diets have been shown to induce both obesity and Type II diabetes' in laboratory animals (13) . Surwit and colleagues demonstrated that male C57BL/6J mice are extremely sensitive to the diabetogenic effects of a high-fat diet when initiated at weaning. At six months. of age, high-fat fed animals had significantly 5 elevated fasting blood-glucose and insulin levels and also demonstrated a decrease in insulin sensitivity (14) . Ahren and colleagues (15) reported evidence of insulin resistance as well as diminished glucose-stimulated insulin release, after feeding with a high-fat diet for 12 weeks. These mice
10 also showed elevated levels of total cholesterol, triglycerides, and free fatty acids, another hallmark of Type II diabetes.
15 Anatomy and Physiology of Muscle Muscle tissue constitutes about 40% of the body mass. Muscles may be classified by location, i.e., skeletal if attached to bone, cardiac if forming the wall of the heart, and visceral if associated with another body organ. Muscles
20 may also be classified as voluntary or involuntary, depending on how their contractions and relaxations are controlled. Skeletal muscles are voluntary, while cardiac and visceral muscles are involuntary. It is also possible to classify muscles morphologically; skeletal and cardiac
25. muscle cells are striated, whereas visceral muscle cells are not .
Each skeletal muscle is composed of many individual muscle cells called muscle fibers.• The fibers are held together by
30 fibrous connective-tissue membranes called fascia. The. fascium which' envelops' the entire muscle is the epimysium, and. the fascia which penetrate the muscle, . separating the fibers into bundles (fasciculi) >are called perimysium. Very thin fascia (endomysium) sheath each .muscle fiber. Skeletal
35 muscles are attached either directly to a bone, or ' indirectly through a tendon.
The individual muscle fibers ' (cells) comprise threadlike protein structures, called myofibrils. There are over ,600 muscles in the human body. We will have occasion later to refer to the gastrocnemius . It is a superficial muscle in the posterior compartment of the lower leg, which together with the underlying soleus forms the characteristic bulge of the calf.
Role of Muscle in Development of Type II Diabetes Muscle, fat and liver tissues are the major contributors to the development of insulin resistance, hyperinsulinemia, and, ultimately, type II diabetes.
Muscle cells respond to insulin by increasing glucose uptake from the bloodstream. Muscle tissue can become resistant to insulin, causing the beta cells to initially increase insulin secretion. Eventually, though, the beta cells become unable to compensate for this increasing insulin resistance from muscle and other cells, and they fail to respond to elevated blood glucose levels. Thus, clinical type 2 - diabetes results from the combination of insulin resistance and impaired beta cell function. ' Defects in muscle glycogen synthesis are known to play a role in the development of insulin resistance. At least three steps-those mediated by glycogen synthase, hexokinase, and GLUT4-have been reported to be defective in patients with type 2 diabetes . Fatty acids can induce insulin resistance, and it has been suggested that this was a consequence of altered insulin signaling through PI3-kinase. PKC-theata has. also been implicated. See generally Petersen, et al . , "Pathogenesis of Skeletal muscle insulin resistance in type 2 diabetes mellitus", in "A Symposium: Evolution of type 2 diabetes mellitus management", at Amer. J. Cardiol . , 90 (5A) : 11G-18G, (Sept. 5, 2002) .
Adverse Effects of Type II Diabetes on Muscle "Myopathy is a general term used to describe any ' ' disease of muscles,- such as the, muscular dystrophies and myopathies associated. with' thyroid disease. 'It .can be- caused by endocrine disorders, including diabetes, metabolic disorders, .infection or inflammation of the muscle, certain drugs and mutations in genes. In diabetes, myopathy is thought to be caused by neuropathy, a complication of diabetes. General symptoms of myopathies include muscle weakness of limbs sometimes occurring during exercise although in some cases the symptoms diminish as exercise increases. Depending on the type of myopathy, one muscle group may be more affected than others . " See "Joint and Muscle Problems Associated with Diabetes", www, iddtinternational .orα/iointandmuscleproblems.html [Last modified June 12, 2003] .
Diabetic muscle, infarction can spontaneously affect patients with a long history of poorly controlled diabetes. "Most affected patients have multiple microvascular complications (neuropathy, nephropathy, and retinopathy) . The clinical presentation is an acute onset of pain and swelling over days to weeks in the affected muscle groups (usually the thigh or calf) , along with varying degrees- of tenderness..... Therapy consists of rest and analgesia. Romtine daily activities are not deleterious to the condition, but physical therapy may cause exacerbation. Spontaneous diabetic muscle infarction tends to resolve over a period of weeks to months in most cases." See "Mmsculoskeletal Complications of Diabetes - Part 2",' -, www.diabetic-lifestyle.com/articles/ian02_whats_l .htm [last modified Feb., 9,' '2004] . See also Truj illo-Santos, et al . , "Diabetes muscle infarction: an underdiagnosed complication .of long-standing diabetes," Diabetes Care, 26(l):211-5 1 (2003) . Identification of genes involved in hyperinsulinemia and type II diabetes, generally Our attention recently has focused on the generation of muscle mRNA expression profiles and the identification of cjenes involved in the genesis of the obesity-induced biyperinsulinemia and type-II diabetes. To date, no one has attempted to study the actual progression from the normal condition to that of hyperinsulinemia or from tryperinsulinemia to Type II diabetes in an attempt to identify genes that are up-regulated or down-regulated in muscle as the disease progresses. In previous studies aimed at identifying genes involved in diabetes-induced glomerulosclerosis, differential display and traditional subtractive hybridization techniques were used (16-20) . While effective for the identification of a few genes (e.g. hmuncl3, PED/PEA-15, lactate dehydrogenase, aτniloride sensitive sodium channel, ubiquitin-like protein, mdr 1, and a-amyloid protein precursor as well as a few novel genes), these techniques can be quite labor intensive. The PCR-based method of subtractive hybridization requires less starting material, and allows the simultaneous isolation of all differentially expressed cDNAs into two groups (up-regulated and down-regulated) . However,' the PCR-based method of subtractive hybridization is also quite labor-intensive, .produced large numbers of false positive candidates and ultimately resulted in the identification of a relatively limited number of differentially expressed genes, (see Kelderl-USA application) . ' . ; In order to e pand the number of genes that can be ! analyzed simultaneously, ' several groups have begun to utilize DNA microarray analysis to measure differences in ' gene expression between normal and diseased states. However, these experiments . have been limited in regards to t?he number of experimental 'conditions analyzed. DNA m±croarray analysis has been performed on -normal, obese and diabetic mice (21) '■ Also, the obesity and diabetes in the mouse models examined were caused by a specific endogenous genetic mutation (22) . The differentially expressed genes in the above models may be very different from genes differentially expressed due to diet-induced obesity and Type-II diabetes.
The use of differential expression and related techniques to identify genes useful in the treatment of diabetes has been 10 reviewed by Perfetti, et al., Diabetes Technol . & Therapeut., 5(3): 421-3.(2003). Bernal-Mizrachi, et al . , Diabetes Metab. Res. Rev. 19: 32-42 (2003) .
Other papers of interest include : 15 Wada, et al . , "Gene expression profile in streptozotocin-induced diabetic mice kidneys undergoing glomerulosclerosis", Kidney Int, 59:1363-73 (2001); Song, et al . , "Cloning of a novel gene in the human kidney homologous to rat muncl3S: its potential role in 20 diabetic nephropathy" , Kidney Int., 53:1689-95 (1998); ' Page, et al . , "Isolation of. diabetes-associated kidney genes using differential display", Biochem. Biophys. Res. Comm., 232:49-53 (1997). Peradi, ."Subtractive hybridization claims: An efficient 25 technique to detect overexpressed mRNAs in 'diabetic nephropathy," Kidney Int. 53:926-3i (1998). Condorelli, EMBO J. , 17:3858-66 (1998).
Diabetes-Specific Differential Expression in Muscle
-30 Sreekumar, et al . , "Gene expression profile in skeletal msucle of' type 2 diabetes and the effect of insulin treatment," Diabetes 51:; 1913 (June 2002) surveyed 6,451 genesw, and identified 85 genes for which there was an alteration in skeletal muscle transcription in diabetic
'35 patients after withdrawal of insulin treatment. Subsequent insulin1 treatment resulted in further' changes in transcription of 74 of ,the 85 genes (15 increased, 59 decreased), and -also resulted -in alteration of '29 additional gene ,transcripts. / Mootha, et al . , "PCG-lc. responsive genes involved in oxidati-ve phosphorylation are coordinatively downregulated in human diabetes," Nature Genetics 34(3); 267 (July 2003), used DNA microarrays to detect changes in the expression of sets of related genes, rather than of individual genes. They classified over 22,000 genes into 149 data sets; some of these data sets overlapped. They looked for a statistical correlation between the overall rank order of the genes in differential expression, and the groups to which the genes belonged. Expression was compared pairwise among three groups: males with normal glucose tolerance; males with impaired glucose tolerance; and males with type 2 diabetes. The set with the highest enrichment score (the one whose members ranked highly most often relative to chance expectation) was an internally curated set of 106 genes involved in oxidative phosphorylation. While the average decrease for the individual genes was modest (~20%) , it was also consistent, being observed in 89% (94/106) of the genes in question. This paper is reviewed by Toye and Gauguier, "Genetics and functional genomics of type -2 diabetes mellitus", Genome Biology, 4: 241 (2003).
Patti, et al . , "Coordinated reduction of genes of oxidative metabolism in humans with insulin resistance and diabetes: Potential role of PGC1 and NRF1", Proc. Nat. Acad. SCi . (USA), 100(14): 8466 (July 8, 2003) used microarrays to analyze skeletal muscle expression of genes in nondiabetic insulin—resistant subjects at high risk for diabetes (based on family hisotry of diabetes and Mexican-American ethnicity) and diabetic Mexican-American subjects., Of 7,129 sequences represented on, the microarray, 187*. were differentially expressed between' control, and diabetic subjects. However, no single gene remained significantly differentially expressed after controlling for multiple comparison false discovery by using the Benjamini-Hochberg- method, see Benjamini, et 'al . , J. R. Stat . Soc. Sert . B. •57:289-300 (1995); Dudait, ' ' et al . , 'Stat. Sin.' 12: 111-139'' (2002). Consequently, Patti et al.> sought to identif groups of related genes, with similar patterns of differential expression using MAPP FINDER and ONTOEXPRESS. According to MAPP FINDER, the top-ranked cellular component terms were mitochondrion, mitochondrial membrane, mitochondrial inner membrane, and ribosome, and the top- ranked process term was ATP biosynthesis. According to ONTOEXPRESS, the over-represented groups were energy generation, protein biosynthesis/ribosomal proteins, RNA binding, ribosomal structural protein, and ATP synthase complex.
Huang, Xudong, "Identification of abnormally expressed genes in skeletal muscle contributing to insulin resistance and type 2 diabetes", Thesis, document id: 9576 Lunds University 2002, reported differential expression of the mitochondrially-encoded ND1 gene in human diabetic patients and of the nuclear-encoded cathepsin L gene in mice.
Standaert, et al . , ": Skeletal muscle insulin resistance in obesity-associated type 2 diabetes in monkeys is linked to a defect in insulin activation of protein kinase C- zeta/lambda/iota Diabetes 51: 2936 (Oct. 2002). the authors concluded that defective activation of atypical PKCs played an important role in the patehogenesis of peripheral .insulin resistance in both obese prediabetic and diabetic monkeys.
They attributed this linkage to the apparent requirement for aPKCs during insulin-stimulated glucose transport.
I
Srommer, et al . , Am. J. Physiol ., "Skeletal muscle insulin- resistance after trauma: insulin signaling and glucose transport", 275(2 Pt . 1): E3518(Aug. 1998) concluded ' that insulin resistance in skeletal muscle after surgical trauma is associated with reduced glucose transport but not with impaired glucose signaling to PI 3-kinase or its downstream target, .Akt.
Agingr-Specific Differential Expression in Muscle Gene Chip-Based Identification of genes involved in aging of skeletal muscle Several groups have used DNA microarrays to measure differences in gene expression. caused by the aging process. 5 However, these experiments are extremely limited in regards to the number of aging time points or experimental conditions. Weindruch, et al . , "Microarray profiling of gene expression in aging and its alteration by caloric 0 restriction in mice" in Symposium: Calorie Restriction: effects on Body Composition, Insulin Signaling and Aging 918S-923S (2001) (21) compared expression in gastrocnemius muscle from 5- and 30-month old C57BL/6 mice, with and without caloric restriction. In this analysis, the 5 expression of: 11-3 genes was found to be changed by at least two-fold in 5 -month old mice compared to 30-month old mice. Caloric restriction of comparable mice caused a reversal of the altered gene expression of 33 genes . Of the S347 genes surveyed in the oligonucleotide 0 - microarray, only 58 (0.9%) displayed a greater than 2 fold increase in gene expression as- a function of aging, whereas 55(0.9%) displayed a greater than 2 fold decrease. Of the genes positively correlated with aging, 16% could.be assigned to stress responses. The largest 5 differential expression between young and aged animals (3.8 fold) was the mitochondrial sarcomeric creatine kinase. Of 'the genes negatively correlated with aging, 13% were involved in energy metabolism. A noteworthy number were genes encoding biosynthetic enzymes (cytochrome P450 IIC12, 0 squaelene synthase, stearoyl-CoA desaturase, EF-1-gamma. Another down regulator was a CpG binding protein, MeCP2. 'Weindruch further reported that age-related changes in gene expression .profile were "remarkably attenuated" by caloric restriction. 5 What' appears to be the same experiment is, discussed in Lee, et al., "Gene expression profile of aging and its , '' ' retardation by caloric restriction," Science, 285: 1390 (Aug. 27,, 1999): .This papers lists the individual genes which were differentially expressed by more than 2 -fold, and • classifies them as energy metabolism, neuronal factors , protein metabolism, stress response, biosynthesis , calcium metabolism or DNA repair genes . Welle, et al . , "Skeletal muscle gene expression profiles in 20-29 year old and 65-71 year old women, " Exper . Gerontol . , 39 : 369-77 (2004) and available electronically as doi : 10 . 1016/j . exger .2003 . 11 . 011 studied gene expression and physical condition in seven young and eight older women . With respect to physical condition, the measured or calculated parameters were total body mass , lean body mass , left leg lean mass (by biopsy) , maximum isometric left knee extension force, left knee extension force/left keg lean mass , Peak V02/lean body mass , and Peak V02/left leg lean mass . There were 1178 "probe sets" (representing 1053 different Unigene clusters) for which differential expression was detected; 550 for which expression was higher in older women, and 628 the inverse effect. The differences ranged from 1.2 to 4 fold; most (78A%) were less than 1.5 fold. The complete list of 'differentially expressed genes is given in the Rochester Muscle database website, www.urmc.rochester.edu/smd/crc/swindex (".html" omitted, in accordance with USPTO requirements, so that the publication . of this application will not- create an active hyperlink) . The gene most highly overexpressed in older muscle was p21 (cyclin-dependent kinase inhibitor LA) (4.01 fold)'. This one of several genes (see Welle Table 2) which are potentially related to DNA damage and repair. Welle also thought it noteworthy how many of the differentially expressed genes were, ones that encode proteins which bind to pre-mRNAs or mRNAs (see Welle Table 3) .
, Other Differential /Subtractive Hybridization Studies of Interest , *( Zhang, et al . , Kidney International, 56:549-558 (1999) identified genes up-regulated in 5/6 nephrectomized (subtotal renal ablation)- mouse kidney by a PCR-based subtraction method. Ten known and nine novel genes were identified. The ultimate goal was to identify genes', involved in glomerular hyperfiltration and hypertrophy. elia, et al . , Endocrinol., 139:688-95 (1998) applied subtractive hybridization methods for the identification of androgen-regulated genes in mouse kidney.' The treatment mice were dosed with dihydrotestosterone, an androgen. Kidney androgen-regulated protein gene was used as a positive control, as it is known to be up-regulated by DHT. See also Holland, et al . , Abstract 607, "Identification of Genes Possibly Involved in Nephropathy of Bovine Growth Hormone Transgenic Mice" (Endocrine Society Meeting, June 22, 2000) and Coschigano, et al . , Abstract 333, "Identification of Genes Potentially Involved in Kidney Protection During Diabetes" (Endocrine Society Meeting, June 22, 2000). The following differential hybridization articles may also be of - interest : Wada, et al . , "Gene expression profile in streptozotocin-induced diabetic mice kidneys undergoing glomerulosclerosis", Kidney Int, 59:1363-73 (2001); Song, et al . , "Cloning of a novel gene in the human kidney homologous to rat muncl3S: its potential role in diabetic nephropathy", Kidney Int., 53:1689-95 (1998); Page, et al . , "Isolation of diabetes-associated kidney genes using differential display", Biochem. Biophys. Res. Comm. , 232:49-53 (1997); Peradi, "Subtractive hybridization ' claims : An efficient technique to detect overexpressed mRNAs in diabetic nephropathy," Kidney Int1..53:926-31 (1998); Condorelli, EMBO J., 17:3858-66 (1998) . -,. ;
Apoptosis and CIDE-A * Apoptosis is a form of programmed cell death that occurs in an active and controlled manner to eliminate unwanted cells.' Apoptotic ■ cells undergo an orchestrated • cascade, of morphological changes such as . membrane blebbing, nuclear shrinkage, chromatin condensation, and formation of apoptotic bodies -which then undergo phagocytosis by neighboring cells . One of the hallmarks of cellular apoptosis is the cleavage of chromosomal DNA into discrete oligonucleosomal size fragments. This orderly removal of unwanted cells minimizes the release of cellular components that may affect neighboring tissue. In contrast, membrane rupture and release of cellular components during necrosis often leads to tissue inflammation. The process of apoptosis is highly conserved and involves the activation of the caspase cascade. Cohen, GM. (1997) Caspases : the executioners of apoptosis. Biochem. J. 326:1-16; Buditiardjo, I., Oliver, H. , Lutter, M. , Luo, X., Wang, X. (1999) Biochemical pathways of caspase activation during apoptosis. Annnu. Rev. Cell. Dev. Biol.15:269-290; Jacobson, M.D., Weil, M. , Raff, M.C. (1997) Programmed cell death in animal development. Cell 88:347-354. Caspases are a family of serine proteases that are synthesized as inactive proenzymes. Their activation by apoptotic signals such as CD95 (Fas) death receptor activation or tumor necrosis factor results in the cleavage of specific target proteins and execution of the apoptotic program. Apoptosis may occur by either an extrinsic pathway involving the activation of cell surface death receptors (DR) or by an intrinsic mitochondrial pathway. Yoon, J-H. Gores G.J. (2002) Death receptor-mediated apoptosis and - the liver. J. Hepatology 37:400-410. These pa.thways are not^ mutually exclusive and some cell types require the activation of both pathways for maximal apoptotic signaling. In type-I cells, death receptor' activation, leads to the recruitment and activation of caspases-8/10 and the rapid cleavage and activation of. caspase-3 in a mitochondrial-independent manner, i Hepatocytes are members of the Type-II cells in which mitochondria are essential for DR-mediated apoptosis
Scaffidi, C, Fulda, S., Srinivasan, A., Friesen, C, .Li, F., Tomaselli, K.J., Debatin, K.M. , Krammer, P.H. , Peter, ι ■ M.E. (1998) Two .CD95 (APO-l/Fais) ■' signaling pathways. EMBO J. 17:1675-1687. In this pathway, -the' pro-apoptotic protein Bid is truncated by activated caspases-8/10 and translocates to the mitochondria. Luo, X., Budihardjo, I., Zou, H. ,
Slaughter, C, Wang, X. (1998) Bid, a Bcl2 interacting protein, mediates cytochrome c release from mitochondria in response to activation of cell surface death receptors.
Cell' 94:481-490; Li, H. , Zhu, H. , Xu, C.J., Yuan, J. (1998) Cleavage of BID by caspase 8 mediates the mitochondrial damage in the Fas pathway of apoptosis. Cell
94:491-501. This translocation leads to mitochondrial cytochrome c release and eventual activation of caspases-3 and 7 via cleavage by activated caspase-9. One of the substrates for activated caspase-3 is the DNA fragmentation factor (DFF) . DFF is composed of a 45 kDa regulatory subunit (DFF45) and a 40 kDA catalytic subunit (DFF40) . Liu, X., Zou, H., Slaughter, C, Wang, X. (1997) DFF, a heterodimeric protein that functions downstream of caspase-3 to trigger DNA fragmentation during apoptosis. Cell 89:175-184. DFF45 cleavage by activated caspase-3 results in its dissociation from DFF40 and allows the caspase-activated DNAse (CAD) activity of DFF40 to cleave chromosomal . DNA into όligonucleosomal size fragments. Liu, X., Li, P., Widlak, P., Zou, H., Luo, X., Garrard, W.T., Wang, X. (1998) The 40-kDa subunit of DNA fragmentation factor induces DNA fragmentation and chromatin condensation during apoptosis. Proc. Natl. Acad. Sci. USA. 95:8461-8466; Halenbeck, R., MacDonald, H. , Roulston, A., Chen, T.T., Conroy, L., Williams, L.T. (1998) CPAN, a human nuclease regulated by the caspase-sensitive inhibitor DFF45. Curr Biol. 8:537-540; Nagata, S. (2000) Apoptotic DNA fragmentation. Ex . Cell Res. 256:12-8. Recently, a novel family of cell-death-inducing DFF45-li?ke effectors' (CIDEs) have been identified that; includes CIDE-A, CIDE-B and CIDE-3/FSP2. Inohara, N. , . Koseki, T., Chen, S., Wu, X., Nunez, G. (1998) CIDE, a novel family of cell death activators with homology to the
45 kDa sibunit of the DNA fragmentation factor.1 EMBO J.17:2526-2533; Danesch,- U. , Hoeck, W. , Ringold, G.M. (1992) Cloning and transcriptional regulation of a novel adipocyte- specific gene, FSP27. CAAT-enhancer-binding protein (C/EBP) and C/EBP-like proteins interact with sequences required for differentiation-dependent expression. J. Biol. Chem. 267:7185-7193; Liang, L. , Zhao, M. , Xu, Z., Yokoyama, K.K., Li, T. (2003) Molecular cloning and characterization of CIDE-3, a novel member of the cell-death-inducing DNA- fragmentatio -factor (DFF45) -like effector family. Biochem. J. 370:195-203. The CIDEs contain an N-terminal domain that shares homology with the N-terminal region of DFF45 and may represent a regulatory region via protein interaction. See Inohara, supra; Lugovskoy, A.A. , Zhou, P., Chou, J.J., McCarty, J.S. , Li, P., Wagner, G. . (1999) Solution structure of the CIDE-N domain of CIDE-B and a model for CIDE-N/CIDE-N interactions in the DNA , fragmentation pathway of apoptosis. Cell 9:747-755. The family members also share a C-terrninal domain that is necessary and sufficient for inducing cell death and DNA fragmentation; see Inohara supra. The o-verexpression of CIDE-A induces cell death that can be inhibited by DFF45. However, CIDE-A-induced apoptosis. is not inhibited by caspase-8 inhibitors thereby suggesting the presence of additional, caspase-independent, pathway (s) for the induction of apoptosis, see Inohara supra. Previous reports have indicated that human and mouse CIDE-A are expressed in several tissues such as brown adipose tissue (BAT) and heart and are localized to the mitochondria,- Zhou, Z., Yon Toh, S . , ■ Chen, Z., Guo, K. , Ng, C.P., Ponniah, S., Lin, S.C, Hong, W. , Li, P. (2003) Cidea-deficient mice have lean phenotype and are resistant to obesity. Nat . Genet. 35:49-56. . In addition to the ability to induce apoptosis, CIDE-A can interact and inhibit UCP1 in BAT and may therefore play a role in regulating energy ibalance, see Zhou supra. Previous reports have indicated that CIDE-A is not expressed. in either adult human o mouse liver tissue, see Inohara supra, Zhou supra. . , The human protein cell' death activator CIDE-A is of particular' interest because of its highly dramatic change in liver expression with "age, first demonstrated in our Kopchick7 application, supra. CIDE-A expression is elevated in older normal mice . CIDE-A expression was studied for normal C57BI/6J mouse ages 35, 49, 77, 133, 207, 403 and 558 days. Expression is low at the first five data points, then rises sharply at 403 days, and again at 558 days. CIDE-A was therefore classified as an "unfavorable protein", i.e. , it was taught that an antagonist to CIDE-A could retard biological aging. In Kopchick7A-PCT we reported that CIDE-A is also prematurely expressed in hyperinsulinemic and type- II diabetic mouse liver tissue. CIDE-A expression also correlates with liver steatosis in diet-induced obesity, hyperinsulinemia and type-II diabetes. These observations suggest an additional pathway of apoptotic cell death in Non-Alcoholic Fatty Liver Disease (NAFLD) and that CIDE-A may play a role in this serious disease and potentially in liver dysfunction associated with type-II diabetes.
SUMMARY OF THE INVENTION Differential hybridization techniques have been used to identify mouse genes that are differentially expressed in the muscle (gastrocnemius) of mice, depending upon their development of hyperinsulinemia or type II diabetes. In essence, complementary ?RNA derived from normal mice, or mouse models of hyperinsulinemia or type II diabetes, was screened for hybridization with oligonucleotide probes each specific to a particular mouse database DNA, the latter being identified, by database accession number, by the gene manufacturer. Each database DNA in, turn was also identified by the gene chip manufacturer as representative of a particular mouse gene cluster (Unigene) . In most cases,, this database DNA sequence is a full length genomic DNA or cDNA sequence, and is therefore either identical to, or otherwise encodes the same protein as does, a natural full-length genomic DNA protein coding sequence. Those which don't present at least a partial sequence of a natural gene or its cDNA equivalent . For the sake of simplicity, all of these mouse database DNA sequences, whether full-length or partial, and whether cDNA or genomic DNA, are referred to herein as "mouse genes". When only the genomic sequence is intended, we will refer specifically to "genomic DNA" or "gDNA" . The sequences in the protein databases are determined either by directly sequencing the protein or, more commonly, by sequencing a DNA, and .then determining the translated amino acid sequence in accordance with the Genetic Code. All of the mouse sequences in the mouse polypeptide database are referred to herein as "mouse proteins" regardless of whether* they are in fact full length , sequences . "> Mouse genes which were differentially expressed (normal vs. hyperinsulinemic, hyperinsulinemic vs . diabetic, or normal vs. diabetic), as measured by .different levels of hybridization of the respective c?RNA samples with the : particular probe corresponding to that mouse gene) were identified. Since the progression is from normal to hyperinsulinemic, and thence from hyperinsulinemic to type II diabetic, one may define mammalian subjects as being more favored or less favored, with normal subjects being more favored than hyperinsulinemic subjects, and hyperinsulinemic subjects being more favored than type II diabetic subjects. The subjects' state may then be correlated with their gene expression activity. The terms "normal" and "control" are used interchangeably in thiis specification, unless expressly stated otherwise. The control or normal subject is a mouse which is normal vis-a-vis fasting insulin and fasting glucose levels. The term "normal", as used herein, means normal relative to tlrαose parameters, and does not necessitate that the mouse be normal in every respect . A mouse gene is said to have exhibited a favorable behavior if, for a particular mouse age of observation, its average level of expression in mice which are in a more favored state is hig?tιer than that in mice which are in a less favored state. A. mouse gene is said to have exhibited an unfavorable behavior if, for a particular mouse age of observation, its average level of expression in mice which are in a more favored state is lower than that in mice which are in a less favored state. When we observe the mice at several different ages, it. is possible for their expression behavior to.vary from time point to time point. For a given comparison of subjects, e.g., normal vs. hyperinsulinemic, we classify .the mouse . gene as favorable or unfavorable on the basis of the . direction of the largest expression change, and it is the magnitude of this largest ' expression change, expressed as a ratio of greater to lesser, which is set forth in the' Master Table 1 data for that mouse gene. Thus, if .at, 2 weeks, there - was a 3-fold favorable behavior, and at 8 weeks, there was a 4-fold unfavorable behavior, and at all other observed, time points, the behavior was weaker than 3-fold, the mouse gene would be - classified as an unfavorable gene,with respect .to the subject comparison in question. It will be appreciated that it may be that if the mouse gene were observed at an age other than one of the ages noted in the Examples, we would have observed a still stronger differential expression behavior. Nonetheless, we must classify the mouse genes on the basis of the behavior which we actually observed, not the behavior which might have been observed at some other age.
We are particularly interested in mouse genes which exhibit strongly favorable or unfavorable differential expression behaviors. A behavior is considered strong if the ratio of the higher level to the lower level is at least two-fold. However, a mouse gene may still be identified as favorable or unfavora?ble even if none of its observed behaviors are strong as defined above. In general, we consider the consistency of its behaviors (that is, are all or most of the differential expression behaviors at different ages in the same direction, e.g., hyperinsulinemic higher than control) , the magnitude of the behaviors (higher the better) , and the expression behavior of structurally or functionally related mouse genes (a mouse gene is more likely to be identified as favorable on the basis of a weakly favorable behavior, if it is related to other mouse genes which exhibited, favorable, especially strongly favorable, behavior) . If we considered a mouse gene with only weak differential expression behavior to be worthy of consideration on the ?basis of these criteria, then we listed it in Master Table 1 in the appropriate subtable . Preferably, the differential behavior observed is both strong and consistent . Preferably, if related mouse genes were- tested, they exhibit the' same direction of differential expression behavior.
- ' A mouse gene which was more strongly expressed in hyperinsulinemic. tissue than iri either normal or type IT diabetic ; tissue (i.e., C<HI , HI>D) will be deemed both ^unfavorable", by virtue of the control :hyperinsulinemic
Figure imgf000027_0001
and "favorable", by virtue of, the ' ' - hyperinsulinemic:diabetic comparison. This is one of several possible "mixed" expression patterns. Thus, we can subdivide the "favorables" into wholly and partially favorables. Likewise, we can subdivide the unfavorables into wholly and partially unfavorables . The genes/proteins with "mixed" expression patterns are, by definition, both partially favorable and partially unfavorable. In general, use of the wholly favorable or wholly unfavorable genes/proteins is preferred to use of the partially favorable or partially unfavorable ones. It is evident from the foregoing that mixed genes/proteins are those exhibiting a combination of favorable and unfavorable behavior. A mixed gene/protein can be used as would a favorable gene/protein if its favorable behavior outweighs the unfavorable. It can be used as would an unfavorable gene/protein if its unfavorable behavior outweighs the favorable. Preferably, they are used in conjunction with other agents that affect their balance of favorable and unfavorable behavior. Use of mixed genes/proteins is, in general, less desirable than use of purely favorable or purely unfavorable genes/proteins, but it is not excluded. It should be noted that a mouse gene is classified on ' the basis of the strongest C-HI behavior among the ages tested, the strongest HI-D behavior among the ages tested, and the strongest C-D behavior among the ages tested. If at least one of these three behaviors is significantly favorable, and none of the others of these : three behaviors is significantly unfavorable, the mouse gene will be classified as wholly favorable • and listed in subtable lA of Master Table 1. However, that does not mean that it may not have exhibited a' weaker but unfavorable expression behavior at some tested age. - The "favorable", "unfavorable" and "mixed" mouse proteins of the present invention include the mouse database proteins listed in the Master Table in the same row as a particular "favorable" ,' "unfavorable" or "mixed7' mouse gene, respectively. These proteins may be the exact translation product of the identified mouse gene (database DNA) . However, if they were sequenced directly, they could be shorter or longer than that translation product. They could also differ in sequence from the exact translation product as a result of post-translational modifications. The mouse proteins of interest also include mouse proteins which, while not listed in the table, correspond to (i.e., homologous to, i.e., which could be aligned in a statistically significant manner to) such mouse proteins or genes, and mouse proteins which are at least substantially identical or conservatively identical to the listed mouse proteins.
Related human genes (database DNAs) and proteins were identified by searching a database, comprising human DNAs or proteins for sequences corresponding to (i.e., homologous to, i.e., which could be aligned in a statistically significant manner to) the mouse gene or protein. More than one human protein may be . identified as corresponding to a particular mouse chip probe and to a particular mouse gene. Note that the terms "human genes" . and "human proteins" are used in a manner analogoxαs to that already discussed in the case of "mouse genes" and "mouse proteins". As used herein, the term "corresponding" does not mean identical, but rather implies the existence of. a statistically significant sequence similarity, such as one sufficient to qualify the human protein or gene as a homologus protein or DNA as defined below. The greater the degree of relationship as thus defined (i.e., by the ' statistical significance of each alignment used to connept the mouse cDNA to the human protein or gene,' measured by an E value),- the more close the correspondence. The connection may be ?direct (mouse gene to human protein) or indirect' (e.g., mouse gene to human gene, human.gene to human protein) .By "mouse gene", we mean the mouse gene from which the gene chip DNA in question was derived.: . In general , ' the human genes/proteins which ,most closely correspond, directly or indirectly; to the mouse genes are • preferred, such as the one(s) with the highest, top two highest, top three highest, top four highest, top five highest, and top ten highest E values for the final alignment in the connection process. The human genes/proteins deemed to correspond to our mouse genes are identified in the Master Tables . Note that it is possible to identify homologous full- length human genes and proteins, if they are present in the database, even if the query mouse DNA or protein sequence is not a full-length sequence. If there is no homologous full-length human gene or protein in the database, but there is a partial one, the latter may nonetheless be useful. For example, a partial protein may still have biological activity, and a molecule which binds the partial protein may also bind the full- . length protein so as to antagonize a biological activity of the full-length protein. Likewise, a partial human gene may encode a partial protein which has biological activity, or the gene may be useful in the design of a hybridization probe or in the design of a therapeutic antisense DNA. The partial genes and protein sequences may of course also be used in the design of1 probes intended to identify .. the full length gene or protein sequence.
For the sake of convenience, we refer to a human protein as favorable if (1) it is listed in Master Table 1 as corresponding to a favorable mouse gene, or (2) it is at least substantially identical or conservatively identical to a listed protein per (1), or (3) it is a member of a human protein class listed in Master Table 2 (if provided) as corresponding to a favorable mouse gene. We define a human protein as unfavorable in an analogous manner. We may further identify a human protein as being wholly .favorable * (see mouse genes of subtable 1A., .wholly unfayorable (see mouse genes of subtable IB), or mixed, i.e., both partially favorable and partially unfavorable (see mouse genes of subtable IC) . ' Likewise, a human gene which encodes a. particular human protein may be classified in the same way- as the human protein which it encodes. However, it should be noted that this classification is not based on the direct study of the expression of the human gene/protein. of course, the human genes/proteins of ultimate interest will be the ones whose change in level of 5 expression is, in fact, correlated, directly or inversely, with the change of state (normal, hyperinsulinemic, diabetic) of the subject.
After identifying related human genes and proteins, one 10 may formulate agents useful in screening humans at risk for - progression toward hyperinsulinemia or toward type II diabetes, or protecting humans at risk thereof from progression from a normoinsulinemic state to a hyperinsulinemic state, or from either to a type II diabetic '15 state.
Agents which bind the "favora le" and "unfavorable" nucleic acids (e.g., the agent is a substantially complementary nucleic acid hybridization probe) , or the
20 corresponding proteins (e.g., an antibody vs. the protein) may be used to evaluate whether a human subject is at increased or decreased risk for progression toward type II diabetes. A subject with one or more elevated "unfavorable" and/or one or more depressed "favorable" genes/proteins is
25 at increased risk, and one with one or more elevated " "favorable" and/or one or more depressed "unfavorable" genes/proteins is at decreased ris?k. One may further take into account whether the subject is normoinsulinemic or hyperinsulinemic at the time of the assay. If the subject
30 is non-diabetic and normoinsulinemic, we are especially . interested in the "favorable" and ""unfavorable" human genes/proteins corresponding to mouse genes differentially expressed in hyperinsulinemic vs. normal muscle. If the subject is already hyperinsulinemic, yet non-diabetic, we
35 ■ are especially interested in the "favorable" and "unfavorable" human genes/proteins corresponding to mouse
, ( genes differentially expressed in type II diabetic vs. hyperinsulinemic, muscle. The assay may be used as a preliminary screening assay to select subjects for further analysis, or as a formal diagnostic assay.
5 The identification of the related genes and proteins may also be useful in protecting humans against these disorders. Thus, Applicants contemplate: (1) use of the "favorable" mouse DNAs (or fragments
10 thereof) of the Master Tables (below) to isolate or identify related human DNAs; (2) use of human DNAs, related to favorable mouse DNAs, to express the corresponding human proteins; (3) use of the corresponding human proteins (and mouse 15 proteins, . if biologically active in humans), to protect against the disorder (s); (4) use of the corresponding mouse or human proteins, or nucleic acid probes derived from the mouse or human genes, in diagnostic agents, in assays to measure
20 progression toward hyperinsulinemia or type II diabetes, or protection against the disorder (s), or to estimate related end organ damage such as kidney damage; and (5) use of the corresponding human or mouse genes therapeutically in gene therapy, to protect against the
2.5 disorder (s) . Moreover Applicants contemplate: (1) use of the "unfavorable" mouse DNAs (or fragments thereof) of the Master Tables ' to isolate or identify related human DNAs;
30 (2) use of the complement to the "unfavorable" mouse DNAs or related human DNAs, as -anti sense molecules to inhibit expression of the related human DNAs; (3) use of the mouse or human DNAs to express the corresponding mouse or human proteins;
35 (4) use of the corresponding -mouse or human proteins, in diagnostic agents, to measure progression toward .hyperinsulinemia or type II diabetes, or protection against the disorder(s), or to estimate related end organ damage such as kidney damage; ' (5) use of the corresponding mouse or human proteins in assays to determine whether a substance binds to (and hence may neutralize) the protein; and (6) use of the neutralizing substance to protect 5 against the disorder (s) .
Thus, DNAs of interest include those which specifically hybridize to the aforementioned mouse or human genes, and are thus of interest as hybridization assay reagents or for 10 antisense therapy. They also include synthetic DNA sequences which encode the same polypeptide as is encoded by the database DNA, and thus are useful for producing the polypeptide in cell culture or in situ (i.e ., gene therapy) . Moreover, they include DNA sequences which encode , 15 polypeptides which are substantially structurally identical or conservatively identical in amino acid sequence to the mouse and human proteins identified in 'the Master Table 1, subtables 1A or IC. Finally, they include DHSTA sequences which encode peptide (including antibody) antagonists of the 20 proteins of Master Table 1, subtables IB or IC.
The related human DNAs may be identified by comparing the mouse sequence (or its AA translation product) to known human DNAs (and their AA translation products) . 25.. Related human DNAs also may be identified by screening human cDNA or genomic DNA libraries using the mouse gene of the Master Table, or a fragment thereof, as a probe. If the mouse gene of Master Table 1 is not full-length, and there is no closely corresponding full-length mouse gene in 30 the sequence databank, then the mouse DNA may first be used ', as a hybridization probe to screen a mouse cDNA library to isolate the corresponding full-length sequence. Alternatively, the mouse DNA may be used as a probe to screen a mouse genomic DNA library.
35 Our animal models' of hyperinsulinemia and diabetes are1 also obese. It is possible' that the genes found to be ? , , favorable act αndirectly by inhibiting obesity. Likewise, it,
■• 7 is possible that the genes > found- to be unfavorable act indirectly by accentuating obesity. Consequently, it is ■ within the compass of the present invention to use the favorable genes and proteins, or to use antagonists of the unfavorable genes and proteins, to protect against obesity, as well as against sequelae of obesity such as hyperinsulinemia and diabetes. Since type II diabetes is an age-related disease, the agents of the present invention may be used in conunction with known anti-aging or anti-age-related disease agents.
It is of particular interest to use the agents of the present invention in conjunction with an agent disclosed in one of the related applications cited above, in particular, an antagonist to CIDE-A, the latter having been taught in
Kopchick7 and Kopchick7A-PCT.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1. Body weight gain [Fig. la] , fasting glucose [Fig. lb] and fasting insulin [Fig. Ic] levels of mice on the HF or Std diets.
Figure 2. Expression levels of Actin, alpha, cardiac (Actcl, NM_009608) using RNA isolated from gastrocnemius muscle of individual diabetic HF mice and corresponding Std mice at different time points.
Figure 3. Data shown are expression levels for additional actin-related and actin-binding genes exhibiting a consistent decrease in expression in the HF mice in -comparison to Std mice at all four time points (Fig. 3(a)) or at three of the four time points (Fig. 3(b)) .
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS OF T"HE INVENTION
5 Full-Length vs. Partial Length Genes/Proteins A "full length" gene is here defined as (1) a naturally occurring DNA sequence which begins with an initiation codon (almost always the Met codon, ATG) , and ends with a stop codon in phase with said initiation codon0 (when introns, if any, are ignored) , and thereby encod.es a naturally occurring polypeptide with biological activity, or a naturally occurring precursor thereof, or (2) a synthetic DNA sequence which encodes the same polypeptide as that which is encoded by (1) . The gene may, but need not,5 include introns . A "full-length"- protein is here defined as a naturally occurring protein encoded by a full-length gene, or a protein derived naturally by post-translational modification of such a protein. Thus, it includes mature0 proteins, proproteins, preproteins and preproproteins . It also includes substitution and extension mutants of such naturally occurring proteins.
Subjects5 A mouse is considered to be a diabetic subject if, regardless of its fasting plasma insulin level, it has a fasting plasma glucose level of at least 190 mg/dL. A mouse is considered to 'be a hyperinsulinemic . subject if its fasting plasma insulin level is at' least 0.67 ng/mL and it0 , does, not qualify as a- diabetic subject. A mouse is considered to be "normal" if it is neither 'diabetic nor - \ hyperinsulinemic. Thus, normality is defined in a very limited manner. A' mouse is considered "obese" if its weight is at least'5 ' 15% in excess of the -.mean weight for mice of its age and , sex. A mouse which does n'ot satisfy 'this standard may be characterized as "nonrobese" , the term "normal" being reserved for use in reference to glucose and insulin levels'1' as previously -described. • , ., A human is considered a diabetic subject if, regardless of his or her fasting plasma insulin level, the fasting plasma glucose level -is at least 126 mg/dL. A human is considered a hyperinsulinemic subject if the fasting plasma insulin level is more than 26 micro International Units/mL (it is believed that this is equivalent to 1.08 ng/mL) , and does not qualify as a diabetic subject. A human is considered to be "normal" if it is neither diabetic nor hyperinsulinemic. Thus, normality is defined in a very limited manner. A human is considered "obese" if the body mass index (BMI) (weight divided by height squared) is at least 30 kg/m2. A human who does not satisfy this standard may be characterized as "non-obese", the term "normal" being reserved for use in reference to glucose and insulin level s as previously described. A human is considered overweight if the BMI is at least 25 kg/m2. Thus, we define overweight to include obese individuals, consistent with the recommendations of the National Institute of Diabetes and Digestive and Kidney
Diseases (NIDDK) . A human who does not satisfy this standard may be characterized as "non-overweight."
According to the Report of the Expert Committe on the Diagnosis and Classification of Diabetes Mellitus, Diabetes Care 20: 1183-97 (1997), the following are risk factors for diabetes type II: older (e.g., at least 45; see below) , ' excessive weight (see below)
first-degree relative with diabetes mellitus member of high risk ethnic group (black, Hispanic, "' Native American, Asian) history of gestational diabetes mellitus or delivering a baby weighing more than 9 'pounds (4.032 kg) hypertensive (>140/90 mm Hg)
HDL cholesterol level >35 mg/dL (0. 90 mmol/L) triglyceride level >=250 mg/dL (2. S3 mmol/L)
Hence, in a preferred embodiment, the diagnostic and protective methods of the present invention are applied to human subjects exhibiting one or more of the aforementioned risk factors. Likewise, in a preferred embodiment, they are applied to human subjects who, while not diabetic, exhibit impaired glucose homeostasis (110 to <126 mg/dL) .
The risk of diabetes increases wit age. Hence, . in successive preferred embodiments, the age of the subjects is at least 45, at least 50, at least 55, att least 60, at least 65, at least 70, and at least 75. With regard to excessive weight, NEDDK says that "The relative risk of diabetes increases by approximately 25 percent for each additional unit of BMI over 22." Hence, in successive preferred embodiments, the BlV-IIs of- the human subjects is at least 23, at least 24, at least 25 (i.e., overweight by our criterion), at least 26, at least 27, at least 28, at. least 29, at least 30 (i.e. , obese), at least 31, at least 32, at least 33, at least 3-4, at least 35, at least 36, at least 37, at least 38, at heast 39, at least 40, or over 40.
Age-Related Diseases
Age-related (senescent) diseases iϋclude certain cancers, atherosclerosis, diabetes (type 2) , osteoporosis, hypertension, depression, Alzheimer's, Parkinson's, glaucoma, certain immune system defects, kidney failure, and liver steatosis. In general, they are diseases for which the relative risk (comparing a subpopulation over age 55 to' a suitably matched population under age 55 ) is at least 1.1. Preferably, the agents of the present invention protect against one or more age-related diseases for at least a subpopulation of mature (post-puberty) adult subjects.
Direct and Indirect Utility of Identified Nucleic Acid Sequences and Related Molecules The mouse or human genes (or fragments thereof) may be used directly. For diagnostic or screening purposes, they (or specific binding fragments thereof) may be labeled and used as hybridization probes. For therapeutic purposes, they (or specific binding fragments thereof) may be used as antisense reagents to inhibit the expression of the corresponding gene, or of a sufficiently homologous , gene of another species . If the database DNA appears to be a full-length cDNA or gDNA, that is, it encodes an entire, functional, naturally occurring protein, then it may be used in the expression of that protein. Likewise, if the corresponding human gene is known in full-length, it may be used to express the human protein. Such expression may be in cell culture, with the protein subsequently isolated and administered exogenously to subjects' who would benefit therefrom, or in vivo, i.e., administration by gene therapy. Naturally, any DNA encoding the same protein may be used for the same purpose, and a DNA encoding a protein which a fragment or a mutant of that naturally occurring protein which retains the desired activity, . may be used for the purpose of producing the active fragment or mutant. The , , encoded protein of course has "utility therapeutically and, in labeled or immobilized form, diagnostically. The genes may also be used indirectly, that is, to identify other useful DNAs, proteins, or other .molecules . We have attempted to determine whether the mouse genes disclosed herein have significant similarity to any known human DNA, and whether, in any of the six possible combinations of reference frame and strand, they encode' a; protein similar, to a known human protein. If .so, then it : follows that the known human protein, and DNAs encoding that protein, may be used in a similar manner. In addition, if the known human protein is known to have additional homologues, then those homologous proteins,, and DNAs encoding them, may be used in a similar manner.
There thus are several ways that a human protein homologue of interest can be identified by database searching, including but not limited to:
1) a DNA->DNA. (BlastN) search for human database DNAs closely related to the mouse gene identifies a known human gene, and the sequence of the human protein is deduced by the Genetic Code;
2) a DNA->Protein (BlastX) search for humn database proteins closely related to the translated DNA of ttie mouse gene identifies a ?known human protein; and
3) the sequence of the mouse protein is known or is deduced by the Genetic Code, and a Protein->Protein (BlastP) search for closely related database proteins identifies a known human protein.
. Once a known human gene is identified, it may be used in further BlastN or BlastX searches to identify other human , genes or proteins . Once a known human protein is identified, it may be used in further BlastP searches to - identify other human proteins..
Searches may also take, cognizance, intermediately, of known genes and proteins other than mouse or human ones, e.g., use the- mouse sequence to identify a known rat sequence and then the rat ; sequence to identify a human one. * If we have identified a mouse gene, and it encodes a mouse protein which appears similar to a human protein, then, that human protein may be used (especially in humans) for purposes analogous to the proposed use of the mouse protein in mice. Moreover, a specific binding fragment of an appropriate strand of the corresponding human gene (gDNA or cDNA) could be labeled and used as a hybridization probe (especially against samples of human mRNA or cDNA) . In determining whether the disclosed genes (gDNA or cDNA)have significant similarities to known DNAs (and their translated AA sequences to known proteins) , one would generally use the disclosed gene as a query sequence in a search of a sequence database. The results of several such searches are set forth in the Examples. Such results are dependent, to some degree, on the search parameters. Preferred parameters are set forth in Example 1. The results are also dependent on the content of the database. While the raw similarity score of a particular target (database) sequence will not vary with content (as long as it remains in the database) , its informational value (in bits), expected value, and relative ranking can change. Generally speaking, the changes are small.
It will be appreciated that the nucleic acid and protein databases keep growing. Hence a later search may identify high scoring target sequences which were not uncovered by an earlier search because the target sequences were not previously part of a database . Hence, in a preferred embodiment, the cognate DNAs and proteins include not only those set forth in the examples, but those which would have been highly ranked (top ten, more preferably top three, even more preferably top two, most preferably the top one) in a search run with the same parameters on the date of filing of this application.
If the known mouse or human database DNA appears to be a partial sequence (that is, partial relative to a cDNA or gDNA encoding the whole naturally occurring protein) , it may be used as a hybridization probe to isolate the full-length DNA. If the partial DNA encodes a biologically functional fragment of the cognate protein, it may be used in a manner similar to the full length DNA, i.e., to produce the functional fragment.
If we have indicated that an antagonist of a protein or other molecule is useful, then such an antagonist may be obtained by preparing a combinatorial library, as described belqw, of potential antagonists, and screening the library members for binding to the protein or other molecule in question. The binding members may then be further screened for the ability to antagonize the biological activity of the target. The antagonists may be used therapeutically, or, in suitably labeled or immobilized form, diagnostically. If the identified mouse or human database DNA is related to a known protein, then, substances known to interact with that protein (e.g., agonists, antagonists, substrates, receptors, second messengers, regulators, and so forth) , and binding molecules which bind them, are also of utility. Such binding molecules can likewise be identified by screening a combinatorial library.
Isolation of Full Length DNAs Using Partial DNAs as probes If it is determined that a DNA of the present invention is a partial DNA, and the cognate full length DNA is not listed in a sequence database, the available DNA may be used as a hybridization probe to isolate the full-length DNA from a suitable DNA library. Stringent hybridization conditions are appropriate, that is, conditions in which the hybridization -temperature is 5-10 deg. C. below the Tm of the DNA as a perfect duplex.
Identification and Isolation of Homologous Genes Using a DNA Probe ,It may be that the sequence databases available do not include the sequence of any homologous gene (cDNA or gDNA) , or at .'least of the homologous gene for a species ofinterest . However, given ,the cDNAs set forth above, one'may readily obtain the homologous^ gene. *
' , . .' The possession of ''one DNA (the "starting DNA") greatly .facilitates -the isolation of homologous DNAs. If only a partial DNA is known, this' partial DNA may first be used as a probe to isolate the corresponding full length DNA for the same species, and that the latter may be used as the starting DNA in the search for homologous genes . The starting DNA, or a fragment thereof, is used as a hybridization probe to screen a cDNA or genomic DNA library for clones containing inserts which encode either the entire homologous protein, or a recognizable fragment thereof. The minimum length of the hybridization probe is dictated by the need for specificity. If the size of the library in bases is L, and the GC content is 50%, then the probe should have a length of at least 1, where L = 41. This will yield, on average, a single perfect match in random DNA of L bases. The human cDNA library is about 108 bases and the human genomic DNA library is about 1010 bases. The library is preferably derived from an organism which is .known, on biochemical evidence, to produce a homologous protein, and more preferably from the genomic DNA or mRNA of cells of that organism which are likely to be relatively high producers of that protein. A cDNA library (which is derived from an mRNA library) is especially preferred. If the organism in question is known to have substantially different codon preferences from that of the organism whose' relevant cDNA or genomic DNA is known, a synthetic hybridization probe may be used which encodes the same amino acid sequence but whose codon utilization is more similar to that of the DNA of the • target organism. Alternatively, the synthetic probe may employ inosine as a substitute for those bases which are most likely to be divergent, or the probe may be a mixed probe which mixes the codons for ,;the source DNA with the preferred codons (encoding the same amino acid) for the target organism. By routine methods, the Tm of a perfect duplex of starting DNA is determined. " One may then select a ' ' - hybridization temperature which is sufficiently 'lower than the perfect duplex Tm to allow hybridization? of the starting ; DNA (or other probe) to a target DNA which is divergent from ' the starting .DNA. A 1% sequence divergence' typically lowers ; the Tm of a duplex by 1-2°C, and the DNAs encoding homologous proteins of different species typically have sequence identities of around 50-80%. Preferably, the library is screened under conditions where the temperature is at least 20°C, more preferably at least 50°C. , below the perfect duplex Tm. Since salt reduces the Tm, one ordinarily would carry out the search for DNAs encoding highly homologous proteins under relatively low salt hybridization conditions, e.g., <1M NaCl. The higher the salt concentration,- and/or the lower the temperature, the greater the sequence divergence which is tolerated. For the use of probes to identify homologous genes in other species, see, e.g., Schwinn, et al . , J. Biol. Chem., 265:8183-89 (1990) (hamster 67-bp cDNA probe vs. human leukocyte genomic library; human 0.32kb DNA probe vs. bovine brain cDNA library, both with hybridization at 42°C in 6xSSC) ; Jenkins et al . , J. Biol. Chem., 265:19624-31 (1990) (Chicken 770-bp cDNA probe vs. human genomic libraries; hybridization at 40°C in 50% formamide and 5xSSC) ; Murata et al., J. Exp. Med., 175:341-51 (1992) (1.2-kb mouse cDNA probe v. human eosinophil cDNA library; hybridization at 65°C in 6xSSC) ; Guyer et al . , J. Biol. Chem., 265:17307-17 (1990) (2.95-kb human genomic DNA probe vs. porcine genomic DNA library; hybridization at'42°C in 5xSSC) . The conditions set forth in these articles may each be considered suitable for the purpose of isolating homologous genes .
Corresponding (Homologous) Proteins and DNAs In the case of a gene chip, the manufacturer of the gene chip determines which DNA to place at each position on the ' chip. This DNA may correspond in; sequence to a genomic DNA, a cDNA, or a fragment of genomic or, cDNA, - and may be natural, synthetic or partially natural and partially synthetic in origin. The' manufacturer of the gene chip will normally identify the
Figure imgf000044_0001
for ■ a mouse 'gene , chip as ' corresponding to a particular mouse gene, in which. case it will be assumed that the alignments' of chip DNA to mouse gene satisfies the homology criteria of the invention. Usually, the gene chip manufacturer will provide a sequence database accession number for the mouse DNA. If so, to identify the corresponding mouse protein, we will first inspect the database record for that mouse DNA. Often, the mouse protein accession number will appear in that record or in a linked record. If it doesn't, the corresponding mouse protein can be identified by performing a BlastX search on a mouse protein database with the mouse database DNA sequence as the query sequence. Even if the protein sequence is not in the database, if the DNA sequence comprises a full-length coding sequence, the corresponding protein can be identified by translating the coding sequence in accordance with the Genetic Code. A human protein can be said to be identifiable as corresponding (homologous) to a gene chip DNA if it is identified as corresponding (homologous) to the mouse gene (gDNA or cDNA, whole or partial) identified by the gene chip manufacturer as corresponding to that gene chip DNA. : In turn, it is identifiable as corresponding (homologous) to said identified mouse gene, if
(1) it can be aligned by BlastX directly to that mouse gene, and/or
(2) it is encoded by a human gene, or can be aligned to a human gene ,by BlastX, which in turn can be aligned by BlastN to said mouse gene and/or
(3) it can be aligned by BlastP to a mouse protein, the latter being encoded by said mouse gene, or aligned to said mouse gene BlastX, where any alignment by BlastN,' BlastP or BlastX is in accordance' with the default parameters set. forth below, and , the expected value '(E) of each alignment (the probability that such an alignment would have occurred by chance.'alone) l is less than e-10. (Note that because this is a negative exponent, a value such as e-50 is less than e-10.)
Desirably, two or all three of these conditions (l)-(3) are satisfied for the corresponding (homologous) human genes and proteins.
A human gene is corresponding (homologous) to a mouse gene chip DNA, and hence to said identified mouse gene (or cDNA) and protein, if it encodes a corresponding (homologous) human protein as defined above, or it can be aligned by BlastN to said mouse gene. Preferably, for at least one of conditions (l)-(3), the E value is less than e-50, more preferably less than e-60, still more preferably less than e-70, even more preferably less than e-80, considerably more preferably less than e-90, and most preferably less than e-100. Desirably, it is true for two or even all three of these conditions. In constructing Master table 1, we generally used a BlastX (mouse gene vs. human protein) alignment E value cutoff of e-50. However, if there were no human proteins with that good an alignment to the mouse DNA in question, or if there were other reasons for including a particular human protein (e.g., a known functionality supportive of the observed differential cognate mouse protein expression) , then a human protein with a score worse (i.e., higher) than e-50 may appear in Master Table 1: If the manufacturer of the gene chip identifies the gene chip DNA as corresponding to an EST, or other DNA which is not. a full-length mouse gene or cDNA, a longer (possibly full length) mouse gene. or cDNA may be identified by a BlastN search of the mouse DNA database. Alternatively, the identified DNA1 may be used to conduct ; BlastN search of a human DNA database, or a BlastX search of a' mouse or human protein database . , Thus', more' generally, a human protein can be said -to be identifiable as corresponding (homologous) to a gene chip DNA, or to a DNA identified by the manufacturer as corresponding to that gene chip DNA, if
(l1) it can be aligned directly to the gene chip or corresponding manufacturer identified DNA by BlastX. and/or
(2') it can be aligned to a human gene/cDNA by BlastX, whose genomic DNA (gDNA) or cDNA (DNA complementary to messenger ?RNA) in turn can be aligned to the gene chip or corresponding manufacturer identified DNA by BlastN, and/or
(3') it can be aligned to a mouse gene/cDNA by BlastX, whose gDNA or cDNA in turn can be aligned to the gene chip or corresponding manufacturer identified DNA by BlastN, and/or
(41) it can be aligned to a mouse protein by BlastP, which in turn can be aligned to the gene chip or corresponding manufacturer identified DNA by BlastX, and/or (5') it can be aligned to a mouse protein by BlastP, which in turn can be aligned to a mouse gene/cDNA by BlastX, whose gDNA or cDNA can in turn be aligned to the gene chip or corresponding manufacturer identified DNA by BlastN;
where any alignment by BlastN, BlastP, or BlastX is in accordance with the default parameters set forth below, and the expected value (E) of each alignment (the probability that such an alignment would have occurred by chance alone) is less than e-10. (Note that because this is a negative exponent, a value such as e-50 is less than e-10.)
Preferably, two, three, four ,or all five of conditions (l')-(5') are satisfied. Preferably, for at least one of conditions (l')-(5'), for at least the final alignment (i.e., vs. the human protein), the E value is less than e-50, more preferably less than e-60, , still more preferably less than e-70, even more preferably less than e-80, considerably more preferably less than e-90, and most preferably less than e-100. Desirably, one or more of these standards of preference are met for two, three, four or all five of conditions (1')- (51) . In particular, for those conditions in which the gene chip or corresponding manufacturer identified DNA is . indirectly connected to the human protein by virtue of two or more successive alignments, the E value is preferably, so limited for all of said alignments in the connecting chain.
A human gene corresponds (is homologous) to a gene chip DNA or manufacturer identified corresponding DNA if it encodes a homologous human protein as defined above, or if it can be aligned either directly to that DNA, or indirectly through a mouse gene which can be aligned to said DNA, according to the conditions set forth above. '. , Master table 1 assembles a list of human protein corresponding to each of the mouse DNAs/proteins identified as related to the. chip DNA. These human proteins form a set- and can be given a percentile rank, with respect to E value, within that set. The human proteins of the present invention preferably are those scorers with a percentile- rank of at least 50%, more preferably at least 60%, still more preferably at least' 70%, even more preferably at least 80%, and most preferably at least 90%. ( .' ; •■ For each mouse gene (gDNA or cDNA) in Master Table 1, ' there is a particular human protein which provides the best alignment match as measured by BlastX, i.e., the human protein with the best score (lowest-' e-value) . These human proteins form a subset of the set above and can be given a percentile rank'within that subset, e.g., the human proteins with scores in the top 10% of that subset have a percentile rank of 90% or higher. The human proteins of the present invention preferably are those 'best scorer subset proteins with a percentile rank' within- he subset of at, least -50%, more preferably ,at least 60%, still more preferably at least 70%; even more , , ' preferably at least *80%,, and most preferably at ' least 90%..' BlastN and BlastX report very low expected values as ■
"0.0". This does not truly mean that the expected value is exactly zero (since any alignment could occur by chance) , but merely that it is so infinitesimal that it is not reported. The documentation does not state the cutoff value, but alignments with explicit E values as low as e-178 (624 bits) have been reported as nonzero values, while a score of 636 bits was reported as "0.0". Functionally homologous human proteins are also of interest. A human protein may be said to be functionally homologous to the moμse gene if the human protein has at least one biological activity in common with the mouse protein encoded by said mouse gene. The human proteins of interest also include those that are substantially and/or conservatively identical (as defined below) to the homologous and/or functionally homologous human proteins defined above .
Degree of Differential Expression The degree of differential expression may be expressed as the ratio of the higher expression level to the lower expression level. Preferably, this is at least 2-fold, and more preferably, it is higher, such as at least 3-fold, at least, 4-fold, at least 5-fold, at least 6-fold, at least 7- fold, at least 8-fold, at least 9-fold, or at least 10-fold.
Most preferably, the human protein of interest corresponds to a mouse gene for which the degree of differential expression places it among the top 10% of the mouse genes in the appropriate subtable.
Relevance of Favorable and Unfavorable Genes If a gene is down-regul ted in more favored mammals, or up-regulated in less favored mammals, (i.e., an "unfavorable gene") then several utilities are apparent. First, , the complementary strand of the gene, or a portion thereof, may be used in labeled form as a hybridization probe to detect- messenger RNA and thereby monitor the level of expression of the gene in a subject. Elevated levels are indicative of progression, or propensity to progression, to a less favored state, and clinicians may take appropriate preventative, curative or ameliorative action. Secondly, the messenger ?RNA product (or equivalent cDNA) , the protein product, or a binding molecule specific for that product (e.g., an antibody which binds the product) , or a downstream product which mediates the activity (e.g., a signaling intermediate) or a binding molecule (e.g., an antibody) therefor, may be used, preferably in labeled or immobilized form, as an assay reagent in an assay for said nucleic acid product, protein product, or downstream product (e.g., a signaling intermediate) . Again, elevated levels are indicative of a present or future problem. Thirdly, an agent which down-regulates expression of the gene may be used to reduce levels of the corresponding protein and thereby inhibit further damage. This agent could inhibit transcription of the gene in the subject, or translation of the corresponding messenger RNA. Possible inhibitors of transcription and translation include antisense molecules and repressor molecules. The agent could also inhibit a post-translational modification (e.g., , glycosylation, phosphorylation, cleavage, GPI attachment) s required for activity, or post-translationally modify the protein so as to inactivate it. Or it could be an agent which down- or up-regulated a positive or negative regulatory gene, respectively. Fourthly, an agent which is an antagonist of the messenger RNA product or protein product of the gene, ' or of a downstream product ' through which' its activity is . manifested (e.g., a signaling intermediate), may be used to inhibit its activity. This antagonist could be an antibody, a peptide, a peptoid, a nucleic acid, a peptide nucleic acid (PNA) 5 oligomer, a small organic molecule of a kind for which a combinatorial library exists (e.g., a benzodiazepine) , etc. An antagonist is simply a binding molecule which, by binding, reduces or abolishes the undesired activity of its target. The antagonist, if not an oligomeric molecule, is
,10 preferably less than 1000 daltons, more preferably less than 500 daltons. Fifthly, an agent which degrades, or abets the degradation of, that messenger RNA, its protein product or a downstream product which mediates its activity (e.g., a
15 signaling intermediate) , may be used to curb the effective period of activity of the protein. If a gene is up-regulated in more favored mammals, or down-regulated in less favored animals then the utilities are converse to those stated above .
20 First, the complementary strand of the gene, or a portion thereof, may be used in labeled form as a hybridization probe to detect messenger RNA and thereby monitor the level of expression of the gene in a subject. Depressed levels are indicative of damage, or possibly of a
25 propensity to damage, and clinicians may take appropriate preventative, curative or ameliorative action. Secondly, the messenger RNA product, the equivalent cDNA, protein product, . or a binding molecule specific for those products, or a downstream product, or a signaling
30 intermediate, or a binding molecule therefor, may be used, preferably in labeled or' immobilized form, as an assay reagent in an assay for said protein product or downstream product. ' Again, depressed levels are indicative of a present or future problem. 5 ? ' Thirdly,1 an agent 'which ,up-regulates expression of the gene may be used to increase- levels of the corresponding protein and thereby inhibit further progression to a less favored state. By way of example, it could be a vector ■ which carries a copy of the gene, but- which expresses the gene at higher levels than does the endogenous expression system. Or it could be an agent which up- or down-regulates a positive or negative regulatory gene. Fourthly, an agent which is an agonist of the protein product of the gene, or of a downstream product through which its activity (of inhibition of progression to a less favored state) is manifested, or of a signaling intermediate may be used to foster its activity. Fifthly, an agent which inhibits the degradation of that protein product or of a downstream product or of a signaling intermediate may be used to increase the effective period of activity of the protein.
Mutant Proteins The present invention also contemplates mutant proteins
(peptides) which are substantially identical (as defined below) to the parental protein (peptide) . In general, the fewer the mutations, the more likely the mutant protein is to retain the activity of the parental protein. The effect of mutations is usually (but not always) additive. Certain individual mutations are more likely to be tolerated than * others . A protein is more likely to tolerate a mutation which (a) is a substitution rather than an insertion or deletion; (b) is an insertion or deletion at the terminus, rather than internally, or, if internal, is at a domain boundary, or a loop or turn, rather than in an alpha helix or beta strand; ' (c) affects a surface residue rather than an interior residue; (d) affects a part of the molecule distal to the binding site; (e) is a, substitution of. one amino acid for another of similar size,, charge, and/or , *' . - * hydrophobicity, .. and does not destroy a disulfide 'bond or other crosslink; and ; (f) is at a site which is subject to substantial variation among a family of homologous proteins to which the protein of interest belongs.
These considerations can be used to design functional mutants .
Surface vs . Interior Residues Charged amino acid residues almost always lie on the surface of the protein. For uncharged residues, there is less certainty, but in general, hydrophilic residues are partitioned to the surface and hydrophobic residues to the interior. Of course, for a membrane protein, the membrane- spanning segments are likely to be rich in hydrophobic residues. Surface residues may be identified experimentally by various labeling techniques, or by 3-D structure mapping techniques like X-ray diffraction and NMR. A 3-D model of a homologous protein can be helpful.
Binding Si te Residues Residues forming the binding site may be identified by (1) comparing the effects of labeling the surface residues before and after complexing the protein to its target, (2) labeling the binding site directly with affinity ligands, (3)' fragmenting the protein and testing the fragments for binding activity, and ,(4) systematic mutagenesis (e.g., alanine-scanning mutagenesis) to determine which mutants destroy binding. If the binding site of a homologous protein is known, the binding site may be postulated by analogy. Protein libraries may be constructed and screened that a large family (e.g., 108) of related mutants may, be evaluated simultaneously. _ ,' Hence, the mutations are preferably conservative modifications as defined below. ' ;
"Substantially Identical" - A mutant protein (peptide).- is substantially identical', to a reference protein (peptide) if (a) it has at least 10% of a specific binding activity or a non-nutritional biological activity of the reference protein, and (b) is at least 50% identical in amino acid sequence to the reference protein (peptide) . It is "substantially structurally identical" if condition (b) applies, regardless of (a) . Percentage amino acid identity is determined by aligning the mutant and reference sequences according to a rigorous dynamic programming algorithm which globally aligns their sequences to maximize their similarity, the similarity being scored as the sum of scores for each aligned pair according to an unbiased PAM250 matrix, and a penalty for each internal gap of -12 for the first null of the gap and - 4 for each additional null of the same gap. The percentage identity is the number of matches expressed as a percentage of the adjusted (i.e., counting inserted nulls) length of the reference sequence . A mutant DNA sequence is substantially identical to a reference DNA sequence if they are structural sequences, and encoding mutant and reference proteins which are substantially identical as described above. If instead they are regulatory sequences, they are substantially identical if the mutant sequence has at least 10% of the regulatory activity of the reference sequence, and is at least 50% identical in nucleotide sequence to the reference sequence. Percentage identity is determined as for proteins except that matches are scored +5, mismatches - 4, the gap open penalty is -12, and the gap extension penalty (per additional null) is -4. More preferably, the sequence is not merely . substantially identical but rather is at least 51%, at least 66%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identical in sequence to the reference ■ sequence . DNA sequences may also be considered "substantially identical" if they hybridize to each other under .stringent conditions, ,i.e., conditions at which the Tm of the heteroduplex of the one strand of the mutant DNA and the more complementary strand of the reference ' DNA is not in excess of 10°C. less than the Tm of the reference DNA homoduplex. Typically this will correspond to a percentage identity of 85-90%. "Conservative Modifications" "Conservative modifications" are defined as (a) conservative substitutions of amino acids as hereafter defined; or (b) ' single or multiple insertions (extension) or deletions (truncation) of amino acids at the termini . Conservative modifications are preferred to other modifications. Conservative substitutions are preferred to other conservative modifications. "Semi-Conservative Modifications" are modifications which are not conservative, but which are (a) semi- conservative substitutions as hereafter defined; or (b) single or multiple insertions or deletions internally, but at interdomain boundaries, in loops or in other segments of relatively high mobility. Semi-conservative modifications are preferred to nonconservative modifications ; Semi- conservative substitutions are preferred to other semi- conservative modifications. Non-conservative substitutions are preferred to other non-conservative modifications. The term "conservative" is used here in an a priori sense,- i.e., modifications which would be expected to preserve 3D structure and activity, based on analysis of .the naturally occurring families of homologous proteins and of past experience with the effects of deliberate mutagenesis, rather than post facto, a modification already known to conserve activity. Of course, a modification which is conservative a priori may, and usually is, also conservative post facto. Preferably,' except at the termini, no more than about five amino acids are inserted or deleted- at a particular* locus, and the modifications are outside regions :known' to contain binding sites important to activity. , Preferably, insertions or deletions are limited to the ■ termini . A conservative substitution is a substitution of one amino acid for another of the same exchange group, the 5 exchange groups being defined as follows I Gly, Pro, Ser, Ala (Cys) (and any nonbiogenic, neutral amino acid with a hydrophobicity not exceeding that of- the aforementioned a.a.'s) II Arg, Lys, His (and any nonbiogenic, positively- 10 charged amino acids) III Asp,. Glu, Asn, Gin (and any nonbiogenic negatively-charged amino acids) IV Leu, lie, Met, Val (Cys) (and any nonbiogenic, . aliphatic, neutral amino acid with a
15 hydrophobicity too high for I above) V Phe, Trp, Tyr (and any nonbiogenic, aromatic neutral amino acid with a hydrophobicity too high for I above) . Note that Cys belongs to both I and IV. 20 Residues Pro, Gly and Cys have special conformational roles. Cys participates in formation of disulfide bonds. Gly imparts flexibility to the chain. Pro imparts rigidity to the chain and disrupts a. helices. These residues may be essential in certain regions of the polypeptide, but 25 substitutable elsewhere. One, .two or three conservative substitutions are more likely to be tolerated than a larger number. 1 "Semi-conservative substitutions" are defined herein as being substitutions within supergroup I/II/III or within 30 supergroup IV/V, but not within a single one of groups I-V., They also include replacement of any other amino acid with alanine. If a substitution is not conservative, it preferably is semi-conservative. . ' "Non-conservative substitutions" are substitutions 35 which are not "conservative" or "semi-conservative". "Highly conservative substitutions" are a subset of conservative substitutions, and are exchanges of amiήo acids .within the groups Phe/Tyr/Trp, Met/Leu/lle/Val, .His/Arg/Lys', ? , Asp/Glu and Ser/Thr/Ala. ' They are more, likely to be tolerated than other conservative substitutions. Again, the smaller the number of substitutions, the more likely they are to be tolerated.
"Conservatively Identical" A protein (peptide) is conservatively identical to a reference protein (peptide) it differs from the latter, if at all, solely by conservative modifications, the protein (peptide remaining at least seven amino acids long if the reference protein (peptide) was at least seven amino acids long. A protein is at least semi-conservatively identical to a reference protein (peptide) if it differs from the latter, if at all, solely by semi-conservative or conservative modifications. A protein (peptide) is nearly conservatively identical to a reference protein (peptide) if it differs from the latter, if at all, solely by one or more conservative modifications and/or a single nonconservative substitution. It is highly conservatively identical if it differs, if at all, solely by highly conservative substitutions. Highly conservatively identical proteins are preferred to those merely conservatively identical. An absolutely identical protein is even more preferred.
The core sequence of a reference protein (peptide) is the largest single fragment wlαich retains at least 10% of a particular specific binding activity, if one is specified, or otherwise of at, least one specific binding activity of the referent.- If the referent has more than one specific binding activity, it may have more- than one core sequence, and these may overlap or not. , If it is taught that a peptide of the present invention may have a particular similarity relationship (e.g., markedly identical) to a reference protein (peptide),' preferred peptides are those which comprise a sequence having-' that relationship to a core sequence of the reference protein' (peptide) , but with , internal insertions or deletions in either sequence excluded. Even more preferred peptides are those whose entire sequence has that relationship, with the same exclusion, to a core sequence of that reference protein (peptide) .
Library The term "library" generally refers to a collection of chemical or biological entities which are related in origin, structure, and/or function, and which can be screened simultaneously for a property of interest. Libraries may be classified by how they are constructed (natural vs. artificial diversity; combinatorial vs. noncombinatorial) , how they are screened (hybridization, expression, display) , or by the nature of the screened library members (peptides, nucleic acids, etc.). In a "natural diversity" library, essentially all of the diversity arose without human intervention. This would be true, for example, of messenger RNA extracted from a non- engineered cell. In a "synthetic diversity" library, essentially all of the diversity arose deliberately as a result of human intervention. This would be true for example of a combinatorial library; note that a small level of natural diversity could still arise as a result of spontaneous mutation. It would also be true of a noncombinatorial library of compounds collected from diverse sources, even if they were1 all natural products. In a "non-natural diversity" library, at least some of the diversity arose deliberately through human intervention. In' a "controlled origin" library, the source of the diversity is limited in some. way. A limitation might be to cells of a particular individual, to a particular species, or to a particular genus, or, more complexly, to individuals of a particular species who are of a particular age, sex, physical .condition, geographical location, .occupation and/or. familial relationship. lternatively or additionally, it' might"be to cells of a particular- tissue or. organ. Or it could be cells exposed to particular pharmacological, environmental, or pathogenic conditions.- Or the library could be of chemicals, or a particular class of chemicals, produced by such cells. In a ""controlled structure" library, the library members are deliberately limited by the production conditions to particular chemical structures.. For example, if they are oligomers, they may be limited in length and monomer composition, e.g. hexapeptides composed of the twenty genetically encoded amino acids.
Hybridization Library In a hybridization library, the library members are nucleic acids, and are screened using a nucleic acid hybridization probe. Bound nucleic acids may then be amplified, cloned, and/or sequenced.
Expression Library In an expression library, the screened library members are gene expression products, but one may also speak of an underlying library of genes encoding those products. The library is made by subcloning DNA encoding the library members (or portions thereof) into expression vectors (or into cloning vectors which subsequently are used to construct expression vectors) , each vector comprising an expressible gene encoding a particular library member, - introducing the expression vectors into suitable cells, and expressing the genes so the expression products are produced. In one embodiment, the expression products are secreted, so the library, can be screened using an affinity reagent, such as an antibody or receptor. The bound expression products may be sequenced, directly, or their , j sequences inferred by, e.g., sequencing at least the- variable portion of the encoding DNA. In a second embodiment, the cells are lysed,- thereby exposing the expression products, and .the latter are ■ screened with the affinity reagent. '• • >'. In a third embodiment, the cells,- express the library members in such a manner that they are-, displayed on the surface of the cells, or on the surface of viral particles produced by the cells. (See- display libraries, below). In a fourth embodiment, the screening is not for the ability of the expression product to bind to an affinity reagent, but rather for its ability to alter the phenotype of the host cell in a particular detectable manner. Here, the screened library members are transformed cells, but there is a first underlying library of expression products which mediate the behavior of' the cells, and a second underlying library of genes which encode those products.
Display Library In a display library, the library members are each conjugated to, and displayed upon, a support of some kind. The support may be living (a cell or virus) , or nonliving (e.g., a bead or plate). If the support is a cell or virus, display will normally be effectuated by expressing a fusion protein which comprises the library member, a carrier moiety allowing integration of the fusion protein into the surface of the cell or virus, and optionally a lining moiety. In a variation on this theme, the cell coexpresses a first fusion comprising the library member and a linking moiety LI, and a second fusion comprising a linking moiety L2 and the carrier moiety. LI and L2 interact to associate the first fusion with the second fusion and hence, indirectly, the library . member with the surface of the cell or virus.. i Soluble Library In a soluble library, the library members are free in solution. A soluble library may be produced directly, or , ' one may first" make a -display library and then release the library members from their supports. Encapsulated Library , In an encapsulated library, the library members are inside cells or liposomes. Generally speaking, encapsulated libraries are used to store the library members for -future use; the members are extracted in some way for screening purposes. However, if they differentially affect the phenotype of the cells, they may be screened indirectly by screening the cells.
cDNA Library A cDNA library is usually prepared by extracting RNA from cells of particular origin, fractionating the RNA to isolate the messenger RNA (mRNA has a poly (A) tail, so this is usually done by oligo-dT affinity chromatography) , synthesizing complementary DNA (cDNA) using reverse transcriptase, DNA polymerase, and other enzymes, subcloning the cDNA into vectors, and introducing the vectors into cells. Often, only mRNAs or cDNAs of particular sizes will be used, to make it more likely that the cDNA encodes a functional polypeptide. A cDNA library explores the natural diversity of the transcribed DNAs of cells from a particular source. It is not a combinatorial library. A cDNA library may be used to make a hybridization library, or it may be used as an (or to make) expression library.
Genomic DNA Library A genomic DNA library is made by extracting DNA from a particular source, fragmenting the DNA, isolating fragments of a particular size range, subcloning the DNA fragments into vectors, and introducing the vectors into cells. Like a cDNA library, a genomic DNA library is a natural diversity library, and not a combinatorial library. A genomic DNA library may be used the same way as a , cDNA library.
Synthetic DNA library A synthetic DNA library may be screened directly (as a hybridization library) , or used in the creation of an expression or display library of peptides/proteins.
Combinatorial Libraries The term "combinatorial library" refers to a library in which the individual members are either systematic or random combinations of a limited set of basic elements, the properties of each member being dependent on the choice and location of the elements incorporated into it. Typically, the members of the library are at least capable.of being screened simultaneously. Randomization may be complete or partial; some positions may be randomized and others predetermined, and at random positions, the' choices may be limited in a predetermined manner. The members of a combinatorial library may be oligomers or polymers, of some kind, in which the variation occurs through the choice of monomeric building block at one or more positions of the oligomer or polymer, and possibly in terms of the connecting linkage, or the length of the oligomer or polymer, too. Or the members may be nonoligomeric molecules with a standard core structure, like the 1, 4-benzodiazepine structure, with the variation being introduced by the choice of substituents at particular variable sites on the core structure. Or the members may be nonoligomeric molecules assembled like a jigsaw puzzle, but wherein each piece has both one or more variable moieties (contributing to library diversity) and one or more constant moieties (providing the functionalities for coupling the piece in question to other pieces) . Thus, in a typical combinatorial library, chemical building blocks are at least partially randomly combined into a large number (as high as 1015) of different compounds, which are then simultaneously screened for binding (or other) activity against one or more targets. In a "simple combinatorial library", .all of the members belong to the same class of compounds (e.g., peptides) and can be synthesized simultaneously. A "composite - i combinatorial library" is a mixture of two or more simple libraries, e.g., DNAs and peptides, - or peptides, peptoids, , and PNAs, or benzodiazepines and carbamates .. The number of component simple libraries in a composite library will, of course, normally he smaller than the average number of members in each simple library, as- otherwise' the advantage of a library over individual , synthesis is small. Libraries of thousands, even millions, of random oligopeptides have been prepared by chemical synthesis (Houghten et al . , Nature, 354:84-6(1991)), or gene expression (Marks et al., J Mol Biol, 222:581-97(1991)), displayed on chromatographic supports (Lam et al . , Nature, 354:82-4(1991)), inside bacterial cells (Colas et al . , Nature, 380:548-550(1996)), on bacterial pili (Lu, Bio/Technology, 13:366-372(1990)), or phage (Smith, Science,
,228:1315-7(1985)), and screened for binding to a variety of targets including antibodies (Valadon et al., J Mol Biol, 261:11-22(1996)), cellular proteins (Schmitz et al., J Mol Biol, 260:664-677(1996)), viral proteins (Hong and Boulanger, Embo J, 14:4714-4727(1995)), bacterial proteins (Jacobsson and Frykberg, Biotechniques, 18:878-885(1995)), nucleic acids (Cheng et al . , Gene, 171:1-8(1996)), and plastic (Siani et al . , J Chem Inf Comput Sci, 34:588- 593(1994)) . Libraries of proteins (Ladner, USP 4,664,989), peptoids
' (Simon et al . , Proc Natl Acad Sci U S A, 89:9367-71(1992)), nucleic acids (Ellington and Szostak, Nature, 246:818(1990)), carbohydrates, and small organic molecules (Eichler et al . , Med Res Rev, 15:481-96(1995)) have also been prepared or suggested for drug screening purposes . The first combinatorial libraries were composed of peptides or proteins, in which all or selected amino acid positions were randomized. Peptides and- proteins can exhibit high and specific binding activity, and can act as catalysts. In consequence, they, are of great importance in biological systems . Nucleic acids have also been used in combinatorial libraries. Their great advantage is the ease with which a, nucleic acid with appropriate binding activity can be amplified. As a result, combinatorial libraries composed of nucleic acids can be of low redundancy and hence, of high diversity. * There has. also been much interest in 'Combinatorial' libraries' based on small molecules, which,are more suited to pharmaceutical use, especially those which, like benzodiazepines, belong to a chemical class which has already yielded useful pharmacological agents . The techniques of combinatorial chemistry have been recognized as the most efficient means for finding small molecules that act on these targets. At present, small molecule combinatorial chemistry involves the synthesis of either . pooled or discrete molecules that present varying arrays of functionality on a common scaffold. These compounds are' grouped in libraries that are then screened against the target of interest either or binding or for inhibition of biological activity. The size of a library is the number of molecules in it. The simple diversity of a library is the number of unique structures in it . There is no formal minimum or maximum diversity. If the library has a very low diversity, the library has little advantage over just synthesizing and screening the members individually. If the library is of very high diversity, it may be inconvenient to handle, at least without automatizing the process. The simple diversity of a library is preferably at least 10, 10E2, 10E3, 10E4, 10E6, 10E7, 10E8 or 10E9, the higher the better under most circumstances. The simple diversity is usually not more than 10E15, and more usually not more than 10E10. The average sampling level is the size divided by the simple diversity. The expected average sampling level must be high enough to provide a reasonable assurance -that, ,if a given structure were expected, as a consequence of the library design, to be present, that ' the actual average sampling level will be' high enough so that the structure, if satisfying the screening criteria, will yield a positive result when the library is screened. Thus, the preferred average sampling level is a function of the detection limit, which initurn is a function of the strength of the signal -to be screened. , \ , There are more complex measures of diversity than simple diversity. These attempt to take into account the degree of structural difference between the various unique sequences.- These more comple 'measures are usually used in ' the context of small organic compound libraries, see below. The library members may be presented as solutes in' solution, or immobilized on some form of support. In the latter case, the support may be living (cell, virus) or nonliving (bead, plate, etc.). The supports may be separable (cells, virus particles, beads) so that binding and nonbinding members can be separated, or nonseparable (plate) . In the latter case, the members will normally, be placed on addressable positions on the support . The advantage of a soluble library is that there is no carrier moiety that could interfere with the binding of the members to the support. The advantage of an immobilized library is that it is easier to identify the structure of the members which were positive. When screening a soluble library, or one with a separable support, the target is usually immobilized. When screening a library on a nonseparable support, the target will usually be labeled.
Oligonucleotide Libraries An oligonucleotide library is a combinatorial library, at least some of whose members are single-stranded oligonucleotides having three or more nucleotides connected by phosphodiester or analogous bonds. The oligonucleotides may be linear, cyclic or branched, and may include non- nucleic acid moieties. The, nucleotides are not limited to the nucleotides normally found in DNA or RNA. For examples of nucleotides modified to increase nuclease resistance and chemical stability of aptamers, see Chart 1 in Osborne and Ellington, Chem. Rev., 97: 349-70 (1997). For screening' of RNA, see Ellington and Szostak, Nature, 346: .818-22 (1990). There is no formal minimum or maximum size for these oligonucleotides. However, the number of conformations which an oligonucleotide can assume increases exponentially with its length in bases. Hence, a longer oligonucleotide is more likely to be able to fold to adapt itself to a protein surface: On the other hand, while ver 'long molecules can be ..synthesized and screened, unless they provide a much ■ superior affinity to that of shorter molecules, they -are not likely to be found in the selected population, for ,the ,\ reasons explained by Osborne and Ellington (1997) . Hence, the libraries of the present invention are preferably composed of oligonucleotides having a length of 3 to 100 bases, more preferably 15 to 35 bases. The oligonucleotides in a given library may be of the same or of different lengths . Oligonucleotide libraries have the advantage that libraries of very high diversity (e.g., 1015) are feasible, and binding molecules are readily amplified in vitro by polymerase chain reaction (PCR) . Moreover, nucleic acid molecules can have very high specificity and affinity to targets . In a preferred embodiment, this invention prepares and screens Oligonucleotide libraries by the SELΞX method, as described in King and Famulok, Molec. Biol. Repts., 20: 97- 107 (1994) ; L. Gold, C. Tuerk. Methods of producing nucleic acid ligands, US#5595877; Oliphant et al . Gene 44:177 (1986) . The term "aptamer" is conferred on those . oligonucleotides which bind the target protein. Such aptamers may be used to- characterize the' target protein, both directly (through identification of the aptamer and the points of contact between the aptamer and the protein) and indirectly (by use of the aptamer as a ligand to modify the chemical reactivity of the protein) . In a classic oligonuclotide, each nucleotide (monomeric unit) is. composed of a phosphate group, a sugar moiety, and either a purine or a pyrimidine base. In DNA, the sugar is deoxyribose and in ?RNA it is ribose. The nucleotides are linked by 5 -3' phosphodies er bonds. The ' deoxyribose phosphate backbone of DNA can be . modified to increase resistance to nuclease and to increase penetration of cell membranes. Derivatives such as mono- or dithiophosphates, methyl phosphonates, boranophosphates, - for acetals, carbamates, siloxanes, and dimethylenethio- - sulfoxideo- and-sulfono- linked species are known in the ' art . . , . ' -
- Peptide Library * A peptide is composed of a plurality of amino acid residues joined together by peptidyl (-NHC0-) bonds. A . biogenic peptide is a peptide in which the residues are all genetically encoded amino acid residues; it is not necessary that the biogenic peptide actually be produced by gene expression. Amino acids are the basic building blocks with which peptides and proteins are constructed. , Amino acids possess both an amino group (-NH2) and a carboxylic acid group (- COOH) . Many amino acids, but not all, have the alpha amino acid structure ISIHa-CHR-COOH, where R is hydrogen, or any of a variety of functional groups . Twenty amino acids are genetically encoded: Alanine, Arginine, Asparagine, Aspartic Acid, Cysteine, Glutamic . Acid, Glutamine, Glycine, Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Proline, Serine, Threonine, Tryptophan, Tyrosine, and Valine. Of these, all save Glycine are optically isomeric, however, only the L- form is found in humans. Nevertheless, the D-forms of these amino acids do have biological significance; D-Phe, for example, is a known analgesic. Many other amino acids are also known,' including: 2- Aminoadipic acid; 3-Aminoadipic acid; beta-Aminopropionic acid; 2-Aminobutyric acid; 4-Aminobutyric acid (Piperidinic acid) ; 6-Aminocaproic acid; 2-Aminoheptanoic acid; 2-
Aminoisobutyric acid, 3 -Aminoisobutyric acid; 2-Aminopimelic acid; 2,4-Diaminobutyric acid; Desmosine; 2,2'- Diaminopimelic acid; 2 , 3-Diaminopropionic acid; N- Ethylglycine; N-Ethylasparagine; Hydroxylysine,-, allo- Hydroxylysine; 3-Hydroxyproline; 4-Hydroxyproline,-1
Isodesmosine; allo-Isol eucine; N-Methylglycine • (Sarcosine) ; N-Methylisoleucine; N-Methylvaline;. Norvaiine; Norleucine; and Ornithine . ' Peptides are constructed by condensation of amino acids and/or smaller peptides . The amino group of one amino acid (or peptide) reacts with the carboxylic acid group of a : second amino acid, (or peptide) to form a peptide ι(-?NHCO-) bond, releasing- one molecule of ater. :' Therefore, ; when an amino acid is incorporated into a peptide,"- it should; technically speaking, be referred to as an amino acid residue. The core of that residue is the moiety which excludes the -NH and -CO linking functionalities which connect it to other residues. This moiety consists of one or more main chain atoms (see below) and the attached side chains . The main chain moiety of each amino acid consists of the -NH and -CO linking unctionalities and a core main chain moiety.- Usually, the latter is a single carbon atom. However, the core main chain moiety may include additional carbon atoms, and may also include nitrogen, oxygen or sulfur atoms, which together form a single chain. In a preferred embodiment, the core main chain atoms consist solely of carbon atoms. The side chains are attached to the core main chain atoms. For alpha amino acids, in which the side chain is attached to the alpha carbon, the C-l, C-2 and N-2 of each residue form the repeating unit of the main chain, and the word "side chain" refers to, the C-3 and higher numbered carbon atoms and their substituents. It also includes H atoms attached to the main chain' atoms. Amino acids may be classified according to the number of carbon atoms which appear in the main chain between the carbonyl carbon and amino nitrogen atoms which' participate in the peptide bonds. Among the 150 or so amino acids which occur in nature, alpha, beta, gamma and delta amino acids are known. These have 1-4 intermediary carbons. Only alpha amino acids occur in proteins. Proline is a special case of an alpha amino acid; its side chain also binds to the peptide bond nitrogen. For beta and higher order amino acids, there is a choice as to which main chain core carbon a side chain other than H is attached to.- The preferred attachment site is- the C-2 (alpha) carbon, i.e., the one adjacent to the carboxyl carbon of the -CO linking functionality. It ' is also possible ' for more than one main chain atom to carry' a' side chain ■ other than H. However, in a preferred embodiment, only one' main chain core atom carries a side' chain other than H. A main chain carbon atom may carry either one or two side chains; one is more common. A side chain may be attached to a main chain carbon atom by a single or a double bond; the former is more common. A simple combinatorial peptide library is one whose members are peptides having three or more amino acids connected via peptide bonds . The peptides may be linear, branched, or cyclic, and may covalently or noncovalently include nonpeptidyl moieties. . The amino acids are not limited to the naturally occurring or to the genetically encoded amino acids. A biased peptide library, is one in which one or more (but not all) residues of the peptides are constant residues.
Cyclic Peptides Many naturally occurring peptides are cyclic.
Cyclization is a common mechanism for stabilization of peptide conformation thereby achieving improved association of the peptide with its ligand and hence improved biological activity. Cyclization is usually achieved by intra-chain cystine formation, by formation of peptide bond between side chains or between N- and C- terminals. Cyclization was usually achieved by peptides in solution, but several publications . have appeared that describe cyclization of peptides on beads. A peptide library may be an oligopeptide library or a protein library. ,
Olig-opeptides Preferably, the oligopeptide are at least five, six, seven or eight amino -acids in length. Preferably, they are composed of less than 50, more preferably less than 20 amino acids. In the' case of an oligopeptide library, all or just some of the' residues may.be varia?ble. The oligopeptide may be unconstrained, or constrained to a. particular conformation by, e.g., the participation of constant cysteine residues in the formation of a constraining disulfide bond.
Proteins Proteins, like oligopeptides, are composed of a plurality of amino acids, but the term protein is usually reserved for longer peptides, which are able to fold into a stable conformation. A protein may be composed of two or more polypeptide chains, held together by covalent or ' noncovalent crosslinks. These may occur in a homooligomeric or a heterooligomeric state. A peptide is considered a protein if it (1) is at least 50 amino acids long, or (2) has at least two stabilizing covalent crosslinks (e.g., disulfide bonds). Thus, conotoxins are considered proteins . Usually, the proteins of a protein library will be characterizable as having both constant residues (the same for all proteins in the library) and variable residues (which vary from member to member) . This is simply because, for a given range of variation at each position, the sequence space (simple diversity) grows exponentially with the number of residue positions, so at some point it becomes inconvenient for all residues of a peptide to be variable positions. Since proteins are usually larger than oligopeptides, it is more common for protein libraries than oligopeptide libraries to feature variable positions.- In the case of a protein library, it is desirable to focus the mutations at those sites which are tolerant, of mutation. These may.be determined by alanine scanning mutagenesis or by comparison of the protein sequence to that of homologous proteins of similar activity. It is also more ,; likely that mutation f surface residues will directly affect binding. , Surface residues may be determined by inspecting a 3D structure of the protein, or by labeling the surface and then ascertaining which residues, have received labels. They may also be inferred by identifying regions of high hydrophilicityl,within the protein. •• Because proteins are often altered at some sites but not others, protein libraries can be considered a special case of the biased peptide library. There are several reasons that one might screen a protein library instead of an oligopeptide library, including (1) a particular protein, mutated in the library, has the desired activity to some degree already, and (2)- the oligopeptides are not expected to have a sufficiently high affinity or specificity, since they do not have a stable conformation. When the protein library is based on a parental protein which does not have the desired act ivity, the parental protein will usually be one which i s of high stability (melting point >= 50 deg. C.) and/or possessed of hypervariable regions. The variable domains of an antibody possess hypervariable regions and hence, in some embodiments, the protein library comprises members which comprise a, mutant of VH or VL chain, or a mutant of an antigen-specific binding fragment of such a chain. VH and VL chains are usually each about 110 amino acid residues, and are held in proximity by a disulfide bond between the adjoing CL and CHI regions to form a variable domain. Together, the VH, VL, CL and CHI form an Fab fragment. In human heavy chains, the hypervariable regions are at 31-35, 49-65, 98-111 and 84-88, but only the first three are involved in antigen binding. There is variation among VH and VL chains at residues outside the' hypervariable regions, but to a much lesser degree. A sequence is considered a mutant of a VH or VL chain if it is at least 80% identical to a naturally occurring VH or VL chain at all residues outside the hypervariable ; region. ' x In a preferred embodiment, such antibody library members comprise both at -least one VH chain and at least one
VL chain, at least one of which is a mutant chain, and which chains may be derived from the same or different , antibodies . , The VH and VL chains may be covaleiitly joined by a suitable'-- linker moiety, as in a "single chain antibody", or they may be noncovalently joined, as in a naturally occurring variable domain. If the joining is noncovalent, and the library is displayed on cells or virus, then either the VH or the VL chain may be fused to the carrier surface/coat protein. The complementary chain may be co-expressed, or added exogenously to the library. The members may further comprise some or all of an antibody constant heavy and/or constant light chain, or a mutant thereof.
Peptoid Library A peptoid is an analogue of a peptide in which one or more of the peptide bonds (-NH-CO-) are replaced by pseudopeptide bonds, which may be the same or different. It is not necessary that all of the peptide bonds be replaced, i.e., a peptoid may include one or more conventional amino acid residues, e.g., proline. A peptide bond has two small divalent linker elements, -NH- and -CO-. Thus, a preferred class of psuedopeptide bonds are those which consist of two small divalent linker elements. Each may be chosen independently from the group consisting of amine (-NH-) , substituted amine (-NR-) , carbonyl (-CO-) , thiocarbonyl (-CS-) ,methylene (-CH2-) , monosubstituted methylene (-CHR-) , disubstituted methylene (-CR1R2-) , ether (-0-) and thioether (-S-). The more preferred pseudopeptide bonds include : N-modified -NRCO- Carba Ψ -CH2-CH2- Depsi Ψ -C0-O- Hydroxyethylene Ψ -CHOH-CH2- Ketomethylene Ψ -C0-CH2- Methylene-Oxy -CH2-0- Reduced -CH2-NH- Thiomethylene -CH2-S- , Thiopeptide -CS-NH- Retro-Inverso -CO-NH- A single peptoid molecule may include more than one kind of. pseudopeptide bond. For the purposes of introducing diversity into a peptoid library, one may vary (1) the side chains attached 5 to the core main chain atoms of the monomers linked by the pseudopeptide bonds, and/or (2) the side chains (e.g., the - R of an -NRC0-) of the pseudopeptide bonds. Thus, in one embodiment, the monomeric units which are not amino acid residues are of the structure -NR1-CR2-CO-, where at least 10 one of RI and R2 are not hydrogen. If there is variability in the pseudopeptide bond, this is most conveniently done by using an -NRCO- or other pseudopeptide bond with an R group, and varying the R group. In this event, the R group will usually be any of the side chains characterizing the amino 15 ' acids of peptides, as previously discussed. If the R group of the pseudopeptide bond is not variable, it will usually be small, e.g., not more than 10 atoms (e.g., hydroxyl, amino, carboxyl, methyl, ethyl, propyl) .
20 . If the conjugation chemistries are compatible, a simple combinatorial library may include both peptides and peptoids .
Peptide Nucleic Acid Library 25 A PNA oligomer is here defined as one - comprising a plurality of units, at least one of which is a PNA monomer which comprises a side chain comprising a nucleobase. For nucleobases, see USP 6,077,835. The classic PNA oligomer is composed of (2- 30 aminoethyl) glycine units, with nucleobases attached by methylene carbonyl linkers. That is, it has the structure
H- (-HN-CH2-CH2-N(-CO-CH2-B) -CH2-CO-)n -OH ,
'35 ' where the outer parenthesized substructure is the PNA monomer. ,
In this structure, the' nucleobase B is separated : from the' backbone N by three bonds, and the points of attachment of the side chains are separated by six bonds. The nucleobase may be any of the bases included in the nucleotides discussed in connection with oligonucleotide libraries. The bases of nucleotides A, G, T, C and U are preferred. A PNA oligomer may further comprise one or more amino acid residues, especially glycine and proline. One can readily envision related molecules in which (1) the -C0CH2- linker is replaced by another linker, especially one composed of two small divalent linkers as defined previously,. (2) a side chain is attached to one of the three main chain carbons not participating in the peptide bond (either instead or in addition to the side chain attached to the N of the classic PNA).; and/or (3) the peptide bonds ' are replaced by pseudopeptide bonds as disclosed previously in the context of peptoids. PNA oligomer libraries have been made; see e.g. Cook, 6,204,326. Small Organic Compound Library The small organic compound library ("compound library", for short) is a combinatorial library whose members are suitable for use as drugs if, indeed, they have the ability to mediate a biological activity of the target protein. ' Peptides have certain disadvantages as drugs . These include susceptibility to .degradation by serum proteases, and difficulty in penetrating cell membranes. Preferably, all or most of the compounds of the compound library avoid, or at ' least do not suffer to the same degree, one or more of the pharmaceutical disadvantages of peptides. In designing a compound library, it is helpful to bear in mind the methods;'of molecular modification typically used to' obtain new drugs. Three basic kinds of modification may be identified: disjunction, in which a lead drug is simplified to identify its component, pharmacophori '' mo'ieties; conjunction, in .which two or more known pharmacophoric moieties, which may be the same or different, , are associated, covalently r noncovalently, to. form a new drug; and alteration, in- which one moiety is, replaced by another which may be similar or different, but which is not in effect a disjunction or conjunction. The use of the terms "disjunction", "conjunction" and "alteration" is intended only to connote the structural relationship of the end product to the original leads, and not how the new drugs are actually synthesized, although it is possible that the two are the same. The process of disjunction is illustrated by the evolution of neostigmine (1931) and edrophonium (1952) from physostigmine (1925) . Subsequent conjunction is illustrated by demecarium (1956) and ambenonium (1956) . Alterations may modify the size, polarity, or electron distribution of an original moiety. Alterations include ring closing or opening, formation of lower or higher homologues, introduction or saturation of double bonds , introduction of optically active centers, introduction, removal or replacement of bulky groups, isosteric or bioisosteric substitution, changes in the position or orientation of a group, introduction of alkylating groups, and introduction, removal or replacement of groups with a ' view toward inhibiting or promoting inductive (electrostatic) or. conjύgative (resonance) effects. Thus, the substituents may include ,electron acceptors and/or electron donors. Typical electron' donors (+1) include -CH3, -CH2R, -CHR2, -CR3 and -COO". Typical electron acceptors (-1) include -?NH3+, -NR3+, -N02, -CN, -COOH, - COOR, -CHO, -COR, -COR, -F, -Cl, -Br, -OH, -OR, -SH, -SR, -CH=CH2, ' -CR=CR2, and -C=CH. The substituents may also include those which increase or decrease .electronic density in conjugated systems. The former (+R)' groups include -CH3, -CR3, -F, -Cl, -Br, -I,. -OH, -OR, -OCOR, -SH, -SR, -NH2,, -NR2, and -NHCOR. The later (-R) groups include -N02, -CN, , -CHC, . -COR, -COOH, -COOR, -COISIH2, . -S02R and -CF3. Synthetically speaking, the modifications may be achieved by a variety of unit processes, including nucleophilic and electrophilic- substitution, reduction \ and , , '-. 7 oxidation, addition elimination, ' double bond cleavage, and cyclization. For the purpose of constructing a library/ a compound, or a family of compounds, having one or more pharmacological activities (which need not be related to the known or suspected activities of the target protein) , may be disjoined into two or more known orr potential pharmacophoric moieties. Analogues of each of these moieties may be identified, and mixtures of these a-nalogues reacted so as to reassemble compounds which have some similarity to the original ■ lead compound. It is not necessary that all members of the library possess moi&ties analogous to all of the moieties of the lead compound. The design of a library may be illustrated by the example of the benzodiazepines . Several benzodiazepine drugs, including chlordiazepoxide, diazepam and oxazepam, ' have been used as anti-anxiety drug-s. Derivatives of benzodiazepines have widespread biological activities; derivatives have been reported to a ct not only as anxiolytics, but also as anticonvul sants ; cholecystokinin (CCK) receptor subtype A or B, kapp.a opioid receptor, platelet activating factor, and HIV transactivator Tat antagonists, and GPIIblla, reverse transcriptase and ras farnesyltransferase inhibitors . The benzodiazepine structure has been disjoined into a 2-aminobenzophenone, an amino acid, and an alkylating agent. - See Bunin, et al . , Proc. Nat. Acad. Sci. USA, 91:4708 (1994) . Since only a few 2-aminobemzophenone derivatives are commercially available, it was Hater disjoined into 2- aminoarylstannane, an acid , chloride _, an amino acid, and an alkylating agent. Bunin, et al . , Meth. Ξnzymol . , 267:448 - (1996) . The arylstannane may be considered the core , structure upon which the other moieties are substituted, or all four may be considered equals which are conjoined ?to make each library member. A basic library synthesis plan and member structure is , shown in Figure '1 of Fowlkes, et al _ , U.S. Serial- No. 08/740,671, incorporated by reference in its entirety. The , ; acid chloride building block introdi ces variability at the R site. The R2 site is introduced by the amino acid, and- the R3, ; site by the alkylating agent. The R4 site is inherent in the- arylstannane. Bunin, et al . generated a 1, 4- benzodiazepine library of 11,200 different derivatives prepared from 20 acid chlorides, 35 amino acids, and 16 alkylating agents. (No diversity was introduced at R4; this
5 group was used to couple the molecule to a solid phase.) According to the Available Chemicals Directory (HDL Information Systems, San Leandro CA) , over 300 acid chlorides, 80 Fmoc-protected amino acids and 800 alkylating agents were available for purchase (and more, of course,0 could be synthesized) . The particular moieties used were chosen to maximize structural dispersion, while limiting the numbers to those conveniently synthesized in the wells of a microtiter plate. In choosing between structurally similar compounds, preference was given to the least substituted5 compound. The variable elements included both aliphatic and aromatic groups. Among the aliphatic groups, both acyclic and cyclic (mono- or poly-) structures, substituted or not, were tested.' (While all of the acyclic groups were linear,0 it would have been feasible to introduce a branched aliphatic). The aromatic groups' featured either single and. multiple rings, fused or not, substituted or not, and with heteroatoms or not. The secondary substitutents included - NH2, -OH, -OMe, -CN, -Cl, -F, and -COOH. While not used,5 spacer moieties, such as '-0-, -S-, -00-, -CS-, -NH- , and - NR-, could have been incorporated. Bunin et al . suggest that instead of using a 1, 4- benzodiazepine as a core structure, one may instead use a 1, 4-benzodiazepine-2, 5-dione structure.0 As noted by Bunin et al . , it is advantageous, although not necessary, to use a linkage strategy which, leaves no trace of. the linking functionality, ,as this permits ■•' , ' construction of a more diverse, library. Other combinatorial onoligomeric compound libraries5 known or, suggested in the 'art have been based on carbamates', mercaptoacylated pyrrolidines, phenolic agents, aminimides, N-acylamino ethers (made from amino alcohols, aromatic , . hydroxy , acids, and carboxylic acids), N-alkylamino ethers (made from aromatic hydroxy acids, amino alcohols and aldehydes) 1, 4-piperazines, and 1, 4-pi;gperazine-6-ones. DeWitt, et al . , Proc. Nat. Acad. Sczi. (USA), 90:6909-13, (1993) describe the simultaneous but separate, synthesis of 40 discrete hydantoins and 40 discrete benzodiazepines.
They carry out their synthesis on a solid support (inside a gas dispersion tube), in an array format., as opposed to other conventional simultaneous synthesis! techniques (e.g. , in a well, or on a pin) : The hydantoins were synthesized by first simultaneously deprotecting and then treating each of five amino acid resins with each of eight isocyanates . The benzodiazepines were synthesized by treating each of five deprotected amino acid resins with each of eight 2-amino benzophenone imines . Chen, et al . , J. Am. Chem. Soc, 116:2661-62 (1994) described the preparation of a pilot (9 member) combinatorial library of formate esters. A polymer bead- bound aldehyde preparation was "split" into three aliquots, each reacted with one of three different ylide reagents. The reaction products were combined, and then divided into three new aliquots, each of which was reacted with a different Michael donor. Compound identity was found to be determinable on a single bead basis by gas chromatography/mass spectroscopy analysis . Holmes, USP 5,549,974 (1996) sets forth methodologies' for the combinatorial synthesis of libraries of thiazolidinones and metathiazanones . These libraries are-,, made by combination of amines, carbonyl compounds, and thiols under cyclization conditions. Ellman, USP 5,545,568 (1996) describes combinatorial synthesis of benzodiazepines, prostaglandins, beta-turn mimetics, and glycerol-based compounds. • See' also Ellman, USP 5,288,514. * ' , Summerton, USP 5,506,337 (1,996) discloses methods of preparing' a combinatorial library formed predominantly of morpholino subunit structures . , Heterocylic combinatorial libraries are reviewed - * * generally in Nefzi, ?et' al . , Chem. Rev., 97:449^-472 (1997).' For pharmacological classes, see, e.g., Goth, Medical Pharmacology: Principles and Concepts (C.V. Mosby Co. : 8th ed. 1976) ; Korolkovas and Burckhalter, Essentials} of Medicinal Chemistry (John Wiley & Sons, Inc. : 19"?76) . For 5 synthetic methods, see, e.g., Warren, Organic Synthesis: The Disconnection Approach (John Wiley & Sons, Ltd. : 1982) ; Fuson, Reactions of Organic Compounds (John Wiley & Sons: 1966) ; Payne and Payne, How to do an Organic Synthesis (Allyn and Bacon, Inc.: 1969); Greene,. Protective Groups in
.10 Organic Synthesis (Wiley-lnterscience) . For selection of substituents, see e.g., Hansch and Leo, Substituent Constants for Correlation Analysis in Chemistry and Biology (John Wiley & Sons: 1979). The library is preferably synthesized so that the 15 individual members remain. identifiable so that, i_f a member is shown to be active, it is not necessary to analyze it. Several methods of identification have been proposed, including: (1) encoding, i.e., the attachment to each member of 20 an identifier moiety which is more readily identified than the member proper. Thi s has the disadvantage that the tag may itself influence the activity of the conjugate. (2) spatial addressing, e.g., each member is 25 synthesized only at a particular coordinate on' or in a matrix, or in a particular chamber . This ' might be, for example, the location of a : . , particular pin, or a particular well on a microtiter plate, or inside a "tea bag" . 30 * The present invention is not limited to any particular form of identification. However, it is possible to simply characterise those members of the library which are found to be active, based on the characteristic spectroscopic indicia of the various -35 -- building blocks. .Solid phase synthesis permits greater controH^ over , . which derivatives are formed. However,, the' solid phase . could interfere with activity. To overcome this ^problem, some or all of the molecules of each member could be liberated, after synthesis but before screening. Examples of candidate simple libraries which might be evaluated include derivatives of the followiixg: 5 Cyclic Compounds Containing One Hetero A^tom Heteronitrogen pyrroles pentasubstituted pyrroles pyrrolidines 0 pyrrolines prolines indoles beta-carbolines pyridines 5 dihydropyridines 1 , -dihydropyridines pyrido [2 , 3-d] pyrimidines tetrahydro-3H-imidazo [4, 5—c] pyridines Isoquinolines 0 -tetrahydroisoquinolmes quinolones beta-lactams azabicyclo [4.3.0]nonen-8 -one amino acid Het'erooxygen 5 furans tetrahydrofurans 2, 5-rdisubstituted tetrahydrofurans pyrans hydroxypyranones 0 tetrahydroxypyranones
'- ' gamma-butyrolactones Heterosulfur * sulfolenes ' -' Cyclic Compounds with Two or More Hetero atoms5 ■' Multiple heteronitrogens imidaz'oles ' ' ' • > pyrazoles piperaazzminfes diketopiperazines arylpiperazines benzylpiperazines benzodiazepines 1, 4-benzodiazepine-2 , 5-diones hydantoins 5-alkoxyhydantoins dihydropyrimidines
1, 3-disubstituted-5 , 6-dihydopyrimidine-2 , - diones cyclic ureas cyclic thioureas quinazolines chiral 3-substituted-quinazoline-2 , 4- diones triazoles 1,2,3-triazoles purines Heteronitrogen and Heterooxygen dikelomorpholines isoxazoles isoxazolines Heteronitrogen and Heterosulfur thiazolidines N-axylthiazolidines dihydrothiazoles 2-methylene-2,3-dihydrothiazates 2-aminothiazoles thiophenes 3 -amino thiophenes 4-thiazolidinones 4 -melathiazanones benzisothiazolones For details on synthesis of libraries-, see Nefzi, et al., Chem. Rev., 97:449-72 (1997), and references cited therein. : ■ ' ' ' -
Pharmaceutical Methods and Preparations The preferred animal subject of the present invention is a mammal. By the term "mammal" is meant an individual belonging to the class Mammalia. The invention is particularly useful in the treatment of human subjects, although it is intended for veterinary and nutritional uses as well. Preferred nonhuman subjects are of the orders Primata (e.g., apes and monkeys), Artiodactyla or Perissodactyla (e.g., cows, pigs, sheep, horses, goats), Carnivora (e.g., cats, dogs), Rodenta (e.g., rats, mice, guinea pigs, hamsters), Lagomorpha (e.g., rabbits) or other pet, farm or laboratory mammals. The term "protection", as used herein, is intended to include "prevention," "suppression" and "treatment." "Prevention", strictly speaking, involves administration of the pharmaceutical prior to the induction of the disease (or other adverse clinical condition) . "Suppression" involves administration of the composition prior to the clinical appearance of the disease. "Treatment" involves administration of the protective composition after the appearance of the disease. It will be understood that in human and veterinary medicine, it is not always possible to distinguish between : "preventing" and "suppressing" since the ultimate inductive event or events may be unknown, latent, or the patient is not ascertained until well after the occurrence of the event or events. Therefore, unless qualified, the term "prevention" will be understood to refer to both prevention in the strict' sense, and to suppression. The preventative or prophylactic use of a pharmaceutical usually involves identifying subjects who are at higher risk than the general population of contracting the disease, and administering the pharmaceutical to them in advance of the ; clinical appearance of the disease. -The effectiveness of such use is measured by comparing the , subsequent incidence or severity of the disease, or of particular symptoms of the disease, in the treated subjects against that in untreated subjects of the- same high risk group. , : , ' - ' While high risk factors vary from disease to disease, in general, these include (1) prior occurrence of the disease in one or more members of the same family, or, in the case of a contagious disease, in individuals with whom the subject has come into potentially contagious contact at a time when the earlier victim was likely to be contagious, (2) a prior occurrence of the disease in the subject, (3) prior occurrence of a related disease, or a condition known to increase the likelihood of the disease, in the subject; (4) appearance of a suspicious level of a marker of the disease, or a related disease or condition; (5) a subject who is immunologically compromised, e.g., by radiation treatment, HIV infection, drug use,, etc., or (6) membership in a particular group (e.g., a particular age, sex, race, ethnic group, etc.) which has been epidemiologically associated with that disease. In some cases, it may be desirable to provide prophylaxis for the general population, and not just a high risk group. This is most likely to be the case when essentially all are at risk of contracting the disease, the effects of the disease are serious, the therapeutic index of the prophylactic agent is high, and the cost of the agent is low. A prophylaxis or treatment may be curative, that is, directed at the underlying cause of a disease, or ameliorative, that is, directed at the symptoms of the disease, especially those which reduce the quality of life. It should also be understood that to be useful, the protection provided need not be absolute, provided that it is sufficient to carry clinical value. An agent which provides protection to a lesser degree than do competitive agents may still be of value if the other agents are ineffective for a particular individual, if it can be used in combination with other agents to enhance the level of protection, or if it is safer than competitive agents. It is desirable that there be a statistically significant (p=0.05 or less) improvement in the treated subject relative to an appropriate untreated control, and it is desirable that this improvement be at least 10%, more preferably at least 25%, still more preferably at least 50%, even more preferably at least 100%, in some indicia of the incidence or severity of the disease or of at least one symptom of the disease. At least one of the drugs of the present invention may be administered, by any means that achieve their intended purpose, to protect a subject against a disease or other adverse condition. The form of administration may be systemic or topical. For example, administration of such a composition may be by various parenteral routes such as subcutaneous, intravenous, intradermal, intramuscular, intraperitoneal, intranasal, transdermal, or buccal routes. Alternatively, or concurrently, administration may be by the oral route. Parenteral administration can be by bolus injection or by gradual perfusion over time. A typical regimen comprises administration of an effective amount of the drug, administered over a period ranging from a single dose, to dosing over a period of hours, days, weeks, months , or years . It is understood that the suitable dosage of a drug of the present invention will be dependent upon the age, sex, health, and weight of the recipient, kind of concurrent treatment, if any, frequency of treatment, and the nature of the effect desired. However, the most preferred dosage can be tailored to the individual subject, as is understood and . determinable by one of skill in the art, without undue experimentation.- This will typically involve adjustment of a standard dose, e.g., reduction of the dose if the patient has a low body weight. Prior to use in humans, a drug, will first be evaluated for safety and efficacy in- laboratory animals. In human- clinical studies, one would begin with a dose expected to be safe in humans, based;on' the preclinical data ' for the drug' in question, and on customary doses for analogous drugs (if any) . If this dose is effective, the dosage may be decreased, to determine the minimum effective dose, if
'desired. -If this dose is ineffective, it will be cautiously increased, with the patients monitored for .'signs of side effects. See, e.g. ,'' Berkow et al,-eds., The Merck Manual , '• 15th edition,- 'Merck and, Co., Rahway, ,'N. J. , 1587; . Goodman et al., eds., Goodman and Gilman ' s The Pharmacological Basis of Therapeutics,. 8th edition, Pergamon Press, Inc., Elmsford, N.Y., (1990); Avery' s Drug Treatment : Principles and Practice of Clinical Pharmacology and Therapeutics , 3rd edition, ADIS Press, LTD., Williams and Wilkins, Baltimore, MD. (1987), Ebadi, Pharmacology, Little, Brown and Co., Boston, (1985) , which references and references cited therein, are entirely incorporated herein by reference. The total dose required for each treatment may be administered by multiple doses or in a single dose. The protein may be administered alone or in conjunction with other therapeutics , directed to the disease or directed to other symptoms thereof. Typical pharmaceutical doses, for adult humans, are in the range of 1 ng to lOg per day, more often 1 mg to lg per day. The appropriate dosage form will depend on the disease, the pharmaceutical, and the mode of administration; possibilities include tablets, capsules, lozenges, dental pastes, suppositories, inhalants, solutions, ointments and parenteral depots. See, e.g., Berker, supra, Goodman, supra, Avery, supra and Ebadi, supra, which are entirely incorporated herein by reference, including all references cited therein. In the case of peptide drugs, the drug may be administered in the- form of an expression vector comprising a nucleic acid encoding the peptide; such a vector, after incorporation into' the genetic complement of a cell of the patient, directs synthesis of the peptide. Suitable vectors include genetically engineered poxviruses (vaccinia) , adenoviruses, adeno-associated viruses, herpesviruses and lentiviruses which are or have been rendered nonpathogenic. * In addition to at least one drug as described herein, ,a pharmaceutical composition may contain suitable pharmaceutically acceptable carriers, such as excipients, carriers and/or auxiliaries which facilitate processing of the active compounds into preparations which can be' used pharmaceutically. .See, e.g., Berker, supra, Goodman, supra, Avery, supra and Ebadi, supra, which -are entirely incorporated herein by reference, included all references cited therein.
Assay Compositions and Methods Target Organism The invention contemplates that it may be appropriate to ascertain or to mediate the biological activity of a substance of this invention in a target organism. The target organism may be a plant, animal, or microorganism. In the case of a plant, it may be an economic plant, in which case the drug may be intended to increase the disease, weather or pest resistance, alter the growth characteristics, or otherwise improve the useful characteristics or mute undesirable characteristics of the plant . Or it may be a weed, in which case the drug may be intended to kill or otherwise inhibit the growth of the plant, or to alter its characteristics to convert it from a weed to an economic plant. The plant may be a tree, shrub, crop, grass, etc.- The plant may be an algae (which are in some cases also microorganisms) , or a vascular plant, especially gymnosperms (particularly conifers) and angiospeirms. Angiosperms may be monocots or dicots. The plants of greatest interest are rice, wheat, corn, alfalfa, soybeans, potatoes, peanuts, tomatoes, melons, apples, pears, plums, pineapples, fir, spruce, pine, cedar, and oak. If the target organism is .a microorganism, it may be' • algae, bacteria, fungi, or a virus (although the biological activity of a. virus must be determined in a virus-infected cell) . The microorganism -may be human or other animal or plant pathogen, or it may be nonpathogenic. It may be a soil or water organism,1 or one which normally lives inside other living things. If the target organism is an animal, it may be a , vertebrate or a nonvertebrate animal. Nonvertebrate animals are chiefly of- interest when they act as pathogens or parasites, and the drugs are intended to act as biocidic or biostatic agents. Nonvertebrate animals of 'interest include ' worms, mollusks, and arthropods. ? The target organism may also be a vertebrate animal, i.e., a mammal, bird, reptile, fish or amphibian. Among mammals, the target animal preferably belongs to the order
Primata (humans, apes and monkeys), Artiodactyla (e.g., cows, pigs, sheep, goats, horses), Rodenta (e.g., mice, rats) Lagomorpha (e.g., rabbits, hares), or Carnivora (e.g., cats, dogs) . Among birds, the target animals are preferably of the orders Anseriformes (e.g., ducks, geese, swans) or
Galliformes (e.g., quails, grouse, pheasants, turkeys and chickens) . Among fish, the target animal is preferably of the order Clupeifόrmes (e.g., sardines, shad, anchovies, whitefish, salmon) .
Target Tissues The term "target tissue" refers to any whole animal, physiological system, whole organ, part of organ, miscellaneous tissue, cell, or cell component (e.g., the cell membrane) of a target animal in which biological activity may be measured. Routinely in mammals one would choose to compare and contrast the biological impact on virtually any and all tissues which express the subject receptor protein. The main tissues to use are: brain, heart, lung, kidney, liver, pancreas, skin, intestines, adipose,- stomach, skeletal muscle, adrenal glands, breast, prostate, vasculature, retina, cornea, thyroid gland, parathyroid glands, thymus, bone marrow, bone, etc. Another classification would be b cell type: B cells, T cells, macrophages, neutrophils, eosinophils,. mast cells, platelets, megakaryocytes, erythrocytes, bone marrow stomal cells, fibroblasts, neurons, astrocytes, neuroglia, . microglia, epithelial cells (from any organ, e.g. skin, ' ' breast, prostate, lung, intestines etc) , cardiac muscle cells, smooth muscle cells, striated muscle cells, osteoblasts, osteocytes, chondroblasts, chondrocytes, keratinocytes, melanocytes, etc. ' . ' , Of course, in the case of a unicellular organism,, there is no distinction between the. "target organism" and the' "target tissue". ' , Screening Assays Assays intended to determine the binding or the biological activity of a substance are called preliminary screening assays. Screening assays will typically be either in vitro (cell-free) 'assays (for binding to an immobilized receptor) or cell-based assays (for alterations in the phenotype of the cell) . They will not involve screening of whole multicellular organisms, or isolated organs. The comments on diagnostic biological assays apply mutatis mutandis to screening cell-based assays.
In Vitro vs. In Vivo Assays The term in vivo is descriptive of an event, such as binding or enzymatic action, which occurs within a living organism. The organism in question may, however, be genetically modified. The term in vi tro refers to an event which occurs outside a living organism. Parts of an organism (e.g., a membrane, or an isolated biochemical) are used, together with artificial substrates and/or conditions . For the purpose of the present invention, the term in vitro excludes events occurring inside or on an intact cell, whether of a unicellular or multicellular organism. In vivo assays include both cell-based assays, and organismic assays. The cell-based assays include both assays on unicellular organisms, and assays on isolated cells or cell cultures derived from multicellular organisms. The cell cultures may be mixed, provided that they are not 'organized into tissues or organs. The term organismic assay, refers to assays on whole multicellular organisms, and assays on isolated organs or tissues of such organisms. -
In vitro Diagnostic Methods and Reagents
The in vitro assays of the present invention may be applied to any suitable analyte-containing sample, and may be qualitative or quantitative in nature. Sample The sample will normally be a biological fluid, such as blood, urine, lymph, semen, milk, or cerebrospinal fluid, or a fraction or/"derivative thereof, or a biological tissue, in the form of, e.g., a tissue section or homogenate. However, the sample conceivably could be (or derived from) a food or beverage, a pharmaceutical or diagnostic composition, soil, or surface or ground water. If a biological fluid or tissue, it may be taken from a human or other mammal, vertebrate or animal, or from a plant. The preferred sample is blood, or a fraction or derivative thereof.
Binding and Reaction Assays The assay may be a binding assay, in which one step involves the binding of a diagnostic reagent to the analyte, or a reaction assay, which involves the reaction of a reagent with the analyte. The reagents used in a binding assay may be classified as to the nature of their interaction with analyte: (1) analyte analogues, or (2) analyte binding . molecules (ABM). They may be labeled or-, insolubilized. In a reaction assay, the assay may look for a direct reaction between the analyte and a reagent which is reactive with the analyte, or if the analyte is an enzyme or enzyme inhibitor, for a reaction catalyzed or inhibited by the analyte. The reagent may be a reactant, a catalyst, or an inhibitor for the reaction. An assay may involve a cascade of ste s in which the product of one step acts as the target for the next step. These steps may be binding steps, reaction steps, or a combination thereof.
Signal Producing System (SPS) In order to detect' the presence, or measure the amount, of an analyte, the assay must provide -for a signal producing -system (SPS) ( in which there is a detectable difference in the signal produced, depending on whether the analyte is' present' or absent (or,-, in a quantitative assay, on the amount of the analyte) . The detectable signal may be one which is visually detectable, or one detectable only with instruments. Possible signals include production of colored or luminescent products, alteration of the characteristics (including amplitude or polarization) of absorption or emission of radiation by an assay component or product, and precipitation or agglutination of a. component or product. The term "signal" is intended to include the discontinuance of an existing signal, or a change in the rate of change of an observable parameter, rather than a change in its absolute value. The signal may be monitored manually or automatically. In a reaction assay, the signal is often a product of the reaction. In a binding assay, it is normally provided by a label borne by a labeled reagent.
Labels The component of the signal producing system which is most intimately associated with the diagnostic reagent is called the "label". A label may be, e.g., a radioisotope, a fluorophore, an enzyme, a co-enzyme, an enzyme substrate, an electron-dense compound, an agglutinable particle. The radioactive isotope can be detected by such means as the use of. a gamma counter or a scintillation counter or by autoradiography. Isotopes which are particularly useful for the purpose of the present invention include 3H, 125I, 131I, S, 14C, 32P and 33P. 125I is preferred for antibody labeling. , The label may also be a fluorophore. When the • fluorescently labeled reagent is exposed to light of the proper wave length, its presence can then be detected due to fluorescence. .Among the most commonly used-fluorescent labeling compounds are fluorescein isothiocyanate, rhodamine, phycoerythrin, phycocyanin, allophycocyanin, o- : phthaldehyde and fluores'camine . Alternatively, fluorescence-emitting metals such as - -: 125Eu, or others of the lanthanide series, may be incorporated into a diagnostic reagent using such metal chelating groups as diethylenetriaminepentaacetic acid (DTPA) of ethylenediamine-tetraacetic acid (EDTA) . The label may also be a chemiluminescent compound. The presence of the chemiluminescently labeled reagent is then . - determined by detecting the presence of luminescence that' arises during the course of a chemical reaction. Examples of particularly useful chemiluminescent labeling compounds are luminol, isolumino, theromatic acridinium ester, imidazole, acridinium salt and oxalate ester. Likewise, a bioluminescent compound may be used for labeling. Bioluminescence is a type of chemiluminescence , found in biological systems in which a catalytic protein increases the .efficiency of the chemiluminescent reaction. The presence of a bioluminescent protein is determined by detecting the presence of luminescence. ' Important bioluminescent compounds for purposes of labeling are luciferin, luciferase and aequorin. Enzyme labels, such as horseradish peroxidase and alkaline phosphatase, are preferred. When an enzyme label is used, the signal producing system must also include a substrate for the enzyme. If the enzymatic reaction product is not itself detectable, the SPS will include one or more additional reactants so that a detectable product appears. An enzyme analyte may act as its own label if an enzyme inhibitor is used as a diagnostic reagent.
Binding Assay Formats Binding assays may be divided into two basic types, heterogeneous and homogeneous. In heterogeneous assays, the interaction between the affinity molecule and the analyte ' does' not affect' the label, hence, to determine the amount or presence of analyte, bound label, must be separated from free label. In homogeneous assays, the interaction does affect the activity of the label, and therefore analyte levels can be deduced without the need for a separation step. ' ' '■ In one embodiment, the ABM is insolubilized by coupling it to a macromolecular support, 'and analyte in the sample, is allowed to compete with a known quantity of. a 'labeled ,or : • specifically, labelable .analyte analogue,. The "analyte, ι analogue" is a molecule capable of competing with analyte for binding to the ABM, and the term is intended to include analyte itself. It may be labeled already, or it may be labeled subsequently by specifically binding the label to a 5 moiety differentiating the analyte analogue from analyte. The solid and liquid phases are separated, and the labeled analyte analogue in one phase is quantified. The higher the level of analyte analogue in the solid phase, i.e., sticking to the ABM, the lower the level of analyte in the
10 sample. In a "sandwich assay", both an insolubilized ABM, and a labeled ABM are employed. The analyte is captured by the insolubilized ABM and is tagged by the labeled ABM, forming . a ternary complex. The reagents may be added to the sample
15 in either order, or simultaneously. The ABMs may be the same or different. The amount of labeled ABM in the ternary complex is directly proportional to the amount of analyte in the sample. The two embodiments described above are both
20 heterogeneous assays. However, homogeneous assays are conceivable. The key is that the label be affected by whether or not the complex is formed. Conjugation Methods A label may be conjugated, directly or indirectly
25 (e.g. , through a labeled anti-ABM antibody) , covalently (e.g. , with SPDP) or noncovalently, to the ABM, to produce a diagnostic reagent. Similarly, the ABM may be conjugated to • a solid phase support to form a solid phase ("capture") diagnostic reagent.
,30 Suitable supports include glass, polystyrene, polypropylene, polyethylene, dextran, nylon, amylases, natural and modified celluloses, polyacrylamides, agaroses, and magnetite . The nature of the carrier can be either soluble to some extent or insoluble for the purposes of the
35 present invention. The support material may have virtually any possible , structural configuration so long as the coupled molecule is capable of binding to its target. Thus the support ' > .configuration may be spherical, as in a. bead, or cylindrical, as in the inside surface of a test tube,, or the external surface of a rod. Alternatively, the surface may be flat such as a sheet, test strip, etc. Biological Assays A biological assay measures or detects a biological response of a biological, entity to a substance. The biological entity may be a whole organism, an isolated organ or tissue, freshly isolated cells, an immortalized cell line, or a subcellular component (such as a membrane; this term should not be construed as including an isolated .receptor) . The entity may be, or may be derived from, an organism which occurs in nature, or which is modified in some way. Modifications may be genetic (including radiation and chemical mutants, and genetic engineering) or somatic (e.g., surgical, chemical, etc.). In the case of a multicellular entity, the modifications may affect some or all cells. The entity need not be the target organism, or a derivative thereof, if there is a reasonable correlation between bioassay activity in the assay entity and biological activity in the target organism. The entity is placed in a particular environment, which may be more or less natural. For example, a culture medium may, but need not, contain serum or serum substitutes, and it may, but need' not, include a support matrix of some kind, it may be still, or agitated. It may contain particular biological or chemical agents, or. have particular physical parameters (e.g., temperature)-, that are intended to nourish or challenge the biological entity. There must also be a detectable biological marker for the response. At the cellular level, the most common markers are cell survival and proliferation, cell behavior (clustering, motility) , cell morphology (shape, color), and biochemical activity (overall DNA synthesis,- overall protein . synthesis, and specific - metabolic activities, such as utilization of particular nutrients, e.g., consumption of '. ; oxygen, production of C02, /production of organic acids, uptake or discharge of ions) . ' The direct signal produced by the biological marker may be transformed by a signal producing system into a different signal which is more observable, for example, a fluorescent or colorimetric signal. The entity, environment, marker and signal producing system are chosen to achieve a clinically acceptable level of sensitivity, specificity and accuracy. In some cases, the goal will be to identify substances which mediate the biological activity of a natural biological entity, and the assay is carried out directly with that entity. In other cases, the biological entity is used simply as a model of some more complex (or otherwise inconvenient to work with) biological entity. In that event, the model biological entity is used because activity in the model system is considered more predictive of activity in the ultimate natural biological entity than is simple binding activity in an in vitro system. The model entity is used instead of the ultimate entity because the former is more expensive or slower to work with, or because ethical considerations forbid working with the ultimate entity yet . The model entity may be naturally occurring, if the model entity usefully models the ultimate entity under some conditions. Or it may be non-naturally occurring, with modifications that increase its resemblance to the ultimate entity. ' Transgenic animals, such as transgenic mice, rats, and rabbits, have been1 found useful as model systems'. In cell-based model assays, where the biological activity is mediated by binding to a receptor (target ϊ protein), the receptor may be functionally connected to a signal (biological marker) producing system,' which, may be endogenous or exogenous' to the cell. There are a number of techniques of doing this.
"Zero-Hybrid"' Systems , , In these systems, the binding? of a peptide to the target protein results in a screenable or selectable . phenotypic change, without resort to fusing the target protein (or a ligard binding moiety thereof) to an endogenous protein. It may be that the target protein is endogenous to the host cell, or is substantially identical to an endogenous receptor so that it can take advantage of the latter 's native signal transduction pathway. Or sufficient elements of the signal transduction pathway normally associated with the target protein may be engineered into the cell so that the cell signals binding to the target protein.
"One-Hybrid" Systems In these systems, a chimera receptor, a hybrid of the target protein and an endogenous receptor, is used. The chimeric receptor has the ligand binding characteristics of the target protein and the signal transduction characteristics of the endogenous receptor. Thus, the normal signal transduction pathway of the endogenous receptor is subverted. Preferably, the endogenous receptor is inactivated, or the conditions of the assay avoid activation of the endogenous receptor, to improve the signal-to-noise ratio. See Fowlkes USE3 5,789,184 for a yeast system. Another type of "one-hybrid" system combines a peptide: DNA-binding' domain fusion with an unfused target receptor that possesses an activation domain.
"Two-Hybrid" System , , In a preferred embodiment, the cell-based assay is a two hybrid system. This term implies that the ligand is incorporated into a first hybrid protein, and the receptor into a second hybrid, protein. The first hybrid also comprises component A of a signal generating system, and the second hybrid comprises component B of that system.. Components A and B, by themselves, are insufficient to generate a signal. However, if the ligand binds the receptor, components A and B are brought into sufficiently close proximity so that they can cooperate to generate a signal. '' ' Components A and B may naturally occur, or be substantially identical to moieties which naturally occur, as components of a single naturally occurring biomolecule, or they may naturally occur, or be substantially identical to moieties which naturally occur, as separate naturally occurring biomolecules which interact in nature.
Two-Hybrid System: Transcription Factor Type In a preferred "two-hybrid" embodiment, one member of a peptide ligand:receptor binding pair is expressed as a fusion to a DNA-binding domain (DBD) from a transcription factor (this fusion protein is called the "bait") , and the other is expressed as a fusion to a transactivation domain (TAD) (this fusion protein is called the "fish", the "prey", or the "catch") . The transactivation domain should be complementary to the DNA-binding domain, i.e., it should interact with the latter so as to activate transcription of a specially designed reporter gene that carries a binding site for the DNA-binding domain. Naturally, the two fusion proteins must likewise be complementary. This complementarity may be achieved by use of the complementary and separable DNA-binding and transcriptional activator domains of a single transcriptional activator protein, or one may use complementary domains derived from different proteins. The domains may be identical to the native domains, or mutants thereof. The assay members may be fused directly to the DBD or TAD, or fused through an intermediated linker. , The target DNA operator may be the native operator sequence, or a mutant operator. Mutations, in the operator may be coordinated with mutations in. the DBD and the TAD. An example of a suitable transcription activation system is one comprising the DNA-binding domain from the bacterial represser LexA and the activation domain from the yeast - transcription factor Gal4= , with the reporter? gene operably linked to '.the' LexA operator. '', ' It is not necessary to employ the intact target receptor';- just "the ligand-binding moiety is sufficient. The two fusion proteins may be expressed from the same or different vectors. Likewise, the activatable reporter gene may be expressed from the same vector as either fusion protein (or both proteins) , or from a third vector. Potential DNA-binding domains include Gal4, LexA, and mutant domains substantially identical to the above. Potential activation, domains include E. coli B42, Gal4 . activation domain II, and HSV "VP16, and mutant, domains substantially identical to the above. Potential operators include the native operators for the desired activation domain, and mutant domains substantially identical to the native operator. The fusion proteins may comprise nuclear localization signals. The assay system will include a signal producing system, too. The first element of this system is a reporter gene operably linked to an operator responsive to the DBD and TAD of -choice. The expression of this reporter gene will result, directly or indirectly, in a selectable or screenable phenotype (the signal) . The signal producing system may include, besides the reporter gene, additional genetic or biochemical elements which cooperate in the production of the signal. Such an element could be, for example, a selective agent in the cell growth medium. There may be more than one signal producing system, and the system may include more than one reporter gene. The sensitivity of the system may be adjusted by, e.g., use of competitive inhibitors of any step in the activation or signal production process, increasing or decreasing . the number of operators, using a stronger or weaker DBD or TAD, etc. - When the signal is the death. or survival of the cell in question, or proliferation or. nonproliteration of the cell in question, the assay is said to be a selection. When the signal merely results in a detectable phenotype by which the signaling cell may be differentiated from the same cell in a nonsignaling state; (either way being a living cell)',-, the , ; assay is a screen.' However, the term "screening assay" may be used in a broader sense to include ' a selection.--. When the1 narrower sense is intended, we will use the term "nonselective screen" . Various screening and selection systems are discussed in Ladner, USP 5,198,346. Screening and selection may be for or against the peptide: target protein or compound:target protein interaction. Preferred assay cells are microbial (bacterial, yeast, algal, protozooal) , invertebrate, vertebrate (esp. mammalian, particularly human) . The best developed two- hybrid assays are yeast and. mammalian systems. Normally, two hybrid assays are used to determine whether a protein X and a protein Y interact, by virtue of their ability to reconstitute the interaction of the DBD and the TAD. However, augmented two-hybrid assays have been used to detect interactions that depend on a third, non- protein ligand. For more guidance on two-hybrid assays, see Brent and Finley, Jr., Ann. Rev. Genet., 31:663-704 (1997); Fremont- Racine, et al . , Nature Genetics, 277-281 (16 July 1997);
Allen, et al . , TIBS, 511-16 (Dec. 1995); LeCrenier, et al . , BioEssays, 20:1-6 (1998); Xu, et al . , Proc. Nat. Acad. sci. (USA), 94:12473-8 (Nov. 1992); Esotak, et al., Mol. Cell. Biol., 15:5820-9 (1995); Yang, et al . , Nucleic Acids Res., 23:1152-6 (1995); Bendixen, et al . , Nucleic Acids Res.,
22:1778-9 (1994); Fuller, et al . , Bi©Techniques, 25:85-92 (July 1998); Cohen, et al . , PNAS (USA) 95:14272-7 (1998); Kolonin and Finley, Jr., PNAS (USA) 95:14266-71 (1998). See also Vasavada, et al . , PNAS (USA) , 88 : 10686-90 ' (1991) (contingent replication assay), and Rehrauer, et al . , J. Biol. Chem., 271:23865-73 91996) (LexA repressor faleavage assay) .
Two-Hybrid Systems: reporter Enzyme type In another embodiment, -the components A and B reconstitute an enzyme which is not a transcription factor. , As in the last example, the effect of the reconstitution of the enzyme is a phenotypic change which may be a screenable' change, a selectable change, or both. In vivo Diagnostic Uses Radio-labeled ABM may be administered to the human or animal subject. Administration is typically by injection, e.g., intravenous or arterial or other means of administration in a quantity sufficient to permit subsequent dynamic and/or static imaging using suitable radio-detecting devices. The dosage is the smallest amount capable of providing a diagnostically effective image, and may be determined by means conventional in the art, using known radio-imaging agents as a guide. Typically, the imaging is carried out on the whole body of the subject, or on that portion of the body or organ relevant to the condition or disease under study. The amount of radio-labeled ABM accumulated at a given point in time in relevant target organs can then be quantified. A particularly suitable radio-detecting device is a scintillation camera, such as a gamma camera. A scintillation camera is a stationary device that can be used to image distribution of radio-labeled ABM. The detection device in the- camera senses the radioactive decay, the distribution of which can be recorded. Data produced by the imaging system can be digitized. The digitized information can be analyzed over time discontinuously or continuously. The digitized data can be processed to produce images, called frames, of the pattern of uptake of the radio-labeled ABM in the target organ at a discrete point in time. In most continuous (dynamic) studies, quantitative data is obtained' by observing changes in distributions of radioactive decay in target organs over time. In other words, a time-activity analysis of .the data will illustrate .. uptake through clearance of the radio-labeled binding protein by the target organs with time. ; Various factors should be taken into consideration in selecting an- appropriate radioisotope . The radioisotope must be1,selected with a view to obtaining good quality resolution upon imaging, should be safe for diagnostic use in humans and animals, and should preferably have a short physical half-life so as to decrease the amount of radiation received by the body. The radioisotope used should preferably be pharmacologically inert, and, in the quantities administered, should not have any substantial physiological effect . The ABM may be radio-labeled with different isotopes of iodine, for example 123I, 125I, or 131I (see for example, U.S. Patent 4,609, 725) . The extent of radio-labeling must, however be monitored, since it will affect the calculations made based on the imaging results (i.e. a diiodinated ABM will result in twice the radiation count of a similar monoiodinatecl ABM over the same time frame) . In applications to human subjects, it may be desirable to use radioisotopes other than 125I for labeling in order to decrease the total dosimetry exposure of the human body and to optimize the detectability of the labeled molecule (though this radioisotope can be used if circumstances require) . Ready availability for clinical -use is also a factor. Accordingly, for human applications, preferred radio-labels are for example, 99mTc, S7Ga, 68Ga, 90Y, U1ln, 113raIn, 123I, 186He, 188Re or 211At . The radio-labeled ABM may be prepared by various methods. These include radio-halogenation by the chloramine - T method or the lactoperoxidase method and subsequent purification by HPLC (high pressure liquid chromatography) , , for example as described by J. Gutkowska et al in "Endocrinology and Metabolism Clinics of America: (1987) 16. (1):183. Other known, methods of radio-labeling can be used, such as IODOBEADS™. ' ■ ■ There are a number of different methods of delivering the radio-labeled ABM to -the end-user. It may, be administered by any means that enables the active agent- to reach the agent's site of action in the body of a mammal. Because proteins are subject to being digested when administered orally, parenteral administration, i.e.,' intravenous, • subcutaneous, intramuscular, would ordinarily be used to optimize absorption of an ABM, such as an antibody, which is a protein.
EXAMPLES We are utilizing a- mouse model of diet-induced obesity that progresses to diabetes. The diet is high in fat and has been documented to lead to diabetes in C57BL/SJ mice (Surwit at al . , 1988). After weaning, C57BL/6J mice were fed either the high fat diet or a standard lab chow diet for 16 weeks. Body weight was monitored bi-weekly. Fasting glucose and insulin levels were measured after 2, 4, 8, and 16 weeks on the diets. At each time point, several diabetic and control mice were sacrificed and a number of tissues collected. For further analysis, RNA was extracted from the gastrocnemius muscles at each time point and used in DNA microarray analyses. Animal Models. Obesity and subsequent hyperinsulinemia and hyperglycemia were induced by feeding a group of 3 week old mice (50 C57BL/6 males) a high-fat diet (Bio-Serve , Frenchtown, NJ, #F1850 High Carbohydrate-High Fat; 56% of calories from fat, 16% from protein and 27% from carbohydrates) : Another group of 3 week old mice (20 C57B1/6 males) were fed the normal control diet (P1 I Nutrition International Inc., Brentwood, MO, Prolab ?RMH3000; 14% of calories from fat, 16% from protein and 60% from carbohydrates) . The mice were placed onto the respective ' diets immediately following weaning. Animal weights were determined weekly. Fasting blood-glucose and plasma insulin measurements were determined after 2, 4, 8 and 16 -weeks on the respective diets. , ' The day after obtaining body weight measurements at the indicated time points, mice were fasted 8 hours and blood glucose concentrations were measured via tail blood samples using a One Touch Glucometer (Lifescan) . For insulin measurements, blood was collected into heparinized 'tubes, plasma obtained by centrifugation and insulin concentrations determined using - an Ultra-Sensitive Rat Insulin ELISA kit --' , * (ALPCO) as instructed -by the manufacturer. Values were adjusted by a factor of 1.23 as determined by the manufacturer to correct for species difference in cross- reactivity with the antibody (bottom panel) . Results reflect mean ± SE of 50 mice on the HF diet and 20 mice on the Std diet. Normal weight, normal fasting blood glucose and normal fasting plasma insulin levels are defined as the respective mean values of the animals fed the control diet. Two of the "most typical" animals were selected for each group (Control, hyperinsulinemic and Diabetic) at each time point ( 2,4, 8, and 16 weeks after commencement of diet) for sacrifice. The selected mice were sacrificed and muscle tissue obtained and immediately processed for RNA- isolation.
Fasting Blood Glucose Levels. Blood glucose levels was measured from a drop of blood taken from the tip of the tail of fasted (8 hr) mice using a Lifescan Genuine One Touch glucometer. All measurements occurred between 2:00 pm and 5:00 pm. Plasma insulin measurements. . Blood was collected from the tail of, fasted (8 hr) mice into a heparinized capillary tube and stored on ice. All collections occurred between 2:00 pm and 5:00 pm. Plasma was separated from red blood cells by centrifugation for 10 minutes at 8000 x g and then stored at -20'C. Insulin concentrations were determined using the Rat Insulin ELISA kit and rat insulin standards (ALPCO) .essentially as instructed by the manufacturer. Values were adjusted by a factor of 1.23 as determined by the manufacturer to correct for the species difference in cross-reactivity with the antibody.- ' . '
RNA isolation., Total RNA was isolated from muscle (skeletal muscle , specifically, gastrocnemius) of two mice at each time point ' during the progression of HF diet-induced type 2 diabetes, , as well as age-matched controls on the/Std, diet, /using the ? RNA" STAT-60 Total RNA/mRNA Isolation Reagent .according to : the manufacturer's, instructions (Tel-Test, Friendswood, TX) . Sample Quantification and Quality Assessment Total RNA was quantified and assessed for quality on a Bioanalyzer ?RNA 6000 Nano chip (Agilent) . Each chip contained an interconnected set o f gel-filled channels that allowed for molecular sieving of xiucleic acids. Pin- electrodes in the chip were used to create electrokinetic forces capable of driving molecules through these micro- channels to perform electrophoret c separations. Ribosomal peaks were measured by fluorescence signal and displayed in an electropherogram. A successful total RNA sample featured 2 distinct ribosomal peaks (18S and 28S rRNA) .
Biotinylated cRNA Hybridization Target. Total RNA was prepared for use as a hybridization target as described in the manufacturer7 s instructions for CodeLink Expression Bioarrays (TM) (Amersham Biosciences) . The CodeLink Expression Bioarrays utilize nucleic acid hybridization of a biotin-labeled complementary RNA(cRNA) target with DNA oligonucleotide probes attached to a gel matrix. The biotin-labeled cRNA target is prepared by a linear amplification method. Poly (A) + UJMA (within the total RNA population) is primed for reverse transcription by a DNA oligonucleotide containing a T7 RKFA polymerase promoter 5' to a (dT) 24 sequence. After secorxd-strand cDNA synthesis, the cDNA serves as the template in an in vi tro transcription (IVT) reaction to produce the target cRNA. The IVT is performed in the presence of bioti nylated nucleotides to label the target c?NA. This procedure results in. a 50-200 fold linear amplification of the imput poly (A) + RNA.
Hybridization Probes . The oligonucleotide probes we e provided by the Codelink Uniset Mouse I Bioarray (Amersham,- product code 300013) . Amine-terminated oligonucleotide probes are attached to a three-dimensional' poHyacrylamide gel matrix. There - are 10,000 oligonucleotide pirobes, ' each specific to-a well-characterized mouse gene.' Eac?h mouse . gene is representative of a unique gene cluster from the fourth quarter 2001 Genbank Unigene build. There are also 500 control probes. The sequences of the probes are proprietary to Amersham. However, for each probe, Amersham identifies the corresponding mouse gene by NCBI accession number, OGS, LocusLink, Unigene Cluster ID, and description (name) . This information should be available from- Amersham. In the case of the differentially expressed probes, this information is duplicated in master table 1. For the complete list, see , http://www4.amershambiosciences.com/aptri3c/upp01077.nsf/Cont ent/codelink_literature Under "Gene Lists", select "Uniset Mouse I", and a gene list, in Excel format, can be downloaded.
Hybridization Using the cRNA target, the hybridization reaction mixture is prepared and loaded into array chambers for bioarray processing as set forth in the manufacturer's instructions for CodeLink Gene Expression BioarraysTM (Amerhsam Biosciences) . Each sample is hybridized to an individual microarray. Hybridization is at 37°C. The hybridization buffer, is prepared as set f rth in the Motorola instructions. Hybridization to the microarray is detected with an avidinated fluorescent reagent, Streptavidin-Alexa Fluor ® 647 (Amersham) . Mouse Gene Expression Analysis Processed arrays were scanned using a GenePix 4000B 'Microarray Scanner (Axon Instruments, Inc .); array images were acquired using the Amersham CodeLink""" Analysis Software (Release 2.2) . The Amersham CodeLink™ Analysis Software . gives an integrated optical density (IOD) value for every .spot;, a- unique background value for that spot is subtracted, resulting in "raw" data points. Individual chips are then !' normalized by the Amersham Codelink™ software according to the median raw intensity for all 10,000 genes . A negative control threshold (0.2) is also calcul ted according to the control probes. The expression data was analyzed to identify genes whose expression levels changed significantly with respect to:
Normal mice compared to hyperinsulinemic mice at 2, 4, 8 and 16 weeks on normal vs. high -fat diet.
Normal mice compared to hyperinsulinemic/hyperglycemic mice at 2 , 4, 8 and 16 weeks on normal vs. high-fat diet.
Hyperinsulinemic compared to hyperinsulinemic/hyperglycemic mice at 2 , 4, 8 and.16 weeks on high-fat diets.
Database Searches Nucleotide sequences and predicted amino acid sequences were compared to public domain databases using the Blast 2.0 program (National Center for Biotechnology Information, National Institutes of Health) . Nucleotide sequences were displayed using ABI prism Edit View 1.0.1 (PE Applied Biosystems, Foster City, CA) . Nucleotide database searches were conducted with the then, current version of BIASTN 2.0.12, see Alts'chul, et al . , "Gapped BIAST and PSI-BLAST: a new generation of protein .database search programs",' Nucleic Acids Res., 25:3389-3402 (1997). Searches employed the default parameters, unless otherwise stated. ι For blastN, searches, the default was the blastN matrix (1,-3), with gap penalties of 5 for existence and 2 for extension. Protein database searches were conducted with the then- , current version of BLAST X, see Altschi.il et al . (1997)-, supra. • Searches employed the default parameters, unless otherwise stated. The scoring matrix was BLOSUM62, with gap 1 costs of 11 for existence and 1' for extension. • > The standard low complexity, filter was used. .- * ' "ref" indicates that-NCBI's RefSecq is the source ; database. The identifier that follows is a RefSeq accession number, not a GenBank accession number. "RefSeq secquences are derived from GenBank and provide non-redundant < urated data representing our current knowledge of known genes . Some records include additional sequence information that was never submitted to an archival database but is available in the literature. A small number of sequences are provided through collaboration; the underlying primary sequence data is available in GenBank, but may not be available in any one GenBank record. RefSeq sequences are not submitted primary sequences. RefSeq records are owned by NCBI and therefore can be updated as needed to maintain current annotation or to incorporate additional sequence information." See also http: //www.ncbi .nlm.nih.gov/LocusLink/refseq.html It will be appreciated by those in the art that the exact results of a database search will change from day to day, as new sequences are added. Also, if you query with a longer version of the original sequence, the results will change. The results given here were obtained at one time and no guarantee is made that the exact same hits would be obtained in a search on the filing date. However, if an alignment between a particular query sequence and a particular database sequence is discussed, that alignment should not change (if the parameters and sequences rremain unchanged) . '
Northern Analysis. Northern analysis may be used to confirm the results . , Favorable and unfavorable genes, identified as described , above, or fragments thereof, will be used as probes in Northern hybridization, analyses to confirm their differential expression. Total RNA isolated from subject mice will be resolved by agarose gel electrophoresis through ' a 1% agarose, 1 % formaldehyde denaturing gel, transferred '■ to positively charged nylon membrane, and hybrid!zedL to a probe labeled with [32P] dCTP 'that was :generated from the aforementioned gene or fragment using the Random' Primed DNA Labeling Kit ' (Roche, Palo Alto, ' CA) , or to a probe Labeled with digoxigenin (Roche Molecular, Biochemicals, Indianapolis , IN) , according to the manufacturer' s instructions .
Real-Time RNA Analysis. Real-time RNA analysis may also be used for confirmation. For "real-time" RNA analysis, RNA will be converted to cDNA and then probed with gene-specific primers made for each clone. "Real-time" incorporation of fluorescent dye will be measured to determine the' amount of specific transcript present in each sample. Sample differences (control, vs. hyperinsulinemic, hyperinsulinemic vs. diabetic, or control vs. diabetic) wi!Ll be evaluated. Confirmation using several independent animals is desirable.
In situ Hybridization Another form of confirmation may be provided by nonisotopic in . si tu hybridizations (NISH) on selected human (obtained by Tissue Informatics) and mouse tissues using cRNA probes generated from mouse genes found to be up- or down-regulated during the disease progression. In si tu hybridizations may also be performed on mouse tissues using cRNA probes generated from differentially eixpressed DNAs. These cRNA's will hybridize to their corresponding messenger RNA's present in cells and will 'provide information regarding the particular cell types within a tissue that is expressing the particular gene as well as the relative level of gene expression. The cRNA probes may be 'generated by in vi tro transcription of template cDNA by Sp6. or T7 RNA polymerase in the presence of digoxigenin-lIL-UTP (Roche Molecular Biochemicals, Mannheim, Germany; 3Pardue, M.L. '*, 1985. In: In situ hybridization, Nucleic acid hybridization, a practical approach: IRL Press,' Oxford, 179- 202) . . , . ? '
Transgenic Animals. Transgenic expression may be used to con irm the results'. In one embodiment, 'a mouse is -engineered to -overexpress the favorable or unfavorable mouse gene in question.' In another - embodiment, a mouse is engineered to express the '• ' corresponding favorable or unfavorable human gene. Ion. a third embodiment, a nonhuman animal other than a mouse, such as a rat, rabbit, goat, sheep or pig, is engineered to express the favorable or unfavorable mouse or human gene.
Hyperquantitative Tissue Analysis In addition to gene expression analysis the tissue sections can also be analyzed using Tissuelnformatics .,
Inc.'s TissueAnalytics™ software. A single represe tative section may be cut from each tissue block, placed on a slide, and stained with H&E. Digital images of each slide may be acquired using an research microscope and digital camera (Olympus E600 microscope and Sony DKC-ST5) . These images may be acquired at 20x magnification with a resolution of 0.64 mm/pixel. A hyperquantitative anaϋysis may be performed on the resulting images: First a digital image analysis can identify and annotate structural o jects in a tissue using machine vision. These objects, which are constituents of the tissue, can be annotated because they are visually identifiable and have a biological meaning. '
Subsequently a quantification of these structures regarding their geometric properties like area or stain intensities and their relationship to the field of view or per unit area in terms of a % coverage may be performed. Features orr parameters for hyper-quantification are specific for each tissue, and, may also include relations between features, measures of overall heterogeneity, including orientation, relative locations, and textures.
Correlation Analysis Mathematical statistics provides a rich set of additional ' tools to analyze time resolved data sets of hyper-' quantitative and gene expression profiles for similarities, including rank correlation, .the calculation of regression and correlation coefficients, and clustering. Continuous functions may also be fitted through' the data points of : <■ individual 'gene and tissue feature data. Relation' etween gene expression and hyper-quantitative tissue data. may be linear or non-linear, in synchronous or asynchronous arrangements .
Example 1 Obesity is increasing at an alarming' rate in the United States. In parallel, the incidence of type II diabetes is also rising. We are interested in defining alterations in gene expression that correlate with the development of these conditions in the hopes of reversing these dangerous trends . Insulin plays a major role in regulating blood glucose levels. It stimulates the uptake of glucose in adipose tissue and striated muscle for storage as intracellular triglycerides and glycogen. Insulin also inhibits the release of glucose from the liver. Normally, this woμld prevent the rise in blood sugar concentration that occurs after eating. However, in the early stages of type 2 diabetes, resistance to insulin is seen. Muscle plays a major role in glucose metabolism. Thus, it also is a major contributor to the development of type 2 diabetes. In normal situations, muscle cells respond to increasing levels of insulin by increasing glucose uptake from the bloodstream. However, during the very early stages of type 2 diabetes, muscle tissue becomes resistant to insulin, requiring the pancreatic beta cells to increase insulin secretion. Eventually, the beta cells become unable to compensate for this increasing insulin resistance from muscle and other cells, and insulin production drops. Thus, clinical type 2 diabetes results from the combination of insulin resistance and impaired beta cell function. Defects in muscle glycogen synthesis are known to play a ,role in the development of insulin resistance' (Petersen and Shulman, 2002) . At least three steps - those mediated by glycogen synthase, hexokinase, and GLUT4 - have, been reported to be defective in patients with * type 2 diabetes. Fatty acids' also, can induce insulin resistance,' and it has '.been suggested that this was a- consequence, of altered insulin signaling through PI3-kinase. We are utilizing a mouse model of diet-induced obesity that progresses to diabetes. The diet is high in fat, an increasing component in the U.S. diet, and has been documented to lead to diabetes in C57BL/6J mice (Surwit et al . , 1988). After weaning, C57BL/6J mice were fed either the high fat diet or a standard lab chow diet for 16 weeks.
Body weight was monitored bi-weekly. Fasting glucose and insulin levels were measured after 2, 4, 8, and 16 weeks on the diets. Consumption of the HF diet resulted in significant, . progressive increases in body weight and fasting insulin levels in comparison to consumption of the Std diet. Fasting glucose levels of mice on the HF diet were dramatically increased at the first time point assayed (2 weeks) and remained high through the duration of the experiment (16 weeks) . At each time point, several diabetic and control mice were sacrificed and a number of tissues collected. RNA was extracted from the gastrocnemius muscle at each time point . In order to identify additional muscle genes involved in the development of type 2 diabetes, we used microarray analysis to compare RNA expression levels of 10,000 genes in muscle of high fat diet fed and control diet fed mice at various time points, in .the progression of type 2 diabetes. Microarray analysis provides a more global picture, of gene regulation, allowing the identification of families or groups of genes showing similar expression patterns that potentially imply similar or coordinated roles in disease progression.. , , Consumption of the HF diet resulted in significant, progressive increases in body weight- and fasting insulin levels in comparison to consumption of the 'Std diet. Fasting glucose levels of mice on the HF diet were dramatically increased at the first time point assayed (2 weeks) and remained high through the duration of ■ the experiment (16 weeks) ., . * Of 10,000 genes analyzed, -121 were up-regulated but only 7 down-regulated greater than two-fold in -the diabetic relative to non-diabetic mice. These genes are listed in Master Table 1. This distribution of up- and down-regulated genes was much different from that seen for other organs (liver, pancreas, and white adipose tissue) where there was a much closer balance between the number of up- and down-regulated genes.
Actin, alpha, cardiac (Actcl, ?NM_009608) was one of the most down-regulated genes when comparing HF to Std mice. It was consistently expressed at lower levels in the HF. diabetic mice in comparison to the Std mice and also steadily decreased over the 16 week study.
Example 2 Interestingly, further analysis of the time points and exploration of gene pathways and functionally related genes revealed a subset of actin-related and actin-binding genes exhibiting a consistent decrease in expression (although less than two-fold) in the diabetic mice; 9 of 37 functionally related genes were decreased in diabetic muscle at all four time points and an additional 9 were decreased at three of the four time points. Only two of these genes had been included in the original list of 7 down-regulated genes using the two-fold cut-off criterion. It is possible that this subtle but coordinated down- ' regulation of actin-related or actin-binding genes reflects a role, in the decreased glucose uptake by skeletal muscle that occurs in diabetes. With nearly half (18 of 37) of the genes in a related family of genes being consistently down- regulated in a study that did not identify a large number .of down regulated genes, we feel- that actin and genes in actin- related pathways may prove to play key roles in muscle as obesity and diabetes progress . | The actin-related and actin-binding mouse genes in question have been included at the end of Master Table 1, subtable "lA. , . ■ ■ Introduction to Master Tables
The master tables reflect applicants7 analysis of the gene chip data.
For each probe corresponding to a differentially expressed mouse gene, Master Table 1 identifies
Col. 1: The mouse gene (upper) and mouse protein (lower) database accession #s .
Col. 2: The corresponding mouse Unigene Cluster, as of the 4th Quarter 2001 build.
Col. 3: The behavior (differential expression) observed for the mouse gene. This column identifies the gene as favorable (F) or unfavorable (U) on the basis of its strongest differential behavior at the ages tested. There are three possible comparisons, HI-D, C-HI, and C-D, where C=control (normal), HI=hyperinsulinemic, and D=diabetic. If HI>D, C>HI, or C>D, the behavior for that subject comparison is considered unfavorable. If the inequality is reversed, the behavior for that subject comparison is considered favorable. In the Master Table, the numerical value is the ratio of the greater value to the lesser value. If this ratio is at least two fold, the degree of differential expression is considered strong. Usually only mouse genes exhibiting at least one strong differential expression behavior are listed in the Master Table; exceptions are noted in the Examples.
Figure imgf000113_0001
Figure imgf000114_0001
Col. 4: A related human protein, identified by its database accession number. Usually, several such proteins are identified relative to each mouse gene. These proteins have been identified by BLAST searches, as explained in cols. 6-
8. \
Col. 5: The name of the related human protein.
Col. 6: The score (in bits) for the alignment performed by the BIAST program.
Col. 7: The E-value for the alignment performed by the BLAST program., It is worth noting that Unigene considers a Blastx E Value of less than le-6 to be a "match" to the reference sequence of a cluster.
Unless otherwise indicated, the bit score and E-value' for the alignment is with respect to the alignment of the mouse DNA of col . 1 to the human protein of col . 4 by BlastX, according to the .default parameters.
Master Table, 1 is divided into three subtables on the basis of the behavior in col. 3. If a gene' has at least' one significantly favorable behavior, and no significantly unfavorable ones, it is put into Subtable 1A. In the opposite case, it is put into Subtable IB. If its' behavior is mixed, i.e., at least one significantly favorable and at least one significantly unfavorable, it is put into Subtable IC. Note that this classification is based - on the strongest observed differential expression behaviors for each of the three subject comparisons, C-HI, HI-D and C-D.
The corresponding human gene clusters 'are also of interest. These may be obtained in a- number of ways. First, one may?' search on Unigene (http : //www.ncbi .nlm.nih. gov/entrez/query. fcgi?db=unigene) for the identified human protein. Review the "hits" (each of which is a Unigene record) for those prefixed by "Hs." Secondly, one may access the Unigene record for the mouse gene cluster (which is given in Master Table 1) , and then click on "Homologene" . This will bring up a new page which includes the section "Possible Homologous Genes" . One of the entries should be a Homo sapiens gene (considered by Unigene to be the most related human gene) ; click on its
Unigene record link. Additional information of interest may be accessed by searching with the mouse gene accession # in the Mouse Gene
Informatics database, at http: //www. informatics.iax.org/ .
MASTER TABLE 1 SIGNIFICANTLY DIFFERENTIALLY EXPRESSED MOUSE GENES/PROTEINS AND CORRESPONDING HUMAN PROTEINS
Subtable 1 A: Wholly Favorable Genes and Proteins
Figure imgf000116_0001
Figure imgf000117_0001
Figure imgf000118_0001
Figure imgf000119_0001
Figure imgf000120_0001
Figure imgf000121_0001
Figure imgf000122_0001
Figure imgf000123_0001
Figure imgf000124_0001
Figure imgf000125_0001
Figure imgf000126_0001
Figure imgf000127_0001
Figure imgf000128_0001
Figure imgf000129_0001
Figure imgf000130_0001
Figure imgf000131_0001
Figure imgf000132_0001
Figure imgf000133_0001
Figure imgf000134_0001
Figure imgf000135_0001
Figure imgf000136_0001
NM_025891 - .. NP 080167.2 Mm.279751 -1.70 AAR88510.1 60kDa BRG-1/Brm associated factor subunit c isoform 2 SWI/SNF-related matrix-associated actin-dependent regulator of chromatin d3; 7 Rscδp; mammalian chromatin remodeling complex BRG1 -associated factor 60C; - Swp73-Iike protein; chromatin remodeling complex BAF60C subunit; SWI/SNF NP )03069.2 complex 60 kDa subunit C 786 AAC50697.1 SWI/SNF complex 60 KDa subunit 745 SWI/SNF-f elated matrix-associated actin-dependent regulator of chromatin d1 isoform a; Rscδp; mammalian chromatin remodeling complex BRG1-associated ' • ' factor 60A; chromatin remodeling complex BAF60A subunit; Swp73-like protein; . NP_003067.2 SWI/SNF complex 60 kDa subunit A 623 e-1 AAH09368.2 SMARCD1 protein 623 e-1 AAD23390.1 SWI/SNF-related, matrix-associated, actin-dependent regulator of chromatin D1 619e-1 SWI/SNF-related matrix-associated actin-dependent regulator ofchromatin d2; Rscδp; mammalian chromatin remodeling complex BRG1 -associated factor 60B; Swp73-like protein; chromatin remodeling complex BAF60B subunit; SWI/SNF NP_003068.2 complex 60 kDa subunit B 594e-1 AAC50696.1 SWI/SNF complex 60 KDa subunit 575e-1 SWI/SNF-related matrix-associated actin-dependent regulator of chromatin d1 isoform b; Rscδp; mammalian chromatin remodeling complex BRG1 -associated factor 60A; chromatin remodeling complex BAF60A subunit; Swp73-Iike protein; NP 20710.1 δWI/δNF complex 60 KDa subunit A 541 e-1 AAC50695.1 SWI/SNF complex 60 KDa subunit 541 e-1 AAS02031.1 unknown 450 e-1 AAS00380.1 unknown 3405e
M12866 F:(C-D) AAA37164.1 Mm.214950 -1.69 NP_001091.1 alpha 1 actin precursor; alpha skeletal muscle actin 765 NP_005150.1 cardiac muscle alpha actin proprotein; smooth muscle actin 759 NP 001604.1 alpha 2 actin; alpha-cardiac actin 753
Figure imgf000138_0001
ARM1_HUMA N Actin related protein M1 389e-1 NP_115876.2 actin related protein M1 385 e-1 AAH07289.1 Actin related protein M1 384 e-1 CAA57692.1 beta-centractin 380e-1 NP_612146.1 actin-related protein T1 366 e-1 AAM00432.1 actin-related protein T1 366 e-1 NP_536356.3 actin-related protein M2; actin-related protein hArpM2; actin-related protein T2 363 e-1 AAP20055.1 HSD27 362 e-1 BAB85862.1 actin-related protein hArpM2 362 1e- NP_005713.1 actin-related protein 2; ARP2 (actin-related protein 2, yeast) homolog 361 2e- AAH29499.1 Actin-related protein M2 3596e- AAH14546.1 . Actin-related protein 2 3582e- AAP37280.1 actin alpha 1 skeletal muscle protein 3327e- XP_208204.1 similar to actin-related protein 2 331 2e- XP_377904.1 similar to cytoplasmic beta-actin 3234e- AAH36253.1 ACTR2 protein 321 2e- AAHl 0417.2 ACTG1 protein 321 2e- NP_006678.1 actin-like 7A; actin-like 7-alpha 321 2e NP 06677.1 actin-like 7B; actϊn-like 31036- AAH09544.1- Unknown (protein for IMAGE:3897065) 3105e- NP_848620.1 actin-like -. - ■'■• ' . 3003e AAP20052.1 HSD21 ... . 2999e- . .- ;"" " XP_377631.1 similar to beta actin " 2999e-
NM_007392 Mm.213025 F:(C-D)
NP_031418.1 .. . '-. ' _ -1.53 NP_001604.1 alpha 2 actin; alpha-cardiac actin 765 • ~; --- '. - ATHUSM actin alpha 2, aortic smooth muscle 762
NPJ305150.1 cardiac muscle alpha actin proprotein; smooth muscle actin 755 0 NP_001606.1 actin, gamma 2 propeptide; actin, alpha-3 754 0 NP_001091.1 alpha 1 actin precursor; alpha skeletal muscle actin 753 0 actin, gamma 1 propeptide; actin, cytoplasmic 2; deafness, autosomal dominant
NP_001605.1 26; deafness, autosomal dominant 20; 724 0
JC5818 gamma-actin 724 0
NP_001092.1 beta actin; beta cytoskeletal actin 724 0
AAH16045.1 Beta actin 722 0
CAA45026.1 mutant beta-actin (beta'-actin) 720 0
AAH08633.1 actin, beta 719 0
AAH12854.1 ACTB protein 703 0
AAH17450.1 Unknown (protein for IMAGE:3538275) 701 0
XP_293924.1 similar to RIKEN cDNA 4732495G21 gene 689 0
XP 710S8.2 similar to FKrøO 07Z 0
XP_065237.5 similar to FKSG30 671 0 AAG50355.1 FKSG30 671 0 XP_292982.4 similar to pote protein; Expressed in prostate, ovary, testis, and placenta 668 0 XP_372957.1 similar to FKSG30 668 0 AAA51586.1 actin prepeptide 661 0
0902248A actin beta related pseudogene 575 e- 163 AAH23548.1 ACTG1 protein 506 e-143 AAA51580.1 gamma-actin 445 e -124 AAH06372.1 ARP1 actin-related protein 1 homolog B, centractin beta 431 e -120 ARP1 actin-related protein 1 homolog B, centractin beta; centractin beta; ARP1 (actin-related protein 1 , yeast) homolog B (centractin beta); PC3; ARP1, yeast NPJ305726.1 homolog B 429 e -120 ARP1 actin-related protein 1 nomolog A, centractin alpha; ARP1 (actin-related protein 1 , yeast) homolog A (centractin alpha); centractin alpha; actin-RPV;
NP_005727.1 centrosome-associated actin homolog; ARP1, yeast homolog 422 e 118 1818358A actin-related protein 421 e -117 ARM1_HUMA N Actin related protein M1 387 e -107
NP_115876.2 actin related protein M1 382 e-105 AAH07289.1 Actin related protein M1 382 e-105 CAA57692.1 beta-centractin 380 e-105 NP_612146.1 actin-related protein T1 369 e-102 AAM00432.1 actin-related protein T1 369 e-102 NP_536356.3 actin-related protein M2; actin-related protein hArpM2; actin-related protein T2 369 e-102 BAB85862.1 actin-related protein hArpM2 367 e-101 AAP20055.1 HSD27 366 e-101 AAH29499.1 Actin-related protein M2 - 365 e-100 NP_005713.1 actin-related protein 2; ARP2 (actin-related protein 2, yeast) homolog 3566e-98 AAH14546.1. Actin-related protein 2 3535e-97 - NP_006678.1 actin-like 7A; actin-like 7-alpha 3282e-89 XP^.208204.1 similar to actin-related protein 2 3267e-89 . XP_377904.1 similar to cytoplasmic beta-actin 3252e-88 AAP37280.1 actin alpha 1 skeletal muscle protein 3236e-88 AAH10417.2 ACTG1 protein 3238e-88 - AAH3G253.1 ACTR2 protein 318 1e-86 NP_006677.1 actin-like 7B; actin-like 7-beta 3169e-86 . AAH09544.1 Unknown (protein for IMAGE:3897065) 311 2e-84 BAB71690.1 unnamed protein product 3036e-82 NP_848620.1 actin-like 3038e-82 AAP20052.1 HSD21 301 2e-81
NM_013456 . F:(C-D) 168
NP 038484.1 Mm.5316 -1.46 NP 001095.1 skeletal muscle specific actinin, alpha 3 5 . 0 144 NP_001094.1 actinin, alpha 2 9 0 141 NP_001093.1 actinin, alpha 1 0 0 140 FAHUAA alpha-actinin 1 - human 7 0 134 NP_004915.2 actinin, alpha 4 8 0
Figure imgf000142_0001
AAF93173.1 betalV spectrin isoform sigma4 394 e-109 Chain A, Crystal Structure Of Two Central Spectrin-Like Repeats From 1QUU_A Alpha-Actinin 379 e-104
NP_057726.1 spectrin, beta, non-erythrocytic 5; beta V spectrin 344 5e-94
AAB41498.1 alpha II spectrin 264 7e-70
AAH53521.1 SPTAN1 protein 264 7e-70
NP_003118.1 spectrin, alpha, non-erythrocytic 1 (alpha-fodrin) 259 2e-68 plectin 1 isoform 1; hemidesmosomal protein 1; epidermolysis bullosa simplex 1
NP_000436.2 (Ogna) 245 3e-64
G02520 plectin - human 245 3e-64 plectin 1 isoform 6; hemidesmosomal protein 1; epidermolysis bullosa simplex 1 NP_958782.1 (Ogna) . 245 3e-64 plectin 1 isoform 10; hemidesmosomal protein 1; epidermolysis bullosa simplex 1 NP_958785.1 (Ogna) 245 3e-64 plectin 1 isoform 8; hemidesmosomal protein 1; epidermolysis bullosa simplex 1 NP_958784.1 (Ogna) 245 3e-64 plectin 1 isoform 11 ; hemidesmosomal protein 1 ; epidermolysis bullosa simplex 1 NP_958786.1 (Ogna) " 245 3e-64 plectin 1 isoform 3; hemidesmosomal protein 1 ; epidermolysis bullosa simplex 1 NP_958781.1 (Ogna) 245 3e-64 plectin 1 isoform 2; hemidesmosomal protein 1; epidermolysis bullosa simplex 1 NP_958780.1 (Ogna) 245 3e-64 plectin 1 isoform 7; hemidesmosomal protein 1 ; epidermolysis bullosa simplex 1 NP_958783.1 (Ogna) 245 3e-64
PLE1_HUMA hi Plectin 1 (PLTN) (PGN) (Hemidesmosomal protein 1) (HD1) 241 4e«63
139160 dystonin isoform 1 - human (fragment) 231 4e-60
BPA1_HUMA Bullous pemphigoid antigen 1 isoforms 1/2/3/4/5/8 (230 kDa bullous pemphigoid
N antigen) (BPA) (Hemidesmosomal plaque protein)(Dystonia musculorum protein) 231 4e-60 bullous pemphigoid antigen 1 isoform 1 ; bullous pemphigoid antigen 1 NP_899236.1 (230/240kD); dystonin; hemidesmosomal plaque protein 231 4e-60
MACFJHUMA Microtubule-actin crosslinking factor 1, isoforms 1/2/3 (Actin cross-linking family N protein 7) (Macrophin 1) (Trabeculin-alpha) (620 kDa actin-binding protein) 224 8e-58
Figure imgf000144_0001
. actin-binding LIM protein 1 isoform m; LIM actin-binding protein 1; limatin; 111 NP_006710.2 actin-binding LIM protein 3 actin-binding LIM protein 1 isoform s; LIM actin-binding protein 1; limatin; NP_006711.2 actin-binding LIM protein 756 BAA74866.2. KIAA0843 protein 651 NP_055760.1 actin binding LIM protein family, member 3 651 AAH67214.1 Unknown (protein for !MAGE:6188753) 518 e -1 BAB47437.1 KIAA1808 protein 508 e -1 NP_115808.2 actin binding LIM protein family, member 2 506 e -1 BAC0441 .1 unnamed protein product 501 e -1 AAH02448.1 -. ABLIM1 protein 433 e -1 AAH01665.1 ABLIM3 protein 401 e -1
NM J19785 " F:(C-D)
NP_062759.1 Mm.29317 , -1.32 . NP_060947.1 uncharacterized hypothalamus protein HARP11 813 BAA91243.1 . unnamed protein product 813 BAB14083.1 unnamed protein product 811 CAD62610.1 unnamed protein product 561 e -1 CAD61940.1 unnamed protein product 430 e -1 ARP1 actin-related protein 1 homolog A, centractin alpha; ARP1 (actin-related
NM_016860 F:(C-D) protein 1 , yeast) homolog A (centractin alpha); centractin alpha; actin-RPV;
NP_058556.1 Mm.3118. -1.31 NP_005727.1 centrosome-associated actin homolog; ARP1, yeast homolog 755 1818358A actin-related protein 753 ARP1 actin-related protein 1 homolog B, centractin beta; centractin beta; ARP1 (actin-related protein 1 , yeast) homolog B (centractin beta); PC3; ARP1 , yeast - NP_005726.1 homolog B .... - 709 AAH06372.1 ARP1 actin-related protein 1 homolog B, centractin beta 708 CAA57692.1 beta-centractin 616 e -1 actin, gamma 1 propeptide; actin, cytoplasmic 2; deafness, autosomal dominant NP_001605.1 26; deafness, autosomal dominant 20; 425 e -1 JC5818 gamma-actin 425 e -1 NP 05150.1 cardiac musclQ alpha actin proprotoin: smooth muscle actin 425(3-1 NP_001092.1 beta actin; beta cytoskeletal actin 424 e -1
AAH08633.1 actin, beta 424 e-118
AAH16045.1 Beta actin 424 e-118
CAA45026.1 mutant beta-actin (beta'-actin) 423 e-118
NP_001091.1 alpha 1 actin precursor; alpha skeletal muscle actin 423 e-118
NP_001604.1 alpha 2 actin; alpha-cardiac actin 422 e-117
NP_001606.1 actin, gamma 2 propeptide; actin, alpha-3 422 e-117
ATHUSM actin alpha 2, aortic smooth muscle 422 e-117
XP_293924.1 similar to RIKEN cDNA 4732495G21 gene 417 e-116
AAH17450.1 Unknown (protein for IMAGE:3538275) 410 e-114
AAH12854.1 ACTB protein 408 e-113
AAG50355.1 FKSG30 408 e-113
XP_065237.5 similar to FKSG30 408 e-113
XP_371558.2 similar to FKSG30 404 e-112
XP_292982.4 similar to pote protein; Expressed in prostate, ovary, testis, and placenta 404 e-112
XP_372957.1 similar to FKSG30 404 e-112
AAA51586.1 actin prepeptide 355 2e-97
0902248A actin beta related pseudogene 330 6e-90
NP_005713.1 actin-related protein 2; ARP2 (actin-related protein 2, yeast) homolog 322 2e-87
AAH14546.1 Actin-related protein 2 318 2e-86
NP_115876.2 actin related protein M1 314 6e-85
ARM1_HUMA
N Actin related protein M1 314 6e-85
NP_536356.3 actin-related protein M2; actin-related protein hArpM2; actin-related protein T2 309 1e-83
AAH07289.1 Actin related protein M1 309 2e-83
BAB85862.1 actin-related protein hArpM2 308 2e-83
AAH29499.1 Actin-related protein M2 307 7e-83
AAH23548.1 ACTG1 protein 297 6e-80
XP_208204.1 similar to actin-related protein 2 296 1e-79
NP_612146.1 actin-related protein T1 295 4e-79
AAM00432.1 actin-related protein T1 2954e- AAP20055.1 HSD27 291 4e- AAH36253.1 ACTR2 protein . 2878e- NP_006678.1 actin-like 7A; actin-like 7-alpha 2676e- NP_006677.1 actin-like 7B; actin-like 7-beta 260 1e- AAA51580.1 gamma-actin 253 9e- BAB71690.1 unnamed protein product 2484e- NP_848620.1 actin-like . 247 7e- AAP20052.1 HSD21 2462e- actin-related protein 3-beta; actin-related protein 3-beta; actin-related protein NP_065178.1 Arp11; actin-related protein Arp11 2353e- ARP3 actin-related protein 3 homolog; ARP3 (actin-related protein 3, yeast) . NP_005712.1 homolog 2353e- NP_057272.1 BAF53b; actin-related protein; hArpN alpha 213 1e- CAB66543.1 hypothetical protein 203 1e-
NM_020618 - --. F:(C-D) " SWI/SNF-related matrix-associated actin-dependent regulator of chromatin e1 ;
NP_065643.1 Mm.27330 ? -1/30 NP_003070.3 mammalian chromatin remodeling complex BRG1 -associated factor 57 597 e-1 AAH07082.1 SWI/SNF-related matrix-associated actin-dependent regulator of chromatin e1 594 e-1
NM_011779 ' F:(C-D)
NP_035909?2 Mm.320560 -1.30 T47172 hypothetical protein DKFZp762H186.1 - human (fragment) 954 NP;_055140.1 coronin, actin binding protein, 1C; coronin, actin-binding protein, 1C; coronin 1C 946 . NP_065174.1 coronin, actin binding protein, 1B 758 coronin, actin binding protein, 1A; coronin, actin-binding, 1A;coronin, actin-binding NP_009005?1 protein, 1A; coronin-1 648 AAA77058.1 coronin-like protein 644 . BAA76769.1 KIAA0925 protein 412 e-1 coronin, actin binding protein, 2B; clipin C; coronin, actin-binding, 2B; coronin, NP_006082.1 actin-binding protein, 2B 411 e-1 . C02B_HUMA N " Coronin 2B (Coronin-like protein C) (ClipinC) (Protein FC96) 409 e-1
coronin, actin binding protein, 2A; coronin, actin-binding protein, 2A; coronin 2A; NP_438171.1 coronin-like protein B; WD-repeat protein 2; WD protein IR10 408 e-113 coronin, actin binding protein, 2A; coronin, actin-binding protein, 2A; coronin 2A; NP_003380.2. coronin-iike protein B; WD-repeat protein2; WD protein IR10 408 e-113 AAB47807.1 - WD protein IR10 404 < 3-112 T47174 hypothetical protein DKFZp762l 166.1 - human (fragment) 389 3-107 AAS48630.1 unknown 3147e-85 NP_116243.1 hypothetical protein FLJ 14871 311 5e-84 . . . AAQ04659.1 Unknown 311 6e-84 NP_078811.1 hypothetical protein FLJ22021 234 βθ-61
NM_033268 . F:(C-D) 171
NP 50371.2 Mm.195067 -1.29 NP_001.094.1 actinin, alpha 2 2 . 0 141 NP_001095.1 skeletal muscle specific actinin, alpha 3 2 0 139 NP_001093.1 actinin, alpha 1 4 0 139 FAHUAA alpha-actinin 1 - human 1 0 136 NP_004915.2 actinin, alpha 4 1 0 136 BAA24447.1 alpha actinin 4 - 1 0 126 AAC17470.1 alpha actinin .„ . • 5 0 AAH15620.2 ACTN4 protein 941 0 1 HCI_A Chain A, Crystal Structure Of The Rod Domain Of Alpha-Actinin 891 0 1 HCI_B Chain B, Crystal Structure Of The Rod Domain Of Alpha-Actinin 891 0 CAA38970.1 alpha-actinin 887 0 CAD62344.1 unnamed protein product 835 0 XP_293669.4 similar to actinin, alpha 4 524 3-148
Figure imgf000149_0001
Figure imgf000150_0001
SPCA_HUMA N Spectrin alpha chain, erythrocyte (Erythroid alpha-spectrin) 210 1e-53 S66292 actin-crosslinking protein ACF7 - human (fragment) 209 2e-53
AA118546 F:(C-D) ARP3 actin-related protein 3 homolog; ARP3 (actin-related protein 3, yeast)
NP_076224 Mm.183102 -1.23 . NP_005712.1 homolog 850 0
Figure imgf000151_0001
actin-related protein 3-beta; actin-related protein 3-beta; actin-related protein NP_065178.1 Arp11 ; actin-related protein Arp11 793 0 AAP97150.1 actin related protein 662 0 AAH15207.1 ARP3BETA protein 597 e-170 XP_374583.1 similar to actin-related protein Arp11 348 3e-95 JC7580 actin-related protein Arp11 - human _ 344 4e-94 AAK31778.1 FKSG74 . 253 8e-67 AAK31776.1 FKSG72 . 252 2e-66 AAK31777.1 FKSG73 249 2e-65 AAHl 6045.1 Beta actin . . 248 3e-65 NP_001092.1 beta actin; beta cytoskeletal actin 248 3e-65 actin, gamma 1 propeptide; actin, cytoplasmic 2; deafness, autosomal dominant NP_001605.1 26; deafness, autosomal dominant 20; cytoskeletal gamma-actin 248 4e-65 JC5818 ' ■ - gamma-actin - human 248 4e-65 NP_001091.1 alpha 1 actin precursor; alpha skeletal muscle actin . 248 5e-65 CAA45026.1 mutant beta-actin (beta'-actin) 247 6e-65 AAH08633.1 actin, beta .. . ... 247 8e-65 NP_005150.1 cardiac muscle alpha actin proprotein; smooth muscle actin 247 8e-65 XP_293924.1 similar to RIKEN cDNA 4732495G21 gene " 246 1e-64 ATHUS actin alpha 2, aortic smooth muscle - human 246 1e-64 NP_001604.1. alpha 2 actin; alpha-cardiac actin . 246 2e-64 ; NP_001606.1 actin, gamma 2 propeptide; actin, alpha-3 245 3e-64 AAH17450.1 nknown (protein for IMAGE:3538275) 239 2e-62 ARP1 actin-related protein 1 homolog B, centractin beta; centractin beta; ARP1
Figure imgf000151_0002
(actin-related protein 1 , yeast) homolog B (centractin beta); PC3; ARP1 , yeast NP 005726.1 homolog B 236 1e-61
AAH12854.1 ACTB protein 2362e-61 ARP1 actin-related protein 1 homolog A, centractin alpha; ARP1 (actin-related protein 1 , yeast) homolog A (centractin alpha); centractin alpha; actin-RPV; NP_005727.1 centrosome-associated actin homolog; ARP1 , yeast homolog A 2353e-61 1818358A actin-related protein 2344e-61 AAH06372.1 ARP1 actin-related protein 1 homolog B, centractin beta 2344e-61 XP 372957.1 similar to FKSG30 223 1e-57 XP_065237.5 similar to FKSG30 2231e-57 AAG50355.1 FKSG30 ' _ 223 1e-57 XP_292982.4 similar to pote protein; Expressed in prostate, ovary, testis, and placenta 2231e-57 XP_371558.2 similar to FKSG30 2232e-57 .AAA51586.1 actin prepeptide 211 5e-54 CAA57692.1 . beta-centractin 211 6e-54 AAH14546.1 Actih-related protein 2 2031e-51 NP_005713.1 actin-related protein 2; ARP2 (actin-related protein 2, yeast) homolog 2031e-51 - SWI/SNF-related matrix-associated actin-dependent regulator of chromatin d2; Rsc6p; mammalian chromatin remodeling complex BRG1-associated factor 60B; NM_031878 . F:(C-D) Swp73-like protein; chromatin remodeling complex BAF60B subunit; SWI/SNF
NP 14084.1 Mm.2l772 -1.21 NP_003068.2 complex 60 kDa subunit B 828 AAC50696.1 SWI/SNF complex 60 KDa subunit 745 SWI/SNF-related matrix-associated actin-dependent regulator of chromatin d3; Rscδp; mammalian chromatin remodeling complex BRG1 -associated factor 60C; Swp73-like protein; chromatin remodeling complex BAF60C subunit; SWI/SNF NP_003069.2 complex 60 kDa subunit C 622 e-178 AAR88510.1 60kDa BRG-1/Brm associated factor subunit c isoform 2 619 e-177 AAC50697.1 SWi/SNF complex 60 KDa subunit 596 e-170 AAH09368.2 SMARCD1 protein 569 e-168
SWI/SNF-related matrix-assoGiated actin-dependent regulator of chromatin d1 isoform a; Rsc6p; mammalian chromatin remodeling complex BRG1 -associated factor 60A; chromatin remodeling complex BAF60A subunit; Swp73-like protein; NP_003067.2 SWI/SNF complex 60 kDa subunit A " 589 e-168 AAD23390.1 SWI/SNF-related, matrix-associated, actin-dependent regulator of chromatin D1 582 e-165 SWI/SNF-related matrix-associated actin-dependent regulator of chromatin d1 - isoform b; Rscδp; mammalian chromatin remodeling complex BRG1 -associated factor 60A; chromatin remodeling complex BAF60A subunit; Swp73-Iike protein; NP^.620710.1 SWI/SNF complex 60 kDa subunit A 505 e-142 AAC50695.1 SWI/SNF complex 60 KDa subunit 505 e-142 AAS02031.1 unknown 366 e-100 AAS00380.1 unknown 261 5e-69 . AAH18953.2 SMARCD2 protein 1592e-38 - . ?? ,- - AAF20280.1 PR02451 1522e-36
NM_019767 " F:(C-D) actin related protein 2/3 complex subunit 1 A; actin binding protein
NP_062741.1 Mm.34695. -1.18 NP_006400.2 (Schizosaccharomyces pombe sop2-like); SOP2-like protein 730 " '"-' ■ AR1A_HUMA
'"' "" " ■ ' .' ■ • ' • ' -. N . ... . • Actin-reiated protein 2/3 complex subunit 1 A (SOP2-like protein) 723 0 actin related protein 2/3 complex subunit 1 B; ARP2/3 protein complex subunit NP_005711.1 p41; actin related protein 2/3 complex, subunit 1A (41 kD) 533 e-151
- - :-.. .. ' . .. . . .' _ . .. - AAS00381.1 unknown . 3572e-98 SWI/SNF related, matrix associated, actin dependent regulator of chromatin,
NM 011418 - F:(C-D) subfamily b, member 1; sucrose nonfermenting, yeast, homolog-like 1; integrase
NP 035548.1 Mm.279751 -1.14 NP_003064.2 interactor 1 754 0 SNF5_HUMA SWI/SNF related, matrix associated, actin dependent regulator of chromatin N subfamily B member 1 (Integrase interactor 1 protein) (hSNF5) (BAF47) 749 0 CAA09759.1 Inilb 728 0 BAB14784.1 . unnamed protein product . 710 0 CAA76639.1 SNF5/INI1 protein 685 0
Figure imgf000154_0001
Subtable IB: Wholly Unfavorable Genes and Proteins
Figure imgf000155_0001
Figure imgf000156_0001
Figure imgf000157_0001
Figure imgf000158_0001
Figure imgf000159_0001
Figure imgf000160_0001
Figure imgf000161_0001
Figure imgf000162_0001
Figure imgf000163_0001
Figure imgf000163_0002
Figure imgf000164_0001
Figure imgf000165_0001
Figure imgf000166_0001
Figure imgf000167_0001
Figure imgf000168_0001
Figure imgf000169_0001
Figure imgf000170_0001
Figure imgf000171_0002
Figure imgf000171_0001
Figure imgf000172_0001
Figure imgf000173_0001
Figure imgf000174_0001
Figure imgf000175_0001
Figure imgf000176_0001
Figure imgf000177_0001
Figure imgf000178_0001
Figure imgf000179_0001
Figure imgf000180_0001
Figure imgf000181_0001
Figure imgf000182_0001
Figure imgf000183_0002
Figure imgf000183_0001
Figure imgf000184_0001
Figure imgf000185_0001
Figure imgf000186_0001
Figure imgf000187_0001
Figure imgf000187_0002
Figure imgf000187_0003
Figure imgf000187_0004
Figure imgf000188_0001
Figure imgf000188_0002
Figure imgf000188_0003
Figure imgf000188_0004
Figure imgf000189_0001
Figure imgf000189_0002
Figure imgf000190_0001
Figure imgf000191_0001
Figure imgf000192_0001
Figure imgf000193_0001
Figure imgf000194_0001
Figure imgf000195_0001
Figure imgf000196_0001
Figure imgf000197_0001
Figure imgf000198_0001
Figure imgf000199_0001
Figure imgf000200_0001
Figure imgf000201_0001
Figure imgf000202_0001
Figure imgf000203_0001
Figure imgf000204_0001
Figure imgf000205_0001
Figure imgf000206_0001
Figure imgf000207_0001
Figure imgf000208_0001
Figure imgf000209_0001
Figure imgf000210_0001
Figure imgf000211_0001
Figure imgf000212_0001
Figure imgf000213_0001
Figure imgf000214_0001
Figure imgf000215_0001
Figure imgf000216_0001
Figure imgf000217_0001
Figure imgf000218_0001
Figure imgf000219_0001
Figure imgf000220_0001
Figure imgf000221_0001
Figure imgf000222_0001
Figure imgf000223_0001
Figure imgf000224_0001
Figure imgf000225_0001
Figure imgf000226_0001
Figure imgf000227_0001
Figure imgf000228_0001
Figure imgf000229_0001
Figure imgf000230_0001
Figure imgf000231_0001
Figure imgf000232_0001
Figure imgf000233_0001
Figure imgf000234_0001
Figure imgf000235_0001
Figure imgf000236_0001
Figure imgf000237_0001
Figure imgf000237_0002
Figure imgf000238_0001
Figure imgf000239_0001
Figure imgf000240_0001
Figure imgf000241_0001
240
Figure imgf000242_0001
Figure imgf000243_0001
Figure imgf000244_0001
Figure imgf000245_0001
Figure imgf000246_0001
Figure imgf000247_0001
Figure imgf000248_0001
Figure imgf000249_0001
Figure imgf000250_0001
Figure imgf000251_0001
Figure imgf000252_0001
Figure imgf000253_0001
Figure imgf000254_0001
Figure imgf000255_0001
Figure imgf000256_0001
Figure imgf000257_0001
Figure imgf000258_0001
Figure imgf000259_0001
Figure imgf000260_0001
Figure imgf000261_0001
Figure imgf000262_0001
Figure imgf000263_0001
Figure imgf000264_0001
Figure imgf000265_0001
Figure imgf000266_0001
Figure imgf000267_0001
266
Subtable IC: Mixed Genes and Proteins
Figure imgf000269_0001
Figure imgf000270_0001
Figure imgf000271_0001
Figure imgf000272_0001
Figure imgf000273_0001
Figure imgf000274_0001
Figure imgf000274_0002
Figure imgf000274_0003
Figure imgf000275_0001
Figure imgf000276_0001
Figure imgf000277_0001
Figure imgf000277_0002
Figure imgf000277_0003
Figure imgf000278_0001
Figure imgf000279_0001
Figure imgf000280_0001
Figure imgf000281_0001
Figure imgf000282_0001
Figure imgf000283_0001
Figure imgf000284_0001
References 1. Unger, R.H., Foster, D.W. (1998) Diabetes mellitus, In- Williams Textbook of Endocrinology, J.D. Wilson, D.W. Foster, H.M. ■ Kronenberg, and P.R. Larsen, eds. (Philadelphia, W.B. Saunders Company), pp. 973-1059.
2. Polonsky, K.S. (1995) The beta-cell in diabetes: from molecular genetics to clinical research. Diabetes 44.705-717
3. Velho, G . , Froguel, P. (1997) Genetic determinants of non-insulin-dependent diabetes mellitus: strategies and recent results. Diabete et Metabolisme 23:7-17
4. Groop, L.C., Tuomi, T. (1997) Non-insulin-dependent diabetes mellitus-a collision between thrifty genes and an affluent society. Ann. Med. 29:37-53.
5. Reaven, G.M. (1988) Role of insulin resistance in human disease. Diabetes 37:1595-1607.
6. Clark, M.G. , Rattigan, S., Clark, D.G. (1983) Obesity with insulin resistance: experimental insights. Lancet (ii) 1236-1240.
7. Kissebah, A.H. , Vydelingum, N., Murray, R. , Evans,' D.J., Hartz, A.J., Kakloff, R.K. , Adams, P.W. (1982) Rel-ation of body fat distribution to metabolic complications of obesity. J Clin. Endo and Metab 54 (2) :254-260.
8. Kissebah, A.H. (1996) Intra-abdominal fat: is it a maj or factor in developing diabetes and coronary artery disease? Diabetes Res Clin Pract 30 (Suppl) : 25-30.
9. Friedman, J.M., Leibel, R. (1992) Tackling a weighty problem. Cell 69:217-220
10. Bjorntorp, P. (1991) Metabolic implications of body fat distribution. Diabetes Care 14:1132-1143. 11. Emery, E.M., Schmid, T.L., Kahn, H.S., Filozof, P.P. (1993) A review of the association between abdominal fat distribution, health outcome measures, and modifiable risk factors. Am J Health Promot 7:342-353.
12. Wickelgren, I. (1998) Obesity: .how big a problem? Science 280:1365.
13. Surwit, R.S., Kuhn, CM., Cochrane, C, McCubbin, J.A. , Feinglos, M.N. (1988) Diet-induced type-II diabetes in C57BL/6J mice. Diabetes 37:1163-1167.
14. Surwit, R.S., Feinglos, M.N. , Rodin, J. , Sutherland, A., Petro, A.E., Opara, E.C., Kuhn, CM., Rebuffe-Scrive, M. (1995) Differential effects of fat and sucrose on the development of obesity and diabetes in C57BL/6J and A/J mice. Metabolism 44 (5) : 645-651.
15. Ahren, B.E., Simonson, E., Scheurink, A.J.W., Mulder, H., Myerson, U. , Sundler, F. (1997) Dissociated insulinotropic sensitivity to glucose and carbachol in high- fat diet-induced insulin resistance in C57BL/6J mice. Metabolism 46 (1) : 97-106.
16. Page, R. , Morris, C, Williams, J. , von Ruhland, C,
Malik, A.N. (1997) Isolation of diabetes-associated kidney genes using differential display. Biochem Biophys Res Commun 232 (1) :49-53
17. Condorelli, G. , Vigliotta, G., Iavarone, C, Caruso, M., Tocchetti, C.G., /Andreozzi, F., Cafieri, A., Tecce, M.F., Formisano, P., Beguinot, L., Beguinot , F. (1998) PED/PEA-15 gene controls glucose transport and is overexpressed in type 2 diabetes mellitus. E bo J 17(14) :3858-66
18 . Peraldi , M . N . , Berrou, J . , Hagege , J . , Rondeau, E . , Sraer, J . D . ( 1998 ) Subtractive hybridization cloning : an efficient technique to detect overexpressed RNAs in diabetic nephropathy. Kidney Int 53 (4): 926-31
19. Song, Y. , Ailenberg, M. , Silverman, M. (1998) Cloning of a novel gene in the human kidney homologous to rat muncl3s: its potential role in diabetic nephropathy. Kidney Int 53 (6) :1689-95
20. Imagawa, M. , Tsughiya, T., and Nishihara, T. (1999) Identification of inducible genes at the early stage of adipocyte differentiation of 3T3-L1 cells. Biochem. Biophys. Res. Comm. 254:299-305.
21. Nadler, S.T., Stoehr, J.P., Schueler, K.L., Tanimoto, G. , Yandell, B.S., Attie, A.D. (2000) The expression of adipogenic genes is decreased in obesity and diabetes mellitus. Proc Natl Acad Sci U S A 97:11371-11376
22. Lan H, Rabaglia ME, Stoehr JP, Nadler ST, Schueler KL, Zou F, Yandell BS , Attie AD. (2003) ' Gene expression profiles of nondiabetic and diabetic obese mice suggest a role of hepatic lipogenic capacity in diabetes susceptibility. Diabetes 52:688-700.
23. Petersen KF, Shulman GI (2002) Pathogenesis of skeletal muscle insulin resistance in type 2 diabetes mellitus. Am J Cardiol 90, 11G-18G.
Ci tation of documents herein is not intended as an admission that any of the documents cited herein is pertinent prior art, or an admission that the cited documents is considered material to the patentability of any of the claims of the present application . All statements as to the date or representation as to the contents of these documents is based on the information available to the applicant and does not constitute any admission as to the correctness of the dates or contents of these documents . The appended claims are to be treated as a non-limi ting recitation of preferred embodiments. In addi tion to those set forth elsewhere, the following references are hereby incorporated by reference, in their most recent editions as of the time of filing of this application: Kay, Phage Display of Peptides and Proteins : A Laboratory Manual; the John Wiley and Sons Current Protocols series, including Ausubel , Current Protocols in Molecular Biology; Coligan, Current Protocols in Protein Science; Coligan, Current Protocols in Immunology; Current Protocols in Human Genetics; Current Protocols in Cy tome try; Current Protocols in Pharmacology; Current Protocols in Neuroscience; Current Protocols in Cell Biology; Current Protocols in Toxicology; Current Protocols in Field Analytical Chemistry; Current Protocols in Nucleic Acid Chemistry; and Current Protocols in Human Genetics ; and the following Cold Spring Harbor Laboratory publications : Sambrook, Molecular . Cloning: A Laboratory Manual; Harlow, Antibodies : A Laboratory Manual; Manipulating the Mouse Embryo : A Laboratory Manual; Methods in Yeast Genetics : A Cold Spring Harbor Laboratory Course Manual; Drosophila
Protocols; Imaging Neurons : A Laboratory Manual ; Early
Development of Xenopus laevis : A Laboratory Manual; Using Antibodies : A Laboratory Manual ; At the Bench: A Laboratory Navigator; Cells : A Laboratory Manual ; Methods in Yeast Genetics : A Laboratory Course Manual ; Discovering Neurons : The Experimental Basis of Neuroscience; Genome Analysis : A Laboratory Manual Series ; Laboratory DNA Science; Strategies for Protein Purification and Characterization : A Laboratory Course Manual ; Genetic Analysis of Pathogenic Bacteria : A Laboratory Manual ; PCR Primer: A Laboratory Manual ; Methods in Plant Molecular Biology: A Laboratory Course Manual ; Manipulating the Mouse Embryo: A Laboratory Manual ; Molecular Probes of the Nervous System; Experiments with Fission Yeast : A Laboratory Course Manual; A Short Course in Bacterial Genetics : A Laboratory Manual and Handbook for Escherichia coli and Related Bacteria; DNA Science : A First Course in Recombinant DNA Technology; Methods in Yeast Genetics : A Laboratory Course Manual ; Molecular Biology of Plants : A Laboratory Course Manual . All references ci ted herein, including journal articles or abstracts, published, corresponding, prior or otherwise related U. S. or foreign patent applications, issued U. S . or foreign patents, or any other references, are entirely incorporated by reference herein, including all data, tables, figures, and ex presented in the ci ted references . Additionally, the entire contents of the references ci ted wi thin the references ci ted herein are also entirely incorporated by reference . Reference to known method steps, conventional methods steps, known methods or conventional methods is not in any way an admission that any aspect, description or embodiment of the present invention is disclosed, taught or suggested in the relevant art . The foregoing description of the specific embodiments will so fully reveal the general nature of the invention that others can, by applying knowledge wi thin the ■ skill of the art (including the contents of the references ci ted herein) , readily modify and/or adapt for various applications such specific embodiments, wi thout undue experimentation, - wi thout departing from the general concept of the present invention . Therefore, such adaptations and modifications are intended to be wi thin the meaning and range of equivalents of the disclosed embodiments , based on the teaching and guidance presented herein . It is to be understood that the phraseology or terminology herein is for the purpose of description and not of limi tation, such that the terminology or phraseology of the present specification is to be interpreted by the skilled artisan in light of the teachings and guidance presented herein, in combination wi th the knowledge of one of ordinary skill in the art. Any description of a class or range as being useful, or preferred in the practice of the invention shall be deemed a description of any subclass (e.g. , a disclosed class with one or more disclosed members omitted) or subrange contained therein, as well as a separate description of each individual member or value in said class or range. The description of preferred embodiments individually shall be deemed a description of any possible combination of such preferred embodiments, except for combinations which are impossible (e. g, mutually exclusive choices for an element of the invention) or which are .expressly excluded by this specification . If an embodiment of this invention is disclosed in the prior art, the description of the invention shall be deemed to include the invention as herein disclosed wi th such embodimen t exci sed .

Claims

1. A method of protecting a human subject from progression from a normoinsulinemic state to a hyperinsulinemic state, or from either to a type II diabetic state, which comprises administering to the subject a protective amount of an agent which is
(1) a polypeptide which is substantially structurally identical or conservatively identical in sequence to a reference protein which is selected from the group consisting of mouse and human proteins set forth in master table 1, subtables 1A and IC, or
(2) an expression vector encoding the polypeptide of (1) above and expressible in a human cell, under conditions conducive to expression of the polypeptide of (1) ;
where said agent protects said subject from progression from a normoinsulinemic state to a hyperinsulinemic state, or from either to a type II diabetic state.
2. A method of protecting a human subject from progression from a normoinsulinemic state to a hyperinsulinemic state, or from either to a type II diabetic state which comprises administering to the subject a protective amount of an agent which is
(1) an antagonist of a polypeptide, occurring in said subject, which is substantially structurally identical or conservatively identical in sequence to a reference protein which is selected from the group consisting of mouse and human proteins set forth in master table 1, subtable IB and IC, or
(2) an anti-sense vector which inhibits expression of said polypeptide in said subject, where said agent protects said subject from progression from a normoinsulinemic state to a hyperinsulinemic state, or from either to a type II diabetic state.
3. A method of screening for human subjects who are prone to progression from a normoinsulinemic. state to a hyperinsulinemic state, or from either to a type II diabetic state, which comprises assaying tissue or body fluid samples from said subjects to determine the level of expression of a "favorable" human marker gene, said human marker gene encoding a human protein which is substantially structurally identical or conservatively identical in sequence to a reference protein which is selected from the group consisting of mouse and human proteins set forth in master table 1, subtables 1A and IC, and directly correlating the level of. expression of said marker gene with the propensity to progression in said patient.
4. A method of screening for human subjects who have a propensity for progression from a normoinsulinemic state to a hyperinsulinemic state, or from either to a type II diabetic state, which comprises assaying tissue or body fluid samples from said subjects to determine the level of expression of an "unfavorable" human marker gene, said human marker gene encoding a human protein which is substantially structurally identical or conservatively identical in sequence to a reference protein which is selected from the group consisting of mouse and human proteins set forth in master table 1, subtable IB and IC, and inversely correlating the level of expression of said marker gene with the propensity to progression in said patient .
5. The method of claims 1 or 3 in which the reference protein is of subtable 1A.
6. The method of claims 1 or 3 in which the reference protein is of subtable IB.
7. The method of claim 3 or 4 in which the sample is a muscle tissue sample.
8. The method of any one of claims 1-7 in which the reference protein is a human protein.
9. The method of any one of claims 1-7 in which the reference protein is a mouse protein.
10. The method of any one of claims 3 or 4 in which fche level of expression of the marker protein is ascertained by measuring the level of the corresponding messenger R??-..
11. The method of any one of claims 3 or 4in which tile level of expression is ascertained by measuring the level of a protein encoded by said marker- gene .
12. The method of any one of claims 1-9 in which said polypeptide is at least 80% identical or at least hig?txly conservatively identical to said reference protein.
13. The method of any one of claims 1-10 in which saicL polypeptide is at least 90% identical to said reference protein.
14. The method of any one of claims 1-11 in which said polypeptide is identical to said reference protein.
15. The method of any one of claims 1-14 in which the E- value cited for the reference protein in Master Table L is not more than e-6.
16. The method of claim 15 in which the E-value cited fror the reference protein in Master Table 1 is less than e—10.
17. The method of claim 17 in which the E value calculated by BLASTN or BLASTX would be less than e-15, more preferably less than e-20, suill more preferably less than e-40, even more preferably less than e-60, considerably more preferable- less than e-80, and most preferably less than e-100.
18. The method of any of claims 2-17 in which the antagonist is an antibody, or an antigen-specific binding fragment of an antibody.
19. The method of any of claims 2-17 in which the antagonist is a peptide, peptoid, nucleic acid, or peptide nucleic aci≤l oligomer.
20. The method of any of claims 2-17 in which the antagonist is an organic molecule with a molecular weight of less than 500 daltons.
21. The method of claim 20 in which said organic molecule is identifiable as a molecule which binds said polypeptide by screening a combinatorial library.
22. The method of claim 1 or 2 in which the agent is delivered systemically.
23. The method of claim 1 or 2 in which the agent is selectively delivered to muscle tissue.
PCT/US2005/005596 2004-02-26 2005-02-24 Diagnosis of hyperinsulinemia and type ii diabetes and protection against same based on genes differentially expressed in muscle cells WO2005082398A2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP05713932A EP1732582A2 (en) 2004-02-26 2005-02-24 Diagnosis of hyperinsulinemia and type ii diabetes and protection against same based on genes differentially expressed in muscle cells
AU2005216922A AU2005216922A1 (en) 2004-02-26 2005-02-24 Diagnosis of hyperinsulinemia and type II diabetes and protection against same based on genes differentially expressed in muscle cells
CA002557181A CA2557181A1 (en) 2004-02-26 2005-02-24 Diagnosis of hyperinsulinemia and type ii diabetes and protection against same based on genes differentially expressed in muscle cells

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US54751204P 2004-02-26 2004-02-26
US60/547,512 2004-02-26
US57934204P 2004-06-15 2004-06-15
US60/579,342 2004-06-15

Publications (2)

Publication Number Publication Date
WO2005082398A2 true WO2005082398A2 (en) 2005-09-09
WO2005082398A3 WO2005082398A3 (en) 2006-01-26

Family

ID=34915602

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2005/005596 WO2005082398A2 (en) 2004-02-26 2005-02-24 Diagnosis of hyperinsulinemia and type ii diabetes and protection against same based on genes differentially expressed in muscle cells

Country Status (4)

Country Link
EP (1) EP1732582A2 (en)
AU (1) AU2005216922A1 (en)
CA (1) CA2557181A1 (en)
WO (1) WO2005082398A2 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1824518A2 (en) * 2004-12-07 2007-08-29 Ohio University Diagnosis of hyperinsulinemia and type ii diabetes and protection against same based on proteins differentially expressed in serum
WO2010096875A1 (en) * 2009-02-27 2010-09-02 Verva Pharmaceuticals Ltd A drug identification protocol for type 2 diabetes based on gene expression signatures
EP2293075A3 (en) * 2005-06-17 2011-05-04 Randox Laboratories Ltd. Method for diagnosing neuro-degenerative disease
US9212228B2 (en) 2005-11-24 2015-12-15 Ganymed Pharmaceuticals Ag Monoclonal antibodies against claudin-18 for treatment of cancer
US9512232B2 (en) 2012-05-09 2016-12-06 Ganymed Pharmaceuticals Ag Antibodies against Claudin 18.2 useful in cancer diagnosis
US9775785B2 (en) 2004-05-18 2017-10-03 Ganymed Pharmaceuticals Ag Antibody to genetic products differentially expressed in tumors and the use thereof
US10414824B2 (en) 2002-11-22 2019-09-17 Ganymed Pharmaceuticals Ag Genetic products differentially expressed in tumors and the use thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000066787A2 (en) * 1999-05-05 2000-11-09 Ohio University Growth hormone-regulatable liver genes and proteins, and uses thereof
WO2004092419A2 (en) * 2003-03-31 2004-10-28 Ohio University Diagnosis of hyperinsulinemia and type ii diabetes and protection against same (i)

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000066787A2 (en) * 1999-05-05 2000-11-09 Ohio University Growth hormone-regulatable liver genes and proteins, and uses thereof
WO2004092419A2 (en) * 2003-03-31 2004-10-28 Ohio University Diagnosis of hyperinsulinemia and type ii diabetes and protection against same (i)

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
"CLONTECH.PCR-Select differential screening kit. User Manual" CLONTECH, 10 September 2001 (2001-09-10), pages 1-35, XP002307356 *
BERNAL-MIZRACHI E ET AL: "GENE EXPRESSION PROFILING IN ISLET BIOLOGY AND DIABETES RESEARCH" DIABETES/METABOLISM RESEARCH AND REVIEWS, WILEY, LONDON,, GB, vol. 19, no. 1, 2003, pages 32-42, XP008045358 ISSN: 1520-7552 *
CALVO ROSA MARIA ET AL: "Immunohistochemical and morphometric studies of the fetal pancreas in diabetic pregnant rats. Effects of insulin administration" ANATOMICAL RECORD, vol. 251, no. 2, June 1998 (1998-06), pages 173-180, XP002332470 ISSN: 0003-276X *
COROMINOLA H ET AL: "Identification of novel genes differentially expressed in omental fat of obese subjects and obese type 2 diabetic patients" DIABETES, NEW YORK, NY, US, vol. 50, no. 12, December 2001 (2001-12), pages 2822-2830, XP002293068 ISSN: 0012-1797 *
GERLACH C ET AL: "PROLIFERATION-ASSOCIATED KI-67 PROTEIN IS A TARGET FOR AUTOANTIBODIES IN THE HUMAN AUTOIMMUNE DISEASE SYSTEMIC LUPUS ERYTHEMATOSUS" LABORATORY INVESTIGATION, UNITED STATES AND CANADIAN ACADEMY OF PATHOLOGY, BALTIMORE,, US, vol. 78, no. 1, January 1998 (1998-01), pages 129-130, XP002073106 ISSN: 0023-6837 *
LIM H W ET AL: "Identification of differentially expressed mRNA during pancreas regeneration of rat by mRNA differential display" BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, ACADEMIC PRESS, SAN DIEGO, CA, US, vol. 299, no. 5, 20 December 2002 (2002-12-20), pages 806-812, XP002324520 ISSN: 0006-291X *
PIETILAEINEN T ET AL: "THE IMPORTANT PROGNOSTIC VALUE OF KI-67 EXPRESSION AS DETERMINED BY IMAGE ANALYSIS IN BREAST CANCER" JOURNAL OF CANCER RESEARCH AND CLINICAL ONCOLOGY, SPRINGER INTERNATIONAL, BERLIN, DE, vol. 122, no. 11, 1996, pages 687-692, XP008028143 ISSN: 0171-5216 *
SONE H ET AL: "Pancreatic beta cell senescence contributes to the pathogenesis of type 2 diabetes in high-fat diet-induced diabetic mice." DIABETOLOGIA. JAN 2005, vol. 48, no. 1, January 2005 (2005-01), pages 58-67, XP002332471 ISSN: 0012-186X *
SURWIT R S ET AL: "Differential effects of fat and sucrose on the development of obesity and diabetes in C57BL/6J and AJ mice" METABOLISM, CLINICAL AND EXPERIMENTAL, W.B. SAUNDERS CO., PHILADELPHIA, PA, US, vol. 44, no. 5, May 1995 (1995-05), pages 645-651, XP004540280 ISSN: 0026-0495 cited in the application *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10414824B2 (en) 2002-11-22 2019-09-17 Ganymed Pharmaceuticals Ag Genetic products differentially expressed in tumors and the use thereof
US9775785B2 (en) 2004-05-18 2017-10-03 Ganymed Pharmaceuticals Ag Antibody to genetic products differentially expressed in tumors and the use thereof
EP1824518A2 (en) * 2004-12-07 2007-08-29 Ohio University Diagnosis of hyperinsulinemia and type ii diabetes and protection against same based on proteins differentially expressed in serum
EP1824518A4 (en) * 2004-12-07 2009-10-28 Univ Ohio Diagnosis of hyperinsulinemia and type ii diabetes and protection against same based on proteins differentially expressed in serum
EP2293075A3 (en) * 2005-06-17 2011-05-04 Randox Laboratories Ltd. Method for diagnosing neuro-degenerative disease
EP2293074A3 (en) * 2005-06-17 2011-05-04 Randox Laboratories Ltd. Method for diagnosing neuro-degenerative disease
US9751934B2 (en) 2005-11-24 2017-09-05 Ganymed Pharmaceuticals Ag Monoclonal antibodies against claudin-18 for treatment of cancer
US9499609B2 (en) 2005-11-24 2016-11-22 Ganymed Pharmaceuticals Ag Monoclonal antibodies against claudin-18 for treatment of cancer
US9212228B2 (en) 2005-11-24 2015-12-15 Ganymed Pharmaceuticals Ag Monoclonal antibodies against claudin-18 for treatment of cancer
US10017564B2 (en) 2005-11-24 2018-07-10 Ganymed Pharmaceuticals Gmbh Monoclonal antibodies against claudin-18 for treatment of cancer
US10174104B2 (en) 2005-11-24 2019-01-08 Ganymed Pharmaceuticals Gmbh Monoclonal antibodies against claudin-18 for treatment of cancer
US10738108B2 (en) 2005-11-24 2020-08-11 Astellas Pharma Inc. Monoclonal antibodies against claudin-18 for treatment of cancer
US11739139B2 (en) 2005-11-24 2023-08-29 Astellas Pharma Inc. Monoclonal antibodies against Claudin-18 for treatment of cancer
WO2010096875A1 (en) * 2009-02-27 2010-09-02 Verva Pharmaceuticals Ltd A drug identification protocol for type 2 diabetes based on gene expression signatures
US9512232B2 (en) 2012-05-09 2016-12-06 Ganymed Pharmaceuticals Ag Antibodies against Claudin 18.2 useful in cancer diagnosis
US10053512B2 (en) 2012-05-09 2018-08-21 Ganymed Pharmaceuticals Ag Antibodies against claudin 18.2 useful in cancer diagnosis

Also Published As

Publication number Publication date
AU2005216922A1 (en) 2005-09-09
CA2557181A1 (en) 2005-09-09
WO2005082398A3 (en) 2006-01-26
EP1732582A2 (en) 2006-12-20

Similar Documents

Publication Publication Date Title
US7186694B2 (en) Leptin-related peptides
EP1732582A2 (en) Diagnosis of hyperinsulinemia and type ii diabetes and protection against same based on genes differentially expressed in muscle cells
US20070111933A1 (en) Diagnosis and treatment methods related to aging, especially of liver
WO2006023121A1 (en) Diagnosis of hyperinsulinemia and type ii diabetes and protection against same based on genes differentially expressed in white adipose tissue (13)
US20070142311A1 (en) Diagnosis of hyperinsulinemia and type II diabetes and protection against same
WO2005079840A2 (en) Use of secreted protein products for preventing and treating pancreatic diseases and/or obesity and/or metabolic syndrome
WO2005110460A2 (en) Diagnosis and treatment methods related to aging, especially in muscle (14.1)
WO2005046718A1 (en) Diagnosis and hyperinsulinemia and type ii diabetes and protection against same based on genes differentially expressed in pancreas cells (12.1)
EP1644406B1 (en) Use of dg153 secreted protein products for preventing and treating pancreatic diseases and/or obesity and/or metabolic syndrome
WO2004092419A2 (en) Diagnosis of hyperinsulinemia and type ii diabetes and protection against same (i)
US20060240500A1 (en) Diagnosis of kidney damage and protection against same
WO2005005668A2 (en) Diagnosis and treatment methods related to aging (8a)
US20080107639A1 (en) Use of a Dg147 Protein Product for Preventing and Treating Metabolic Disorders
WO2004012758A1 (en) Use of tgf beta ig-h3 for preventing and treating obesity, diabetes and/or metabolic syndrome
US20060259988A1 (en) Use of dg931 protein for treating diabetes, obesity and metabolic syndrome
WO2007002830A2 (en) Diagnosis of and protection from hyperinsulinemia and type ii diabetes
US20070050856A1 (en) Use of protein products for preventing and treating pancreatic diseases and/or obesity and/or metabolic syndrome

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2557181

Country of ref document: CA

Ref document number: 2005216922

Country of ref document: AU

ENP Entry into the national phase in:

Ref document number: 2005216922

Country of ref document: AU

Date of ref document: 20050224

Kind code of ref document: A

WWP Wipo information: published in national office

Ref document number: 2005216922

Country of ref document: AU

WWE Wipo information: entry into national phase

Ref document number: 2005713932

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 2005713932

Country of ref document: EP