WO2014195730A2

WO2014195730A2 - Auto-antigen biomarkers for lupus

Info

Publication number: WO2014195730A2
Application number: PCT/GB2014/051760
Authority: WO
Inventors: Timothy VYSE; Michael Bernard Mcandrew; Colin Henry Wheeler; Jens-Oliver Koopmann
Original assignee: Sense Proteomic Limited; King's College London
Priority date: 2013-06-07
Filing date: 2014-06-06
Publication date: 2014-12-11
Also published as: GB201310216D0; WO2014195730A3

Abstract

The presence of certain auto-antibodies indicates that a subject has lupus. The auto-antibodies recognise antigens listed in Table 1 herein. These auto-antibodies and/or the antigens themselves can be used as biomarkers for assessing lupus in a subject.

Description

AUTO-ANTIGEN BIOMARKERS FOR LUPUS

This patent application claims priority from GB patent application 1310216.5, filed 7th June 2013, the complete contents of which are incorporated herein by reference for all purposes.

TECHNICAL FIELD

The invention relates to biomarkers useful in diagnosis, monitoring and/or treatment of lupus. BACKGROUND

Systemic lupus erythematosus (SLE) or lupus is a chronic autoimmune disease that can affect the joints and almost every major organ in the body, including heart, kidneys, skin, lungs, blood vessels, liver, and the nervous system. As in other autoimmune diseases, the body's immune system attacks the body's own tissues and organs, leading to inflammation. A person's risk to develop lupus appears to be determined mainly by genetic factors, but environmental factors, such as infection or stress may trigger the onset of the disease. Ethnicity also appears to play a role, in that the severity of disease varies across ethnic groups, being generally more severe in Afro-Caribbean and Asian groups compared with Caucasian groups [1,2]. The course of lupus varies, and is often characterised by a lternating periods of flares, i.e. increased disease activity, and periods of remission. Subjects with lupus may develop a variety of conditions such as lupus nephritis, musculoskeletal complications, haematological disorders and cardiac inflammation.

Lupus occurs approximately 9 times more frequently in women tha n in men. It is part of a family of closely related disorders known as the connective tissue diseases which also includes rheumatoid arthritis (RA), polymyositis-dermatomyositis (PM-DM), systemic sclerosis (SSc or scleroderma), Sjogren's syndrome (SS) and various forms of vasculitis. These diseases share a number of clinical symptoms and abnormalities. Subjects suffering from lupus can present with a variety of diverse symptoms, many of which occur in other connective tissue diseases, fibromyalgia, dermatomyositis or haematological conditions such as idiopathic thrombocytopenic purpura. For the patient, a mis-diagnosis of RA rather than SLE may result in the patient being treated in the inappropriate clinic which may compromise their care. Diagnosis can therefore be challenging and mis-diagnosis may have clinical consequences.

It takes on average 4 years to obtain a correct diagnosis for lupus, in part due to the range a nd complexity of symptoms and the necessity to discount other possible ca uses. The American College of Rheumatologists has established eleven criteria to assist in the diagnosis of lupus for the inclusion of patients in clinical trials and developed the SLE Disease Activity Index (SLEDAI) to assess lupus activity. In addition to considering medical history, the subject's age and gender and a physical examination, a number of laboratory tests are also available to assist in diagnosis. These include tests for the presence of antinuclear antibodies (ANA), extractable nuclear antigens (ENA) and tests for other auto-antibodies such as anti-double stranded DNA (dsDNA), anti-Smith (Sm), anti-RN P, anti-Ro (SSA), anti-La (SSB) and anti-cardiolipin antibodies. Other diagnostic tools include tests for serum complement levels, immune complexes, urine analysis, and biopsies of an affected organ. Some of these criteria are very specific for lupus but have poor sensitivity, but none of these tests provides a definitive diagnosis and so the results of multiple differing tests must be integrated to enable a clinical judgement by an expert. For example, a positive ANA test can occur due to infections or rheumatic diseases, and even hea lthy people without lupus can test positive. The ANA test has high sensitivity (93%) but low specificity (57%) [3]. Antibodies to double-stranded DNA and/or nucleosomes were associated with lupus over 50 years ago and active lupus is generally associated with IgG auto-antibodies. The sensitivity and specificity of the Farr test for anti-dsDNA is 78.8% and 90.9%, respectively [4]. The auto-antibody reactivities e.g. titre can vary with factors including ethnicity and disease severity. Thus it is apparent that the status of multiple auto-antibody species can provide information on the lupus status of a patient but to date these clinical analyses are frequently still performed individually in a piecemeal fashion rather than in an integrated workflow and the data analysed in a relatively crude manner. By identifying the individual antigens responsible for general ANA or anti-dsDNA reactivity, it may be possible to increase the sensitivity and/or specificity of testing for SLE. The necessity for a unified test offering both high sensitivity and specificity for lupus is clear.

Many auto-antibody species have been described in connection with lupus [5] and their cognate antigens include numerous classes of proteins, subcellular organs such as the nucleus and non-protein species such as phospholipid a nd DNA. Frequently the antigen is either poorly described or uncharacterised at the molecular level. Given the cha llenges in obtaining a correct diagnosis, there is a need for new or improved in vitro tests with good specificity and sensitivity to enable non-invasive diagnosis of lupus. Such tests can be based on biomarkers that ca n be used in methods of diagnosing lupus, for the early detection of lupus, subclinical or presymptomatic lupus or a predisposition to lupus, or for monitoring the progression of lupus or the likelihood to transition from remission to fla re or vice versa, or the efficacy of a therapeutic treatment thereof. Such improved diagnostic methods would provide significant clinical benefit by enabling earlier active management of lupus while reducing unnecessary intervention caused by mis-diagnosis. It is an object of the invention to meet any or all of these needs. DISCLOSURE OF THE INVENTION

The invention is based on the identification of correlations between lupus and the level of autoantibodies against certain auto-antigens. The inventors have identified antigens for which the level of auto-antibodies can be used to indicate that a subject has SLE. Auto-antibodies against these antigens are present at significantly different levels in subjects with lupus and without lupus and so the auto-antibodies and their antigens function as biomarkers of lupus. Detection of the biomarkers in a subject sample can thus be used to improve the diagnosis, prognosis and monitoring of lupus. Advantageously, the invention can be used to distinguish between lupus and other autoimmune diseases, particularly other connective tissue diseases such as rheumatoid arthritis (RA), polymyositis-dermatomyositis (PM-DM), systemic sclerosis (SSc or scleroderma), Sjogren's syndrome and vasculitis where inflammation and similar symptoms are common.

The inventors have identified 42 such biomarkers and the invention uses at least one of these to assist in the diagnosis of lupus by measuring level(s) of auto-antibodies against the antigen(s) and/or the level(s) of the antigen(s) themselves. The biomarker can be (i) auto-antibody which binds to an antigen in Table 1 and/or (ii) an antigen in Table 1, but is preferably the former.

The invention thus provides a method for analysing a subject sample, comprising a step of determining the level of a Table 1 biomarker in the sample, wherein the level of the biomarker provides a diagnostic indicator of whether the subject has lupus.

Analysis of a single Table 1 biomarker can be performed, and detection of the auto- antibody/antigen can provide a useful diagnostic indicator for lupus even without considering any of the other Table 1 biomarkers. The sensitivity and specificity of diagnosis can be improved, however, by combining data for multiple biomarkers. It is thus preferred to analyse more than one Table 1 biomarker. Analysis of two or more different biomarkers (a "panel") can enhance the sensitivity and/or specificity of diagnosis compared to analysis of a single biomarker. The data derived from a panel can be combined in a multivariate analysis [6]. The combination of biomarkers may increase the classification power relative to a single biomarker. The biomarkers which constitute the panel do not need to be assayed simultaneously and the data derived for each biomarker can be combined post-assay.

Each different biomarker in a panel is shown in a different row in Table 1 i.e. measuring both auto-antibody which binds to an antigen listed in Table 1 and the antigen itself is measurement of a single biomarker rather than of a panel.

Thus the invention provides a method for analysing a subject sample, comprising a step of determining the levels of x different biomarkers of Table 1, wherein the levels of the biomarkers provide a diagnostic indicator of whether the subject has lupus. The value of x is 2 or more e.g. 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 or more (e.g. up to 42). These panels may include (i) any specific one of the 42 biomarkers in Table 1 in combination with (ii) any of the other 41 biomarkers in Table 1. Suitable panels are described below and panels of particular interest include those listed in Tables 5, 6 and 13. Preferred panels have from 2 to 15 biomarkers, as using >15 of them generally adds little to sensitivity and specificity.

The Ta ble 1 biomarkers can be used in combination with one or more of: (a) known biomarkers for lupus, which may or may not be auto-antibodies or antigens; and/or (b) other information about the subject from whom a sample was taken e.g. age, genotype (genetic variations can affect auto-antibody profiles [7] and considerable progress on the elucidation of the genetics of lupus has been made [8]), weight, other clinically-relevant data or phenotypic information; and/or (c) other diagnostic tests or clinical indicators for lupus; and/or any biomarkers from Table 2. Such combinations can enhance the sensitivity and/or specificity of diagnosis. Known lupus biomarkers of particular interest include, but are not limited to, auto-antibodies against dsDNA, SSA and/or any of the antigens listed in Table 3.

For exa mple, a useful panel includes auto-antibodies against x different biomarkers from Table 1 (as described above) in combination with auto-antibodies against one of more of dsDNA, SSA and/or any of the antigens listed in Table 3. Examples of such panels are disclosed in Tables 5 and 6.

Thus the invention provides a method for analysing a subject sample, comprising a step of determining:

(a) the level(s) of y Table 1 biomarker(s), wherein the levels of the biomarkers provide a diagnostic indicator of whether the subject has lupus; and also one or more of:

(b) if a sample from the subject contains a known biomarker selected from the group consisting of auto-antibodies including ANA, anti-Smith, anti-dsDNA, anti-phospholipid, anti- single stranded DNA (ssDNA), anti-RNP, anti-Ro, anti-La, anti-cardiolipin, anti-histone and/or those antibodies against antigens described in Sherer et al. [5] (and optionally, any other known biomarkers e.g. see above); wherein detection of the known biomarker provides a second diagnostic indicator of whether the subject has lupus;

(c) if the subject has one or more of a fa lse positive serological test for syphilis, serositis, pleuritis, pericarditis, oral ulcers, nonerosive arthritis of two or more peripheral joints, photosensitivity, hemolytic anemia, leukopenia, lymphopenia, thrombocytopenia, hypocomplementemia, renal disorder, seizures, psychosis, malar rash, and/or discoid rash, wherein a positive test for these provides a third diagnostic indicator of whether the subject has lupus;

(d) the subject's age and/or gender, and combining the different diagnostic indicators (and optionally age and/or gender) to provide an aggregate diagnostic indicator of whether the subject has lupus.

The samples used in (a) and (b) may be the same or different.

The value of y is 1 or more e.g.1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 {e.g. up to 42). When y >1 the invention uses a panel of different Table 1 biomarkers.

The invention also provides, in a method for diagnosing if a subject has lupus, an improvement consisting of determining in a sample from the subject the level(s) of y biomarker(s) of Table 1, wherein the level(s) of the biomarker(s) provide a diagnostic indicator of whether the subject has lupus. The biomarker(s) of Table 1 can be used in combination with known lupus biomarkers, as discussed above.

The invention also provides a method for diagnosing a subject as having lupus, comprising steps of: (i) determining the levels of y biomarkers of Table 1 in a sample from the subject; and (ii) comparing the determination from step (i) to data obtained from samples from subjects without lupus and/or from subjects with lupus, wherein the comparison provides a diagnostic indicator of whether the subject has lupus. The comparison in step (ii) can use a classifier algorithm as discussed in more detail below. The biomarkers measured in step (i) can be used in combination with known lupus biomarkers, as discussed above.

The invention also provides a method for monitoring development of lupus in a subject, comprising steps of: (i) determining the levels of zi biomarker(s) of Table 1 in a first sample from the subject taken at a first time; and (ii) determining the levels of z₂ biomarker(s) of Table 1 in a second sample from the subject taken at a second time, wherein: (a) the second time is later than the first time; (b) one or more of the z₂ biomarker(s) were present in the first sample; and (c) a change in the level(s) of the biomarker(s) in the second sample compared with the first sample indicates that lupus is in remission or is progressing. Thus the method monitors the biomarker(s) over time, with changing levels indicating whether the disease is getting better or worse. The disease development can be either an improvement or a worsening, and this method may be used in various ways e.g. to monitor the natural progress of a disease, or to monitor the efficacy of a therapy being administered to the subject. Thus a subject may receive a therapeutic agent before the first time, at the first time, or between the first time and the second time. Increased levels of antibodies against a particular antigen may be due to "epitope spreading", in which additional antibodies or antibody classes are raised to antigens against which an antibody response has already been mounted [9].

The value of ζ is 1 or more e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 (e.g. up to 42). The value of z₂ is 1 or more e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 (e.g. up to 42). The values of zj and z₂ may be the same or different. If they are different, it is usual that zi>z₂ as the later analysis (z₂) can focus on biomarkers which were already detected in the earlier analysis; in other embodiments, however, z₂ can be larger than zi e.g. if previous data have indicated that an expanded panel should be used; in other embodiments z₂=z₂ e.g. so that, for convenience, the same panel can be used for both analyses. When z₂>l or ₂>l, the biomarkers are different bioma rkers. The zi and/or z₂ biomarker(s) can be used in combination with known lupus biomarkers, as discussed above.

The invention also provides a method for monitoring development of lupus in a subject, comprising steps of: (i) determining the level of at least wt Table 1 biomarkers in a first sample taken at a first time from the subject; and (ii) determining the level of at least w₂ Table 1 biomarkers in a second sample taken at a second time from the subject, wherein : (a) the second time is later than the first time; (b) at least one biomarker is common to both the w: and w₂ biomarkers; (c) the level of at least one biomarker common to both the w and w₂ biomarkers is different in the first and second samples, thereby indicating that the lupus is progressing or regressing. Thus the method monitors the range of biomarkers over time, with a broadening in the number of detected biomarkers indicating that the disease is getting worse. As mentioned above, this method may be used to monitor disease development in various ways.

The value of ΙΑΊ is 1 or more e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 (e.g. up to 42). The value of w₂ is 2 or more e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 (e.g. up to 42). The values of \N and w₂ may be the same or different. If they are different, it is usual that w₂>w_lr as the later analysis should focus on a biomarker panel that is at least as wide as the number already detected in the earlier analysis. There will usually be an overlap between the wi and w₂ biomarkers (including situations where they are the same, such that the same biomarkers are measured at two time points) but it is also possible for wi and w₂ to have no biomarkers in common. The wi and/or w₂ biomarker(s) can be used in combination with known lupus biomarkers, as discussed above.

Where the methods involve a first time and a second time, these times may differ by at least 1 day, 1 week, 1 month or 1 year. Samples may be taken regularly. The methods may involve measuring biomarkers in more than 2 samples taken at more than 2 time points i.e. there may be a 3rd sample, a 4th sample, a 5th sample, etc.

The invention also provides a diagnostic device for use in diagnosis of lupus, wherein the device permits determination of the level(s) of y Table 1 biomarkers. The value of y is defined above. The device may also permit determination of whether a sample contains one or more of the known lupus biomarkers mentioned above.

The invention also provides a kit comprising (i) a diagnostic device of the invention and (ii) instructions for using the device to detect y of the Table 1 biomarkers. The value of y is defined above. The kit is useful in the diagnosis of lupus.

The invention also provides a kit comprising reagents for measuring the levels of x different Table 1 biomarkers. The kit may also include reagents for determining whether a sample contains one or more of the known lupus biomarkers mentioned above. The value of x is defined above. The kit is useful in the diagnosis of lupus.

The invention also provides a kit comprising components for preparing a diagnostic device of the invention. For instance, the kit may comprise individual detection reagents for x different biomarkers, such that an array of those x biomarkers can be prepared.

The invention also provides a product comprising (i) one or more detection reagents which permit measurement of x different Table 1 biomarkers, and (ii) a sample from a subject.

The invention also provides a software product comprising (i) code that accesses data attributed to a sample, the data comprising measurement of y Table 1 biomarkers, and (ii) code that executes an algorithm for assessing the data to represent a level of y of the biomarkers in the sample. The software product may also comprise (iii) code that executes an algorithm for assessing the result of step (ii) to provide a diagnostic indicator of whether the subject has lupus. As discussed below, suitable algorithms for use in part (iii) include support vector machine algorithms, artificial neural networks, tree-based methods, genetic programming, etc. The algorithm can preferably classify the data of part (ii) to distinguish between subjects with lupus and subjects without based on measured biomarker levels in samples taken from such subjects. The invention also provides methods for training such algorithms. The y biomarker(s) can be used in combination with known lupus biomarkers, as discussed above.

The invention also provides a computer which is loaded with and/or is running a software product of the invention.

The invention also extends to methods for communicating the results of a method of the invention. This method may involve communicating assay results and/or diagnostic results. Such communication may be to, for example, technicians, physicians or patients. In some embodiments, detection methods of the invention will be performed in one country and the results will be communicated to a recipient in a different country.

The invention also provides an isolated antibody (preferably a human antibody) which recognises one of the antigens listed in Table 1. The invention also provides an isolated nucleic acid encoding the heavy and/or light chain of the antibody. The invention also provides a vector comprising this nucleic acid, and a host cell comprising this vector. The invention also provides a method for expressing the a ntibody comprising culturing the host cell under conditions which permit production of the antibody. The invention also provides derivatives of the human antibody e.g. F(ab')₂ and F(ab) fragments, Fv fragments, single-chain antibodies such as single chain Fv molecules (scFv), minibodies, dAbs, etc.

The invention also provides the use of a Table 1 biomarker as a biomarker for lupus.

The invention also provides the use of x different Table 1 biomarkers as biomarkers for lupus. The value of x is defined above. These may include (i) any specific one of the 42 biomarkers in Table 1 in combination with (ii) any of the other 41 biomarkers in Table 1.

The invention also provides the use as combined biomarkers for lupus of (a) at least y Table 1 biomarker(s)and (b) biomarkers including auto-antibodies including ANA, anti-Smith, anti-dsDNA, anti-phospholipid, anti-ssDNA, anti-histone, false positive test for serological test for syphilis, indicators of serositis, oral ulcers, arthritis, photosensitivity haematological disorder, renal disorder, antinuclear antibody, immunologic disorder, neurologic disorder, malar rash, discoid rash (and optionally, any other known biomarkers e.g. see above). The value of y is defined above. When y>l the invention uses a panel of biomarkers of the invention. Such combinations include those discussed above.

Biomarkers of the invention

Auto-antibodies against 77 different human antigens have been identified and these can be used as lupus biomarkers. Details of the 77 antigens are given in Tables 1 and 2. Within the 77 antigens, 42 human antigens mentioned in Table 1 are particularly useful for distinguishing between samples from subjects with lupus and from subjects without lupus. Preferred subsets of these 42 antigens are listed in Table 7.

The biomarkers from Table 1 have sufficient performance such that a panel consisting of any of the biomarkers from Table 1 is useful with the invention. Adding any of the biomarkers from Table 2 to any of the biomarkers from Table 1 is also useful with the invention because this improves the sensitivity and/or specificity of diagnosis over that of Table 1 alone. Panels which do not include at least one biomarker from the 77 biomarkers listed in Tables 1 and 2 do not form part of the invention.

Further auto-antibody biomarkers can be used in addition to the 77 biomarkers listed to Tables 1 and 2 (e.g. any of the biomarkers listed in Table 3).

The following antigens from Table 1 have been identified by the inventors to provide the best performance in the examples (they have the lowest p-values), so they are particularly useful with the invention: BI RC3, ATF4, DLX3, WAS, NEUROD4, PRM2, SMAD5, CARHSP1, TGIF1 and HIST1H4I. The following antigens from Table 1 are present at high frequencies in the preferred panels of the invention (Tables 5 and 6), so they are also particularly useful with the invention: BI RC3, ATF4, WAS, PRM2, CARHSP1, HIST1H4I, HOXC10, I RF4, SUB1 and PPP2R5A.

The sequence listing provides an example of a natural coding sequence for these antigens. These specific coding sequences a re not limiting on the invention, however, and auto-antibody biomarkers may recognise variants of polypeptides encoded by these natural sequences [e.g. allelic variants, polymorphic forms, mutants, splice variants, or gene fusions), provided that the variant has an epitope recognised by the auto-antibody. Details on allelic variants of or mutations in human genes are available from various sources, such as the ALFRED database [10] or, in relation to disease associations, the OMIM [11] and HGMD [12] databases. Details of splice variants of human genes are available from various sources, such as ASD [13].

As mentioned above, detection of a single Table 1 biomarker can provide useful diagnostic information, but each biomarker might not individually provide information which is useful i.e. auto-antibodies against a Table 1 antigen may be present in some, but not all, subjects with lupus. An inability of a single biomarker to provide universal diagnostic results for all subjects does not mean that this biomarker has no diagnostic utility, however, or else ANA also would not be useful; rather, any such inability means that the test results (as in all diagnostic tests) have to be properly understood and interpreted.

To address the possibility that a single biomarker might not provide universal diagnostic results, and to increase the overall confidence that an assay is giving sensitive and specific results across a disease population, it is advantageous to analyse a plurality of the Table 1 biomarkers (i.e. a panel). For instance, a negative signal for a particular Table 1 antigen is not necessarily indicative of the absence of lupus (just as absence of antibodies to DNA is not), confidence that a subject does not have lupus increases as the number of negative results increases. For example, if all 77 biomarkers are tested and are negative then the result provides a higher degree of confidence than if only 1 biomarker is tested and is negative. Thus biomarker panels are most useful for enhancing the distinction seen between diseased and non-diseased samples. As mentioned above, though, preferred panels have from 2 to 15 biomarkers as the burden of measuring a higher number of markers is usually not rewarded by better sensitivity or specificity. Preferred panels are given below, including panels which include known lupus biomarkers.

Where a biomarker or panel provides a strong distinction between lupus and non-lupus subjects then a method for analysing a subject sample ca n function as a method for diagnosing if a subject has lupus. As with many diagnostic tests, however, and as is already known for other diagnostics tests e.g. the PSA test used for prostate cancer, a method may not always provide a definitive diagnosis and so a method for analysing a subject sample can sometimes function only as a method for aiding in the diagnosis of lupus, or as a method for contributing to a diagnosis of lupus, where the method's result may imply that the subject has lupus [e.g. the disease is more likely than not) and/or may confirm other diagnostic indicators (e.g. passed on clinical symptoms). The test may therefore function as an adjunct to, or be integrated into, the SLEDAI analysis, or similar methodologies for assessing disease activity e.g. adjusted mean SLEDAI, SELENA-SLEDAI, Systemic Lupus Activity Measure (SLAM), British Isles Lupus Activity Group (BILAG). Dealing with these considerations of certainty/uncertainty is well known in the diagnostic field.

The biomarkers of the invention are also useful for distinguishing between subjects with lupus and subjects with confounding diseases, such as rheumatoid arthritis (RA), polymyositis- dermatomyositis (PM-DM), systemic sclerosis (SSc or scleroderma), Sjogren's syndrome and vasculitis. The inventors found that the biomarkers from Tables 1-3 are all effective in distinguishing between SLE, healthy cohorts and confounding disease subjects (connective tissue disease, polymyositis, RA, scleroderma, and Sjogren's Syndrome). I n addition, the inventors found that TROVE2, SSB and PSME3 are particularly useful in distinguishing between SLE, healthy cohorts and confounding disease subjects (connective tissue disease, polymyositis, RA, scleroderma, and Sjogren's Syndrome).

The subject

The invention is used for diagnosing disease in a subject. The subject will usually be female and at least 10 years old (e.g. >15, >20, >25, >30, >35, >40, >45, >50, >55, >60, >65, >70). They will usually be at least of child-bearing age as the risk of lupus increases in this age group, and for these subjects it may be appropriate to offer a screening service for Table 1 biomarkers. The subject may be a post-menopausal female.

The subject may be pre-symptomatic for lupus or may already be displaying clinical symptoms. For pre-symptomatic subjects the invention is useful for predicting that symptoms may develop in the future if no preventative action is taken. For subjects already displaying clinical symptoms, the invention may be used to confirm or resolve another diagnosis. The subject may already have begun treatment for lupus.

I n some embodiments the subject may already be known to be predisposed to development of lupus e.g. due to family or genetic links. I n other embodiments, the subject may have no such predisposition, and may develop the disease as a result of environmental factors e.g. as a result of exposure to particular chemicals (such as toxins or pharmaceutica ls), as a result of diet [14], of infection of oral contraceptive use, of postmenopausal use of hormones, etc. [15].

Because the invention can be implemented relative easily and cheaply it is not restricted to being used in patients who are already suspected of having lupus. Rather, it ca n be used to screen the general population or a high risk population e.g. subjects at least 10 years old, as listed above.

The subject will typically be a human being. In some embodiments, however, the invention is useful in non-human organisms e.g. mouse, rat, rabbit, guinea pig, cat, dog, horse, pig, cow, or non-human primate (monkeys or apes, such as macaques or chimpanzees). In non-human embodiments, any detection antigens used with the invention will typically be based on the relevant non-human ortholog of the human antigens disclosed herein. I n some embodiments animals can be used experimentally to monitor the impact of a therapeutic on a particular biomarker. The sample

The invention analyses samples from subjects. Many types of sample can include autoantibodies and/or antigens suitable for detection by the invention, but the sample will typically be a body fluid. Suitable body fluids include, but are not limited to, blood, serum, plasma, saliva, lymphatic fluid, a wound secretion, urine, faeces, mucus, sweat, tears and/or cerebrospinal fluid. The sample is typically serum or plasma.

I n some embodiments, a method of the invention involves an initial step of obtaining the sample from the subject. I n other embodiments, however, the sample is obtained separately from and prior to performing a method of the invention. After a sample has been obtained then methods of the invention are generally performed in vitro.

Detection of biomarkers may be performed directly on a sample taken from a subject, or the sample may be treated between being taken from a subject and being analysed. For example, a blood sample may be treated to remove cells, leaving antibody-containing plasma for ana lysis, or to remove cells and various clotting factors, leaving antibody-containing serum for analysis. Faeces samples usually require physical treatment prior to protein detection e.g. suspension, homogenisation and centrifugation. For some body fluids, though, such separation treatments are not usually required (e.g. tears or saliva) but other treatments may be used. For example, various types of sample may be subjected to treatments such as dilution, aliquoting, sub-sampling, heating, freezing, irradiation, etc. between being ta ken from the body and being analysed e.g. serum is usually diluted prior to analysis. Also, addition of processing reagents is typical for various sample types e.g. addition of anticoagulants to blood samples.

Biomarker detection

The invention involves determining the level of Table 1 biomarker(s) in a sample. I mmunochemical techniques for detecting antibodies against specific antigens are well known in the art, as are techniques for detecting specific antigens themselves. Detection of an antibody will typically involve contacting a sample with a detection antigen, wherein a binding reaction between the sam ple a nd the detection antigen indicates the presence of the a ntibody of interest. Detection of an antigen will typically involve contacting a sample with a detection antibody, wherein a binding reaction between the sample and the detection antibody indicates the presence of the antigen of interest. Detection of an antigen can also be determined by non-immunological methods, depending on the nature of the antigen e.g. if the antigen is an enzyme then its enzymatic activity can be assayed, or if the antigen is a receptor then its binding activity ca n be assayed, etc. For example, the CLK1 kinase can be assayed using methods known in the art.

A detection antigen for a biomarker antibody can be a natural antigen recognised by the auto-antibody (e.g. a mature human protein disclosed in Table 1), or it may be an antigen comprising an epitope which is recognized by the auto-antibody. It may be a recombinant protein or synthetic peptide. Where a detection antigen is a polypeptide its amino acid sequence can vary from the natural sequences disclosed above, provided that it has the ability to specifically bind to an auto-antibody of the invention (i.e. the binding is not non-specific and so the detection antigen will not arbitrarily bind to antibodies in a sample). It may even have little in common with the natural sequence (e.g. a mimotope, an aptamer, etc. ). Typically, though, a detection antigen will comprise an amino acid sequence (i) having at least 90% (e.g. >91%, >92%, >93%, >94%, >95%, >96%, >97%, >98%, >99%) sequence identity to the relevant SEQ ID NO disclosed herein across the length of the detection antigen, and/or (ii) comprising at least one epitope from the relevant SEQ ID NO disclosed herein. Thus the detection antigen may be one of the variants discussed above. Epitopes are the parts of an antigen that are recognised by and bind to the antigen binding sites of antibodies and are also known as "antigenic determinants". An epitope-containing fragment may contain a linear epitope from within a SEQ ID NO and so may comprise a fragment of at least n consecutive amino acids of the SEQ ID NO:, wherein n may be 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250 or more). B-cell epitopes can be identified empirically (e.g. using PEPSCAN [16,17] or similar methods), or they can be predicted e.g. using the Jameson-Wolf antigenic index [18], ADEPT [19], hydrophilicity [20], antigenic index [21], MAPITOPE [22], SEPPA [23], matrix-based approaches [24], the amino acid pair antigenicity scale [25], or any other suitable method e.g. see ref.26. Predicted epitopes can readily be tested for actual immunochemical reactivity with samples.

Detection antigens can be purified from human sources but it is more typical to use recombinant antigens (particularly where the detection antigen uses sequences which are not present in the natural antigen e.g. for attachment). Various systems are available for recombinant expression, and the choice of system may depend on the auto-antibody to be detected. For example, prokaryotic expression (e.g. using E.coli) is useful for detecting many auto-antibodies, but if an auto-antibody recognises a glycoprotein then eukaryotic expression may be required. Similarly, if an auto-antibody recognises a specific discontinuous epitope then a recombinant expression system which provides correct protein folding may be required.

The detection antigen may be a fusion polypeptide with a first region and a second region, wherein the first region can react with an auto-antibody in a sa mple and the second region can react with a substrate to immobilise the fusion polypeptide thereon.

A detection antibody for a biomarker antigen can be a monoclonal antibody or a polyclonal antibody. Typically it will be a monoclonal antibody. The detection antibody should have the ability to specifically bind to a Table 1 antigen (i.e. the binding is not non-specific and so the detection antibody will not arbitrarily bind to other antigens in a sample).

Various assay formats can be used for detecting biomarkers in samples. For example, the invention may use one or more of western blot, immunoprecipitation, silver staining, mass spectrometry (e.g. MALDI-MS), conductivity-based methods, dot blot, slot blot, colorimetric methods, fluorescence-based detection methods, or any form of immunoassay, etc. The binding of antibodies to antigens can be detected by any means, including enzyme-linked assays such as ELISA, radioimmunoassays ( IA), immunoradiometric assays (IRMA), immunoenzymatic assays (IEMA), DELFIA™ assays, surface plasmon resonance or other evanescent light techniques (e.g. using planar waveguide technology), label-free electrochemical sensors, etc. Sandwich assays are typical for immunological methods. I n embodiments where multiple biomarkers are to be detected an array-based assay format is preferable, in which a sample that potentially contains the biomarkers is simultaneously contacted with multiple detection reagents (antibodies and/or antigens) in a single reaction compartment. Antigen and antibody arrays are well known in the art e.g. see references 27-33, including arrays for detecting auto-antibodies. Such arrays may be prepared by various techniques, such as those disclosed in references 34-38, which are particularly useful for preparing microarrays of correctly-folded polypeptides to facilitate binding interactions with auto-antibodies. It has been estimated that most B-cell epitopes are discontinuous and such epitopes are known to be important in diseases with an autoimmune component. For example, in autoimmune thyroid diseases, auto-antibodies arise to discontinuous epitopes on the immunodominant region on the surface of thyroid peroxidase and in Goodpasture disease auto-antibodies arise to two major conformational epitopes. Protein arrays which have been developed to present correctly-folded polypeptides displaying native structures and discontinuous epitopes are therefore particularly well suited to studies of diseases where auto-antibody responses occur [31].

Methods and apparatuses for detecting binding reactions on protein arrays are now standard in the art. Preferred detection methods are fluorescence-based detection methods. To detect biomarkers which have bound to immobilised proteins a sandwich assay is typical e.g. in which the primary antibody is an auto-antibody from the sample and the secondary antibody is a labelled anti-sample antibody (e.g. an anti-human antibody).

Where a biomarker is an auto-antibody the invention will generally detect IgG antibodies, but detection of auto-antibodies with other subtypes is also possible e.g. by using a detection reagent which recognises the appropriate class of auto-antibody (IgA, IgM, IgE or IgD rather than IgG). The assay format may be able to distinguish between different antibody subtypes and/or isotypes. Different subtypes [39] and isotypes [40] can influence auto-antibody repertoires. For instance, a sandwich assay can distinguish between different subtypes by using differentially-labelled secondary antibodies e.g. different labels for anti-lgG and anti-lgM.

As mentioned above, the invention provides a diagnostic device which permits determination of whether a sample contains Table 1 biomarkers. Such devices will typically comprise one or more antigen(s) and/or antibodies immobilised on a solid substrate (e.g. on glass, plastic, nylon, etc. ). Immobilisation may be by covalent or non-covalent bonding (e.g. non-covalent bonding of a fusion polypeptide, as discussed above, to an immobilised functional group such as an avidin [36] or a bleomycin-family antibiotic [38]). Antigen arrays are a preferred format, with detection antigens being individually addressable. The immobilised antigens will be able to react with auto-antibodies which recognise a Table 1 antigen.

I n some embodiments, the solid substrate may comprise a strip, a slide, a bead, a well of a microtitre plate, a conductive surface suitable for performing mass spectrometry analysis [41], a semiconductive surface [42,43], a surface plasmon resonance support, a planar waveguide technology support, a microfluidic devices, or any other device or technology suitable for detection of antibody-antigen binding.

Where the invention provides or uses an a ntigen array for detecting a panel of auto-antibodies as disclosed herein, in some embodiments the array may include only antigens for detecting these auto-antibodies. In other embodiments, however, the array may include polypeptides in addition to those useful for detecting the auto-antibodies. For example, an array may include one or more control polypeptides. Suitable positive control polypeptides include an anti-human immunoglobulin antibody, such as an anti-lgM antibody, an anti-lgG antibody, an anti-lgA antibody, an anti-lgE antibody or combinations thereof. Other suitable positive control polypeptides which can bind to sample antibodies include protein A or protein G, typically in recombinant form. Suitable negative control polypeptides include, but are not limited to, β-galactosidase, serum albumins (e.g. bovine serum albumin (BSA) or human serum albumin (HSA)), protein tags, bacterial proteins, yeast proteins, citrullinated polypeptides, etc. Negative control features on an array can also be polypeptide-free e.g. buffer alone, DNA, etc. An array's control features are used during performance of a method of the invention to check that the method has performed as expected e.g. to ensure that expected proteins are present (e.g. a positive signal from serum proteins in a serum sample) and that unexpected substances are not present (e.g. a positive signal from an array spot of buffer alone would be unexpected).

I n an antigen array of the invention, at least 10% (e.g. >20%, >30%, >40%, >50%, >60%, >70%, >80%, >90%, >95%, or more) of the total number of different proteins present on the array may be for detecting auto-antibodies as disclosed herein.

An antigen array of the invention may include one or more replicates of a detection antigen and/or control feature e.g. duplicates, triplicates or quadruplicates. Replicates provide redundancy, provide intra-array controls, and facilitate inter-array comparisons.

An antigen array of the invention may include detection antigens for more than just the 44 different auto-antibodies described here, but preferably it can detect antibodies against fewer than 10000 antigens (e.g. <5000, <4000, <3000, <2000, <1000, <500, <250, <100, etc.). An array is advantageous because it allows simultaneous detection of multiple biomarkers in a sample. Such simultaneous detection is not mandatory, however, and a panel of biomarkers can also be evaluated in series. Thus, for instance, a sample could be split into sub-samples and the sub-samples could be assayed in series. In this embodiment it may not be necessary to complete analysis of the whole panel e.g. the diagnostic indicators obtained on a subset of the panel may indicate that a patient has lupus without requiring analysis of any further members of the panel. Such incomplete analysis of the panel is encompassed by the invention because of the intention or potential of the method to analyse the complete panel.

As mentioned above, some embodiments of the invention can include a contribution from known tests for lupus, such as ANA and/or anti-dsDNA tests. Any known tests can be used e.g. Fa rr test, Crithidia, etc.

Thus an array of the invention (or any other assay format) may also provide an assay for one or more of these additional markers e.g. an array may include a DNA spot.

Data interpretation

The invention involves a step of determining the level of Table 1 biomarker(s). In some embodiments of the invention this determination for a particular marker can be a simple yes/no determination, whereas other embodiments may require a quantitative or semi^¬ quantitative determination, still other embodiments may involve a relative determination [e.g. a ratio relative to another marker, or a measurement relative to the same marker in a control sample), and other embodiments may involve a threshold determination (e.g. a yes/no determination whether a level is above or below a threshold). Usually biomarkers will be measured to provide quantitative or semi-quantitative results (whether as relative concentration, absolute concentration, titre, relative fluorescence etc.) as this gives more data for use with classifier algorithms. Usually the raw data obtained from an assay for determining the presence, absence, or level (absolute or relative) require some sort of manipulation prior to their use. For instance, the nature of most detection techniques means that some signal will sometimes be seen even if no antigen/antibody is actually present and so this noise may be removed before the results are interpreted. Similarly, there may be a background level of the antigen/antibody in the general population which needs to be compensated for. Data may need scaling or standardising to facilitate inter-experiments comparisons. These and similar issues, and techniques for dealing with them, are well known in the immunodiagnostic area.

Various techniques are available to compensate for background signal in a particular experiment. For example, replicate measurements will usually be performed (e.g. using multiple features of the same detection antigen on a single array) to determine intra-assay variation, and average values from the replicates can be compared (e.g. the median value of binding to quadruplicate array features). Furthermore, standard markers ca n be used to determine inter-assay variation and to permit calibration and/or normalisation e.g. an array can include one or more standards for indicating whether measured signals should be proportionally increased or decreased. For example, an assay might include a step of analysing the level of one or more control marker(s) in a sample e.g. levels of an antigen or antibody unrelated to lupus. Signal may be adjusted according to distribution in a single experiment. For instance, signals in a single array experiment may be expressed as a percentage of interquartile differences e.g. as [observed signal - 25th percentile] / [75th percentile - 25th percentile]. This percentage may then be normalised e.g. using a standard quantile normalization matrix, such as disclosed in reference 44, in which all percentage values on a single array are ranked a nd replaced by the average of percentages for antigens with the same rank on all arrays. Overall, this process gives data distributions with identical median and quartile values. Data tra nsformations of this type are standard in the art for permitting va lid inter-array comparisons despite variation between different experiments.

The level of a biomarker relative to a single baseline level may be defined as a fold difference. Normally it is desirable to use techniques that can indicate a change of at least 1.5-fold e.g. >1.75-fold, >2-fold, >2.5-fold, >5-fold, etc.

As well as compensating for variation which is inherent between different experiments, it can also be important to compensate for background levels of a biomarker which are present in the general population. Again, suitable techniques are well known. For example, levels of a particular antigen or auto-antibody in a sample will usually be measured quantitatively or semi-quantitatively to permit comparison to the background level of that biomarker. Various controls can be used to provide a suitable baseline for comparison, and choosing suitable controls is routine in the diagnostic field. Further details of suitable controls are given below.

The measured level(s) of biomarker(s), after any compensation/normalisation/eir., can be transformed into a diagnostic result in various ways. This transformation may involve an algorithm which provides a diagnostic result as a function of the measured level(s). Where a panel is used then each individual biomarker may make a different contribution to the overall diagnostic result and so two biomarkers may be weighted differently.

The creation of algorithms for converting measured levels or raw data into scores or results is well known in the art. For example, linear or non-linear classifier algorithms can be used. These algorithms ca n be trained using data from any particular technique for measuring the marker(s). Suitable training data will have been obtained by measuring the biomarkers in "case" and "control" samples i.e. samples from subjects known to suffer from lupus and from subjects known not to suffer from lupus. Most usefully the control samples will also include samples from subjects with a related disease which is to be distinguished from the disease of interest e.g. it is useful to train the algorithm with data from rheumatoid arthritis subjects and/or with data from subjects with connective tissue diseases other than lupus. The classifier algorithm is modified until it can distinguish between the case and control samples e.g. by adding or removing markers from the analysis, by changes in weighting, etc. Thus a method of the invention may include a step of analysing biomarker levels in a subject's sample by using a classifier algorithm which distinguishes between lupus subjects and non-lupus subjects based on measured biomarker levels in samples taken from such subjects.

Various suitable classifier algorithms are available e.g. linear discriminant analysis, na ive Bayes classifiers, perceptrons, support vector machines (SVM) [45] and genetic programming (GP) [46]. GP is particularly useful as it generally selects relatively small numbers of biomarkers and overcomes the problem of trapping in a local maximum which is inherent in many other classification methods. SVM-based approaches have previously been applied to lupus datasets [47]. The inventors have previously confirmed that both SVM and GP approaches can be trained on the same biomarker panels to distinguish the auto-antibody/antigen biomarker profiles of case and control cohorts with similar sensitivity and specificity i.e. auto-antibody biomarkers are not dependent on a single method of analysis. Moreover, these approaches can potentially distinguish lupus subjects from subjects with (i) other forms of autoimmune disease and (ii) rheumatoid arthritis. The biomarkers in Table 1 can be used to train such algorithms to reliably make such distinctions. The classification performance (sensitivity and specificity, ROC analysis) of any putative biomarkers can be rigorously assessed using nested cross validation and permutation analyses prior to further validation. Biological support for putative biomarkers can be sought using tools and databases including Genespring (version 11.5.1), Biopax pathway for GSEA analysis and pathway analysis e.g. Pathway Studio (version 9.1).

It will be appreciated that, although there may be some biomarkers in Table 1 which always give a negative absolute signal when contacted with negative control samples (and thus any positive signal is immediately indicative of lupus), it is more com mon that a biomarker will give at least a low absolute signal (and thus that a disease-indicating positive signal requires detection of auto-antibody levels above that background level). Thus references herein detecting a biomarker may not be references to absolute detection but rather (as is standard in the art) to a level above the levels seen in an appropriate negative control. Such controls may be assayed in para llel to a test sample but it ca n be more convenient to use an absolute control level based on empirical data, or to analyse data using an algorithm which ca n (e.g. by previous training) use biomarker levels to distinguish samples from disease patients vs. non-disease patients.

The level of a particular biomarker in a sample from a lupus-diseased subject may be above or below the level seen in a negative control sample. Antibodies that react with self-antigens occur naturally in healthy individuals and it is believed that these are necessary for survival of T- and B-cells in the peripheral immune system [48]. In a control population of healthy individuals there may thus be significant levels of circulating auto-antibodies against some of the antigens disclosed in Table 1 a nd these may occur at a significant frequency in the population. The level and frequency of these biomarkers may be altered in a disease cohort, compared with the control cohort. An analysis of the level and frequency of these biomarkers in the case and control populations may identify differences which provide diagnostic information. The level of auto-antibodies directed against a specific antigen may increase or decrease in a lupus sample, compared with a healthy sample.

I n general, therefore, a method of the invention will involve determining whether a sample contains a biomarker level which is associated with lupus. Thus a method of the invention can include a step of comparing biomarker levels in a subject's sam ple to levels in (i) a sa mple from a patient with lupus and/or (ii) a sample from a patient without lupus. The comparison provides a diagnostic indicator of whether the subject has lupus. An aberrant level of one or more biomarker(s), as compared to known or standard expression levels of those biomarker(s) in a sample from a patient without lupus, indicates that the subject has lupus.

The level of a biomarker should be significantly different from that seen in a negative control. Advanced statistical tools (e.g. principal component analysis, unsupervised hierarchical clustering and linear modelling) can be used to determine whether two levels are the same or different. For example, an in vitro diagnosis will rarely be based on comparing a single determination. Rather, an appropriate number of determinations will be made with an appropriate level of accuracy to give a desired statistical certainty with an acceptable sensitivity and/or specificity. Antigen and/or antibody levels can be measured quantitatively to permit proper comparison, and enough determinations will be made to ensure that any difference in levels can be assigned a statistical significance to a level of p<0.05 or better. The number of determinations will vary according to various criteria (e.g. the degree of variation in the baseline, the degree of up-regulation in disease states, the degree of noise, etc. ) but, again, this fa lls within the normal design capabilities of a person of ordina ry skill in this field. For exam ple, interquartile differences of normalised data can be assessed, and the threshold for a positive signal (i.e. indicating the presence of a particular auto-antibody) can be defined as requiring that antibodies in a sample react with a diagnostic antigen at least 2.5-fold more strongly that the interquartile difference above the 75th percentile. Other criteria are familiar to those skilled in the art and, depending on the assays being used, they may be more appropriate than quantile normalisation. Other methods to normalise data include data transformation strategies known in the art e.g. scaling, log normalisation, median normalisation, etc. For example, raw protein array data can be normalized by consolidating the replicates, transforming the data and applying median normalization which has been demonstrated to be appropriate for this type of analysis. Gene expression data can be subjected to background correction via 2D spatial correction and dye bias normalization via MvA lowess. Normalized gene expression and proteomic data can be analysed for any potential signatures relating to differences between patient cohorts referring to levels of statistical significance (generally p < 0.05), multiple testing correction and fold changes within the expression data that could be indicative of biological effect (generally 2 fold in mRNA compared with a reference value).

The underlying aim of these data interpretation techniques is to distinguish between the presence of a Table 1 biomarker and of an arbitrary control biomarker, and also to distinguish between the response of sample from a lupus subject from a control subject. Methods of the invention may have sensitivity of at least 70% (e.g. >70%, >75%, >80%, >85%, >90%, >95%, >96%, >97%, >98%, >99%). Methods of the invention may have specificity of at least 70% (e.g. >70%, >75%, >80%, >85%, >90%, >95%, >96%, >97%, >98%, >99%). Advantageously, methods of the invention may have both specificity and sensitivity of at least 70% (e.g. >70%, >75%, >80%, >85%, >90%, >95%, >96%, >97%, >98%, >99%). As shown in the examples, the invention can consistently provide specificities above approximately 70% and sensitivities greater than approximately 70%.

Data obtained from methods of the invention, and/or diagnostic information based on those data, may be stored in a computer medium (e.g. in RAM, in non-volatile computer memory, on CD, DVD, etc.) and/or may be transmitted between computers e.g. over the internet.

If a method of the invention indicates that a subject has lupus, further steps may then follow. For instance, the subject may undergo confirmatory diagnostic procedures, such as those involving physical inspection of the subject, and/or may be treated with therapeutic agent(s) suitable for treating lupus.

Monitoring the efficacy of therapy

As mentioned above, some methods of the invention involve testing samples from the same subject at two or more different points in time. I n general, where the above text refers to the presence or absence of biomarker(s), the invention also includes an increasing or decreasing level of the biomarker(s) over time. An increasing level of an auto-antibody biomarker includes a spread of antibodies in which additional antibodies or antibody classes are raised against a single antigen. Methods which determine changes in biomarker(s) over time can be used, for instance, to monitor the efficacy of a therapy being administered to the subject [e.g. in theranostics). The therapy may be administered before the first sample is taken, at the same time as the first sample is taken, or after the first sample is taken.

The invention can be used to monitor a subject who is receiving lupus therapy. There is presently no cure for lupus. Current therapies for lupus include therapeutic drugs, alternative medicines or life-style changes. Approved drugs include non-steroidal and steroidal anti^¬ inflammatory drugs (e.g. prednisolone), anti-malarials (e.g. hydroxychloroquine) and immunosupressants (e.g. cyclosporin A). A series of new drugs are being developed, many of which target B-cells, such as Rituximab which targets CD20 and Belimumab (Benlysta) which is directed against B-lymphocyte stimulator (BlyS). The appropriate treatment regime will depend on the severity of the disease, and the responsiveness of the patient. Disease-modifying antirheumatic drugs can be used preventively to reduce the incidence of flares. When flares occur, they are often treated with corticosteroids. Given the similarities between rheumatic diseases, discussed below, it is not surprising that many of the therapeutics developed for one disease may have efficacy in another. In particular, the success of cytokine inhibitors in treating RA has advanced our understanding of these diseases and has opened up the possibility that some of these new classes of therapeutics will be of use in multiple disease areas. For example, Belimumab failed to meet its target in RA but demonstrated efficacy in a phase III trial for lupus and is now marketed as Benlysta. Another anti-CD20 antibody, Ocrelizumab, is being investigated for use in RA and lupus and Imatinib which targets kit, abl and PDGFR kinases is in Phase II for RA and scleroderma. Other representative molecules which are directed towards rheumatic diseases are (target in parentheses): Tocilizumab (IL-6 receptor), AMG714 imAb (IL- 15), AIN457 mAb (IL-17), Ustekinumab (IL-23/IL-12), Belimumab (BLyS/BAFF), Atacicept (BLyS/BAFF and APRIL), Baminercept (LTa/LT3/LIGHT), Ocrelizumab (CD20), Ofatumumab (CD20), TRU-015/SMIP (CD20), Epratuzumab (CD22), Abatacept (CD80/CD86), Denosumab (RANKL), INCB018424 (JAK1/JAK2/Tyk2), CP-690,550 (JAK3), Fostamatinib (Syk), multiple compounds (p38), Imatinib (PDGF-R, c-kit, c-abl), ARRY-162 (ERK/MEK), AS-605240 (ΡΙ3Κγ), Maraviroc (CCR5), IB-MECA/CF101 (Adenosine A3 receptor agonist) and CE-224,535 (P2X7 antagonist). Recently, tofacitinib, the first oral Janus Kinase Inhibitor for RA was approved.

In related embodiments of the invention, the results of monitoring a therapy are used for future therapy prediction. For example, if treatment with a particular therapy is effective in reducing or eliminating disease symptoms in a subject, and is also shown to decrease levels of a particular biomarker in that subject, detection of that biomarker in another subject may indicate that this other subject will respond to the same therapy. Conversely, if a particular therapy was not effective in reducing or eliminating disease symptoms in a subject who had a particular biomarker or biomarker profile, detection of that biomarker or profile in another subject may indicate that this other subject will also fail to respond to the same therapy.

I n other embodiments, the presence of a particular biomarker ca n be used as the basis of proposing or initiating a particular therapy (patient stratification). For instance, if it is known that levels of a particular auto-antibody can be reduced by administering a particular therapy then that auto-antibody's detection may suggest that the therapy should begin. Thus the invention is useful in a theranostic setting.

Normally at least one sample will be taken from a subject before a therapy begins.

Immunotherapy

Where the development of auto-antibodies to a newly-exposed auto-antigen is causative for a disease, early priming of the immune response can prepare the body to remove antigen- exposing cells when they arise, thereby removing the cause of disease before auto-antibodies develop dangerously. For example, one antigen known to be recognised by auto-antibodies is p53, and this protein is considered to be both a vaccine target a nd a thera peutic target for the modulation of cancer [49-51]. The antigens listed in Table 1 are thus therapeutic targets for treating lupus.

Thus the invention provides a method for raising an antibody response in a subject, comprising eliciting to the subject an immunogen which elicits antibodies which recognise an antigen listed in Table 1. The method is suitable for immunoprophylaxis of lupus.

The invention also provides an immunogen for use in medicine, wherein the immunogen can elicit antibodies which recognise an antigen listed in Table 1. Similarly, the invention also provides the use of an immunogen in the manufacture of a medicament for immunoprophylaxis of lupus, wherein the immunogen can elicit antibodies which recognise an antigen listed in Table 1.

As discussed above for detection antigens, the immunogen may be the antigen itself or may comprise an amino acid sequence having identity and/or comprising an epitope from the antigen. Thus the immunogen may comprise an amino acid sequence (i) having at least 90% [e.g. >91%, >92%, >93%, >94%, >95%, >96%, >97%, >98%, >99%) sequence identity to the relevant SEQ ID NO disclosed herein, and/or (ii) comprising at least one epitope from the relevant SEQ I D NO disclosed herein. Other immunogens may also be used, provided that they can elicit antibodies which recognise the antigen of interest.

As an alternative to immunising a subject with a polypeptide immunogen, it is possible to administer a nucleic acid (e.g. DNA or RNA) immunogen encoding the polypeptide, for in situ expression in the subject, thereby leading to the development of an antibody response.

The immunogen may be delivered in conjunction [e.g. in admixture) with an immunological adjuvant. Such adjuvants include, but are not limited to, insoluble aluminium salts, water-in-oil emusions, oil-in-water emulsions such as MF59 and AS03, saponins, ISCOMs, 3-O-deacylated MPL, immunostimulatory oligonucleotides (e.g. including one or more CpG motifs), bacterial ADP-ribosylating toxins and detoxified derivatives thereof, cytokines, chitosan, biodegradable microparticles, liposomes, imidazoquinolones, phosphazenes (e.g. PCPP), aminoalkyl glucosaminide phosphates, gamma inulins, etc. Combinations of such adjuvants can also be used. The adjuvant(s) may be selected to elicit an immune response involving CD4 or CD8 T cells. The adjuvant(s) may be selected to bias an immune response towards a THl phenotype or a TH2 phenotype.

The immunogen may be delivered by any suitable route. For example, it may be delivered by parenteral injection (e.g. subcutaneously, intraperitoneal^, intravenously, intramuscularly), or mucosally, such as by oral (e.g. tablet, spray), topical, transdermal, transcutaneous, intranasal, ocular, aural, pulmonary or other mucosal administration.

The immunogen may be administered in a liquid or solid form. For example, the immunogen may be formulated for topical administration (e. g. as an ointment, cream or powder), for oral administration (e.g. as a tablet or capsule, as a spray, or as a syrup), for pulmonary administration (e.g. as an inhaler, using a fine powder or a spray), as a suppository or pessary, as drops, or as an injectable solution or suspension.

Imaging and staining

The antigens listed in Table 1 can be useful for imaging. A labelled antibody against the antigen can be injected in vivo and the distribution of the antigen can then be detected. This method may identify the source of the antigen (e.g. an area in the body where there is a high concentration of the antigen), potentially offering early identification of lupus. I maging techniques can also be used to monitor the progress or remission of disease, or the impact of a therapy.

The antigens listed in Table 1 can be useful for analysing tissue samples by staining e.g. using standard immunocytochemistry. A labelled antibody against a Table 1 antigen can be contacted with a tissue sample to visualise the location of the antigen. A single sample could be stained with different antibodies against multiple different antigens, and these different antibodies may be differentially labelled to enable them to be distinguished. As an alternative, a plurality of different samples can each be stained with a single antibody.

Thus the invention provides a labelled antibody which recognises an antigen listed in Table 1. The antibody may be a human antibody, as discussed above. Any suitable label can be used e.g. quantum dots, spin labels, fluorescent labels, dyes, etc.

Alternative biomarkers

The invention has been described above by reference to auto-antibody and antigen biomarkers, with assays of auto-antibodies against an antigen being used in preference to assays of the antigen itself. In addition to these biomarkers, however, the invention can be used with other biological manifestations of the Table 1 antigens. For example, the level of imRNA tra nscripts encoding a Table 1 antigen can be measured, particularly in tissues where that gene is not normally transcribed (such as in the potential disease tissue). Similarly, the chromosomal copy number of a gene encoding a Table 1 antigen can be measured e.g. to check for a gene duplication event. The level of a regulator of a Table 1 antigen ca n be measured e.g. to look at a microRNA regulator of a gene encoding the antigen. Furthermore, things which are regulated by or respond to a Table 1 antigen can be assessed e.g. if an antigen is a regulator of a metabolic pathway then disturbances in that pathway can be measured. Further possibilities will be apparent to the skilled reader.

Preferred panels

Preferred embodiments of the invention are based on at least two different biomarkers i.e. a panel. Panels of particular interest consist of or comprise combinations of one or more biomarkers listed in Table 1, optionally in combination with at least 1 further biomarker(s) e.g. from Table 2, from Table 3, etc. Preferred panels have from 2 to 15 biomarkers in total. Panels of particular interest consist of or comprise the combinations of biomarkers listed in any of Tables 5, 6 and 13. The pa nels useful for the invention (e.g. the panels listed in Tables 5, 6 and 13) can be expanded by adding further (i.e. one or more) biomarker(s) to create a larger pa nel. The further biomarkers can usefully be selected from known biomarkers (as discussed above e.g. see Table 3), from any of Tables 1, 2 and 8 to 11. Table 8 lists biomarkers described in reference 52. Table 9 lists biomarkers described in reference 53. Table 10 lists biomarkers described in reference 54. Table 11 lists biomarkers disclosed in the literature. In general the addition does not decrease the sensitivity or specificity of the panel shown in the Tables. Such panels include, but are not limited to: A panel comprising or consisting of 2 different biomarkers, namely: (i) a biomarker selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

A panel comprising or consisting of 3 different biomarkers, namely: (i) any 2 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

A panel comprising or consisting of 4 different biomarkers, namely: (i) any 3 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

A panel comprising or consisting of 5 different biomarkers, namely: (i) any 4 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

A panel comprising or consisting of 6 different biomarkers, namely: (i) any 5 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

A panel comprising or consisting of 7 different biomarkers, namely: (i) any 6 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

A panel comprising or consisting of 8 different biomarkers, namely: (i) any 7 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

A panel comprising or consisting of 9 different biomarkers, namely: (i) any 8 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

A panel comprising or consisting of 10 different biomarkers, namely: (i) any 9 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

A panel comprising or consisting of 11 different biomarkers, namely: (i) any 10 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

A panel comprising or consisting of 12 different biomarkers, namely: (i) any 11 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3. • A panel comprising or consisting of 13 different biomarkers, namely: (i) any 12 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

• A panel comprising or consisting of 14 different biomarkers, namely: (i) any 13 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or

3.

• A panel comprising or consisting of 15 different biomarkers, namely: (i) any 14 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

Another preferred panel comprises >1 of the biomarkers from Table 1 plus one or more of: (i) >1 of the biomarkers from Table 8 (ii) >1 of the biomarkers from Table 9 (iii) >1 of the biomarkers from Table 10 and (iv) >1 of the biomarkers from Table 11.

Preferably the panel comprises >1 of the biomarkers from Table 1 plus >1 of the biomarkers from Table 8. Preferably the panel comprises >1 of the biomarkers from Table 1 plus >1 of the biomarkers from Table 9. Preferably the panel comprises >1 of the biomarkers from Table 1 plus >1 of the biomarkers from Table 10. Preferably the panel comprises >1 of the biomarkers from Table 1 plus >1 of the biomarkers from Table 11.

Preferably the panel comprises >1 of the biomarkers from Table 1 plus (i) >1 of the biomarkers from Table 8 and (ii) >1 of the biomarkers from Table 9. Preferably the panel comprises >1 of the biomarkers from Table 1 plus (i) >1 of the biomarkers from Table 8 and (iii) >1 of the biomarkers from Table 10. Preferably the panel comprises >1 of the biomarkers from Table 1 plus (i) >1 of the biomarkers from Table 8 and (iv) >1 of the biomarkers from Table 11. Preferably the panel comprises >1 of the biomarkers from Table 1 plus (ii) >1 of the biomarkers from Table 9 and (iii) >1 of the biomarkers from Table 10. Preferably the panel comprises >1 of the biomarkers from Table 1 plus (ii) >1 of the biomarkers from Table 9 and (iv) >1 of the biomarkers from Table 11. Preferably the panel comprises >1 of the biomarkers from Table 1 plus (iii) >1 of the biomarkers from Table 10 and (iv) >1 of the biomarkers from Table 11.

Preferably the panel comprises >1 of the biomarkers from Table 1 plus (i) >1 of the biomarkers from Table 8, (ii) >1 of the biomarkers from Table 9 and (iii) >1 of the biomarkers from Table 10. Preferably the panel comprises >1 of the biomarkers from Table 1 plus (i) >1 of the biomarkers from Table 8, (ii) >1 of the biomarkers from Table 9 and (iv) >1 of the biomarkers from Table 11. Preferably the panel comprises >1 of the biomarkers from Table 1 plus (i) >1 of the biomarkers from Table 8, (iii) >1 of the biomarkers from Table 10 and (iv) >1 of the biomarkers from Table 11. Preferably the panel comprises >1 of the biomarkers from Table 1 plus (ii) >1 of the biomarkers from Table 9, (iii) >1 of the biomarkers from Table 10 and (iv) >1 of the biomarkers from Table 11.

Preferably the panel comprises >1 of the biomarkers from Table 1 plus (i) >1 of the biomarkers from Table 8, (ii) >1 of the biomarkers from Table 9, (iii) >1 of the biomarkers from Table 10 and (iv) >1 of the biomarkers from Table 11.

The >1 of the biomarkers from Table 8 are preferably selected from the group consisting of: FUS, HMG20B, E1B-AP5/HNRNPUL1, HOXB6, LIN28, PABPC1, PSME3, SMN1 and SSA2/TROVE2.

The >1 of the biomarkers from Table 9 are preferably selected from the group consisting of: MAP2K7, MARK4, MLF1 and SSX4.

The >1 of the biomarkers from Table 10 are preferably selected from the group consisting of: ANXA1, APOBEC3G, ARAF, CDC25B, CLK1, CREB1, CSN K2A1, DLX4, EGR2, EZH2, GEM, HMGB2, HNRNPA2B1, HNRNPUL1, HOXB6, ID2, IGF2BP3, LI N28A, MLLT3, NFI L3, PABPC1, PATZ1, PPP2CB, PRM1, PTK2, PTPN4, PYGB, RRAS, SH2B1, SMAD2, SSB, TROVE2, VAV1, WT1 and ZAP70.

The >1 of the biomarkers from Table 11 are preferably selected from the group consisting of: DEK, LYN, MAGEB2, PIK3C3, PPP2CB and VAV1.

Panels of specific interest a re the panels shown in Tables 5, 6 and 13. Each of these panels can be combined with a further biomarker selected from Table 1 or Table 2.

A preferred panel comprises or consists of BIRC3 and ANA. Another preferred panel comprises or consists of ANXA1 and RQCD1.

A preferred panel comprises or consists of PSME3, PABPC1, RQCD1, HMG20B, NPM1, HNRNPUL1, ZNF207, ANA and dsDNA.

A preferred panel comprises or consists of PSME3, PABPC1, RQCD1, HMG20B, NPM1, HNRNPUL1, ZNF207, RNASEL and dsDNA.

A preferred panel comprises or consists of TROVE2, PSME3, PABPC1, RQCD1, MAGEB2, HMG20B, HN RNPUL1, I RF5, ZNF207, NFKBIA and dsDNA.

A preferred panel comprises or consists of dsDNA, PSME3, ZNF207, HNRNPUL1, RQCD1, MAGEB2, PABPC1, NFKBIA, APEX1, HMG20B, RNASEL, NPM1, SMN1, IGF2BP3 and SSB/La.

Table 7 lists the preferred subsets of biomarkers from Table 1. A preferred panel of the invention comprises or consists of two or more of the biomarkers listed in any one of the subsets from Table 7. Preferably, a panel of the invention comprises or consists of two or more biomarkers selected from the group consisting of: TROVE2, HNRN PUL1, PABPC1, LIN28A, PSME3, H MG20B, CDC25B, HNRNPA2B1, SSB/La, IGF2BP3, DLX3, CARHSP1, APEX1, MAP3K7, RPS6KA6, RQCD1, MAGEB1, LYN, MAGEB2 and RARA.

Another preferred panel comprises or consists of a subset of at least 10 of the 12 biomarkers from any of the panels listed in Table 5D. Another preferred panel comprises or consists of a subset of 7-8 of the 20 biomarkers from any of the panels listed in Table 6C. Another preferred panel comprises or consists of a subset of 8-13 of the 18 biomarkers from any of the panels listed in Table 6D.

Another preferred panel comprises or consists of a subset of at least 7, 8, 9, 10, 11, 12, 13, 14 or 15 of the biomarkers from any of the panels listed in Table 13. Preferably, the panel comprises or consists of the biomarkers listed in subset no. 83 of Table 7. Preferably, the panel comprises or consist of the biomarkers listed in any of the subsets no. 84-87 of Table 7.

Panels of the invention contain at least one biomarker from the 77 biomarkers listed in Ta bles 1 and 2.

General

The term "comprising" encompasses "including" as well as "consisting" e.g. a composition "comprising" X may consist exclusively of X or may include something additional e.g. X + Y.

References to an antibody's ability to "bind" an antigen mean that the antibody and antigen interact strongly enough to withstand standard washing procedures in the assay in question. Thus non-specific binding will be minimised or eliminated.

References to a "level" of a biomarker mean the amount of an analyte measured in a sample and this encompasses relative and absolute concentrations of the analyte, analyte titres, relationships to a threshold, rankings, percentiles, etc.

An assay's "sensitivity" is the proportion of true positives which are correctly identified i.e. the proportion of lupus subjects who test positive by a method of the invention. This ca n apply to individual biomarkers, panels of biomarkers, single assays or assays which combine data integrated from multiple sources e.g. ANA, anti-dsDNA and/or other clinical test such as those included in the SLEDAI index. It can relate to the ability of a method to identify samples containing a specific analyte (e.g. antibodies) or to the ability of a method to correctly identify samples from subjects with lupus.

An assay's "specificity" is the proportion of true negatives which are correctly identified i.e. the proportion of subjects without lupus who test negative by a method of the invention. This can apply to individual biomarkers, panels of biomarkers, single assays or assays which combine data integrated from multiple sources e.g. ANA, anti-dsDNA and/or other clinical tests such as those included for consideration in the SLEDAI index. It can relate to the ability of a method to identify samples containing a specific analyte (e.g. antibodies) or to the ability of a method to correctly identify samples from subjects with lupus.

Unless specifically stated, a method comprising a step of mixing two or more components does not require any specific order of mixing. Thus components can be mixed in any order. Where there are three components then two components can be combined with each other, and then the combination may be combined with the third component, etc.

References to a percentage sequence identity between two amino acid sequences means that, when aligned, that percentage of amino acids are the same in comparing the two sequences. This alignment and the percent homology or sequence identity can be determined using software programs known in the art, for example those described in section 7.7.18 of ref. 55. A preferred alignment is determined by the Smith-Waterman homology search algorithm using an affine gap search with a gap open penalty of 12 and a gap extension penalty of 2, BLOSUM matrix of 62. The Smith-Waterman homology search algorithm is disclosed in ref. 56.

"anti-dsDNA" is used interchangeably with "dsDNA", indicating the presence of antibodies in a sample which react against native dsDNA or possibly factors associated with native dsDNA.

Table 1 lists 42 biomarkers. From within these 42, a preferred subset is the 37 listed in Table 17. Thus, any reference herein to the 42 biomarkers of Table 1 ca n be replaced by a reference to the 37 biomarkers in Table 17. For instance, the invention provides a method for analysing a subject sample, comprising a step of determining the levels of x different biomarkers of Table 17, wherein the levels of the biomarkers provide a diagnostic indicator of whether the subject has lupus. The value of is 2 or more e.g. 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 or more [e.g. up to 37). These panels may include (i) any specific one of the 37 biomarkers in Table 17 in combination with (ii) any of the other 36 biomarkers in Table 17.

Table 7 lists 87 preferred subsets of biomarkers from Table 1. From within these 87, 82 preferred subsets are listed in Table 18.

I n all embodiments of the invention, where only one biomarker is used, the biomarker is preferably not NFKBIA or CCNB1.

I n all embodiments of the invention, where a panel comprises NFKBIA, preferably the panel further comprises one or more bioma rkers from Table 1 that is not NFKBIA.In all embodiments of the invention, where a panel comprises CCNBl preferably the panel further comprises one or more biomarkers from Table 1 that is not CCNBl.

In some experiments the clone for CDC25B was mixed with CCNBl. Thus, where the application refers to CDC25B, this can optionally be taken as a reference to CDC25B and /or CCNBl. CCNBl has the sequence listed in SEQ ID NO: 237, details of this gene are: official full name given by NBCI : cyclin Bl; HGNC no. 1579; Genlnfo Identifier no. 33990813.

BRIEF DESCRIPTION OF DRAWINGS

Figure 1 shows the comparison of anti-dsDNA results for Afro-Caribbean (diamonds) and European (circles) SLE cohorts. The samples were ordered by anti-dsDNA reactivity.

Figure 2 shows the comparison of ANA results for Afro-Caribbean (diamonds) and European (squares) SLE cohorts and confounding disease cohorts: connective tissue disease (black triangles), polymyositis (black circles), RA (grey triangles), scleroderma (grey circles) and Sjogren's syndrome (white circles). The samples were ordered by ANA reactivity.

Figure 3 shows a volcano plot displaying the p-value of a microarray t-test on the y-axis versus the fold change in antibody levels between case and controls on the x-axis. The most relevant features (biomarker candidates) can be found in the top left and top right area of the volcano plot. A dotted line is plotted in the graph to differentiate between potential markers and insignificant events. The minimum selection criteria of a p-value smaller than 0.05 and a fold change of greater than 1.004 was used to identify candidate biomarkers. Global median normalised data and not raw data is used to derive the fold-change values. Large differences in raw RFUs translate to small changes in this value following normalisation. Several of the best- performing markers (including HNRNPA2B1 (C), TROVE2 (A) and SSB/La (B)) in this analysis are highlighted. D: PSME3; E: HMG20B; F: PABPC1; G: LIN28A ; H: APOBEC3G; I : IGF2BP3; J: HMGB2; K: DUSP12; L: DLX4; M: HNRNPUL1; N: APEX1; O: ANXA1

Figure 4 shows data for array platform in the absence of ANA and anti-dsDNA data. Receiver operating characteristic (ROC) curve for (A) Wilcoxon feature ranking (AUC=0.83747, S+S=1.567), (B) Entropy feature ranking (AUC=0.845365, S+S=1.5616), (C) Bhattacharyya feature ranking (AUC=0.833, S+S=1.537), (D) tTest feature ranking (AUC=0.828, S+S=1.558), (E) ROC feature ranking (AUC=0.829, S+S=1.564) and (F) FS feature ranking (AUCO.821, S+S=1.529). The top curve shows the performance of the original data and the bottom curve shows the performance of the permuted data. The optimal performance point is indicated by a circle. Figures 4(G) and (H) show data relating the number of biomarkers in a panel to performance, as measured by: sensitivity + specificity score and AUC, respectively. The optimal number of biomarkers in a panel is indicated by a circle. It can be seen that there is no increase in performance beyond the inclusion of 12 biomarkers in a panel derived using a Wilcoxon algorithm. Figure 4(1) shows data relating the normalised selection frequency for the biomarkers most frequently chosen in the Wilcoxon analysis. A selection frequency of 1.0 indicates that the biomarker is chosen in each panel derived by the analysis.

Figure 5 shows data for array platform in the presence of ANA and anti-dsDNA data. Receiver operating characteristic (ROC) curves for (A) Entropy feature ranking (AUC=0.863, S+S=1.653), (B) T-test feature ranking (AUC=0.873, S+S=1.654), (C) ROC feature ranking (AUC=0.876, S+S=1.671) and (D) FS feature ranking (AUC=0.879, S+S=1.671). The top curve shows the performance of the original data and the bottom curve shows the performance of the permuted data. The optimal performance point is indicated by a circle.

Figure 5(E) shows data relating the number of biomarkers in a panel selected by the Entropy algorithm to performance, as measured by sensitivity + specificity score. The optimal number of biomarkers in a panel (n=20) as calculated by an analysis algorithm is indicated by a circle.

Figure 5(F) shows data relating the number of biomarkers in a panel selected by the Bhattacharyya algorithm to performance, as measured by sensitivity + specificity score. The optimal number of biomarkers in a panel (n=15) as calculated by an analysis algorithm is indicated by a circle.

Figure 5(G) shows data relating the number of biomarkers in a panel selected by the tTest algorithm to performance, as measured by sensitivity + specificity score. The optimal number of biomarkers in a panel (n=20) as calculated by an analysis algorithm is indicated by a circle.

Figure 5(H) shows data relating the number of biomarkers in a panel selected by the ROC algorithm to performance, as measured by sensitivity + specificity score. The optimal number of biomarkers in a panel (n=18) as calculated by an analysis algorithm is indicated by a circle.

Figure 5(l) shows data relating the number of biomarkers in a panel selected by the FS algorithm to performance, as measured by sensitivity + specificity score. The optimal number of biomarkers in a panel (n=2) as calculated by an analysis algorithm is indicated by a circle.

Figure 6(A) shows the receiver operating characteristic (ROC) curve for ANA and a biomarker panel (ROC, Table 6; ANA, TROVE2, SSB/La, HMGB2, PABPC1, dsDNA, BIRC3, ANXA1, SMN1, IGF2BP3, HNRNPA2B1, WAS, HMG20B, PATZ1, HNRNPUL1, NFIL3, HOXB6, DLX3) incorporating ANA plus additional biomarkers with the specificity measured at 93% sensitivity. ANA+biomarkers: specificity=0.38; Biomarkers: specificity=0.33; ANA: specificity=0.28.

Figure 6(B) shows the receiver operating characteristic (ROC) curve for ANA and a biomarker panel (ROC, Table 6; ANA, TROVE2, SSB/La, HMGB2, PABPC1, dsDNA, BIRC3, ANXA1, SMN 1, IGF2BP3, HNRNPA2B1, WAS, HMG20B, PATZ1, HN RNPUL1, NFI L3, HOXB6, DLX3) incorporating ANA plus additional biomarkers with the specificity measured at 85% sensitivity. ANA+biomarkers: specificity=0.71; Biomarkers: specificity=0.58; ANA: specificity=0.65.

Figure 7 shows the auto-antibody reactivity against TROVE2, SSB and PSME3 in serum samples from three cohorts: European, Afro-Caribbean and confounding disease cohorts. The data was ordered from left to right in increasing ANA reactivity. The confounding disease cohort includes samples from patients suffering from polymyositis, RA, Scleroderma and Sjogren's syndrome.

Figure 8 shows scatter plots for TROVE2 (A), SSB/La (B), PSME3 (C) and ANA (D) in serum samples from three cohorts: European, Afro-Caribbean and confounding disease cohorts. The Pool control is a combined pool of healthy human serum, used as a QC step to monitor the assay process.

Figure 9(A) shows the receiver operating characteristic (ROC) curve for the LR1 biomarker panel (PSME3, PABPC1, RQCD1, HMG20B, NPM1, HNRNPUL1, ZNF207, ANA, dsDNA), ANA and dsDNA, measured in the validation cohorts described.

Figure 9(B) shows the receiver operating characteristic (ROC) curve for the LR2 biomarker panel (PSME3, PABPC1, RQCD1, HMG20B, NPM1, HN RNPUL1, ZN F207, RNASEL, dsDNA), ANA and dsDNA, measured in the validation cohorts described.

Figure 10(A) shows the receiver operating characteristic (ROC) curve for the LR3 biomarker panel (TROVE2, PSME3, PABPC1, RQCD1, MAGEB2, HMG20B, HNRNPUL1, IRF5, ZNF207, NFKBIA, dsDNA) and dsDNA, measured in the validation cohorts described.

Figure 10(B) shows the receiver operating characteristic (ROC) curve for the LR4 biomarker panel (dsDNA, PSME3, ZNF207, HNRNPUL1, RQCD1, MAGEB2, PABPC1, NFKBIA, APEX1, HMG20B, RNASEL, NPM1, SMN1, IGF2BP3, SSB/La) and dsDNA, measured in the validation cohorts described. The confidence intervals are also shown.

Figure 11 shows data relating the number of biomarkers in the LR4 panel to performance, as measured by AUC from ROC analysis. The optimal number of biomarkers in a panel is indicated by the filled circle. MODES FOR CARRYING OUT THE INVENTION

Anti-dsDNA and ANA analysis

Each serum sample was subjected to an anti-dsDNA ELISA (QUANTA Lite Cat No: 704650; Inova Diagnostics, San Diego, USA) and an ANA ELISA (QUANTA Lite Cat No: 708750; Inova Diagnostics, San Diego, USA).

The European SLE cohort was split into two, named SLE I and SLE I I, prior to assay and analysis.

When the anti-dsDNA ELISA data for the Afro Caribbean SLE cohort and matched controls were compared, it was found that the sensitivity of the dsDNA assay for disease samples was 65.6% and the specificity was 97.0%:

When the ANA ELISA data for the Afro Caribbean SLE cohort and matched controls were compared, it was found that the sensitivity of ANA for disease samples was 92.6% and the specificity was 63.0%:

When the anti-dsDNA ELISA data for the European SLE cohort and matched controls were compared, it was found that the sensitivity of the dsDNA assay for disease samples was 37.2% and the specificity was 93.6%:

When the ANA ELISA data for the European SLE cohort and matched controls were compared, it was found that the sensitivity of the dsDNA assay for disease samples was 37.2% and the specificity was 93.6%:

When the anti-dsDNA ELISA data for the Afro Caribbean and European SLE controls were compared, it was found that the percentage of dsDNA positive patients is greater in the Afro Caribbean cohort:

It was also found that ANA reactivity varies according to ethnic group (Figure 2). The ANA responses were stronger in Afro Caribbean SLE (83.2% positive), scleroderma (83.3% positive) and Sjogren's syndrome (100% positive) compared to connective tissue disease (66.7% positive) and European SLE (53.2% positive). The ANA response was only 16.2% positive in RA and 0% response in polymyositis.

Array preparation

We used a unique "functional protein" array technology which has the ability to display native, discontinuous epitopes [27,57]. Proteins are full-length, expressed with a folding tag in insect cells and screened for correct folding before being arrayed in a specific, oriented manner designed to conserve native epitopes. Each array contains approximately 1550 human proteins representing ~1500 distinct genes chosen from multiple functional and disease pathways printed in quadruplicate together with control proteins. I n addition to the proteins on each array, four control proteins for the BCCP-myc tag (BCCP, BCCP-myc, β-galactosidase-BCCP-myc and β-galactosidase-BCCP) were arrayed, along with additional controls including Cy3labeled biotin-BSA, dilution series of biotinylated-lgG and biotinylated IgM and buffer-only spots. Incubation of the arrays with serum samples allows detection of binding of serum immunoglobulins to specific proteins on the arrays, enabling the identification of both autoantibodies and their cognate antigens [31].

Biomarker confirmation

Serum samples were obtained from several groups of subjects:

1. Discovery Study 1:

• Sample cohort sourced from USA (n=86); plus matched controls (n=90)

2. Validation study 1:

• European sample cohort (n=95); plus matched controls (n=86)

3. Validation study 2:

• Afro-Caribbean sample cohort (n=95); plus matched controls (n=99)

4. Validation study 3:

• Confounding/Interfering disease cohort: connective tissue disease cohort comprising connective tissue disease (n=3), polymyositis (n=3), RA (n=68), scleroderma (n=12), Sjogren's Syndrome (n=6)

ANA and anti-dsDNA tests were performed on all samples and data analysis performed with and without this data to identify biomarkers from the protein array data which were additive to, correlated or decorrelated with ANA and anti-dsDNA activity.

For auto-antibody profiling, serum samples were incubated with arrays separately. Serum samples were clarified by centrifugation at 10-13K rpm for 3 minutes at 20°C/room temperature to remove particulates, including lipids. The samples were then diluted 200-fold in 0.1% v/v Triton/0.1% v/v BSA in IX PBS (Triton-BSA buffer) and then applied to the arrays. Diluted serum (4 mL) sample was added to each array housed in a separate compartment of a plastic dish. All arrays were incubated for 2 hours at room temperature (RT, 20°C) with gentle orbital shaking (~50 rpm). Arrays were removed from the dish and any excess probing solution was removed by blotting the sides of the array onto lint-free tissue. Probed arrays were washed three times in fresh Triton-BSA buffer at RT for 20 minutes with gentle orbital shaking. The washed slides were then blotted onto lint-free tissue to remove excess wash buffer and were incubated in a secondary staining solution (prepared just prior to use) at RT for 2 hours, with gentle orbital shaking and protected from light using aluminium foil. The secondary staining solution was a labelled anti-human IgG antibody. Slides were washed three times in Triton-BSA buffer for 5 minutes at RT with gentle orbital shaking, rinsed briefly (5-10 seconds) in distilled water, and centrifuged for 2 minutes at 240g in a container suitable for centrifugation.

The probed and dried arrays were scanned using an Agilent High-Resolution microarray scanner at ΙΟμιη resolution. The resulting 20-bit tiff images were feature extracted using Agilent's Feature Extraction software version 10.5 or 10.7.3.1. The microarray scans produced images for each array that were used to determine the intensity of fluorescence bound to each protein spot which were used to normalize and score array data.

Raw median signal intensity (also referred to as the relative fluorescent unit, RFU) of each protein feature (also referred to as a spot or antigen) on the array was subtracted from the local median background intensity. Alternative analyses use other measures of spot intensity such as the mean fluorescence, total fluorescence, as known in the art. The results of QC analyses showed that the platform performed well within expected parameters with relatively low technical variation.

The raw array data was normalized by consolidating the replicates (median consolidation), followed by normal transformation and then global median normalisation. Outliers were identified and removed. Data normalisation refers to the process of identifying and removing systematic effects, and bringing the data from different microarrays onto a common scale for biomarker selection. There is no method of normalisation which is universally appropriate and factors such as study design and sample properties must be considered. For the current study median normalisation was used. Other normalisation methods include, amongst others, SAM, quantile normalisation [44], multiplication of net fluorescent intensities by a normalisation factor consisting of the product of the 1st quartile of all intensities of a sample and the mean of the 1st quartiles of all samples and the "VSN" method [58]. Such normalisation methods are known in the art of microarray analysis.

Normalised data was used for the identification of individual candidate biomarkers and for the development of combinations of biomarkers ("panels"). Data analysis was performed using 6 different feature ranking methods and the performance of the classification compared to a randomized set of case and control status samples (permutation assay). The analysis parameters were specifically adjusted in order to create larger biomarker panels which could include low penetrance biomarkers. Furthermore, the biomarkers that were used in this analysis were preselected and poor biomarkers and markers which are up-regulated in the healthy cohort where excluded from the biomarker selection process. Furthermore, nested cross-validation was applied to the classification procedures in order to assess the classification prediction accuracy. Tools such as volcano plots (Figure 3), scatter plots, boxplots and ROC plots were used to identify biomarkers in conjunction with combinations of strong p-values, robust fold-changes and frequency of selection for inclusion in biomarker panels when comparing case and control cohorts. Some of the identified biomarkers identified (e.g. SSB, HNRNPA2B1 and TROVE2/SSA) have previously been demonstrated to be associated with lupus, thus validating this approach.

The biomarker frequencies in an SLE cohort (European vs matched healthy cohort), as shown in Figure 3, are listed below:

This data illustrate the range of frequency of occurrence of biomarkers which might be expected in a population. Even for a "strong" biomarker (i.e. low p-value, high fold-change) such as TROVE2, it is not detected in every SLE patient sample and does appear in a low percentage of matched, healthy controls. Other biomarkers shown to have low p-values and high fold-changes appear at low frequencies (<10% of subjects in a cohort) such as HNRNPA2B1. Such biomarkers may still be specific and have clinical value. For example, in a study [59] of the frequency of auto-antibodies against Smith antigen, using a cut-off value of 30 units/ml, the specificity was 99.5% with a sensitivity of 27.2%. They concluded that the anti-Sm test by itself is not useful as a screening test for lupus but a positive result is highly specific for SLE. This illustrates that highly specific auto-antibodies are of value clinically, even when their sensitivity and frequency of occurrence within an SLE population is not high. Biomarker panelslt is not possible to predict a priori which classifier will perform best with a given dataset, therefore data a nalysis was performed with 5 different feature ranking methods (1-5) plus forward and backward feature selection :

1 . Entropy

2. Bhattacharyya

3. T-test

4. Wilcoxon

5. ROC

6. Forward selection

7. Backward selection

Other classification methods as known in the art could be used. Classifiers were then assessed for performance by referring to the combined sensitivity and specificity (S+S score) and area under the curve (AUC). Data were repeatedly split and analysis cycles repeated until a stable set of classifiers ("panels") was identified. Nested cross validation was applied to the classification procedures in order to avoid overfitting of the study data. The performance of the classification was compared to a randomized set of case-control status samples (permutation assay) which should give no predictive performance and provides an indication of the background in the analysis. A figure close to 1.0 is expected for the null assay (equivalent to a sensitivity + specificity (S+S) score of 0.5 + 0.5, respectively) whereas an S+S score of 2.0 would indicate 100% sensitivity and 100% specificity. The difference between the values for the permutation analysis and the classifier performance indicates the relative strength of the classifier. For each analysis, multiple combinations of putative biomarkers were derived and the performance of the derived panels was then ranked by AUC value. The biomarkers for the best performing panels (containing up to 36 biomarkers; shown in Tables 5, 6 and 7) were taken and the frequency of appeara nce of each protein in these panels was used to rank the predictive power of each protein included in these panels. A frequency of 1.00 indicates that this biomarker was chosen each time a novel biomarker panel was selected. The biomarkers with the greatest diagnostic power, as judged by p value or appearance in the panels derived were identified and combined into a single list (Table 4). These represent biomarkers of particular interest as they correspond to the subset of biomarkers with the greatest predictive properties. The analysis methods described above were used to build, test and identify combinations of biomarkers with greater sensitivity, specificity or AUC than the individual biomarkers disclosed in Tables 1 to 3. Specific examples of the results of this approach are shown below. Analyses were performed both with the inclusion and exclusion of ANA and anti-dsDNA data generated using ELISA assays as described above to identify biomarkers from the protein array data which were additive to, correlated or decorrelated with ANA and anti-dsDNA activity. Validation approaches

• Wilcoxon feature selection:

o The best panel of 12 markers: 67.6% sensitivity / 89.0% specificity

o 6/12 markers chosen in all panels developed using this method

o 3/12 markers previously associated with SLE, 2 of which are well known o 2 other markers belong to a class of protein reported to elicit auto-antibodies in drug-induced lupus

• Entropy feature selection

• Bhattacharyya feature selection

· T-Test feature selection

• ROC feature selection

• FS feature selection

Biomarker panels excluding ANA and dsDNA

Biomarker panels were developed using the data generated with the protein array platform. ANA and anti-dsDNA data were not included. This allows biomarkers to be selected which would otherwise not be included because their function in the panel overlaps with ANA or anti- dsDNA. Cohorts used were SLE, matching cohorts and confounding disease samples. Table 5 summarises the characteristics of the best performing panels derived using each feature- ranking method (A: Entropy feature selection, B: Bhattacharyya feature selection, C: t-Test feature selection, D: Wilcoxon feature selection, E: ROC feature selection, and F: FS feature selection). The top 30 panels with respect to performance (AUC, sensitivity, specificity) for each method are shown in Tables 5A to 5F, together with the frequency of marker selection. In general, the more frequently selected markers (such as those in Table 7) may have greater utility in diagnosis. Figure 4 shows the ROC curves and Table 5G summarises the data for the best performing panel derived by each feature ranking method.

Figures 4G to H shows how panel performance changes with the number of constituent biomarkers. It can be seen that the loss of one or a few biomarkers from the optimal number of biomarkers in a panel may cause a modest reduction in performance, but a panel comprising or consisting of a subset of at least 10 of the 12 biomarkers from any of the panels listed in Table 5D may still be useful with the invention.

The performance of the panels is remarkably consistent across the different methods used to derive the panels, indicating that at least a subset of the biomarkers identified maintain their performance and are not limited in utility to individual algorithms.

The best performing panels were derived by applying a Wilcoxon analysis (AUC = 0.84; sensitivity = 0.68, sensitivity = 0.89; Table 5D and 5G) and consist of 12 biomarkers of which TROVE2, SSB, PABPC1, BI RC3, HMGB2 and HMNG20B were selected in every panel derived, indicating the robustness of these markers. ANXA1 and RQCD1 were selected in greater than 90% of panels. The frequency of marker selection is one characteristic which ca n be used to gauge the contribution of a marker to panel performance. Demonstrating the power of this approach, TROVE2 and SSB have previously been demonstrated to be strongly associated with connective tissue diseases, particularly SLE and Sjogren's Syndrome and in this case TROVE2 and SSB were selected in every panel. I n general, these markers were selected from this dataset regardless of the analysis approach taken.

Other markers which are selected frequently and could be considered as forming the core of many biomarker panels. Any of the markers which were chosen in >90% of panels selected for a given algorithm in at least three independent methods could be considered to be core proteins and consist of: TROVE2, HMGB2, HN RNPA2B1, SSB/La, BI RC3, HMG20B, PABPC1, ANXA1, IGF2BP3, CDC25B, SMN1, PSME3, RQCD1, LI N28A, NFIL3, APOBEC3G.

Biomarker panels without the inclusion of ANA and dsDNA

Biomarker panels were developed using the data generated with the protein array platform.

ANA and anti-dsDNA ELISA data were included. Cohorts used were SLE, matching cohorts and confounding disease samples. Table 6 summarises the characteristics of the best performing panels derived using each feature-ranking method (A: Entropy feature selection, B:

Bhattacharyya feature selection, C: t-Test feature selection, D: ROC feature selection, and E: FS feature selection). The top 30 panels with respect to performance (AUC, sensitivity, specificity) for each method are shown in Tables 6A to 6E, together with the frequency of marker selection.

Figure 5A to 5D shows the ROC curves and Table 6F summarises the data for the best performing panel derived by each feature ranking method.

Figure 5E to 5I shows how panel performance changes with the number of constituent biomarkers. The optimal number of biomarkers in a panel as calculated by an analysis algorithm is indicated by a circle but inspection shows that in some cases this is unreliable. Figure 5E shows that there is little increase in performance beyond the inclusion of 3-4 bioma rkers. Thus, a panel comprising or consisting of a subset of 3-4 of the 20 biomarkers from any of the panels listed in Table 6A can maintain performance. Figure 5F shows that there is little increase in performance beyond the inclusion of 5 biomarkers. Thus, a panel comprising or consisting of a subset of 5 of the 15 biomarkers from any of the panels listed in Table 6B ca n maintain performance. Figure 5G shows that there is little increase in performance beyond the inclusion of 7-8 biomarkers. Thus, a panel comprising or consisting of a subset of 7-8 of the 20 biomarkers from any of the panels listed in Table 6C can maintain performance and is useful with the invention. Figure 5H shows that there is little increase in performance beyond the inclusion of 8-13 biomarkers. Thus, a panel comprising or consisting of a subset of 8-13 of the 18 biomarkers from any of the panels listed in Table 6D can maintain performance and is useful with the invention. Figure 51 shows that as additional biomarkers are added to the panel beyond the optimal number of biomarkers, the performance of the panel decreases. Thus, a panel comprising or consisting of ANA and BI RC3 (as shown in Table 6E) is useful with the invention.

Biomarker panel performance at a defined sensitivity or specificity

The sensitivity and specificity of a given biomarker or biomarker panel are measured at the optimal performance point of the receiver operator curve (ROC), indicated by a circle. However it is sometimes desirable to measure the performance at a fixed sensitivity or specificity such as when a specific clinical cut-off may be employed. Examples of this include the recommendations regarding cut-offs provided with the anti-dsDNA ELISA assay for assessing SLE patients and, in cancer, PSA at 4 ng/mL for identifying individuals at increased risk of prostate cancer. Using a fixed % sensitivity or specificity may also be helpful in modelling clinical utility where it is desirable to either reduce false positive results [60] or to identify as many subjects at risk as possible such as in population screening applications e.g. fecal occult blood test for colorectal cancer. To assess how the biomarker panels performed at high sensitivity, the specificity of several panels was measured at the sensitivity reported for the ANA test in the literature (93%).

I n this study, at 93% sensitivity, ANA alone (28% specificity) gave the lowest specificity compared with a panel (ROC, Table 6, 38% specificity; ANA, TROVE2, SSB/La, HMGB2, PABPC1, dsDNA, BIRC3, ANXA1, SMN1, IGF2BP3, HN RNPA2B1, WAS, HMG20B, PATZ1, HN RNPUL1, N FIL3, HOXB6, DLX3) incorporating ANA plus additional biomarkers which at the same sensitivity, provided a 10% better specificity than ANA alone (Figure 6A). The biomarker panel without the inclusion of ANA gave a specificity of 33%. This data suggest some decorrelation between the biomarkers in the panel and ANA, that is, the ability of the biomarkers to distinguish between case and control samples is not identical to ANA. In other words, the biomarker panel may correctly identify case/control status of some samples which are misclassified by the ANA test. The biomarker panel is therefore additive to ANA. This may represent a clinically significant improvement. A similar analysis was performed, calculating specificity at 85% sensitivity, comparing the performance of ANA alone (65% specificity), a biomarker panel incorporating ANA (ROC, Table 6, 71% specificity; ANA, TROVE2, SSB/La, HMGB2, PABPC1, dsDNA, BI RC3, ANXA1, SMN 1, IGF2BP3, HNRN PA2B1, WAS, HMG20B, PATZ1, HNRNPUL1, NFIL3, HOXB6, DLX3) and the biomarker panel with the exclusion of ANA (58% specificity). In each case, the addition of the biomarkers to ANA increased the specificity by 5-6% above that of ANA alone (Figure 6B). This may represent a clinically significant improvement and supports the observation of some decorrelation between the biomarkers in the panel and ANA.

Biomarker panels distinguish between SLE, healthy subjects and confounding diseases

An analysis was performed to determine if the identified biomarkers were able to not only differentiate between SLE and healthy subjects but also between SLE and confounding diseases. Figure 7 shows that for three of the strongest biomarkers, TROVE2, SSB and PSME3, autoantibody reactivity against these antigens is frequent and substantial in SLE cohorts but occurs at low frequency in both healthy controls and confounding diseases. Scatter plots (Figure 8) confirm that these biomarkers can distinguish between SLE, healthy controls and confounding disease subjects. These analyses were performed for all the biomarkers disclosed in Tables 1-3, confirming their ability to distinguish between SLE, healthy controls and confounding disease subjects.

It is interesting to note that while TROVE2 and SSB are constituent antigens of the ANA test, their reactivity in the confounding disease cohort differs from that of the ANA assay itself. While initia lly appea ring to be a surprising result, it demonstrates that by individually analysing antigens present within the group of antigens covered by the ANA test, it is possible to extract additional information from an individual sample which may have clinical value. Thus, molecular characterisation at the level of individual antigens could be considered to be analagous to the classical ANA test where the pattern of staining of cells is attributable to the presence of auto-antibodies in a subject's serum. ANA staining patterns are well characterised and considered to be clinically informative. By developing panels at the level of individual antigens or biomarkers, it should be possible to leverage the clinical value obtainable from individual antigens in isolation.

Biomarker panels perform in different ethnic backgrounds

The biomarkers of the invention and/or the panels of the invention can be tested to assess their performance in various ethnic backgrounds, e.g. US, European and Afro Caribbean. The biomarkers and/or panels of the invention ideally provide consistently high performances in a range of enthnic backgrounds. Derivation of biomarker panels containing 2 - 15 members

The methodology described above can be used to select panels of biomarkers of interest based on combining biomarkers and monitoring their performance with respect to sensitivity, specificity, AUC of a Receiver Operating Characteristic (ROC) curve and other appropriate metrics useful for measuring diagnostic performance. The number of members constituting the panels can be varied. Backward selection can be used for feature selection as described above and panels of biomarkers containing from 2 to 15 members can be derived following 50 rounds of nested cross- validation. The panels can then be ranked in order of performance and the top 10 panels for each n-mer (where n=2-15) would be useful for the invention.

This approach demonstrates that panels of biomarkers of a given size can be derived from the biomarkers presented in Table 1, optionally in combination with known lupus biomarkers. This enables panels to be developed or tuned according to specific requirements. Thus, biomarkers previously identified through their association with lupus can be integrated in to panels with the biomarkers described here in Table 1. Also, where for a specific reason e.g. performance in an assay, a particular biomarker is preferred or should be removed and substituted for another or others, this approach provides the means to develop and validate such a required biomarker panel.

Development of additional biomarker panels

The normalised dataset from the three validation studies described above was used in a second, independent set of analyses to further mine the data and to identify biomarkers with similar or superior performance to tests in current practice, in particular, ANA and anti-dsDNA. Individual biomarkers and biomarker panels were identified that differed between SLE and Controls (Healthy and/or Confounding Disease). Three possible settings for an improved test were considered, with application 2 being preferred:

1. Application 1

a. Improved test better than ANA test (Sens = 95%, Spec>54%)

2. Application 2

a. Improved test to use on ANA positive subject samples, to effectively replace anti- dsDNA test (Sens=54.2%, Spec=94.1%)

3. Application 3

A further test, that adds to anti-dsDNA test

The overall objectives remained the development of a diagnostic assay to improve clinical discrimination of disease and assist in exclusion of confounding diseases. Univariate analysis was used to identify candidate biomarkers to which multivariate analyses were applied. Univariate analysis

Univariate analysis with reference to statistical parameters including ROC curves and P values identified 40 candidate biomarkers that were significant between SLE and Controls (Healthy and/or Confounding Disease) following adjustment for multiple comparisons (Table 12). Of the 40 biomarkers, 35 were identified in the previous analyses, validating this approach. The five additional candidates identified were: CSNK1G2, ZNF207, MAGE4B, NFKBIA and RNASEL. Note that in such analyses, it is common to apply a somewhat arbitrary cut-off based on the parameters analysed, below which candidates are excluded. Therefore when comparing the result from this analysis with the previous analyses, one would expect the lists to be similar but not identical. In keeping with this, the five additional candidates not identified in the previous analyses were all in the lower half of the ranked top 40 candidates in this analysis. None of these markers has been previously linked to the generation of autoantibodies in SLE, indicating their potential utility in the characterisation of connective tissue disease in general and SLE in particular.

Multivariate analysis

Pairwise combinations of the 40 candidate biomarkers were analysed via logistic regression. Feature selection was performed by stepwise logistic regression (both forwards and backwards). Cross-validation through random sampling was used both at the feature selection and evaluation of model performance. In summary:

• Stepwise logistic regression models fitted for 100 random subsets of data

· Final models extracted and frequency of each variable determined

• Variables present in >50% of the final models were selected

• Validation of the final model performed using 100 randomly allocated training and testing samples

Two models were developed, LRl (PSME3, PABPC1, RQCD1, HMG20B, NPM1, HNRNPUL1, ZNF207, ANA, dsDNA; Table 13, row 2) and LR2 (PSME3, PABPC1, RQCD1, HMG20B, NPM1, HNRNPUL1, ZNF207, RNASEL, dsDNA; Table 13, row 4) which differ only in that ANA in LRl is substituted by a different variable, RNASEL, in LR2. LRl was selected with ANA available as a variable whereas LR2 was selected in the absence of ANA. Note, this does not suggest that RNASEL can directly substitute for ANA but rather, that the combination of biomarkers together performs a similar role. The performance of LRl is also superior to LR2, demonstrating that ANA (used as a continuous variable) contributes substantially to LRl performance, as would be expected given its AUC of 0.8 which is greater than any other individual biomarker.

The ROC curves for Rl and LR2 are shown in Figure 9. LRl demonstrates a marked improvement over ANA alone, reflected in an AUC = 0.85 (Table 13). LR2 shows that a molecularly defined panel performs similarly to ANA alone (AUC = 0.81 cf AUC = 0.80). The individual constituents of ANA are not completely defined. By identifying precisely the components in the panel, this provides potential advantages in the manufacture and reproducibility of a test. It may also be possible to ascertain whether individual biomarkers provide additional information with respect to disease pathology, analogous to the use of the staining pattern in the ANA IIF assay to derive information on the status of autoimmune conditions. Such information could have utility in devising specific tests for diseases with an autoimmune component.

Table 13 also enables the comparative performance at fixed sensitivities and specificities to be read. For example, the literature value for anti-dsDNA is 94.1% specificity, 54% sensitivity.

Table 14 and Table 15 show the performance of LR1 and LR2, respectively, in comparison with ANA and anti-dsDNA, in the context of the three applications described above. In addition to sensitivity, specificity and accuracy, the number of individual subject assigned to tp, fp, tn and fn are included along with the entire number of samples in the tested cohort (N). Of a starting cohort of 487, 1 sample was removed during QC. The classification performance of ANA alone ("ANA on all") is compared to the model ("Model on all"; N = 486). LR1 outperforms ANA (accuracy of 0.813 vs 0.663) when applied to the entire cohort ("Application 1").

The second application is where the ANA assay would be applied as standard on the entire cohort. The subset of ANA positive samples (N = 293) would be further tested either using the anti-dsDNA assay (using either the low or high cut-off values provided by the manufacturer) or the panel (LR1 or LR2). The "combined performance" shows the effect of applying the ANA test as standardly used followed by the panel. This combined performance provides an accuracy of 0.815 for LR1 compared with 0.741/0.761 for the current standard (ANA then anti-dsDNA). For LR2, the combined performance is 0.798, also superior to Ana followed by anti-dsDNA (0.763/0.741).

In the third application of interest, the panel is analysed following application to the ANA positive and anti-dsDNA positive (N = 112) or anti-dsDNA negative cohorts (N = 181). In this context, the performance of the LR1 and LR2 models are again superior to the tests currently used as standard.

Development of models specifically for use on ANA positive cohorts

To ascertain if it was possible to derive a panel with improved performance for use specifically with ANA positive subjects, a further model, LR3 (TROVE2, PSME3, PABPC1, RQCD1, MAGEB2, HMG20B, HNRNPUL1, IRF5, ZNF207, NFKBIA, dsDNA; Table 13, row 6), was derived according to the previous analysis using the ANA positive cohort (N = 293) rather than the entire cohort (N = 487). An additional model was developed, LR4 (dsDNA, PSME3, ZNF207, HNRNPUL1, RQCD1, MAGEB2, PABPC1, NFKBIA, APEX1, HMG20B, RNASEL, NPM1, SMN1, IGF2BP3, SSB/La; Table 13, row 8), which was specifically optimised on the ANA positive cohort with the aim of reducing the false positive rate of the anti-dsDNA test (fp = 20 subjects in this cohort). Figure 10 shows the ROC curves with confidence intervals for both panels. Since these show the ANA positive cohort and the relevant comparison is with the anti-dsDNA assay, only the performance of the model and anti-dsDNA is shown. Very good separation between the models and anti-dsDNA is apparent with LR3 and LR4 both outperforming anti-dsDNA. At 95% specificity, the sensitivity of LR3 and LR4 are 51.9% and 48.1%, respectively.

Table 16 shows the performance of LR3 and LR4, in comparison with ANA and anti-dsDNA, in the context of the three applications described above. The "combined performance" shows the effect of applying the ANA test as standardly used followed by the panel. The specificity was fixed at the value of the standard anti-dsDNA assay in this study (93.3%), demonstrating an improvement in sensitivity over anti-dsDNA of 5%, 10% and 11% for LR2, LR3 and LR4, respectively. The improvement in overall accuracy was 2%, 4% and 4%, respectively.

Despite these improvements in performance over the standard anti-dsDNA assay, the number of false positives remained at 20. By altering the cut-off of LR4, it was possible to reduce the number of false positives to 15 while the overall accuracy of the model remained at 0.8.

Figure 11 shows data relating the number of biomarkers in the LR4 panel to performance, as measured by AUC from ROC analysis. As demonstrated previously, panels of biomarkers can be developed and tested by measuring performance using this approach. The optimal number of biomarkers in a panel is indicated by the filled circle. It can be seen that there is no absolute increase in performance beyond the inclusion of 15 biomarkers and relatively little additive performance beyond 7 biomarkers in a panel derived using logistic regression. Using this approach, panels of biomarkers derived from the pool of biomarkers identified in Table 1 may be usefully defined and tested.

In summary, LR1, LR2, LR3 and LR4 demonstrate the derivation of models which equal or improve on the performance of the ANA and anti-dsDNA assays. Thus, assays using these biomarker panels improve diagnosis of SLE.

It will be understood that the invention has been described by way of example only and modifications may be made whilst remaining within the scope and spirit of the invention. TABLE 1: Biomarkers useful with the invention

Ta ble 1 lists biomarkers usefu l with the invention . The measured biomarker can be (i) presence of auto-a ntibody which binds to an a ntigen listed in Table 1 a nd/or (ii) the presence of a n antigen listed in Table 1, but is prefera bly the former.

No: Symbol ID Name HGNC Gl p-value

APEX1 328 APEX nuclease (multifunctional 587 33876570 0.112123

1.

DNA repair enzyme) 1

ATF4 468 activating transcription factor 4 786 14198041 4.14E-06

2. (tax-responsive enhancer

element B67)

BI C2 329 baculoviral IAP repeat containing 590 22382083 0.016232

3.

2

BIRC3 330 baculoviral IAP repeat containing 591 22766815 4.06E-10

4.

3

CARHSP1 23589 calcium regulated heat stable 17150 13097197 0.005046

5.

protein 1, 24kDa

CASP9 842 caspase 9, apoptosis-related 1511 38014291 0.01901

6.

cysteine peptidase

7. COX6C 1345 cytochrome c oxidase subunit Vic 2285 34783038 0.208511

8. DLX3 1747 distal-less homeobox 3 2916 15214474 0.000407

ERCC5 2073 excision repair cross- 3437 34192346 0.182423 complementing rodent repair

9.

deficiency, complementation

group 5

10. H IST1H4I 8294 histone cluster 1, H4i 4793 16740964 0.011267

11. HOXC10 3226 homeobox CIO 5122 12654896 0.014371

12. I RF4 3662 interferon regulatory factor 4 6119 16041743 0.018691

MAFG 4097 v-maf musculoaponeurotic 6781 15147379 0.062494

13. fibrosarcoma oncogene homolog

G (avian)

14. MAGEB1 4112 melanoma antigen family B, 1 6808 257796250 0.791781

MAP3K14 9020 mitogen-activated protein kinase 6853 23272579 0.112088

15.

kinase kinase 14

MAP3K7 6885 mitogen-activated protein kinase 6859 34189719 0.135015

16.

kinase kinase 7

MYD88 4615 myeloid differentiation primary 7562 15488922 0.223747

17.

response gene (88)

18. N EUROD4 58158 neuronal differentiation 4 13802 26454740 0.002205

19. PLD2 5338 phospholipase D2 9068 15929159 0.028245

PPP2R5A 5525 protein phosphatase 2, regulatory 9309 18490281 0.03865

20.

subunit B', alpha

21. PRM2 5620 protamine 2 9448 68989266 0.002365

PSIP1 11168 PC4 and SFRS1 interacting protein 9527 190014584 0.722355

22.

1

23. RPL10 6134 ribosomal protein L10 10298 13097176 0.388947

RPS6KA6 27330 ribosomal protein S6 kinase, 10435 283483967 0.147917

24.

90kDa, polypeptide 6 RQCDl 9125 RCD1 required for cell 10445 410515402 0.585852

25. differentiationl homolog (S.

pombe)

SERPINB5 5268 serpin peptidase inhibitor, clade B 8949 18089113 0.296317

26.

(ovalbumin), member 5

SGK3 23678 serum/glucocorticoid regulated 10812 15929809 0.096028

27.

kinase family, member 3

28. SLA 6503 Src-like-adaptor 10902 13937869 0.737308

29. SMAD5 4090 SMAD family member 5 6771 34189276 0.002738

30. SU B1 10923 SUB1 homolog (S. cerevisiae) 19985 16307066 0.025215

31. TEK 7010 TEK tyrosine kinase, endothelial 11724 23273967 0.014237

TFE3 7030 transcription factor binding to 11752 19684175 0.029027

32.

IGH M enhancer 3

33. TGI F1 7050 TGFB-induced factor homeobox 1 11776 12654024 0.00719

34. TGI F2 60436 TGFB-induced factor homeobox 2 15764 33870164 0.128123

35. VAX2 25806 ventral anterior homeobox 2 12661 13623466 0.255454

WAS 7454 Wiskott-Aldrich syndrome 12731 15215302 0.000798

36.

(eczema-thrombocytopenia)

ZMYN D11 10771 zinc finger, MYN D-type containing 16966 21961556 0.019902

37.

11

232 CSNK1G2 1455 casein kinase 1, gamma 2 2455 33870264 0.001263

233 ZNF207 7756 zinc finger protein 207 12998 33876613 9.59E-05

234 MAGEB4 4115 melanoma antigen family B, 4 6811 261878498 0.000199 nuclear factor of kappa light

235 N FKBIA polypeptide gene enhancer in B-

4792 cells inhibitor 7797 33876916 0.000416 ribonuclease L (2'5'-

236 RNASEL oligoisoadenylate synthetase-

6041 dependent) (RNASEL) 10050 315221148 0.000612

Columns

(i) This number is the SEQ ID NO: for the coding sequence for the auto-antigen biomarker, as shown in the sequence listing.

(ii) The "Symbol" column gives the gene symbol which has been approved by the HGNC. The symbol thus identifies a unique human gene.

(iii) The "ID" column shows the Entrez GenelD number for the antigen marker. An Entrez GenelD value is unique across all taxa.

(iv) This name is taken from the Official Full Name provided by NCBI. An antigen may have been referred to by one or more pseudonyms in the prior art. The invention relates to these antigens regardless of their nomenclature. (v) The HUGO Gene Nomenclature Committee aims to give unique and meaningful names to every human gene. The HGNC number thus identifies a unique human gene.

(vi) A "Gl" number, "Genlnfo Identifier", is a series of digits assigned consecutively to each sequence record processed by NCBI when sequences are added to its databases. The Gl number bears no resemblance to the accession number of the sequence record. When a sequence is updated {e.g. for correction, or to add more annotation or information) it receives a new Gl number. Thus the sequence associated with a given Gl number is never changed. The Gl numbers given here are for coding DNA sequences (except for SEQ ID NO: 7).

(vii) The "p-value" represents the p-value of a microarray T-test derived from comparing case with control.

TABLE 2

No: Symbol ID Name HGNC Gl Frequency

adenylate kinase 3-like 1,

38. AK3L1 205 363 16740594 0.3 transcript variant 3,

23272320

39. AK7 122481 adenylate kinase 7, 20091 0.3

21708051

40. ATXN3 4287 ataxin 3 7106 0.3

375298745

41. BRSK2 9024 BR serine/threonine kinase 2 11405 0.03 calmodulin 1 (phosphorylase

42. CALM1 801 1442 33869376 0.63 kinase, delta)

calmodulin 2 (phosphorylase

43. CAL 2 805 1445 13097164 0.43 kinase, delta),

caspase 7, apoptosis-related

44. CASP7 840 cysteine protease, transcript 1508 16041818 0.17 variant alpha,

CHK2 checkpoint homolog (S.

45. CHEK2 11200 16627 38114706 0.03 pombe), transcript variant 1,

Cbp/p300-interacting

46. CITEDl 4435 transactivator, with Glu/Asp- 1986 13278989 0.07 rich carboxy-terminal doma

COP9 constitutive

47. COPS6 10980 photomorphogenic homolog 21749 33876807 0.03 subunit 6 (Arabidopsis),

src homology 3 domain-

48. DBNL 28988 2696 21619482 0.17 containing protein HIP-55

306518603

49. DCLK1 9201 doublecortin-like kinase 1 2700 0.67

17389362

50. FIP1L1 81608 FIP1 like 1 (S. cerevisiae). 19124 0.07

Homo sapiens human T-cell

51. FOXN2 3344 5281 38648780 0.13 leukemia virus enhancer factor

34783414

52. GPHN 10243 gephyrin 15465 0.1

G protein-coupled receptor

53. GRK5 2869 kinase 5, mRNA (cDNA clone 4544 40352898 0.43

MGC:71228 )

interferon-induced protein with

54. IFIT5 24138 13328 19343998 0.83 tetratricopeptide repeats 5,

37588915

55. KLHL12 59349 kelch-like protein C3IP1 19360 0.23

17389359

56. MAGEA4 4103 melanoma antigen, family A, 4, 6802 0.27 mitogen-activated protein

57. MAP2K1 5604 6840 169790828 0.1 kinase kinase 1 ( AP2K1)

mitogen-activated protein

58. MAP2K6 5608 6846 15080539 0.07 kinase kinase 6, transcript variant 1,

megakaryocyte-associated

59. ATK 4145 tyrosine kinase, transcript 6906 12652728 0.2 variant 1,

15489390

60. N H LH1 4807 nescient helix loop helix 1, 7817 0.03 non-metastatic cells 5, protein

61. N M E5 8382 expressed in (nucleoside- 7853 34190528 0.03 diphosphate kinase).

18044377

62. PI M 1 5292 pim-1 oncogene. 8986 0.4 protein (peptidyl-prolyl

63. PI N1 5300 cis/trans isomerase) NI MA- 8988 12804092 0.1 interacting 1,

protein kinase, interferon-

64. P KRA 8575 inducible double stranded RNA 9438 14495716 0.13 dependent activator

protein tyrosine phosphatase,

65. PTPN 11 5781 non-receptor type 11 (Noonan 9644 14250500 0.03 syndrome 1),

23468344

66. SRPK1 6732 SFRS protein kinase 1, 11305 0.8 synovial sarcoma, X breakpoint

67. SSX2 6757 11336 33872900 0.03

2, transcript variant 2,

serine/threonine kinase 3

68. STK3 6788 11406 34189966 0.2

(STE20 homolog, yeast),

33876276

69. TARDBP 23435 TAR DNA binding protein, 11571 0.13

323668322

70. TP73 7161 P73 tumor protein p73 12003 0.03

37588917

71. YARS 8565 tyrosyl-tRNA synthetase 12840 0.03

7024 transcription factor CP2 11748 33872685

72. TFCP2

TABLE 3

Known biomarkers for lu pus are listed below. The measured bioma rker can be (i) presence of auto-a ntibody which binds to a n a ntigen listed i n Table 3 and/or (ii) the prese nce of a n a ntigen listed i n Table 3, but is prefera bly the former.

No: Symbol ID Name HGNC Gl p-value

4869 nucleophosmin 7910 15214851 0.045470488

73. NPM 1

74. ANXA1 301 annexin Al 533 12654862 4.74E-10 apolipoprotein B mRNA editing

75. APOBEC3G 60489 enzyme, catalytic polypeptide- 17357 18999452 2.46E-08 like 3G

v-raf murine sarcoma 3611 viral

76. ARAF 369 646 33876716 1.06E-05 oncogene homolog

77. CDC25B 994 cell division cycle 25 homolog B 1726 33991200 4.18E-06 (S. pombe)

78. CLK1 1195 CDC-like kinase 1 2068 21618730 0.003099 cAMP responsive element

79. CREB1 1385 2345 14714955 0.000104 binding protein 1

casein kinase 2, alpha 1

80. CSNK2A1 1457 2457 33991298 3.85E-05 polypeptide

81. DEK 7913 DEK oncogene 2768 23273865 0.046016

82. DLX4 1748 distal-less homeobox 4 2917 16359376 6.69E-08

83. EGR2 1959 early growth response 2 3239 23272557 0.035327 enhancer of zeste homolog 2

84. EZH2 2146 3527 34194096 0.012993

(Drosophila)

85. FUS 2521 fused in sarcoma 4010 33875401 0.000104

GTP binding protein

86. GEM 2669 overexpressed in skeletal 4234 34193982 0.038199 muscle

87. HMG20B 10362 high mobility group 20B 5002 33876853 6.63E-07

88. HMGB2 3148 high mobility group box 2 5000 14705263 3.77E-09

HNRNPA2B heterogeneous nuclear

89. 3181 5033 33875522 2.14E-08

1 ribonucleoprotein A2/B1

heterogeneous nuclear

90. HNRNPUL1 11100 17011 33987968 3.61E-09 ribonucleoprotein U-like 1

91. H0XB6 3216 homeobox B6 5117 15779174 3.95E-06 inhibitor of DNA binding 2,

92. ID2 3398 dominant negative helix-loop- 5361 34190057 0.002404 helix protein

insulin-like growth factor 2

93. IGF2BP3 10643 28868 30795211 1.12E-09 mRNA binding protein 3

94. IRF5 3663 interferon regulatory factor 5 6120 34782796 0.001923

95. LIN28A 79727 lin-28 homolog A (C. elegans) 15986 33872076 2.07E-08 v-yes-1 Yamaguchi sarcoma

96. LYN 4067 6735 50960483 0.247057 viral related oncogene homolog

97. MAGEB2 4113 melanoma antigen family B, 2 6809 222418638 0.190527 mitogen-activated protein

98. MAP2K7 5609 6847 34192881 0.308226 kinase kinase 7

MAP/microtubule affinity-

99. MARK4 57787 13538 47940615 0.391928 regulating kinase 4

100. MLF1 4291 myeloid leukemia factor 1 7125 13937875 0.001328 myeloid/lymphoid or mixed- lineage leukemia (trithorax

101. MLLT3 4300 7136 23273580 0.005034 homolog, Drosophila);

translocated to, 3

nuclear factor, interleukin 3

102. NFIL3 4783 7787 14198273 1.37E-06 regulated

poly(A) binding protein,

103. PABPC1 26986 8554 33872187 8.6E-12 cytoplasmic 1

POZ (BTB) and AT hook

104. PATZ1 23598 13071 18088881 8.47E-06 containing zinc finger 1

phosphoinositide-3-kinase, class

105. PIK3C3 5289 8974 34783440 0.061461

3

106. PPP2CB 5516 protein phosphatase 2, catalytic 9300 15080564 0.009479 subunit, beta isozyme

107. PRKCB 5579 protein kinase C, beta 9395 22209071 0.29732

108. PRM1 5619 protamine 1 9447 121582462 8.13E-06 proteasome (prosome,

109. PSM E3 10197 macropain) activator subunit 3 9570 33876201 2.25E-07

(PA28 gamma; Ki)

110. PTK2 5747 PTK2 protein tyrosine kinase 2 9611 34786073 0.042487 protein tyrosine phosphatase,

111. PTPN4 5775 non-receptor type 4 9656 14715026 0.545816

(megakaryocyte)

112. PYGB 5834 phosphorylase, glycogen; brain 9723 34189295 1.09E-05 related RAS viral (r-ras)

113. RRAS 6237 10447 16740850 5.19E-05 oncogene homolog

114. SH2B1 25970 SH2B adaptor protein 1 30417 14715078 0.022286

115. SMAD2 4087 SMAD family member 2 6768 15928761 0.02062 survival of motor neuron 1,

116. SM N1 6606 11117 13111817 0.004949 telomeric

Sjogren syndrome antigen B 357430791

117. SSB 6741 11316 1.78E-17

(auto-antigens La)

synovial sarcoma, X breakpoint

118. SSX4 6759 11338 13529094 0.001654

4

TROVE domain family, member

119. TR0VE2 6738 11313 34192599 4.07E-18

2

vav 1 guanine nucleotide

120. VAV1 7409 12657 33991319 0.000136 exchange factor

121. WT1 7490 Wilms tumor 1 12796 34190661 2.79E-05 zeta-chain (TCR) associated

122. ZAP70 7535 12858 24657845 0.000135 protein kinase 70kDa

B-cell scaffold protein with

123. BAN K1 55024 18233 21619549 0.499107 ankyrin repeats 1

signal transducer and activator

6775 11365 21411473 0.247855281

124. STAT4 of transcription 4

TABLE 4

Gene corrected p-

Symbol Name HGNC Gl

ID value

COX6C 1345 cytochrome c oxidase subunit Vic 2285 34783038 0.2085109

FUS 2521 fused in sarcoma 4010 33875401 0.000104072

HNRNPA2B heterogeneous nuclear

3181 5033 33875522 2.14037E-08 1 ribonucleoprotein A2/B1

TGI F1 7050 TGFB-induced factor homeobox 1 11776 12654024 0.007189694

H MGB2 3148 high mobility group box 2 5000 14705263 3.76897E-09

SMN 1 6606 survival of motor neuron 1, telomeric 11117 13111817 0.004948867

ANXA1 301 annexin Al 533 12654862 4.74307E-10

HOXC10 3226 homeobox CIO 5122 12654896 0.01437102 proteasome (prosome, macropain)

PSM E3 10197 9570 33876201 2.25295E-07 activator subunit 3 (PA28 gamma; Ki) APEX nuclease (multifunctional DNA

APEX1 328 587 33876570 0.1121227 repair enzyme) 1

caspase 9, apoptosis-related cysteine

CASP9 842 1511 38014291 0.01901049 peptidase

v-raf murine sarcoma 3611 viral

ARAF 369 646 33876716 1.05854E-05 oncogene homolog

HMG20B 10362 high mobility group 20B 5002 33876853 6.63004E-07

RPL10 6134 ribosomal protein L10 10298 13097176 0.3889473 calcium regulated heat stable protein

CARHSP1 23589 17150 13097197 0.005045564

1, 24kDa

IRF5 3663 interferon regulatory factor 5 6120 34782796 0.001923048

SSX4 6759 synovial sarcoma, X breakpoint 4 11338 13529094 0.001653565

VAX 2 25806 ventral anterior homeobox 2 12661 13623466 0.2554537 cell division cycle 25 homolog B (S.

CDC25B 994 1726 33991200 4.1826E-06 pombe)

SLA 6503 Src-like-adaptor 10902 13937869 0.737308

MLF1 4291 myeloid leukemia factor 1 7125 13937875 0.001328172 activating transcription factor 4 (tax-

ATF4 468 786 14198041 4.14191E-06 responsive enhancer element B67)

NFIL3 4783 nuclear factor, interleukin 3 regulated 7787 14198273 1.36651E-06

SUB1 10923 SUB1 homolog (S. cerevisiae) 19985 16307066 0.02521463

SMAD5 4090 SMAD family member 5 6771 34189276 0.002737759 heterogeneous nuclear

HNRNPUL1 11100 17011 33987968 3.61217E-09 ribonucleoprotein U-like 1

cAMP responsive element binding

CREB1 1385 2345 14714955 0.00010396 protein 1

protein tyrosine phosphatase, non¬

PTPN4 5775 9656 14715026 0.5458158 receptor type 4 (megakaryocyte)

SH2B1 25970 SH2B adaptor protein 1 30417 14715078 0.02228639 enhancer of zeste homolog 2

EZH2 2146 3527 34194096 0.01299325

(Drosophila)

CSNK2A1 1457 casein kinase 2, alpha 1 polypeptide 2457 33991298 3.85376E-05 protein phosphatase 2, catalytic

PPP2CB 5516 9300 15080564 0.009478866 subunit, beta isozyme

v-maf musculoaponeurotic

MAFG 4097 fibrosarcoma oncogene homolog G 6781 15147379 0.0624943

(avian)

DLX3 1747 distal-less homeobox 3 2916 15214474 0.000407169

Wiskott-Aldrich syndrome (eczema-

WAS 7454 12731 15215302 0.000797565 thrombocytopenia)

TGIF2 60436 TGFB-induced factor homeobox 2 15764 33870164 0.1281234 vav 1 guanine nucleotide exchange

VAV1 7409 12657 33991319 0.000136377 factor

myeloid differentiation primary

MYD88 4615 7562 15488922 0.2237467 response gene (88)

HOXB6 3216 homeobox B6 5117 15779174 3.95005E-06

SMAD2 4087 SMAD family member 2 6768 15928761 0.02062045

PLD2 5338 phospholipase D2 9068 15929159 0.02824466 serum/glucocorticoid regulated kinase

SGK3 23678 10812 15929809 0.09602847 family, member 3 I F4 3662 interferon regulatory factor 4 6119 16041743 0.01869144

PABPC1 26986 poly(A) binding protein, cytoplasmic 1 8554 33872187 8.59833E-12

DLX4 1748 distal-less homeobox 4 2917 16359376 6.69202E-08 related RAS viral (r-ras) oncogene

R AS 6237 10447 16740850 5.19321E-05 homolog

HIST1H4I 8294 histone cluster 1, H4i 4793 16740964 0.01126656

PYGB 5834 phosphorylase, glycogen; brain 9723 34189295 1.0884E-05 mitogen-activated protein kinase

MAP3K7 6885 6859 34189719 0.1350147 kinase kinase 7

serpin peptidase inhibitor, clade B

SERPINB5 5268 8949 18089113 0.2963169

(ovalbumin), member 5

POZ (BTB) and AT hook containing zinc

PATZ1 23598 13071 18088881 8.46956E-06 finger 1

GTP binding protein overexpressed in

GEM 2669 4234 34193982 0.03819914 skeletal muscle

protein phosphatase 2, regulatory

PPP2R5A 5525 9309 18490281 0.03864966 subunit B', alpha

apolipoprotein B mRNA editing

AP0BEC3G 60489 17357 18999452 2.46024E-08 enzyme, catalytic polypeptide-like 3G

transcription factor binding to IGHM

TFE3 7030 11752 19684175 0.02902675 enhancer 3

LIN28A 79727 lin-28 homolog A (C. elegans) 15986 33872076 2.07033E-08

BIRC2 329 baculoviral IAP repeat containing 2 590 22382083 0.01623158 inhibitor of DNA binding 2, dominant

ID2 3398 5361 34190057 0.002404469 negative helix-loop-helix protein

excision repair cross-complementing

ERCC5 2073 rodent repair deficiency, 3437 34192346 0.182423 complementation group 5

CLK1 1195 CDC-like kinase 1 2068 21618730 0.003098854

WT1 7490 Wilms tumor 1 12796 34190661 2.78892E-05

PIK3C3 5289 phosphoinositide-3-kinase, class 3 8974 34783440 0.06146097

ZMYND11 10771 zinc finger, MYND-type containing 11 16966 21961556 0.01990216

DEK 7913 DEK oncogene 2768 23273865 0.04601563

PTK2 5747 PTK2 protein tyrosine kinase 2 9611 34786073 0.04248696

TEK 7010 TEK tyrosine kinase, endothelial 11724 23273967 0.01423735 mitogen-activated protein kinase

MAP3K14 9020 6853 23272579 0.1120875 kinase kinase 14

EGR2 1959 early growth response 2 3239 23272557 0.03532683 myeloid/lymphoid or mixed-lineage

MLLT3 4300 leukemia (trithorax homolog, 7136 23273580 0.005033636

Drosophila); translocated to, 3

PRKCB 5579 protein kinase C, beta 9395 22209071 0.2973195

TR0VE2 6738 TROVE domain family, member 2 11313 34192599 4.0653E-18

BIRC3 330 baculoviral IAP repeat containing 3 591 22766815 4.05605E-10 mitogen-activated protein kinase

MAP2K7 5609 6847 34192881 0.3082262 kinase 7

zeta-chain (TCR) associated protein

ZAP70 7535 12858 24657845 0.000135242 kinase 70kDa

NEUR0D4 58158 neuronal differentiation 4 13802 26454740 0.002205443 MAP/microtubule affinity-regulating

MARK4 57787 13538 47940615 0.3919276 kinase 4

v-yes-1 Yamaguchi sarcoma viral

LYN 4067 6735 50960483 0.2470574 related oncogene homolog

MAGEB1 4112 melanoma antigen family B, 1 6808 257796250 0.7917813

MAGEB2 4113 melanoma antigen family B, 2 6809 222418638 0.190527

PRMl 5619 protamine 1 9447 121582462 8.12608E-06

PRM2 5620 protamine 2 9448 68989266 0.002364523

Sjogren syndrome antigen B (auto-

SSB 6741 11316 357430791 1.78318E-17 antigens La)

RCD1 required for cell differentiationl

RQCDl 9125 10445 410515402 0.5858517 homolog (S. pombe)

insulin-like growth factor 2 mRNA

IGF2BP3 10643 28868 30795211 1.1162E-09 binding protein 3

ribosomal protein S6 kinase, 90kDa,

RPS6KA6 27330 10435 283483967 0.1479165 polypeptide 6

PSIP1 11168 PC4 and SFRS1 interacting protein 1 9527 190014584 0.7223553

TABLE 5A

HMGB2, PRMl, PABPCl, CDC25B, CARHSPl, AP0BEC3G, HNRNPULl, H0XB6, SUBl, HMG20B, RQCDl, ANXAl, DLX4, ID2, BIRC3, PATZl, SLA, CSNK2A1, NFIL3, ARAF, ATF4, IRF4, WTl, FUS, PPP2R5A

TR0VE2, SMNl, SSB/La, PSME3, RRAS, PRM2, HNRNPA2B1, CDC25B, SSX4, HMGB2, LIN28A, AP0BEC3G, IGF2BP3, CARHSPl, WAS, PABPCl, H0XB6, HNRNPULl, PRMl, ID2, HMG20B, PPP2R5A, SUBl, PATZl, RQCDl, IRF4, CSNK2A1, DLX4, ATF4, ANXAl, SLA, BIRC3, WTl, NFIL3, FUS, PSIP1

SSB/La, SMNl, TR0VE2, HNRNPA2B1, PSME3, CDC25B, IGF2BP3, RRAS, WAS, LIN28A, SSX4, HMGB2, AP0BEC3G, PABPCl, ID2, H0XB6, PRM2, PPP2R5A, HNRNPULl, HMG20B, PRMl, CSNK2A1, ANXAl, CARHSPl, BIRC3, SUBl, PATZl, DLX4, NFIL3, ATF4, SLA, WTl, SH2B1, FUS, ARAF, CASP9

SMNl, SSB/La, TR0VE2, PSME3, IGF2BP3, WAS, PRM2, CARHSPl, RRAS, CDC25B,

HNRNPA2B1, HMGB2, AP0BEC3G, LIN28A, PABPCl, PRMl, HMG20B, HNRNPULl, PPP2R5A, ANXAl, SUBl, DLX4, RQCDl, H0XB6, ATF4, ID2, PATZl, SLA, SSX4, CSNK2A1, SH2B1, NFIL3, BIRC3, PYGB, WTl, IRF4

TR0VE2, SMNl, SSB/La, PSM E3, HNRNPA2B1, CDC25B, SSX4, WAS, RRAS, IGF2BP3, LIN28A, PRM2, PABPCl, HMGB2, AP0BEC3G, CARHSPl, ID2, H0XB6, HMG20B, HNRNPULl, ANXAl, RQCDl, PRMl, CSNK2A1, PATZl, NFIL3, DLX4, BIRC3, ATF4, ARAF, SH2B1, PPP2R5A, SLA, PYGB, SMAD5, VAV1

TR0VE2, SSB/La, PSME3, SMNl, RRAS, HNRNPA2B1, PRM2, LIN28A, IGF2BP3, CARHSPl, SSX4, WAS, HMGB2, CDC25B, H0XB6, SUBl, PABPCl, AP0BEC3G, PRMl, HNRNPULl, HMG20B, RQCDl, ID2, ANXAl, CSNK2A1, SH2B1, DLX4, PPP2R5A, PATZl, SLA, ARAF, FUS, BIRC3, IRF5, NFIL3, ATF4

TR0VE2, SSB/La, SMNl, HNRNPA2B1, PSME3, WAS, IGF2BP3, RRAS, CDC25B, LIN28A, SSX4, AP0BEC3G, HMGB2, CARHSPl, PABPCl, HMG20B, HNRNPULl, RQCDl, PRM2, PRMl, BIRC3, ID2, PPP2R5A, PSIP1, ANXAl, DLX4, H0XB6, ATF4, PATZl, NFIL3, CSNK2A1, MAGEB2, PYGB, WTl, CREB1, MAFG

SMNl, SSB/La, TR0VE2, PSME3, WAS, CARHSPl, CDC25B, PRM2, SSX4, HNRNPA2B1, RRAS, LIN28A, AP0BEC3G, IGF2BP3, H0XB6, HMGB2, PABPCl, ID2, PRMl, HMG20B, HNRNPULl, SUBl, PATZl, ANXAl, CSNK2A1, IRF4, RQCDl, ATF4, BIRC3, DLX4, ARAF, PPP2R5A, NFIL3, WTl, FUS, SMAD5

TR0VE2, SSB/La, HNRNPA2B1, PSME3, SMNl, RRAS, IGF2BP3, PRM2, CDC25B, LIN28A, SSX4, CARHSPl, AP0BEC3G, HMGB2, WAS, PABPCl, H0XB6, HMG20B, PRMl, HNRNPULl, SUBl, RQCDl, ID2, PPP2R5A, SH2B1, ATF4, ANXAl, BIRC3, DLX4, CSNK2A1, SLA, PATZl, SMAD2, FUS, NFIL3, IRF4

TR0VE2, SSB/La, SMNl, PSM E3, HNRNPA2B1, IGF2BP3, CARHSPl, AP0BEC3G, CDC25B, LIN28A, SSX4, RRAS, PABPCl, WAS, HMGB2, SUBl, H0XB6, HMG20B, ID2, HNRNPULl, CSNK2A1, PRMl, PRM2, RQCDl, PATZl, ANXAl, BIRC3, DLX4, WTl, NFIL3, FUS, ATF4, ARAF, IRF5, PPP2R5A, SLA

SMNl, SSB/La, TR0VE2, PSME3, WAS, HNRNPA2B1, RRAS, SSX4, PRM2, IGF2BP3, LIN28A, CDC25B, H0XB6, HNRNPULl, PABPCl, HMGB2, AP0BEC3G, HMG20B, PRMl, ID2, CARHSPl, SLA, PATZl, PPP2R5A, BIRC3, ANXAl, SH2B1, IRF4, MAGEB2, DLX4, ARAF, CSNK2A1, ATF4, RQCDl, SUBl, NFIL3

SSB/La, TR0VE2, SM Nl, HNRNPA2B1, PSME3, CDC25B, SSX4, WAS, RRAS, PRM2, IGF2BP3, LIN28A, CARHSPl, HMGB2, AP0BEC3G, PABPCl, H0XB6, RQCDl, HMG20B, PRMl, ATF4, HNRNPULl, BIRC3, ANXAl, ID2, PPP2R5A, PATZl, NFIL3, CSNK2A1, DLX4, PYGB, GEM, WTl, CREB1, ARAF, FUS

SMNl, SSB/La, TR0VE2, PSME3, HNRNPA2B1, CARHSPl, SSX4, PRM2, RQCDl, RRAS, LIN28A, IGF2BP3, AP0BEC3G, CDC25B, WAS, HMGB2, PABPCl, PRMl, HNRNPULl, HMG20B, ANXAl, ATF4, PPP2R5A, CSNK2A1, ID2, DLX4, BIRC3, PATZl, IRF4, FUS, NFIL3, H0XB6, SUBl, IRF5, ARAF, WTl

TR0VE2, SSB/La, SMNl, PSM E3, WAS, CDC25B, HNRNPA2B1, PRM2, SSX4, AP0BEC3G, RRAS, LIN28A, IGF2BP3, HMGB2, PABPC1, IRF4, PRMl, CARHSPl, HMG20B, HNRNPUL1, PPP2R5A, SUBl, DLX4, ANXAl, ID2, SLA, PATZl, CSNK2A1, H0XB6, BIRC3, SH2B1, NFIL3, FUS, RQCDl, ATF4, PYGB

22 TR0VE2, SSB/La, PSME3, SMNl, WAS, HNRNPA2B1, HMGB2, SSX4, LIN28A, AP0BEC3G, RRAS, PRM2, IGF2BP3, H0XB6, CARHSPl, PABPC1, CDC25B, PPP2R5A, PRMl, SUBl, BIRC3, HNRNPUL1, ID2, RQCDl, HMG20B, ANXAl, CSNK2A1, NFIL3, DLX4, PATZl, SLA, ATF4, FUS, IRF5, SH2B1, ARAF

23 SMNl, SSB/La, TR0VE2, HNRNPA2B1, WAS, PSME3, SSX4, RRAS, IGF2BP3, LIN28A, PRM2, CDC25B, CARHSPl, AP0BEC3G, RQCDl, PABPC1, HMGB2, HNRNPUL1, H0XB6, HMG20B, PRMl, ID2, SUBl, ANXAl, MAFG, PATZl, DLX4, ATF4, NFIL3, HOXC10, PPP2R5A, BIRC3, CSNK2A1, SLA, TEK, C0X6C

24 SMNl, TR0VE2, SSB/La, PSME3, CDC25B, CARHSPl, RRAS, HNRNPA2B1, WAS, SSX4, IGF2BP3, HMGB2, PRM2, AP0BEC3G, LIN28A, H0XB6, HMG20B, PABPC1, HNRNPUL1, BIRC3, RQCDl, IRF4, SUBl, PPP2R5A, ANXAl, CSNK2A1, PRMl, ID2, SH2B1, DLX4, GEM, PATZl, PYGB, NFIL3, CASP9, ATF4

25 SSB/La, TR0VE2, SMNl, PSME3, WAS, RRAS, IGF2BP3, HNRNPA2B1, PRM2, LIN28A, PABPC1, CDC25B, H0XB6, HMGB2, AP0BEC3G, HMG20B, PRMl, CARHSPl, HNRNPUL1, SSX4, ANXAl, ATF4, PATZl, WT1, BIRC3, SLA, SUBl, FUS, NFIL3, ID2, CSNK2A1, DLX4, PPP2R5A, PYGB, SH2B1, RQCDl

26 TR0VE2, SSB/La, SM Nl, PSME3, RRAS, IGF2BP3, SSX4, HNRNPA2B1, PRM2, CDC25B, LIN28A, HMG20B, CARHSPl, AP0BEC3G, HMGB2, PABPC1, WAS, PRMl, HNRNPUL1, ID2, ANXAl, RQCDl, DLX4, SLA, NFIL3, BIRC3, PATZl, CSNK2A1, ARAF, FUS, SUBl, PYGB, H0XB6, VAV1, GEM, CREB1

27 SMNl, TR0VE2, SSB/La, PSME3, CDC25B, IGF2BP3, HNRNPA2B1, RRAS, LIN28A, SSX4,

HMGB2, RQCDl, PABPC1, AP0BEC3G, CARHSPl, HMG20B, WAS, ATF4, PRM2, PRMl, HNRNPUL1, ID2, DLX4, ANXAl, PATZl, PPP2R5A, BIRC3, CSNK2A1, FUS, NFIL3, WT1, PYGB, H0XB6, ARAF, SLA, SH2B1

28 SSB/La, TR0VE2, SM Nl, PSME3, CDC25B, WAS, HNRNPA2B1, RRAS, IGF2BP3, SSX4, LIN28A, HMG20B, AP0BEC3G, RQCDl, HMGB2, PABPC1, CARHSPl, PRM2, H0XB6, HNRNPUL1, ID2, PRMl, PPP2R5A, PATZl, IRF4, SUBl, ANXAl, CSNK2A1, BIRC3, ATF4, WT1, FUS, DLX4, SLA, SH2B1, NFIL3

29 SSB/La, TR0VE2, SMNl, PSME3, SSX4, WAS, RRAS, CARHSPl, HNRNPA2B1, IGF2BP3, CDC25B, PRM2, LIN28A, AP0BEC3G, PABPC1, HMGB2, RQCDl, HNRNPUL1, H0XB6, PRMl, ID2, HMG20B, CSNK2A1, ANXAl, PPP2R5A, BIRC3, SLA, ATF4, PATZl, FUS, DLX4, NFIL3, SH2B1, PYGB, WT1, MAGEB2

30 TR0VE2, SSB/La, SMNl, PSME3, WAS, PRM2, HNRNPA2B1, CDC25B, LIN28A, RRAS, IGF2BP3, CARHSPl, HMGB2, AP0BEC3G, PABPC1, HMG20B, PRMl, SUBl, RQCDl, HNRNPUL1, ID2, ANXAl, H0XB6, DLX4, MAGEB1, CSNK2A1, SSX4, PATZl, MAGEB2, BIRC3, NFIL3, FUS, IRF4, PSIP1, ATF4, SLA

TABLE 5B

Panel Biomarker

1 SMNl, TROVE2, SSB/La, PSME3, HNRNPA2B1, WAS, CDC25B, PRM2, RQCDl, RRAS, SSX4, LIN28A, IGF2BP3, CARHSPl, HMGB2, APOBEC3G, PRMl, SUBl, HMG20B

2 SMNl, TROVE2, SSB/La, PSME3, RRAS, HNRNPA2B1, IGF2BP3, CDC25B, LIN28A, WAS, RQCDl, CARHSPl, HMGB2, APOBEC3G, SUBl, PPP2R5A, HMG20B, MAFG, PRM2

3 SMNl, SSB/La, TROVE2, PSME3, SSX4, CDC25B, CARHSPl, RRAS, HNRNPA2B1, PRM2,

IGF2BP3, LIN28A, WAS, HMGB2, HMG20B, APOBEC3G, PPP2R5A, PRMl, PABPC1

4 SSB/La, TROVE2, SMNl, PSME3, SSX4, PRM2, RQCDl, RRAS, CDC25B, WAS, HNRNPA2B1, IGF2BP3, APOBEC3G, CARHSPl, LIN28A, HMGB2, PRMl, ID2, HOXB6 SSB/La, SMN1, TR0VE2, PSME3, RRAS, CARHSPl, SSX4, CDC25B, HNRNPA2B1, HMGB2, LIN28A, SUB1, IGF2BP3, PPP2R5A, ID2, AP0BEC3G, PABPC1, RQCDl, HMG20B

TR0VE2, SSB/La, SMNl, PSME3, HNRNPA2B1, SSX4, CARHSPl, CDC25B, RRAS, PRM2, RQCDl, LIN28A, WAS, IGF2BP3, AP0BEC3G, SUB1, PABPC1, HMGB2, HMG20B

SMNl, SSB/La, TR0VE2, PSME3, WAS, RRAS, SSX4, CDC25B, HNRNPA2B1, PRM2, HMGB2, LIN28A, IGF2BP3, CARHSPl, SUB1, RQCDl, H0XB6, ID2, PPP2R5A

SSB/La, SMN1, TR0VE2, WAS, PSME3, RRAS, HNRNPA2B1, PRM2, SSX4, LIN28A, IGF2BP3, HMGB2, CARHSPl, CDC25B, PRM1, SUB1, RQCDl, AP0BEC3G, H0XB6

SMNl, TR0VE2, SSB/La, PSME3, RRAS, PRM2, CDC25B, HNRNPA2B1, SSX4, HMGB2, LIN28A, CARHSPl, AP0BEC3G, IGF2BP3, WAS, H0XB6, ID2, PRM1, SUB1

SSB/La, SMN1, TR0VE2, HNRNPA2B1, PSME3, CDC25B, IGF2BP3, RRAS, WAS, LIN28A, HMGB2, SSX4, AP0BEC3G, ID2, PRM2, PPP2R5A, PABPC1, SUB1, CARHSPl

SMNl, SSB/La, TR0VE2, PSME3, WAS, PRM2, IGF2BP3, CARHSPl, CDC25B, RRAS,

HNRNPA2B1, AP0BEC3G, HMGB2, LIN28A, PRM1, SUB1, RQCDl, PPP2R5A, PABPC1

SMNl, SSB/La, TR0VE2, PSME3, CDC25B, HNRNPA2B1, WAS, RRAS, SSX4, IGF2BP3, PRM2, LIN28A, CARHSPl, RQCDl, HMGB2, ID2, AP0BEC3G, PABPC1, H0XB6

TR0VE2, SSB/La, PSM E3, SMNl, RRAS, PRM2, HNRNPA2B1, CARHSPl, IGF2BP3, LIN28A, WAS, CDC25B, SSX4, SUB1, HMGB2, AP0BEC3G, RQCDl, PRM 1, H0XB6

TR0VE2, SSB/La, SMNl, HNRNPA2B1, PSME3, WAS, CDC25B, RRAS, IGF2BP3, LIN28A, CARHSPl, AP0BEC3G, RQCDl, HMGB2, SSX4, PSIP1, PRM2, ID2, PABPC1

SMNl, SSB/La, TR0VE2, PSME3, WAS, CARHSPl, CDC25B, PRM2, HNRNPA2B1, RRAS, SSX4, LIN28A, AP0BEC3G, IGF2BP3, ID2, HMGB2, RQCDl, H0XB6, SUB1

TR0VE2, SSB/La, HNRNPA2B1, PSME3, SMNl, RRAS, IGF2BP3, PRM2, CDC25B, CARHSPl, LIN28A, SSX4, AP0BEC3G, WAS, RQCDl, HMGB2, SUB1, HMG20B, H0XB6

SSB/La, SMN1, TR0VE2, PSME3, HNRNPA2B1, CARHSPl, IGF2BP3, CDC25B, AP0BEC3G, RRAS, LIN28A, SSX4, SUB1, WAS, HMGB2, PABPC1, H0XB6, ID2, RQCDl

SMNl, SSB/La, TR0VE2, PSME3, HNRNPA2B1, RRAS, WAS, PRM2, SSX4, IGF2BP3, CDC25B, LIN28A, CARHSPl, PRM1, AP0BEC3G, ID2, SLA, HMGB2, H0XB6

SSB/La, SMNl, TR0VE2, HNRNPA2B1, PSME3, CDC25B, WAS, SSX4, PRM2, RRAS, CARHSPl, IGF2BP3, LIN28A, RQCDl, AP0BEC3G, HMGB2, H0XB6, PRM1, HMG20B

SSB/La, SMNl, TR0VE2, PSME3, HNRNPA2B1, CARHSPl, RQCDl, PRM2, SSX4, RRAS, LIN28A, CDC25B, IGF2BP3, WAS, AP0BEC3G, HMGB2, PRM1, PABPC1, SUB1

SSB/La, TR0VE2, SMNl, PSME3, WAS, CDC25B, PRM2, HNRNPA2B1, RRAS, AP0BEC3G, SSX4, LIN28A, IGF2BP3, HMGB2, CARHSPl, IRF4, PRM1, SUB1, PPP2R5A

TR0VE2, SSB/La, PSME3, SMNl, WAS, HNRNPA2B1, SSX4, AP0BEC3G, RRAS, LIN28A, HMGB2, PRM2, IGF2BP3, CARHSPl, CDC25B, SUB1, PPP2R5A, RQCDl, PRM1

SMNl, SSB/La, TR0VE2, HNRNPA2B1, WAS, PSME3, RRAS, SSX4, PRM2, IGF2BP3, CDC25B, RQCDl, CARHSPl, LIN28A, AP0BEC3G, HMGB2, SUB1, PRM 1, ID2

SMNl, SSB/La, TR0VE2, PSME3, CARHSPl, CDC25B, RRAS, HNRNPA2B1, WAS, PRM2, SSX4, IGF2BP3, AP0BEC3G, HMGB2, RQCDl, SUB1, LIN28A, H0XB6, IRF4

SSB/La, TR0VE2, SMNl, PSME3, WAS, RRAS, IGF2BP3, HNRNPA2B1, PRM2, LIN28A, CDC25B, CARHSPl, AP0BEC3G, HMGB2, PRM1, H0XB6, SUB1, PABPC1, RQCDl

TR0VE2, SSB/La, SMNl, PSME3, RRAS, PRM2, IGF2BP3, CDC25B, HNRNPA2B1, LIN28A, SSX4, CARHSPl, HMGB2, AP0BEC3G, HMG20B, WAS, RQCDl, PRM1, PABPC1

SMNl, SSB/La, TR0VE2, PSME3, CDC25B, IGF2BP3, HNRNPA2B1, RRAS, RQCDl, LIN28A, SSX4, CARHSPl, AP0BEC3G, HMGB2, PRM2, WAS, ID2, PABPC1, HMG20B

SSB/La, SM Nl, TR0VE2, PSME3, WAS, CDC25B, HNRNPA2B1, RRAS, RQCDl, IGF2BP3, SSX4, LIN28A, CARHSPl, AP0BEC3G, HMG20B, HMGB2, PRM2, PABPC1, PPP2R5A

SSB/La, TR0VE2, SMNl, PSME3, WAS, SSX4, CARHSPl, RRAS, HNRNPA2B1, CDC25B, PRM2, IGF2BP3, LIN28A, RQCDl, AP0BEC3G, HMGB2, ID2, PABPC1, HNRNPUL1

SSB/La, TR0VE2, SMNl, PSME3, WAS, PRM2, HNRNPA2B1, CDC25B, RRAS, IGF2BP3, CARHSP1, LIN28A, HMGB2, APOBEC3G, SUB1, RQCD1, PRMl, HMG20B, PABPCl

TABLE 5C

Panel Biomarker

1 TROVE2, SSB/La, PABPCl, ANXAl, HNRNPULl, BIRC3, HMG20B, IGF2BP3, LIN28A, NFIL3, ATF4, HMGB2, DLX4, HNRNPA2B1, ARAF, PYGB, WTl, APOBEC3G, CSNK2A1, PSME3, PATZl, FUS, VAV1, CREB1, HOXB6

2 TROVE2, SSB/La, PABPCl, HNRNPULl, ANXAl, IGF2BP3, HMG20B, LIN28A, NFIL3, HMGB2, BIRC3, DLX4, APOBEC3G, HNRNPA2B1, PYGB, SSX4, CSNK2A1, ATF4, WTl, ARAF, CREB1, ZAP70, VAV1, HOXB6, CDC25B

3 SSB/La, TROVE2, PABPCl, IGF2BP3, HMG20B, HNRNPA2B1, BIRC3, PSME3, ANXAl, HMGB2, CDC25B, APOBEC3G, LIN28A, HNRNPULl, DLX4, ATF4, RRAS, NFIL3, CSNK2A1, FUS, PRMl, HOXB6, PYGB, CARHSP1, SMAD5

4 TROVE2, SSB/La, PABPCl, BIRC3, ANXAl, HMG20B, HMGB2, HNRNPULl, IGF2BP3, LIN28A, APOBEC3G, NFIL3, ATF4, HNRNPA2B1, DLX4, PATZl, CSNK2A1, RRAS, PYGB, HOXB6, CDC25B, FUS, VAV1, PRMl, PSME3

5 SSB/La, TR0VE2, PABPCl, BIRC3, ANXAl, HMG20B, HMGB2, DLX4, HNRNPULl, PSME3, HNRNPA2B1, LIN28A, IGF2BP3, NFIL3, ZAP70, ATF4, CSNK2A1, WTl, RRAS, PYGB, CDC25B, HOXB6, PATZl, ARAF, APOBEC3G

6 TROVE2, SSB/La, BIRC3, PABPCl, HNRNPA2B1, AP0BEC3G, IGF2BP3, HMG20B, ATF4,

ANXAl, LIN28A, PSME3, HNRNPULl, HMGB2, H0XB6, DLX4, PATZl, ARAF, CDC25B, WTl, NFIL3, CSNK2A1, PRMl, CREB1, FUS

7 TROVE2, SSB/La, PABPCl, ANXAl, HMG20B, HNRNPULl, HNRNPA2B1, BIRC3, HMGB2, IGF2BP3, LIN28A, AP0BEC3G, DLX4, CDC25B, PSME3, ATF4, PATZl, NFIL3, PRMl, CSNK2A1, H0XB6, WTl, ZAP70, PYGB, ARAF

8 SSB/La, TR0VE2, PABPCl, HNRNPULl, ANXAl, IGF2BP3, DLX4, LIN28A, BIRC3, HMG20B, HMGB2, PRMl, AP0BEC3G, HNRNPA2B1, NFIL3, PSME3, ATF4, ARAF, PYGB, RRAS, CDC25B, H0XB6, WTl, FUS, PATZl

9 TROVE2, SSB/La, PABPCl, HMG20B, AP0BEC3G, IGF2BP3, HMGB2, PSME3, LIN28A,

HNRNPULl, ANXAl, HNRNPA2B1, BIRC3, DLX4, RRAS, ATF4, CDC25B, PRMl, NFIL3, CSNK2A1, ID2, FUS, H0XB6, PYGB, PATZl

10 TR0VE2, SSB/La, ANXAl, PABPCl, BIRC3, HNRNPULl, HNRNPA2B1, LIN28A, CSNK2A1, IGF2BP3, DLX4, CDC25B, RRAS, ATF4, HMG20B, NFIL3, AP0BEC3G, SH2B1, HMGB2, FUS, PSME3, WTl, PRMl, ARAF, H0XB6

11 TR0VE2, SSB/La, PABPCl, IGF2BP3, HMG20B, ANXAl, HMGB2, HNRNPULl, LIN28A, DLX4, HNRNPA2B1, H0XB6, PRMl, ATF4, NFIL3, PSM E3, BIRC3, AP0BEC3G, CDC25B, DLX3, PYGB, SSX4, PATZl, CREB1, RRAS

12 TR0VE2, SSB/La, PABPCl, HMG20B, ANXAl, HNRNPULl, IGF2BP3, BIRC3, DLX4,

HNRNPA2B1, LIN28A, HMGB2, NFIL3, ATF4, AP0BEC3G, CDC25B, ARAF, PATZl, PYGB, PSME3, CSNK2A1, WTl, ZAP70, VAV1, SMAD5

13 TR0VE2, SSB/La, PABPCl, HNRNPULl, ANXAl, LIN28A, HMGB2, IGF2BP3, HMG20B,

HNRNPA2B1, PSME3, AP0BEC3G, BIRC3, H0XB6, DLX4, ATF4, PRMl, CSNK2A1, RRAS, NFIL3, CARHSP1, PYGB, IRF5, CDC25B, FUS

14 TR0VE2, SSB/La, PABPCl, BIRC3, ANXAl, HNRNPA2B1, HNRNPULl, IGF2BP3, LIN28A, DLX4, H0XB6, HMG20B, HMGB2, ATF4, NFIL3, AP0BEC3G, WTl, PYGB, PRMl, PSME3, ZAP70, PATZl, RRAS, DLX3, VAV1

15 TR0VE2, SSB/La, PABPCl, BIRC3, HMGB2, ANXAl, HMG20B, LIN28A, IGF2BP3, HNRNPULl, AP0BEC3G, HNRNPA2B1, ATF4, PATZl, PSME3, DLX4, H0XB6, NFIL3, ARAF, WTl, PYGB, CDC25B, RRAS, VAV1, ID2 16 TROVE2, SSB/La, IGF2BP3, PABPC1, LIN28A, BIRC3, H GB2, ANXA1, ATF4, HNRNPA2B1, HMG20B, DLX4, HNRNPUL1, AP0BEC3G, PSME3, H0XB6, PATZ1, PRM1, NFIL3, ARAF, RRAS, CDC25B, PYGB, CSNK2A1, CREB1

17 TR0VE2, SSB/La, PABPC1, IGF2BP3, AP0BEC3G, HNRNPA2B1, BIRC3, PSME3, ANXA1,

HNRNPUL1, HMGB2, HMG20B, DLX4, CDC25B, LIN28A, ATF4, NFIL3, PRM1, PATZ1, H0XB6, CSNK2A1, CREB1, ARAF, WAS, PYGB

18 TR0VE2, SSB/La, BIRC3, PABPC1, HNRNPA2B1, ANXA1, HNRNPUL1, IGF2BP3, H G20B, HMGB2, DLX4, LIN28A, PSME3, ATF4, H0XB6, ARAF, CDC25B, AP0BEC3G, NFIL3, RRAS, PRM1, ID2, PYGB, CREB1, VAV1

19 SSB/La, TR0VE2, PABPC1, BIRC3, ANXA1, HNRNPA2B1, ATF4, NFIL3, HMGB2, IGF2BP3, LIN28A, DLX4, HMG20B, PYGB, HNRNPUL1, CREB1,

CDC25B,ZAP70,CSNK2A1,PATZ1,ARAF,VAV1,APOBEC3G,WT1,PSME3

20 TROVE2,SSB/La,PABPCl,ANXAl,BIRC3,DLX4,HNRNPULl,ATF4,HNRNPA2Bl,NFIL3,APOBEC3G ,HMG20B,HOXB6,LIN28A,ARAF,PATZ1,IRF5,WT1,VAV1,IGF2BP3,HMGB2,PSME3,PYGB,CSNK2 A1,ZAP70

21 TROVE2,SSB/La,PABPCl,ANXAl,BIRC3,IGF2BP3,LIN28A,DLX4,HOXB6,HMG20B,APOBEC3G,H MGB2,HNRNPUL1,HNRNPA2B1,CDC25B,NFIL3,PYGB,ATF4,ARAF,WT1,PRM1,NEUR0D4,ZAP7 0,PATZ1,CREB1

22 TROVE2,SSB/La,BIRC3,HMGB2,ANXAl,PABPCl,HMG20B,HNRNPULl,LIN28A,DLX4,APOBEC3 G,H0XB6,IGF2BP3,PSME3,ATF4,CSNK2A1,NFIL3,HNRNPA2B1,ARAF,PRM2,PRM1,WT1,IRF5,R RAS,PYGB

23 TROVE2,SSB/La,PABPCl,HNRNPULl,ANXAl,HMG20B,HNRNPA2Bl,NFIL3,DLX4,IGF2BP3,BIRC 3,LIN28A,ATF4NAV1,PATZ1,HMGB2,PYGB,ARAF,ZAP70,APOBEC3G,CREB1,HOXB6,CDC25B,P S E3,ZMYND11

24 TROVE2,SSB/La,PABPCl,HMG20B,BIRC3,HNRNPULl,ANXAl,HMGB2,IGF2BP3,DLX4,LIN28A,C DC25B,NFIL3,ATF4,PYGB,HOXB6,HNRNPA2B1,ZAP70,CSNK2A1,APOBEC3G,PATZ1,WT1,PSME 3,ARAF,VAV1

25 TROVE2,SSB/La,ANXAl,PABPCl,IGF2BP3,BIRC3,HMG20B,ATF4,HNRNPULl,HMGB2,SSX4,NFI L3,LIN28A,DLX4,WT1,PYGB,FUS,ARAF,HNRNPA2B1,ZAP70,CREB1,PATZ1,CSNK2A1,APOBEC3 G,PSME3

26 TROVE2,SSB/La,PABPCl,IGF2BP3,ANXAl,HMG20B,HNRNPA2Bl,BIRC3,HNRNPULl,LIN28A,A POBEC3G,HMGB2,ARAF,NFIL3,DLX4,PATZl,CDC25B,PYGB,PRMl,CSNK2Al,HOXB6,PSME3,CR EB1NAV1,ATF4

27 TROVE2,SSB/La,PABPCl,IGF2BP3,ANXAl,LIN28A,HMG20B,ATF4,BIRC3,HMGB2,DLX4,NFIL3,

HNRNPA2B1,CDC25B,HNRNPUL1,PYGB,AP0BEC3G,H0XB6,CREB1,PATZ1,ARAF,PSM E3,FUS,P

RM1NAV1

28 TROVE2,SSB/La,HMG20B,PABPCl,IGF2BP3,LIN28A,HNRNPA2Bl,HNRNPULl,ANXAl,HMGB2,

BIRC3,APOBEC3G,DLX4,ATF4,PATZl,NFIL3,HOXB6,PYGB,CDC25B,PRMl,WTl,PSM E3,RRAS,S AD5,PRM2

29 TROVE2,SSB/La,PABPCl,IGF2BP3,HNRNPULl,ANXAl,BIRC3,LIN28A,ATF4,APOBEC3G,HMGB2

,H G20B,HNRNPA2B1,DLX4,NFIL3,PATZ1,PSME3,PYGB,HOXB6,WT1,PRM1,CDC25B,FUS,RR

AS,ZAP70

30 TROVE2,SSB/La,PABPCl,HMG20B,LIN28A,ANXAl,HNRNPA2Bl,HOXB6,HMGB2,IGF2BP3,BIRC

3,DLX4,HNRNPULl,PSME3,PRMl,APOBEC3G,ATF4,CDC25B,NFIL3,SSX4,RRAS,ARAF,PYGB,IRF

5,ZMYND11

TABLE 5D

Panel Biomarker

1 SSB/La, TR0VE2, HMGB2, HMG20B, SMN1, PABPC1, ANXA1, BIRC3, HNRNPA2B1, IGF2BP3, CDC25B, RQCD1 TROVE2, SSB/La, PABPCl, HMGB2, H G20B, BIRC3, ANXAl, RQCDl, IGF2BP3,

HNRNPA2B1, LYN, NFIL3

SSB/La, TR0VE2, PABPCl, HMGB2, RQCDl, HNRNPA2B1, SMNl, BIRC3, LYN, MARK4, HMG20B, CDC25B

TR0VE2, SSB/La, HMGB2, HMG20B, PABPCl, ANXAl, SMNl, BIRC3, APEX1, RQCDl, HNRNPA2B1, NFIL3

TR0VE2, SSB/La, BIRC3, HMGB2, PABPCl, SMNl, HMG20B, RQCDl, ANXAl, HNRNPA2B1, LYN, NFIL3

SSB/La, TR0VE2, BIRC3, RQCDl, HMG20B, HMGB2, SMNl, PABPCl, ANXAl, LYN,

HNRNPA2B1, RPL10

TR0VE2, SSB/La, MARK4, LYN, HMGB2, HMG20B, PABPCl, HNRNPA2B1, BIRC3, RPL10, ANXAl, SMNl

SSB/La, TROVE2, HMGB2, PABPCl, ANXAl, HMG20B, BIRC3, SMNl, RQCDl, NFIL3, DLX4, APEX1

TROVE2, SSB/La, HMGB2, PABPCl, HMG20B, ANXAl, BIRC3, SMNl, RQCDl, HNRNPA2B1, LYN, NFIL3

TROVE2, SSB/La, RQCDl, ANXAl, CSNK2A1, BIRC3, SMNl, PABPCl, HMGB2, CDC25B, HMG20B, NFIL3

TROVE2, SSB/La, PABPCl, RQCDl, HMGB2, HMG20B, IGF2BP3, ANXAl, HNRNPA2B1, DLX3, APEX1, BIRC3

SSB/La, TR0VE2, PABPCl, HMGB2, ANXAl, HMG20B, BIRC3, RQCDl, SMNl, HNRNPA2B1, NFIL3, IGF2BP3

TROVE2, SSB/La, HMGB2, PABPCl, HMG20B, ANXAl, HNRNPA2B1, BIRC3, APEX1, LIN28A, HOXB6, SMNl

TROVE2, SSB/La, PABPCl, HMGB2, BIRC3, ANXAl, HMG20B, RQCDl, HNRNPA2B1, NFIL3, SMNl, ATF4

TROVE2, SSB/La, PABPCl, HMGB2, RQCDl, HMG20B, BIRC3, SMNl, ANXAl, HNRNPA2B1, PATZ1, LIN28A

SSB/La, TROVE2, RQCDl, HMGB2, PABPCl, BIRC3, ANXAl, HNRNPA2B1, SM Nl, IGF2BP3, HMG20B, ATF4

TROVE2, SSB/La, PABPCl, HMGB2, HNRNPA2B1, MARK4, LYN, RPL10, BIRC3, HMG20B, RQCDl, ANXAl

TROVE2, SSB/La, SMNl, BIRC3, RQCDl, HNRNPA2B1, HMG20B, PABPCl, LYN, ANXAl, HMGB2, MARK4

SSB/La, TROVE2, BIRC3, HMGB2, PABPCl, RQCDl, ANXAl, SM Nl, HMG20B, NFIL3, WAS, ATF4

TROVE2, SSB/La, RQCDl, ANXAl, BIRC3, SMNl, PABPCl, NFIL3, HMGB2, HMG20B, ARAF, DLX4

TROVE2, SSB/La, RQCDl, BIRC3, ANXAl, HMGB2, HMG20B, PABPCl, PSME3, HNRNPA2B1, DLX4, SMNl

SSB/La, TROVE2, BIRC3, HMGB2, SMNl, ANXAl, HMG20B, PABPCl, RQCDl, HOXB6, NFIL3, ARAF

TROVE2, SSB/La, PABPCl, RQCDl, HMG20B, ANXAl, HMGB2, BIRC3, HNRNPUL1, NFIL3, HNRNPA2B1, CDC25B

SSB/La, TROVE2, HMG20B, PABPCl, HMGB2, BIRC3, ANXAl, RQCDl, HNRNPA2B1, HNRNPUL1, IGF2BP3, WAS

TROVE2, SSB/La, BIRC3, HMGB2, PABPCl, ANXAl, HMG20B, RQCDl, NFIL3, SMNl, ATF4, APEX1

TROVE2, SSB/La, PABPCl, HMG20B, RQCDl, ANXAl, HNRNPA2B1, HMGB2, BIRC3, CDC25B, NFIL3, IGF2BP3

TROVE2, SSB/La, PABPCl, HMGB2, HMG20B, RQCDl, ANXAl, BIRC3, HNRNPA2B1, IGF2BP3, APEX1, ATF4

28 TR0VE2, SSB/La, HMG20B, HNRNPA2B1, HMGB2, PABPCl, BIRC3, ANXAl, SM Nl, RQCD1, IGF2BP3, LIN28A

29 SSB/La, TR0VE2, PABPCl, BIRC3, RQCD1, HMGB2, ANXAl, HMG20B, NFIL3, HNRNPA2B1, ATF4, IGF2BP3

30 TR0VE2, SSB/La, HMGB2, HMG20B, PABPCl, ANXAl, RQCD1, BIRC3, HNRNPA2B1, SMNl, ARAF, LIN28A

TABLE 5E

Panel Biomarker

1 SSB/La, TROVE2, HMGB2, HMG20B, SMNl, PABPCl, ANXAl, BIRC3, HNRNPA2B1, IGF2BP3, CDC25B, LIN28A, DLX4, NFIL3

2 TR0VE2, SSB/La, PABPCl, HMGB2, HMG20B, BIRC3, ANXAl, IGF2BP3, HNRNPA2B1, NFIL3, HNRNPUL1, LIN28A, SMNl, PYGB

3 SSB/La, TR0VE2, PABPCl, HMGB2, HNRNPA2B1, BIRC3, SMNl, HMG20B, ANXAl, CDC25B, IGF2BP3, APEX1, NFIL3, AP0BEC3G

4 TR0VE2, SSB/La, HMGB2, HMG20B, PABPCl, ANXAl, SMNl, BIRC3, APEX1, HNRNPA2B1, NFIL3, CSNK2A1, ATF4, PSIP1

5 TR0VE2, SSB/La, BIRC3, HMGB2, PABPCl, SMNl, HMG20B, ANXAl, HNRNPA2B1, NFIL3, SSX4, ZAP70, PATZ1, CSNK2A1

6 SSB/La, TR0VE2, BIRC3, HMG20B, HMGB2, PABPCl, SMNl, ANXAl, HNRNPA2B1, ATF4, IGF2BP3, LIN28A, NFIL3, SSX4

7 TR0VE2, SSB/La, HMGB2, HMG20B, PABPCl, BIRC3, HNRNPA2B1, ANXAl, SMNl,

AP0BEC3G, CSNK2A1, NFIL3, APEX1, CDC25B

8 SSB/La, TR0VE2, HMGB2, PABPCl, ANXAl, HMG20B, BIRC3, SM Nl, NFIL3, APEX1, DLX4, HNRNPA2B1, HNRNPUL1, PYGB

9 TR0VE2, SSB/La, HMGB2, PABPCl, HMG20B, ANXAl, BIRC3, SMNl, HNRNPA2B1, NFIL3, IGF2BP3, CDC25B, AP0BEC3G, CSNK2A1

10 TR0VE2, SSB/La, ANXAl, CSNK2A1, BIRC3, SMNl, PABPCl, HMGB2, CDC25B, HMG20B, NFIL3, SH2B1, ARAF, ATF4

11 TR0VE2, SSB/La, HMGB2, PABPCl, HMG20B, IGF2BP3, ANXAl, HNRNPA2B1, DLX3, BIRC3, APEX1, LIN28A, DLX4, NFIL3

12 SSB/La, TR0VE2, PABPCl, HMGB2, ANXAl, HMG20B, BIRC3, SMNl, HNRNPA2B1, NFIL3, IGF2BP3, HNRNPUL1, APEX1, ATF4

13 TR0VE2, SSB/La, HMGB2, PABPCl, HMG20B, ANXAl, HNRNPA2B1, BIRC3, H0XB6, APEX1, SMNl, LIN28A, CSNK2A1, PSIP1

14 TR0VE2, SSB/La, PABPCl, HMGB2, BIRC3, HMG20B, ANXAl, NFIL3, HNRNPA2B1, SMNl, ATF4, HNRNPUL1, DLX4, IGF2BP3

15 TR0VE2, SSB/La, PABPCl, HMGB2, HMG20B, BIRC3, SMNl, ANXAl, HNRNPA2B1, PATZ1, LIN28A, H0XB6, NFIL3, ATF4

16 SSB/La, TR0VE2, HMGB2, PABPCl, BIRC3, ANXAl, HNRNPA2B1, SMNl, HMG20B, IGF2BP3, ATF4, H0XB6, LIN28A, NFIL3

17 TR0VE2, SSB/La, PABPCl, HMGB2, HNRNPA2B1, BIRC3, HMG20B, ANXAl, IGF2BP3, SMNl, NFIL3, CDC25B, WAS, AP0BEC3G

18 TR0VE2, SSB/La, SMNl, BIRC3, HNRNPA2B1, HMG20B, PABPCl, ANXAl, HMGB2, NFIL3, ATF4, SSX4, ARAF, CDC25B

19 SSB/La, TR0VE2, BIRC3, HMGB2, PABPCl, ANXAl, SMNl, HMG20B, NFIL3, WAS, ATF4, HNRNPA2B1, SH2B1, PATZ1 20 TR0VE2, SSB/La, ANXAl, BIRC3, S N1, PABPCl, NFIL3, HMGB2, HMG20B, ARAF, DLX4, CSNK2A1, PATZ1, SSX4

21 TR0VE2, SSB/La, BIRC3, ANXAl, HMGB2, PABPCl, HMG20B, HNRNPA2B1, DLX4, SMN1, IGF2BP3, NFIL3, DLX3, LIN28A

22 SSB/La, TR0VE2, BIRC3, HMGB2, SMN1, ANXAl, HMG20B, PABPCl, H0XB6, NFIL3, ARAF, HNRNPA2B1, DLX4, PSIPl

23 TR0VE2, SSB/La, PABPCl, ANXAl, HMG20B, HMGB2, BIRC3, NFIL3, HNRNPUL1,

HNRNPA2B1, CDC25B, ATF4, IGF2BP3, SM N1

24 SSB/La, TR0VE2, HMG20B, PABPCl, HMGB2, BIRC3, ANXAl, HNRNPA2B1, HNRNPUL1, IGF2BP3, WAS, SM N1, NFIL3, LIN28A

25 TR0VE2, SSB/La, BIRC3, HMGB2, PABPCl, ANXAl, HMG20B, NFIL3, ATF4, SMN1, APEXl, WAS, IGF2BP3, SSX4

26 TR0VE2, SSB/La, PABPCl, HMG20B, ANXAl, HNRNPA2B1, HMGB2, BIRC3, CDC25B, NFIL3, IGF2BP3, PYGB, LIN28A, ARAF

27 TR0VE2, SSB/La, PABPCl, HMGB2, HMG20B, ANXAl, BIRC3, IGF2BP3, HNRNPA2B1, APEXl, ATF4, LIN28A, NFIL3, SMN1

28 TR0VE2, SSB/La, HMG20B, HNRNPA2B1, HMGB2, PABPCl, BIRC3, ANXAl, SMN1, IGF2BP3, LIN28A, NFIL3, ATF4, HNRNPUL1

29 SSB/La, TR0VE2, PABPCl, BIRC3, HMGB2, ANXAl, NFIL3, HMG20B, ATF4, HNRNPA2B1, IGF2BP3, HNRNPUL1, PATZ1, SM N1

30 TR0VE2, SSB/La, HMGB2, HMG20B, PABPCl, ANXAl, BIRC3, HNRNPA2B1, SMN1, ARAF, LIN28A, NFIL3, H0XB6, DLX4

TABLE 5F

Panel Biomarker

1 TR0VE2, PSME3, BIRC3, HMG20B, CDC25B, RQCDl, CSNK2A1, PSIPl, PRM1, SGK3, SMN1, PATZ1, SSX4, EGR2

2 TR0VE2, PSME3, BIRC3, CDC25B, CSNK2A1, RQCDl, PSIPl, H0XB6, CARHSPl, IGF2BP3, S N1, MARK4, HOXC10, EGR2

3 TROVE2, BIRC3, PSME3, H G20B, CDC25B, RQCDl, PSIPl, ANXAl, PPP2CB, FUS, SLA, HIST1H4I, PABPCl, SSX4

4 TROVE2, BIRC3, HNRNPA2B1, CDC25B, PSME3, CSNK2A1, RQCDl, IGF2BP3, CARHSPl, ATF4, PSIPl, DLX4, SLA, SMN1

5 TROVE2, BIRC3, HNRNPA2B1, PSME3, CDC25B, PSIPl, HMG20B, RQCDl, CSNK2A1, HMGB2, CARHSPl, IGF2BP3, HIST1H4I, VAV1

6 TROVE2, HNRNPA2B1, BIRC3, PSME3, CDC25B, PSIPl, SSX4, RQCDl, ANXAl, HIST1H4I, HOXC10, MLLT3, MAP3K7, SLA

7 TROVE2, HNRNPA2B1, BIRC3, PSME3, CDC25B, PSIPl, PRM1, IGF2BP3, SMN1, MARK4, SUB1, RRAS, CSNK2A1, RQCDl

8 TROVE2, HNRNPA2B1, BIRC3, PS E3, HMG20B, CDC25B, PSIPl, RQCDl, CSNK2A1, HMGB2, SMAD5, SLA, PRM1, ATF4

9 TROVE2, HNRNPA2B1, BIRC3, PS E3, HMG20B, CDC25B, RQCDl, CSNK2A1, PSIPl, S N1, HOXC10, HIST1H4I, ANXAl, CLK1

10 TROVE2, BIRC3, HNRNPA2B1, CDC25B, PSME3, CSNK2A1, RQCDl, HOXC10, SMN1, PSIPl, SSB/La, SERPINB5, RRAS, HMGB2

11 TROVE2, SMN1, IGF2BP3, CDC25B, RQCDl, PSIPl, PSME3, ARAF, CSNK2A1, BIRC3, HMGB2, PPP2CB, HOXC10, HIST1H4I

12 TROVE2, PSME3, CDC25B, BIRC3, HMG20B, PSIPl, RQCDl, CSNK2A1, IGF2BP3, SMN1, HIST1H4I, PRKCB, PABPCl, SSX4 13 TROVE2, BIRC3, APOBEC3G, HNRNPA2B1, PSIP1, PSME3, CDC25B, RQCD1, ANXA1, HMG20B, HIST1H4I, HOXC10, SSX4,TGIF1

14 SSB/La, PSME3, ANXA1, CDC25B, RQCD1, PSIP1, BIRC3, TROVE2, HMG20B, HIST1H4I, HOXC10, SSX4, PABPC1, COX6C

15 TROVE2, HMGB2, PSME3, CDC25B, BIRC3, RQCD1, IGF2BP3, RRAS, HIST1H4I, CREB1, SSX4, PABPC1, WAS, SH2B1

16 TROVE2, PSME3, ANXA1, CDC25B, BIRC3, HIST1H4I, HOXC10, HMG20B, RQCD1, PSIP1, SMNl, MLLT3, CARHSPl, ATF4

17 SSB/La, BIRC3, CDC25B, PSME3, PABPC1, RQCD1, PSIP1, ANXA1, IRF5, SSX4, HOXC10, HIST1H4L TROVE2, SMNl

18 TROVE2, SMNl, IGF2BP3, CDC25B, PSME3, CSNK2A1, RQCD1, BIRC3, PSIP1, RRAS, ATF4, HOXC10, SERPINB5, ANXA1

19 TROVE2, BIRC3, HNRNPA2B1, PSME3, CDC25B, HMG20B, RQCDl, PSIP1, RRAS, CSNK2A1, HIST1H4I, HOXC10, SSX4, WAS

20 TROVE2, APOBEC3G, BIRC3, HNRNPA2B1, CDC25B, PSME3, HMGB2, RQCDl, CSNK2A1, SMNl, SGK3, ATF4, HOXC10, SERPINB5

TABLE 5G

TABLE 6A

Panel Biomarker

1 dsDNA, ANA, TROVE2, SSB/La, SMNl, PSME3, IFIT5, HNRNPA2B1, WAS, SRPKl, CALM1, PRM2, IGF2BP3, MAGEA4, RRAS, LIN28A, FOXN2, CALM 2, CDC25B, PIN1

2 dsDNA, ANA, SSB/La, TROVE2, SMNl, KLHL12, PSME3, MAP2K6, GPHN, SRPKl, WAS, BANKl, HNRNPA2B1, DCLKl, CITEDl, PRM2, FOXN2, CALM1, CDC25B, AK7

3 dsDNA, ANA, SMNl, TROVE2, SSB/La, PSME3, SRPKl, CALM1, IFIT5, PRM2, CALM2, SSX4, RRAS, WAS, DCLKl, GRK5, HNRNPA2B1, CDC25B, STK3, LIN28A

4 dsDNA, ANA, SSB/La, TROVE2, SMNl, PSME3, IFIT5, SRPKl, IGF2BP3, GRK5, DCLKl,

HNRNPA2B1, CDC25B, SSX4, PRM2, RRAS, LIN28A, HMGB2, AK3L1, MAGEA4

5 dsDNA, ANA, SMNl, TROVE2, SSB/La, PSME3, IFIT5, WAS, SRPKl, HNRNPA2B1, RRAS, GRK5, PRM2, CDC25B, PIM1, TARDBP, DCLKl, PIN1, PRKRA, AK3L1

6 dsDNA, ANA, SSB/La, TROVE2, IFIT5, PSME3, HNRNPA2B1, SMNl, WAS, BANKl, IGF2BP3, LIN28A, RRAS, CITEDl, PIM1, TP73, CDC25B, DCLKl, HMGB2, SRPKl

7 dsDNA, ANA, SSB/La, SMNl, TROVE2, IFIT5, PSME3, KLHL12, WAS, RRAS, CARHSPl, CDC25B, HNRNPA2B1, PRM2, AK7, SSX4, PIM1, DCLKl, LIN28A, SRPKl

8 dsDNA, ANA, TROVE2, SSB/La, PSME3, SMNl, KLHL12, RRAS, WAS, FOXN2, DCLKl, LIN28A, HMGB2, SRPKl, MAGEA4, IGF2BP3, AK3L1, CDC25B, PI Ml, PRKRA 9 dsDNA, ANA, SSB/La, SMNl, TR0VE2, IFIT5, PSME3, TARDBP, HNRNPA2B1, SRPKl, MAP2K1, RQCDl, RRAS, DCLKl, AK7, HMGB2, CALMl, IGF2BP3, GRK5, GPHN

10 dsDNA, SSB/La, ANA, TR0VE2, SMNl, PSME3, HNRNPA2B1, RRAS, CASP7, WAS, PIMl, DCLKl, MAGEA4, IFIT5, MATK, BRSK2, SRPKl, PRM2, IGF2BP3, MAP2K1

11 dsDNA, ANA, TR0VE2, SSB/La, SMNl, PSME3, CDC25B, IFIT5, WAS, HNRNPA2B1, RRAS,

TARDBP, CARHSP1, AK7, PIMl, SRPKl, SSX4, IGF2BP3, STK3, FIP1L1

12 dsDNA, SSB/La, ANA, KLHL12, TR0VE2, SMNl, PSME3, IFIT5, PRM2, DCLKl, NME5, CALMl, CDC25B, GRK5, HNRNPA2B1, IGF2BP3, LIN28A, CARHSP1, AP0BEC3G, MATK

13 dsDNA, ANA, TR0VE2, SSB/La, SMNl, PSME3, IFIT5, IGF2BP3, SRPKl, TARDBP, SSX4, WAS, HMGB2, LIN28A, RRAS, HNRNPA2B1, CDC25B, MATK, PRKRA, CHEK2

14 dsDNA, ANA, SSB/La, SMNl, TR0VE2, PSME3, IFIT5, KLHL12, CALMl, IGF2BP3, HNRNPA2B1, CALM 2, GRK5, CDC25B, DCLKl, AK3L1, AK7, LIN28A, RRAS, SSX4

15 dsDNA, ANA, SSB/La, TR0VE2, SM Nl, PSME3, CALMl, CALM 2, IGF2BP3, SRPKl, PRM2,

F0XN2, LIN28A, DBNL, HNRNPA2B1, GRK5, RRAS, PIN1, STK3, WAS

16 dsDNA, ANA, SSB/La, TR0VE2, SMNl, KLHL12, PSME3, SRPKl, WAS, MAGEA4, SSX4, CDC25B, RRAS, PRM2, HNRNPA2B1, PIMl, CARHSP1, LIN28A, GRK5, AK7

17 dsDNA, ANA, TR0VE2, SSB/La, SM Nl, PSME3, HNRNPA2B1, IFIT5, SRPKl, CDC25B, DCLKl, RRAS, C0PS6, PRM2, CALMl, GRK5, LIN28A, WAS, IGF2BP3, CALM2

18 dsDNA, ANA, SSB/La, TR0VE2, SMNl, IFIT5, PSME3, RRAS, WAS, STK3, CALMl, IGF2BP3, LIN28A, HNRNPA2B1, CALM 2, GRK5, CASP7, HMGB2, PTPN11, PRM2

19 dsDNA, ANA, SMNl, SSB/La, TR0VE2, PSME3, IFIT5, PIMl, SRPKl, HNRNPA2B1, CALM l, DBNL, SSX4, MATK, CDC25B, RRAS, LIN28A, PRM2, IGF2BP3, CALM2

20 dsDNA, SSB/La, ANA, TR0VE2, SM Nl, PSME3, IFIT5, CALMl, WAS, DCLKl, RRAS, HNRNPA2B1, SRPKl, CALM 2, IGF2BP3, LIN28A, MAGEA4, PRKRA, AK3L1, YARS

21 dsDNA, ANA, SSB/La, SMNl, TR0VE2, PSME3, IFIT5, CDC25B, HNRNPA2B1, MAP2K6, SRPKl, PRM2, MAP2K1, MATK, DBNL, RRAS, LIN28A, AK3L1, IGF2BP3, AK7

22 dsDNA, ANA, SSB/La, SMNl, TR0VE2, PSME3, SRPKl, RRAS, SSX4, IFIT5, HNRNPA2B1, WAS, PRM2, GPHN, AK3L1, IGF2BP3, CALM l, CARHSP1, CDC25B, HMGB2

23 dsDNA, ANA, TR0VE2, SSB/La, PSME3, HNRNPA2B1, SMNl, IFIT5, SRPKl, WAS, CDC25B, PRM2, HMGB2, DCLKl, GRK5, RRAS, IGF2BP3, SSX4, MATK, AP0BEC3G

24 dsDNA, ANA, SSB/La, TR0VE2, PSME3, SRPKl, SMNl, RRAS, PIMl, WAS, LIN28A, HNRNPA2B1, IGF2BP3, PRM2, CASP7, CALMl, MAGEA4, DCLKl, CDC25B, SSX4

25 dsDNA, ANA, SSB/La, TR0VE2, SMNl, PSME3, IFIT5, CALMl, CARHSP1, CALM2, CDC25B, DCLKl, HNRNPA2B1, WAS, DBNL, STK3, LIN28A, PRM2, IGF2BP3, FIP1L1

26 dsDNA, ANA, SMNl, SSB/La, TR0VE2, IFIT5, PSME3, HNRNPA2B1, IGF2BP3, CDC25B, CALMl, DCLKl, LIN28A, SRPKl, GRK5, PIMl, WAS, AK7, PRM2, CALM2

27 dsDNA, ANA, SMN1, TR0VE2, SSB/La, PSME3, KLHL12, WAS, PRM2, HNRNPA2B1, DCLKl, LIN28A, CALMl, IGF2BP3, SSX4, RRAS, SSX2, AP0BEC3G, IFIT5, CASP7

28 dsDNA, SMNl, ANA, TR0VE2, SSB/La, IFIT5, PSME3, CALMl, SRPKl, WAS, CALM 2, GRK5, PIMl, RRAS, HMGB2, SSX4, IGF2BP3, CDC25B, AK3L1, HNRNPA2B1

29 dsDNA, ANA, TR0VE2, SMNl, SSB/La, IFIT5, PSME3, HNRNPA2B1, CARHSP1, CALMl, WAS, IGF2BP3, SSX4, RRAS, DCLKl, CALM 2, LIN28A, CDC25B, STK3, DBNL

30 dsDNA, ANA, TR0VE2, SSB/La, SMNl, PSME3, SRPKl, BANK1, RRAS, CALMl, IGF2BP3, WAS, LIN28A, CALM 2, AK3L1, HNRNPA2B1, PIMl, AK7, PRM2, MAGEA4

TABLE 6B

Panel Biomarker

1 ANA, dsDNA, SSB/La, TR0VE2, SMNl, PSME3, IFIT5, SRPKl, CALM l, HNRNPA2B1, MAGEA4, WAS, F0XN2, PRM2, CALM 2 ANA, dsDNA, SSB/La, SMNl, TR0VE2, KLHL12, PSME3, MAP2K6, GPHN, SRPKl, BANK1, DCLKl, F0XN2, CITEDl, WAS

ANA, dsDNA, SM Nl, SSB/La, TR0VE2, PSM E3, SRPKl, IFIT5, CALMl, CALM 2, DCLKl, PRM2, GRK5, RRAS, STK3

ANA, dsDNA, SSB/La, TR0VE2, SMNl, PSME3, IFIT5, SRPKl, DCLKl, GRK5, CDC25B, AK3L1, MAGEA4, PI Ml, PRM2

ANA, dsDNA, SMNl, SSB/La, TR0VE2, IFIT5, PSME3, SRPKl, WAS, HNRNPA2B1, GRK5, TARDBP, DCLKl, RRAS, PIM1

ANA, dsDNA, SSB/La, TR0VE2, IFIT5, PSME3, HNRNPA2B1, SM Nl, WAS, BANK1, DCLKl, CITEDl, PIM1, TP73, SRPKl

ANA, dsDNA, SSB/La, SMNl, TR0VE2, IFIT5, PSME3, KLHL12, WAS, RRAS, CARHSPl, CDC25B, DCLKl, AK7, PRM2

ANA, dsDNA, SSB/La, TR0VE2, PSM E3, SMNl, KLHL12, WAS, RRAS, F0XN2, DCLKl, SRPKl, AK3L1, MAGEA4, PIM1

ANA, dsDNA, SSB/La, SMNl, IFIT5, TR0VE2, PSME3, TARDBP, SRPKl, HNRNPA2B1, MAP2K1, DCLKl, RQCDl, CALMl, AK7

ANA, dsDNA, SSB/La, TR0VE2, SMNl, PSM E3, CASP7, HNRNPA2B1, PIM1, RRAS, DCLKl, MAGEA4, IFIT5, WAS, MATK

ANA, dsDNA, TR0VE2, SSB/La, SMNl, PSME3, IFIT5, CDC25B, TARDBP, WAS, RRAS,

HNRNPA2B1, SRPKl, AK7, CARHSPl

ANA, dsDNA, SSB/La, KLHL12, TR0VE2, SMNl, PSME3, IFIT5, DCLKl, NME5, CALMl, GRK5, PRM2, MATK, CDC25B

ANA, dsDNA, SSB/La, SMNl, TR0VE2, PSME3, IFIT5, SRPKl, TARDBP, IGF2BP3, WAS, SSX4, MATK, CDC25B, CHEK2

ANA, dsDNA, SSB/La, SMNl, TR0VE2, IFIT5, PSME3, KLHL12, CALMl, CALM 2, GRK5, DCLKl, HNRNPA2B1, AK3L1, IGF2BP3

ANA, dsDNA, SSB/La, TR0VE2, SM Nl, PSME3, CALMl, CALM 2, SRPKl, F0XN2, DBNL, GRK5, PRM2, IGF2BP3, STK3

ANA, dsDNA, SSB/La, TR0VE2, SMNl, SRPKl, KLHL12, PSME3, MAGEA4, WAS, PIM1, GRK5, CDC25B, AK7, DCLKl

ANA, dsDNA, TR0VE2, SSB/La, SMNl, PSM E3, HNRNPA2B1, IFIT5, SRPKl, DCLKl, C0PS6, CALMl, CDC25B, GRK5, RRAS

ANA, dsDNA, SSB/La, TR0VE2, IFIT5, SMNl, PSME3, RRAS, STK3, CALMl, WAS, CASP7, GRK5, CALM 2, PTPN11

ANA, dsDNA, SMNl, SSB/La, TR0VE2, IFIT5, PSME3, PIM1, SRPKl, CALMl, HNRNPA2B1, MATK, DBNL, CALM 2, GRK5

ANA, dsDNA, SSB/La, SMNl, TR0VE2, PSME3, IFIT5, CALM l, DCLKl, SRPKl, CALM 2, MAGEA4, WAS, HNRNPA2B1, AK3L1

ANA, dsDNA, SSB/La, SMNl, TR0VE2, IFIT5, PSME3, CDC25B, MAP2K6, HNRNPA2B1, SRPKl, MATK, MAP2K1, AK3L1, DBNL

ANA, dsDNA, SSB/La, SMNl, TR0VE2, PSME3, SRPKl, RRAS, IFIT5, AK3L1, GPHN, CALMl, PIM 1, WAS, HNRNPA2B1

ANA, dsDNA, TR0VE2, SSB/La, PSME3, IFIT5, SM Nl, HNRNPA2B1, SRPKl, WAS, DCLKl, GRK5, CASP7, CDC25B, MATK

ANA, dsDNA, SSB/La, TR0VE2, PSME3, SRPKl, PI Ml, SMNl, RRAS, CASP7, CALMl, WAS, IFIT5, DCLKl, MAGEA4

ANA, dsDNA, SSB/La, SMNl, TR0VE2, PSME3, IFIT5, CALMl, CALM2, DCLKl, CARHSPl, STK3, DBNL, CDC25B, HNRNPA2B1

ANA, dsDNA, SMNl, SSB/La, TR0VE2, IFIT5, PSME3, HNRNPA2B1, CALMl, CDC25B, DCLKl, GRK5, SRPKl, IGF2BP3, PIM1

ANA, dsDNA, SM Nl, TR0VE2, SSB/La, PSME3, KLHL12, WAS, DCLKl, PRM2, CALMl, HNRNPA2B1, IFIT5, SSX2, AK3L1

28 ANA, dsDNA, SMNl, SSB/La, TR0VE2, IFIT5, PSME3, CALMl, SRPK1, CALM 2, GRK5, WAS, PIM1, AK3L1, CASP7

29 ANA, dsDNA, TR0VE2, SSB/La, SMNl, IFIT5, PSME3, HNRNPA2B1, CALMl, CARHSPl, DCLKl, WAS, CALM 2, STK3, RRAS

30 ANA, dsDNA, SMNl, SSB/La, TR0VE2, PSME3, SRPK1, BANK1, CALMl, CALM 2, AK3L1, RRAS, MAGEA4, AK7, PI Ml

TABLE 6C

20 ANA, SSB/La, TR0VE2, PABPCl, ANXAl, dsDNA, IGF2BP3, LIN28A, BIRC3, HNRNPULl, HMGB2, DLX4, HNRNPA2B1, SSX4, PSME3, AP0BEC3G, FUS, PRM1, NFIL3, RRAS

21 ANA, TR0VE2, SSB/La, PABPCl, ANXAl, dsDNA, HNRNPULl, HMG20B, HMGB2, BIRC3,

IGF2BP3, LIN28A, DLX4, H0XB6, CDC25B, NFIL3, HNRNPA2B1, ARAF, PRM1, ATF4

22 ANA, SSB/La, TR0VE2, PABPCl, dsDNA, ANXAl, HNRNPULl, DLX4, IGF2BP3, HMGB2, NFIL3, BIRC3, ARAF, LIN28A, PYGB, H0XB6, HNRNPA2B1, VAV1, AP0BEC3G, ZAP70

23 ANA, TR0VE2, SSB/La, PABPCl, ANXAl, dsDNA, BIRC3, HMGB2, HNRNPULl, HNRNPA2B1, DLX4, IGF2BP3, AP0BEC3G, ARAF, LIN28A, NFIL3, HMG20B, ATF4, H0XB6, PSME3

24 ANA, TR0VE2, SSB/La, dsDNA, PABPCl, HMGB2, ANXAl, DLX4, BIRC3, HNRNPULl, IGF2BP3, LIN28A, H0XB6, AP0BEC3G, HNRNPA2B1, NFIL3, ARAF, HMG20B, PRM1, PSME3

25 ANA, SSB/La, TR0VE2, PABPCl, dsDNA, IGF2BP3, PSME3, HMGB2, ANXAl, BIRC3, AP0BEC3G, HNRNPA2B1, LIN28A, HNRNPULl, CDC25B, DLX4, PATZ1, H0XB6, ATF4, HMG20B

26 ANA, SSB/La, TR0VE2, dsDNA, BIRC3, IGF2BP3, PABPCl, HNRNPA2B1, HMGB2, HNRNPULl, ANXAl, PSME3, LIN28A, DLX4, CDC25B, AP0BEC3G, SH2B1, NFIL3, PATZ1, H0XB6

27 ANA, TR0VE2, SSB/La, dsDNA, ANXAl, PABPCl, IGF2BP3, AP0BEC3G, BIRC3, HNRNPA2B1, LIN28A, HMGB2, PSME3, HNRNPULl, DLX4, CDC25B, PRM1, HMG20B, CSNK2A1, NFIL3

28 ANA, TR0VE2, SSB/La, PABPCl, dsDNA, HNRNPULl, ANXAl, IGF2BP3, HMGB2, BIRC3,

HMG20B, DLX4, LIN28A, AP0BEC3G, HNRNPA2B1, NFIL3, PSME3, PYGB, CDC25B, H0XB6

29 ANA, TR0VE2, SSB/La, PABPCl, ANXAl, dsDNA, IGF2BP3, HNRNPULl, HNRNPA2B1, DLX4, BIRC3, HMGB2, LIN28A, NFIL3, HMG20B, ATF4, AP0BEC3G, H0XB6, ARAF, PYGB

30 ANA, TR0VE2, SSB/La, PABPCl, IGF2BP3, BIRC3, ANXAl, dsDNA, H0XB6, HNRNPULl, LIN28A, HMG20B, HMGB2, HNRNPA2B1, NFIL3, PATZ1, ATF4, WTl, DLX4, AP0BEC3G

TABLE 6D

Panel Biomarker

1 ANA, TR0VE2, SSB/La, HMGB2, PABPCl, dsDNA, BIRC3, ANXAl, SMNl, IGF2BP3, HNRNPA2B1, WAS, HMG20B, PATZ1, HNRNPULl, NFIL3, H0XB6, DLX3

2 ANA, TR0VE2, SSB/La, dsDNA, HMGB2, PABPCl, BIRC3, ANXAl, HNRNPA2B1, HMG20B, ATXN3, SMNl, IGF2BP3, ATF4, PSIP1, NFIL3, H0XB6, DLX3

3 ANA, TR0VE2, SSB/La, dsDNA, PABPCl, BIRC3, HMGB2, SMNl, ANXAl, HMG20B,

HNRNPA2B1, IGF2BP3, AP0BEC3G, PATZ1, LIN28A, ATXN3, NFIL3, WAS

4 ANA, TR0VE2, SSB/La, dsDNA, BIRC3, PABPCl, HMGB2, ANXAl, SMNl, HMG20B,

HNRNPA2B1, PATZ1, ARAF, H0XB6, IGF2BP3, WAS, APEX1, ATXN3

5 ANA, TR0VE2, SSB/La, BIRC3, dsDNA, ANXAl, SM Nl, HMGB2, PABPCl, ATF4, ARAF, H0XB6, HNRNPA2B1, NFIL3, HMG20B, DLX3, ATXN3, CDC25B

6 ANA, TR0VE2, SSB/La, dsDNA, HMGB2, BIRC3, HNRNPA2B1, PABPCl, ANXAl, SMNl,

HMG20B, IGF2BP3, NFIL3, DLX4, APEX1, ATXN3, HNRNPULl, AP0BEC3G

7 ANA, dsDNA, TR0VE2, SSB/La, HMGB2, PABPCl, HMG20B, SM Nl, ANXAl, BIRC3,

HNRNPA2B1, APEX1, IGF2BP3, WAS, PSIP1, NFIL3, H0XB6, KLHL12

8 ANA, SSB/La, TR0VE2, dsDNA, HMGB2, PABPCl, SMNl, BIRC3, ANXAl, HMG20B, APEX1, CSNK2A1, NFIL3, HNRNPA2B1, PSIP1, AP0BEC3G, DLX4, ATF4

9 ANA, TR0VE2, SSB/La, dsDNA, HMGB2, PABPCl, HNRNPA2B1, HMG20B, SMNl, BIRC3,

ANXAl, H0XB6, IGF2BP3, APEX1, AP0BEC3G, NFIL3, PSIP1, HNRNPULl

10 ANA, TR0VE2, SSB/La, PABPCl, dsDNA, ANXAl, HMGB2, HMG20B, BIRC3, IGF2BP3, KLHL12, SMNl, DLX3, HNRNPA2B1, WAS, PATZ1, ATF4, ARAF

11 ANA, TR0VE2, SSB/La, dsDNA, HMGB2, PABPCl, BIRC3, HMG20B, HNRNPA2B1, WAS, ANXAl, PSIP1, SMNl, PATZ1, DLX3, H0XB6, CDC25B, DLX4

12 ANA, SSB/La, TR0VE2, dsDNA, PABPCl, HMGB2, ANXAl, CSNK2A1, HNRNPA2B1, SMNl, BIRC3, HMG20B, NFIL3, SH2B1, AP0BEC3G, LIN28A, CDC25B, APEX1 13 ANA, TR0VE2, SSB/La, dsDNA, HMGB2, PABPCl, HMG20B, BIRC3, ANXAl, HNRNPA2B1, SMNl, APEX1, AP0BEC3G, ATXN3, IGF2BP3, PSIP1, TGIF2, KLHL12

14 ANA, TR0VE2, SSB/La, dsDNA, HMGB2, PABPCl, IGF2BP3, ANXAl, HMG20B, HNRNPA2B1, BIRC3, SMNl, NFIL3, APEX1, PSIP1, SSX4, HNRNPUL1, CDC25B

15 ANA, TR0VE2, dsDNA, SSB/La, HMGB2, PABPCl, HMG20B, HNRNPA2B1, IGF2BP3, SMNl, ANXAl, BIRC3, LIN28A, CDC25B, APEX1, PSIP1, HNRNPUL1, DLX4

16 ANA, TR0VE2, SSB/La, dsDNA, BIRC3, SMNl, ANXAl, PABPCl, HMG20B, DLX4, HNRNPA2B1, ATXN3, NFIL3, ATF4, HMGB2, ARAF, PATZ1, WAS

17 ANA, TR0VE2, SSB/La, dsDNA, PABPCl, HMGB2, HNRNPA2B1, BIRC3, SMNl, ANXAl, IGF2BP3, HMG20B, CDC25B, AP0BEC3G, WAS, CSNK2A1, LIN28A, APEX1

18 ANA, TR0VE2, dsDNA, SSB/La, PABPCl, HMGB2, HMG20B, ANXAl, BIRC3, IGF2BP3, NFIL3, LIN28A, HNRNPUL1, HNRNPA2B1, CDC25B, DLX4, PATZ1, PYGB

19 ANA, SSB/La, TR0VE2, dsDNA, PABPCl, ANXAl, HMGB2, SMNl, BIRC3, NFIL3, HMG20B, CDC25B, HNRNPUL1, ATF4, WAS, HNRNPA2B1, PYGB, PATZ1

20 ANA, SSB/La, TR0VE2, dsDNA, HMGB2, ANXAl, PABPCl, BIRC3, SMNl, HMG20B, NFIL3, HNRNPA2B1, ATF4, DLX4, WAS, APEX1, PYGB, PATZ1

21 ANA, dsDNA, TR0VE2, SSB/La, PABPCl, HMGB2, ANXAl, HMG20B, BIRC3, NFIL3, H0XB6, APEX1, SMNl, PATZ1, CDC25B, ATF4, IGF2BP3, ARAF

22 ANA, SSB/La, TR0VE2, PABPCl, dsDNA, HMGB2, ANXAl, BIRC3, IGF2BP3, DLX4, NFIL3,

HMG20B, SMNl, ARAF, KLHL12, HNRNPUL1, HNRNPA2B1, PYGB

23 ANA, TR0VE2, SSB/La, dsDNA, HMGB2, PABPCl, BIRC3, ANXAl, HNRNPA2B1, HMG20B, NFIL3, SSX4, ARAF, CDC25B, DLX4, SMNl, ATF4, WAS

24 ANA, SSB/La, TR0VE2, dsDNA, HMGB2, HMG20B, ANXAl, BIRC3, PABPCl, DLX4, ARAF, NFIL3, SMNl, HNRNPA2B1, PYGB, LIN28A, CSNK2A1, HNRNPUL1

25 ANA, dsDNA, SSB/La, TR0VE2, PABPCl, HMGB2, HNRNPA2B1, BIRC3, ANXAl, IGF2BP3,

HMG20B, SMNl, APEX1, PSIP1, AP0BEC3G, KLHL12, HNRNPUL1, ATXN3

26 ANA, SSB/La, TR0VE2, dsDNA, HMGB2, PABPCl, HNRNPA2B1, BIRC3, HMG20B, ANXAl, SMNl, IGF2BP3, LIN28A, APEX1, CDC25B, DLX4, NHLH1, NFIL3

27 ANA, SSB/La, TR0VE2, dsDNA, HMGB2, ANXAl, PABPCl, HNRNPA2B1, BIRC3, SMNl,

HMG20B, IGF2BP3, CDC25B, ATXN3, NFIL3, KLHL12, LIN28A, WAS

28 ANA, TR0VE2, SSB/La, dsDNA, PABPCl, ANXAl, HMGB2, HMG20B, BIRC3, SMNl, NFIL3, HNRNPUL1, IGF2BP3, HNRNPA2B1, DLX4, PYGB, ATF4, CDC25B

29 ANA, TR0VE2, dsDNA, SSB/La, PABPCl, HMGB2, ANXAl, HMG20B, BIRC3, NFIL3, SSX4, DLX4, IGF2BP3, HNRNPA2B1, HNRNPUL1, DLX3, PYGB, ARAF

30 ANA, TR0VE2, dsDNA, SSB/La, PABPCl, BIRC3, HMGB2, ANXAl, WAS, HNRNPA2B1, HMG20B, IGF2BP3, PATZ1, SMNl, NFIL3, H0XB6, ATF4, LIN28A

TABLE 6E

Panel Biomarker

1 ANA, BIRC3

2 ANA, BIRC3

3 ANA, BIRC3

4 ANA, BIRC3

5 ANA, BIRC3

6 ANA, BIRC3

7 ANA, BIRC3

8 ANA, BIRC3 9 ANA, BIRC3

10 ANA, BIRC3

11 ANA, BIRC3

12 ANA, dsDNA

13 ANA, BIRC3

14 ANA, BIRC3

15 ANA, BIRC3

16 ANA, BIRC3

17 ANA, BIRC3

18 ANA, BIRC3

19 ANA, BIRC3

20 ANA, BIRC3

TABLE 6F

TABLE 7

Preferred subsets of Table 1

1. BIRC3, ATF4

2. BIRC3, RQCDl

3. BIRC3, APEXl

4. BIRC3, WAS

5. BIRC3, DLX3

6. BIRC3, PSIP1

7. BIRC3, RPL10

8. BIRC3, PRM2

9. BIRC3, SMAD5

10. BIRC3, SU B1

11. BIRC3, I RF4

12. BIRC3, ATF4, RQCDl

13. BIRC3, APEXl, RQCDl

14. BIRC3, ATF4, APEXl

15. BIRC3, ATF4, DLX3

16. BIRC3, ATF4, WAS

17. BIRC3, ATF4, ZMYNDll 18. BIRC3, RPL10, RQCDl

19. BIRC3, APEX1, PSIPl

20. BIRC3, ATF4, CARHSPl

21. BIRC3, ATF4, NEUR0D4

22. BIRC3, ATF4, PRM2

23. BIRC3, ATF4, SMAD5

24. BIRC3, DLX3, APEX1

25. BIRC3, WAS, RQCDl

26. WAS, PRM2, CARHSPl, RQCDl

27. BIRC3, ATF4, APEX1, RQCDl

28. BIRC3, ATF4, APEX1, PSIPl

29. BIRC3, ATF4, PRM2, SMAD5

30. BIRC3, ATF4, SMAD5, CARHSPl

31. BIRC3, ATF4, WAS, APEX1

32. BIRC3, ATF4, WAS, RQCDl

33. BIRC3, DLX3, APEX1, RQCDl

34. BIRC3, HIST1H4I, RQCDl, PSIPl

35. BIRC3, SGK3, RQCDl, PSIPl

36. BIRC3, SUBl, RQCDl, PSIPl

37. BIRC3, WAS, HIST1H4I, RQCDl

38. CARHSPl, SUBl, PPP2R5A, RQCDl

39. WAS, CARHSPl, SUBl, RQCDl

40. WAS, PRM2, CARHSPl, PPP2R5A

41. WAS, PRM2, CARHSPl, SLA

42. WAS, PRM2, CARHSPl, SUBl

43. WAS, PRM2, CARHSPl, SUBl, RQCDl

44. BIRC3, HIST1H4I, HOXCIO, RQCDl, PSIPl

45. BIRC3, CARHSPl, HIST1H4I, RQCDl, PSIPl

46. BIRC3, CARHSPl, HOXCIO, RQCDl, PSIPl

47. BIRC3, HIST1H4I, RQCDl, PSIPl, SLA

48. BIRC3, HOXCIO, SERPINB5, RQCDl, PSIPl

49. WAS, PRM2, CARHSPl, PPP2R5A, RQCDl

50. WAS, PRM2, CARHSPl, RQCDl, PSIPl

51. WAS, PRM2, CARHSPl, SUBl, PPP2R5A

52. WAS, PRM2, CARHSPl, SUBl, PPP2R5A, RQCDl

53. BIRC3, ATF4, CARHSPl, RQCDl, PSIPl, SLA

54. BIRC3, ATF4, HOXCIO, SERPINB5, RQCDl, PSIPl

55. BIRC3, ATF4, HOXCIO, SGK3, SERPINB5, RQCDl

56. BIRC3, ATF4, SMAD5, RQCDl, PSIPl, SLA

57. BIRC3, HIST1H4I, HOXCIO, COX6C, RQCDl, PSIPl

58. BIRC3, TGIF1, HIST1H4I, HOXCIO, RQCDl, PSIPl

59. BIRC3, WAS, HIST1H4I, HOXCIO, RQCDl, PSIPl

60. WAS, PRM2, CARHSPl, IRF4, SUBl, PPP2R5A

61. WAS, PRM2, CARHSPl, IRF4, SUBl, RQCDl

62. BIRC3, ATF4, CARHSPl, HIST1H4I, HOXCIO, RQCDl, PSIPl

63. BIRC3, ATF4, WAS, PR 2, CARHSPl, PPP2R5A, RQCDl

64. BIRC3, HIST1H4I, HOXCIO, MAP3K7, RQCDl, PSIPl, SLA

65. BIRC3, WAS, PRM2, CARHSPl, SUBl, RQCDl, SLA

66. WAS, PRM2, CARHSPl, SUBl, PPP2R5A, MAFG, RQCDl

67. BIRC3, ATF4, WAS, PRM2, CARHSPl, PPP2R5A, RQCDl, SLA

68. BIRC3, ATF4, WAS, PRM2, CARHSPl, SUBl, PPP2R5A, RQCDl 69. BIRC3, ATF4, WAS, PR 2, CARHSPl, SUBl, PPP2R5A, RQCDl, SLA

70. BIRC3, ATF4, WAS, PR 2, CARHSPl, CASP9, SUBl, PPP2R5A, SLA

71. BIRC3, ATF4, WAS, PRM2, CARHSPl, IRF4, SUBl, PPP2R5A, RQCDl

72. BIRC3, ATF4, WAS, PRM2, CARHSPl, PPP2R5A, MAFG, RQCDl, PSIP1

73. BIRC3, ATF4, WAS, PRM2, SMAD5, CARHSPl, PPP2R5A, RQCDl, SLA

74. BIRC3, ATF4, WAS, PRM2, CARHSPl, IRF4, SUBl, PPP2R5A, RQCDl, SLA

75. BIRC3, ATF4, WAS, PRM2, CARHSPl, IRF4, CASP9, SUBl, PPP2R5A, RQCDl

76. BIRC3, ATF4, WAS, PRM2, SMAD5, CARHSPl, IRF4, SUBl, PPP2R5A, RQCDl

77. BIRC3, ATF4, WAS, PRM2, SMAD5, CARHSPl, SUBl, PPP2R5A, RQCDl, SLA

78. BIRC3, ATF4, WAS, PRM2, CARHSPl, IRF4, SUBl, PPP2R5A, RQCDl, PSIP1, SLA

79. BIRC3, ATF4, WAS, PR 2, CARHSPl, IRF4, SUBl, RQCDl, PSIP1, SLA, MAGEBl

80. BIRC3, WAS, PRM2, CARHSPl, HOXC10, IRF4, CASP9, SUBl, PLD2, PPP2R5A, MAFG, RQCDl

81. BIRC3, ATF4, WAS, PRM2, CARHSPl, TEK, HOXC10, SUBl, PPP2R5A, MAFG, COX6C, RQCDl, SLA

82. DLX3, CARHSPl, APEXl, MAP3K7, RPS6KA6, RQCDl, MAGEBl

83. APEXl, CARHSPl, MAGEBl, PRM2, RQCDl, SUBl, WAS, CSNK1G2, ZNF207, MAGEB4, NFKBIA, RNASEL

84. APEXl, NFKBIA, RNASEL, RQCDl, ZNF207

85. RQCDl, ZNF207

86. RQCDl, ZNF207, RNASEL

87. RQCDl, ZNF207, NFKBIA

TABLE 8

Table 8 lists biomarkers described in reference 52. The measured biomarker can be (i) presence of auto-antibody which binds to an antigen listed in Table 8 and/or (ii) the presence of an antigen listed in Table 8, but is preferably the former.

No. Biomarker HGNC Gl

85 FUS 4010 33875401

87 HMG20B 5002 33876853

E1B-AP5/

90 17011 33987968

HNRNPUL1

91 HOXB6 5117 15779174

95 LIN28 15986 33872076

103 PABPC1 8554 33872187

109 PSME3 9570 33876201

116 SMN1 11117 13111817

SSA2/

119 11313 34192599

TROVE2

123 BANK1 18233 21619549

125. DOM3Z 2992 33878616

126. ZMAT2 26433 34785080

127. ASPSCR1 13825 17511731

BY21P1/

128. 15715 13623504

MAP1S

129. CEBPG 1837 14043188 130. DDX55 20085 34190861

131. HAGH 4805 12654064

132. IFI16 5395 16877621

133. KRT8 6446 14198277

134. LNX 6657 18605734

135. NDUFV3 7719 33871569

136. PHLDA1 8933 39644938

137. PIAS2 17311 15929521

138. PRKCBP1 9397 21315038

139. PRKRA 9438 14495716

140. RAB11FIP3 17224 30411060

141. RALBP1 9841 15341886

142. RAN 9846 33871120

143. RARA 9864 33873941

144. RBMS1 9907 33869903

145. RDBP 13974 34193418

146. RNF12 13429 33872118

147. RPL30 10333 34783378

148. RPL31 10334 40226052

149. RUFY1 19760 21595719

150. SNK 19699 33988188

151. SRPK1 11305 23468344

152. SSNA1 11321 12654102

153. STAU 11370 29792189

154. STK11 11389 33872385

155. T0 1 11982 28374254

156. TXNL2 15987 13528998

157. TXNRD1 12437 17390271

158. VCL 12665 24657578

159. ZNF38 13104 28703926

TABLE 9

Table 9 lists biomarkers described in reference 53. The measured biomarker can be (i) presence of auto-antibody which binds to an antigen listed in Table 9 and/or (ii) the presence of an antigen listed in Table 9, but is preferably the former.

161. RPL18A 10311 38196939

162. ACTL7B 162 21707461

163. BAG3 939 13623600

164. C6orf93 21173 33872922

165. CCNI 1595 38197480

166. CCT3 1616 14124983

167. CDK3 1772 28839544

168. CKS1B 19083 40226240

169. C0PG2 2237 16924304

170. DNCLI2 2966 19684162

171. EEF1D 3211 33988346

172. FBX09 13588 33875682

173. GTF2H2 4656 40674449

174. KATNB1 6217 38197184

175. KIAA0643 19009 34190884

176. KIT 6342 47938801

177. MAP2K5 6845 33871775

178. MGC42105 34783729

179. MT01 19261 15029678

180. NFE2L2 7782 15079436

181. N E6 20567 38197001

182. NTRK3 8033 15489167

183. PFKFB3 8874 26251768

184. PIAS2 17311 15929521

185. P0LR2E 9192 13325243

186. PRKCBP1 9397 21315038

187. RALBP1 9841 15341886

188. RPL15 10306 15928752

189. RPL34 10340 12804692

190. RPL37A 10348 34783289

191. RPS6KA1 10430 15929012

192. RRP41 18189 38114779

193. STK4 11408 38327560

194. SUCLA2 11448 34783884

195. TCEB3 11620 38197222

196. TRIM37 7523 23271191

197. TUBA1 12407 37589861

198. WDR45L 25072 12803025

199. EEF1G 3213 38197136

200. RNF38 18052 21707089

201. PHLDA2 12385 13477152

202. KCM F1 20589 13111812

203. NUBP2 8042 33990898

204. VPS45A 14579 15277874 TABLE 10

Table 10 lists biomarkers described in reference 54. The measured biomarker can be (i) presence of auto-antibody which binds to an antigen listed in Table 10 and/or (ii) the presence of an antigen listed in Table 10, but is preferably the former.

206. BCL2A1 991 16740835

207. CWC27 10664 15082404

208. DPPA2 19197 239835766

209. EFHD2 28670 34782922

210. E CC2 3434 14249929

211. EWSR1 3508 15029674

212. FES 3657 23271524

213. FOS 3796 33872858

214. FTHL17 3987 261862240

215. GNA15 4383 15488913

216. GNG4 4407 18490900

217. IFI35 5399 33876082

218. JUNB 6205 14495708

219. KLF6 2235 13279169

220. LGALS7 6568 194688138

221. NRBF2 19692 15079806

222. PCGF2 12929 38197067

223. PPP3CC 9316 33991135

224. RET 9967 13279040

225. RPS7 10440 33877263

226. SCEL 10573 238908500

227. STAM 11357 34192153

228. TAF9 11542 34782794

229. TIE1 11809 23398604

230. UBA3 12470 18605782

231. ZNRD1 13182 15012006

TABLE 11

Known biomarkers for SLE are listed below.

TABLE 12

TABLE 13

Table 13 shows the performance characteristics of LRl - LR4 at fixed sensitivities and specificities with clinical relevance.

tp= true-positive, fp=false-positive, tn=true-negative, fn=false-negative, N=number of samples TABLE 14

Table 14 shows the performance characteristics of LRl (PSME3,PABPC1,RQCD1,HMG20B, NPMl,HNRNPULl,ZNF207,ANA,dsDNA).

tp= true-positive, fp=false-positive, tn=true-negative, fn=false-negative, N=number of samples TABLE 15

Table 15 shows the performance characteristics of LR2 (PSME3,PABPC1,RQCD1,HMG20B, NPMl,HNRNPULl,ZNF207,RNASEL,dsDNA).

tp- true-positive, fp-false-positive, tn-true-negative, fn-false-negative, N-number of samples TABLE 16

Table 16 shows the performance characteristics of LR3 (TROVE2, PSME3, PABPCl, RQCDl, MAGEB2, HMG20B, HNRNPULl, IRF5, ZNF207, NFKBIA, dsDNA) and LR4 (dsDNA, PSME3, ZNF207, HNRN PULl, RQCDl, MAGEB2, PABPCl, NFKBIA, APEXl, HMG20B, RNASEL, NPMl, SMN1, IGF2BP3, SSB/La).

tp= true-positive, fp=false-positive, tn=true-negative, fn=false-negative, N=number of samples TABLE 17

Table 17 lists biomarkers useful with the invention. The measured biomarker can be (i) presence of auto-antibody which binds to an antigen listed in Table 17 and/or (ii) the presence of an antigen listed in Table 17, but is preferably the former.

No: Symbol ID Name HGNC Gl p-value

APEX1 328 APEX nuclease (multifunctional 587 33876570 0.112123

125.

DNA repair enzyme) 1

ATF4 468 activating transcription factor 4 786 14198041 4.14E-06

126. (tax-responsive enhancer

element B67)

BI C2 329 baculoviral IAP repeat containing 590 22382083 0.016232

127.

2

BIRC3 330 baculoviral IAP repeat containing 591 22766815 4.06E-10

128.

3

CARHSP1 23589 calcium regulated heat stable 17150 13097197 0.005046

129.

protein 1, 24kDa

CASP9 842 caspase 9, apoptosis-related 1511 38014291 0.01901

130.

cysteine peptidase

131. COX6C 1345 cytochrome c oxidase subunit Vic 2285 34783038 0.208511

132. DLX3 1747 distal-less homeobox 3 2916 15214474 0.000407

133.

deficiency, complementation

group 5

134. HIST1H4I 8294 histone cluster 1, H4i 4793 16740964 0.011267

135. HOXC10 3226 homeobox CIO 5122 12654896 0.014371

136. IRF4 3662 interferon regulatory factor 4 6119 16041743 0.018691

MAFG 4097 v-maf musculoaponeurotic 6781 15147379 0.062494

137. fibrosarcoma oncogene homolog

G (avian)

138. MAGEB1 4112 melanoma antigen family B, 1 6808 257796250 0.791781

MAP3K14 9020 mitogen-activated protein kinase 6853 23272579 0.112088

139.

kinase kinase 14

MAP3K7 6885 mitogen-activated protein kinase 6859 34189719 0.135015

140.

kinase kinase 7

MYD88 4615 myeloid differentiation primary 7562 15488922 0.223747

141.

response gene (88)

142. NEUROD4 58158 neuronal differentiation 4 13802 26454740 0.002205

143. PLD2 5338 phospholipase D2 9068 15929159 0.028245

PPP2R5A 5525 protein phosphatase 2, regulatory 9309 18490281 0.03865

144.

subunit B', alpha

145. PRM2 5620 protamine 2 9448 68989266 0.002365

PSIP1 11168 PC4 and SFRS1 interacting protein 9527 190014584 0.722355

146.

1

147. RPL10 6134 ribosomal protein L10 10298 13097176 0.388947

RPS6KA6 27330 ribosomal protein S6 kinase, 10435 283483967 0.147917

148.

90kDa, polypeptide 6 RQCDl 9125 RCD1 required for cell 10445 410515402 0.585852

149. differentiationl homolog (S.

pombe)

SERPINB5 5268 serpin peptidase inhibitor, clade B 8949 18089113 0.296317

150.

(ovalbumin), member 5

SGK3 23678 serum/glucocorticoid regulated 10812 15929809 0.096028

151.

kinase family, member 3

152. SLA 6503 Src-like-adaptor 10902 13937869 0.737308

153. SMAD5 4090 SMAD family member 5 6771 34189276 0.002738

154. SU B1 10923 SUB1 homolog (S. cerevisiae) 19985 16307066 0.025215

155. TEK 7010 TEK tyrosine kinase, endothelial 11724 23273967 0.014237

TFE3 7030 transcription factor binding to 11752 19684175 0.029027

156.

IGH M enhancer 3

157. TGI F1 7050 TGFB-induced factor homeobox 1 11776 12654024 0.00719

158. TGI F2 60436 TGFB-induced factor homeobox 2 15764 33870164 0.128123

159. VAX2 25806 ventral anterior homeobox 2 12661 13623466 0.255454

WAS 7454 Wiskott-Aldrich syndrome 12731 15215302 0.000798

160.

(eczema-thrombocytopenia)

ZMYN D11 10771 zinc finger, MYN D-type containing 16966 21961556 0.019902

161.

11

Columns

(iv) This name is taken from the Official Full Name provided by NCBI. An antigen may have been referred to by one or more pseudonyms in the prior art. The invention relates to these antigens regardless of their nomenclature.

(v) The HUGO Gene Nomenclature Committee aims to give unique and meaningful names to every human gene. The HGNC number thus identifies a unique human gene.

(vi) A "Gl" number, "Genlnfo Identifier", is a series of digits assigned consecutively to each sequence record processed by NCBI when sequences are added to its databases. The Gl number bears no resemblance to the accession number of the sequence record. When a sequence is updated (e.g. for correction, or to add more annotation or information) it receives a new Gl number. Thus the sequence associated with a given G l number is never changed. The G l numbers given here are for coding DNA sequences (except for SEQ ID NO: 7).

TABLE 18

4. BI C3, WAS

5. BIRC3, DLX3

6. BIRC3, PSIPl

7. BIRC3, RPL10

8. BIRC3, PRM2

9. BIRC3, SMAD5

10. BIRC3, SUBl

11. BIRC3, IRF4

12. BIRC3, ATF4, RQCDl

13. BIRC3, APEX1, RQCDl

14. BIRC3, ATF4, APEX1

15. BIRC3, ATF4, DLX3

16. BIRC3, ATF4, WAS

17. BIRC3, ATF4, ZMYND11

18. BIRC3, RPL10, RQCDl

19. BIRC3, APEX1, PSIPl

20. BIRC3, ATF4, CARHSPl

21. BIRC3, ATF4, NEUROD4

22. BIRC3, ATF4, PRM2

23. BIRC3, ATF4, SMAD5

24. BIRC3, DLX3, APEX1

25. BIRC3, WAS, RQCDl

26. WAS, PRM2, CARHSPl, RQCDl

27. BIRC3, ATF4, APEX1, RQCDl

28. BIRC3, ATF4, APEX1, PSIPl

29. BIRC3, ATF4, PRM2, SMAD5

30. BIRC3, ATF4, SMAD5, CARHSPl

31. BIRC3, ATF4, WAS, APEX1

32. BIRC3, ATF4, WAS, RQCDl

33. BIRC3, DLX3, APEX1, RQCDl

34. BIRC3, HIST1H4I, RQCDl, PSIPl

35. BIRC3, SGK3, RQCDl, PSIPl

36. BIRC3, SUBl, RQCDl, PSIPl

37. BIRC3, WAS, HIST1H4I, RQCDl

38. CARHSPl, SUBl, PPP2R5A, RQCDl

39. WAS, CARHSPl, SUBl, RQCDl

40. WAS, PRM2, CARHSPl, PPP2R5A

41. WAS, PRM2, CARHSPl, SLA

42. WAS, PRM2, CARHSPl, SUBl

43. WAS, PRM2, CARHSPl, SUBl, RQCDl

44. BIRC3, HIST1H4I, HOXCIO, RQCDl, PSIPl

45. BIRC3, CARHSPl, HIST1H4I, RQCDl, PSIPl

46. BIRC3, CARHSPl, HOXCIO, RQCDl, PSIPl

47. BIRC3, HIST1H4I, RQCDl, PSIPl, SLA

48. BIRC3, HOXCIO, SERPINB5, RQCDl, PSIPl

49. WAS, PRM2, CARHSPl, PPP2R5A, RQCDl

50. WAS, PRM2, CARHSPl, RQCDl, PSIPl

51. WAS, PRM2, CARHSPl, SUBl, PPP2R5A

52. WAS, PRM2, CARHSPl, SUBl, PPP2R5A, RQCDl

53. BIRC3, ATF4, CARHSPl, RQCDl, PSIPl, SLA

54. BIRC3, ATF4, HOXCIO, SERPINB5, RQCDl, PSIPl 55. BIRC3, ATF4, HOXCIO, SGK3, SERPINB5, RQCDl

56. BIRC3, ATF4, SMAD5, RQCDl, PSIP1, SLA

57. BIRC3, HIST1H4I, HOXCIO, COX6C, RQCDl, PSIP1

58. BIRC3, TGIF1, HIST1H4I, HOXCIO, RQCDl, PSIP1

59. BIRC3, WAS, HIST1H4I, HOXCIO, RQCDl, PSIP1

60. WAS, PRM2, CARHSPl, IRF4, SUBl, PPP2R5A

61. WAS, PRM2, CARHSPl, IRF4, SUBl, RQCDl

62. BIRC3, ATF4, CARHSPl, HIST1H4I, HOXCIO, RQCDl, PSIP1

63. BIRC3, ATF4, WAS, PRM2, CARHSPl, PPP2R5A, RQCDl

64. BIRC3, HIST1H4I, HOXCIO, MAP3K7, RQCDl, PSIP1, SLA

65. BIRC3, WAS, PRM2, CARHSPl, SUBl, RQCDl, SLA

66. WAS, PRM2, CARHSPl, SUBl, PPP2R5A, MAFG, RQCDl

67. BIRC3, ATF4, WAS, PRM2, CARHSPl, PPP2R5A, RQCDl, SLA

68. BIRC3, ATF4, WAS, PRM2, CARHSPl, SUBl, PPP2R5A, RQCDl

69. BIRC3, ATF4, WAS, PRM2, CARHSPl, SUBl, PPP2R5A, RQCDl, SLA

70. BIRC3, ATF4, WAS, PRM2, CARHSPl, CASP9, SUBl, PPP2R5A, SLA

71. BIRC3, ATF4, WAS, PRM2, CARHSPl, IRF4, SUBl, PPP2R5A, RQCDl

72. BIRC3, ATF4, WAS, PRM2, CARHSPl, PPP2R5A, MAFG, RQCDl, PSIP1

73. BIRC3, ATF4, WAS, PRM2, SMAD5, CARHSPl, PPP2R5A, RQCDl, SLA

74. BIRC3, ATF4, WAS, PRM2, CARHSPl, IRF4, SUBl, PPP2R5A, RQCDl, SLA

75. BIRC3, ATF4, WAS, PRM2, CARHSPl, IRF4, CASP9, SUBl, PPP2R5A, RQCDl

76. BIRC3, ATF4, WAS, PRM2, SMAD5, CARHSPl, IRF4, SUBl, PPP2R5A, RQCDl

77. BIRC3, ATF4, WAS, PRM2, SMAD5, CARHSPl, SUBl, PPP2R5A, RQCDl, SLA

78. BIRC3, ATF4, WAS, PRM2, CARHSPl, IRF4, SUBl, PPP2R5A, RQCDl, PSIP1, SLA

79. BIRC3, ATF4, WAS, PRM2, CARHSPl, IRF4, SUBl, RQCDl, PSIP1, SLA, MAGEB1

80. BIRC3, WAS, PRM2, CARHSPl, HOXCIO, IRF4, CASP9, SUBl, PLD2, PPP2R5A, MAFG, RQCDl

81. BIRC3, ATF4, WAS, PRM2, CARHSPl, TEK, HOXCIO, SUBl, PPP2R5A, MAFG, COX6C, RQCDl, SLA

82. DLX3, CARHSPl, APEX1, MAP3K7, RPS6KA6, RQCDl, MAGEB1

REFERENCES

[I] Lau et al. (2006) Lupus. 15(ll):715-9.

[2] Pons-Estel et al. (2010) Semin. Arthritis Rheum. 39(4):257-68

[3] Habash-Bseiso (2005) Clin Med Res. 3(3): 190-3.

[4] Antico et al. (2010) Lupus doi: 10.1177/0961203310362995.

[5] Sherer et al. (2004) Arthritis Rheum. 34(2):501-37.

[6] Wild et al. (2008) Biomarkers. 13(1):88-105

[7] Pappworth et al. (2009) Mot Immunol 46:1042-9.

[8] Guerra et al (2012) Arthritis Res Ther. 29;14(3):21

[9] Vanderlugt & Miller (1996) Curr Opin Immunol. 8:831-6.

[10] Cheung et al. (2000) Nucleic Acids Res. 28(l) :361-3. http://alfred.med.yale.edu/alfred/

[II] McKusick (1998) Mendelian Inheritance in Man. A Catalog of Human Genes and Genetic Disorders. Baltimore: Johns Hopkins University Press, 1998 (12th edition). See also http://www. ncbi.nl m. nih. gov/omim/.

[12] Stenson et al. (2009) Genome Med 1:13.

[13] Stamm et al. (2006) Nucleic Acids Res 34: D46-D55.

[14] Sonn et al. (2005) Lupus Prostatic Dis 8:304-10. Costenbader et al. (2007) Arthritis Rheum. 56(4):1251-62.

Geysen ef al. (1984) PNAS USA 81:3998-4002.

Carter (1994) Methods Mol Biol 36:207-23.

Jameson, BA et al. 1988, CABIOS 4(1):181-186.

Maksyutov & Zagrebelnaya (1993) Comput Appl Biosci 9(3):291-7.

Hopp (1993) Peptide Research 6:183-190.

Welling et al. (1985) FEBS Lett. 188:215-218.

Bublil et al. (2007) Proteins 68(l):294-304.

Sun et al. (2009) Nucleic Acids Res 37:W612-6.

Raddrizzani & Hammer (2000) Brief Bioinform l(2):179-89.

Chen et al. (2007) Amino Acids 33(3):423-8.

Reimer (2009) Methods Mol Biol 524:335-44.

Boutell et al. (2004) Proteomics 4:1950-8.

Tassinari et al. (2008) Curr Opin Mol Ther 10:107-15.

Stoevesandt et al. (2009) Expert Rev Proteomics 6:145-57.

Tao et al. (2007) Comb Chem High Throughput Screen 10:706-18.

Gnjatic et al. (2009) J Immunol Methods 341:50-8.

Hartmann et al. (2009) Anal Bioanal Chem 393:1407-16.

Fall & Niessner (2009) Methods Mol Biol 509:107-22.

WO01/57198.

WO02/27327.

Blackburn & Hart (2005) Methods Mol Biol. 310:197-216

WO03/064656.

WO2004/046730.

Stahl et al. (2006) Immunol Lett 102:50-9.

Quintana (2008) PNAS USA 105:18889-94.

Koopmann & Blackburn (2003) Rapid Commun Mass Spectrom.17:455-62.

WO01/61040.

Oleinikov et al. (2003) J Proteome Res. 2:313-9.

Bolstad et al. (2003) Bioinformatics 19:185-93.

Meyer et al. (2003) Neurocomputing 55:169-86.

Koza (1992), Genetic Programming: On the Programming of Computers by Means

Wang & Japkowicz (2008) Lecture Notes in Computer Science 4994/2008, 38-47.

Elkon & Casali (2008) Nat Clin Pract Rheumatol. 4(9):491-8.

Chada et al. (2003) Curr Opin Drug Discov Devel. 6(2):169-73.

Chene (2003) Nature Reviews Cancer 3, 102-109.

Wang & El-Deiry (2008) Curr Opin Oncol. 20(l):90-6.

WO 2009/150422

WO 2012/049664

GB application nos 1213790.7 and 1217288.8

Current Protocols in Molecular Biology (F.M. Ausubel et al., eds., 1987) Supplement 30 Smith & Waterman (1981) Adv. Appl. Math. 2: 482-489.

Koopmann, J.O., McAndrew, M.B. and Blackburn, J.M. (2005) in "Protein Microarrays [58] Huber ei al. (2002) Bioinformatics 18 suppl. 1 S96-S104.

[59] Pan et al., (1998) Ann Acad Med Singapore. 27(l):21-3.

[60] Brodersen and Siersma (2013) Ann. Fam. Med. 11(2):106-115

Claims

1. A method for analysing a subject sample, comprising a step of determining the levels of x different biomarkers in the sample, wherein the levels of the biomarkers provide a diagnostic indicator of whether the subject has lupus; wherein x is 1 or more and wherein the x different biomarkers are selected from auto-antibodies against RQCD1, BIRC3, ATF4, DLX3, WAS, NEUROD4, PRM2, SMAD5, CARHSP1, TGIF1, HIST1H4I, TEK, HOXC10, BIRC2, IRF4, CASP9, ZMYND11, SUB1, PLD2, TFE3, PPP2R5A, MAFG, SGK3, MAP3K14, APEX1, TGIF2, MAP3K7, RPS6KA6, ERCC5, COX6C, MYD88, VAX2, SERPINB5, RPL10, , PSIP1, SLA, MAGEB1, CSNK1G2, ZNF207, MAGEB4, NFKBIA and RNASEL.

2. The method of claim 1, wherein x is 2 or more.

3. The method of claim 2, wherein x is 10 or more.

4. The method of any preceding claim, wherein x is 60 or fewer.

5. The method of claim 4, wherein x is 15 or fewer.

6. The method of any preceding claim, wherein the method also includes a step of determining if a sample from the subject contains one or more of ANA, anti-dsDNA auto-antibodies, anti-SSA antibodies and/or antibodies against any of the antigens listed in Table 2 or Table 3.

7. The method of any preceding claim, wherein the sample is a body fluid.

8. The method of claim 7, wherein the sample is blood, serum or plasma.

9. The method of any preceding claim, wherein the subject is (i) pre-symptomatic for lupus or (ii) already displaying clinical symptoms of lupus.

10. The method of any preceding claim, wherein the presence of auto-antibodies is determined using an immunoassay.

11. The method of claim 10, wherein the immunoassay utilises an antigen comprising an amino acid sequence (i) having at least 90% sequence identity to an amino acid sequence encoded by a SEQ ID NO listed in Table 1, and/or (ii) comprising at least one epitope from an amino acid sequence encoded by a SEQ ID NO listed in Table 1.

12. The method of claim 10 or claim 11, wherein the immunoassay utilises a fusion polypeptide with a first region and a second region, wherein the first region can react with an auto-antibody in a sample and the second region can react with a substrate to immobilise the fusion polypeptide thereon.

13. The method of any preceding claim, wherein the subject is a human male.

14. The method of any preceding claim, wherein the method involves comparing levels of the biomarkers in the subject sample to levels in (i) a sample from a patient with lupus and/or (ii) a sample from a patient without lupus.

15. The method of any preceding claim, wherein the method involves analysing levels of the biomarkers in the sample with a classifier algorithm which uses the measured levels of to distinguish between patients with lupus and patients without lupus.

16. The method of any one of claims 2 to 15, wherein the 2 or more different biomarkers are:

• A panel comprising or consisting of 2 different biomarkers, namely: (i) a biomarker selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

• A panel comprising or consisting of 3 different biomarkers, namely: (i) any 2 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

• A panel comprising or consisting of 4 different biomarkers, namely: (i) any 3 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

• A panel comprising or consisting of 5 different biomarkers, namely: (i) any 4 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

• A panel comprising or consisting of 6 different biomarkers, namely: (i) any 5 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

• A panel comprising or consisting of 7 different biomarkers, namely: (i) any 6 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

• A panel comprising or consisting of 8 different biomarkers, namely: (i) any 7 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

• A panel comprising or consisting of 9 different biomarkers, namely: (i) any 8 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3. • A panel comprising or consisting of 10 different biomarkers, namely: (i) any 9 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

• A panel comprising or consisting of 11 different biomarkers, namely: (i) any 10 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

• A panel comprising or consisting of 12 different biomarkers, namely: (i) any 11 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

• A panel comprising or consisting of 13 different biomarkers, namely: (i) any 12 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

• A panel comprising or consisting of 14 different biomarkers, namely: (i) any 13 biomarkers selected from Table 1 and (ii) a further biomarker selected from Table 2 or 3.

17. A diagnostic device for use in diagnosis of systemic lupus erythematosus, wherein the device permits determination of the level(s) of 1 or more Table 1 biomarkers.

18. The device of claim 17, wherein the device comprises a plurality of antigens immobilised on a solid substrate as an array.

19. The device of claim 18, wherein the device contains antigens for detecting autoantibodies against all of the antigens listed in Table 1.

20. The device of claim 18 or 19, wherein the array includes one or more control polypeptides.

21. The device of claim 20, comprising one or more an anti-human immunoglobulin antibody(s).

22. The device of any one of claims 17 to 21, including one or more replicates of an antigen.

23. The method of any one of claims 1 to 15, using the device of any one of claims 17 to 21.

24. In a method for diagnosing if a subject has systemic lupus erythematosus, an improvement consisting of determining in a sample from the subject the level(s) of y biomarker(s) of Table 1, wherein y is 1 or more and the level(s) of the biomarker(s) provide a diagnostic indicator of whether the subject has lupus.

25. A human antibody which recognises an antigen listed in Table 1.