One 609 gene panel for being used for screening mutator
Technical field
The application belongs to genetic test field, and in particular to the detection of mutator.
Background technology
The mankind have a gene more than 20,000, and 31.6 hundred million base-pairs, we will study which site of which gene and human tumor
There is Close relation, be the thing that one very complicated, task is huge, domestic genetic test company is directed to different tumours and detection
, there is the phenomenon that a hundred flowers blossom lets a hundred schools contend in purpose, selection of each company for panel.But the if gene of panel
It is improper to choose, and gene loci or region selection are improper, miss the information in important site, cannot play to patient and targetedly examine
It looks into, increases the cost of patient but no directive function, the information of mistake is provided to the early diagnosis of clinician.So gene
The selection of panel is particularly critical.
Most close implementation:On November 15th, 2017, U.S. FDA official approval best Cancer center of the U.S.
One of --- commemorate Si Long Caitlins Cancer center (Memorial Sloan Kettering Cancer Center, abbreviation MSK)
NGS tumour multiple baseline images detection platforms IMPACTTM (Integrated Mutation Profiling of
Actionable CancerTargets), variation that can be in 468 genes of Rapid identification, and genome other molecular changes
Change situation, guidance is provided for successive treatment.They are using exclusive lesion detection technology MSK-IMPACTTM to more than 10000
Patient with advanced cancer has carried out gene sequencing, is suitable for up to 63 kinds of variety of solid tumor types, this is current metastatic cancer patient
Middle maximum-norm swell tumor sequencing research.MSK-IMPACTTM is a kind of hybrid capture technology based on NGS panel, can be quick
Detection with relevant 468 unique genes of cancer on all proteins encoding mutant, copy number variation, promoter mutation and
Structural rearrangement.
Small panel is often confined to the targeted drug for mutational site, economical and practical, but may miss other targeting medicines
The site of object also lacks the detection in toxic and side site.When small panel detections cannot meet, it is necessary to select
The more panel of detection site.And big panel detection sites are comprehensive, cover more target chemotherapy medicine, but from testing result
On from the point of view of, the effect of some gene medically is not yet clear, and practical medication detection and it is uncorrelated, greatly push away
The more difficult acquisition of drug recommended, has often resulted in the waste of testing cost, there is no directive function and increases financial burden to patient.
Big panel currently on the market can generally detect 500 or so genes, and testing cost is high.
Invention content
Goal of the invention:It solves small panel detections inaccuracy and big panel testing costs in the market and wastes disadvantage.
Technical solution:A kind of 609 gene panel for screening mutator, which is characterized in that include 609 bases
Because of 1273 sites, 609 genes are more than 400 drivings by having 468 genes and 48 centers in the sample of MSK offers
Gene carries out gene annotation and addition medication site information is screened and obtained;This 609 genes are as shown in the table:
Table 1
ABL1 ABL2 ACACA ACSL3 ACSL6 ACTB ACTG1 ACVR1 ACVR1B ACVR2A ADAM10
ADCY1 AFF4 AHCTF1 AHNAK AHR AKAP9 AKT1 AKT2 AKT3 ALK ALOX12B AMER1 ANK3
ANKRD11 APAF1 APC AQR AR ARAF ARAP3 ARFGAP1 ARFGAP3 ARFGEF1 ARFGEF2 ARHGAP29
ARHGAP35 ARHGEF2 ARID1A ARID1B ARID2 ARID4A ARID4B ARID5B ARNTL ASH1L ASPM
ASXL1 ASXL2 ATM ATR ATRX AURKA AURKB AXIN1 AXIN2 AXL B2M BAP1 BARD1 BBC3
BCL10 BCL11A BCL2 BCL2L11 BCL6 BCOR BIRC3 BLM BMPR1A BMPR2 BNC2 BPTF BRAF
BRCA1 BRCA2 BRD4 BRIP1 BRWD1 BTK CAD CALR CARD11 CARM1 CASP1 CASP8 CAST CAT
CBFB CBL CCAR1 CCND1 CCND2 CCND3 CCNE1 CCT5 CD274 CD276 CD79A CD79B CDC27
CDC73 CDH1 CDK12 CDK4 CDK6 CDK8 CDKN1A CDKN1B CDKN2A CDKN2B CEBPA CEP290
CHD1L CHD3 CHD4 CHD6 CHD8 CHD9 CHEK1 CHEK2 CIC CLOCK CLSPN CLTC CNOT3 CNTNAP1
COL1A1 CREBBP CRKL CRNKL1 CRTC3 CSDE1 CSF1R CSF3R CSNK2A1 CTCF CTNNB1 CTNND1
CTTN CUL1 CUL3 CUX1 CXCR4 CYLD DAXX DDR2 DDX3X DDX5 DEPDC1B DHX15 DHX35 DHX9
DICER1 DIS3 DNAJB1 DNMT1 DNMT3A DNMT3B DOT1L E2F3 EED EEF1A1 EFTUD2 EGFL7
EGFR EIF1AX EIF2AK3 EIF4A2 EIF4E EIF4G1 EIF4G3 ELF3 EP300 EPC1 EPHA1 EPHA2
EPHA3 EPHA4 EPHA5 EPHA7 EPHB1 EPHB2 ERBB2 ERBB3 ERBB4 ERCC2 ERCC3 ERCC4 ERCC5
ERRFI1 ESR1 ETV1 ETV6 EZH2 F8 FAM175A FANCA FANCI FAS FAT1 FAT2 FBXO11 FBXW7
FGF19 FGF3 FGF4 FGFR1 FGFR2 FGFR3 FGFR4 FH FKBP5 FLCN FLT1 FLT3 FLT4 FMR1 FN1
FOXA1 FOXA2 FOXL2 FOXO1 FOXP1 FUS FXR1 FYN G3BP1 G3BP2 GATA1 GATA3 GNA11
GNAI1 GNAQ GNAS GNG2 GPS2 GPSM2 GREM1 GRIN2A GSK3B H3F3A H3F3B H3F3C HCFC1
HDAC3 HDAC9 HGF HIST1H2BD HIST1H3B HIST1H3C HIST1H3D HIST1H3E HIST1H3F
HIST1H3G HIST1H3H HIST1H3I HIST1H3J HIST2H3D HIST3H3 HLA-A HLA-B HLF HNF1A
HOXB13 HRAS HSP90AA1 HSP90AB1 HSPA8 IDH1 IDH2 IGF1 IGF1R IGF2 IKBKE IL7R ING1
INHA INHBA INPP4A INPP4B INPPL1 INSR IREB2 IRF2 IRF6 IRF7 IRS1 IRS2 ITGA9
ITSN1 JAK1 JAK2 JAK3 JMY JUN KALRN KAT6B KDM5A KDM5C KDM6A KDR KEAP1 KIT KLF4
KMT2C KMT2D KRAS LAMA2 LATS2 LCP1 LDHA LIMA1 LMO1 LNPEP LRP6 LRPPRC MACF1
MAGI2 MAP2K1 MAP2K2 MAP2K4 MAP3K1 MAP3K11 MAP3K13 MAP3K14 MAP3K4 MAP4K1 MAPK1
MAPK3 MAT2A MAX MCM3 MCM8 MDC1 MDM2 MDM4 MECOM MED12 MED17 MED23 MEF2B MEF2C
MEN1 MET MFNG MGA MGMT MITF MKL1 MLH1 MLH3 MMP2 MPL MRE11A MSH2 MSH6 MSR1
MST1 MST1R MTOR MUC20 MUTYH MYB MYC MYCN MYD88 MYH10 MYH11 MYH14 MYH9 MYOD1
NBN NCF2 NCK1 NCKAP1 NCOA3 NCOR1 NCOR2 NDRG1 NEDD4L NF1 NF2 NFATC4 NFE2L2
NFKBIA NKX2-1 NKX3-1 NOTCH1 NOTCH2 NOTCH3 NOTCH4 NPM1 NR2F2 NR4A2 NRAS NSD1
NTN4 NTRK1 NTRK2 NTRK3 NUP107 NUP93 NUP98 PABPC1 PABPC3 PAK1 PALB2 PARK2
PARP1 PAX5 PBRM1 PCBP1 PCDH18 PCSK5 PCSK6 PDCD1 PDGFRA PDGFRB PER1 PGR PHF6
PIK3C2B PIK3C2G PIK3CA PIK3CB PIK3CD PIK3CG PIK3R1 PIK3R2 PIK3R3 PIM1 PLCB1
PLCG1 PLK2 PLXNA1 PLXNB2 PMS1 PMS2 POLE POLR2B POM121 PPM1D PPP2R1A PPP2R5A
PPP2R5C PPP6C PRDM1 PRKAR1A PRKCZ PRPF8 PRRX1 PSIP1 PSMA6 PSMD11 PSME3 PTCH1
PTEN PTPN11 PTPRD PTPRF PTPRS PTPRT PTPRU RAB35 RAC1 RAD21 RAD23B RAD50 RAD51
RAD51B RAD51C RAD52 RAD54L RAF1 RARA RASA1 RASGRP1 RB1 RBBP7 RBM10 RBM5
RECQL4 REL RET RFC4 RFWD2 RGS3 RHEB RHOA RHOT1 RICTOR RIT1 RNF43 ROBO2 ROS1
RPL5 RPS6KA4 RPS6KB2 RPTOR RTN4 RUNX1 RYBP SDHA SDHAF2 SDHB SETD2 SETDB1
SF3B1 SFPQ SH2B3 SH2D1A SHMT1 SHQ1 SIN3A SMAD2 SMAD3 SMAD4 SMARCA1 SMARCA4
SMARCB1 SMARCD1 SMC1A SMO SMURF2 SOCS1 SOS1 SOS2 SOX17 SOX2 SOX9 SPEN SPOP
SRC SRGAP1 SRGAP3 SRSF2 STAG1 STAG2 STARD13 STAT3 STAT5A STAT5B STIP1 STK11
STK4 STK40 SUFU SUZ12 SVEP1 SYK SYNCRIP TAF1 TAOK1 TBL1XR1 TBX3 TCEB1 TCF12
TCF3 TCF4 TCF7L2 TERT TET2 TFDP1 TGFBR1 TGFBR2 THRAP3 TJP1 TJP2 TMEM127
TNFAIP3 TNPO1 TOM1 TOP1 TP53 TP63 TRAF7 TRERF1 TRIO TRIP10 TSC1 TSHR TXNIP
U2AF1 USP6 VEGFA VHL VIM VTCN1 WASF3 WHSC1 WHSC1L1 WIPF1 WNK1 WNT5A WT1 XIAP
XPO1 XRCC2 XRN1 YAP1 YBX1 YES1 ZC3H11A ZFHX3 ZFP36L2 ZNF292 ZNF638
For 609 gene panel screening processes of screening mutator,
1) selection and the relevant catastrophe point information of cancer kind from the sample that MSK is provided, obtain 468 genes;
2) sample data delivered in other 48 research centers is chosen, prediction driving gene obtains a driving base more than 400
Cause;
3) information for merging 2 parts above carries out gene annotation to variant sites;
4) to the gene chosen above, the information in medication site is added;
5) according to personal experience and reference database, site information all of the above is screened, finally retains 609 genes
1273 sites.
Advantageous effect:
1) commemorate this grand Caitlin's Cancer Research Center (Memorial Sloan Kettering Cancer Center,
Abbreviation MSK) gene panel can cover close to 300 kinds of cancer kinds, our panel can cover 367 cancer kinds, at present it is several
It is not covered with the panel of so more cancer kinds;
2) commemorate this grand Caitlin's Cancer Research Center (Memorial Sloan Kettering Cancer Center,
Abbreviation MSK) gene panel have 468 genes, we are selected 609 tumor-related genes, covering surface is wider, Er Qiewo
This 609 genes are only chosen with the site of mutation most occurred frequently, it is at low cost, it is with strong points;
3) the medication site information for adding FDA approvals, help is provided to clinician's medication guide;
4) there are 609 genes, not only cover most genes of big panel in the market, but also targetedly
The hot spot mutation region and medication site, overall cost for choosing each gene can decline, and will be seen that the comprehensive of tumour medication
Information.
Specific implementation mode
First, we are according to this grand Caitlin's Cancer Research Center (Memorial Sloan Kettering of souvenir
Cancer Center, abbreviation MSK)《Natural medicine》The article published on magazine, downloaded all clinical research samples with
List of genes list to the abrupt information of these samples excavate and extracts useful data, and counted to tumor type
Analysis, obtains 468 genes, 262595 site mutation information;
Secondly, the high-throughput data for having chosen 6000 many cases patients of 48 research centers published upload, in base
Because being based on protein on the basis of variation, the bioanalysis network of transcriptional control and non-coding RNA done Analysis of Topological Structure and
Further screening obtains the abrupt information in 48121 sites of more than 400 driving gene;
Finally, the union for taking this 2 portion gene annotates all mutational sites, the letter of comprehensive multiple databases
Breath come judge with the relevant gene of tumour and catastrophe, according to existing experience from a abrupt information more than 8000 obtained
High frequency mutant gene and catastrophe point are chosen, medication site known to these genes is added, finally determines 609 genes 1273
Mutational site.As shown in last table.
CHROM:Chromosome
POS:Position on chromosome
REF:With reference to the base of chromosome in this position
ALT:Detect the base of the position
ANN[*].HGVS_P:HGVS (on protein level) name that annovar software annotations come out
ANN[*].HGVS_C:HGVS (on DNA level) name that annovar software annotations come out
ANN[*].EFFECT:The mutation type that annovar software annotations come out
ANN[*].RANK:Annovar software annotations come out the position on several exons on corresponding gene or
Including subinterval
ANN[*].GENE:The gene that annovar software annotations come out
ANN[*].FEATUREID:The referenced transcript version number of annotation
COSID:No. id in cosmic databases
CDS:The abbreviation of coded sequence
AA:Amino acid abbreviations
CNT:How many sample of the sample included in cosmic databases is this mutation
EXAC_ALL:The frequency of mutation of all samples included in ExAC databases in the site;
EXAC_AC_AFR:The frequency of mutation of the African sample included in ExAC databases in the site;
EXAC_AC_AMR:The frequency of mutation of the American sample included in ExAC databases in the site;
EXAC_AC_EAS:The frequency of mutation of the sample of the gook included in ExAC databases in the site;
EXAC_AC_FIN:The frequency of mutation of the Finnic sample included in ExAC databases in the site;
EXAC_AC_NFE:The frequency of mutation of the non-Finnic sample included in ExAC databases in the site;
EXAC_AC_OTH:The frequency of mutation of the sample for the non-above regional ethnic group included in ExAC databases in the site;
EXAC_AC_SAS:The frequency of mutation of the sample of people from South Asia included in ExAC databases in the site;
1000G_ALL:The frequency of mutation of all samples included in thousand human genome databases in the site;
1000G_EAS_AF:The frequency of mutation of the sample of the gook included in thousand human genome databases in the site;
1000G_AMR_AF:The frequency of mutation of the American sample included in thousand human genome databases in the site;
1000G_AFR_AF:The frequency of mutation of the African sample included in thousand human genome databases in the site;
1000G_EUR_AF:The frequency of mutation of the European sample included in thousand human genome databases in the site;
1000G_SAS_AF:The frequency of mutation of the sample of people from South Asia included in thousand human genome databases in the site;
clinvar_CLINSIG:The clinical meaning for the Mutation included in Clinical databases;
clinvar_CLNDBN:Included in ClinVar databases with the relevant disease information in the site;
FATHMM prediction:To the prediction of pathogenic sites in cosmic databases
0FATHMM score:It gives a mark to pathogenic sites in cosmic databases, the higher site of score may more cause to cause
Disease
Mutation somatic status:It is systematic mutation or germline mutation to the site in cosmic databases
Prediction