CN114316025A - VEGFR2 gene humanized non-human animal and construction method and application thereof - Google Patents

VEGFR2 gene humanized non-human animal and construction method and application thereof Download PDF

Info

Publication number
CN114316025A
CN114316025A CN202111596716.9A CN202111596716A CN114316025A CN 114316025 A CN114316025 A CN 114316025A CN 202111596716 A CN202111596716 A CN 202111596716A CN 114316025 A CN114316025 A CN 114316025A
Authority
CN
China
Prior art keywords
vegfr2
human
ser
leu
humanized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111596716.9A
Other languages
Chinese (zh)
Inventor
赵磊
尚诚彰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baccetus Beijing Pharmaceutical Technology Co ltd
Original Assignee
Baccetus Beijing Pharmaceutical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baccetus Beijing Pharmaceutical Technology Co ltd filed Critical Baccetus Beijing Pharmaceutical Technology Co ltd
Publication of CN114316025A publication Critical patent/CN114316025A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention provides a VEGFR2 gene humanized non-human animal and a construction method thereof, wherein a part of nucleotide sequence for coding human VEGFR2 protein is introduced into the genome of the non-human animal in a homologous recombination mode, and the humanized VEGFR2 protein can be normally expressed in the animal body, can be used as an animal model for researching the signal mechanism of human VEGFR2 and screening drugs for tumors and immune related diseases, and has important application value for the research and development of new drugs for immune targets. The invention also provides a humanized VEGFR2 protein, a humanized VEGFR2 gene, a targeting vector, a non-human animal obtained by the construction method and application of the non-human animal in the field of biomedicine.

Description

VEGFR2 gene humanized non-human animal and construction method and application thereof
Technical Field
The invention belongs to the field of animal genetic engineering and genetic modification, and particularly relates to a VEGFR2 gene humanized non-human animal, a construction method thereof and application thereof in the field of biomedicine.
Background
Angiogenesis refers to the process of formation of new blood vessels by the pre-existing capillaries in a budding manner, and various important physiological and pathological processes of a human body all relate to the formation of new blood vessels, including embryonic development, wound healing, diabetic retinopathy, tumor infiltration and metastasis, and the like. This process is triggered and regulated by a variety of factors, including basic fibroblast growth factor (bFGF), Vascular Endothelial Growth Factor (VEGF), platelet-derived growth factor (PDGF), interleukin 6(IL6), and the like. Among them, VEGF is one of the most important factors for regulating the growth and proliferation of vascular endothelial cells, and it induces capillary proliferation, promotes endothelial cell proliferation and migration, enhances vascular permeability, and inhibits apoptosis by acting specifically on glycosylated cells of endothelial cells, including mitogen. The action of the compound depends on the combination with vascular endothelial cell specific receptors (VEGFRs), and further activates in vivo signal transduction to exert biological effects.
VEGF family members include VEGFA, VEGFB, VEGFC, VEGFD and VEGFE, of which VEGFA is the most important member and one of the most potent vasopermeability-promoting factors known to date. There are 3 members of the VEGF receptor family: VEGFR1, VEGFR2 and VEGFR3, of which VEGFR2 is one of the most important regulatory factors, is mainly distributed in vascular endothelial cells and lymphatic endothelial cells, and plays a leading role in VEGF signal transduction and vascular endotheliogenesis. VEGFR2 forms a dimer by binding with VEGFA, causes autophosphorylation, activates and transmits a cell membrane/cytoplasmic kinase cascade reaction signal to a nucleus, initiates a series of reactions such as endothelial cell proliferation, invasion and migration, regulates endothelial cells through mechanisms such as anti-apoptosis, promotes angiogenesis and maintains the integrity of the angiogenesis. The VEGFA/VEGFR2 signal pathway is the most important rate-limiting step of physiological and pathological angiogenesis, is also an important pathway of tumor angiogenesis, not only directly leads to the generation of tumor vessels, but also indirectly promotes the formation of lymphatic metastasis. High expression of VEGFR2 has been detected in various tumor tissues, such as non-small cell lung cancer, metastatic colorectal cancer, cervical cancer, ovarian cancer, nasopharyngeal cancer, gastric cancer, brain glioma, etc., and its expression is associated with tumor staging, recurrence and prognosis. Therefore, VEGFR2 has been considered as an important target of antitumor drug action, and can be used as a tumor prognosis index to assist clinical medication, providing important basis for determination of treatment scheme, efficacy evaluation and prognosis.
The experimental animal disease model is an indispensable research tool for researching etiology and pathogenesis of human diseases, developing prevention and treatment technologies and developing medicines. However, due to the differences between the physiological structures and metabolic systems of animals and humans, the traditional animal models cannot reflect the real conditions of human bodies well, and the establishment of disease models closer to the physiological characteristics of human bodies in animal bodies is an urgent need of the biomedical industry. However, because of the differences in physiology and pathology between animals and humans, and the complexity of genes, it remains the greatest challenge to construct an "effective" humanized animal model that approximates the physiological characteristics of humans for the development of new drugs.
In view of the fact that the VEGFA/VEGFR2 signal pathway has great application value in the field of tumor immunotherapy, in order to further research relevant biological characteristics, improve the effectiveness of preclinical pharmacodynamic tests, improve the success rate of research and development, make preclinical tests more effective and minimize the research and development failure, there is an urgent need in the art to develop a non-human animal model involving the VEGFA/VEGFR2 signal pathway. In addition, the non-human animal obtained by the method can be mated with other gene humanized non-human animals to obtain a multi-gene humanized animal model which is used for screening and evaluating the drug effect research of human drugs and combined drugs aiming at the signal path. The invention has wide application prospect in academic and clinical research.
Disclosure of Invention
The present invention utilizes gene editing technology to replace homologous genes in non-human animal genome with human normal or mutant genes and to create non-human animals with normal or mutant genes that more closely approximate the physiological or disease characteristics of humans. The cell or tissue transplantation can be improved and promoted through gene humanization, and more importantly, due to the insertion of human gene segments, human proteins can be expressed or partially expressed in an animal body and can be used as targets of drugs only capable of recognizing human protein sequences, so that the possibility of screening anti-human antibodies and other drugs at the animal level is provided.
Specifically, in the first aspect of the present invention, a humanized VEGFR2 protein is provided, wherein the humanized VEGFR2 protein comprises all or part of a human VEGFR2 protein.
Preferably, the humanized VEGFR2 protein comprises all or part of the signal peptide, transmembrane region, cytoplasmic region, and/or extracellular region of the human VEGFR2 protein. Further preferably, the humanized VEGFR2 protein comprises all or part of the extracellular domain of human VEGFR2 protein.
Preferably, the humanized VEGFR2 protein further comprises a portion of a non-human animal VEGFR2 protein, more preferably all or a portion of the signal peptide, transmembrane region, and/or cytoplasmic region of the non-human animal VEGFR2 protein. Further preferably, the VEGFR2 protein may also comprise a portion of the extracellular domain of a non-human animal VEGFR2 protein.
In one embodiment of the present invention, the humanized VEGFR2 protein comprises a signal peptide, a transmembrane region, a cytoplasmic region, and an extracellular region. Wherein the extracellular domain comprises all or part of the extracellular domain of human VEGFR2 protein. Preferably, the portion of the extracellular domain of human VEGFR2 protein comprises an extracellular domain of human VEGFR2 protein with 0-20, preferably 5-15, specifically 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 amino acid residues removed from the N-terminus and/or C-terminus. Further preferred, the extracellular domain of human VEGFR2 protein comprises 6 amino acid residues removed from the N-terminus and 13 amino acid residues removed from the C-terminus. The signal peptide, transmembrane region and cytoplasmic region are derived from a non-human animal.
Preferably, the humanized VEGFR2 protein comprises at least 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 710, 720, 726, 730, 732, 738, 739, 740, 741 or 745 consecutive amino acids of the extracellular domain of the human VEGFR2 protein.
In one embodiment of the invention, the humanized VEGFR2 protein comprises a sequence identical to SEQ ID NO: 4, amino acid sequence having at least 80%, 85%, 90%, 95% or at least 99% identity at positions 1-764, 14-764, 20-764, 26-764, 23-760, 20-751, 26-751 or comprising the amino acid sequence of SEQ ID NO: 4, positions 1-764, 14-764, 20-764, 26-764, 23-760, 20-751, 26-751. In a specific embodiment of the present invention, the humanized VEGFR2 protein comprises a non-human animal VEGFR2 protein signal peptide, a non-human animal VEGFR2 protein transmembrane region, a non-human animal VEGFR2 protein cytoplasmic region, and a portion of the extracellular domain of the human VEGFR2 protein (preferably, the extracellular domain of the human VEGFR2 protein with 0 to 20 amino acid residues removed from the N-terminus and/or C-terminus) and a portion of the extracellular domain of the non-human animal (0 to 20 amino acid residues from the N-terminus and/or C-terminus).
Preferably, the humanized VEGFR2 protein comprises all or part of an amino acid sequence encoded by exon 1 through exon 30 of the human VEGFR2 gene. Further preferably, the VEGFR gene comprises all or part of an amino acid sequence encoded by any one, two, three or more, two or more consecutive exons or a combination of three or more consecutive exons of human VEGFR2 gene from exon 1 to exon 30. Even more preferably, all or part of the amino acid sequence encoded by the exons 2 to 15 of the human VEGFR2 gene. Still further preferably, the VEGFR gene comprises an amino acid sequence encoded by part of exon 2, all of exons 3 to 14, and part of exon 15 of human VEGFR2 gene, wherein part of exon 2 of human VEGFR2 gene comprises a nucleotide sequence of exon 2 with 0-3 amino acids N-terminal removed, or part of exon 2 of human VEGFR2 gene comprises a nucleotide sequence of at least 20, 30, 40, 50, 60, 70, 80, 86, 90, or 94bp contiguous exons 2; the portion of exon 15 of human VEGFR2 gene comprises the nucleotide sequence of exon 15 with the C-terminal 0-4 amino acids removed, or the portion of exon 15 of human VEGFR2 gene comprises a nucleotide sequence of at least 20, 50, 80, 100, 110, 119, 120, 130, or 132bp contiguous to exon 15. Most preferably, the polypeptide comprises a sequence identical to SEQ ID NO: 7 or a nucleotide sequence having at least 80%, 85%, 90%, 95% or at least 99% identity to the sequence of SEQ ID NO: 7, or a pharmaceutically acceptable salt thereof.
Preferably, the humanized VEGFR2 protein further comprises all or part of an amino acid sequence encoded by a non-human animal VEGFR2 gene, preferably exons 1, 16-30, preferably further comprises part of exon 2 and/or part of exon 15 of a non-human animal VEGFR2 gene.
In one embodiment of the present invention, the amino acid sequence of the humanized VEGFR2 protein comprises one of the following groups:
A) SEQ ID NO: 4, 1-764, 14-764, 20-764, 26-764, 23-760, 20-751, 26-751;
B) and SEQ ID NO: 4 at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% amino acid sequence identity at positions 1-764, 14-764, 20-764, 26-764, 23-760, 20-751, 26-751;
C) and SEQ ID NO: 4, positions 1-764, 14-764, 20-764, 26-764, 23-760, 20-751, 26-751, differ by no more than 10, 9, 8, 7, 6, 5, 4, 3, 2, or by no more than 1 amino acid; or, D) is substantially identical to SEQ ID NO: 4, 1-764, 14-764, 20-764, 26-764, 23-760, 20-751, 26-751, including substitutions, deletions and/or insertions of one or more amino acid residues.
In one embodiment of the present invention, the amino acid sequence of the humanized VEGFR2 protein comprises one of the following groups:
I) SEQ ID NO: 13 amino acid sequence, in whole or in part;
II) and SEQ ID NO: 13 is at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or at least 99% identical in amino acid sequence;
III) and SEQ ID NO: 13 by no more than 10, 9, 8, 7, 6, 5, 4, 3, 2, or no more than 1 amino acid; or
IV) and SEQ ID NO: 13, comprising substitution, deletion and/or insertion of one or more amino acid residues.
In a second aspect of the present invention, a humanized VEGFR2 gene is provided, wherein the humanized VEGFR2 gene comprises a portion of the human VEGFR2 gene.
Preferably, the humanized VEGFR2 gene comprises all or part of a nucleotide sequence encoding a signal peptide, extracellular, transmembrane and/or cytoplasmic region of human VEGFR2 protein. Further preferably, the human VEGFR2 protein comprises all or part of the nucleotide sequence of the extracellular domain. Still further preferred is a nucleotide sequence comprising an extracellular domain of human VEGFR2 protein encoding N-terminal and/or C-terminal deletion of 0-20, preferably 5-15, more preferably 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 amino acid residues. Still further preferably, the human VEGFR2 protein comprises a nucleotide sequence encoding an extracellular domain of human VEGFR2 protein with 6 amino acid residues removed from the N-terminus and 13 amino acid residues removed from the C-terminus. Still further preferred is a nucleotide sequence comprising at least 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 710, 720, 726, 730, 732, 738, 739, 740, 741 or 745 consecutive amino acids of the extracellular domain encoding human VEGFR2 protein.
Preferably, the human VEGFR2 further comprises a nucleotide sequence encoding a signal peptide of human VEGFR 2. Most preferably, the polypeptide comprises a nucleotide sequence encoding a polypeptide corresponding to SEQ ID NO: 4, amino acid sequence having at least 80%, 85%, 90%, 95% or at least 99% identity at positions 1-764, 14-764, 20-764, 26-764, 23-760, 20-751, 26-751 or to the amino acid sequence of SEQ ID NO: 4, 1-764, 14-764, 20-764, 26-764, 23-760, 20-751, 26-751, and a nucleotide sequence identical to the amino acid sequence shown in position 1-764, 14-764, 20-764, 26-751.
Preferably, the humanized VEGFR2 gene further comprises all or a portion of a protein encoding non-human animal VEGFR 2. For example, a partial nucleotide sequence encoding a cytoplasmic, signal peptide, a transmembrane, and/or an extracellular region of a non-human animal VEGFR2 protein.
Preferably, the humanized VEGFR2 gene encodes the humanized VEGFR2 protein described above.
Preferably, the humanized VEGFR2 gene comprises all or part of exons 1 to 30 of human VEGFR2 gene. Further preferably, all or part of a combination of any one, two, three or more, two consecutive or three or more exons from exon 1 to exon 30 of human VEGFR2 gene is contained. Even more preferably, all or part of exons 2 to 15 of human VEGFR2 gene are included. Still further preferably, the VEGFR gene comprises part of exon 2, all of exons 3 to 14, and part of exon 15 of human VEGFR2 gene, preferably further comprises intron 2-3 and/or intron 14-15, wherein part of exon 2 comprises a nucleotide sequence of exon 2 excluding the nucleotide sequence encoding the N-terminal 0-3 amino acids, or the part of exon 2 of human VEGFR2 gene comprises a nucleotide sequence of at least 20, 30, 40, 50, 60, 70, 80, 86, 90, or 94bp contiguous of exon 2; part of exon 15 comprises the nucleotide sequence of exon 15 with the C-terminal 0-4 amino acids removed, or part of exon 15 of the human VEGFR2 gene comprises a nucleotide sequence of at least 20, 50, 80, 100, 110, 119, 120, 130 or 132bp contiguous to exon 15.
Preferably, the humanized VEGFR2 gene further comprises a portion of a non-human animal VEGFR2 gene; preferably exons 1, 16-30, preferably also part of exon 2 and/or part of exon 15 of the non-human animal VEGFR2 gene.
Preferably, the humanized VEGFR2 gene comprises a nucleotide sequence identical to SEQ ID NO: 8 and/or 9, or a nucleotide sequence comprising at least 80%, 85%, 90%, 95%, or at least 99% identity to SEQ ID NO: 8 and/or 9.
Preferably, the humanized VEGFR2 gene comprises a nucleotide sequence identical to SEQ ID NO: 10 and/or 11, or a nucleotide sequence comprising at least 80%, 85%, 90%, 95%, or at least 99% identity to SEQ ID NO: 10 and/or 11.
In one embodiment of the present invention, the humanized VEGFR2 gene comprises one of the following groups:
(A) SEQ ID NO: 7, or a portion or all of the nucleotide sequence set forth in seq id no;
(B) and SEQ ID NO: 7 is at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99%;
(C) and SEQ ID NO: 7 differ by no more than 10, 9, 8, 7, 6, 5, 4, 3, 2, or no more than 1 nucleotide; or
(D) Has the sequence shown in SEQ ID NO: 7, including nucleotide sequences with one or more nucleotides substituted, deleted and/or inserted.
In one embodiment of the present invention, the nucleotide sequence of the humanized VEGFR2 gene comprises one of the following groups:
(i) the transcribed mRNA is SEQ ID NO:12, or a portion or all of the nucleotide sequence set forth in seq id no;
(ii) the transcribed mRNA is identical to SEQ ID NO:12 is at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99%;
(iii) the transcribed mRNA is identical to SEQ ID NO:12 differ by no more than 10, 9, 8, 7, 6, 5, 4, 3, 2, or no more than 1 nucleotide; or
(iv) The transcribed mRNA has the sequence of SEQ ID NO:12, including nucleotide sequences with one or more nucleotides substituted, deleted and/or inserted.
Preferably, the humanized VEGFR2 gene further comprises a specific inducer or repressor. Further preferably, the specific inducer or repressor may be a conventionally induced or repressed substance.
In one embodiment of the invention, the specific inducer is selected from the tetracycline System (Tet-Off System/Tet-On System) or Tamoxifen System (Tamoxifen System).
In a third aspect of the present invention, there is provided a targeting vector, said targeting vector comprising any one of:
A) a portion of the human VEGFR2 gene, preferably comprising part of exon 2, all of exons 3 to 14, and part of exon 15 of human VEGFR2 gene, preferably further comprising intron 2-3 and/or intron 14-15, wherein the portion of exon 2 of human VEGFR2 gene comprises a nucleotide sequence that excludes exon 2 encoding the N-terminal 0-3 amino acids, or the portion of exon 2 of human VEGFR2 gene comprises a nucleotide sequence of at least 20, 30, 40, 50, 60, 70, 80, 86, 90, or 94bp contiguous exons 2; the portion of exon 15 of human VEGFR2 gene comprises the nucleotide sequence of exon 15 with the C-terminal 0-4 amino acids removed, or the portion of exon 15 of human VEGFR2 gene comprises the nucleotide sequence of at least 20, 50, 80, 100, 110, 119, 120, 130, or 132bp contiguous to exon 15; further preferably, said humanized VEGFR2 gene comprises SEQ ID NO: 7;
B) a nucleotide sequence encoding all or a portion of a human VEGFR2 protein, preferably comprising all or a portion of an extracellular region encoding a human VEGFR2 protein, further preferably comprising a nucleotide sequence encoding at least 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 710, 720, 726, 730, 732, 738, 739, 740, 741, or 745 consecutive amino acids of an extracellular region of a human VEGFR2 protein, more preferably comprising a nucleotide sequence encoding SEQ ID NO: 4 at position 1-764, 14-764, 20-764, 26-764, 23-760, 20-751, or 26-751;
C) a nucleotide sequence encoding the humanized VEGFR2 protein described above; or the like, or, alternatively,
D) the humanized VEGFR2 gene described above.
Preferably, the targeting vector further comprises a 5 'arm (5' homology arm) and/or a 3 'arm (3' homology arm). The 5 'arm and the 5' end of the switching region to be changed are homologous DNA fragments selected from 100-10000 nucleotides in length of the genome DNA of the VEGFR2 gene of the non-human animal. Preferably, the 5' arm has at least 90% homology with NCBI accession number NC _ 000071.6. Further preferably, the 5' arm sequence is identical to SEQ ID NO: 5 or as shown in SEQ ID NO: 5, respectively. The 3 'arm comprises a DNA fragment homologous to the 3' end of the transition region to be altered selected from the group consisting of 100-10000 nucleotides in length of the genomic DNA of the VEGFR2 gene of a non-human animal. Preferably, the 3' arm has at least 90% homology with NCBI accession number NC _ 000071.6. Further preferably, the 3' arm sequence is identical to SEQ ID NO: 6 or as shown in SEQ ID NO: and 6.
Preferably, the transition region to be altered of the targeting vector is located at the non-human animal VEGFR2 locus. Further preferably, it is located on exon 1 to 30 of the non-human animal VEGFR2 gene. Even more preferably, it is located on exon 2 to exon 15 of the VEGFR2 gene of a non-human animal.
Preferably, the targeting vector further comprises a marker gene. Further preferably, the marker gene is a gene encoding a negative selection marker. Still more preferably, the gene encoding the negative selection marker is a gene encoding diphtheria toxin subunit a (DTA).
In one embodiment of the present invention, the targeting vector further comprises a resistance gene for positive clone selection. Further preferably, the resistance gene selected by the positive clone is neomycin phosphotransferase coding sequence Neo.
In one embodiment of the present invention, the targeting vector further comprises a specific recombination system. Further preferably, the specific recombination system is a Frt recombination site (a conventional LoxP recombination system can also be selected). The specific recombination system is provided with two Frt recombination sites which are respectively connected to two sides of the resistance gene.
In one embodiment of the present invention, the targeting vector further comprises a targeting sequence substantially identical to SEQ ID NO: 8 and/or 9, or a nucleotide sequence comprising at least 85%, 90%, 95%, or at least 99% identity to SEQ ID NO: 8 and/or 9.
In one embodiment of the present invention, the targeting vector further comprises a targeting sequence substantially identical to SEQ ID NO: 10 and/or 11, or a nucleotide sequence comprising at least 85%, 90%, 95% or at least 99% identity to SEQ ID NO: 10 and/or 11.
In a fourth aspect of the invention, there is provided a cell comprising the targeting vector described above.
In a fifth aspect of the present invention, there is provided the use of the targeting vector or the cell comprising the targeting vector described above for the modification of the VEGFR2 gene. Preferably, said use includes, but is not limited to, knock-out, insertion or substitution.
In a sixth aspect of the present invention, there is provided a non-human animal humanized with VEGFR2 gene, said non-human animal expressing human or humanized VEGFR2 protein; alternatively, the genome of the non-human animal comprises a portion of the human VEGFR2 gene.
Preferably, the non-human animal has reduced or absent expression of endogenous VEGFR2 protein.
Preferably, the genome of the non-human animal comprises a humanized VEGFR2 gene.
Preferably, a portion of the human VEGFR2 gene or the nucleotide sequence of the humanized VEGFR2 gene is operably linked to a non-human animal endogenous regulatory element.
In a seventh aspect of the present invention, a method for constructing a non-human animal humanized with VEGFR2 gene is provided, wherein the non-human animal expresses the human or humanized VEGFR2 protein in vivo; alternatively, the genome of the non-human animal comprises a portion of the human VEGFR2 gene.
Preferably, the genome of the non-human animal comprises a nucleotide sequence encoding a human VEGFR2 protein. Further preferably comprises all or part of the nucleotide sequence encoding the extracellular, signal, cytoplasmic, and/or transmembrane regions of the human VEGFR2 protein. Even more preferably, the recombinant human VEGFR2 comprises all or part of the nucleotide sequence of an extracellular domain encoding a human VEGFR2 protein. Still further preferred is a nucleotide sequence comprising an extracellular domain of human VEGFR2 protein encoding an N-terminal and/or C-terminal removed 0-20, preferably 5-15, particularly 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 amino acid residues. In one embodiment of the present invention, the present invention comprises a nucleotide sequence encoding an extracellular domain of human VEGFR2 protein with 6 amino acid residues removed from the N-terminus and 13 amino acid residues removed from the C-terminus. Still further preferred is a nucleotide sequence comprising at least 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 710, 720, 726, 730, 732, 738, 739, 740, 741 or 745 consecutive amino acids of the extracellular domain encoding human VEGFR2 protein.
In one embodiment of the invention, the genome of the non-human animal comprises a nucleotide sequence that encodes a nucleotide sequence identical to SEQ ID NO: 4, amino acid sequence having at least 80%, 85%, 90%, 95% or at least 99% identity at positions 1-764, 14-764, 20-764, 26-764, 23-760, 20-751, 26-751 or to the amino acid sequence of SEQ ID NO: 4, 1-764, 14-764, 20-764, 26-764, 23-760, 20-751, 26-751, and a nucleotide sequence identical to the amino acid sequence shown in position 1-764, 14-764, 20-764, 26-751.
In one embodiment of the invention, the genome of the non-human animal comprises a nucleotide sequence encoding SEQ ID NO: 13, or a nucleotide sequence of the amino acid shown in 13.
Preferably, the genome of the non-human animal comprises all or part of exons 1 to 30 of human VEGFR2 gene. Further preferably, all or part of a combination of any one, two, three or more, two consecutive or three or more exons from exon 1 to exon 30 of human VEGFR2 gene is contained. Even more preferably, all or part of exons 2 to 15 of human VEGFR2 gene are included. Still further preferably, the VEGFR gene comprises part of exon 2, all of exons 3 to 14, and part of exon 15 of human VEGFR2 gene, preferably further comprises intron 2-3 and/or intron 14-15, wherein the part of exon 2 of human VEGFR2 gene comprises a nucleotide sequence that excludes exon 2 encoding the N-terminal 0-3 amino acids, or the part of exon 2 of human VEGFR2 gene comprises a nucleotide sequence that is at least 20, 30, 40, 50, 60, 70, 80, 86, 90, or 94bp contiguous to exon 2; the portion of exon 15 of human VEGFR2 gene comprises the nucleotide sequence of exon 15 with the C-terminal 0-4 amino acids removed, or the portion of exon 15 of human VEGFR2 gene comprises a nucleotide sequence of at least 20, 50, 80, 100, 110, 119, 120, 130, or 132bp contiguous to exon 15. In one embodiment of the invention, the genome of the non-human animal comprises SEQ ID NO: 7.
In a specific embodiment of the present invention, the genome of the non-human animal comprises the human or humanized VEGFR2 gene described above.
Preferably, the method of construction comprises introducing into the non-human animal VEGFR2 locus a nucleotide sequence comprising any one of:
A) a portion of the human VEGFR2 gene, preferably comprising part of exon 2, all of exons 3 to 14, and part of exon 15 of human VEGFR2 gene, preferably further comprising intron 2-3 and/or intron 14-15, wherein the portion of exon 2 of human VEGFR2 gene comprises a nucleotide sequence that excludes exon 2 encoding the N-terminal 0-3 amino acids, or the portion of exon 2 of human VEGFR2 gene comprises a nucleotide sequence of at least 20, 30, 40, 50, 60, 70, 80, 86, 90, or 94bp contiguous exons 2; the portion of exon 15 of human VEGFR2 gene comprises the nucleotide sequence of exon 15 with the C-terminal 0-4 amino acids removed, or the portion of exon 15 of human VEGFR2 gene comprises the nucleotide sequence of at least 20, 50, 80, 100, 110, 119, 120, 130, or 132bp contiguous to exon 15; further preferred, comprises SEQ ID NO: 7;
B) a nucleotide sequence encoding all or a portion of a human VEGFR2 protein, preferably comprising all or a portion of an extracellular region encoding a human VEGFR2 protein, preferably comprising a nucleotide sequence encoding SEQ ID NO: 4 at position 1-764, 14-764, 20-764, 26-764, 23-760, 20-751, or 26-751;
C) a nucleotide sequence encoding the humanized VEGFR2 protein described above; or the like, or, alternatively,
D) the humanized VEGFR2 gene described above.
The introduction is insertion or replacement. Preferably, the site of insertion or substitution is after the endogenous regulatory elements of the VEGFR2 gene.
Wherein, the insertion is to directly place the target fragment between two adjacent bases without deleting the nucleotide. Among them, the target fragment is, for example, a human VEGFR2 gene, a humanized VEGFR2 gene, a nucleotide sequence encoding a human or humanized VEGFR2 protein, a nucleotide sequence obtained by splicing a human VEGFR2 gene with a non-human VEGFR2 gene. Of course, a partial nucleotide sequence of the human VEGFR2 gene may be used, preferably, the exon x +1 to the exon 30 of the human VEGFR2 gene is inserted immediately adjacent to the exon x of the non-human animal VEGFR2 gene; for example, exon 1 of the VEGFR2 gene of non-human animal is inserted next to exon 2 to exon 30 of VEGFR2 gene of human VEGFR, exon 2 of VEGFR2 gene of non-human animal is inserted next to exon 3 to exon 30 of VEGFR2 gene of human VEGFR, exon 3 of non-human animal is inserted next to exon 4 to exon 30 of VEGFR2 gene of human VEGFR, exon 4 of non-human animal is inserted next to exon 30 of VEGFR2 gene of human VEGFR, and the like.
Preferably, the insertion may further comprise disruption of the coding frame of the endogenous VEGFR2 gene in the non-human animal or disruption of the coding frame of the endogenous VEGFR2 gene following the insertion, followed by the insertion procedure, as desired for a particular embodiment. Or the insertion step may be performed to cause a frame shift mutation in the endogenous VEGFR2 gene or to effect insertion of a human sequence.
Further preferably, the insertion may be followed by addition of auxiliary sequences (e.g., stop codons or sequences with termination functions, etc.) or other methods (e.g., flipping sequences, or knocking out sequences) after the insertion of the desired fragment, as desired for a particular embodiment, such that the endogenous VEGFR2 protein in the non-human animal is not normally expressed after the insertion site.
Wherein the replacement includes replacement of a corresponding position or replacement of a non-corresponding position. The substitution of the corresponding position does not merely represent a mechanical substitution corresponding directly to the base site of the human and non-human animal VEGFR2 gene, but also includes a substitution of the corresponding functional region such as a substitution of the nucleotide sequence of the extracellular region encoding the non-human animal VEGFR2 protein with the nucleotide sequence of the extracellular region encoding the human VEGFR2 protein, a substitution of the nucleotide sequence of the signal peptide encoding the non-human animal VEGFR2 protein with the nucleotide sequence of the signal peptide encoding the human VEGFR2 protein, a substitution of the nucleotide sequence of the transmembrane region encoding the non-human animal VEGFR2 protein with the nucleotide sequence of the transmembrane region encoding the human VEGFR2 protein, a substitution of the nucleotide sequence of the cytoplasmic region encoding the non-human animal VEGFR2 protein with the nucleotide sequence of the cytoplasmic region encoding the human VEGFR2 protein, a substitution of the nucleotide sequence of the signal peptide and the extracellular region encoding the human VEGFR2 protein with the nucleotide sequence of the extracellular region encoding the non-human VEGFR2 protein, a substitution of the nucleotide sequence of the signal peptide and extracellular region encoding the human VEGFR2 protein, The nucleotide sequences of the extracellular and cytoplasmic regions replaced the nucleotide sequences encoding the signal peptide, extracellular and cytoplasmic regions of the non-human animal VEGFR2 protein, the nucleotide sequences encoding the signal peptide, extracellular, cytoplasmic and transmembrane regions of the human VEGFR2 protein replaced the nucleotide sequences encoding the signal peptide, extracellular, cytoplasmic and transmembrane regions of the non-human animal VEGFR2 protein, and the nucleotide sequences encoding the extracellular and cytoplasmic regions of the human VEGFR2 protein replaced the nucleotide sequences encoding the extracellular and transmembrane regions of the non-human animal VEGFR2 protein.
Preferably, said introducing into the VEGFR2 locus of a non-human animal is a replacement for the corresponding region of the non-human animal; it is further preferred that all or part of exons 2 to 15 of non-human animal VEGFR2 gene be replaced.
Preferably, the method of construction comprises insertion or substitution of a nucleotide sequence comprising a sequence encoding a human VEGFR2 protein into the non-human animal VEGFR2 locus. Further preferably, the non-human animal VEGFR2 locus is inserted or substituted with a nucleotide sequence comprising all or part of the extracellular region encoding human VEGFR2 protein. Even more preferably, the nucleotide sequence encoding the extracellular domain of human VEGFR2 protein with 0-20, preferably 5-15, more preferably 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 amino acid residues removed from the N-and/or C-terminus is inserted or substituted into the non-human animal VEGFR2 locus. Still further preferably, the insertion or substitution is at the non-human animal VEGFR2 locus with a nucleotide sequence comprising an extracellular region encoding human VEGFR2 protein with 6 amino acid residues removed from the N-terminus and 13 amino acid residues removed from the C-terminus. Most preferably, the polypeptide is produced by a polypeptide comprising a nucleotide sequence encoding a polypeptide corresponding to SEQ ID NO: 4, amino acid sequence having at least 80%, 85%, 90%, 95% or at least 99% identity at positions 1-764, 14-764, 20-764, 26-764, 23-760, 20-751, 26-751 or to the amino acid sequence of SEQ ID NO: 4, 1-764, 14-764, 20-764, 26-764, 23-760, 20-751, 26-751, or a substitution into the non-human animal VEGFR2 locus.
In one embodiment of the invention, the method of construction comprises contacting the nucleic acid sequence comprising the nucleic acid sequence encoding SEQ ID NO: 4, the nucleotide sequence of amino acids 1-764, 14-764, 20-764, 26-764, 23-760, 20-751, or 26-751 of the non-human animal VEGFR2 gene encoding the amino acid sequence of SEQ ID NO: 2, 1-762, 14-762, 20-762, 26-762, 23-758, 20-749 or 26-749.
Preferably, this comprises an insertion or substitution into the non-human animal VEGFR2 locus with a partial nucleotide sequence comprising the human VEGFR2 gene. Further preferably, the non-human animal VEGFR2 locus is inserted or substituted with a full or partial nucleotide sequence comprising exons 1 to 30 of human VEGFR2 gene. Still further preferably, all or part of the nucleotide sequence comprising any one, two, more than three, two consecutive or a combination of three consecutive or more exons of human VEGFR2 gene from exon 1 to exon 30 is inserted or substituted into the non-human animal VEGFR2 locus. Still further preferably, the non-human animal VEGFR2 locus is inserted or substituted with a full or partial nucleotide sequence comprising exons 2 to 15 of human VEGFR2 gene. Even more preferably, the non-human animal VEGFR2 locus is inserted or substituted with a partial nucleotide sequence comprising part of exon 2, all of exons 3 to 14, and part of exon 15 of human VEGFR2 gene, preferably further comprising intron 2-3 and/or intron 14-15, wherein the part of exon 2 of human VEGFR2 gene comprises a nucleotide sequence that excludes exon 2 encoding the N-terminal 0-3 amino acids, or the part of exon 2 of human VEGFR2 gene comprises a nucleotide sequence that excludes exon 2 encoding the C-terminal 0-4 amino acids, at least 20, 30, 40, 50, 60, 70, 80, 86, 90, or 94bp contiguous with exon 2, the part of exon 15 of human VEGFR2 gene comprises a nucleotide sequence that excludes exon 15 encoding the C-terminal 0-4 amino acids, or the part of exon 15 of human VEGFR2 gene comprises at least 20 contiguous exons 20, at least 20 contiguous exons with exon 15, 50. 80, 100, 110, 119, 120, 130 or 132bp nucleotide sequence. Most preferably, the polypeptide is produced using a polypeptide comprising an amino acid sequence identical to SEQ ID NO: 7 or a nucleotide sequence having at least 80%, 85%, 90%, 95%, or at least 99% identity to SEQ ID NO: 7, inserted or substituted into the non-human animal VEGFR2 locus.
In a specific embodiment of the invention, the method of construction comprises replacing all or part of exons 2 to 15 of non-human animal VEGFR2 gene with all or part of exons 2 to 15 of human VEGFR2 gene.
In a specific embodiment of the invention, the method of construction comprises replacing part of exon 2, all of exon 3 to 14, and part of exon 15 of the VEGFR2 gene of a non-human animal with a construct comprising part of exon 2, all of exon 3 to 14, and part of exon 15 of the human VEGFR2 gene.
In a specific embodiment of the invention, the constructing method comprises replacing part of exon 2, all of intron 2-3, all of exon 3-14, all of intron 14-15, and part of exon 15 of VEGFR2 gene in a non-human animal with a construct comprising part of exon 2, all of intron 2-3, all of exon 3-14, all of intron 14-15, and part of exon 15 of VEGFR2 gene.
In one embodiment of the invention, the method of construction comprises the step of using a nucleic acid comprising SEQ ID NO: 7 for replacing all or part of exons 2 to 15 of a non-human animal VEGFR2 gene.
In one embodiment of the invention, the method of construction comprises the step of using a nucleic acid comprising SEQ ID NO: 7 replaces the nucleotide sequence shown in non-human animal VEGFR2 gene encoding SEQ ID NO: 2, amino acid sequence from 26 to 749.
In a specific embodiment of the invention, the cDNA sequence encoding the human VEGFR2 protein is inserted or substituted into the non-human animal VEGFR2 locus.
In a specific embodiment of the invention, the non-human animal VEGFR2 locus is inserted or substituted with a nucleotide sequence comprising a sequence encoding a humanized VEGFR2 protein.
In a specific embodiment of the invention, the nucleotide sequence comprising the humanized VEGFR2 gene is inserted or substituted at the non-human animal VEGFR2 locus.
Preferably, the human or humanized VEGFR2 gene is regulated in a non-human animal by endogenous regulatory elements.
Preferably, the non-human animal is homozygous or heterozygous.
Preferably, the non-human animal comprises a humanized VEGFR2 gene on at least one chromosome in its genome.
Preferably, at least one cell in the non-human animal expresses a human or humanized VEGFR2 protein.
Preferably, the non-human animal is constructed using gene editing techniques including gene targeting using embryonic stem cells, regular clustered spacer short palindromic repeats (CRISPR/Cas9) techniques, Zinc Finger Nucleases (ZFNs) techniques, transcription activator-like effector nucleases (TALENs) techniques, homing endonucleases (megabase megaribozymes), or other molecular biology techniques.
Preferably, the targeting vector described above is used for the construction of non-human animals.
In a specific embodiment of the invention, the construction method comprises introducing the targeting vector into a non-human animal cell, culturing the cell (preferably an embryonic stem cell), transplanting the cultured cell into an oviduct of a female non-human animal, allowing the female non-human animal to develop, and identifying and screening the non-human animal humanized with the VEGFR2 gene.
The construction method also comprises the steps of mating the VEGFR2 gene humanized non-human animal with other gene modified non-human animals, carrying out in vitro fertilization or directly carrying out gene editing, and screening to obtain the multi-gene modified non-human animal. Preferably, the other genes comprise one or more than two of CD3, PD-L1, PD-1, CD47, CD27, CD28, SIRPA, GITR or TIGIT.
The non-human animal according to any of the above aspects may be selected from any non-human animal such as rodents, pigs, rabbits, monkeys, etc. which can be genetically engineered to humanise a gene.
Preferably, the non-human animal is a non-human mammal. Further preferably, the non-human mammal is a rodent. Still more preferably, the rodent is a rat or a mouse.
Preferably, the non-human animal is an immunodeficient non-human mammal. Further preferably, the immunodeficient non-human mammal is an immunodeficient rodent, an immunodeficient pig, an immunodeficient rabbit or an immunodeficient monkey. Still further preferably, the immunodeficient rodent is an immunodeficient mouse or rat. Most preferably, the immunodeficient mouse is NOD-Prkdcscid IL-2rγnullMouse, NOD-Rag 1-/--IL2rg-/-(NRG) mice, Rag 2-/--IL2rg-/-(RG) mice, NOD/SCID mice or nude mice.
The eighth aspect of the present invention provides a method for constructing a polygene-modified non-human animal, comprising the steps of:
providing the non-human animal and the non-human animal obtained by the construction method;
and (II) mating the non-human animal provided in the step (I) with other genetically modified non-human animals, performing in vitro fertilization or directly performing gene editing, and screening to obtain the multi-gene modified non-human animal.
Preferably, the other genetically modified non-human animal includes, but is not limited to, non-human animals modified by a combination of one or more of the genes CD3, PD-L1, PD-1, CD47, CD27, CD28, SIRPA, GITR or TIGIT.
Preferably, the polygenic modified non-human animal is a two-gene humanized non-human animal, a three-gene humanized non-human animal, a four-gene humanized non-human animal, a five-gene humanized non-human animal, a six-gene humanized non-human animal, a seven-gene humanized non-human animal, an eight-gene humanized non-human animal or a nine-gene humanized non-human animal.
Preferably, each of the plurality of genes humanized in the genome of the polygenic modified non-human animal may be homozygous or heterozygous.
In a ninth aspect of the present invention, there is provided a non-human animal or its progeny obtained by the above construction method.
In a tenth aspect of the present invention, there is provided a tumor-bearing or inflammation model of an animal, wherein the tumor-bearing or inflammation model is derived from the above-mentioned non-human animal, the non-human animal obtained by the above-mentioned construction method, or the above-mentioned non-human animal or its progeny.
In an eleventh aspect of the present invention, there is provided a method for constructing a tumor-bearing or inflammation model in an animal, comprising the above-described method for constructing a non-human animal, a non-human animal or a progeny thereof, or a polygene-modified non-human animal.
In a twelfth aspect, the present invention provides a non-human animal derived from the above non-human animal, a non-human animal obtained by the above construction method, the above non-human animal or its progeny, or the above constructed polygene-modified non-human animal, for use in preparing a tumor-bearing or inflammation model of the animal.
In a thirteenth aspect of the present invention, there is provided a cell, a tissue or an organ, wherein the cell, the tissue or the organ expresses the above-mentioned humanized VEGFR2 protein, or a genome of the cell, the tissue or the organ comprises the above-mentioned humanized VEGFR2 gene, or the cell, the tissue or the organ is derived from the above-mentioned non-human animal, the non-human animal obtained by the above-mentioned construction method, the above-mentioned non-human animal or a progeny thereof, or the above-mentioned tumor-bearing or inflammation model. Preferably, the cell, tissue or organ is incapable of developing into an individual.
In a fourteenth aspect of the present invention, there is provided a tumor tissue after tumor bearing, wherein the tumor tissue is derived from the above non-human animal, the non-human animal obtained by the above construction method, the above non-human animal or its progeny, or the above tumor bearing or inflammation model.
In a fifteenth aspect of the present invention, there is provided a cell humanized of the VEGFR2 gene, said cell expressing a human or humanized VEGFR2 protein; alternatively, the genome of the cell comprises a portion of the human VEGFR2 gene. Preferably, the cell expresses the above-described humanized VEGFR2 protein.
Preferably, the cell comprises a humanized VEGFR2 gene as described above.
In a sixteenth aspect, the present invention provides a humanized VEGFR2 protein derived from the above-mentioned VEGFR2, the above-mentioned humanized VEGFR2 gene, the above-mentioned non-human animal, the non-human animal obtained by the above-mentioned construction method or its progeny, the above-mentioned tumor-bearing or inflammation model, the above-mentioned cell, tissue or organ, the above-mentioned tumor-bearing tissue, or an application of the above-mentioned cell, wherein the application comprises a) an application in the development of a product relating to the immune process of human cells; the product is preferably an antibody;
B) as model systems for pharmacological, immunological, microbiological or medical research;
C) to the production and use of animal experimental disease models for the study of etiology and/or for the development of diagnostic strategies and/or for the development of therapeutic strategies;
D) screening, validating, evaluating or studying VEGFR2 pathway function; preferably human VEGFR2 pathway signaling mechanism; or E) screening and evaluating the application of human medicine and drug effect research. The drug is preferably an antibody or an immune-related drug.
Preferably, the application is an application related to VEGFR2 gene or protein.
Preferably, said use is a non-disease diagnostic and therapeutic method.
In a seventeenth aspect of the present invention, there is provided a method of screening for a modulator specific for human VEGFR2, the method comprising administering the modulator to an individual implanted with tumor cells, and detecting tumor suppression; wherein the individual is selected from the group consisting of the above non-human animal, the non-human animal obtained by the above construction method, the above non-human animal or a progeny thereof, or the above tumor-bearing or inflammation model.
Preferably, the modulator is selected from CAR-T, a drug. Further preferably, the drug is an antibody.
Preferably, the modulator is a monoclonal antibody or a bispecific antibody or a combination of two or more drugs.
Preferably, the detection comprises determining the size and/or proliferation rate of the tumor cells.
Preferably, the detection method comprises vernier caliper measurement, flow cytometry detection and/or animal in vivo imaging detection.
Preferably, the detecting comprises assessing the weight, fat mass, activation pathways, neuroprotective activity or metabolic changes in the individual, including changes in food consumption or water consumption.
Preferably, the tumor cell is derived from a human or non-human animal.
Preferably, the method of screening for a modulator specific for human VEGFR2 is not a therapeutic method. The method is used for screening or evaluating drugs, and detecting and comparing the drug effects of candidate drugs to determine which candidate drugs can be used as drugs and which can not be used as drugs, or comparing the drug effect sensitivity degrees of different drugs, namely, the treatment effect is not necessary and is only a possibility.
In the eighteenth aspect of the present invention, an evaluation method of an intervention program is provided, the evaluation method comprises implanting tumor cells into an individual, applying the intervention program to the individual implanted with the tumor cells, and detecting and evaluating a tumor inhibition effect of the individual after applying the intervention program; wherein the individual is selected from the group consisting of the above-mentioned non-human animal, the non-human animal obtained by the above-mentioned construction method, the above-mentioned non-human animal or a progeny thereof, or the above-mentioned tumor-bearing or inflammation model.
Preferably, the intervention regimen is selected from CAR-T, drug therapy. Further preferably, the drug is an antigen binding protein. The antibody binding protein is an antibody.
Preferably, the tumor cell is derived from a human or non-human animal.
Preferably, the method of assessing the intervention regimen is not a method of treatment. The evaluation method detects and evaluates the effect of the intervention program to determine whether the intervention program has a therapeutic effect, i.e. the therapeutic effect is not necessarily but only a possibility.
In a nineteenth aspect of the present invention, there is provided a use of the non-human animal derived from the above-mentioned non-human animal, the non-human animal obtained by the above-mentioned construction method, the above-mentioned non-human animal or its progeny, the above-mentioned tumor-bearing or inflammation model for preparing a human VEGFR 2-specific modulator.
In a twentieth aspect, the invention provides a non-human animal derived from the above non-human animal, the non-human animal obtained by the above construction method, the above non-human animal or its progeny, and the use of the above tumor-bearing or inflammation model in the preparation of a medicament for treating tumor, angiogenesis-related diseases, embryonic development or immune-related diseases.
The VEGFR2 gene humanized non-human animal can normally express human or humanized VEGFR2 protein in vivo, can be used for drug screening, drug effect evaluation, immunity-related diseases and tumor treatment aiming at human VEGFR2 pathway target sites, can accelerate the development process of new drugs, and can save time and cost.
The "angiogenesis-related diseases" described in the present invention include, but are not limited to, wound healing, diabetes, retinopathy, tumor infiltration and metastasis, and the like.
The "immune-related diseases" described in the present invention include, but are not limited to, allergy, asthma, myocarditis, nephritis, hepatitis, systemic lupus erythematosus, rheumatoid arthritis, scleroderma, hyperthyroidism, idiopathic thrombocytopenic purpura, autoimmune hemolytic anemia, ulcerative colitis, autoimmune liver disease, diabetes, pain, or neurological disorder, etc.
The "tumor" according to the present invention includes, but is not limited to, lymphoma, non-small cell lung cancer, cervical cancer, leukemia, ovarian cancer, nasopharyngeal cancer, breast cancer, endometrial cancer, colon cancer, rectal cancer, gastric cancer, bladder cancer, brain glioma, lung cancer, bronchial cancer, bone cancer, prostate cancer, pancreatic cancer, liver and bile duct cancer, esophageal cancer, kidney cancer, thyroid cancer, head and neck cancer, testicular cancer, glioblastoma, astrocytoma, melanoma, myelodysplastic syndrome, and sarcoma. Wherein the leukemia is selected from acute lymphocytic (lymphoblastic) leukemia, acute myelogenous leukemia, chronic lymphocytic leukemia, multiple myeloma, plasma cell leukemia, and chronic myelogenous leukemia; said lymphoma is selected from Hodgkin's lymphoma and non-Hodgkin's lymphoma, including B-cell lymphoma, diffuse large B-cell lymphoma, follicular lymphoma, mantle cell lymphoma, marginal zone B-cell lymphoma, T-cell lymphoma, and Waldenstrom's macroglobulinemia; the sarcoma is selected from osteosarcoma, Ewing's sarcoma, leiomyosarcoma, synovial sarcoma, soft tissue sarcoma, angiosarcoma, liposarcoma, fibrosarcoma, rhabdomyosarcoma, and chondrosarcoma. In one embodiment of the present invention, the tumor is non-small cell lung cancer, metastatic colorectal cancer, cervical cancer, ovarian cancer, nasopharyngeal carcinoma, gastric cancer, brain glioma.
The invention relates to a whole or part, wherein the whole is a whole, and the part is a part of the whole or an individual forming the whole.
The "humanized VEGFR2 protein" of the present invention comprises a portion derived from the human VEGFR2 protein and a portion of the non-human VEGFR2 protein.
Wherein the humanized VEGFR2 protein comprises 5 to 1356 amino acid sequences which are continuous or spaced and are consistent with the amino acid sequence of the human VEGFR2 protein, preferably 10 to 726 amino acids which are continuous or spaced, more preferably 5, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 726, 750, 800, 900, 1000, 1100, 1200, 1300 or 1356 amino acid sequences which are continuous or spaced and are consistent with the amino acid sequence of the human VEGFR2 protein.
The "humanized VEGFR2 gene" of the present invention includes a portion derived from the human VEGFR2 gene and a portion derived from the non-human VEGFR2 gene.
Wherein, the humanized VEGFR2 gene comprises 20bp-47000bp nucleotide sequences which are continuous or separated and are consistent with the nucleotide sequence of the human VEGFR2 gene, preferably 20-19273 nucleotides which are continuous or separated, and 20-2178 nucleotides, more preferably 20, 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2178, 2500, 3000, 4000, 4500, 5000, 5500, 6000, 6500, 7000, 7500, 8000, 8500, 9000, 9500, 10000, 19273, 20000, 30000, 40000 and 47000bp nucleotide sequences which are consistent with the nucleotide sequence of the human VEGFR2 gene.
The "exon nos. xx to xxx" or the entire "exon nos. xx to xxx" of the present invention include nucleotide sequences of exons and introns therebetween, for example, the "exon nos. 2 to 15" includes exon nos. 2, intron nos. 2 to 3, exon nos. 3, intron nos. 3 to 4, exon nos. 4, intron nos. 4 to 5, exon nos. 5, intron nos. 5 to 6, exon nos. 6 to 7, exon nos. 7, intron nos. 7 to 8, exon nos. 8, intron nos. 8 to 9, exon nos. 9, intron nos. 9 to 10, exon nos. 10, intron nos. 10 to 11, exon nos. 11, intron nos. 11 to 12, intron nos. 12 to 13, exon nos. 13 to 14, intron nos. 13 to 14 and intron nos. 14, The complete nucleotide sequence of intron 14-15 and exon 15.
The "x-xx intron" described herein represents an intron between the x exon and the xx exon. For example, "intron 2-3" means an intron between exon 2 and exon 3.
The "locus" of the present invention refers to the position of a gene on a chromosome in a broad sense and refers to a DNA fragment of a certain gene in a narrow sense, and the gene may be a single gene or a part of a single gene. For example, the "VEGFR 2 locus" refers to a DNA fragment of an optional stretch of exon 1 to exon 30 of VEGFR2 gene. In one embodiment of the present invention, the VEGFR2 locus that is replaced may be a DNA fragment of an optional stretch of exon 1 to exon 30 of VEGFR2 gene. In one embodiment of the present invention, the VEGFR2 locus that is replaced may be a DNA fragment of an optional segment of the exon nos. 2 to 15 of the VEGFR2 gene.
The "nucleotide sequence" of the present invention includes a natural or modified ribonucleotide sequence and a deoxyribonucleotide sequence. Preferably DNA, cDNA, pre-mRNA, rRNA, hnRNA, miRNAs, scRNA, snRNA, siRNA, sgRNA, tRNA.
The term "three or more" as used herein includes, but is not limited to, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty-one, twenty-two, twenty-three, twenty-four, twenty-five, twenty-six, twenty-seven, twenty-eight, twenty-nine, or thirty, etc.
"three or more in succession" in the present invention includes, but is not limited to, three in succession, four in succession, five in succession, six in succession, seven in succession, eight in succession, nine in succession, ten in succession, eleven in succession, twelve in succession, thirteen in succession, fourteen in succession, fifteen in succession, sixteen in succession, seventeen in succession, eighteen in succession, nineteen in succession, twenty-one in succession, twenty-two in succession, twenty-three in succession, twenty-four in succession, twenty-five in succession, twenty-six in succession, twenty-seven in succession, twenty-eight in succession, twenty-nine in succession, thirty in succession, and the like. Wherein "three or more consecutive exons from exon 2 to exon 15" includes three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, etc. consecutive exons, and also includes intron nucleotide sequences in between.
The term "treating" (or "treatment") as used herein means slowing, interrupting, arresting, controlling, stopping, alleviating, or reversing the progression or severity of one sign, symptom, disorder, condition, or disease, but does not necessarily refer to the complete elimination of all disease-related signs, symptoms, conditions, or disorders. The term "treatment" or the like refers to a therapeutic intervention that ameliorates the signs, symptoms, etc. of a disease or pathological state after the disease has begun to develop.
All combinations of items described herein as "and/or" including "are to be understood as meaning that each combination has been individually listed herein. For example, "A and/or B" includes "A", "A and B", and "B". As another example, "A, B and/or C" includes "A", "B", "C", "A and B", "A and C", "B and C", and "A and B and C".
The term "comprising" or "comprises" as used herein is open-ended, and when used to describe a sequence of a protein or nucleic acid, the protein or nucleic acid may be composed of the sequence, or may have additional amino acids or nucleotides at one or both ends of the protein or nucleic acid, but still possess the activity of the invention. Furthermore, it is clear to the skilled person that the methionine at the N-terminus of the polypeptide encoded by the start codon may be retained in certain practical cases (e.g.during expression in a particular expression system), but does not substantially affect the function of the polypeptide. Thus, in describing a particular polypeptide amino acid sequence in the specification and claims of this application, although it may not contain a methionine encoded by the start codon at the N-terminus, the sequence containing the methionine is also encompassed herein, and accordingly, the encoding nucleotide sequence may also contain the start codon; and vice versa.
The term "homology" as used herein refers to the fact that, in the context of using an amino acid sequence or a nucleotide sequence, a person skilled in the art can adjust the sequence to have (including but not limited to) 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 70%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9% identity.
One skilled in the art can determine and compare sequence elements or degrees of identity to distinguish between additional mouse and human sequences.
In one aspect, the non-human animal is a mammal. In one aspect, the non-human animal is a small mammal, such as a rhabdoid. In one embodiment, the non-human animal to which the gene is humanized is a rodent. In one embodiment, the rodent is selected from a mouse, a rat, and a hamster. In one embodiment, the rodent is selected from the murine family. In one embodiment, the genetically modified animal is from a family selected from the family of the crimyspascimyscimysciaenopsis (for example of the crimysciaeidae (for example of the hamsters, the new world rats and the new world rats, the rats and the rats, the. In a particular embodiment, the genetically modified rodent is selected from a true mouse or rat (superfamily murinus), a gerbil, a spiny mouse, and a crowned rat. In one embodiment, the genetically modified mouse is from a member of the murine family. In one embodiment, the animal is a rodent. In a particular embodiment, the rodent is selected from a mouse and a rat. In one embodiment, the non-human animal is a mouse.
In a particular embodiment, the non-human animal is a rodent, a strain of C57BL, C58, a/Br, CBA/Ca, CBA/J, CBA/CBA/mouse selected from BALB/C, a/He, a/J, A/WySN, AKR/A, AKR/J, AKR/N, TA1, TA2, RF, SWR, C3H, C57BR, SJL, C57L, DBA/2, KM, NIH, ICR, CFW, FACA, C57BL/A, C57BL/An, C57BL/GrFa, C57BL/KaLwN, C57BL/6, C57BL/6J, C57BL/6ByJ, C57BL/6NJ, C57BL/10, C57BL/10 sn, C57BL/10Cr and C57 BL/Ola.
The practice of the present invention will employ, unless otherwise indicated, conventional techniques of cell biology, cell culture, molecular biology, transgenic biology, microbiology, recombinant DNA, and immunology. These techniques are explained in detail in the following documents. For example: molecular Cloning A Laboratory Manual, 2nd Ed., ed.by Sambrook, FritschandManiatis (Cold Spring Harbor Laboratory Press: 1989); DNA Cloning, Volumes I and II (d.n. glovered., 1985); oligonucleotide Synthesis (m.j. gaited., 1984); mulliserial.u.s.pat.no. 4, 683, 195; nucleic Acid Hybridization (B.D. Hames & S.J. Higgins.1984); transformation And transformation (B.D. Hames & S.J. Higgins.1984); culture Of Animal Cells (r.i. freshney, alanr.liss, inc., 1987); immobilized Cells And Enzymes (IRL Press, 1986); B.Perbal, A Practical Guide To Molecular Cloning (1984); the series, Methods In ENZYMOLOGY (J.Abelson and M.Simon, eds. inchief, Academic Press, Inc., New York), specific, Vols.154and 155(Wuetal. eds.) and Vol.185, "Gene Expression Technology" (D.Goeddel, ed.); gene Transfer Vectors For Mammarian Cells (J.H.Miller and M.P.Caloseds, 1987, Cold Spring Harbor Laboratory); immunochemical Methods In Cell And Molecular Biology (Mayer And Walker, eds., Academic Press, London, 1987); handbook Of Experimental Immunology, Volumes V (d.m.weir and c.c.blackwell, eds., 1986); and Manipulating the Mouse Embryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986).
The foregoing is merely a summary of aspects of the invention and is not, and should not be taken as, limiting the invention in any way.
All patents and publications mentioned in this specification are herein incorporated in their entirety by reference into the specification, to the same extent as if each individual publication was specifically and individually indicated to be incorporated herein by reference. Those skilled in the art will recognize that certain changes may be made to the invention without departing from the spirit or scope of the invention.
The following examples further illustrate the invention in detail and are not to be construed as limiting the scope of the invention or the particular methods described herein.
Drawings
Embodiments of the invention are described in detail below with reference to the attached drawing figures, wherein:
FIG. 1: schematic comparison of mouse VEGFR2 gene and human VEGFR2 locus (not to scale);
FIG. 2: schematic representation of humanization of mouse VEGFR2 gene (not to scale);
FIG. 3: VEGFR2 gene targeting strategy and targeting vector design schematic (not to scale);
FIG. 4: VEGFR2 recombinant cell Southern blot result, wherein WT is wild type control, 1-A04, 1-A05, 1-C08, 1-E02, 2-B02, 2-C06, 2-F02, 2-H06, 3-A02, 3-B12, 3-E05, 3-F07 are clone numbers;
FIG. 5: schematic representation of the humanized VEGFR2 mouse FRT recombination process (not to scale);
FIG. 6: VEGFR2 humanized mouse F1 genotype identification, where WT is wild type and H2O is water control, and PC is positive control;
FIG. 7: the results (part) of the alignment of human and mouse VEGFR2 protein sequences, where Query is the amino acid sequence of human VEGFR2 protein and Sbjct is the amino acid sequence of mouse VEGFR2 protein;
FIG. 8: results of detection of VEGFR2 mRNA in the lung of embryo of C57BL/6 wild type mice (+/+) and VEGFR2 gene humanized homozygote mice (H/H).
Detailed Description
The invention will be further described with reference to specific embodiments, and the advantages and features of the invention will become apparent as the description proceeds. These examples are illustrative only and do not limit the scope of the present invention in any way. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention, and that such changes and modifications may be made without departing from the spirit and scope of the invention.
In each of the following examples, the equipment and materials were obtained from several companies as indicated below:
XmnI, ScaI and HindIII enzymes are purchased from NEB, and the product numbers are R0194, R0122 and R0104 respectively;
HeraeusTMFrescoTM21Microcentrifuge purchased from Thermo Fisher Scientific, model Fresco 21;
mouse VEGFR2/Flk-1 Quantikine ELISA Kit was purchased from R & D under Cat No. MVR 200B;
human VEGF Receptor 2ELISA kit purchased from Abcam, cat No. ab 213476;
c57BL/6 mice and Flp tool mice were purchased from the national rodent laboratory animal seed center of the Chinese food and drug assay institute;
BV480 Rat Anti-Mouse CD31 from BD HorizonTMItem number 565629;
PE/Cyanine7 anti-mouse CD34 Antibody, available from Biolegend under Cat 128617;
APC anti-mouse CD309(VEGFR2, Flk-1) Antibody, available from Biolegend under Cat 136405;
human VEGFR2/KDR/Flk-1 PE-conjugated Antibody available from R & D under the accession number FAB 357P-025;
Zombie NIRTMfixable visual Kit available from Biolegend under cat # 423105;
purified anti-mouse CD16/32Antibody, available from Biolegend under cat No. 101302;
trizol kit was purchased from Takara, cat # 6110A.
Example 1VEGFR2 Gene humanized mice
This example modifies a non-human animal (e.g., a mouse) to include a nucleotide sequence encoding a humanized VEGFR2 protein in the non-human animal, resulting in a genetically modified non-human animal that expresses a humanized VEGFR2 protein. A comparative scheme of mouse VEGFR2 Gene (NCBI Gene ID: 16542, Primary source: MGI: 96683, located at positions 75932827 to 75979072 of chromosome 5 NC-000071.6, based on transcript NM-010612.3 (SEQ ID NO: 1) and its encoded protein NP-034742.2 (SEQ ID NO: 2)) and human VEGFR2 Gene (NCBI Gene ID: 3791, Primary source: HGNC:6307, UniProt ID: P35968, located at positions 55078481 to 55125595 of chromosome 4 NC-000004.12, based on transcript NM-002253.3 (SEQ ID NO: 3) and its encoded protein NP-002244.1 (SEQ ID NO: 4)) is shown in FIG. 1.
For the purposes of the present invention, a portion of the nucleotide sequence encoding the human VEGFR2 protein may be introduced at the endogenous VEGFR2 locus in mice, such that the mice express the humanized VEGFR2 protein. Specifically, the 18321bp sequences from the sequence of the exon part 2 to the sequence of the exon part 15 of the VEGFR2 gene of the mouse can be replaced by corresponding human DNA sequences by using a gene editing technology to obtain a humanized VEGFR2 gene sequence (a schematic diagram is shown in fig. 2), so that humanized modification of the VEGFR2 gene of the mouse can be realized.
In the schematic of the targeting strategy shown in fig. 3, the homology arm sequences on the targeting vector containing the upstream and downstream of the mouse VEGFR2 gene are shown, as well as the a fragment containing the human VEGFR2 sequence. Wherein, the upstream homology arm sequence (5 'homology arm, SEQ ID NO: 5) is the same as the nucleotide sequence from position 75974488 to 75978469 of NCBI accession No. NC-000071.6, and the downstream homology arm sequence (3' homology arm, SEQ ID NO: 6) is the same as the nucleotide sequence from position 75951590 to 75955601 of NCBI accession No. NC-000071.6; a human VEGFR2 sequence (SEQ ID NO: 7) comprising the sequence of the No. 2 exon portion through the No. 15 exon portion of the human VEGFR2 gene on the A fragment, the human VEGFR2 sequence being identical to the nucleotide sequence from position 55101910 to 55121182 of NCBI accession No. NC-000004.12; the connection between the upstream of the human VEGFR2 sequence in fragment A and the mouse VEGFR2 gene was designed to be 5' -ttggatccaactcttttctaatttgtttctaggtttgcctagtgtttctcttgatctgcccaggctcagcatacaaaaag-3' (SEQ ID NO: 8), sequence thereofColumn'tgcctThe last "t" in "is the last nucleotide in the mouse, and" a "in the sequence" agtgt "is the first nucleotide in the human; the connection between the downstream of the human VEGFR2 sequence and the mouse VEGFR2 gene was designed to
5’-ggcatgcagtgttcttggctgtgcaaaagtggaggcatttttcataatagaaggtcagtgtgatgtcataggctcatcag-3' (SEQ ID NO: 9), wherein the sequence
Figure BDA0003430597810000123
The last "t" in (a) is the last nucleotide, sequence, of a human "ttcatThe first "t" in "is the first nucleotide in the mouse.
The targeting vector also comprises a resistance gene used for positive clone screening, namely neomycin phosphotransferase coding sequence Neo, and two site-specific recombination system Frt recombination sites which are arranged in the same direction are arranged on two sides of the resistance gene to form a Neo cassette (Neo cassette). Wherein the connection between the 5' end of the Neo box and the mouse gene is designed as
5’-ctggggaaaaatagatttggtgttgggcattgctattcaa
Figure BDA0003430597810000122
TGATATCGAATTCCGAAGTTCCTATTCTCTAGAAA-3' (SEQ ID NO: 10), wherein the sequence "ttcaaThe last "a" in "is the last nucleotide, sequence, of the mouse
Figure BDA0003430597810000126
The first "a" in (a) is the first nucleotide of the Neo cassette; the connection between the 3 'end of the Neo box and the mouse gene is designed to be 5' -GAACTTCATCAGTCAGGTACATAATGGTGGATCCG
Figure BDA0003430597810000124
ctagtgacaaggatcaaaggttcaggattttccccagag c-3' (SEQ ID NO: 11), wherein the sequence
Figure BDA0003430597810000125
"C" in (A) is the last nucleotide, sequence, of the Neo cassette "ctagt"in" c "is a mouseThe first nucleotide of (a). In addition, a coding gene with a negative selection marker (diphtheria toxin a subunit coding gene (DTA)) was constructed downstream of the 3' homology arm of the targeting vector. The mRNA sequence of the modified humanized mouse VEGFR2 is shown in SEQ ID NO:12, the expressed protein sequence is shown as SEQ ID NO: shown at 13.
Furthermore, based on the amino acid sequence structures of the human VEGFR2 protein and the mouse VEGFR2 protein (as shown in fig. 7), the VEGFR2 sequence derived from human includes, in addition to SEQ ID NO: 4, the same amino acid residues may also be from an extracellular region and/or a signal peptide, e.g. including SEQ ID NO: 4, or the signal peptide and the extracellular region at the positions of 20-764, 26-764, 23-760, 20-751 or 26-751, and the mRNA sequence of the humanized mouse VEGFR2 after being modified is shown in SEQ ID No. 12, and the expressed protein sequence is shown in SEQ ID NO: shown at 13.
The construction of the targeting vector can be carried out by adopting a conventional method, such as enzyme digestion connection and the like. And carrying out preliminary verification on the constructed targeting vector by enzyme digestion, and then sending the targeting vector to a sequencing company for sequencing verification. The method comprises the steps of performing electroporation transfection on a targeting vector which is verified to be correct by sequencing into embryonic stem cells of a C57BL/6 mouse, screening the obtained cells by using a positive clone screening marker gene, detecting and confirming the integration condition of an exogenous gene by using PCR and Southern Blot technology, screening correct positive clone cells, detecting clones which are verified to be positive by PCR by using Southern Blot (cell DNA is digested by XmnI, ScaI or HindIII and hybridized by using 3 probes respectively, the lengths of the probes and target fragments are shown in table 1), and detecting the result as shown in figure 4, wherein the detection result shows that 12 clones which are verified to be positive by PCR are detected, and the sequencing shows that the rest 11 clones are positive clones except 3-F07 and have no random insertion, and the specific numbers are 1-A04, 1-A05, 1-C08, 1-E02, 2-B02, 2-C06, 2-F02, 2-H06, 3-A02, 3-B12 and 3-E05.
Table 1: specific probes and target fragment lengths
Restriction enzyme Probe needle Wild type fragment size Recombinant sequence fragment size
XmnI 5’Probe 7.0kb 11.6kb
ScaI 3’Probe 7.4kb 12.6kb
HindIII Neo Probe 5.9kb
Wherein the PCR assay comprises the following primers:
F1:5’-TTGCTCTCAGATGCGACTTGCC-3’(SEQ ID NO:14),
R1:5’-GTGTATGCATGTTGGCAGAGAATAG-3’(SEQ ID NO:15);
F2:5’-GCTCGACTAGAGCTTGCGGA-3’(SEQ ID NO:16),
R2:5’-TTCACTGATAGCAAACGCCTCTCC-3’(SEQ ID NO:17);
the Southern Blot detection comprises the following probe primers:
5 'Probe (5' Probe):
5’Probe-F:5’-CACTCTAGCAGCCACTGGAGAAGGA-3’(SEQ ID NO:18),
5’Probe-R:5’-TGTAGCCAGTAATGGGCTCTGAGAC-3’(SEQ ID NO:19);
3 'Probe (3' Probe):
3’Probe-F:5’-GACTTACAGCTCTCTGTTAGCATCTG-3’(SEQ ID NO:20),
3’Probe-R:5’-GCAGAATTCCACAATCACCATGAGA-3’(SEQ ID NO:21);
neo Probe (Neo Probe):
Neo Probe-F:5’-GGATCGGCCATTGAACAAGAT-3’(SEQ ID NO:22),
Neo Probe-R:5’-CAGAAGAACTCGTCAAGAAGGC-3’(SEQ ID NO:23)。
the selected correctly positive cloned cells (black mice) are introduced into the separated blastocysts (white mice) according to the known technology in the field, the obtained chimeric blastocysts are transferred into a culture solution for short-term culture and then transplanted into the oviduct of a recipient mother mouse (white mouse), and F0 generation chimeric mice (black and white alternate) can be produced. The F1 generation mice are obtained by backcrossing the F0 generation chimeric mice and the wild mice, and the F1 generation heterozygous mice are mutually mated to obtain the F2 generation homozygous son mice. Alternatively, positive mice may be mated with Flp tool mice to remove the positive clone selection marker gene (see FIG. 5 for a schematic representation of the process), and then mated with each other to obtain humanized VEGFR2 gene homozygous mice. The somatic genotypes of the progeny mice were identified by PCR (primers shown in Table 2), and the results of identification of exemplary F1 generation mice (with the Neo marker gene removed) are shown in FIG. 6, in which 2 mice numbered F1-01, F1-02 were all positive heterozygous mice. This indicates that using this method, it is possible to construct a humanized mouse of the VEGFR2 gene that can be stably passaged without random insertions.
Table 2: primer name and specific sequence
Figure BDA0003430597810000131
Expression of the humanized VEGFR2 protein in the humanized mouse of the VEGFR2 gene can be detected by conventional methods, such as ELISA. The expression conditions of VEGFR2 protein in 11-week-old male C57BL/6 wild-type mice and VEGFR2 gene humanized homozygote mice are detected according to the operation procedures of Mouse VEGFR2/Flk-1 Quantikine ELISA Kit and human VEGF Receptor 2ELISA specification. The results show that the expression of the murine VEGFR2 protein is detected in the C57BL/6 wild-type mouse, and the humanized VEGFR2 protein is not detected; expression of the humanized VEGFR2 protein was detected in the VEGFR2 gene humanized homozygous mouse, and no expression of murine VEGFR2 was detected.
The expression of the humanized VEGFR2 protein in the VEGFR2 gene humanized mice was also confirmed by flow cytometry. Specifically, 1 female C57BL/6 wild-type Mouse and 1 Mouse of VEGFR2 gene humanized homozygote each of which is 12-14 days old at 11 weeks gestation are taken, and lung tissues of Mouse embryos are taken and treated with Anti-Mouse CD31 Antibody BV480 Rat Anti-Mouse CD31, Anti-Mouse CD34 Antibody PE/Cyanine7 Anti-Mouse CD34 Anti-body, Anti-Mouse VEGFR 2Antibody APC Anti-Mouse CD309(VEGFR2, Flk-1) Anti-body, Anti-Human 2Antibody Human VEGFR2/KDR/Flk-1 PE-conjugated Anti-body, Zombie NIRTMAnd (3) carrying out flow detection after recognition and staining by using a Fixable visualization Kit, a Purified anti-mouse CD16/32Antibody anti-mouse CD16/32Antibody and the like, and detecting the expression condition of the VEGFR2 protein, wherein the detection result is shown in Table 3.
TABLE 3 flow cytometry detection of VEGFR2 protein assay results
Figure BDA0003430597810000141
The results showed (table 3) that only murine VEGFR2 protein, no human or humanized VEGFR2 protein was detected in embryonic lung tissue derived from C57BL/6 wild type mice, and humanized VEGFR2 protein was detected only in mouse embryonic lung tissue of VEGFR2 humanized homozygote mice (hVEGFR 2). However, about 15% of the endothelial cells (CD31+ cells or CD34+ cells) of hVEGFR2 mice detected mouse VEGFR2 protein, and it was presumed that the anti-Human VEGFR 2Antibody Human VEGFR2/KDR/Flk-1 PE-conjugated Antibody might be a Human-mouse cross Antibody.
To further verify the expression of VEGFR2 in the humanized homozygote mouse of VEGFR2 gene, the expression of human VEGFR2 mRNA in the mouse can be confirmed by a conventional detection method, such as RT-PCR. Specifically, 1 mouse of each of the 11-week-old, 12-14-day-pregnant female C57BL/6 wild-type mice and the VEGFR2 gene humanized homozygote mouse prepared in this example was taken, cells of embryonic lung of the mice were extracted after cervical dislocation, cellular RNA was extracted according to the instructions of Trizol kit, and reverse transcription was performed to cDNA for RT-PCR (primers shown in table 4), and the results of the detection are shown in fig. 8: only murine VEGFR2 mRNA, no human VEGFR2 mRNA was detected in C57BL/6 wild type mice; human VEGFR2 mRNA was only detectable in VEGFR2 humanized homozygous mice. Combined with the flow cytometry results, it was shown that only the humanized VEGFR2 protein could be detected in the VEGFR2 gene humanized homozygous mouse.
Table 4: name and specific sequence of RT-PCR primer
Figure BDA0003430597810000142
Example 2 drug efficacy verification
Figure BDA0003430597810000143
(Ramucirumab with full-length Heavy (HC) sequence of SEQ ID NO: 37 and full-length Light (LC) sequence of SEQ ID NO: 38) is a humanized monoclonal antibody that binds to VEGFR2 that was developed pharmaceutically by rituximab.
The VEGFR2 gene humanized mouse prepared by the method is used for constructing a tumor model and can be used for testing the drug effect of a drug targeting human VEGFR 2. Specifically, the humanized homozygote mouse of VEGFR2 gene prepared in example 1 was selected and inoculated subcutaneously with mouse colon cancer cell MC38 (5X 10)5One), the tumor volume is about 100-3Thereafter, the tumor volume was divided into a control group to which PBS was injected or a treatment group (n ═ 6/group) to which Ramucirumab was administered. The administration mode comprises the following steps: intraperitoneal (i.p.) injection, the administration is started on the same day, 2 times per week and 6 times in total. Tumor volume was measured 2 times per week, and after inoculation, tumor volume of a single mouse reached 3000mm3And performing euthanasia. Specific groupingAnd dosing is shown in table 5.
Table 5: grouping and administration of drugs
Figure BDA0003430597810000151
The results show that the health of the animals in the experimental process of each group is good, and the weight of the animals in all treatment groups (G2 and G3) is increased compared with the weight of the animals in a control group (G1), and no obvious difference exists. From the result of tumor volume, the tumor of the control group mice continuously grows in the experimental period, and compared with the control group mice, the tumor volumes of all the treatment group mice are reduced and/or disappeared to different degrees, which shows that the anti-human VEGFR 2antibody Ramucirumab with different concentrations has a certain tumor inhibition effect, does not generate obvious toxic effect on animals, and has better safety. VEGFR2 humanized mice are demonstrated to be useful for screening human VEGFR2 targeted drugs (e.g., antibody drugs) and in vivo drug efficacy testing. Example 3 preparation of double humanized or multiple double humanized mice
The VEGFR2 mouse prepared by the method can also be used for preparing a double-gene humanized or multi-gene humanized mouse model. For example, in example 1, the embryonic stem cells used for blastocyst microinjection may be selected from mice containing gene modifications such as CD3, PD-L1, PD-1, CD47, CD27, CD28, SIRPA, GITR, TIGIT, etc., or may be obtained from humanized VEGFR2 mice by using isolated mouse ES embryonic stem cells and gene recombination targeting techniques to obtain a two-gene or multi-gene modified mouse model of VEGFR2 and other gene modifications. The homozygote or heterozygote of the VEGFR2 mouse obtained by the method can be mated with other homozygote or heterozygote mouse modified by genes, the offspring of the homozygote or heterozygote mouse can be screened, the homozygote or heterozygote mouse modified by humanized VEGFR2 and other genes and double genes or multiple genes can be obtained with a certain probability according to Mendel genetic rules, the heterozygote can be mated with each other to obtain homozygote modified by double genes or multiple genes, and the in vivo efficacy verification of targeted human VEGFR2 and other gene regulators can be carried out by utilizing the mouse modified by double genes or multiple genes.
The preferred embodiments of the present invention have been described in detail, however, the present invention is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solution of the present invention within the technical idea of the present invention, and these simple modifications are within the protective scope of the present invention. It should be noted that the various technical features described in the above embodiments can be combined in any suitable manner without contradiction, and the invention is not described in any way for the possible combinations in order to avoid unnecessary repetition.
In addition, any combination of the various embodiments of the present invention is also possible, and the same should be considered as the disclosure of the present invention as long as it does not depart from the spirit of the present invention.
Sequence listing
<110> Baiosai Diagram (Beijing) pharmaceutical science and technology Co., Ltd
<120> VEGFR2 gene humanized non-human animal and construction method and application thereof
<130> BHVEGFR2.002CN1-P0102020040247Z
<150> 202011564564X
<151> 2020-12-25
<160> 38
<170> SIPOSequenceListing 1.0
<210> 1
<211> 5921
<212> DNA
<213> Mus musculus
<400> 1
gagtcctcag gaccccaaga gagtaagctg tgtttcctta gatcgcgcgg accgctaccc 60
ggcaggactg aaagcccaga ctgtgtcccg cagccgggat aacctggctg acccgattcc 120
gcggacaccg ctgcagccgc ggctggagcc agggcgccgg tgccccgcgc tctccccggt 180
cttgcgctgc gggggcgcat accgcctctg tgacttcttt gcgggccagg gacggagaag 240
gagtctgtgc ctgagaactg ggctctgtgc ccagcgcgag gtgcaggatg gagagcaagg 300
cgctgctagc tgtcgctctg tggttctgcg tggagacccg agccgcctct gtgggtttgc 360
ctggcgattt tctccatccc cccaagctca gcacacagaa agacatactg acaattttgg 420
caaatacaac ccttcagatt acttgcaggg gacagcggga cctggactgg ctttggccca 480
atgctcagcg tgattctgag gaaagggtat tggtgactga atgcggcggt ggtgacagta 540
tcttctgcaa aacactcacc attcccaggg tggttggaaa tgatactgga gcctacaagt 600
gctcgtaccg ggacgtcgac atagcctcca ctgtttatgt ctatgttcga gattacagat 660
caccattcat cgcctctgtc agtgaccagc atggcatcgt gtacatcacc gagaacaaga 720
acaaaactgt ggtgatcccc tgccgagggt cgatttcaaa cctcaatgtg tctctttgcg 780
ctaggtatcc agaaaagaga tttgttccgg atggaaacag aatttcctgg gacagcgaga 840
taggctttac tctccccagt tacatgatca gctatgccgg catggtcttc tgtgaggcaa 900
agatcaatga tgaaacctat cagtctatca tgtacatagt tgtggttgta ggatatagga 960
tttatgatgt gattctgagc cccccgcatg aaattgagct atctgccgga gaaaaacttg 1020
tcttaaattg tacagcgaga acagagctca atgtggggct tgatttcacc tggcactctc 1080
caccttcaaa gtctcatcat aagaagattg taaaccggga tgtgaaaccc tttcctggga 1140
ctgtggcgaa gatgtttttg agcaccttga caatagaaag tgtgaccaag agtgaccaag 1200
gggaatacac ctgtgtagcg tccagtggac ggatgatcaa gagaaataga acatttgtcc 1260
gagttcacac aaagcctttt attgctttcg gtagtgggat gaaatctttg gtggaagcca 1320
cagtgggcag tcaagtccga atccctgtga agtatctcag ttacccagct cctgatatca 1380
aatggtacag aaatggaagg cccattgagt ccaactacac aatgattgtt ggcgatgaac 1440
tcaccatcat ggaagtgact gaaagagatg caggaaacta cacggtcatc ctcaccaacc 1500
ccatttcaat ggagaaacag agccacatgg tctctctggt tgtgaatgtc ccaccccaga 1560
tcggtgagaa agccttgatc tcgcctatgg attcctacca gtatgggacc atgcagacat 1620
tgacatgcac agtctacgcc aaccctcccc tgcaccacat ccagtggtac tggcagctag 1680
aagaagcctg ctcctacaga cccggccaaa caagcccgta tgcttgtaaa gaatggagac 1740
acgtggagga tttccagggg ggaaacaaga tcgaagtcac caaaaaccaa tatgccctga 1800
ttgaaggaaa aaacaaaact gtaagtacgc tggtcatcca agctgccaac gtgtcagcgt 1860
tgtacaaatg tgaagccatc aacaaagcgg gacgaggaga gagggtcatc tccttccatg 1920
tgatcagggg tcctgaaatt actgtgcaac ctgctgccca gccaactgag caggagagtg 1980
tgtccctgtt gtgcactgca gacagaaata cgtttgagaa cctcacgtgg tacaagcttg 2040
gctcacaggc aacatcggtc cacatgggcg aatcactcac accagtttgc aagaacttgg 2100
atgctctttg gaaactgaat ggcaccatgt tttctaacag cacaaatgac atcttgattg 2160
tggcatttca gaatgcctct ctgcaggacc aaggcgacta tgtttgctct gctcaagata 2220
agaagaccaa gaaaagacat tgcctggtca aacagctcat catcctagag cgcatggcac 2280
ccatgatcac cggaaatctg gagaatcaga caacaaccat tggcgagacc attgaagtga 2340
cttgcccagc atctggaaat cctaccccac acattacatg gttcaaagac aacgagaccc 2400
tggtagaaga ttcaggcatt gtactgagag atgggaaccg gaacctgact atccgcaggg 2460
tgaggaagga ggatggaggc ctctacacct gccaggcctg caatgtcctt ggctgtgcaa 2520
gagcggagac gctcttcata atagaaggtg cccaggaaaa gaccaacttg gaagtcatta 2580
tcctcgtcgg cactgcagtg attgccatgt tcttctggct ccttcttgtc attgtcctac 2640
ggaccgttaa gcgggccaat gaaggggaac tgaagacagg ctacttgtct attgtcatgg 2700
atccagatga attgcccttg gatgagcgct gtgaacgctt gccttatgat gccagcaagt 2760
gggaattccc cagggaccgg ctgaaactag gaaaacctct tggccgcggt gccttcggcc 2820
aagtgattga ggcagacgct tttggaattg acaagacagc gacttgcaaa acagtagccg 2880
tcaagatgtt gaaagaagga gcaacacaca gcgagcatcg agccctcatg tctgaactca 2940
agatcctcat ccacattggt caccatctca atgtggtgaa cctcctaggc gcctgcacca 3000
agccgggagg gcctctcatg gtgattgtgg aattctgcaa gtttggaaac ctatcaactt 3060
acttacgggg caagagaaat gaatttgttc cctataagag caaaggggca cgcttccgcc 3120
agggcaagga ctacgttggg gagctctccg tggatctgaa aagacgcttg gacagcatca 3180
ccagcagcca gagctctgcc agctcaggct ttgttgagga gaaatcgctc agtgatgtag 3240
aggaagaaga agcttctgaa gaactgtaca aggacttcct gaccttggag catctcatct 3300
gttacagctt ccaagtggct aagggcatgg agttcttggc atcaaggaag tgtatccaca 3360
gggacctggc agcacgaaac attctcctat cggagaagaa tgtggttaag atctgtgact 3420
tcggcttggc ccgggacatt tataaagacc cggattatgt cagaaaagga gatgcccgac 3480
tccctttgaa gtggatggcc ccggaaacca tttttgacag agtatacaca attcagagcg 3540
atgtgtggtc tttcggtgtg ttgctctggg aaatattttc cttaggtgcc tccccatacc 3600
ctggggtcaa gattgatgaa gaattttgta ggagattgaa agaaggaact agaatgcggg 3660
ctcctgacta cactacccca gaaatgtacc agaccatgct ggactgctgg catgaggacc 3720
ccaaccagag accctcgttt tcagagttgg tggagcattt gggaaacctc ctgcaagcaa 3780
atgcgcagca ggatggcaaa gactatattg ttcttccaat gtcagagaca ctgagcatgg 3840
aagaggattc tggactctcc ctgcctacct cacctgtttc ctgtatggag gaagaggaag 3900
tgtgcgaccc caaattccat tatgacaaca cagcaggaat cagtcattat ctccagaaca 3960
gtaagcgaaa gagccggcca gtgagtgtaa aaacatttga agatatccca ttggaggaac 4020
cagaagtaaa agtgatccca gatgacagcc agacagacag tgggatggtc cttgcatcag 4080
aagagctgaa aactctggaa gacaggaaca aattatctcc atcttttggt ggaatgatgc 4140
ccagtaaaag cagggagtct gtggcctcgg aaggctccaa ccagaccagt ggctaccagt 4200
ctgggtatca ctcagatgac acagacacca ccgtgtactc cagcgacgag gcaggacttt 4260
taaagatggt ggatgctgca gttcacgctg actcagggac cacactgcgc tcacctcctg 4320
tttaaatgga agtggtcctg tcccggctcc gcccccaact cctggaaatc acgagagagg 4380
tgctgcttag attttcaagt gttgttcttt ccaccacccg gaagtagcca catttgattt 4440
tcatttttgg aggagggacc tcagactgca aggagcttgt cctcagggca tttccagaga 4500
agatgcccat gacccaagaa tgtgttgact ctactctctt ttccattcat ttaaaagtcc 4560
tatataatgt gccctgctgt ggtctcacta ccagttaaag caaaagactt tcaaacagtg 4620
gctctgtcct ccaagaagtg gcaacggcac ctctgtgaaa ctggatcgaa tgggcaatgc 4680
tttgtgtgtt gaggatgggt gagatgtccc agggccgagt ctgtctacct tggaggcttt 4740
gtggaggatg cgggctatga gccaagtgtt aagtgtggga tgtggactgg gaggaaggaa 4800
ggcgcaagct cgctcggaga gcggttggag cctgcagatg cattgtgctg gctctggtgg 4860
aggtgggctt gtggcctgtc aggaaacgca aaggcggccg gcagggtttg gttttggaag 4920
gtttgcgtgc tcttcacagt cgggttacag gcgagttccc tgtggcgttt cctactccta 4980
atgagagttc cttccggact cttacgtgtc tcctggcctg gccccaggaa ggaaatgatg 5040
cagcttgctc cttcctcatc tctcaggctg tgccttaatt cagaacacca aaagagagga 5100
acgtcggcag aggctcctga cggggccgaa gaattgtgag aacagaacag aaactcaggg 5160
tttctgctgg gtggagaccc acgtggctgc cctggtggca gtgtctgagg gttctctgtc 5220
aagtggcggt aaaggctcag gctggtgttc ttcctctatc tccactcctg tcaggccccc 5280
aagtcctcag tattttagct ttgtggcttc ctgatggcag aaaaatctta attggttggt 5340
ttgctctcca gataatcact agccagattt cgaaattact ttttagccga ggttatgata 5400
acatctactg tatcctttag aattttaacc tataaaacta tgtctactgg tttctgcctg 5460
tgtgcttatg ttaaaaaaaa aaagaaagaa agaaactgtt cttttcattt ggtaccatag 5520
tgtgaagagc tgggagcaat gactgttaaa catgctatgg cacatctatt tatagtctgt 5580
tatgtagaac aaatgtaata tattaaaacg ttatattata tataatgaac tttgtactac 5640
ccaccttttg tatcagtatt atgtaccact agagagatta caaggctttc agcagccgct 5700
gttgttttgt taaagacttt gagaaactcg aaggaatcct ttcatggaat atgcagctat 5760
ataccctacc gtctctctca tctcaaacgg aggaggagga ggaggagtca ggtataatgt 5820
gagtgtgttc tacgtgtcct tgttctctgt tcttaggagg aatgatttca tcaaatgttt 5880
atatgcttta taaaccaata aacgtattct gagtaaagag a 5921
<210> 2
<211> 1345
<212> PRT
<213> Mus musculus
<400> 2
Met Glu Ser Lys Ala Leu Leu Ala Val Ala Leu Trp Phe Cys Val Glu
1 5 10 15
Thr Arg Ala Ala Ser Val Gly Leu Pro Gly Asp Phe Leu His Pro Pro
20 25 30
Lys Leu Ser Thr Gln Lys Asp Ile Leu Thr Ile Leu Ala Asn Thr Thr
35 40 45
Leu Gln Ile Thr Cys Arg Gly Gln Arg Asp Leu Asp Trp Leu Trp Pro
50 55 60
Asn Ala Gln Arg Asp Ser Glu Glu Arg Val Leu Val Thr Glu Cys Gly
65 70 75 80
Gly Gly Asp Ser Ile Phe Cys Lys Thr Leu Thr Ile Pro Arg Val Val
85 90 95
Gly Asn Asp Thr Gly Ala Tyr Lys Cys Ser Tyr Arg Asp Val Asp Ile
100 105 110
Ala Ser Thr Val Tyr Val Tyr Val Arg Asp Tyr Arg Ser Pro Phe Ile
115 120 125
Ala Ser Val Ser Asp Gln His Gly Ile Val Tyr Ile Thr Glu Asn Lys
130 135 140
Asn Lys Thr Val Val Ile Pro Cys Arg Gly Ser Ile Ser Asn Leu Asn
145 150 155 160
Val Ser Leu Cys Ala Arg Tyr Pro Glu Lys Arg Phe Val Pro Asp Gly
165 170 175
Asn Arg Ile Ser Trp Asp Ser Glu Ile Gly Phe Thr Leu Pro Ser Tyr
180 185 190
Met Ile Ser Tyr Ala Gly Met Val Phe Cys Glu Ala Lys Ile Asn Asp
195 200 205
Glu Thr Tyr Gln Ser Ile Met Tyr Ile Val Val Val Val Gly Tyr Arg
210 215 220
Ile Tyr Asp Val Ile Leu Ser Pro Pro His Glu Ile Glu Leu Ser Ala
225 230 235 240
Gly Glu Lys Leu Val Leu Asn Cys Thr Ala Arg Thr Glu Leu Asn Val
245 250 255
Gly Leu Asp Phe Thr Trp His Ser Pro Pro Ser Lys Ser His His Lys
260 265 270
Lys Ile Val Asn Arg Asp Val Lys Pro Phe Pro Gly Thr Val Ala Lys
275 280 285
Met Phe Leu Ser Thr Leu Thr Ile Glu Ser Val Thr Lys Ser Asp Gln
290 295 300
Gly Glu Tyr Thr Cys Val Ala Ser Ser Gly Arg Met Ile Lys Arg Asn
305 310 315 320
Arg Thr Phe Val Arg Val His Thr Lys Pro Phe Ile Ala Phe Gly Ser
325 330 335
Gly Met Lys Ser Leu Val Glu Ala Thr Val Gly Ser Gln Val Arg Ile
340 345 350
Pro Val Lys Tyr Leu Ser Tyr Pro Ala Pro Asp Ile Lys Trp Tyr Arg
355 360 365
Asn Gly Arg Pro Ile Glu Ser Asn Tyr Thr Met Ile Val Gly Asp Glu
370 375 380
Leu Thr Ile Met Glu Val Thr Glu Arg Asp Ala Gly Asn Tyr Thr Val
385 390 395 400
Ile Leu Thr Asn Pro Ile Ser Met Glu Lys Gln Ser His Met Val Ser
405 410 415
Leu Val Val Asn Val Pro Pro Gln Ile Gly Glu Lys Ala Leu Ile Ser
420 425 430
Pro Met Asp Ser Tyr Gln Tyr Gly Thr Met Gln Thr Leu Thr Cys Thr
435 440 445
Val Tyr Ala Asn Pro Pro Leu His His Ile Gln Trp Tyr Trp Gln Leu
450 455 460
Glu Glu Ala Cys Ser Tyr Arg Pro Gly Gln Thr Ser Pro Tyr Ala Cys
465 470 475 480
Lys Glu Trp Arg His Val Glu Asp Phe Gln Gly Gly Asn Lys Ile Glu
485 490 495
Val Thr Lys Asn Gln Tyr Ala Leu Ile Glu Gly Lys Asn Lys Thr Val
500 505 510
Ser Thr Leu Val Ile Gln Ala Ala Asn Val Ser Ala Leu Tyr Lys Cys
515 520 525
Glu Ala Ile Asn Lys Ala Gly Arg Gly Glu Arg Val Ile Ser Phe His
530 535 540
Val Ile Arg Gly Pro Glu Ile Thr Val Gln Pro Ala Ala Gln Pro Thr
545 550 555 560
Glu Gln Glu Ser Val Ser Leu Leu Cys Thr Ala Asp Arg Asn Thr Phe
565 570 575
Glu Asn Leu Thr Trp Tyr Lys Leu Gly Ser Gln Ala Thr Ser Val His
580 585 590
Met Gly Glu Ser Leu Thr Pro Val Cys Lys Asn Leu Asp Ala Leu Trp
595 600 605
Lys Leu Asn Gly Thr Met Phe Ser Asn Ser Thr Asn Asp Ile Leu Ile
610 615 620
Val Ala Phe Gln Asn Ala Ser Leu Gln Asp Gln Gly Asp Tyr Val Cys
625 630 635 640
Ser Ala Gln Asp Lys Lys Thr Lys Lys Arg His Cys Leu Val Lys Gln
645 650 655
Leu Ile Ile Leu Glu Arg Met Ala Pro Met Ile Thr Gly Asn Leu Glu
660 665 670
Asn Gln Thr Thr Thr Ile Gly Glu Thr Ile Glu Val Thr Cys Pro Ala
675 680 685
Ser Gly Asn Pro Thr Pro His Ile Thr Trp Phe Lys Asp Asn Glu Thr
690 695 700
Leu Val Glu Asp Ser Gly Ile Val Leu Arg Asp Gly Asn Arg Asn Leu
705 710 715 720
Thr Ile Arg Arg Val Arg Lys Glu Asp Gly Gly Leu Tyr Thr Cys Gln
725 730 735
Ala Cys Asn Val Leu Gly Cys Ala Arg Ala Glu Thr Leu Phe Ile Ile
740 745 750
Glu Gly Ala Gln Glu Lys Thr Asn Leu Glu Val Ile Ile Leu Val Gly
755 760 765
Thr Ala Val Ile Ala Met Phe Phe Trp Leu Leu Leu Val Ile Val Leu
770 775 780
Arg Thr Val Lys Arg Ala Asn Glu Gly Glu Leu Lys Thr Gly Tyr Leu
785 790 795 800
Ser Ile Val Met Asp Pro Asp Glu Leu Pro Leu Asp Glu Arg Cys Glu
805 810 815
Arg Leu Pro Tyr Asp Ala Ser Lys Trp Glu Phe Pro Arg Asp Arg Leu
820 825 830
Lys Leu Gly Lys Pro Leu Gly Arg Gly Ala Phe Gly Gln Val Ile Glu
835 840 845
Ala Asp Ala Phe Gly Ile Asp Lys Thr Ala Thr Cys Lys Thr Val Ala
850 855 860
Val Lys Met Leu Lys Glu Gly Ala Thr His Ser Glu His Arg Ala Leu
865 870 875 880
Met Ser Glu Leu Lys Ile Leu Ile His Ile Gly His His Leu Asn Val
885 890 895
Val Asn Leu Leu Gly Ala Cys Thr Lys Pro Gly Gly Pro Leu Met Val
900 905 910
Ile Val Glu Phe Cys Lys Phe Gly Asn Leu Ser Thr Tyr Leu Arg Gly
915 920 925
Lys Arg Asn Glu Phe Val Pro Tyr Lys Ser Lys Gly Ala Arg Phe Arg
930 935 940
Gln Gly Lys Asp Tyr Val Gly Glu Leu Ser Val Asp Leu Lys Arg Arg
945 950 955 960
Leu Asp Ser Ile Thr Ser Ser Gln Ser Ser Ala Ser Ser Gly Phe Val
965 970 975
Glu Glu Lys Ser Leu Ser Asp Val Glu Glu Glu Glu Ala Ser Glu Glu
980 985 990
Leu Tyr Lys Asp Phe Leu Thr Leu Glu His Leu Ile Cys Tyr Ser Phe
995 1000 1005
Gln Val Ala Lys Gly Met Glu Phe Leu Ala Ser Arg Lys Cys Ile His
1010 1015 1020
Arg Asp Leu Ala Ala Arg Asn Ile Leu Leu Ser Glu Lys Asn Val Val
1025 1030 1035 1040
Lys Ile Cys Asp Phe Gly Leu Ala Arg Asp Ile Tyr Lys Asp Pro Asp
1045 1050 1055
Tyr Val Arg Lys Gly Asp Ala Arg Leu Pro Leu Lys Trp Met Ala Pro
1060 1065 1070
Glu Thr Ile Phe Asp Arg Val Tyr Thr Ile Gln Ser Asp Val Trp Ser
1075 1080 1085
Phe Gly Val Leu Leu Trp Glu Ile Phe Ser Leu Gly Ala Ser Pro Tyr
1090 1095 1100
Pro Gly Val Lys Ile Asp Glu Glu Phe Cys Arg Arg Leu Lys Glu Gly
1105 1110 1115 1120
Thr Arg Met Arg Ala Pro Asp Tyr Thr Thr Pro Glu Met Tyr Gln Thr
1125 1130 1135
Met Leu Asp Cys Trp His Glu Asp Pro Asn Gln Arg Pro Ser Phe Ser
1140 1145 1150
Glu Leu Val Glu His Leu Gly Asn Leu Leu Gln Ala Asn Ala Gln Gln
1155 1160 1165
Asp Gly Lys Asp Tyr Ile Val Leu Pro Met Ser Glu Thr Leu Ser Met
1170 1175 1180
Glu Glu Asp Ser Gly Leu Ser Leu Pro Thr Ser Pro Val Ser Cys Met
1185 1190 1195 1200
Glu Glu Glu Glu Val Cys Asp Pro Lys Phe His Tyr Asp Asn Thr Ala
1205 1210 1215
Gly Ile Ser His Tyr Leu Gln Asn Ser Lys Arg Lys Ser Arg Pro Val
1220 1225 1230
Ser Val Lys Thr Phe Glu Asp Ile Pro Leu Glu Glu Pro Glu Val Lys
1235 1240 1245
Val Ile Pro Asp Asp Ser Gln Thr Asp Ser Gly Met Val Leu Ala Ser
1250 1255 1260
Glu Glu Leu Lys Thr Leu Glu Asp Arg Asn Lys Leu Ser Pro Ser Phe
1265 1270 1275 1280
Gly Gly Met Met Pro Ser Lys Ser Arg Glu Ser Val Ala Ser Glu Gly
1285 1290 1295
Ser Asn Gln Thr Ser Gly Tyr Gln Ser Gly Tyr His Ser Asp Asp Thr
1300 1305 1310
Asp Thr Thr Val Tyr Ser Ser Asp Glu Ala Gly Leu Leu Lys Met Val
1315 1320 1325
Asp Ala Ala Val His Ala Asp Ser Gly Thr Thr Leu Arg Ser Pro Pro
1330 1335 1340
Val
1345
<210> 3
<211> 5849
<212> DNA
<213> Homo sapiens
<400> 3
actgagtccc gggaccccgg gagagcggtc aatgtgtggt cgctgcgttt cctctgcctg 60
cgccgggcat cacttgcgcg ccgcagaaag tccgtctggc agcctggata tcctctccta 120
ccggcacccg cagacgcccc tgcagccgcg gtcggcgccc gggctcccta gccctgtgcg 180
ctcaactgtc ctgcgctgcg gggtgccgcg agttccacct ccgcgcctcc ttctctagac 240
aggcgctggg agaaagaacc ggctcccgag ttctgggcat ttcgcccggc tcgaggtgca 300
ggatgcagag caaggtgctg ctggccgtcg ccctgtggct ctgcgtggag acccgggccg 360
cctctgtggg tttgcctagt gtttctcttg atctgcccag gctcagcata caaaaagaca 420
tacttacaat taaggctaat acaactcttc aaattacttg caggggacag agggacttgg 480
actggctttg gcccaataat cagagtggca gtgagcaaag ggtggaggtg actgagtgca 540
gcgatggcct cttctgtaag acactcacaa ttccaaaagt gatcggaaat gacactggag 600
cctacaagtg cttctaccgg gaaactgact tggcctcggt catttatgtc tatgttcaag 660
attacagatc tccatttatt gcttctgtta gtgaccaaca tggagtcgtg tacattactg 720
agaacaaaaa caaaactgtg gtgattccat gtctcgggtc catttcaaat ctcaacgtgt 780
cactttgtgc aagataccca gaaaagagat ttgttcctga tggtaacaga atttcctggg 840
acagcaagaa gggctttact attcccagct acatgatcag ctatgctggc atggtcttct 900
gtgaagcaaa aattaatgat gaaagttacc agtctattat gtacatagtt gtcgttgtag 960
ggtataggat ttatgatgtg gttctgagtc cgtctcatgg aattgaacta tctgttggag 1020
aaaagcttgt cttaaattgt acagcaagaa ctgaactaaa tgtggggatt gacttcaact 1080
gggaataccc ttcttcgaag catcagcata agaaacttgt aaaccgagac ctaaaaaccc 1140
agtctgggag tgagatgaag aaatttttga gcaccttaac tatagatggt gtaacccgga 1200
gtgaccaagg attgtacacc tgtgcagcat ccagtgggct gatgaccaag aagaacagca 1260
catttgtcag ggtccatgaa aaaccttttg ttgcttttgg aagtggcatg gaatctctgg 1320
tggaagccac ggtgggggag cgtgtcagaa tccctgcgaa gtaccttggt tacccacccc 1380
cagaaataaa atggtataaa aatggaatac cccttgagtc caatcacaca attaaagcgg 1440
ggcatgtact gacgattatg gaagtgagtg aaagagacac aggaaattac actgtcatcc 1500
ttaccaatcc catttcaaag gagaagcaga gccatgtggt ctctctggtt gtgtatgtcc 1560
caccccagat tggtgagaaa tctctaatct ctcctgtgga ttcctaccag tacggcacca 1620
ctcaaacgct gacatgtacg gtctatgcca ttcctccccc gcatcacatc cactggtatt 1680
ggcagttgga ggaagagtgc gccaacgagc ccagccaagc tgtctcagtg acaaacccat 1740
acccttgtga agaatggaga agtgtggagg acttccaggg aggaaataaa attgaagtta 1800
ataaaaatca atttgctcta attgaaggaa aaaacaaaac tgtaagtacc cttgttatcc 1860
aagcggcaaa tgtgtcagct ttgtacaaat gtgaagcggt caacaaagtc gggagaggag 1920
agagggtgat ctccttccac gtgaccaggg gtcctgaaat tactttgcaa cctgacatgc 1980
agcccactga gcaggagagc gtgtctttgt ggtgcactgc agacagatct acgtttgaga 2040
acctcacatg gtacaagctt ggcccacagc ctctgccaat ccatgtggga gagttgccca 2100
cacctgtttg caagaacttg gatactcttt ggaaattgaa tgccaccatg ttctctaata 2160
gcacaaatga cattttgatc atggagctta agaatgcatc cttgcaggac caaggagact 2220
atgtctgcct tgctcaagac aggaagacca agaaaagaca ttgcgtggtc aggcagctca 2280
cagtcctaga gcgtgtggca cccacgatca caggaaacct ggagaatcag acgacaagta 2340
ttggggaaag catcgaagtc tcatgcacgg catctgggaa tccccctcca cagatcatgt 2400
ggtttaaaga taatgagacc cttgtagaag actcaggcat tgtattgaag gatgggaacc 2460
ggaacctcac tatccgcaga gtgaggaagg aggacgaagg cctctacacc tgccaggcat 2520
gcagtgttct tggctgtgca aaagtggagg catttttcat aatagaaggt gcccaggaaa 2580
agacgaactt ggaaatcatt attctagtag gcacggcggt gattgccatg ttcttctggc 2640
tacttcttgt catcatccta cggaccgtta agcgggccaa tggaggggaa ctgaagacag 2700
gctacttgtc catcgtcatg gatccagatg aactcccatt ggatgaacat tgtgaacgac 2760
tgccttatga tgccagcaaa tgggaattcc ccagagaccg gctgaagcta ggtaagcctc 2820
ttggccgtgg tgcctttggc caagtgattg aagcagatgc ctttggaatt gacaagacag 2880
caacttgcag gacagtagca gtcaaaatgt tgaaagaagg agcaacacac agtgagcatc 2940
gagctctcat gtctgaactc aagatcctca ttcatattgg tcaccatctc aatgtggtca 3000
accttctagg tgcctgtacc aagccaggag ggccactcat ggtgattgtg gaattctgca 3060
aatttggaaa cctgtccact tacctgagga gcaagagaaa tgaatttgtc ccctacaaga 3120
ccaaaggggc acgattccgt caagggaaag actacgttgg agcaatccct gtggatctga 3180
aacggcgctt ggacagcatc accagtagcc agagctcagc cagctctgga tttgtggagg 3240
agaagtccct cagtgatgta gaagaagagg aagctcctga agatctgtat aaggacttcc 3300
tgaccttgga gcatctcatc tgttacagct tccaagtggc taagggcatg gagttcttgg 3360
catcgcgaaa gtgtatccac agggacctgg cggcacgaaa tatcctctta tcggagaaga 3420
acgtggttaa aatctgtgac tttggcttgg cccgggatat ttataaagat ccagattatg 3480
tcagaaaagg agatgctcgc ctccctttga aatggatggc cccagaaaca atttttgaca 3540
gagtgtacac aatccagagt gacgtctggt cttttggtgt tttgctgtgg gaaatatttt 3600
ccttaggtgc ttctccatat cctggggtaa agattgatga agaattttgt aggcgattga 3660
aagaaggaac tagaatgagg gcccctgatt atactacacc agaaatgtac cagaccatgc 3720
tggactgctg gcacggggag cccagtcaga gacccacgtt ttcagagttg gtggaacatt 3780
tgggaaatct cttgcaagct aatgctcagc aggatggcaa agactacatt gttcttccga 3840
tatcagagac tttgagcatg gaagaggatt ctggactctc tctgcctacc tcacctgttt 3900
cctgtatgga ggaggaggaa gtatgtgacc ccaaattcca ttatgacaac acagcaggaa 3960
tcagtcagta tctgcagaac agtaagcgaa agagccggcc tgtgagtgta aaaacatttg 4020
aagatatccc gttagaagaa ccagaagtaa aagtaatccc agatgacaac cagacggaca 4080
gtggtatggt tcttgcctca gaagagctga aaactttgga agacagaacc aaattatctc 4140
catcttttgg tggaatggtg cccagcaaaa gcagggagtc tgtggcatct gaaggctcaa 4200
accagacaag cggctaccag tccggatatc actccgatga cacagacacc accgtgtact 4260
ccagtgagga agcagaactt ttaaagctga tagagattgg agtgcaaacc ggtagcacag 4320
cccagattct ccagcctgac tcggggacca cactgagctc tcctcctgtt taaaaggaag 4380
catccacacc cccaactcct ggacatcaca tgagaggtgc tgctcagatt ttcaagtgtt 4440
gttctttcca ccagcaggaa gtagccgcat ttgattttca tttcgacaac agaaaaagga 4500
cctcggactg cagggagcca gtcttctagg catatcctgg aagaggcttg tgacccaaga 4560
atgtgtctgt gtcttctccc agtgttgacc tgatcctctt tttcattcat ttaaaaagca 4620
tttatcatgc cccctgctgc gggtctcacc atgggtttag aacaaagacg ttcaagaaat 4680
ggccccatcc tcaaagaagt agcagtacct ggggagctga cacttctgta aaactagaag 4740
ataaaccagg caatgtaagt gttcgaggtg ttgaagatgg gaaggatttg cagggctgag 4800
tctatccaag aggctttgtt taggacgtgg gtcccaagcc aagccttaag tgtggaattc 4860
ggattgatag aaaggaagac taacgttacc ttgctttgga gagtactgga gcctgcaaat 4920
gcattgtgtt tgctctggtg gaggtgggca tggggtctgt tctgaaatgt aaagggttca 4980
gacggggttt ctggttttag aaggttgcgt gttcttcgag ttgggctaaa gtagagttcg 5040
ttgtgctgtt tctgactcct aatgagagtt ccttccagac cgttacgtgt ctcctggcca 5100
agccccagga aggaaatgat gcagctctgg ctccttgtct cccaggctga tcctttattc 5160
agaataccac aaagaaagga cattcagctc aaggctccct gccgtgttga agagttctga 5220
ctgcacaaac cagcttctgg tttcttctgg aatgaatacc ctcatatctg tcctgatgtg 5280
atatgtctga gactgaatgc gggaggttca atgtgaagct gtgtgtggtg tcaaagtttc 5340
aggaaggatt ttaccctttt gttcttcccc ctgtccccaa cccactctca ccccgcaacc 5400
catcagtatt ttagttattt ggcctctact ccagtaaacc tgattgggtt tgttcactct 5460
ctgaatgatt attagccaga cttcaaaatt attttatagc ccaaattata acatctattg 5520
tattatttag acttttaaca tatagagcta tttctactga tttttgccct tgttctgtcc 5580
tttttttcaa aaaagaaaat gtgttttttg tttggtacca tagtgtgaaa tgctgggaac 5640
aatgactata agacatgcta tggcacatat atttatagtc tgtttatgta gaaacaaatg 5700
taatatatta aagccttata tataatgaac tttgtactat tcacattttg tatcagtatt 5760
atgtagcata acaaaggtca taatgctttc agcaattgat gtcattttat taaagaacat 5820
tgaaaaactt gaaaaaaaaa aaaaaaaaa 5849
<210> 4
<211> 1356
<212> PRT
<213> Homo sapiens
<400> 4
Met Gln Ser Lys Val Leu Leu Ala Val Ala Leu Trp Leu Cys Val Glu
1 5 10 15
Thr Arg Ala Ala Ser Val Gly Leu Pro Ser Val Ser Leu Asp Leu Pro
20 25 30
Arg Leu Ser Ile Gln Lys Asp Ile Leu Thr Ile Lys Ala Asn Thr Thr
35 40 45
Leu Gln Ile Thr Cys Arg Gly Gln Arg Asp Leu Asp Trp Leu Trp Pro
50 55 60
Asn Asn Gln Ser Gly Ser Glu Gln Arg Val Glu Val Thr Glu Cys Ser
65 70 75 80
Asp Gly Leu Phe Cys Lys Thr Leu Thr Ile Pro Lys Val Ile Gly Asn
85 90 95
Asp Thr Gly Ala Tyr Lys Cys Phe Tyr Arg Glu Thr Asp Leu Ala Ser
100 105 110
Val Ile Tyr Val Tyr Val Gln Asp Tyr Arg Ser Pro Phe Ile Ala Ser
115 120 125
Val Ser Asp Gln His Gly Val Val Tyr Ile Thr Glu Asn Lys Asn Lys
130 135 140
Thr Val Val Ile Pro Cys Leu Gly Ser Ile Ser Asn Leu Asn Val Ser
145 150 155 160
Leu Cys Ala Arg Tyr Pro Glu Lys Arg Phe Val Pro Asp Gly Asn Arg
165 170 175
Ile Ser Trp Asp Ser Lys Lys Gly Phe Thr Ile Pro Ser Tyr Met Ile
180 185 190
Ser Tyr Ala Gly Met Val Phe Cys Glu Ala Lys Ile Asn Asp Glu Ser
195 200 205
Tyr Gln Ser Ile Met Tyr Ile Val Val Val Val Gly Tyr Arg Ile Tyr
210 215 220
Asp Val Val Leu Ser Pro Ser His Gly Ile Glu Leu Ser Val Gly Glu
225 230 235 240
Lys Leu Val Leu Asn Cys Thr Ala Arg Thr Glu Leu Asn Val Gly Ile
245 250 255
Asp Phe Asn Trp Glu Tyr Pro Ser Ser Lys His Gln His Lys Lys Leu
260 265 270
Val Asn Arg Asp Leu Lys Thr Gln Ser Gly Ser Glu Met Lys Lys Phe
275 280 285
Leu Ser Thr Leu Thr Ile Asp Gly Val Thr Arg Ser Asp Gln Gly Leu
290 295 300
Tyr Thr Cys Ala Ala Ser Ser Gly Leu Met Thr Lys Lys Asn Ser Thr
305 310 315 320
Phe Val Arg Val His Glu Lys Pro Phe Val Ala Phe Gly Ser Gly Met
325 330 335
Glu Ser Leu Val Glu Ala Thr Val Gly Glu Arg Val Arg Ile Pro Ala
340 345 350
Lys Tyr Leu Gly Tyr Pro Pro Pro Glu Ile Lys Trp Tyr Lys Asn Gly
355 360 365
Ile Pro Leu Glu Ser Asn His Thr Ile Lys Ala Gly His Val Leu Thr
370 375 380
Ile Met Glu Val Ser Glu Arg Asp Thr Gly Asn Tyr Thr Val Ile Leu
385 390 395 400
Thr Asn Pro Ile Ser Lys Glu Lys Gln Ser His Val Val Ser Leu Val
405 410 415
Val Tyr Val Pro Pro Gln Ile Gly Glu Lys Ser Leu Ile Ser Pro Val
420 425 430
Asp Ser Tyr Gln Tyr Gly Thr Thr Gln Thr Leu Thr Cys Thr Val Tyr
435 440 445
Ala Ile Pro Pro Pro His His Ile His Trp Tyr Trp Gln Leu Glu Glu
450 455 460
Glu Cys Ala Asn Glu Pro Ser Gln Ala Val Ser Val Thr Asn Pro Tyr
465 470 475 480
Pro Cys Glu Glu Trp Arg Ser Val Glu Asp Phe Gln Gly Gly Asn Lys
485 490 495
Ile Glu Val Asn Lys Asn Gln Phe Ala Leu Ile Glu Gly Lys Asn Lys
500 505 510
Thr Val Ser Thr Leu Val Ile Gln Ala Ala Asn Val Ser Ala Leu Tyr
515 520 525
Lys Cys Glu Ala Val Asn Lys Val Gly Arg Gly Glu Arg Val Ile Ser
530 535 540
Phe His Val Thr Arg Gly Pro Glu Ile Thr Leu Gln Pro Asp Met Gln
545 550 555 560
Pro Thr Glu Gln Glu Ser Val Ser Leu Trp Cys Thr Ala Asp Arg Ser
565 570 575
Thr Phe Glu Asn Leu Thr Trp Tyr Lys Leu Gly Pro Gln Pro Leu Pro
580 585 590
Ile His Val Gly Glu Leu Pro Thr Pro Val Cys Lys Asn Leu Asp Thr
595 600 605
Leu Trp Lys Leu Asn Ala Thr Met Phe Ser Asn Ser Thr Asn Asp Ile
610 615 620
Leu Ile Met Glu Leu Lys Asn Ala Ser Leu Gln Asp Gln Gly Asp Tyr
625 630 635 640
Val Cys Leu Ala Gln Asp Arg Lys Thr Lys Lys Arg His Cys Val Val
645 650 655
Arg Gln Leu Thr Val Leu Glu Arg Val Ala Pro Thr Ile Thr Gly Asn
660 665 670
Leu Glu Asn Gln Thr Thr Ser Ile Gly Glu Ser Ile Glu Val Ser Cys
675 680 685
Thr Ala Ser Gly Asn Pro Pro Pro Gln Ile Met Trp Phe Lys Asp Asn
690 695 700
Glu Thr Leu Val Glu Asp Ser Gly Ile Val Leu Lys Asp Gly Asn Arg
705 710 715 720
Asn Leu Thr Ile Arg Arg Val Arg Lys Glu Asp Glu Gly Leu Tyr Thr
725 730 735
Cys Gln Ala Cys Ser Val Leu Gly Cys Ala Lys Val Glu Ala Phe Phe
740 745 750
Ile Ile Glu Gly Ala Gln Glu Lys Thr Asn Leu Glu Ile Ile Ile Leu
755 760 765
Val Gly Thr Ala Val Ile Ala Met Phe Phe Trp Leu Leu Leu Val Ile
770 775 780
Ile Leu Arg Thr Val Lys Arg Ala Asn Gly Gly Glu Leu Lys Thr Gly
785 790 795 800
Tyr Leu Ser Ile Val Met Asp Pro Asp Glu Leu Pro Leu Asp Glu His
805 810 815
Cys Glu Arg Leu Pro Tyr Asp Ala Ser Lys Trp Glu Phe Pro Arg Asp
820 825 830
Arg Leu Lys Leu Gly Lys Pro Leu Gly Arg Gly Ala Phe Gly Gln Val
835 840 845
Ile Glu Ala Asp Ala Phe Gly Ile Asp Lys Thr Ala Thr Cys Arg Thr
850 855 860
Val Ala Val Lys Met Leu Lys Glu Gly Ala Thr His Ser Glu His Arg
865 870 875 880
Ala Leu Met Ser Glu Leu Lys Ile Leu Ile His Ile Gly His His Leu
885 890 895
Asn Val Val Asn Leu Leu Gly Ala Cys Thr Lys Pro Gly Gly Pro Leu
900 905 910
Met Val Ile Val Glu Phe Cys Lys Phe Gly Asn Leu Ser Thr Tyr Leu
915 920 925
Arg Ser Lys Arg Asn Glu Phe Val Pro Tyr Lys Thr Lys Gly Ala Arg
930 935 940
Phe Arg Gln Gly Lys Asp Tyr Val Gly Ala Ile Pro Val Asp Leu Lys
945 950 955 960
Arg Arg Leu Asp Ser Ile Thr Ser Ser Gln Ser Ser Ala Ser Ser Gly
965 970 975
Phe Val Glu Glu Lys Ser Leu Ser Asp Val Glu Glu Glu Glu Ala Pro
980 985 990
Glu Asp Leu Tyr Lys Asp Phe Leu Thr Leu Glu His Leu Ile Cys Tyr
995 1000 1005
Ser Phe Gln Val Ala Lys Gly Met Glu Phe Leu Ala Ser Arg Lys Cys
1010 1015 1020
Ile His Arg Asp Leu Ala Ala Arg Asn Ile Leu Leu Ser Glu Lys Asn
1025 1030 1035 1040
Val Val Lys Ile Cys Asp Phe Gly Leu Ala Arg Asp Ile Tyr Lys Asp
1045 1050 1055
Pro Asp Tyr Val Arg Lys Gly Asp Ala Arg Leu Pro Leu Lys Trp Met
1060 1065 1070
Ala Pro Glu Thr Ile Phe Asp Arg Val Tyr Thr Ile Gln Ser Asp Val
1075 1080 1085
Trp Ser Phe Gly Val Leu Leu Trp Glu Ile Phe Ser Leu Gly Ala Ser
1090 1095 1100
Pro Tyr Pro Gly Val Lys Ile Asp Glu Glu Phe Cys Arg Arg Leu Lys
1105 1110 1115 1120
Glu Gly Thr Arg Met Arg Ala Pro Asp Tyr Thr Thr Pro Glu Met Tyr
1125 1130 1135
Gln Thr Met Leu Asp Cys Trp His Gly Glu Pro Ser Gln Arg Pro Thr
1140 1145 1150
Phe Ser Glu Leu Val Glu His Leu Gly Asn Leu Leu Gln Ala Asn Ala
1155 1160 1165
Gln Gln Asp Gly Lys Asp Tyr Ile Val Leu Pro Ile Ser Glu Thr Leu
1170 1175 1180
Ser Met Glu Glu Asp Ser Gly Leu Ser Leu Pro Thr Ser Pro Val Ser
1185 1190 1195 1200
Cys Met Glu Glu Glu Glu Val Cys Asp Pro Lys Phe His Tyr Asp Asn
1205 1210 1215
Thr Ala Gly Ile Ser Gln Tyr Leu Gln Asn Ser Lys Arg Lys Ser Arg
1220 1225 1230
Pro Val Ser Val Lys Thr Phe Glu Asp Ile Pro Leu Glu Glu Pro Glu
1235 1240 1245
Val Lys Val Ile Pro Asp Asp Asn Gln Thr Asp Ser Gly Met Val Leu
1250 1255 1260
Ala Ser Glu Glu Leu Lys Thr Leu Glu Asp Arg Thr Lys Leu Ser Pro
1265 1270 1275 1280
Ser Phe Gly Gly Met Val Pro Ser Lys Ser Arg Glu Ser Val Ala Ser
1285 1290 1295
Glu Gly Ser Asn Gln Thr Ser Gly Tyr Gln Ser Gly Tyr His Ser Asp
1300 1305 1310
Asp Thr Asp Thr Thr Val Tyr Ser Ser Glu Glu Ala Glu Leu Leu Lys
1315 1320 1325
Leu Ile Glu Ile Gly Val Gln Thr Gly Ser Thr Ala Gln Ile Leu Gln
1330 1335 1340
Pro Asp Ser Gly Thr Thr Leu Ser Ser Pro Pro Val
1345 1350 1355
<210> 5
<211> 3982
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
tctctttctg ccctgagtcc tcaggacccc aagagagtaa gctgtgtttc cttagatcgc 60
gcggaccgct acccggcagg actgaaagcc cagactgtgt cccgcagccg ggataacctg 120
gctgacccga ttccgcggac accgctgcag ccgcggctgg agccagggcg ccggtgcccc 180
gcgctctccc cggtcttgcg ctgcgggggc gcataccgcc tctgtgactt ctttgcgggc 240
cagggacgga gaaggagtct gtgcctgaga actgggctct gtgcccagcg cgaggtgcag 300
gatggagagc aaggcgctgc tagctgtcgc tctgtggttc tgcgtggaga cccgagccgc 360
ctctgtgggt aagaagccca ctctttagta gtaaggcgga gaagtagggt gcgggcggag 420
agtgggaata gaagaggacc taactcgtag agctctagag accctcctcc cttgggtgtt 480
ctttcactta ccaatgggga aactgaggtt caaagactct tccgaaatga ctcagccagg 540
attctactct cccccgggca tcggttggag cgtgtcctgc ggagccgtca cagcccctgg 600
cgctaggtag gcaggagtgg aaaggcggcc tgagccgggg caggagatgc tcccactggc 660
aggaacaggc ggtcaaacgc tgggaagcca gctcaagcca agcggcccgg ctggcatcaa 720
tcactccgtg ctgttgccca ccgccctagt gtggggcagg gaatccgcct ctggctccgc 780
tcccctttag ctccagcgtg taagcgcacg gactatgtga ggtgaggtct cttcatagag 840
caacactttc ctccctcaac tttctttgat gcagaatgct atttttgctg gtaggaggaa 900
gacgcggctt tctcttctgt gacagcttct ccaggtgtat taaactaaat aactctccac 960
ttaccgactc caaagcgctg gtcctggggt aaactctgaa agtctcagaa actcttgagc 1020
ttggcaccta gttataggtc acttttcttg ttttaaaatg ccctctgctt caaggttagg 1080
cccacactcg ctcttgggct tttgtgcaat aatttccctt cccttccctt cccttccctt 1140
cccttccctt cccttccctt cccttccctt cccttccctt tccctcttcc ttttcctcct 1200
cctcttcctc ctctatttct ctgtcatttc ctttttgaag ccacagtttg cagatttcca 1260
atctccaccc attggagaat ggagaatcag gaaaaaagaa gtcaattctg cagaaacatt 1320
ccttgcgccc taagagaatc gcatggctta aaagcattgg cactgacata cggcgccaag 1380
atcgcctgtc tagagctatt gagttttcct cataatgact tggttcatca ggctagctcc 1440
accacgagtg ccctcttgtt cctgagaagg ccgcactctc cccctttctg ggaagagaaa 1500
gacagcctgg aacatgtgct tgccctgggt tccatagaga agcaagttgc tttaaagccc 1560
agagaattcc tagtgtagca gcttaacagc gtcccgttct ctgaataaga tggaggttgc 1620
ccttttggag tgtgtgactt gcttaattgg attgggctat aattggtgcc atccaagtct 1680
cgagacagag ccgctgttgt ttttccttct ggtctttgag cgggaaggat aacagtgcac 1740
aaattaatta atgttggtta tcggatttga acataaaagg gcttttattg tatagtagca 1800
tatgtacctc ttgcagtcag aatgagctgt ctaaagaaca gaacccaaac ttgccgatga 1860
aaatgaatga ggtttaataa aggcgatgga tgagcattag tcactgatgt aaatctccag 1920
ttattgataa cctcattgac tggatttgat tgcagacatg tattggtatg gggcatcctt 1980
taaagatgag catagccaac gtgcctgcac tctaagagaa tctatggctg tatgttatta 2040
cagagacagt tgagaagctc ttagtggctc tggcgtgtag atcagcggta gagcgctgag 2100
gctctgcgct cgcttcctgg cactgaagaa taaaggccat ttactgtggt ggtgcagtgg 2160
gcgcagtttg tgacgagtta ctactacatt ttcctcacac atctgcctga ctaatgagtt 2220
catcagatga gcgtatccag tgattgtttg caggttaatg gttctcagtc atgtttagaa 2280
tctacttatc aaacaaattg ttttctcatt tcctgcttct tctcaaacaa agtaagattc 2340
cattattgaa aggcttgttt aagagcattt taactgcttg cctatgttag ggacagtgac 2400
ttatttcata ttgacaaata ttatgccgat taattgaata tgactaccca gttctatagc 2460
tgtctcaggg cagaccaaga gcatctgtga tccagtcact ttaaatgcca tttaaaatgc 2520
ataatttgtt ggtctaggaa taaacacact gtaaagttta gaatcacggc ccaaacacaa 2580
gtctttaaca atgccaacta gcttctgaga ttcattaatg tcatttaatt accaatgttt 2640
taaaaatatg tcattaatta ctaaatctat agttgtaaca gcaacacatg tacatcttat 2700
taggttgggt atattcaggg tggcatagct gtagactatt gcacatctgt gtaggtgagc 2760
cagtggacag ctgcctctgg ctgttctcag agggccacag tgtcacggca ttggctattt 2820
gccttggctc tttgctaata ctttattgac atggcctcat cttcgttcac gttcacttat 2880
ttgcccaaca acgtcaatgc cagctgaggc cttaggagtc atctgttctt agtcagtgtg 2940
aattagaaag cctggatgcc tgcctgctat taattagtta ttcttctctt ctgagacaga 3000
gtctcactgt gtggcccagg ctagtctcaa acttgcggtc catttgtctc actcatcaga 3060
atgctgggct tccaggtgtg tgcaccacac taggtagctc gcgttttaag ctaagagctg 3120
gaagatcctg atgtccttta ccatggtggg catgttacag gttagttgac tgaaaactag 3180
ttatctcgct gtgtaatgac ctgcagtggt atgtatctct caagatgctt ttttgcattt 3240
caatcagtta ggtaacaagt gtcttatgtc tccagctttg tattggtatg agctcagagc 3300
tttgattaat gagttgggac cccctagcta ttgctcatta gacttacact atttttagtt 3360
ttgctctgag tttatgaata tgcatgtatg catgaacttg ggagatattt ctcttcccca 3420
attccttttc ctccatttaa atgtgctgtc tttagaagcc actgcctcag cttctgcagc 3480
tcagatacca aaggaagtct ggtacacagc atgataaaag acaatgggac ggggtcacag 3540
tggctcccgt ccctttcagg ggtatggaga cgagctgtag agagatgtct ccagggagtt 3600
ttcattaatc agcaatttag tcagatctgt gcatcctatg ctttacaaga aatgtcagtg 3660
ggcctgagat catcagatgg aggttcatcg ggtttcaatg tcccgtatcc ttttgtaaga 3720
ccttgaagtt ggcaacgcag gaaaacagga actccaccct ggtgccgtga attgcagagc 3780
tgttgtgttg gtttgtgacc atctgcccat tcttcctgtt atgacagagc ttgtgaactt 3840
taactgggac tggggcaaag tcaatcccac ctttatacaa tgaattgctg aagaggcctt 3900
ttaaaacttg gagtgtgcat tgtttatgga agggctttcc tattggatcc aactcttttc 3960
taatttgttt ctaggtttgc ct 3982
<210> 6
<211> 4012
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 6
ctagtgacaa ggatcaaagg ttcaggattt tccccagagc agagacaaga ttcactcata 60
aacaatagat gattaatagg tgattgcttt tgtgatatgg acaatccggg gttataattg 120
aaatatgaaa ggaaacttcg tttcccctgg ctgtgcgtgt gcgagagagt gtgtgtgtgt 180
atgagtgtgt tttgtgtgtg tgtgtgtgta tgggtgtgtt atgtgtgtgt gtgtgtatga 240
tcgatatgaa ccctaacctt tgtttttcag tcagtattct gtgtcagaac atttcatctt 300
cagaagagcg ttcctggatc ttgtttttgt aatacactaa gggaacaatg tgactctatt 360
gtgtgagtga aggagtggtg tgcaaggggg tgactcaggg tggttagaga ggtaggtttg 420
gcagaattct aacatctcca agtctaaatg ttggcccctt tcatctgccc aactcaccat 480
gtccagtgct gcatctagag ccactagcat gctctcaaat ctctaagatg gttaggattc 540
tataaactgc acctgcaact gaagatgcca aagtgcatgg tagttctgtc actaccaaag 600
tggttcatgg cttggtgcca aatccaggga ggatgcaccc gccagttctc tctggtactc 660
tgcctccacc cactagccat ctgcattagg ttatcatgcc tatgtgggca cagtggcata 720
gatactatgg agacaggaaa ggggtggggg ggagcgagag agagagagag agagagagag 780
agagagagag agagagagag agagagagag agaggaagag aggaagagag gaagagagga 840
agagaggaag agagaggaga gggtagacat tgaatgcttc tacagaagtt cagctgtcag 900
atgttaatat gagactattc acacataaac tgattgagac cattgaatct ttgacctttc 960
ttctgtctct tacatgagta ggcttttaag cctttacata caggtagaac cattacttct 1020
ctggaaacat cagagaaacc aaatccactt acagatgcac tcagagctca ctcccaactg 1080
gtaactctta ttttgccaac taattaattg cttgtcttct ggcttctggc ttcaagcctc 1140
ccagcttgag gctcatttca actgacactg cacagtgtga gtggagagac tgagacccta 1200
agtgataatt gttttctctt aatcacctgg tgctaagcat ttggtgctgc gagcctcctg 1260
cccgtatctc tccacttgga atgtgttctg gagagcaacc tacagtctta cttgctagcc 1320
tttcccttgc tccctccaca cataaggcct ttatcttggc gctctctgtc tctgcctgtc 1380
tctctgtgtg agtctctgtc tgtctgtcta tctgtctatc tctgtgttcg tctttctgtc 1440
tgcctgcctg cctgcctgcc tgtctgtctg tctgtctgtc tgtctgtctc tccctctctt 1500
tatttctgtg tctctccctc acctatcctt cccttctctg tcttcctctc ttttctcctc 1560
tttcaggctt ctgccacata gggacaaggt ggtcagctca atttggacaa atccagtata 1620
acccttgaac agaccaccaa gttcctgtat ctcacagaca agtagcttaa ccttttaagc 1680
cttatttctt catctgaaaa ctgaaggtta ctgaaaaaat aaagaagggc gctatattta 1740
aagtgaaggg cgctatattt aaagtgacca gtgcatagta ggtgtgtggt cacaagtagc 1800
tgtctgcgca aagagcttta cttatctcct tgtgtagatt cctaaacaat caattttccc 1860
atgccacata agcaaagatg ttgactatca tcacaaaatc agactcaagt acccatgaat 1920
agaagtatat tttttcctta gagttctctc accagacaca tccctataat gtatctcttt 1980
tttaaaatat ttttattaca tattttcctc aattacattt ccaatgctat ctccaaagtc 2040
ccccataccc tccccccctt ccctacccag ccattcccat ttttttggcc ctggcattcc 2100
cctgtactgg ggcatataca gtttgcgtgt ccaatgggcc tctctttcca gtgatggccg 2160
actaggccat cttttgatac atatgcagct agagtcaaga gctccggggt actggttagt 2220
tcataatgtt gttccaccta tagggttgca gatcccttta gctccttggg aactttctct 2280
agctcctcct ttgggagccc tgtgatccat ccaatagctg actgagcatc cacttctgtg 2340
tttgctaggc cctggcatag tctcacaaga gacagctaca tctgggtcct ttcgataaaa 2400
tcttgctata taatgtatct cttaagacac gtgtaagcac ctgtggttca ttgctggatg 2460
tcttcatgag tggtgtgttt ctgttgttct ttaaaagtat gaactctatc cagcttaagg 2520
agctattggt ccctaggaat cttaaaagtg ggtgacatga ggtcctgcct tttctccctt 2580
catgtcaccc actttgtcac cttctaccat aggctatgaa tttccagctg ttgatcactc 2640
aaacatgtct tccaggtgcc caggaaaaga ccaacttgga agtcattatc ctcgtcggca 2700
ctgcagtgat tgccatgttc ttctggctcc ttcttgtcat tgtcctacgg accgttaagc 2760
gggtaacaac ataatttccc tcctgtccct tgtgtcttgg tttttgtgat taatggaagc 2820
tgactgggtt tctttcagcg gcttcttccc attgttattg gctcaatggg cacattttta 2880
cctcaataca ataacattct tgtccatttt ctttgggtgg actgtgggca ttaattgatg 2940
ggagccacca gaggggtgaa aggtttggac tgtccactgt aattaagatt tagaaacctt 3000
attctgaact cttttttgga aactgaagtg gcactaaagg caggattata atagcagcgc 3060
tctcaaacac catgagtttt attggaaaat gagattatca tgaatgaggc ccttattaaa 3120
caacaatatg tatacaaaac aacaaccaaa caagaggccg tgtgtgtatg acggggagga 3180
tttatgcttt ccaggccaat gaaggggaac tgaagacagg ctacttgtct attgtcatgg 3240
atccagatga attgcccttg gatgagcgct gtgaacgctt gccttatgat gccagcaagt 3300
gggaattccc cagggaccgg ctgaaactag gtgagttgtc aactgctatt aacttgatat 3360
aagtttttac ccgctcatct ggctctctgt taagacaatg acagatctgg tctatttaga 3420
tgatgtattc tgatttataa atattaattt tatctcttga ctttgggtaa tcatccattt 3480
agctttctag tagtaaggag cctgtgcaac catccaaagc agggcatttt gaaaagcaaa 3540
tggaaaaacc agaacaaagg aggaaagtgc tctgtgggtc taactgggat gcaccatcct 3600
cttgaccaat tggaatcttc atatgccttt tgacagtgtg acaatcaaag tgtattttgt 3660
aataccactg tcattggttc taggaaaacc tcttggccgc ggtgccttcg gccaagtgat 3720
tgaggcagac gcttttggaa ttgacaagac agcgacttgc aaaacagtag ccgtcaagat 3780
gttgaaaggt aaaagggaaa atcattatgt ttctctttct gttttgtccc tcactaaaca 3840
tgagtcttgt gtcagtggcc ctttgcatag gtaggcttta agcccctggc ttatattgcc 3900
tttgcacaaa ttgaaaacag ttttctcctt cagagcctca tgacctcaaa acaagggcta 3960
acttgcacaa ttctatttag tgatcctggt tacacagagg aggggcaata tc 4012
<210> 7
<211> 19273
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 7
agtgtttctc ttgatctgcc caggctcagc atacaaaaag acatacttac aattaaggct 60
aatacaactc ttcaaattac ttgcaggtaa ggattcattc tagatctaga tttcttgtgt 120
taagtaactg attgtttatt gagtggaaat aatttccagt agagcagaat tataatagag 180
cttgtagtaa ttgttcataa gtggtgaggt ttctaagaac tgatgtaata atggaaaatg 240
agaagaattt tctctcaaaa attctgtaca attttgctgg tgtttttata ctattctctg 300
ccaacatgca tacacacaca cacacacaca cgcacacaaa tacacaccca cacccacatt 360
ccaataacca gtacagccac ctggcgtata gtagacatac gctcaataaa tatgaatgaa 420
taaatgaagt tgagggcata catttaagga atagagttga aaaaatttgg gactatattt 480
attatgcttg gtatgattct tgaacactta ttatcccttt ccaaaaactt tgctttataa 540
gaaatttatt actataatta cttaggcagt aatatttaat agcaatttaa tatttagtgg 600
gtaatattac tgagcgcatg atctacataa ataatggact tcgggccctg ccttgatatt 660
ctggaatgca tctttcccca cttgctagca agaagtcatg ctattgattt ttgataactg 720
gagaagtaga cttctttgtc aagaagaaga ggcctttaaa ttttgccttt caacccttac 780
cccaggacga aagatagaag acccttgggt ttaacatagt gatcacacac gaaaggcatg 840
gagccttctt aggacctgtg tgtttttggt agagactgtg acaagtggag gtgatgttac 900
cctcctggaa gagtgctggg ggtccacaaa ggaccttggg taggttattg ccattgcttc 960
atacttgttg aatactaagc attaaaccga atgacataca tctattttag actgcagtat 1020
aaagaatacc ctagcccctt accaataccc agcccttggg aaaaaacaca gtagcaggtg 1080
ctgtttctct agctttactt gtttaagaca catttcccat tagattttcc ttttaccgac 1140
cctcgataac aaggttattt gaaatcccca aggatcccat gctccctttt taaaactctg 1200
cataaacatt tcttatgttc tgaaaaaaac catggagtgt gttaaaagta acttcattga 1260
tttagctgca acttcctgga aattttaagt tctttgaatg aagggccaat aatgttacat 1320
tcttcttgat gttgactatc ttcttatctt ccttggggcc ttgtagagaa atgctgcagt 1380
acaagccatc tatgttttaa tgcgaggtcc ttacaaggtc ctgagggact cttacttgca 1440
cctccttcct tcctaacctc acttcttact cccctttgct cactcttacc tggctgctct 1500
ggtttcctgg ctgttccctt aatactccag atatgcacct gctccagggc ctttccatgt 1560
gctgtttttg ctcctgtaat actgctcttc atgatgttcc tatggctagc tttatcaaga 1620
ccacctcctg caaaattctt tactcttttc tttgtatctt ctatattttt ctccatagta 1680
ctaaacacta tcttttatac aataaacttt ccttactttt taattgcctg ttttctccag 1740
ttagactgag gttccataaa ggcattgatt tttgtctgat ttgttcactg ctctttctct 1800
agtccttaac aagtttggca catagtagat gcttaataga tatttgttga aagaaagaat 1860
gcattaatta atggaaaact caggaatctt tataagtgac ttctgaagct gagtttataa 1920
cttttcatca tatgtcaatc tgacttgttg gtagaagact ttgttttttt ttttttgagg 1980
cagggttgcc ctcttgccca ggctgaagtg cagtggtgtg attttggctc actgcaacct 2040
ccacctcccg ggttcaagca attctcatgc ctcagcctcc tgagtagctg ggattacagg 2100
catgcgccac cacacctggc tcatttttgt atttttagta gagacagggt tttaccatgt 2160
tgcccagcct ggtctcgaac tcctggcctc aggtgatcca tccgccttgg cctcccaaag 2220
tgctgggatt ataggcatga gccaccatgc ctggccggta gaagactgac tgtgtctgtt 2280
gaagagttta tttaagtttc aaaaccaaat tttctctttt cttagaaata gcctcacagt 2340
ctggcacttc atattaatac ctccctgaaa ttaatttttc aggggacaga gggacttgga 2400
ctggctttgg cccaataatc agagtggcag tgagcaaagg gtggaggtga ctgagtgcag 2460
cgatggcctc ttctgtaaga cactcacaat tccaaaagtg atcggaaatg acactggagc 2520
ctacaagtgc ttctaccggg aaactgactt ggcctcggtc atttatgtct atgttcaagg 2580
taagtggtga aataaaattc atttcccacg tctctttacc agttataaaa gacaataggc 2640
tcaaagaaga attgagtaca acaaagggct tgctctaaag gctgtttgcc aagaggaata 2700
cacacaattc ttctctcctg aggctttctc tgagaaataa gactcattga ttctggagct 2760
tgggccgtgt tacctctttt ttgcccagtt agtttgggtc tgatctttgt ttccaaggta 2820
aatctgtgtt cactgttggc cattgagact tataaaaagt cttcctatgt ttgagaagaa 2880
aacctaaaat tcttgaaatc gaggaagatt tgggggtgaa ttatggagaa atttctgtgg 2940
agagataagt tatctacagc agagtaggag attttcccaa gaatgcatag gaaagcattt 3000
tttgccaagg gctctggagt tttttgcaca taggaacctt tttttcttac tagtatttca 3060
taaaaaacaa ttcccatact catgtgcaaa taaagacatt gcttcagact cttttcagga 3120
caatgtttct ttcctttgct tgtttggtct gagatcttgg atgatatgct gtatctttct 3180
aggatgtgca gtttgggatt gatattatga aggctgactt aacatccata tagtataaaa 3240
taaatgtcac acatattctg catttataat gagttatgca ttcttttgtg tttcaaaaat 3300
cttacactat cttatctttt ctgtgaaaac ctaacttaac taatgagatc cctatgatat 3360
aaatttaagg aatgtaaggg ctgcatcata gtttggttgg atgtaccaaa tatttttctt 3420
ttcagtgaag ataaacagac attttatgta tttacgtata tgccttttta catcccagag 3480
tatttgagac aggtgaagat gacttagact tttttcccag aagcagcttt tacagggcaa 3540
gaatttcatc agctttggga aacacacttg catatctctg cttacatttc agtagtgtaa 3600
tatggtcagt gcaatgaaaa agtggagacc acatcaaaat aacctatgcc actggattca 3660
caatgtttga gaaatatctt tgcccagagt aagcactgtc aaagatagaa ttctgtgccc 3720
tcctccttcc ctccacaaga tttgaaagag acaaggctca catcttggag aatttctggc 3780
tccttttgac ctggcagtct tgagagatgc agctcggtca gaagattgca aggatttcct 3840
gctttcagcc tgtctagaaa tactacaaga tgaacatccc ccatatctca ttatttactt 3900
cttcctaagt caggaaactt ggagacatgt gaaaattcat ttcatgagtt tcagtaaata 3960
ttttattttg agaggctggg tggtggtttg ggtttctttt gtttatttcc tttttttgag 4020
ataccgaaat agaattgatt tactaaatag gtttagtctt acgtcaaagg gttaatttag 4080
cttccaaagg cttgctctgt aagcaagtta tgtaatattt cataacatgt ggatgaaagg 4140
taggcaatat taagaagtgg caatccctag cactgtttat tggtacactg cctgtctttg 4200
ggtataccat taaattctgc ttcctgtcta agcttaaagt tctaggagtt gggctgtcca 4260
agattttggc catgaagtta aacaatggga aaggaaacac tgaagtattc tctatggata 4320
ggtgtttaat gtcccctctg gtcgccacct tacttcccta gtcttctgac cccattctct 4380
tcagcaatgg atggagccag gaagtgagcc ctggcctcat aagataatgg ctatggcatg 4440
tggtgggcta gattggctgc ttttctgtgc tttccagctg ggaaggaaat caaacttctg 4500
ctgttgcagg gaattagctg cctttgtccc ctgtggttta attaactctt tcttcacttt 4560
gactgactat tatgaagcac tctgagaatg cttgatggga tgtgttgggc atagcaatgt 4620
gaaatgttat ctctctgaga tttcaagcat gactccacac cacatcatct ctatctctga 4680
ggaatggact aggtttccag cagcatgtta acattgtatg agtaatgttt gattggcctt 4740
gaaatctttt tttttttttt tttttgagac ggagttttgc tcttgttgcc caggctaaag 4800
tgcagtggtg ctatctcagc tcactgcaac ttctgccccc cggttcaaat gattctcctg 4860
cctcagcctc tgaaatagct gggactacag gtgcgtgcca tcatgcctgg ctaatttttt 4920
gtatttttcg tagagatggg gttttgccac gttggtcagg ctggtctcaa actcctgacc 4980
tcaagtgatc cacctgcctc agcctcccaa agtgctggga ttacaggcgt gagccaagaa 5040
cccagtcaga atctcttcag ttttcttctc agtctttgga gtggtgactt ttcaaatgtt 5100
tgtcattgaa gatatcaatg actgctaaat gttaaactaa atgcaaaaac aattaaacat 5160
ggttttagaa agaatcatat ccctagtctt cagaatctta aaatgctcac atgaatggtc 5220
ctcttgaata accaaattca aaagtgttag ctgtttcctg ttaatctaaa gatcctttgg 5280
gatccattca tttattttca tggaatttac attatttacc taaagagaga gcacatgagt 5340
attttaaata ttagtaaaac ttgtcggtaa agtgtataga tttaacttta aattttaaag 5400
taaatattat ccttcatttt gaaaaaatta taatgattaa tcttttaaaa tgtgaaatct 5460
ataaaaatat attctgcttg tcaataaacc ttgtgaaagg agtcaatctc aattgggagt 5520
tttttttcaa aatttttata cacacagata tatacacatg catgtgcatg cacaaacaca 5580
cacacacaca tacacacaca ccctcatgta gcacagatat ctatcagcag aataatctgt 5640
ggatgccttt ggttgtgtga ggtgtccctt ccagtcattc acttgtctgg ttagagttta 5700
ggaacctgaa aaatgaccaa cttttctagt aaatactatt aactcattaa taaaactaaa 5760
ttttcttcta gattacagat ctccatttat tgcttctgtt agtgaccaac atggagtcgt 5820
gtacattact gagaacaaaa acaaaactgt ggtgattcca tgtctcgggt ccatttcaaa 5880
tctcaacgtg tcactttgtg cagtaagttg catctcctcc aatcgtctct taagttttta 5940
taattttaag ctaatattaa gatgggtaac ctgtttataa tattcacaat gagttttaag 6000
gatcctttag gaagggtcaa atgcaatgaa taaaactaat tagtattctt aaaaataaga 6060
tgaattcttc agtgatcatt gtacatggct ctcatttttg gtactggatt aaatatttga 6120
tatgtctttt tattacccag agatacccag aaaagagatt tgttcctgat ggtaacagaa 6180
tttcctggga cagcaagaag ggctttacta ttcccagcta catgatcagc tatgctggca 6240
tggtcttctg tgaagcaaaa attaatgatg aaagttacca gtctattatg tacatagttg 6300
tcgttgtagg taagaggaca tttcctttcc atatcattaa taacatatcc ttgtattaag 6360
atcttggaga taacaacata gagtgaagaa ggatattgaa aagtatagga actcaggata 6420
tggtgttggg caattcatct gctcttctct accaaataaa cccatgtgca attgaggttg 6480
tctcttttct tgccaagatt aaggaagaaa aagaaaactt tttaaaaaaa ggatgaaagc 6540
gaatggtatt actcgagcac attttatgaa gaattcaatg ttcagagcat tgcttgctat 6600
caattatttc aattatgact attttatgga aacttcagca atttgctaaa gctggcccta 6660
ctggcctagg gctactgacc actgaaagtt tactactttt ctgtccactg ggttacaaca 6720
tctttgagat ctgtgaaggt agtgctttgt aaacctctgt tggccatttt cctgggagct 6780
accaagtatt ggtgaggcct gcagggaaaa acaatgtggc atgttttaaa gttgcattac 6840
tttaaaaaat aaatctgtgc aaagttatag gcttatttgc tctctcatgt tctgtttttt 6900
caatttactt gctctagggt ataggattta tgatgtggtt ctgagtccgt ctcatggaat 6960
tgaactatct gttggagaaa agcttgtctt aaattgtaca gcaagaactg aactaaatgt 7020
ggggattgac ttcaactggg aatacccttc ttcgaaggta acgctaatga ttcaaagcca 7080
gacctccaaa tacttagata ataagcccca gtgaagtttg cttgagagat aggggcctct 7140
ttggccagat aaaatgtaag agccttaaac acacacacat acacacccac tcacacacac 7200
atacacacac acacaattta agggaattgc agaacagata gcacccacca aaaggtgaaa 7260
taccaggaat tttgtcctat tctgcaatag ccaggctatg aatattagtt ttctctaggt 7320
gattacatct ttccacatta tgtcatttct ctgttctcca aagtttttga tctacattcc 7380
ttttaaggga atttctcttt aagaggtggc atgagataca ctgctcctta aacagtggtc 7440
acatttactt gtgtttctgc agtttatatc catctcactt tcaccacgtg aggttttaaa 7500
aatcctaatt cagttggttc catttatttc tcctgaaaca aaatatattt gttgtctgca 7560
tgaggttaaa agttctggtg tccctgtttt tagcattaaa taatgtttac caaagcccag 7620
atttaattct gtgtgttact agaagttatt gggtaatgtt atatgctgtg ctttggaagt 7680
tcagtcaact ctttttttca gcatcagcat aagaaacttg taaaccgaga cctaaaaacc 7740
cagtctggga gtgagatgaa gaaatttttg agcaccttaa ctatagatgg tgtaacccgg 7800
agtgaccaag gattgtacac ctgtgcagca tccagtgggc tgatgaccaa gaagaacagc 7860
acatttgtca gggtccatgg taagctatgg tcttggaaat tattctgtgc cttgacaagt 7920
gagataattt aaataaattt aggtcactta gtgattccta ttttgttcat tcagaagata 7980
gtttctagtt tttcttgtta gggaggccac atgacctaga ggtcaagagc atagctttgt 8040
agtcaggaac ttgggttcaa acctcaactt taaagatgag atgtgctgat atacagtaag 8100
agttcattta gtattactta ttatagttat tgctgctatt aggattgtta ctatgataaa 8160
tagtattagc taaggtagtt tttaaatttt cattttattg caaggctgag aggcctactt 8220
gaataagcat gagctttgca aactggggaa acatttagca atatacagtt gacctgtgag 8280
caactcaggg attgggggaa ctcaggggag ttcccctaac tttccctcct ctgcagtcaa 8340
aaatccatgt ataggccggg cgcggtggct cacgcctgta atcccaacac tttgggagtc 8400
tgaggtgggt ggatcacctg agatcaggag ttcgaaacca gcctggtcaa catggtggaa 8460
ccccatctct actaaaaatc caaaaaatta gcctggtgtg gtggtgggag cttgtaatcc 8520
cagctactca ggaggctgag gcaggagaat tgcttgaacc caggaggtgg aggttgcagt 8580
gagccaagat cgtgccattg taccccagcc tgggcaacaa gagtgaaact ccttctcaaa 8640
aaaaaaaaaa aaaaaaaaat caaggtataa cttttgactt ccacaaaaca taactaatgg 8700
cctactgttg actggaagcc ctactgataa cataaacagt caattaacac atattttata 8760
tgttatatgt attatatact gtattcttcc aataaagcta gagaaaagaa aatgttatta 8820
agaaaattgt aaggaagaga aaatatattt actattcatt aagtgtaagt ggatcatcat 8880
aaaggtcttc atccttgtct tcacgttgag taggctgagg aaaaggggga agaggagggg 8940
gtggttttgc tgtctcaggg gtggcagagg tggaagaaaa tctgcttata agtggactca 9000
tgtagttcaa gtttgtgtta tttaagggtc aactgtaatt gaactggaat taaattgaac 9060
tggccttgag aaaatcacct taattttttg tttattctct ttcatttaca taaatgtctg 9120
agtttacatg gtaatttgtg tggcatccta cttataagcc ttggaaagga ttttggagtt 9180
tatattatga gaatgcatca atacagtgaa attttaaaaa taccttagat aatgctattt 9240
attagagttg taatcataaa agtggcaaca actataacaa gtatgattta gtgagcactt 9300
actttattag ctcatctcat ctttgaagct gagattggaa ctcaagttcc tgactacaaa 9360
gctatgctct tgacctctag gtcacgtggc atccctagca agaacttgaa aatttcttct 9420
gaatgaacaa aatagaaatc actaagtgtc ctaaatttat ttaaattatt tcacttgcca 9480
agatgcactt gtcaaaatac acagagagag atgtgctctg gcttatgttt ttatagaatt 9540
acttttgttt tccagaatac ttcagggaaa taggggcaga aataaggagg tcagttggga 9600
ggctaattgc agttatccaa gtgagagttg aggggtggct tagacaaggg tagttgaggt 9660
ggaggtagtg agaggtgatc tgcttctgga tatattttga aggtagagtc aacagggtcc 9720
gctgatcaat tcattggttg tggagtataa gagaaaaaga gtggaagatg actcgagcgt 9780
tagcatgagc aactgagtaa atgatggtgt tatttactga gatggcaaag atcgagaagg 9840
cagtgagatt tagggaaaca gtgttagata tgtttatctg gagatgcctg ttaaacatcc 9900
aagtggagat atttaacata tcaacccgga acccagagga gtcagggcag aagataacac 9960
atttaggagg tacgtgaatg atactttaaa cctgaggcta gaggaaggtg taaataaaga 10020
ggaggtctga ggactgagtc ctggggcctc atggtggaag aggtgtgtgg aggctgtcat 10080
gggagcagag gagaaggagc acccaagcat ccctggggga cttagagaaa gctgcacaga 10140
ggagcaagtg tttgagttga gacttgagca atcactaggc ttgtgggagt gcactagcgg 10200
ggagagaaaa gcaaatgcaa acacaggagg tgtgggagaa acacgggagg tgtgggagaa 10260
gctgaaaagt gacccactga aagatagtac aggaaatctt ggaactgcag ctactcagac 10320
cctcaaggtc tttgacgttt cacttgaaat gaaaaactaa atcaaatgac catttacagt 10380
aagttgacct tttttttttt ttattttctt ccagaaaaac cttttgttgc ttttggaagt 10440
ggcatggaat ctctggtgga agccacggtg ggggagcgtg tcagaatccc tgcgaagtac 10500
cttggttacc cacccccaga aataaaatgg taactactgg aaataaatgc aaagcatcat 10560
ttcgtgtgag agcaaatcct ttgactatac taattcctga gaattttttt tcataggtat 10620
aaaaatggaa taccccttga gtccaatcac acaattaaag cggggcatgt actgacgatt 10680
atggaagtga gtgaaagaga cacaggaaat tacactgtca tccttaccaa tcccatttca 10740
aaggagaagc agagccatgt ggtctctctg gttgtgtatg gtgagtccat tcaattttcc 10800
tctctgccca agatttatta tgatacattg tcttccaaat cagccaaacc accgttcctc 10860
tgcctcctgc tgcttcactc atatcatggc tgggcctgcg tacaaaagtc atctggcgtg 10920
gtgaagctga agtgaaacgt aggaccatgt gctctggcca tgtttgttta agaggccgtg 10980
taaatgagct ttgtggtgga caaatgcaag attaaagtag tgataccctc gatagctaaa 11040
tgttgtgaaa taagaatgcc cacagggaca gttgtcaagc taagttatac taccatgttc 11100
ccctctcatg gaattgccca cctggtacac agatgtgtaa gacccttctc cttagatttt 11160
gtgcaaagct tctagtttga tgttgtagtt gatgtatcag agatgtgcag gcacgttcca 11220
actctgaagg cttttgaagt tgacactgtt ggcttggttg ggagcttttc ttttttcctt 11280
tttgacagga gttcaggatc tgattttgag tctgtaaagg aaagatagta agtttttgat 11340
gtaaagataa tttgaacttt gttttctgaa actgaaaggt acaaataagt gtttggaatg 11400
gagtggggag aagggtgcca tggtcaagtg agtgtgagag gtgctaaggt gatgtgtaga 11460
tgtgtaacag gtttctttat tgcaggactt cgcagaacct tttatatgct aatgtatatt 11520
ggtattctcc aggaggagag acatagagta ttcaaggttt aacaaaccta tttgaccaga 11580
gcaccttttt tcccctgagc aaattcatta atctctcact ccaaacagtt tgagaaatgc 11640
ttctctgttg taattctttg ttcccccttc tggtacggca tattaaaact tcaggatatt 11700
ttcccatgac attaaggtgc ttccctacgt gtcctgatac tcttctgtag gccgctgaac 11760
ttggctttat tatttttttt cagggaatat tttaaagata ggctgggtgc cgtggtttgc 11820
atctgtaatc ccagcacttt gggaggccga ggcggatgga tcacctgagg tcaggagttc 11880
gagaccagcc tggccaacat gatgaaaacc cgtctctact aaaaatataa aaattagcca 11940
ggcatggtgg tgggcacctg taatcccagc tacttgggag gctgaggcag gagaatcact 12000
tgaacccagg aggtggaggt tgcagatagc cgagatcgca ccattgtact ccagcctggt 12060
gacaagagca aaactccgtc tcaaaaaaaa agttaacagg ttccaaaaag gttgtttaga 12120
agcagcatag gtgtagggga ctggggagag gagaaactgg aaagtgtata agtaggatgg 12180
gaggaggaaa tgaacaggaa ataaaaacaa aacacggaca gcaaatagcc catttcatca 12240
gttcatgaag ccactaaata ttttattcac tttagcaaat tctctgctat atgaaataaa 12300
cataaaaaag aagtcaagtc ttcaaagcat aatctgaggc tttaggttga cagtaataag 12360
gaaatagttt tgactttgga gtcaaaaaag aaagaaagga aaaagggaga gaagaaagaa 12420
ggaagtgaga gaagggagaa ggaagaaagg ggaagaggga aagggagtgg agagggaggg 12480
agggaggaag agggagagag aatgaaaaac tcagatgatg gtggcaggaa tgcattctct 12540
aaagatttac accttccttt aacatgaggt ggtttacgtg tttgggttca gaagtcagag 12600
tgtctaggtt tgttccaggt tttgccgttc gttaactgag tgaccttggg cgagtcattt 12660
ttttctgttt catttttttc tcacgtataa agctgtggac agtaatagtg gttgtgagga 12720
ttaagtgaat gaattcatgc aaagcacttc aaacaatgct tggcacataa taaatgtatt 12780
tactgtgcta tttcagctgt tttctgtagc ctttccctga tctcctaaac ttgagaggac 12840
agagagaact atctctgtaa tacagatgag aggcacagga tttcaacact tccataaagt 12900
cattcagctt gttagtttat tattattatt agcttattgt catttttatt ttatttcgtt 12960
actttattcc tttttttttt ttttggtaga gatggggtct caccatgtgg cccaggctgg 13020
tcttgatctc ctgggcttaa gcgatccacc taccttggcg tcccaaaata ctgagattac 13080
aggcataagc ccccatgcct ggctagttgt tatttttatg agtatcacta gaactcaggt 13140
ctcttgtttc cacatctagg tgttcttcga aaaagaaagt ggaagcaaaa tcatatgctt 13200
aaagaaagtc agctttagtt gctaaaatcc tctatttccc attcttcaaa gctgactgac 13260
aattcaaaag ttgtttttcc catcttcagt cccaccccag attggtgaga aatctctaat 13320
ctctcctgtg gattcctacc agtacggcac cactcaaacg ctgacatgta cggtctatgc 13380
cattcctccc ccgcatcaca tccactggta ttggcagttg gaggaagagt gcgccaacga 13440
gcccaggtga gtaaggccac atgctctttg ctttcctgcc atcttgcatt tcttacagct 13500
gagctatgat atgactccat cctaaatgga gaagcctaaa ccaaaaaaag ttttctctca 13560
agaggtagcc tgaatctcca tccatctttc tctgtgtctt acattttagg ggatgtcttt 13620
gcttggagta tcctcctttg gggttagcta agctcagcct tgttaggtta gccgtgaggt 13680
acacttctcc aaacacaggc tatttgctca gtttgctaat tgccagtctt tggtttttct 13740
cccgatacca atcggctggt gaataccaca tccctccttc ttgtgtgtgt gaagatccat 13800
ctctcagagg aaatgctgat agatgagagg cagtgataga cccagcccca gtcctcaggg 13860
tctcaggccc agcttatcat gctctgacac aagtccagac atccttaggg aaaaacacaa 13920
caacagcagc caacccacca ccaccctaag cagtccactt cctgttgttg tttttgaaat 13980
ggccactatg agcttcttcc tcagctgctg atcatttcct tcacagagac catggtccca 14040
gagaaattac tttaaggagc ccagtggctt ctaagtttcc ttgccttcct ttgaactaaa 14100
ttaacttgaa ttgtcttgtc gatccaattt atgaatgaag gtttattccc agaatagctg 14160
cttccctcct gtatcctgaa tgaatctacc tagaaccttt tccttcattg tcaatgccta 14220
tttttaattg gcgccaagtc ttgtaccatg gtaggctgcg ttggaagtta tttctaagaa 14280
cagaataacc aaagtctgaa tcttttcctt actcttgact ctaattaaag aaaaattaaa 14340
tcataatatg cgctgttatc tctttcttat agccaagctg tctcagtgac aaacccatac 14400
ccttgtgaag aatggagaag tgtggaggac ttccagggag gaaataaaat tgaagttaat 14460
aaaaatcaat ttgctctaat tgaaggaaaa aacaaagtga gtttgaagtt ttaaaatttg 14520
aaaatctctc tctctttaat ggaaggatgg tacaataata tgtgaggcat attggagatt 14580
aataatcaaa tagtctggat gattaaatag agcgtattaa gtcactttga aaataccatt 14640
gacttttagc agtaccatta acttattaat agcttatcag agaaaaataa aaacatctat 14700
gacattaaat ctatgcatct gtgtagggtg attctgattt tataaacatg agaatgaaaa 14760
aatgtgtatc atatcatatt aaaacacatc attagtttca tggcttccaa agcccttttt 14820
atataatgtg tgagctccac agcagcataa ttatacaaat tgagtaaata tcccaaacct 14880
aaaaacccca aatccaaaat gctccagatt ctgaaccttt ttgagtgccg acatggtgct 14940
caaaggaaac gctcgttgga gcattttgga ttttcagatt agggatgctc aactggtaag 15000
tatacaatgc aaatattcca aaatccaaaa aaaaaaatcc aaaatccaaa ccacttttgg 15060
tcccaagcgt tttgagtaag ggatactcaa cctgcaattg cataaatttg agcgtgtcca 15120
accgctgcag aagtgggaat ggcataggca ggttggagtg attgtggaga ctgctggact 15180
gagtgcttgt gcacaaacag ccgcgttgtt tatggcctgg gatttgtttt ttccccgcac 15240
agactgtaag tacccttgtt atccaagcgg caaatgtgtc agctttgtac aaatgtgaag 15300
cggtcaacaa agtcgggaga ggagagaggg tgatctcctt ccacgtgacc agtaagtact 15360
cttctctgga ggtttgggtt ggatcactca cacagtgggt actaagctat gtaattccct 15420
gttgtttttg ccattcatgt gagtggcatg gcatttagga aagaggactt ggattgatca 15480
ttgatgcttt cattcataaa ttacaacttc tcaggtatct cctgggctta tgtgaagtca 15540
gtgcgtctaa ctacactgga gagagaatgg tttcacagat gctttaaacc acaagctctg 15600
tgtggtattt acatctcagt cttcagagtc tggcacagtg cctggcttat tgagcttcag 15660
tacatattgg tgggcttgct gtggaacagt tgatgagggt gggctttatg gaggcaatca 15720
gaaggacata ggagcagtgc cctcccaatg ctgccgattt tgcctgtgca tcttagtttt 15780
atggataagc tttagctgat tgtgctgaat ggaatattat agccagggct aattcattgg 15840
cataaatgta gctttcatat cattgagtgt tagtgttaat gaagacctaa ttttaaaatt 15900
ctgttagaat tagagatttt gctttggatt tttaatatat taaacattgc gtagagctca 15960
tagtggagat gtggtaaata tctgaggaat tcgtttacat tttcaagtaa tgtgtttggc 16020
caaataagat attttgggac ctgaattgtc tagtttgttt gtcaagttgt agtacatcac 16080
ctggaacgga tagagcttca tttcttttgg tactttgtag tagtctgaaa gcagcaagat 16140
gatagtgagc tgtaccaagt taaatcacca ttcaataact atggcctctt cattttaggg 16200
ggtcctgaaa ttactttgca acctgacatg cagcccactg agcaggagag cgtgtctttg 16260
tggtgcactg cagacagatc tacgtttgag aacctcacat ggtacaagct tggcccacag 16320
cctctgccaa tccatgtggg agagttgccc acacctgttt gcaagaactt ggatactctt 16380
tggaaattga atgccaccat gttctctaat agcacaaatg acattttgat catggagctt 16440
aagaatgcat ccttgcagga ccaaggagac tatgtctgcc ttgctcaaga caggaagacc 16500
aagaaaagac attgcgtggt caggcagctc acagtcctag gtagggagac aattctggat 16560
cattgtgcag aggcagttgg aatgccttaa atgtagtgca attcaggtgc tatgcaaaga 16620
ttactgtcct ctaggagatt atgttgtaaa ctggtgcaca cttcttcacc gaaagtcctt 16680
gaggaagaaa gaagctaata ataatgaaat gatatatcga aaggagaaaa taacaaaacc 16740
tgatgatgga gtaattcact agtatatgca agggatatta gcttgaacca gggaaacttc 16800
tgccttatct tgggcatcca tttatttaaa tagacaaata tttgtggaat gcctgctatg 16860
agctaggaga gtgtcagaaa ttcacagtgg taaacatgaa ggaaaggagg agaacatagg 16920
caaccactgg gaagtcacag cacagtgagg tctctgtgtc catgagaaca ggaattgttc 16980
tctgttttgc tccctgctat agctctagtc atagagcata gcagcatata ctaactgctc 17040
aataaggcac ctgctgcatg aagagtggga tgatgggctg cgtttaagac ctagaagact 17100
ccatgggaag gaagctacat tcactgtctg tacctctggg tcatcccaca tgatccagcg 17160
tagcccaagg tcaatgggac gatcacttca gtgagcagat agctctgtaa attcctccat 17220
agaggcactg tctacccctt gtctaacctc atgccttgtg caaaagctgg gcagccatgg 17280
ctttgtctgt gggaaaatca ggcaaatttg gggagcgtct ctttgtgcca cttctctcca 17340
ttttctcctc ttgtggtgtc cctttccaat tcctaggata tatgtgccct ctgttttttt 17400
tttactgtta ggaaggaaat tgcccaagta aattcatcta taccacagtt ttagagggta 17460
acgtcttcat cagaggcctt ggcgtatttg aagaggcacc ttctgacaga cactagcata 17520
aagttcgcta gttttaagac tcaggtgtca taataagaga tactttgggg tcaagtcatc 17580
cccagcatcc ttcaagtcac accacataga tcacatggat tttctgttgg cttgtctggc 17640
ttcaaggtta tggcagaatt gagaaagaga tgtgaagtag gctcctggcc tagctgtgcc 17700
cagaaaatat gtgctcgcag ttagctgctt tgcttcccta aggactccta acttgttttc 17760
ctaaaaccta ttcttagaaa taggctagaa tccagtacat ttgcttagac ttcaatgtag 17820
tacgctgttg aggtaatctc attttgctaa gtgttgacgt ggattttttc agcatgattc 17880
cttttgatgt tcagttggtt gggacaagat atttccacag cactttgatg atctgaagaa 17940
agaataaatc taaagtgttc ttgtacactt aaacaaatac tcatgggctt cattttcttt 18000
aaatccaaga cttcccttag ggtattgttg ttttgtttgt gttttagtgg aaatagcact 18060
gaactggtct tttagcctca ccagattctg taaacagttc aactgtttac ttagttgcag 18120
ggacatggac aagtggttta atgtcgctga acatcattta tttcatctgt gagataacgc 18180
taacagtcct attctgctca ttacataaga tcactagtga ggaacacaaa ttgtgtaaac 18240
aagttttata agaattgcca aataaatgta aggcattatt ggttgaatga tactaaaatt 18300
tggcacttcc aagagaaatt tgaagggatt ctagggtatt attgactaga atcttcatgg 18360
gagggaagtt ttcacctggg gaggctgtgt ctaattagag gaaaaatcca taaaggtgac 18420
cctgaacctt tcttttgtga tgggattacc agctagtatc actaatatga atgttaaaag 18480
ccattaatct gtttgcagtg tcctgactga cttgtttcat ttaactttac ccagtgacca 18540
gtgtattttc ccagaagtta atatatcaac aagttccttt ttactaaatt taaactgttt 18600
aaaagtttgc tgataccaga accatttcaa aagttataat tccatgttct gtgattttct 18660
ttttgtgtgt ctagagcgtg tggcacccac gatcacagga aacctggaga atcagacgac 18720
aagtattggg gaaagcatcg aagtctcatg cacggcatct gggaatcccc ctccacagat 18780
catgtggttt aaagataatg agacccttgt agaagactca ggtaaataga atttggctat 18840
cactcttggg ttgcagaact ttcccaggga tgttatctaa aaagccatat tatttcttga 18900
tgtaatgtag aaaaaaagca gtattggtgt ccatgacctg gctcatttca cagacttaga 18960
attggagtat ggggccctgt tgaattttca tgaaagccat ataggagatt agtcagcagt 19020
agatcccatg tgactctaca gagttagata atagaacaag atgaagggca gcatttatat 19080
tttctaaatt tccctgaaaa acttcacaga ctacatcatc ataaatgaga atgatcgttt 19140
tcttcctctg ttaggcattg tattgaagga tgggaaccgg aacctcacta tccgcagagt 19200
gaggaaggag gacgaaggcc tctacacctg ccaggcatgc agtgttcttg gctgtgcaaa 19260
agtggaggca ttt 19273
<210> 8
<211> 80
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 8
ttggatccaa ctcttttcta atttgtttct aggtttgcct agtgtttctc ttgatctgcc 60
caggctcagc atacaaaaag 80
<210> 9
<211> 80
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 9
ggcatgcagt gttcttggct gtgcaaaagt ggaggcattt ttcataatag aaggtcagtg 60
tgatgtcata ggctcatcag 80
<210> 10
<211> 80
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 10
ctggggaaaa atagatttgg tgttgggcat tgctattcaa aagcttgata tcgaattccg 60
aagttcctat tctctagaaa 80
<210> 11
<211> 80
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 11
gaacttcatc agtcaggtac ataatggtgg atccgatatc ctagtgacaa ggatcaaagg 60
ttcaggattt tccccagagc 80
<210> 12
<211> 5927
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 12
gagtcctcag gaccccaaga gagtaagctg tgtttcctta gatcgcgcgg accgctaccc 60
ggcaggactg aaagcccaga ctgtgtcccg cagccgggat aacctggctg acccgattcc 120
gcggacaccg ctgcagccgc ggctggagcc agggcgccgg tgccccgcgc tctccccggt 180
cttgcgctgc gggggcgcat accgcctctg tgacttcttt gcgggccagg gacggagaag 240
gagtctgtgc ctgagaactg ggctctgtgc ccagcgcgag gtgcaggatg gagagcaagg 300
cgctgctagc tgtcgctctg tggttctgcg tggagacccg agccgcctct gtgggtttgc 360
ctagtgtttc tcttgatctg cccaggctca gcatacaaaa agacatactt acaattaagg 420
ctaatacaac tcttcaaatt acttgcaggg gacagaggga cttggactgg ctttggccca 480
ataatcagag tggcagtgag caaagggtgg aggtgactga gtgcagcgat ggcctcttct 540
gtaagacact cacaattcca aaagtgatcg gaaatgacac tggagcctac aagtgcttct 600
accgggaaac tgacttggcc tcggtcattt atgtctatgt tcaagattac agatctccat 660
ttattgcttc tgttagtgac caacatggag tcgtgtacat tactgagaac aaaaacaaaa 720
ctgtggtgat tccatgtctc gggtccattt caaatctcaa cgtgtcactt tgtgcaagat 780
acccagaaaa gagatttgtt cctgatggta acagaatttc ctgggacagc aagaagggct 840
ttactattcc cagctacatg atcagctatg ctggcatggt cttctgtgaa gcaaaaatta 900
atgatgaaag ttaccagtct attatgtaca tagttgtcgt tgtagggtat aggatttatg 960
atgtggttct gagtccgtct catggaattg aactatctgt tggagaaaag cttgtcttaa 1020
attgtacagc aagaactgaa ctaaatgtgg ggattgactt caactgggaa tacccttctt 1080
cgaagcatca gcataagaaa cttgtaaacc gagacctaaa aacccagtct gggagtgaga 1140
tgaagaaatt tttgagcacc ttaactatag atggtgtaac ccggagtgac caaggattgt 1200
acacctgtgc agcatccagt gggctgatga ccaagaagaa cagcacattt gtcagggtcc 1260
atgaaaaacc ttttgttgct tttggaagtg gcatggaatc tctggtggaa gccacggtgg 1320
gggagcgtgt cagaatccct gcgaagtacc ttggttaccc acccccagaa ataaaatggt 1380
ataaaaatgg aatacccctt gagtccaatc acacaattaa agcggggcat gtactgacga 1440
ttatggaagt gagtgaaaga gacacaggaa attacactgt catccttacc aatcccattt 1500
caaaggagaa gcagagccat gtggtctctc tggttgtgta tgtcccaccc cagattggtg 1560
agaaatctct aatctctcct gtggattcct accagtacgg caccactcaa acgctgacat 1620
gtacggtcta tgccattcct cccccgcatc acatccactg gtattggcag ttggaggaag 1680
agtgcgccaa cgagcccagc caagctgtct cagtgacaaa cccataccct tgtgaagaat 1740
ggagaagtgt ggaggacttc cagggaggaa ataaaattga agttaataaa aatcaatttg 1800
ctctaattga aggaaaaaac aaaactgtaa gtacccttgt tatccaagcg gcaaatgtgt 1860
cagctttgta caaatgtgaa gcggtcaaca aagtcgggag aggagagagg gtgatctcct 1920
tccacgtgac caggggtcct gaaattactt tgcaacctga catgcagccc actgagcagg 1980
agagcgtgtc tttgtggtgc actgcagaca gatctacgtt tgagaacctc acatggtaca 2040
agcttggccc acagcctctg ccaatccatg tgggagagtt gcccacacct gtttgcaaga 2100
acttggatac tctttggaaa ttgaatgcca ccatgttctc taatagcaca aatgacattt 2160
tgatcatgga gcttaagaat gcatccttgc aggaccaagg agactatgtc tgccttgctc 2220
aagacaggaa gaccaagaaa agacattgcg tggtcaggca gctcacagtc ctagagcgtg 2280
tggcacccac gatcacagga aacctggaga atcagacgac aagtattggg gaaagcatcg 2340
aagtctcatg cacggcatct gggaatcccc ctccacagat catgtggttt aaagataatg 2400
agacccttgt agaagactca ggcattgtat tgaaggatgg gaaccggaac ctcactatcc 2460
gcagagtgag gaaggaggac gaaggcctct acacctgcca ggcatgcagt gttcttggct 2520
gtgcaaaagt ggaggcattt ttcataatag aaggtgccca ggaaaagacc aacttggaag 2580
tcattatcct cgtcggcact gcagtgattg ccatgttctt ctggctcctt cttgtcattg 2640
tcctacggac cgttaagcgg gccaatgaag gggaactgaa gacaggctac ttgtctattg 2700
tcatggatcc agatgaattg cccttggatg agcgctgtga acgcttgcct tatgatgcca 2760
gcaagtggga attccccagg gaccggctga aactaggaaa acctcttggc cgcggtgcct 2820
tcggccaagt gattgaggca gacgcttttg gaattgacaa gacagcgact tgcaaaacag 2880
tagccgtcaa gatgttgaaa gaaggagcaa cacacagcga gcatcgagcc ctcatgtctg 2940
aactcaagat cctcatccac attggtcacc atctcaatgt ggtgaacctc ctaggcgcct 3000
gcaccaagcc gggagggcct ctcatggtga ttgtggaatt ctgcaagttt ggaaacctat 3060
caacttactt acggggcaag agaaatgaat ttgttcccta taagagcaaa ggggcacgct 3120
tccgccaggg caaggactac gttggggagc tctccgtgga tctgaaaaga cgcttggaca 3180
gcatcaccag cagccagagc tctgccagct caggctttgt tgaggagaaa tcgctcagtg 3240
atgtagagga agaagaagct tctgaagaac tgtacaagga cttcctgacc ttggagcatc 3300
tcatctgtta cagcttccaa gtggctaagg gcatggagtt cttggcatca aggaagtgta 3360
tccacaggga cctggcagca cgaaacattc tcctatcgga gaagaatgtg gttaagatct 3420
gtgacttcgg cttggcccgg gacatttata aagacccgga ttatgtcaga aaaggagatg 3480
cccgactccc tttgaagtgg atggccccgg aaaccatttt tgacagagta tacacaattc 3540
agagcgatgt gtggtctttc ggtgtgttgc tctgggaaat attttcctta ggtgcctccc 3600
cataccctgg ggtcaagatt gatgaagaat tttgtaggag attgaaagaa ggaactagaa 3660
tgcgggctcc tgactacact accccagaaa tgtaccagac catgctggac tgctggcatg 3720
aggaccccaa ccagagaccc tcgttttcag agttggtgga gcatttggga aacctcctgc 3780
aagcaaatgc gcagcaggat ggcaaagact atattgttct tccaatgtca gagacactga 3840
gcatggaaga ggattctgga ctctccctgc ctacctcacc tgtttcctgt atggaggaag 3900
aggaagtgtg cgaccccaaa ttccattatg acaacacagc aggaatcagt cattatctcc 3960
agaacagtaa gcgaaagagc cggccagtga gtgtaaaaac atttgaagat atcccattgg 4020
aggaaccaga agtaaaagtg atcccagatg acagccagac agacagtggg atggtccttg 4080
catcagaaga gctgaaaact ctggaagaca ggaacaaatt atctccatct tttggtggaa 4140
tgatgcccag taaaagcagg gagtctgtgg cctcggaagg ctccaaccag accagtggct 4200
accagtctgg gtatcactca gatgacacag acaccaccgt gtactccagc gacgaggcag 4260
gacttttaaa gatggtggat gctgcagttc acgctgactc agggaccaca ctgcgctcac 4320
ctcctgttta aatggaagtg gtcctgtccc ggctccgccc ccaactcctg gaaatcacga 4380
gagaggtgct gcttagattt tcaagtgttg ttctttccac cacccggaag tagccacatt 4440
tgattttcat ttttggagga gggacctcag actgcaagga gcttgtcctc agggcatttc 4500
cagagaagat gcccatgacc caagaatgtg ttgactctac tctcttttcc attcatttaa 4560
aagtcctata taatgtgccc tgctgtggtc tcactaccag ttaaagcaaa agactttcaa 4620
acagtggctc tgtcctccaa gaagtggcaa cggcacctct gtgaaactgg atcgaatggg 4680
caatgctttg tgtgttgagg atgggtgaga tgtcccaggg ccgagtctgt ctaccttgga 4740
ggctttgtgg aggatgcggg ctatgagcca agtgttaagt gtgggatgtg gactgggagg 4800
aaggaaggcg caagctcgct cggagagcgg ttggagcctg cagatgcatt gtgctggctc 4860
tggtggaggt gggcttgtgg cctgtcagga aacgcaaagg cggccggcag ggtttggttt 4920
tggaaggttt gcgtgctctt cacagtcggg ttacaggcga gttccctgtg gcgtttccta 4980
ctcctaatga gagttccttc cggactctta cgtgtctcct ggcctggccc caggaaggaa 5040
atgatgcagc ttgctccttc ctcatctctc aggctgtgcc ttaattcaga acaccaaaag 5100
agaggaacgt cggcagaggc tcctgacggg gccgaagaat tgtgagaaca gaacagaaac 5160
tcagggtttc tgctgggtgg agacccacgt ggctgccctg gtggcagtgt ctgagggttc 5220
tctgtcaagt ggcggtaaag gctcaggctg gtgttcttcc tctatctcca ctcctgtcag 5280
gcccccaagt cctcagtatt ttagctttgt ggcttcctga tggcagaaaa atcttaattg 5340
gttggtttgc tctccagata atcactagcc agatttcgaa attacttttt agccgaggtt 5400
atgataacat ctactgtatc ctttagaatt ttaacctata aaactatgtc tactggtttc 5460
tgcctgtgtg cttatgttaa aaaaaaaaag aaagaaagaa actgttcttt tcatttggta 5520
ccatagtgtg aagagctggg agcaatgact gttaaacatg ctatggcaca tctatttata 5580
gtctgttatg tagaacaaat gtaatatatt aaaacgttat attatatata atgaactttg 5640
tactacccac cttttgtatc agtattatgt accactagag agattacaag gctttcagca 5700
gccgctgttg ttttgttaaa gactttgaga aactcgaagg aatcctttca tggaatatgc 5760
agctatatac cctaccgtct ctctcatctc aaacggagga ggaggaggag gagtcaggta 5820
taatgtgagt gtgttctacg tgtccttgtt ctctgttctt aggaggaatg atttcatcaa 5880
atgtttatat gctttataaa ccaataaacg tattctgagt aaagaga 5927
<210> 13
<211> 1347
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 13
Met Glu Ser Lys Ala Leu Leu Ala Val Ala Leu Trp Phe Cys Val Glu
1 5 10 15
Thr Arg Ala Ala Ser Val Gly Leu Pro Ser Val Ser Leu Asp Leu Pro
20 25 30
Arg Leu Ser Ile Gln Lys Asp Ile Leu Thr Ile Lys Ala Asn Thr Thr
35 40 45
Leu Gln Ile Thr Cys Arg Gly Gln Arg Asp Leu Asp Trp Leu Trp Pro
50 55 60
Asn Asn Gln Ser Gly Ser Glu Gln Arg Val Glu Val Thr Glu Cys Ser
65 70 75 80
Asp Gly Leu Phe Cys Lys Thr Leu Thr Ile Pro Lys Val Ile Gly Asn
85 90 95
Asp Thr Gly Ala Tyr Lys Cys Phe Tyr Arg Glu Thr Asp Leu Ala Ser
100 105 110
Val Ile Tyr Val Tyr Val Gln Asp Tyr Arg Ser Pro Phe Ile Ala Ser
115 120 125
Val Ser Asp Gln His Gly Val Val Tyr Ile Thr Glu Asn Lys Asn Lys
130 135 140
Thr Val Val Ile Pro Cys Leu Gly Ser Ile Ser Asn Leu Asn Val Ser
145 150 155 160
Leu Cys Ala Arg Tyr Pro Glu Lys Arg Phe Val Pro Asp Gly Asn Arg
165 170 175
Ile Ser Trp Asp Ser Lys Lys Gly Phe Thr Ile Pro Ser Tyr Met Ile
180 185 190
Ser Tyr Ala Gly Met Val Phe Cys Glu Ala Lys Ile Asn Asp Glu Ser
195 200 205
Tyr Gln Ser Ile Met Tyr Ile Val Val Val Val Gly Tyr Arg Ile Tyr
210 215 220
Asp Val Val Leu Ser Pro Ser His Gly Ile Glu Leu Ser Val Gly Glu
225 230 235 240
Lys Leu Val Leu Asn Cys Thr Ala Arg Thr Glu Leu Asn Val Gly Ile
245 250 255
Asp Phe Asn Trp Glu Tyr Pro Ser Ser Lys His Gln His Lys Lys Leu
260 265 270
Val Asn Arg Asp Leu Lys Thr Gln Ser Gly Ser Glu Met Lys Lys Phe
275 280 285
Leu Ser Thr Leu Thr Ile Asp Gly Val Thr Arg Ser Asp Gln Gly Leu
290 295 300
Tyr Thr Cys Ala Ala Ser Ser Gly Leu Met Thr Lys Lys Asn Ser Thr
305 310 315 320
Phe Val Arg Val His Glu Lys Pro Phe Val Ala Phe Gly Ser Gly Met
325 330 335
Glu Ser Leu Val Glu Ala Thr Val Gly Glu Arg Val Arg Ile Pro Ala
340 345 350
Lys Tyr Leu Gly Tyr Pro Pro Pro Glu Ile Lys Trp Tyr Lys Asn Gly
355 360 365
Ile Pro Leu Glu Ser Asn His Thr Ile Lys Ala Gly His Val Leu Thr
370 375 380
Ile Met Glu Val Ser Glu Arg Asp Thr Gly Asn Tyr Thr Val Ile Leu
385 390 395 400
Thr Asn Pro Ile Ser Lys Glu Lys Gln Ser His Val Val Ser Leu Val
405 410 415
Val Tyr Val Pro Pro Gln Ile Gly Glu Lys Ser Leu Ile Ser Pro Val
420 425 430
Asp Ser Tyr Gln Tyr Gly Thr Thr Gln Thr Leu Thr Cys Thr Val Tyr
435 440 445
Ala Ile Pro Pro Pro His His Ile His Trp Tyr Trp Gln Leu Glu Glu
450 455 460
Glu Cys Ala Asn Glu Pro Ser Gln Ala Val Ser Val Thr Asn Pro Tyr
465 470 475 480
Pro Cys Glu Glu Trp Arg Ser Val Glu Asp Phe Gln Gly Gly Asn Lys
485 490 495
Ile Glu Val Asn Lys Asn Gln Phe Ala Leu Ile Glu Gly Lys Asn Lys
500 505 510
Thr Val Ser Thr Leu Val Ile Gln Ala Ala Asn Val Ser Ala Leu Tyr
515 520 525
Lys Cys Glu Ala Val Asn Lys Val Gly Arg Gly Glu Arg Val Ile Ser
530 535 540
Phe His Val Thr Arg Gly Pro Glu Ile Thr Leu Gln Pro Asp Met Gln
545 550 555 560
Pro Thr Glu Gln Glu Ser Val Ser Leu Trp Cys Thr Ala Asp Arg Ser
565 570 575
Thr Phe Glu Asn Leu Thr Trp Tyr Lys Leu Gly Pro Gln Pro Leu Pro
580 585 590
Ile His Val Gly Glu Leu Pro Thr Pro Val Cys Lys Asn Leu Asp Thr
595 600 605
Leu Trp Lys Leu Asn Ala Thr Met Phe Ser Asn Ser Thr Asn Asp Ile
610 615 620
Leu Ile Met Glu Leu Lys Asn Ala Ser Leu Gln Asp Gln Gly Asp Tyr
625 630 635 640
Val Cys Leu Ala Gln Asp Arg Lys Thr Lys Lys Arg His Cys Val Val
645 650 655
Arg Gln Leu Thr Val Leu Glu Arg Val Ala Pro Thr Ile Thr Gly Asn
660 665 670
Leu Glu Asn Gln Thr Thr Ser Ile Gly Glu Ser Ile Glu Val Ser Cys
675 680 685
Thr Ala Ser Gly Asn Pro Pro Pro Gln Ile Met Trp Phe Lys Asp Asn
690 695 700
Glu Thr Leu Val Glu Asp Ser Gly Ile Val Leu Lys Asp Gly Asn Arg
705 710 715 720
Asn Leu Thr Ile Arg Arg Val Arg Lys Glu Asp Glu Gly Leu Tyr Thr
725 730 735
Cys Gln Ala Cys Ser Val Leu Gly Cys Ala Lys Val Glu Ala Phe Phe
740 745 750
Ile Ile Glu Gly Ala Gln Glu Lys Thr Asn Leu Glu Val Ile Ile Leu
755 760 765
Val Gly Thr Ala Val Ile Ala Met Phe Phe Trp Leu Leu Leu Val Ile
770 775 780
Val Leu Arg Thr Val Lys Arg Ala Asn Glu Gly Glu Leu Lys Thr Gly
785 790 795 800
Tyr Leu Ser Ile Val Met Asp Pro Asp Glu Leu Pro Leu Asp Glu Arg
805 810 815
Cys Glu Arg Leu Pro Tyr Asp Ala Ser Lys Trp Glu Phe Pro Arg Asp
820 825 830
Arg Leu Lys Leu Gly Lys Pro Leu Gly Arg Gly Ala Phe Gly Gln Val
835 840 845
Ile Glu Ala Asp Ala Phe Gly Ile Asp Lys Thr Ala Thr Cys Lys Thr
850 855 860
Val Ala Val Lys Met Leu Lys Glu Gly Ala Thr His Ser Glu His Arg
865 870 875 880
Ala Leu Met Ser Glu Leu Lys Ile Leu Ile His Ile Gly His His Leu
885 890 895
Asn Val Val Asn Leu Leu Gly Ala Cys Thr Lys Pro Gly Gly Pro Leu
900 905 910
Met Val Ile Val Glu Phe Cys Lys Phe Gly Asn Leu Ser Thr Tyr Leu
915 920 925
Arg Gly Lys Arg Asn Glu Phe Val Pro Tyr Lys Ser Lys Gly Ala Arg
930 935 940
Phe Arg Gln Gly Lys Asp Tyr Val Gly Glu Leu Ser Val Asp Leu Lys
945 950 955 960
Arg Arg Leu Asp Ser Ile Thr Ser Ser Gln Ser Ser Ala Ser Ser Gly
965 970 975
Phe Val Glu Glu Lys Ser Leu Ser Asp Val Glu Glu Glu Glu Ala Ser
980 985 990
Glu Glu Leu Tyr Lys Asp Phe Leu Thr Leu Glu His Leu Ile Cys Tyr
995 1000 1005
Ser Phe Gln Val Ala Lys Gly Met Glu Phe Leu Ala Ser Arg Lys Cys
1010 1015 1020
Ile His Arg Asp Leu Ala Ala Arg Asn Ile Leu Leu Ser Glu Lys Asn
1025 1030 1035 1040
Val Val Lys Ile Cys Asp Phe Gly Leu Ala Arg Asp Ile Tyr Lys Asp
1045 1050 1055
Pro Asp Tyr Val Arg Lys Gly Asp Ala Arg Leu Pro Leu Lys Trp Met
1060 1065 1070
Ala Pro Glu Thr Ile Phe Asp Arg Val Tyr Thr Ile Gln Ser Asp Val
1075 1080 1085
Trp Ser Phe Gly Val Leu Leu Trp Glu Ile Phe Ser Leu Gly Ala Ser
1090 1095 1100
Pro Tyr Pro Gly Val Lys Ile Asp Glu Glu Phe Cys Arg Arg Leu Lys
1105 1110 1115 1120
Glu Gly Thr Arg Met Arg Ala Pro Asp Tyr Thr Thr Pro Glu Met Tyr
1125 1130 1135
Gln Thr Met Leu Asp Cys Trp His Glu Asp Pro Asn Gln Arg Pro Ser
1140 1145 1150
Phe Ser Glu Leu Val Glu His Leu Gly Asn Leu Leu Gln Ala Asn Ala
1155 1160 1165
Gln Gln Asp Gly Lys Asp Tyr Ile Val Leu Pro Met Ser Glu Thr Leu
1170 1175 1180
Ser Met Glu Glu Asp Ser Gly Leu Ser Leu Pro Thr Ser Pro Val Ser
1185 1190 1195 1200
Cys Met Glu Glu Glu Glu Val Cys Asp Pro Lys Phe His Tyr Asp Asn
1205 1210 1215
Thr Ala Gly Ile Ser His Tyr Leu Gln Asn Ser Lys Arg Lys Ser Arg
1220 1225 1230
Pro Val Ser Val Lys Thr Phe Glu Asp Ile Pro Leu Glu Glu Pro Glu
1235 1240 1245
Val Lys Val Ile Pro Asp Asp Ser Gln Thr Asp Ser Gly Met Val Leu
1250 1255 1260
Ala Ser Glu Glu Leu Lys Thr Leu Glu Asp Arg Asn Lys Leu Ser Pro
1265 1270 1275 1280
Ser Phe Gly Gly Met Met Pro Ser Lys Ser Arg Glu Ser Val Ala Ser
1285 1290 1295
Glu Gly Ser Asn Gln Thr Ser Gly Tyr Gln Ser Gly Tyr His Ser Asp
1300 1305 1310
Asp Thr Asp Thr Thr Val Tyr Ser Ser Asp Glu Ala Gly Leu Leu Lys
1315 1320 1325
Met Val Asp Ala Ala Val His Ala Asp Ser Gly Thr Thr Leu Arg Ser
1330 1335 1340
Pro Pro Val
1345
<210> 14
<211> 22
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 14
ttgctctcag atgcgacttg cc 22
<210> 15
<211> 25
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 15
gtgtatgcat gttggcagag aatag 25
<210> 16
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 16
gctcgactag agcttgcgga 20
<210> 17
<211> 24
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 17
ttcactgata gcaaacgcct ctcc 24
<210> 18
<211> 25
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 18
cactctagca gccactggag aagga 25
<210> 19
<211> 25
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 19
tgtagccagt aatgggctct gagac 25
<210> 20
<211> 26
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 20
gacttacagc tctctgttag catctg 26
<210> 21
<211> 25
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 21
gcagaattcc acaatcacca tgaga 25
<210> 22
<211> 21
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 22
ggatcggcca ttgaacaaga t 21
<210> 23
<211> 22
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 23
cagaagaact cgtcaagaag gc 22
<210> 24
<211> 24
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 24
gctataaagc agccacctca catt 24
<210> 25
<211> 23
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 25
gatcgaattg ccagagacca tct 23
<210> 26
<211> 25
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 26
aggagattag tcagcagtag atccc 25
<210> 27
<211> 22
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 27
gtctgtgagg tctgagttac gg 22
<210> 28
<211> 25
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 28
caggggaaac gaagtttcct ttcat 25
<210> 29
<211> 25
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 29
gacaagcgtt agtaggcaca tatac 25
<210> 30
<211> 24
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 30
gctccaattt cccacaacat tagt 24
<210> 31
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 31
acatgcacag tctacgccaa 20
<210> 32
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 32
gccatgcgct ctaggatgat 20
<210> 33
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 33
ccctgcgaag taccttggtt 20
<210> 34
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 34
tctcccgact ttgttgaccg 20
<210> 35
<211> 22
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 35
tcaccatctt ccaggagcga ga 22
<210> 36
<211> 21
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 36
gaaggccatg ccagtgagct t 21
<210> 37
<211> 446
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 37
Glu Val Gln Leu Val Gln Ser Gly Gly Gly Leu Val Lys Pro Gly Gly
1 5 10 15
Ser Leu Arg Leu Ser Cys Ala Ala Ser Gly Phe Thr Phe Ser Ser Tyr
20 25 30
Ser Met Asn Trp Val Arg Gln Ala Pro Gly Lys Gly Leu Glu Trp Val
35 40 45
Ser Ser Ile Ser Ser Ser Ser Ser Tyr Ile Tyr Tyr Ala Asp Ser Val
50 55 60
Lys Gly Arg Phe Thr Ile Ser Arg Asp Asn Ala Lys Asn Ser Leu Tyr
65 70 75 80
Leu Gln Met Asn Ser Leu Arg Ala Glu Asp Thr Ala Val Tyr Tyr Cys
85 90 95
Ala Arg Val Thr Asp Ala Phe Asp Ile Trp Gly Gln Gly Thr Met Val
100 105 110
Thr Val Ser Ser Ala Ser Thr Lys Gly Pro Ser Val Phe Pro Leu Ala
115 120 125
Pro Ser Ser Lys Ser Thr Ser Gly Gly Thr Ala Ala Leu Gly Cys Leu
130 135 140
Val Lys Asp Tyr Phe Pro Glu Pro Val Thr Val Ser Trp Asn Ser Gly
145 150 155 160
Ala Leu Thr Ser Gly Val His Thr Phe Pro Ala Val Leu Gln Ser Ser
165 170 175
Gly Leu Tyr Ser Leu Ser Ser Val Val Thr Val Pro Ser Ser Ser Leu
180 185 190
Gly Thr Gln Thr Tyr Ile Cys Asn Val Asn His Lys Pro Ser Asn Thr
195 200 205
Lys Val Asp Lys Lys Val Glu Pro Lys Ser Cys Asp Lys Thr His Thr
210 215 220
Cys Pro Pro Cys Pro Ala Pro Glu Leu Leu Gly Gly Pro Ser Val Phe
225 230 235 240
Leu Phe Pro Pro Lys Pro Lys Asp Thr Leu Met Ile Ser Arg Thr Pro
245 250 255
Glu Val Thr Cys Val Val Val Asp Val Ser His Glu Asp Pro Glu Val
260 265 270
Lys Phe Asn Trp Tyr Val Asp Gly Val Glu Val His Asn Ala Lys Thr
275 280 285
Lys Pro Arg Glu Glu Gln Tyr Asn Ser Thr Tyr Arg Val Val Ser Val
290 295 300
Leu Thr Val Leu His Gln Asp Trp Leu Asn Gly Lys Glu Tyr Lys Cys
305 310 315 320
Lys Val Ser Asn Lys Ala Leu Pro Ala Pro Ile Glu Lys Thr Ile Ser
325 330 335
Lys Ala Lys Gly Gln Pro Arg Glu Pro Gln Val Tyr Thr Leu Pro Pro
340 345 350
Ser Arg Glu Glu Met Thr Lys Asn Gln Val Ser Leu Thr Cys Leu Val
355 360 365
Lys Gly Phe Tyr Pro Ser Asp Ile Ala Val Glu Trp Glu Ser Asn Gly
370 375 380
Gln Pro Glu Asn Asn Tyr Lys Thr Thr Pro Pro Val Leu Asp Ser Asp
385 390 395 400
Gly Ser Phe Phe Leu Tyr Ser Lys Leu Thr Val Asp Lys Ser Arg Trp
405 410 415
Gln Gln Gly Asn Val Phe Ser Cys Ser Val Met His Glu Ala Leu His
420 425 430
Asn His Tyr Thr Gln Lys Ser Leu Ser Leu Ser Pro Gly Lys
435 440 445
<210> 38
<211> 214
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 38
Asp Ile Gln Met Thr Gln Ser Pro Ser Ser Val Ser Ala Ser Ile Gly
1 5 10 15
Asp Arg Val Thr Ile Thr Cys Arg Ala Ser Gln Gly Ile Asp Asn Trp
20 25 30
Leu Gly Trp Tyr Gln Gln Lys Pro Gly Lys Ala Pro Lys Leu Leu Ile
35 40 45
Tyr Asp Ala Ser Asn Leu Asp Thr Gly Val Pro Ser Arg Phe Ser Gly
50 55 60
Ser Gly Ser Gly Thr Tyr Phe Thr Leu Thr Ile Ser Ser Leu Gln Ala
65 70 75 80
Glu Asp Phe Ala Val Tyr Phe Cys Gln Gln Ala Lys Ala Phe Pro Pro
85 90 95
Thr Phe Gly Gly Gly Thr Lys Val Asp Ile Lys Gly Thr Val Ala Ala
100 105 110
Pro Ser Val Phe Ile Phe Pro Pro Ser Asp Glu Gln Leu Lys Ser Gly
115 120 125
Thr Ala Ser Val Val Cys Leu Leu Asn Asn Phe Tyr Pro Arg Glu Ala
130 135 140
Lys Val Gln Trp Lys Val Asp Asn Ala Leu Gln Ser Gly Asn Ser Gln
145 150 155 160
Glu Ser Val Thr Glu Gln Asp Ser Lys Asp Ser Thr Tyr Ser Leu Ser
165 170 175
Ser Thr Leu Thr Leu Ser Lys Ala Asp Tyr Glu Lys His Lys Val Tyr
180 185 190
Ala Cys Glu Val Thr His Gln Gly Leu Ser Ser Pro Val Thr Lys Ser
195 200 205
Phe Asn Arg Gly Glu Cys
210

Claims (15)

1. A humanized VEGFR2 protein, wherein the humanized VEGFR2 protein comprises all or a portion of a human VEGFR2 protein.
2. The humanized VEGFR2 protein of claim 1, wherein the humanized VEGFR2 protein comprises at least 200 consecutive amino acids of the extracellular region of human VEGFR2 protein; preferably, the humanized VEGFR2 protein comprises a sequence identical to SEQ ID NO: 4, amino acid sequence having at least 80%, 85%, 90%, 95% or at least 99% identity at positions 1-764, 14-764, 20-764, 26-764, 23-760, 20-751, 26-751 or comprising the amino acid sequence of SEQ ID NO: 4, amino acid sequence shown in positions 1-764, 14-764, 20-764, 26-764, 23-760, 20-751, 26-751; and/or the presence of a gas in the gas,
the humanized VEGFR2 protein comprises an amino acid sequence encoded by a part of exon 2, all of exons 3 to 14 and a part of exon 15 of human VEGFR2 gene; preferably, the polypeptide comprising SEQ ID NO: 7, or a pharmaceutically acceptable salt thereof.
3. The humanized VEGFR2 protein of claim 1 or 2, wherein the amino acid sequence of the humanized VEGFR2 protein comprises one of the following group:
I) SEQ ID NO: 13 amino acid sequence, in whole or in part;
II) and SEQ ID NO: 13 is at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or at least 99% identical in amino acid sequence;
III) and SEQ ID NO: 13 by no more than 10, 9, 8, 7, 6, 5, 4, 3, 2, or no more than 1 amino acid; or the like, or, alternatively,
IV) and SEQ ID NO: 13, comprising substitution, deletion and/or insertion of one or more amino acid residues.
4. A humanized VEGFR2 gene, wherein the humanized VEGFR2 gene comprises a portion of the human VEGFR2 gene; preferably, the humanized VEGFR2 gene encodes the humanized VEGFR2 protein of any one of claims 1-3.
5. The humanized VEGFR2 gene of claim 4, wherein the humanized VEGFR2 gene comprises part of exon 2, all of exons 3 to 14, and part of exon 15 of human VEGFR2 gene, and preferably further comprises intron 2-3 and/or intron 14-15; preferably, the humanized VEGFR2 gene comprises SEQ ID NO: 7.
6. The humanized VEGFR2 gene of claim 4 or 5, wherein the nucleotide sequence of the humanized VEGFR2 gene comprises one of the following group:
(i) the transcribed mRNA is SEQ ID NO:12, or a portion or all of the nucleotide sequence set forth in seq id no;
(ii) the transcribed mRNA is identical to SEQ ID NO:12 is at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99%;
(iii) the transcribed mRNA is identical to SEQ ID NO:12 differ by no more than 10, 9, 8, 7, 6, 5, 4, 3, 2, or no more than 1 nucleotide; or the like, or, alternatively,
(iv) the transcribed mRNA has the sequence of SEQ ID NO:12, including nucleotide sequences with one or more nucleotides substituted, deleted and/or inserted;
preferably, the polypeptide comprising SEQ ID NO: 8 and/or 9.
7. A targeting vector, wherein said targeting vector comprises any one of:
A) a portion of the human VEGFR2 gene, preferably comprising part of exon 2, all of exons 3 to 14, and part of exon 15 of the human VEGFR2 gene, preferably further comprising intron 2-3 and/or intron 14-15; further preferably, said humanized VEGFR2 gene comprises SEQ ID NO: 7;
B) a nucleotide sequence encoding all or a portion of a human VEGFR2 protein, preferably comprising all or a portion of an extracellular region encoding a human VEGFR2 protein, preferably comprising a nucleotide sequence encoding SEQ ID NO: 4 at position 1-764, 14-764, 20-764, 26-764, 23-760, 20-751, or 26-751;
C) a nucleotide sequence encoding the humanized VEGFR2 protein of any of claims 1-3; or the like, or, alternatively,
D) the humanized VEGFR2 gene of any one of claims 4-6;
preferably, the targeting vector further comprises a 5 ' arm and/or a 3 ' arm, said 5 ' arm having at least 90% homology to NCBI accession No. NC _ 000071.6; preferably, the 5' arm is identical to SEQ ID NO: 5 or as shown in SEQ ID NO: 5 is shown in the specification; the 3' arm has at least 90% homology of nucleotide with NCBI accession number NC-000071.6; preferably, the 3' arm is identical to SEQ ID NO: 6 or as shown in SEQ ID NO: and 6.
8. A method for constructing a non-human animal with a humanized VEGFR2 gene, wherein the non-human animal expresses a human or humanized VEGFR2 protein and/or the genome of the non-human animal comprises a portion of the human VEGFR2 gene; preferably, the non-human animal expresses a humanized VEGFR2 protein of any of claims 1-3.
9. The method of claim 8, wherein the genome of the non-human animal comprises the humanized VEGFR2 gene of any one of claims 4-6, preferably the genome of the non-human animal comprises part of exon 2, all of exons 3 to 14, and part of exon 15 of human VEGFR2 gene, preferably further comprises intron 2-3 and/or intron 14-15; more preferably, the genome of said non-human animal comprises SEQ ID NO: 7.
10. The method of construction of claim 8 or 9, comprising introducing into the non-human animal VEGFR2 locus a nucleotide sequence comprising any one of:
A) a portion of the human VEGFR2 gene, preferably comprising part of exon 2, all of exons 3 to 14, and part of exon 15 of the human VEGFR2 gene, preferably further comprising intron 2-3 and/or intron 14-15; further preferred, comprises SEQ ID NO: 7; or the like, or, alternatively,
B) a nucleotide sequence encoding all or a portion of a human VEGFR2 protein, preferably comprising all or a portion of an extracellular region encoding a human VEGFR2 protein, preferably comprising a nucleotide sequence encoding SEQ ID NO: 4 at position 1-764, 14-764, 20-764, 26-764, 23-760, 20-751, or 26-751;
preferably, said introducing is an insertion or a substitution, and more preferably, said introducing is at the VEGFR2 locus of a non-human animal to replace the corresponding region of the non-human animal; it is still further preferred that all or part of exons 2 to 15 of non-human animal VEGFR2 gene be replaced;
preferably, the human or humanized VEGFR2 gene is regulated in a non-human animal by endogenous regulatory elements.
11. The method of any one of claims 8 to 10, wherein the targeting vector of claim 7 is used to construct a non-human animal.
12. The method of any one of claims 8-11, further comprising mating the VEGFR2 gene-humanized non-human animal with another genetically modified non-human animal, in vitro fertilization, or direct gene editing, and screening to obtain a polygenetically modified non-human animal;
preferably, the other genes comprise one or more than two of CD3, PD-L1, PD-1, CD47, CD27, CD28, SIRPA, GITR or TIGIT.
13. The method of any one of claims 8-12, wherein the non-human animal is a mouse or rat.
14. A cell, tissue or organ that expresses the humanized VEGFR2 protein of any of claims 1-3, or that comprises the humanized VEGFR2 gene of any of claims 4-6 in its genome, or that is derived from a non-human animal obtained by the construction method of any of claims 8-13.
15. Use of a protein derived from the humanized VEGFR2 of any one of claims 1-3, the humanized VEGFR2 gene of any one of claims 4-6, the non-human animal obtained by the method of construction of any one of claims 8-13, or the cell, tissue or organ of claim 14, said use comprising:
A) use in the development of products involving the immunological process of human cells;
B) as model systems for pharmacological, immunological, microbiological or medical research;
C) to the production and use of animal experimental disease models for the study of etiology and/or for the development of diagnostic strategies and/or for the development of therapeutic strategies;
D) screening, validating, evaluating or studying VEGFR2 pathway function; or the like, or, alternatively,
E) screening and evaluating the application of human medicine and drug effect research.
CN202111596716.9A 2020-12-25 2021-12-24 VEGFR2 gene humanized non-human animal and construction method and application thereof Pending CN114316025A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011564564X 2020-12-25
CN202011564564 2020-12-25

Publications (1)

Publication Number Publication Date
CN114316025A true CN114316025A (en) 2022-04-12

Family

ID=81012545

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111596716.9A Pending CN114316025A (en) 2020-12-25 2021-12-24 VEGFR2 gene humanized non-human animal and construction method and application thereof

Country Status (1)

Country Link
CN (1) CN114316025A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106892980A (en) * 2017-01-25 2017-06-27 长春金赛药业有限责任公司 Anti-vegf R2 monoclonal antibodies and its application
CN107815466A (en) * 2016-08-31 2018-03-20 北京百奥赛图基因生物技术有限公司 The preparation method and application of humanization genetic modification animal model
CN109053895A (en) * 2018-08-30 2018-12-21 中山康方生物医药有限公司 Bifunctional antibody, its medical composition and its use of anti-PD-1- anti-vegf A
CN111440244A (en) * 2020-04-09 2020-07-24 诺未科技(北京)有限公司 Metastatic cancer vaccine targeting VEGFR2

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107815466A (en) * 2016-08-31 2018-03-20 北京百奥赛图基因生物技术有限公司 The preparation method and application of humanization genetic modification animal model
CN106892980A (en) * 2017-01-25 2017-06-27 长春金赛药业有限责任公司 Anti-vegf R2 monoclonal antibodies and its application
CN109053895A (en) * 2018-08-30 2018-12-21 中山康方生物医药有限公司 Bifunctional antibody, its medical composition and its use of anti-PD-1- anti-vegf A
CN111440244A (en) * 2020-04-09 2020-07-24 诺未科技(北京)有限公司 Metastatic cancer vaccine targeting VEGFR2

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CHANGCHIEN CY等: "登录号:NP_002244.1", 登录号:NP_002244.1 *

Similar Documents

Publication Publication Date Title
CN111304246B (en) Humanized cytokine animal model, preparation method and application
CN111057721B (en) Preparation method and application of humanized IL-4 and/or IL-4R alpha modified animal model
CN112779285B (en) Construction method and application of humanized IL-10 and IL-10RA gene modified animal
CN112430621B (en) Construction method and application of IL2RA gene humanized non-human animal
CN111793646B (en) Construction method and application of non-human animal subjected to IL1R1 gene humanization transformation
CN114277055A (en) Non-human animal humanized by IL1B and IL1A genes and construction method and application thereof
CN113429472B (en) CD94 and NKG2A gene humanized non-human animal and its preparation method and use
CN112300265B (en) Construction method and application of IL33 gene humanized non-human animal
CN108070614B (en) Preparation method and application of humanized gene modified animal model
CN113046390B (en) Humanized non-human animal of CSF1R gene, construction method and application thereof
CN112553213B (en) CX3CR1 gene humanized non-human animal and construction method and application thereof
CN113881681B (en) CCR8 gene humanized non-human animal and construction method and application thereof
CN111793648B (en) Construction method and application of ETAR gene humanized and transformed non-human animal
CN112501205B (en) Construction method and application of CEACAM1 gene humanized non-human animal
CN114316025A (en) VEGFR2 gene humanized non-human animal and construction method and application thereof
CN113264996A (en) Humanized non-human animal and preparation method and application thereof
CN113461802A (en) CD276 gene humanized non-human animal and construction method and application thereof
CN113046389A (en) CCR2 gene humanized non-human animal and construction method and application thereof
CN112553252A (en) Construction method and application of TNFR2 gene humanized non-human animal
CN114853871B (en) Humanized non-human animal of CSF1 and/or CSF1R gene, construction method and application thereof
CN112501202B (en) CXCR4 gene humanized non-human animal and construction method and application thereof
CN113388640B (en) CCR4 gene humanized non-human animal and construction method and application thereof
CN113817770B (en) Construction method and application of CD73 gene humanized non-human animal
CN112501203B (en) Construction method and application of IL17RB gene humanized non-human animal
CN112481303B (en) IL15RA gene humanized non-human animal and construction method and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination