CN114395017A

CN114395017A - Preparation method and application of SARS-CoV-2 virus-like particle

Info

Publication number: CN114395017A
Application number: CN202111271508.1A
Authority: CN
Inventors: 戴俊彪; 马英新; 毛国斌
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2021-10-29
Filing date: 2021-10-29
Publication date: 2022-04-26
Also published as: WO2023070873A1

Abstract

The present invention relates to a composition for preparing SARS-CoV-2 virus-like particles, comprising either or both of a first plasmid comprising a Stru Δ S fragment, and a third plasmid selected from a second plasmid comprising an S fragment, comprising a packaging signal fragment of SARS-CoV-2 virus; the preparation method of the composition comprises the steps of splitting the Stru delta S fragment, the S fragment and/or the ORF1ab packaging signal fragment into short DNA fragments with homologous sequences between adjacent fragments, and carrying out homologous recombination with a linearized plasmid vector in a yeast cell. The invention also relates to a method for preparing SARS-CoV-2 virus-like particles by using the composition, the prepared SARS-CoV-2 virus-like particles, and the application of the composition in preparing vaccine for preventing or treating SARS-CoV-2 virus infection and in vitro research for SARS-CoV-2 virus infected cells.

Description

Preparation method and application of SARS-CoV-2 virus-like particle

Technical Field

The invention relates to the technical field of molecular biology, in particular to a preparation method and application of SARS-CoV-2 virus-like particles.

Background

A virus-like particle (VLP) is a highly structured protein particle that is self-assembled from single or multiple structural proteins of a virus. VLPs do not achieve replicative reproduction within the host, but they contain some or all of the protein and still give rise well to a strong immune response by the host. Thus, VLPs may serve as a potentially highly effective vaccine formulation. In general, all VLPs can form unique structures, which are covered with repetitive protein structures, and their strong immunogenicity makes them useful as vaccines. VLPs are of great interest in vaccine research, and currently marketed or under development, among VLP vaccines, the human VLP vaccine is mainly HBV vaccine, HPV vaccine and HEV vaccine, while the veterinary VLP vaccine is used for prevention of porcine circovirus type 2. The human VLP vaccine under research also relates to various viruses including influenza virus, AIDS virus, Ebola virus and the like, and the veterinary VLP vaccine under research also comprises foot-and-mouth disease virus, porcine parvovirus vaccine, classical swine fever virus and the like. However, most of VLPs currently contain only a part or all of structural proteins, and research on non-structural proteins, accessory proteins, and the like is limited, and immune responses caused as vaccines are limited.

The novel coronavirus (SARS-CoV-2) is a single-stranded RNA virus, has high virus variation capacity, and the variant strain has high virus infectivity, so that the development of the vaccine is particularly important. Although SARS-CoV-2 vaccines based on various technologies have entered the clinical stage, SARS-CoV-2VLP based vaccines have been less studied. One of the key reasons is the lack of a simple, highly efficient platform for the synthesis and preparation of SARS-CoV-2 VLP.

Disclosure of Invention

In view of the technical problems, the invention provides a method for de novo artificial synthesis and preparation of SARS-CoV-2VLP, comprising the steps of splitting SARS-CoV-2 genome into a plurality of fragments, synthesizing the fragments and assembling the fragments on a carrier, and transfecting a packaging cell to obtain the VLP.

The first aspect of the present invention provides a composition for preparing SARS-CoV-2 virus-like particles, comprising the following (a), and further comprising any one or any two selected from the following (b) and (c):

(a) a first plasmid comprising a Stru Δ S fragment, said Δ S fragment comprising a nucleic acid sequence encoding at least one structural protein of SARS-CoV-2 virus or a mutant thereof and/or at least one accessory protein or a mutant thereof, and said Δ S fragment not comprising a nucleic acid sequence encoding an S protein of SARS-CoV-2 virus or a mutant thereof;

(b) a second plasmid comprising an S fragment comprising a nucleic acid sequence encoding an S protein of SARS-CoV-2 virus or a mutant thereof;

(c) a third plasmid comprising a packaging signal fragment of SARS-CoV-2 virus, said packaging signal fragment comprising a packaging signal sequence of SARS-CoV-2 virus.

In some embodiments, the mutation of the mutant of the S protein relative to the S protein of the SARS-CoV-2 virus comprises N331Q, N501Y, D614G, and/or P681H.

In some embodiments, the Stru Δ S fragment comprises any one of:

(1) nucleic acid sequences encoding all structural proteins and all accessory proteins of the SARS-CoV-2 virus except the S protein;

(2) at least one of the structural and accessory proteins has a mutation relative to wild-type SARS-CoV-2 virus compared to (1); and

(3) in comparison with (1), any one of the structural protein and the accessory protein is deleted.

In some embodiments, the packaging signal sequence is nucleotide 19900-20000 of NCBI sequence No. NC-045512.2 or nucleotide 19773-20335 of NCBI sequence No. NC-045512.2.

In some embodiments, the packaging signal fragment comprises ORF1ab of the SARS-CoV-2 viral genome.

Another aspect of the present invention provides a method for preparing the above composition, comprising preparing the first plasmid, the second plasmid and/or the third plasmid by the following methods, respectively:

(1) splitting the Stru delta S fragment, the S fragment and/or the ORF1ab packaging signal fragment into short DNA fragments with homologous sequences between adjacent fragments respectively;

(2) transferring the DNA short fragment and the linearized plasmid vector obtained in the step (1) into a yeast cell, and performing homologous recombination to respectively obtain a first plasmid, a second plasmid and/or a third plasmid.

In another aspect of the present invention, there is provided a method for preparing SARS-CoV-2 virus-like particles, comprising: transfecting a packaging cell with the composition of any one of the above to obtain SARS-CoV-2 virus-like particles.

In some embodiments, the method of making, further comprises making the first plasmid, the second plasmid, and/or the third plasmid by:

(1) splitting the Stru delta S fragment, the S fragment and/or the packaging signal fragment into short DNA fragments with homologous sequences between adjacent fragments respectively;

In some embodiments, the method of making further comprises separately amplifying the first, second, and/or third plasmids in e.

Another aspect of the present invention provides SARS-CoV-2 virus-like particles prepared by the above-described preparation method.

Another aspect of the present invention provides the use of SARS-CoV-2 virus-like particles prepared from the above composition or prepared by the above preparation method in the preparation of a vaccine for preventing or treating SARS-CoV-2 virus infection.

Another aspect of the present invention provides the use of the SARS-CoV-2 virus-like particle prepared from the above composition or by the above preparation method in vitro studies for SARS-CoV-2 virus infected cells.

The study of SARS-CoV-2 virus infected cell can be the study of VLP interaction with cell, the study of virus infection process to cell and/or the study of virus component effect.

Compared with the prior art, the invention has the following advantages and effects:

A. the invention uses the mammal cell eukaryotic expression system to express the structural protein, non-structural protein and auxiliary protein of SARS-CoV-2 to assemble VLP, which can be used in vaccine, and the vaccine preparation method does not involve live virus. Compared with inactivated virus vaccines and attenuated live vaccines, the method has better safety; the method is more immunogenic than polypeptide or nucleic acid vaccines.

B. The invention adopts a method of splitting virus genome, firstly synthesizes small fragment genes, and then assembles the genome through a yeast homologous recombination system, and the method is simple, convenient, rapid and high in efficiency.

C. The VLP has complete virus particle morphology including structural protein, non-structural protein and the like by utilizing a genome splitting strategy, has strong immunogenicity, and is expected to be developed into an efficient vaccine with cross protection efficacy.

Drawings

The present invention will become more fully understood from the detailed description given herein below and the accompanying drawings, wherein like elements are numbered alike, and wherein:

FIG. 1 shows the production of SARS-CoV-2 pseudovirus. (A) Full-length SARS-CoV-2 genome and split part protocol, including SARS-CoV-2S, ORF1ab and Stru Δ S genome. (B) The isolated SARS-CoV-2 genome was assembled in yeast to construct plasmids SARS-CoV-2S, ORF1ab and Stru Δ S. (C) SARS-CoV-2 pseudovirus was obtained by co-transfection of plasmids SARS-CoV-2S, ORF1ab and Stru. DELTA.S. The SARS-CoV-2 pseudovirus cannot produce progeny virus after infecting 293T/hACE2 cells.

FIG. 2 shows the construction and characterization of SARS-CoV-2 pseudovirus. (A) Confocal microscopy images of 293T cells co-transfected with plasmids SARS-CoV-2S, ORF1ab-mCherry and Stru. DELTA.S-EGFP 24 hours post-transfection. (B) SARS-CoV-2S and N proteins were detected in cell lysates (top) and SARS-CoV-2 pseudovirus (bottom) by Western blotting. GAPDH was used as a loading control for western blots. (C) Morphology and size of SARS-CoV-2 pseudoviruses and VLPs detected by TEM. Scale bar: 20 nm. Confocal microscopy images (D) and protocol (E) of SARS-CoV-2 pseudovirus obtained by co-transfection of plasmids SARS-CoV-2S, ORF1ab-mCherry and Stru. DELTA.S-EGFP for 293T/hACE2 cell infection.

Fig. 3 shows the analysis and verification of the Packaging Signal (PS). (A) The sequences of the ORF1b regions of SARS-CoV-2, SARS-CoV and bat SARS-CoV were analyzed using Biopython software (nt 19500-. The secondary structure of the same location in these three SARS-associated viruses was predicted by RNAstructure Web Server using default parameters and analyzed by Vienna RNA Web Server. Two identical RNA stem loops (SL1 and SL2) are indicated with rectangular boxes. (B) RT-PCR analysis of the PS583 gene in lysates of cells transfected with the plasmid EGFP-PS 583. Red arrows indicate the expected amplification sequence. nRT: non-reverse transcribed group. (C) The EGFP, SARS-CoV-2S and N proteins were detected by Western blotting in cell lysates (left) and VLP (EGFP-PS583) (right). GAPDH was used as a loading control for Western blotting and RT-PCR. Confocal microscopy imaging 48 hours after infection of 293T/hACE2 cells with VLP (EGFP-PS583) (D) and protocol (E). As controls, 293T cells were infected with VLP (EGFP-PS583) and 293T/hACE2 cells were infected with VLP (EGFP). .

FIG. 4 shows functional verification of different SARS-CoV-2 proteins. (A) Statistical analysis of DiO-labeled SARS-CoV-2 VLPs in the cytoplasm to test the effect of SARS-CoV-2S mutations on the ability of viral infection. (B) Confocal microscopy images of 293T/hACE2 cells after infection with VLP (EGFP-PS583), VLP (Δ N-EGFP-PS583) or VLP (EGFP) were observed for intracellular green fluorescence 48 hours after infection. (C) Detection of S and N proteins in different types of VLPs was by western blot. The morphology and size of the VLPs was observed by TEM. Scale bar: 20 nm.

FIG. 5 shows (A) a scheme for dual fluorescent labeling of VLPs (QD-DiO). (B) The morphology and size of VLPs (QD-DiO) were characterized by TEM, with white arrows indicating QDs. Scale bar: 20 nm. (C) Co-localization of DiO and QD signals in VLPs (QD-DiO). (D) Dynamic process of VLP (QD-DiO) entry into 293T/hACE2 cells. (E) Differential Interference Contrast (DIC) images of 293T/hACE2 cells. The black line indicates the track of the virus. (F, G) analysis of the mean velocity (F) and MSD plot (G) of the viral particles shown in (D). (H) Dynamic process of RNA-QD and Env-DiO dissociation of VLP (QD-DiO) in 293T/hACE2 cells in real time. (I-K) traces (I), mean velocities (J) and MSD plots (K) for RNA-QD and Env-DiO.

FIG. 6 shows the construction and characterization of SARS-CoV-2S plasmid. (A) RT-PCR analysis of SARS-CoV-2S protein coding gene in cell lysate. Red arrows indicate the expected amplification sequence. nRT: non-reverse transcribed group. (B) SARS-CoV-2S protein was detected in cell lysates by Western blotting, and GAPDH was used as a loading control for Western blotting and RT-PCR.

FIG. 7 shows the construction and characterization of the SARS-CoV-2 Stru. DELTA.S-EGFP plasmid. (A) Schematic representation of Stru. DELTA.S-EGFP construction. (B) RT-PCR analysis of SARS-CoV-2ORF3a and the N protein encoding gene in cell lysates, red arrows indicate the expected amplification sequence, GAPDH was used as loading control, nRT: non-reverse transcribed group. (C) Fluorescence imaging of the SARS-CoV-2 Stru. delta. S-EGFP plasmid transfected 293T cells 24 h. (D) Cell lysates (left) and VLPs (right) were assayed for N protein by western blotting, GAPDH was used as a loading control. (E) The morphology and size of SARS-CoV-2VLP (Stru. DELTA.S) was examined by transmission electron microscopy. Scale bar: 20 nm.

FIG. 8 shows the construction and characterization of SARS-CoV-2ORF1ab-mCherry plasmid. (A) ORF1ab-mCherry construction scheme. (B) RT-PCR analysis of SARS-CoV-2NSP1 and NSP16 encoding genes in cell lysates, red arrows indicate the expected amplification sequence, GAPDH was used as a loading control, nRT: non-reverse transcribed group. (C) Fluorescence imaging of the SARS-CoV-2ORF1ab-mCherry plasmid transfected 293T cells 24h later.

FIG. 9 shows the construction and characterization of SARS-CoV-2VLP (Stru. DELTA.S-S). (A) Fluorescence imaging of 293T cells 24h after co-transfection of SARS-CoV-2 Stru. delta. S-EGFP and the S plasmid. (B) Cell lysates (top) and VLP (Stru. DELTA.S-S) (bottom) were assayed for N protein by Western blotting, and GAPDH was used as a loading control. (C) The morphology and size of SARS-CoV-2VLP (Stru. DELTA.S-S) was examined by transmission electron microscopy. Scale bar: 20 nm.

FIG. 10 shows the construction and characterization of SARS-CoV-2VLP (Stru. DELTA.S-ORF 1 ab). (A) Fluorescence imaging of 293T cells 24 hours after co-transfection of SARS-CoV-2 Stru. delta. S-EGFP and ORF1ab-mCherry plasmid. (B) The N proteins of cell lysates (top) and VLP (Stru. DELTA.S-ORF 1ab) (bottom) were detected by Western blotting, and GAPDH was used as a loading control. (C) The morphology and size of SARS-CoV-2VLP (Stru. DELTA.S-ORF 1ab) was examined by transmission electron microscopy. Scale bar: 20 nm.

FIG. 11 shows the verification of the biological safety of pseudoSARS-CoV-2 virus. (A) Expression of the hACE2 receptor on 293T/hACE2 cells was constructed by Western blot detection. (B) Cell lysates (left) and supernatants (right) were assayed for S and N proteins by western blotting. (C) 293T/hACE2 cells were infected 48 hours after co-transfection with VLP or plasmid SARS-CoV-2S, ORF1ab-mCherry and Stru. DELTA.S-EGFP, and GAPDH was used as a loading control. (C) 293T/hACE2 cell image 72 hours after infection with wild type SARS-CoV-2 virus and SARS-CoV-2 VLP.

Figure 12 shows RNA structure prediction based on alignment of consensus sequences. Multiple alignment LocARNA uses default parameters to predict the consensus structure of the bat SARS-like coronavirus PS (MG772933.1: 19773-.

Fig. 13 shows the analysis and verification of the packaging signal. (A) PS101 gene was detected in cell lysates by RT-PCR, red arrows indicate the expected amplification sequence, GAPDH was used as loading control, nRT: non-reverse transcribed group. (B) Cell lysates (left) and VLPs (right) were assayed by western blotting for S, N protein and EGFP, GAPDH as loading controls. (C) Fluorescence imaging 48 hours after infection of 293T/hACE2 cells with VLP (EGFP-PS101) and VLP (EGFP).

FIG. 14 shows the construction and characterization of SARS-CoV-2S (N331Q). (A) Schematic representation of the construction of SARS-CoV-2S (N331Q). (B) RT-PCR analysis of the SARS-CoV-2S (N331Q) encoding gene in cell lysates, red arrows indicate the expected amplification sequence, nRT: there is no reverse transcription. (C) SARS-CoV-2S (N331Q) protein was detected in cell lysates by Western blotting, and GAPDH was used as a loading control for Western blotting and RT-PCR.

FIG. 15 shows the construction and characterization of SARS-CoV-2S (N501Y). (A) Schematic representation of the construction of SARS-CoV-2S (N501Y). (B) RT-PCR analysis of the SARS-CoV-2S (N501Y) encoding gene in cell lysates, red arrows indicate the expected amplicon, nRT: non-reverse transcribed group. (C) SARS-CoV-2S (N501Y) protein was detected in cell lysates by Western blotting, and GAPDH was used as a loading control for Western blotting and RT-PCR.

FIG. 16 shows the construction and characterization of SARS-CoV-2S (D614G). (A) Schematic representation of the construction of SARS-CoV-2S (D614G). (B) RT-PCR analysis of the SARS-CoV-2S (D614G) encoding gene in cell lysates, red arrows indicate the expected amplification sequence, nRT: non-reverse transcribed group. (C) SARS-CoV-2S (D614G) protein was detected in cell lysates by Western blotting, and GAPDH was used as a loading control for Western blotting and RT-PCR.

FIG. 17 shows the construction and characterization of SARS-CoV-2S (P681H). (A) Schematic representation of the construction of SARS-CoV-2S (P681H). (B) RT-PCR analysis of the SARS-CoV-2S (P681H) encoding gene in cell lysates, red arrows indicate the expected amplification sequence, nRT: non-reverse transcribed group. (C) SARS-CoV-2S (P681H) protein was detected in cell lysates by Western blotting, and GAPDH was used as a loading control for Western blotting and RT-PCR.

FIG. 18 shows the construction and characterization of SARS-CoV-2VLP with S protein mutation. VLP (N331Q) (a), VLP (N501Y) (B), VLP (D614G) (B) and VLP (P681H) (D) were detected by western blot. The morphology and size of SARS-CoV-2VLP was examined by transmission electron microscopy. Scale bar: 20 nm.

FIG. 19 shows the construction and characterization of the SARS-CoV-2 Stru. DELTA.S-EGFP/. DELTA.N plasmid. (A) The SARS-CoV-2Stru delta S-EGFP/delta N plasmid construction is shown schematically. (B) RT-PCR analysis of SARS-CoV-2ORF3a and the N protein coding gene in cell lysates, red arrows indicate the expected amplification sequence, nRT: non-reverse transcribed group. (C) SARS-CoV-2N protein was detected in cell lysates by Western blotting, and GAPDH was used as a loading control for Western blotting and RT-PCR.

FIG. 20 shows the construction and characterization of the SARS-CoV-2 Stru. DELTA.S-EGFP/Δ E plasmid. (A) The SARS-CoV-2Stru delta S-EGFP/delta E plasmid construction is shown schematically. (B) RT-PCR analysis of the genes encoding SARS-CoV-2ORF3a, E and N proteins in cell lysates, red arrows indicate the expected amplification sequence, nRT: non-reverse transcribed group. (C) SARS-CoV-2N protein was detected in cell lysates by Western blotting, and GAPDH was used as a loading control for Western blotting and RT-PCR.

FIG. 21 shows the construction and characterization of SARS-CoV-2VLP (VLP/. DELTA.E) with E protein deleted. (A) The morphology and size of VLP/Δ E was examined by transmission electron microscopy. Scale bar: 20 nm. (B) Confocal microscopy images 48 hours after VLP/Δ E infection of 293T/hACE2 cells.

FIG. 22 shows the construction and characterization of SARS-CoV-2 Stru. DELTA.S-EGFP/. DELTA.M plasmid. (A) The SARS-CoV-2Stru delta S-EGFP/delta M plasmid construction is shown schematically. (B) RT-PCR analysis of the SARS-CoV-2ORF3a, M and N protein encoding genes in cell lysates, red arrows indicate the expected amplification sequence, nRT: non-reverse transcribed group. (C) SARS-CoV-2N protein was detected in cell lysates by Western blotting, and GAPDH was used as a loading control for Western blotting and RT-PCR.

FIG. 23 shows the construction and characterization of M protein deleted SARS-CoV-2VLP (VLP/. DELTA.E). (A) The morphology and size of VLP/Δ M was examined by transmission electron microscopy. Scale bar: 20 nm. (B) Confocal microscopy images 48 hours after VLP/Δ M infection of 293T/hACE2 cells.

FIG. 24 shows the construction and characterization of the SARS-CoV-2 Stru. DELTA.S-EGFP/Δ ORF10 plasmid. (A) The SARS-CoV-2Stru delta S-EGFP/delta ORF10 plasmid construction is shown schematically. (B) RT-PCR analysis of the genes encoding the SARS-CoV-2ORF3a, N and ORF10 proteins in cell lysates, red arrows indicate the expected amplified sequences, nRT: non-reverse transcribed group. (C) SARS-CoV-2N protein was detected in cell lysates by Western blotting, and GAPDH was used as a loading control for Western blotting and RT-PCR.

FIG. 25 shows the characterization of protein-deleted SARS-CoV-2 VLP. (A) Schematic of plasmid construction for protein-deficient SARS-CoV-2 VLP. (B) Fluorescence co-localization results of protein-deficient SARS-CoV-2 VLPs (DiO and QDs).

Detailed Description

Unless otherwise indicated, terms used in the present invention have meanings commonly understood in the art and can be understood by referring to standard texts, references, documents known to those skilled in the art. The definitions and explanations provided herein are intended to clarify their particular use within the scope of the present invention. All references cited herein are incorporated by reference in their entirety.

Without wishing to be bound by any particular theory, the explanation of the relevant theory or mechanism in the present invention is only for aiding the understanding of the invention and should not be seen as limiting the protected scheme of the present invention.

As used herein, the terms "comprising," "including," or other forms of terms having similar meanings to the terms, "including," "comprising," or "having," etc., are to be construed to cover the listed elements but do not exclude the presence of other elements; these terms also include the case where only the listed elements are made up. The term "consisting of … …" means consisting of only the recited elements. The term "consisting essentially of … …" means that elements that do not have a significant impact on the solution involved are not excluded.

The terms "a", "an" or "the" herein include both the singular and the plural, unless specifically stated otherwise. The meaning of the term "at least one" or other similar expressions is equivalent to the meaning of "one or more" or "one or more".

Recitation of ranges of values herein, such as ranges of temperature, time, composition or concentration, or other ranges of values, are intended to include the endpoints of the range, all intermediate ranges, sub-ranges (e.g., ranges between an intermediate value and an endpoint), and all individual values, particularly intermediate ranges, sub-ranges and individual integer values, which are integer values. Also, any intermediate ranges, subranges, and all individual values recited in the numerical ranges may be excluded from the numerical ranges.

As used herein, "about" means a range of 20%, ± 18%, ± 15%, ± 12%, ± 10%, ± 9%, ± 8%, ± 7%, ± 6%, ± 5%, ± 4%, ± 3%, ± 2%, ± 1%, or ± 0.5% of the stated value, the stated range including the endpoints of the range and any number within the stated range.

The term "and/or" as used herein shall be understood to mean any one, any two, any three, or any combination of more of the elements joined by the term.

In the present invention, the terms "nucleic acid sequence" and "nucleotide sequence" are used interchangeably and refer to a sequence consisting of a base, a sugar and a phosphate backbone linked together. The nucleic acid sequences of the invention may be deoxyribonucleic acid sequences (DNA) or ribonucleic acid sequences (RNA) and may include natural or non-natural bases, which may be single-or double-stranded, and may be coding or non-coding sequences.

In the present invention, when a gene or a protein encoded by a gene is expressed, the "vector" may be a plasmid, and in some cases, the terms "vector" and "plasmid" may be used interchangeably.

In the present invention, unless otherwise specified, the description of the nucleic acid sequence is usually from the 5 'end to the 3' end, and the description of the amino acid sequence is usually from the N-terminus to the C-terminus.

The method of the invention may be carried out in vitro or in vivo.

Reverse genetics is one of the major means of virus rescue (rescuing) or VLP production, of which the acquisition of the viral genome is the most important step. Nuclease ligation is the main method for obtaining viral genome, but the method has complex reaction process and low efficiency, and meanwhile, the viral full-length genome has large toxicity when transforming large intestine. The in-vivo homologous recombination method of yeast is a method for realizing the assembly of a plurality of DNA fragments with homologous series existing mutually by utilizing an efficient homologous recombination system in a yeast cell. The invention separates and assembles SARS-CoV-2 genome into three plasmids by unique genome splitting design and genome synthesis and yeast in vivo homologous recombination assembly technology, and obtains SARS-CoV-2VLP with high structural integrity by cell transfection, thereby providing a good tool platform for further vaccine development.

Specifically, the present invention provides three fragments of SARS-CoV-2 genome, each fragment being assembled into a plasmid, resulting in three plasmids in total, and using any one, any two or three of these three plasmids to co-transfect a packaging cell, a VLP of SARS-CoV-2 can be obtained.

The SARS-CoV-2 can be wild type SARS-CoV-2or any mutant thereof. The genome of wild-type SARS-CoV-2 can be determined from NCBI sequence number NC-045512.2.

The three fragments are an S fragment comprising the coding sequence of the S protein of SARS-CoV-2, a Stru. delta.S fragment comprising the coding sequences of all structural proteins and all auxiliary proteins except the S protein, and a packaging signal fragment comprising a packaging signal sequence, respectively.

The overall structural proteins of SARS-CoV-2 include the S protein (spike glycoprotein), the E protein (small envelope protein), the M protein (membrane glycoprotein), and the N protein (nucleocapsid protein). All helper proteins for SARS-CoV-2 include proteins encoded by ORF3a, ORF3b, ORF6, ORF7a, ORF7b, ORF8 and ORF 10.

The S protein (spike glycoprotein) encoded by the S fragment may be a wild-type S protein or a mutant thereof. In some embodiments, the S protein mutant has a mutation relative to the wild type S protein selected from any one or more of N331Q, N501Y, D614G and P681H. The wild-type S protein sequence may be the protein sequence encoded by nucleotides 21563-25384 of NCBI sequence No. NC-045512.2 or determined by NCBI sequence No. YP-009724390.1. In some embodiments, the sequence encoding the S protein contained in the S fragment is nucleotide 21563-25384 of NCBI sequence number NC-045512.2.

The Stru Δ S fragment encodes any one or any several of the structural proteins and accessory proteins, such as any one, any two or three of the E protein (small envelope protein), the M protein (membrane glycoprotein), the N protein (nucleocapsid protein), and/or any one or any several of the accessory proteins, which independently may be wild-type or mutants thereof. Although most preferably, the Stru Δ S fragments of the invention encode all structural proteins and accessory proteins other than S protein, in some embodiments of the invention, the protein encoded by the Stru Δ S fragments may also further lack one or more structural proteins, and/or lack one or more accessory proteins, as compared to the most preferred protocol, e.g., the protein encoded by the Stru Δ S fragments may not comprise any one, any two or three of E protein (small envelope protein), M protein (membrane glycoprotein), N protein (nucleocapsid protein), and/or may not comprise any one or more of accessory proteins, as compared to the most preferred protocol. The wild-type structural protein and/or auxiliary protein refers to the structural protein and/or auxiliary protein of wild-type SARS-CoV-2. In some embodiments, the nucleic acid encoding the structural and accessory proteins contained in the Stru Δ S fragment is nucleotide 25393 and 29674 of NCBI sequence No. NC-045512.2.

The packaging signal fragment at least comprises the packaging signal sequence of SARS-CoV-2, and can also comprise other sequences in the ORF1ab sequence of SARS-CoV-2. In some embodiments, the packaging signal sequence is nucleotide 19900-20000 of NCBI sequence No. NC-045512.2 or nucleotide 19773-20335 of NCBI sequence No. NC-045512.2. In some embodiments, the packaging signal fragment may comprise the full length of ORF1ab or a fragment thereof, i.e., the full length of ORF1ab of SARS-CoV-2or a fragment thereof. Any one or any several of the full length or fragments of ORF1ab may be wild-type or a mutant thereof. In some embodiments of the invention, the fragment of ORF1ab may lack one or more of the ORFs in ORF1ab sequence, but at least comprise a packaging signal sequence. The ORF in the wild-type ORF1ab sequence refers to the ORF in the ORF1ab sequence of wild-type SARS-CoV-2. In some embodiments, ORF1ab is nucleotide 266-.

As used herein, the term "wild-type" may apply to a gene, protein or virus strain, and generally refers to one that has been isolated from a naturally occurring source. A wild type is typically the gene, protein or virus strain most frequently observed in a population and is therefore designated as "wild type". The term "mutant" refers to a gene having an insertion, substitution or deletion of one or more nucleotides or amino acids in the nucleic acid sequence of the gene, the amino acid sequence of the protein or the genomic sequence of the virus strain, when compared to the wild-type gene, protein or virus strain. A "mutant" usually still has the essential functional properties of the wild type, but may have altered relative levels of the functional properties compared to the wild type, e.g.an increased or decreased activity of the protein or of a protein encoded by the gene. It will be understood that "mutants" may also be naturally occurring, for example where a virus is subject to natural variation, resulting in the production of naturally occurring mutants of the virus, and that "wild type" may refer to the earliest occurring naturally occurring in the same gene, protein or strain of virus. In the present invention, the genome of wild-type SARS-CoV-2 can be determined from NCBI sequence No. NC-045512.2, and each structural protein, accessory protein, and/or each ORF in ORF1ab of the wild-type can be encoded or determined from the genome sequence shown in NCBI sequence No. NC-045512.2. SARS-CoV-2 produces a virus by natural variation during the existence and spread of the virus in nature, and the genes and proteins encoded by the genes contained in the viral genome can be referred to as "mutants". The "mutants" in the present invention also include those obtained by artificially modifying naturally occurring wild-type SARS-CoV-2 and naturally mutated SARS-CoV-2, and mutants of the genome, any of the genes and any of the proteins thereof.

In the present invention, each of the S fragment, the Stru. DELTA.S fragment and the ORF1ab fragment is assembled into a plasmid for expressing the genes in these fragments or the proteins encoded thereby. In some embodiments, the plasmid containing the Stru Δ S fragment is referred to as a first plasmid, the plasmid containing the S fragment is referred to as a second plasmid, and the plasmid containing the packaging signal fragment is referred to as a third plasmid. The numbering of plasmids is merely for the purpose of differentiation and does not imply a particular sequence structure among the plasmids.

The primary plasmid vectors used to prepare the first, second and third plasmids may be the same or different, e.g., three may be the same primary plasmid vector, or three different primary plasmid vectors may be used, or two may be the same but different from the other. The initial plasmid vector used to prepare the first, second, or third plasmid may be any suitable vector common in the art, such as any suitable vector common in the art that can shuttle between yeast and bacteria and can transfect a packaging cell, including but not limited to pEASY-T1, pRS415, pJS356, pEGFP-N1, and the like.

The sequences encoding the structural and/or accessory proteins contained in the first plasmid, the sequence encoding the S protein in the second plasmid and/or the packaging signal sequence in the third plasmid or the full length of ORF1ab or fragments thereof may be operably linked to the expression regulatory sequence. The expression regulatory sequence is a sequence having a regulatory effect on the expression and/or translation of a fragment contained in a plasmid, and includes, but is not limited to, a5 'untranslated region (5' UTR), a 3 'untranslated region (3' UTR), a promoter, an enhancer, a terminator, a selection marker gene, a post-transcriptional regulatory element, an Internal Ribosome Entry Site (IRES), a cleavable sequence and/or a polyadenylation signal (polyA), and the like.

Promoters that can be used include, but are not limited to, promoters from viruses or mammals (including humans), and the like. Promoters may be constitutive or inducible. Examples of viral promoters include, but are not limited to, Cytomegalovirus (CMV) immediate early promoter, RSV promoter, chicken β -actin (CBA) promoter, CMV early enhancer/chicken β -actin (CAG) promoter, simian virus 40(SV40) promoter, and the like. Examples of mammalian promoters include, but are not limited to, the human elongation factor 1a (EF1a) promoter, the human ubiquitin c (ucb) promoter, the mouse phosphoglycerate kinase (PGK) promoter, RNA polymerase type III promoters (e.g., U6 and H1), and the like. Examples of constitutive promoters include, but are not limited to, the retroviral Rous Sarcoma Virus (RSV) LTR promoter, the Cytomegalovirus (CMV) promoter, the SV40 promoter, the dihydrofolate reductase promoter, the β -actin promoter, and the phosphoglycerate kinase (PGK) promoter, and the EF1a promoter. Examples of inducible promoters include, but are not limited to, the zinc-inducible ovine Metallothionein (MT) promoter, the dexamethasone (Dex) -inducible Mouse Mammary Tumor Virus (MMTV) promoter, the T7 polymerase promoter system, and the like.

Examples of the posttranscriptional regulatory element that can be used include viral posttranscriptional regulatory elements such as woodchuck hepatitis virus posttranscriptional regulatory element (WPRE), Hepatitis B Virus Posttranscriptional Regulatory Element (HBVPRE), and RNA Transport Element (RTE).

Selectable markers that can be used include, but are not limited to, for example, drug resistance selectable markers. Such a selectable marker gene may encode factors necessary for the survival or growth of cells grown in the selection medium. Host cells that are not transformed with a vector comprising a selection gene will not survive in culture. Proteins that confer resistance to antibiotics or other toxins, examples of which include, but are not limited to, ampicillin, hygromycin, neomycin, methotrexate, kanamycin, gentamicin, Zeocin, or tetracycline, complement the defect of auxotrophy, or supply key nutrients withheld from the culture medium may be selected as selectable markers. The selectable marker may also be a fluorescent protein, such as mCherry, GFP, BFP, EGFP, and the like.

The cleavable sequence can be any sequence that is capable of self-cleavage upon expression or can be cleaved by other means (e.g., by a particular enzyme). The cleavable sequence may be a sequence capable of self-cleavage upon expression. Such sequences are well known to those skilled in the art, for example, ribozyme sequences having self-cleaving function, such as cleavage type ribozyme sequences. A cleavage type ribozyme catalyzes self-cleavage of self RNA at a specific site. A nicking ribozyme includes, but is not limited to, a hammerhead ribozyme, a hairpin ribozyme, an HDV ribozyme, or an RNaseP. The cleavable sequence may also be a sequence that can be cleaved by other means, such as by a particular enzyme. Such sequences are well known to those skilled in the art, e.g., tRNA sequences. the tRNA sequence can be cut off the additional sequence at the 5 'end of the tRNA under the action of tRNA 5' maturase (RZaseP) or the sequence at the 3 'end of the tRNA under the action of 3' endonuclease RZase F. Thus, the cleavable sequence may be a tRNA sequence that can be cleaved by RZaseP for the 5 'additional sequence, or may be a tRNA sequence that can be cleaved by rzaase F for the 3' sequence. Any tRNA that is capable of being cleaved at maturity can be used in the invention.

The polyadenylation signal sequence may be located in a transcription termination region, such as the 3' untranslated region, examples of which include, but are not limited to, bovine growth hormone (bGH) poly (a), SV40 polyA, Thymidine Kinase (TK) poly (a) sequences, and the like.

The 5 'untranslated region (5' UTR), the 3 'untranslated region (3' UTR) can use any suitable viral 5 'UTR and 3' UTR, such as the 5 'UTR and 3' UTR of SARS-CoV-2.

In some embodiments, the nucleic acid encoding the structural and/or accessory proteins in the first plasmid is operably linked to a5 'promoter, a 5' UTR, and a 3 'UTR, a polyA, and optionally a cleavable sequence at the 3' end. In some embodiments, the nucleic acid encoding the S protein in the second plasmid is operably linked to a promoter at the 5 'end and a polyA at the 3' end. In some embodiments, the sequence of ORF1ab from SARS-CoV-2 in the third plasmid is operably linked to the 5 'promoter, the 5' UTR, and the 3 'UTR, the polyA, and the optional cleavable sequence at the 3' end.

In the present invention, the Stru Δ S fragment, S fragment or ORF1ab fragment can be assembled into the first plasmid, the second plasmid or the third plasmid by a Transformation-mediated recombination (TAR) cloning method. The TAR clone is synthesized by splicing a large-fragment target DNA by using a recombination system in a yeast body, and the specific method comprises the following steps: a plurality of fragments of a target DNA having homologous ends and a linearized TAR vector are transfected into a yeast together, and homologous recombination is performed between the vector and a homologous sequence of the target DNA fragment and between homologous sequences of different target DNA fragments by using an efficient recombination system in a yeast cell, so that a vector with the target DNA can be generated. In the present invention, TAR cloning can be carried out using an S fragment, a Stru. DELTA.S fragment or an ORF1ab fragment as a target DNA. The TAR vector may be a plasmid vector. Specifically, the target DNA is split into a plurality of DNA fragments, adjacent DNA fragments have homologous sequences (also called homology arms, overlapping sequences or overlapping fragments) between them, the DNA fragments at both ends of the target DNA have homologous sequences with both ends of the linearized vector, respectively, and these DNA fragments and the linearized vector are transferred into yeast together, so that they undergo homologous recombination in the yeast to assemble a vector containing the target DNA. The term "adjacent" means that the two DNA fragments are adjacent in position on the DNA of interest. The length of each of the resolved DNA fragments is typically 2-5kb, for example, the lower and upper limits of the length range of each of the resolved DNA fragments may be about 2kb, about 2.1kb, about 2.2kb, about 2.3kb, about 2.4kb, about 2.5kb, about 2.6kb, about 2.7kb, about 2.8kb, about 2.9kb, about 3kb, about 3.1kb, about 3.2kb, about 3.3kb, about 3.4kb, about 3.5kb, about 3.6kb, about 3.7kb, about 3.8kb, about 3.9kb, about 4kb, about 4.1kb, about 4.2kb, about 4.3kb, about 4.4kb, about 4.5kb, about 4.6kb, about 4.7kb, about 4.8kb, about 4.9kb, about 5, about 3-4kb, more preferably about 3kb, respectively. The length of the homologous sequence is usually 50-300bp, preferably 100-250bp, more preferably 150-200 bp.

In some embodiments, the Stru Δ S fragment can be split into 3 DNA fragments. In some embodiments, the S fragment can be split into 2 DNA fragments. In some embodiments, the ORF1ab fragment can be split into 2 DNA fragments.

The resolved DNA fragments can be synthesized by methods known in the art, such as in vitro synthesis, and the synthesized DNA fragments can be single-stranded or double-stranded fragments.

It is understood that the Stru. DELTA.S fragment, S fragment, or ORF1ab fragment of the invention may contain only the gene of SARS-CoV-2 to be expressed, no other sequences (e.g., coding sequence or expression control sequence), and may also contain the gene of SARS-CoV-2 to be expressed and one, two or more of the aforementioned expression control sequences. The expression control sequence may be added to the vector by any means, for example, the expression control sequence may be contained in the Stru. DELTA.S fragment, the S fragment and/or the ORF1ab fragment, split into a plurality of DNA fragments containing homology arms together with the gene of SARS-CoV-2 expressed in these fragments, and assembled into a plasmid by yeast homologous recombination (TAR). The aforementioned expression regulatory sequence may also be previously added to a plasmid vector, and the DNA fragment with the homology arm, which is separated from the Stru. DELTA.S fragment, S fragment or ORF1ab fragment, is subjected to homologous recombination in yeast, in which case the Stru. DELTA.S fragment, S fragment or ORF1ab fragment contains only the gene of SARS-CoV-2 to be expressed, and does not contain the expression regulatory sequence. In other embodiments, some of the regulatory sequences are contained in the Stru. DELTA.S fragment, the S fragment, and/or the ORF1ab fragment, are split into multiple DNA fragments containing homology arms together with the gene of SARS-CoV-2 expressed in these fragments, and are assembled into a plasmid by yeast homologous recombination (TAR), and other expression regulatory sequences are previously added to the plasmid vector.

Methods for homologous recombination of DNA fragments and linearized vectors in yeast are well known to those skilled in the art, and for example, one specific example is: firstly, preparing yeast competence, selecting yeast strains from a plate, inoculating the yeast strains to 5mL of yeast extract peptone glucose medium (YPD), and culturing overnight; an appropriate amount of the suspension was taken out from the expanded and shaken solution, inoculated again into 5mL of YPD, and cultured at 30 ℃ and 200rpm until OD₆₀₀0.4 to 0.6. Cleaning bacteriaTaking 150ng of DNA fragments to be connected and 100ng of linearized vector, mixing uniformly, and adding competent yeast cells; subsequently, primers are designed from about 100bp at two ends of the homologous arm for amplification, yeast is selected for monoclonal overnight culture, and junction PCR is carried out according to the treatment of a PCR program; after the verification is correct, selecting thalli from the flat plate, inoculating the thalli into 5mL of liquid SC-LEU culture medium, culturing overnight, and collecting the thalli; finally, resuspend the cells and extract the yeast total DNA for use.

The vector containing the target DNA assembled in the yeast can be extracted from the yeast by a conventional plasmid extraction method to obtain a first plasmid, a second plasmid or a third plasmid, and then the first plasmid, the second plasmid or the third plasmid can be selectively transferred into escherichia coli for amplification; or the total yeast DNA can be directly extracted, and the total yeast DNA is used for transforming and amplifying escherichia coli to obtain the first plasmid, the second plasmid or the third plasmid. Methods for obtaining and amplifying recombinant plasmids from yeast are well known in the art, and for example, yeast total DNA can be transferred into E.coli, monoclonal selected, verified to be correct, amplified in large quantities and extracted, and the plasmids stored for use. As a specific example, firstly, selecting an EPI300 monoclonal and inoculating the monoclonal to 5mL of LB liquid medium, culturing overnight, transfecting total DNA of yeast by adopting an electrotransfer mode, coating a transfer bacterial liquid on a resistant plate, selecting the monoclonal, and carrying out junction PCR according to the treatment of a PCR program; then selecting the bacteria with correct verification to extract plasmids, carrying out enzyme digestion verification, and carrying out sequencing verification after the enzyme digestion verification is correct; after the correctness is verified, the escherichia coli is amplified in a large quantity, plasmids are extracted, and the plasmids are stored for later use. In the present invention, the virus-like particles of SARS-CoV-2 can be obtained by transfecting a packaging cell with any two or three plasmids selected from the group consisting of the first plasmid, the second plasmid and the third plasmid, allowing them to be expressed and assembled in the packaging cell. In some embodiments, plasmids that can be used to transfect packaging cells include a first plasmid and either or both of a second plasmid and a third plasmid.

Before transfection, the expression of the plasmid may be verified, for example, any kind of plasmid may be transfected into mammalian cells (e.g., 293T cells) separately, intracellular RNA and protein may be extracted after cell lysis and detected separately, for example, by RT-PCR and Western blot, or by a method of selective labeling, for example, fluorescent protein imaging.

In the present invention, the terms "virus-like particle (VLP)", "pseudovirus", "recombinant virus" and "recombinant virus particle (rVP)" are used interchangeably and refer to a virus particle obtained by a process comprising at least one recombinant DNA technology step, wherein at least one component comprised in the viral genome is expressed and assembled by genetic engineering methods. The "virus-like particle" of the present invention may contain no viral genome, or contain a gene that does not encode a viral protein, and thus is non-replicative and non-infectious, and the "recombinant viral particle" in the present invention refers to a viral particle obtained by transfecting a packaging cell with any two or three of the first plasmid, the second plasmid, and the third plasmid, which may or may not contain a nucleic acid substance therein. When the plasmid having the packaging signal is included in the plasmid used for transfecting the packaging cell, the nucleic acid substance is contained in the obtained viral particle, and when the plasmid having the packaging signal is not included in the plasmid used for transfecting the packaging cell, the nucleic acid substance is not contained in the obtained viral particle. For example, in the present invention, rSAR2-CoV-2 and SAR2-CoV-2 VLPs can be used interchangeably.

"transfection" as used herein refers to the transfer of a polynucleotide, such as a nucleic acid molecule, plasmid, etc., from outside a cell into a cell such that the polynucleotide functions within the cell. Methods of transfection are well known to those skilled in the art, for example calcium phosphate or liposome-mediated transfection. As known to those skilled in the art, liposomes that can be used include lipofectamine 8000 and the like. In some embodiments, the transfection method comprises growing 293T to a density of about 80%, transfecting 293T cells with plasmid mixed with lipofectamine 8000 in DMEM supplemented with 10% FBS; after 6h, the medium was changed to DMEM, and the supernatant was collected after 48 h.

Packaging cells for the production of recombinant viral particles are well known to those skilled in the art and, for example, packaging cells that may be used include, but are not limited to, mammalian (including human) cells, insect cells, plant cells, microorganisms, or yeast, such as cells of the HEK293 series (e.g., HEK293A, HEK293T, or HEK293FT), a549 cells, or Vero cells, and the like.

Media and culture methods for producing recombinant viral particles are known to those skilled in the art, and can be used with commercially available or customized media, or further supplemented with one or more cell culture components known in the art, including but not limited to glucose, vitamins, amino acids, and or growth factors, to increase the titer of recombinant viral particles in the production culture. Recombinant viral vector production cultures can be grown under conditions appropriate for the particular host cell (i.e., packaging cell) used. After production of the recombinant viral vector, if desired, the viral particles can be purified from the cell lysate or viral supernatant using a variety of conventional methods, including ultrafiltration methods, e.g., using nitrocellulose filters; adsorption methods, such as the use of calcium phosphate or ion exchange are well known for adsorption of viruses or impurities, followed by elution with a salt solution; chromatography such as sephadex chromatography, ion exchange chromatography, affinity chromatography, and the like; centrifugation, such as differential centrifugation, CsCl or sucrose density gradient centrifugation; precipitation methods, such as polyethylene glycol precipitation (e.g., using PEG2000), isoelectric point precipitation or neutral salt precipitation. The above purification methods may also be used in combination. In some embodiments, as an example of the purification method, preparing 10% (w/v) concentration of PEG20000 solution, mixing PEG20000 and cultured virus supernatant solution at a volume ratio of 1:1, standing at 4 ℃ for 16h for concentration, centrifuging to collect the precipitate, and then resuspending with opti-MEM for use, the method can be used in experiments such as infection of VLP; as another example of the purification method, the collected virus culture supernatant was passed through a 20% (w/v) sucrose solution and ultracentrifuged at 30000rpm for 4h to primarily purify VLP, which can be used for characterization of RNA, protein, VLP morphology, etc.

The recombinant virus particles of the present invention can be used as an active ingredient in pharmaceutical compositions or vaccines to treat diseases caused by SARS-CoV-2. The term "vaccine" refers to a formulation comprising the recombinant viral particles. The dosage of the recombinant viral particles comprised in the vaccine can be adjusted by the skilled person, e.g. according to the disease condition, the subject and the schedule of treatment. The vaccine typically comprises a therapeutically effective amount of recombinant viral particles. As used herein, "treating" or "treatment" refers to preventing or ameliorating (e.g., reducing, alleviating or curing) at least one symptom associated with a disease state. A "therapeutically effective amount" as referred to herein is an amount sufficient to prevent or alleviate (e.g., reduce, alleviate or cure) at least one symptom associated with a disease state. The dosage of the pharmaceutical composition or vaccine can be conveniently determined by one skilled in the art, e.g. by first identifying a dose effective to elicit a prophylactic or therapeutic immune response, e.g. by measuring the serum titer of virus-specific immunoglobulin or by measuring the inhibition ratio of the antibody in a serum sample or a urine sample or mucosal secretion.

The pharmaceutical composition or vaccine of the invention may comprise the recombinant viral particles of the invention and a pharmaceutically acceptable carrier or excipient. Pharmaceutically acceptable carriers include, but are not limited to, saline, buffered saline, dextrose, water, glycerol, sterile isotonic aqueous buffer, and combinations thereof. The pharmaceutical composition or vaccine of the invention may comprise adjuvants, which are well known to those skilled in the art. Exemplary adjuvants include complete Freund's adjuvant, incomplete Freund's adjuvant, aluminum hydroxide adjuvant, BCG, and the like.

The pharmaceutical composition or vaccine of the invention should be suitable for administration to a subject, e.g., be sterile, non-particulate and/or non-pyrogenic. The pharmaceutical composition or vaccine may be formulated in solid form, such as lyophilized powder, liquid solution, suspension, emulsion, tablet, pill, capsule, sustained release formulation or powder, etc., which may be used to formulate an injection.

Methods of administration of the pharmaceutical compositions or vaccines of the present invention include, but are not limited to, parenteral (e.g., intradermal, intramuscular, intravenous and subcutaneous), epidural, and mucosal (e.g., intranasal and oral or pulmonary routes or by suppositories). Administration may be systemic or topical.

The recombinant virus particles of the invention can also be used in vitro research of SARS-CoV-2 virus infected cells, for example, in vitro drug screening, such as in vitro screening of drugs capable of inhibiting SARS-CoV-2 virus infected cells; development of antibodies, e.g., antibodies can be prepared with the recombinant viral particles as immunogens or antibody neutralization activity can be detected; it can also be used for developing virus vaccine, for example, vaccine can be developed conveniently and rapidly aiming at new virus mutant, especially new virus mutant with mutant S protein.

In some embodiments, the principles of genome resolution, synthesis, and assembly are shown in figure 1. Firstly, splitting a genome ORF1ab into 6 fragments with the size of 3-5kb, splitting an S protein gene into 2 fragments, and splitting other structural proteins or auxiliary protein genes into 3 fragments; then, synthesizing each fragment in vitro and assembling into three fragments of ORF1ab, S and Stru delta S by yeast homologous recombination technology; constructing the three fragments on corresponding vectors, and transferring the three fragments into escherichia coli to obtain plasmids; finally, the plasmid is transfected into mammalian cells to obtain VLPs. The characterization of the various VLPs is shown in fig. 3-4.

In some embodiments, the invention provides a method for de novo artificial synthesis and preparation of SARS-CoV-2VLP, comprising:

the first step is as follows: SARS-CoV-2 genome resolution, synthesis and assembly

(1) Splitting the SARS-CoV-2 genome into three parts of ORF1ab, Stru delta S and S, wherein the ORF1ab sequence is 266 to 21555, and comprises the sequences of all non-structural proteins; the Stru Δ S sequence from 25393 to 29674, comprising the sequences of all structural and accessory proteins except the S protein; the S sequence is from 21563 to 25384, and comprises the sequence of the S protein (the specific sequence is shown in the attachment);

(2) ORF1ab was split into 6 fragments of about 3kb, Stru. DELTA.S into 3 fragments of about 3kb, S into 2 fragments of 3kb, each fragment was synthesized in vitro, and overlapping fragments were designed on adjacent fragments.

(3) And (3) carrying out yeast homologous recombination on the fragments split in the last step by using a yeast in-vivo platform, assembling the fragments into three complete fragments of ORF1ab, Stru delta S and S by using an overlap between adjacent fragments through a homologous recombination method, and simultaneously respectively assembling the fragments onto plasmid vectors pJS356 and pEASY-T-CAG to obtain three plasmids SARS-CoV-2ORF1ab, SARS-CoV-2Stru delta S and SARS-CoV-2S. Wherein 5 ' UTR, 3' UTR and PolyA tail are respectively added at two ends of ORF1ab and Stru delta S coding sequence, and CMV enhancer is added at 5 ' of each sequence; adding a polyA tail to the 3' end of the S protein coding sequence; and 5' plus an enhancer of CAG;

the specific operation steps of yeast homologous recombination are as follows: firstly, preparing yeast competence, selecting yeast strains from a plate, inoculating the yeast strains to 5mL of yeast extract peptone glucose medium (YPD), and culturing overnight; an appropriate amount of the suspension was taken out from the expanded and shaken solution, inoculated again into 5mL of YPD, and cultured at 30 ℃ and 200rpm until OD₆₀₀0.4 to 0.6. Cleaning thallus, taking 150ng of DNA fragment (about 3kb) to be connected and 100ng of linearized vector, mixing uniformly, and adding competent yeast cells; subsequently, primers are designed from about 100bp at two ends of the homologous arm for amplification, yeast is selected for monoclonal overnight culture, and junction PCR is carried out according to the treatment of a PCR program; after the verification is correct, selecting thalli from the flat plate, inoculating the thalli into 5mL of liquid SC-LEU culture medium, culturing overnight, and collecting the thalli; finally, resuspending the thalli and extracting total DNA of the yeast for later use;

(4) transferring the yeast total DNA into escherichia coli, selecting monoclone, amplifying in large quantity and extracting plasmid after verification is correct, and storing the plasmid for later use;

the specific operation steps are as follows: firstly, selecting an EPI300 monoclonal and inoculating the monoclonal to a 5mL LB liquid culture medium, culturing overnight, transfecting total DNA of yeast by adopting an electrotransformation mode, coating a transfer bacterial liquid on a resistant plate, selecting the monoclonal, and carrying out junction PCR according to the treatment of a PCR program; then selecting the bacteria with correct verification to extract plasmids, carrying out enzyme digestion verification, and carrying out sequencing verification after the enzyme digestion verification is correct; after the correctness is verified, the escherichia coli is amplified in a large quantity, plasmids are extracted, and the plasmids are stored for later use.

The second step is that: SARS-CoV-2VLP expression and preparation method

(1) The three plasmids constructed by splitting SARS-CoV-2 genome are verified separately. Firstly, independently transfecting 293T cells with three plasmids, then cracking the cells, respectively extracting RNA and protein in the cells, and respectively carrying out RT-PCR (reverse transcription-polymerase chain reaction) and western blot verification; meanwhile, the verification of plasmid expression is realized by a fluorescence imaging method of EGFP in SARS-CoV-2Stru delta S and mCherry in SARS-CoV-2ORF1 ab.

(2)293T is grown to a density of about 80%, and is transfected with SARS-CoV-2Stru delta S, SARS-CoV-2Stru delta S and SARS-CoV-2ORF1ab, SARS-CoV-2Stru delta S, SARS-CoV-2S and SARS-CoV-2ORF1ab, respectively, and the culture medium is opti-MEM; after 6h, the medium was changed to DMEM, and the supernatant was collected after 48 h.

(3) Preparing 10% (w/v) PEG20000 solution, mixing PEG20000 and virus supernatant solution at a volume ratio of 1:1, standing at 4 deg.C for 16h for concentration, centrifuging to collect precipitate, and then resuspending with opti-MEM for use. The collected supernatant was ultracentrifuged at 30000rpm for 4h through a 20% (w/v) sucrose solution to primarily purify VLP, which is mainly used for characterization of RNA, protein and VLP morphology.

In some embodiments, plasmid cotransformation to make VLPs and validation thereof, comprises any of the following steps:

(1) preparation of VLP of Stru Δ S by Single-transfer of plasmid of Stru Δ S into 293T cells

293T was plated on 100mm cell culture dishes to a cell density of about 90%, 15. mu.g of SARS-CoV-2 Stru. DELTA.S was transfected in each dish with the transfection reagent lipo8000, and the medium was changed after 6 h. And collecting supernatant after 48 hours to obtain VLP. Cell in the culture dish is cracked to carry out RT-PCR and western blot to verify the expression of the plasmid in 293T cells; verifying the expression of the plasmid by adopting a fluorescence imaging method, analyzing N protein in the VLP by adopting western blot, and verifying the size and the shape of the VLP by adopting a transmission electron microscope characterization method;

(2) preparation of VLP of Stru Δ S + S by Co-transferring plasmid of Stru Δ S and S into 293T cell

293T was plated on 100mm cell culture dishes to a cell density of about 90%, 15. mu.g of SARS-CoV-2 Stru. delta.S and 15. mu.g of SARS-CoV-2S were transfected into each dish with the transfection reagent lipo8000, and the medium was changed after 6 hours; and collecting supernatant liquid after 72h, namely the VLP of Stru delta S + S. Verifying the expression of the plasmid by adopting a fluorescence imaging method, and analyzing the intracellular expression and S protein and N protein in VLP (VLP) by adopting western blot; verifying the size and the appearance of VLP by adopting a transmission electron microscope characterization method;

(3) preparation of VLP of Stru. DELTA.S + ORF1ab by co-transformation of the plasmid of Stru. DELTA.S and ORF1ab into 293T cells

293T was plated on 100mm cell culture dishes to a cell density of about 90%, 15. mu.g of SARS-CoV-2 Stru. DELTA.S and 15. mu.g of SARS-CoV-2ORF1ab were transfected into each dish with the transfection reagent lipo8000, and the medium was changed after 6 h. And collecting supernatant liquid after 48h, namely VLP of Stru delta S + ORF1 ab. Verifying the expression of the plasmid by adopting a fluorescence imaging method, analyzing the intracellular expression and N protein in VLP (VLP) by adopting western blot, and verifying the size and the shape of VLP (VLP) by adopting a transmission electron microscope characterization method;

(4) VLP of Stru Δ S + S + ORF1ab was prepared by co-transferring the plasmid of Stru Δ S, S and ORF1ab into 293T cells

293T was plated on 100mm cell culture dishes to a cell density of about 90%, 15. mu.g of SARS-CoV-2 Stru. DELTA.S, 15. mu.g of SARS-CoV-2S and 15. mu.g of SARS-CoV-2ORF1ab were transfected into each dish with the transfection reagent lipo8000, and the medium was changed after 6 hours; collecting supernatant solution after 48h, namely VLP of Stru delta S + S + ORF1 ab; verifying the expression of the plasmid by adopting a fluorescence imaging method, analyzing the intracellular expression and S protein and N protein in VLP by adopting western blot, and verifying the size and the shape of VLP by adopting a transmission electron microscope characterization method;

(5) the Stru. DELTA.S, ORF1ab and S_MutationsThe plasmid is co-transferred into 293T cells to prepare Stru delta S + S_Mutations+ ORF1ab VLP

Firstly, single-site mutation is carried out on an S protein plasmid (namely SARS-CoV-2S) by a PCR method, in this case, N331Q, N501Y, D614G and P681H are taken as examples; and (3) transfecting 293T cells with each S mutant plasmid, cracking the cells, and verifying the expression of the S mutant plasmids by adopting an RT-PCR method. Then, S mutant VLPs were prepared, and 293T cells were plated on 100mm cell culture dishes for each S protein mutation to a cell density of about 90%, and 15. mu.g of SARS-CoV-2 Stru. delta.S, 15. mu.g of SARS-CoV-2S were transfected in each dish using the transfection reagent lipo8000_MutationsWith 15 μ g of plasmid of SARS-CoV-2ORF1ab, the medium was changed after 6 h; collecting supernatant after 48h, namely Stru delta S + S_Mutations+ VLP of ORF1 ab; the expression of the plasmid is verified by adopting a fluorescence imaging method, the intracellular expression and S protein and N protein in VLP are analyzed by adopting western blot, and the size and the shape of VLP are verified by adopting a transmission electron microscope characterization method.

The technical solution of the present invention will be described in further detail below by way of examples with reference to the accompanying drawings, but the present invention is not limited to the following examples. Unless otherwise indicated, the nucleotide numbering of the SARS-CoV-2 genome and parts thereof as described herein and in the examples below is based on NCBI sequence number NC-045512.2, and the amino acid position numbering of the S protein and its mutants is based on the amino acid sequence of the S protein, i.e., amino acid 1 of the S protein.

Example 1: materials and methods

Plasmid: three plasmids, SARS-CoV-2S, ORF1ab-mCherry and Stru. DELTA.S-EGFP, were constructed based on the TAR system in yeast to prepare virus particles of SARS-CoV-2. . The cDNA sequence of SARS-CoV-2S glycoprotein with promoter and poly (A) signal was assembled and cloned into pEASY-T1 vector by homologous recombination in yeast to form plasmid pEASY-T-S (also referred to as SARS-CoV-2S plasmid or S plasmid in the following examples), by adding CAG promoter at the 5 'end of the codon optimized cDNA sequence encoding SARS-CoV-2S glycoprotein and bGH poly (A) signal at the 3' end, splitting the sequence with promoter and poly (A) signal into two fragments with overlapping sequences (SEQ ID NO:1, SEQ ID NO: 2). The CMV promoter, 5 ' UTR, 3' UTR, HDV, SV40 polyA and mCherry encoding genes were added to the 5 ' end of ORF1ab of SARS-CoV-2, and the sequence thus obtained was divided into six fragments with overlapping sequences (SEQ ID NOS: 3-8), and these overlapping fragments were assembled and cloned into pJS356 vector (obtained by adding ori2-oriV elements for inducing high expression of low-copy plasmids and SopA, SopB, Repe and SopC functional elements for high expression in Escherichia coli on the basis of pR415 plasmid) in yeast to form plasmid pJS356-ORF1 ab-mChery (in the following examples, SARS-CoV-2ORF1ab plasmid, SARS-CoV-2ORF1 ab-mChery plasmid, ORF1 ab-mChery plasmid or ORF1ab plasmid). The CMV promoter, the T7 promoter, the 5 ' UTR, and the 3' UTR, HDV, SV40 polyA and EGFP-encoding genes were added to the 3' end of the DNA sequences encoding all structural proteins (except spike proteins) and all auxiliary proteins (including ORF3a, ORF3b, ORF6, ORF7a, ORF7b, ORF8, ORF10), and the sequences thus obtained were split into three fragments with overlapping sequences (SEQ ID NO:9-11), which were assembled in yeast and cloned into pJS356 vector to form plasmid pJS356-Stru Δ S-EGFP (in the following examples, SARS-CoV-2Stru Δ S-EGFP, Stru Δ S-EGFP plasmid or Stru Δ S plasmid). The obtained plasmid was verified by digestion and sequencing.

The DNA sequence of the packaging signal (PS for short) of SARS-CoV-2 genomic RNA (nt 19900-20000 or 19773-20335) obtained by the bioassay was inserted into the 3' non-coding region of the EGFP gene in pEGFP-N1 vector to construct plasmid pEGFP-N1-PS101 or pEGFP-N1-PS 583. PS101(nt 19900-20000) or PS583(nt 19773-20335) was amplified from plasmid pJS356-ORF1 ab-mCherry. Plasmid pEGFP-N1 was treated with Not I restriction endonuclease and the PS-containing fragment was inserted to produce pEGFP-N1-PS101 or pEGFP-N1-PS 583.

Site-directed mutagenesis: the S protein gene mutant is constructed by using SARS-CoV-2S plasmid as template, including four kinds of plasmids including SARS-CoV-2S (N331Q), SARS-CoV-2S (N501Y), SARS-CoV-2S (D614G) and SARS-CoV-2S (P681H). 15-20 bases near the target mutation site were selected as the forward primer, and the complementary sequence was selected as the reverse primer, which is shown in Table 1. After site-directed mutagenesis PCR, the template strand was digested with the restriction enzyme DpnI. The product was directly transformed into E.coli DH5 alpha competent cells, and single clones were selected and sequenced.

Prediction of ORF1b RNA secondary structure: the sequences in the ORF1b region of SARS-CoV-2, SARS-CoV and bat SARS-CoV were selected from the reference genome using Biopython (nt 19500-. Secondary structure is predicted by RNA structure web server with default parameters and visually analyzed by Vienna RNA web server.

Multiple sequence alignment and prediction of similar structures: we predicted and compared the conserved RNA stem loop at the 3' end of ORF1b 3, which is the predicted packaging signal in SARS-associated virus. The LocARNA web server was used to compare SARS-CoV-2(NC-045512.2: 19773-.

Cell construction and culture: this study constructed 293T cells (293T/hACE2) stably expressing human ACE 2. The receptor hACE2 was generated by transfecting 293T cells with the hACE2 plasmid and screened at 2. mu.g/mL puromycin. Expression of hACE2 cells was detected by western blotting. 293T cells were cultured in DMEM medium supplemented with 10% FBS, 1% penicillin and streptomycin, and cultured at 37 ℃ under 5% CO₂An incubator. 293T/hACE2 cells were cultured in DMEM medium supplemented with 10% FBS, 1% penicillin and streptomycin, and 2. mu.g/mL puromycin at 37 ℃ and 5% CO₂An incubator.

Construction of VLPs: to obtain VLPs, the plasmid SARS-CoV-2S (10. mu.g), ORF1ab-mCherry (10. mu.g) and Stru. delta. S-EGFP (10. mu.g) were mixed with lipofectamine 8000, transfected into 293T cells, and cultured in DMEM medium containing 10% FBS for 48 hours. To form VLPs (EGFP-PS101) or VLPs (EGFP-PS583) assembled with the packaging signal, pEGFP-N1-PS101 (1. mu.g) or pEGFP-N1-PS583 (1. mu.g) was co-transfected into 293T cells for 48 hours together with SARS-CoV-2S (10. mu.g) and Stru. delta.S (10. mu.g) plasmids.

VLP purification: the culture supernatant was collected 48 hours after transfection and filtered through a 0.45 μm filter. The culture supernatant was added to a 20% sucrose pad and ultracentrifuged at 30000rpm for 3 hours using a SW32 rotor. The purified VLPs were used for transmission electron microscopy. For western blot analysis, the culture supernatant was added on top of 20% to 40% density gradient sucrose and ultracentrifuged at 30000rpm for 4 hours using a SW41 rotor. VLPs were collected for fluorescence imaging using PEG20000 concentration.

RNA extraction and RT-PCR: total RNA from cell culture supernatant was isolated by the FastPure cell/tissue Total RNA kit V2 and transcribed to cDNA at 48 hours according to the kit instructions. Using the cDNA as a template, specific primers listed in Table 1 were designed to amplify the coding region of SARS-CoV-2S, N, NSP1, NSP16 or ORF3a, and the PCR products were verified by electrophoresis in 1% agarose gel.

Western blot analysis: cells or VLPs were lysed on ice for five minutes with RIPA buffer containing protease inhibitor and PMSF and centrifuged at 13000g for 5 minutes to remove large cell debris. Protein lysates were boiled at 100 ℃ for 10 min, separated on 4-20% acrylamide gels, and transferred to nitrocellulose membranes. The nitrocellulose membrane was blocked with TBST buffer containing 1% Tween-20 and 5% cow's milk, overnight at 4 degrees, and then incubated with IRDye 680RD goat anti-mouse antibody or IRDye 800CW goat anti-rabbit antibody. The analysis results were obtained by a ChemiDoc MP imaging system.

Transmission electron microscope: the VLPs were attached to a carbon-coated copper mesh for 10 minutes and stained with 2% (w/v) phosphotungstic acid (pH 7.1) for 1 minute, and the samples were observed with a 200kV transmission electron microscope.

Preparing a quantum dot nano beacon: tellurium powder (20mg) and sodium borohydride (11mg) were added to a round-bottomed flask containing 1mL of ultrapure water, and the mixture was stirred for 5 hours under oxygen-free and ice-bath conditions to obtain sodium tellurihydride. To prepare the cadmium precursor, cadmium chloride, zinc chloride and N-acetyl-L-cysteine were mixed in a 1:1:4 ratio and the pH was adjusted to 9.0 by NaOH. BHQ2 and phosphorothioate co-modified DNA, cadmium precursor and sodium hydrogen telluride were mixed and transferred to a stainless steel autoclave lined with tetrafluoroethylene. The mixture was heated to 200 ℃ for 36 minutes to obtain BHQ2-DNA functionalized CdTe Zn²⁺QDs, quantum dots purified by centrifugation at 8000rpm for 10 minutes using an Amicon Ultra-4 centrifugal filtration device (50 kDa). Functional CdTe Zn to BHQ2-DNA²⁺QDs were annealed at 95 degrees for 10 minutes and at room temperature for 30 minutes to form stem-loop structures.

Fluorescent labeling of VLPs: to construct fluorescently labeled VLPs, 40 μ L of annealed quantum dot nanobeams (10 μ M) were added to the culture medium 6 hours after transfection of the SARS-CoV-2 plasmid to label the viral genome, and the nanobeams were labeled in the VLPs during the viral self-assembly phase. To 2mL of the collected virus solution, 0.5. mu.L of DiO (1mM) was added and incubated at 37 ℃ for 1 hour to label the viral envelope.

Viral infection and fluorescence imaging: 293T/hACE2 cells were seeded in confocal culture dishes and cultured in DMEM containing 10% FBS, 1% penicillin and streptomycin and 2. mu.g/mL puromycin. The fluorescently labeled VLPs were incubated with the host cell 293T/hACE2 at 4 ℃ for 30 minutes and the medium was replaced with fresh medium to remove unbound particles. The confocal culture dish was sealed with a sealing film and then transferred to 37 ℃ 5% CO₂An incubator. Imaging was performed by an UltraView Vox confocal laser scanning system at 488nm and 561nm excitation.

Example 2 results and discussion

Results

Construction and characterization of SARS-CoV-2VLP

To assemble SARS-CoV-2 VLPs with infectious capacity, we split the full cDNA length of the SARS-CoV-2 genome into three parts: SARS-CoV-2ORF1ab (nucleotide position: 266-. Using chemically synthesized virus genome DNA fragments to assemble in yeast to construct plasmids respectively containing SARS-CoV-2ORF1ab, SARS-CoV-2S and SARS-CoV-2Stru delta S; the constructed plasmids were co-transfected into 293T cells to generate SARS-CoV-2VLP (FIG. 1B). In 293T cells, the DNA fragment is translated and viral proteins are expressed; SARS-CoV-2ORF1ab RNA comprising a Packaging Signal (PS) sequence is assembled with a structural protein to form a SARS-CoV-2 VLP; infectivity of SARS-CoV-2VLP is mediated by binding of the spike protein to the host cell surface hACE2 receptor. Cells that have been infected with SARS-CoV-2VLP do not produce progeny virus when only the packaging sequence genes are assembled in the virion, but not all the structural protein genes (FIG. 1C). Therefore, our SARS-CoV-2VLP construction strategy is expected to provide a safe and universal platform for virology research.

SARS-CoV-2 spike glycoprotein (S) acts as a receptor binding site and can mediate membrane fusion and viral entry. We assembled and optimized the S protein coding sequence and expressed the S protein in mammalian cells. The results of transfecting the SARS-CoV-2S plasmid into 293T cells and analyzing the S gene by RT-PCR are shown in FIG. 6A. The expressed SARS-CoV-2S protein was verified by Western blotting and showed two major bands, a full-length protein of 180kDa and a cleaved protein of 110kDa (FIG. 6B).

SARS-CoV-2 Stru. delta.S gene sequence and EGFP are fused to construct plasmid, named SARS-CoV-2 Stru. delta.S-EGFP (FIG. 7A); at the same time, we constructed expression vectors for structural proteins (E, envelope; M, membrane; N, nucleocapsid) and accessory proteins. To verify the activity of the SARS-CoV-2 Stru. delta. S-EGFP plasmid, ORF3a and the N gene were analyzed by RT-PCR in 293T cells (FIG. 7B); EGFP expression was observed 24 hours after plasmid transfection (FIG. 7C) and expression of SARS-CoV-2N protein was determined by Western blotting (FIG. 7D). The structural proteins can assemble into virus-like particle VLPs (Stru Δ S); we collected the medium 48 hours after transfection and transferred to a supercentrifuge tube (bottom 20% sucrose pad) to isolate VLPs. Confirmation of the presence of SARS-CoV-2N protein in VLP (Stru. DELTA.S) by Western blotting (FIG. 7D); the morphological size of VLPs (Stru Δ S) was then observed by TEM and we found that the estimated diameter of the spherical particles without spike protein was 75 ± 14nm (n ═ 30, fig. 7E).

We constructed the plasmid by fusing SARS-CoV-2ORF1ab gene sequence and mCherry sequence, and the polyprotein 1a (pp1a) and pp1ab expressed by the plasmid could be self-cleaved into 16 non-structural proteins (NSP) by virus protease (FIG. 8A). The activity of SARS-CoV-2ORF1ab-mCherry plasmid was assessed by RT-PCR, while mCherry signal was observed within 48 hours post-transfection using confocal microscopy (FIGS. 8B and 8C).

The assembly and secretion of VLPs (Stru. DELTA.S-S) is dependent on the co-expression of the SARS-CoV-2S plasmid and the Stru. DELTA.S-EGFP plasmid; 24h after transfection, fluorescence of EGFP was detected by confocal microscopy and SARS-CoV-2S and N proteins were detected by Western blotting, indicating that SARS-CoV-2S and Stru. DELTA.S-EGFP RNA can be expressed in transfected cells (FIGS. 9A and 9B). To characterize their morphology and size, transmission electron microscopy was performed under TEM after negative staining of VLPs (Stru Δ S-S) (fig. 9C); observations indicated that VLPs (Stru Δ S-S) are spherical particles with spike proteins, with an average size estimated at 74 ± 16nm (n ═ 30, no spikes contained). At the same time, we also performed western blot analysis of structural proteins of the assembled VLP (Stru. DELTA.S-S) and also obtained a band of the expected molecular weight (FIG. 9B).

VLP (Stru. DELTA.S-ORF 1ab) was produced by cotransfection of 293T cells with a SARS-CoV-2 Stru. DELTA.S-EGFP plasmid and an ORF1ab-mCherry plasmid; imaging the cells at 24 hours post-transfection, significant fluorescence was observed (fig. 10A); meanwhile, expression of SARS-CoV-2N protein in transfected cells was detected by Western blotting (FIG. 10B). Next, transmission electron microscopy was performed to analyze the morphological characteristics of VLP (Stru. DELTA.S-ORF 1 ab). The diameter of the globular particles without spike protein was approximately 79 ± 13nm (n ═ 30, fig. 10C). The SARS-CoV-2N protein of VLP (Stru. DELTA.S-ORF 1ab) was determined by Western blotting (FIG. 10B).

To prepare SARS-CoV-2VLP, three plasmids were co-transfected into 239T cells. Within 24 hours after transfection, EGFP and mCherry signals were observed, indicating that the plasmid was replicated and transcribed in the cells (fig. 2A). To detect the expression of SARS-CoV-2S and N proteins, 48 hours after transfection, cell lysate was collected for Western blot analysis. Meanwhile, in order to analyze the morphological characterization of the virus particles, transmission electron microscope observation is carried out; transmission electron microscopy images showed circular viral particles with a typical sharp coronal aspect, and the particles were approximately 76 ± 12nm in diameter (n ═ 30, no spikes included). SARS-CoV-2 VLPs and other VLPs appear to have similar shapes and sizes (FIG. 2C). The SARS-CoV-2S and N proteins in the virions were analyzed by Western blotting to give the expected bands (FIG. 2B).

To avoid potential bio-safety issues, rescued virus-like particles were safety tested in the BSL-3 laboratory. The 293T/hACE2 cell strain constructed by the method has susceptibility to SARS-CoV-2 infection; the expression of the hACE2 receptor by the cells was verified by western blotting (FIG. 11A). 293T/hACE2 cells were de-infected with SARS-CoV-2VLP to test whether the infected cells produced progeny virus. Whether the virus structural protein is expressed in the cells and assembled into progeny virus is detected by western blotting, and the result shows that no corresponding band can be observed. The results indicate that SARS-CoV-2VLP cannot replicate, express structural proteins, and thus cannot assemble in host cells to produce progeny virus. Since SARS-CoV-2VLP is unable to replicate, representative images of 293T/hACE2 cells infected with SARS-CoV-2VLP and wild type SARS-CoV-2 virus showed significant differences in virus titer (FIG. 11C). Therefore, the SARS-CoV-2VLP constructed by the method can be used as a safe and proper model to research the life cycle and the infection mechanism of SARS-CoV-2 virus, is expected to further understand the life cycle of the virus, and provides a certain theoretical basis for the development of antiviral drugs and vaccines.

To further explore the assembly of viral genomic RNA, we infected 293T/hACE2 cells with SARS-CoV-2VLP (FIG. 2E). At 48 hours after viral particle infection, mCherry signal was observed by confocal microscope imaging, but red fluorescence in infected cells was very weak, probably due to the low copy number of ORF1ab-mCherry mRNA (fig. 2D). VLP (mCherry) is formed by cotransfection and assembly of three plasmids, namely SARS-CoV-2S plasmid, Stru delta S-EGFP plasmid and mCherry plasmid (obtained by directly inserting mCherry into pEGFP-N1 vector). No fluorescence signal was observed in vlp (mCherry) -infected cells, indicating that mCherry mRNA without the Packaging Signal (PS) sequence could not be packaged into particles. This demonstrates that the SARS-CoV-2 packaging signal located in ORF1ab fragment plays an important role in the assembly of viral RNA into particles.

Prediction and validation of package signals in SARS-CoV-2

According to previous reports, the nucleocapsid of coronavirus interacts with viral RNA by recognizing specific Packaging Signal (PS) sequences to facilitate the assembly of genomic RNA into viral particles. The prediction of the PS position of SARS-CoV-2 was performed by bioinformatics analysis, and the prediction revealed that the PS was at the 3' end near the region of ORF1 b. Meanwhile, we predict the RNA secondary structure of SARS-CoV-2, SARS-CoV and bat SARS-CoV; the prediction results showed that all three types of viruses contained two stable stem loops in the RNA secondary structure (SL1 and SL2) (fig. 3A). Multiple alignments of the PS consensus structure further confirmed the high conservation of RNA stem loops, especially SL1 and SL2 (fig. 12). These results indicate that a stable stem-loop structure may have PS function in SARS-CoV-2. To further explore, based on the informed predictions of the packaging sequence, we selected two regions to test packaging activity, one short region containing conserved stem-loop structures (PS101, nt 19900-.

Fusing and expressing the predicted PS sequence and a 3' non-coding region of the EGFP reporter gene, constructing a plasmid pEGFP-N1-PS101 and a plasmid pEGFP-N1-PS583, and respectively transfecting into 293T cells; expression of EGFP-PS101 and EGFP-PS583 RNAs was determined by RT-PCR (FIGS. S8A and 3B). To confirm the predicted assembly of PS RNA, VLPs (EGFP-PS101) and VLPs (EGFP-PS583) were collected and used to infect 293T/hACE2 cells, as shown in FIG. 3E. The SARS-CoV-2S plasmid, the Stru. delta. S plasmid and the pEGFP-N1-PS101/pEGFP-N1-PS583 plasmid were co-transfected into 293T cells to produce VLPs (EGFP-PS101) or VLPs (EGFP-PS583), respectively. In transfected cells, S protein, N protein and EGFP were expressed and assembled into VLPs, and several proteins were analyzed by western blotting (fig. 13B and 3C). Structural proteins were detected in VLPs, whereas EGFP could not be detected. To verify whether EGFP-PS101 or EGFP-PS583 RNA was packaged into VLPs, 293T/hACE2 cells were infected with VLPs and examined for EGFP expression in the infected cells (FIG. 3E). After the VLP (EGFP-PS101) or VLP (EGFP-PS583) infected cells for 48 hours, green fluorescence of EGFP was observed and had higher fluorescence intensity, indicating that it had better packaging activity. As shown in fig. 13C and 3D, the low fluorescence signal in VLP-infected cells was likely due to the lower copy number of EGFP-PS RNA packaged into the VLP. VLP (EGFP) was obtained by cotransfection of SARS-CoV-2S plasmid, Stru. delta. S plasmid and pEGFP-N1 plasmid, and as a control group, EGFP mRNA without a PS sequence on its surface could not be packaged into VLP (green fluorescent protein). These results confirm that the predicted sequence has PS function in SARS-CoV-2 and is important for the assembly of viral genomic RNA into VLPs.

Evaluation of the role of structural proteins in viral infection and Assembly

Previous reports demonstrated that SARS-CoV-2S protein is associated with receptor binding and membrane fusion, and that mutation of the S protein would greatly affect the ability to infect the host. Based on SARS-CoV-2S plasmid, we constructed 4 different region mutations, including N331Q mutation of RBD 22553 site A to C substitution and RBD 22555 site C to G substitution, N501Y of 23063 site A-T substitution; D614G with a to G substitution at position 23403 near the RBD region; and the P681H mutation by a C to A substitution at position 23604 near the furin domain; the construction and characterization of the S mutant is shown in FIGS. 14-17. VLP S (mutant) was assembled from co-transfected cells of SARS-CoV-2S (mutant) plasmid, Stru. delta. S plasmid and ORF1ab plasmid. The SARS-CoV-2S and N proteins in the virions were identified by Western blotting, and bands were expected as shown in FIG. 18. The morphology and size of VLP S (mutations) was analyzed by transmission electron microscopy and the results indicated that the mutants were similar to VLPs with wild-type S protein (fig. 18). To test the infectivity of the S mutant virus particles, 293T/hACE2 cells were infected with fluorescent VLP S (mutant) produced by labeling the lipid envelope with DiO. Cells were fixed at 1 or 2 hours post infection, then 500 cells were randomly selected for imaging under a confocal microscope and colored particles were counted for analysis. Compared to VLPs of wild-type S, VLPs (N501Y), VLPs (D614G) and VLPs (P681H) were more infectious, while the infectivity of VLPs (N331Q) was significantly reduced (fig. 4A). The results are consistent with previous reports, indicating that our constructed system can be used to assess the ability of the S mutation to have an effect on SARS-CoV-2 assembly and infectivity.

To investigate the necessity of nucleocapsid (N) in viral RNA packaging, we deleted the coding region of the N protein on plasmid SARS-CoV-2Stru Δ S, thereby constructing plasmid SARS-CoV-2Stru Δ S-EGFP/Δ N ((which may be referred to as SARS-CoV-2Stru Δ S-EGFP/Δ N plasmid, or Stru Δ S/Δ N plasmid) FIG. 19A). The activity and expression of the SARS-CoV-2 Stru. DELTA.S-EGFP/DELTA.N plasmid was detected by RT-PCR and Western blotting, respectively (FIGS. 19B and 19C). VLP (. DELTA.N) can be obtained by co-transfecting cells with SARS-CoV-2S plasmid, Stru. DELTA.S-EGFP/. DELTA.N plasmid, and ORF1ab triplasmid. VLP (Δ N) is spherical, with a coronal structure, with a diameter of 81 ± 17nm (N ═ 30, excluding spikes), as shown in TEM images. The VLPs produced, which we detected by western blot detection, detected S protein and failed to detect N protein, consistent with the expected results (fig. 4C). Whether the viral RNA assembly process is dependent on nucleocapsid is determined by infecting 293T/hACE2 cells with VLP (. DELTA.N-EGFP-PS 583); VLP (Δ N-EGFP-PS583) was produced by co-transfection with the plasmids SARS-CoV-2S, Stru Δ S/Δ N and the packaging signal plasmid pEGFP-N1-PS 583. Little green fluorescence of EGFP was observed in VLP (Δ N-EGFP-PS583) infected cells, as shown in fig. 4B; this is significantly different from VLP (EGFP-PS583) -infected cells. The results indicate that the N protein is not essential for VLP formation, but it plays an important role in viral RNA packaging.

In early reports on coronaviruses, membrane (M) and envelope (E) proteins, as major functional components, play an important role in the production of viral particles; this study also explored the effect of M and E proteins on VLP production. The M or E protein coding sequences were deleted, respectively, to give SARS-CoV-2 Stru. delta. S-EGFP/. DELTA.M plasmid and SARS-CoV-2 Stru. delta. S-EGFP/. DELTA.E plasmid, and verified by RT-PCR and Western blotting (FIGS. 20 and 22). The Stru Δ S mutant plasmid, either SARS-CoV-2Stru Δ S-EGFP/Δ M plasmid or SARS-CoV-2Stru Δ S-EGFP/Δ E plasmid, was co-transfected with plasmids SARS-CoV-2S and ORF1ab-mCherry into 293T cells, resulting in Stru Δ S mutant viral particles rVP (Δ M) or rVP (Δ E), respectively. TEM images and western blot analysis showed that viral particles lacking either M or E protein were still available in transfected cells (fig. 4C, 21A and 23A). The use of the Stru. DELTA.S mutant virus particles further to infect 293T/hACE2 cells resulted in no fluorescence signal observed after 48 hours of infection (FIGS. 21B and 23B), which is in contrast to the expression of host cells after SARS-CoV-2VLP infection, indicating that transfected cells can produce mutant virus particles even if the M or E gene is deleted, but the function of the virus particles is affected. Unlike N-deleted mutants, deletion of either M or E protein does not affect packaging of viral RNA.

The region of open reading frame 10(ORF10) in the SARS-CoV-2 genome is located downstream of the N gene, and the ORF10 protein does not appear to play a significant role in previous reports. We constructed and characterized the SARS-CoV-2 Stru. delta. S-EGFP/. delta. ORF10 plasmid with the ORF10 coding sequence knocked out (FIG. 24); this plasmid co-transfects cells with the plasmid SARS-CoV-2S, ORF1ab, thereby producing VLP (. DELTA.ORF 10). VLP (Δ ORF10) showed similar morphology and function to those of our constructed SARS-CoV-2VLP (FIG. 4C), indicating that deletion of ORF10 gene had no effect on the assembly of viral particles.

To further confirm the effect of the above structural/accessory proteins on VLP assembly, we fluorescently labeled the genome and envelope in VLP, respectively, and preliminarily explored the assembly efficiency of VLP by fluorescence co-localization efficiency (fig. 25), which is in substantial agreement with the above validation results.

real-time imaging of the entry of the rSARS-CoV-2 virus into 293T/hACE2 cells.

We prepared a dual fluorescence labeled SARS-CoV-2VLP (QD-DiO) to gain insight into the viral entry process, as shown in FIG. 5A. The viral RNA hybridizes to the target nucleic acid sequence to form a complex, which is labeled with Quantum Dot (QD) nanobeads and then fluoresces red, and the complex is finally encapsulated by viral particles. A dark electron dense core was observed inside the QD-labeled virus (fig. 5B). To obtain the bifluorescent particles, the lipid envelope was labeled with DiO. Fluorescence co-localization of RNA-QD and Env-DiO confirmed the successful construction of two-color rSARS-CoV-2(QD-DiO) (FIG. 5C), wherein rSARS-CoV-2 is also SARS-CoV-2 VLP.

To show the dynamic process of viral entry, SARS-CoV-2VLP (QD-DiO) particles were followed in 293T/hACE2 cells by confocal microscopy imaging. During real-time imaging, only the co-localization of signals containing QD and DiO are considered as single virions. Bicolor particles were observed on the membrane of 293T/hACE2 cells and transported into the host cell in an active manner (FIG. 5D). The trajectory of the virus particle is shown in FIG. 5E. In this study, we tracked and analyzed the trajectories of more than 2000 individual virus particles in living cells. The results showed that 47.5% of the virus particles entered 293T/hACE2 cells by endocytosis, whereas the other particles were only attached to the cell membrane surface and were not transported into the cytoplasm.

Release of the viral core from the envelope during endocytosis into the host cell is also captured by real-time imaging. Tracking of the rSARS-CoV-2(QD-DiO) particle in living cells showed that the viral core was released into the cytoplasm. Virions with QD and DiO co-localization signals (yellow) were imaged in the cytoplasm, and separation of red and green spots was observed during dynamic movement of virions (fig. 5F). The trajectories of RNA-QD and Env-DiO differ after the isolation behavior, as shown in FIG. 5G. The results indicate successful escape of the viral core from the endosome, a necessary process for the generation of infection. In this study, we analyzed 1000 single particles in 293T/hACE2 cells and captured 64 similar events. The lower proportion may be due to the particles being tracked moving out of the focal plane during the observation or to photobleaching of the fluorescence signal.

Discussion of the related Art

To study SARS-CoV-2, we constructed a viral genome partitioning System (SVG) and performed a corresponding characterization analysis. The SARS-CoV-2VLP with single round infection is produced by the system, and is expected to be used for biological exploration and vaccine development. SARS-CoV-2ORF1ab RNA containing the packaging signal is assembled into rSARS-CoV-2 without the structural gene in the virion, thereby ensuring the safety of the system. We designed a series of sars-CoV-2 mutants with mutations or deletions of viral genes to determine the function of viral components; the role of the four structural proteins and ORF10 protein in SARS-CoV-2VLP infection and assembly was explored. Meanwhile, the nano-beacon and the lipophilic dye are introduced, and the dynamic process of entering the host cell by SARS-CoV-2VLP through endocytosis is shown through real-time imaging. Various fluorescent probes can be designed into the system to visualize other steps of the virus life cycle, as desired for the study. In addition, a general platform for virology experiments in BSL-2 laboratories is very important for the study of SARS-CoV-2. Based on the system constructed by the inventor, the method can be applied to the current SARS-CoV-2 research of several types: (1) functional analysis of SARS-CoV-2 functional analysis of each component can be performed by designing rSARS-CoV-2 mutant. (2) The prediction and verification of the SARS-CoV-2 packaging signal with specific sequence provides a reliable way for researching the molecular mechanism involved in the virus RNA assembly. (3) Single virion tracking based on multicolor markers SARS-CoV-2 VLPs enables real-time and accurate imaging of the course of infection by virions in living host cells. (4) Assembled VLPs with different components facilitate the development of vectors and vaccines. Combining these advantages, the SVG system can be developed as a valuable platform for studying SARS-CoV-2 and other coronaviruses.

The experimental result proves that the SVG system has safety. The whole genome of SARS-CoV-2 is divided into three parts, and the packaging signal necessary for viral RNA assembly is located in ORF1ab fragment. Our system lacks multiple gene segments and does not produce wild-type virus. Unlike pseudoviruses, which contain a partial functional structure, we produce virions with almost all the components of SARS-CoV-2. However, due to the lack of structural genes in the viral packaged RNA, SARS-CoV-2 VLPs are unable to replicate and assemble progeny viruses. These single round infectious virions fail to infect host cells for multiple rounds, blocking viral transmission. There is no need to construct specific stable cell lines to produce viral particles that can be safely manipulated in the BSL-2 laboratory, such as the trans-complementation system. According to the needs of virus research, some ORF genes can be further deleted from the three fragments, and rSARS-CoV-2 mutant with higher safety can be generated.

In the SVG system, we surprisingly found that deletion of the gene encoding the M protein did not affect the formation of viral particles, whereas no mCherry signal of viral RNA expression was observed in VLP (Δ M) -infected cells. This may be due to the lack of M protein in the mutant virion interfering with the function of the virion. Thus, the SARS-CoV-2M protein plays a major role in viral function, but is not essential for the formation of viral particles. It is distinct from SARS-CoV, where the M protein is a key factor in viral assembly. The system we have developed also has some limitations. The virus assembly efficiency needs to be improved, which is of great significance to further promotion of the system and vaccine development. Optimization of virion-packaged RNA fragments can facilitate study of SARS-CoV-2 infection and replication. The SVG system, the composition and the method of the present invention can be used as a powerful tool for facilitating neutralization tests and antiviral tests, and various pseudovirions can be obtained by using the virus genome partitioning system, and can be used for neutralizing antibody tests of virus antibodies, screening of antiviral drugs, preparation of virus vaccines, etc. In conclusion, we developed an SVG system for SARS-CoV-2 research, which can be operated in BSL-2 laboratory. The resulting virion, rSARS-CoV-2, can be readily engineered for analysis of viral life processes and the function of each component.

TABLE 1

In the following sequences, underlined sequences are homology arms

Split fragment of S protein expression sequence (containing codon optimized S protein genome 3822 bp):

s-fragment 1(SEQ ID NO: 1):

GCAGTGAGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTATG CTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAACAATTTCACACAGGAAACAGCTATGACCATGATTACG CCAAGCTTGGTACCGAGCTCGGATCCACTAGTAACGGCCGCCAGTGTGCTGGAATTCTGCAGATATAACTTCGTATAATGTATGCTATACGAAGTTATccgttacataacttacggtaaatggcccgcctggctgaccgcccaacgacccccgcccattgacgtcaatagtaacgccaatagggactttccattgacgtcaatgggtggagtatttacggtaaactgcccacttggcagtacatcaagtgtatcatatgccaagtacgccccctattgacgtcaatgacggtaaatggcccgcctggcattgtgcccagtacatgaccttatgggactttcctacttggcagtacatctacgtattagtcatcgctattaccatggtcgaggtgagccccacgttctgcttcactctccccatctcccccccctccccacccccaattttgtatttatttattttttaattattttgtgcagcgatgggggcggggggggggggggggcggggcgaggggcggggcggggcgaggcggagaggtgcggcggcagccaatcagagcggcgcgctccgaaagtttccttttatggcgaggcggcggcggcggcggccctataaaaagcgaagcgcgcggcgggcgggagtcgctgcgcgctgccttcgccccgtgccccgctccgccgccgcctcgcgccgcccgccccggctctgactgaccgcgttactcccacaggtgagcgggcgggacggcccttctcctccgggctgtaattagctgagcaagaggtaagggtttaagggatggttggttggtggggtattaatgtttaattacctggagcacctgcctgaaatcactttttttcaggttggaccggtgccaccATGTTCGTGTTCCTGGTGCTGCTGCCCCTGGTGAGCAGCCAGTGCGTGAACCTGACCACCCGCACCCAGCTGCCCCCCGCCTACACCAACAGCTTCACCCGCGGCGTGTACTACCCCGACAAGGTGTTCCGCAGCAGCGTGCTGCACAGCACCCAGGACCTGTTCCTGCCCTTCTTCAGCAACGTGACCTGGTTCCACGCCATCCACGTGAGCGGCACCAACGGCACCAAGCGCTTCGACAACCCCGTGCTGCCCTTCAACGACGGCGTGTACTTCGCCAGCACCGAGAAGAGCAACATCATCCGCGGCTGGATCTTCGGCACCACCCTGGACAGCAAGACCCAGAGCCTGCTGATCGTGAACAACGCCACCAACGTGGTGATCAAGGTGTGCGAGTTCCAGTTCTGCAACGACCCCTTCCTGGGCGTGTACTACCACAAGAACAACAAGAGCTGGATGGAGAGCGAGTTCCGCGTGTACAGCAGCGCCAACAACTGCACCTTCGAGTACGTGAGCCAGCCCTTCCTGATGGACCTGGAGGGCAAGCAGGGCAACTTCAAGAACCTGCGCGAGTTCGTGTTCAAGAACATCGACGGCTACTTCAAGATCTACAGCAAGCACACCCCCATCAACCTGGTGCGCGACCTGCCCCAGGGCTTCAGCGCCCTGGAGCCCCTGGTGGACCTGCCCATCGGCATCAACATCACCCGCTTCCAGACCCTGCTGGCCCTGCACCGCAGCTACCTGACCCCCGGCGACAGCAGCAGCGGCTGGACCGCCGGCGCCGCCGCCTACTACGTGGGCTACCTGCAGCCCCGCACCTTCCTGCTGAAGTACAACGAGAACGGCACCATCACCGACGCCGTGGACTGCGCCCTGGACCCCCTGAGCGAGACCAAGTGCACCCTGAAGAGCTTCACCGTGGAGAAGGGCATCTACCAGACCAGCAACTTCCGCGTGCAGCCCACCGAGAGCATCGTGCGCTTCCCCAACATCACCAACCTGTGCCCCTTCGGCGAGGTGTTCAACGCCACCCGCTTCGCCAGCGTGTACGCCTGGAACCGCAAGCGCATCAGCAACTGCGTGGCCGACTACAGCGTGCTGTACAACAGCGCCAGCTTCAGCACCTTCAAGTGCTACGGCGTGAGCCCCACCAAGCTGAACGACCTGTGCTTCACCAACGTGTACGCCGACAGCTTCGTGATCCGCGGCGACGAGGTGCGCCAGATCGCCCCCGGCCAGACCGGCAAGATCGCCGACTACAACTACAAGCTGCCCGACGACTTCACCGGCTGCGTGATCGCCTGGAACAGCAACAACCTGGACAGCAAGGTGGGCGGCAACTACAACTACCTGTACCGCCTGTTCCGCAAGAGCAACCTGAAGCCCTTCGAGCGCGACATCAGCACCGAGATCTACCAGGCCGGCAGCACCCCCTGCAACGGCGTGGAGGGCTTCAACTGCTACTTCCCCCTGCAGAGCTACGGCTTCCAGCCCACCAACGGCGTGGGCTACCAGCCCTACCGCGTGGTGGTGCTGAGCTTCGAGCTGCTGCACGCCCCCG CCACCGTGTGCGGCCCCAAGAAGAGCACCAACCTGGTGAAGAACAAGTGCGTGAACTTCAACTTCAACGGCCTGACC GGCACCGGCGTGCTGACCGAGAGCAACAAGAAGTTCCTGCCCTTCCAGCAGTTCGGCCGCGACATCGCCGACACCAC CGACGCCGTGCGCGACCCCCAGA

s-fragment 2(SEQ ID NO: 2):

CTTCGAGCTGCTGCACGCCCCCGCCACCGTGTGCGGCCCCAAGAAGAGCACCAACCTGGTGAAGAACA AGTGCGTGAACTTCAACTTCAACGGCCTGACCGGCACCGGCGTGCTGACCGAGAGCAACAAGAAGTTCCTGCCCTT CCAGCAGTTCGGCCGCGACATCGCCGACACCACCGACGCCGTGCGCGACCCCCAGACCCTGGAGATCCTGGACATCACCCCCTGCAGCTTCGGCGGCGTGAGCGTGATCACCCCCGGCACCAACACCAGCAACCAGGTGGCCGTGCTGTACCAGGACGTGAACTGCACCGAGGTGCCCGTGGCCATCCACGCCGACCAGCTGACCCCCACCTGGCGCGTGTACAGCACCGGCAGCAACGTGTTCCAGACCCGCGCCGGCTGCCTGATCGGCGCCGAGCACGTGAACAACAGCTACGAGTGCGACATCCCCATCGGCGCCGGCATCTGCGCCAGCTACCAGACCCAGACCAACAGCCCCCGCCGCGCCCGCAGCGTGGCCAGCCAGAGCATCATCGCCTACACCATGAGCCTGGGCGCCGAGAACAGCGTGGCCTACAGCAACAACAGCATCGCCATCCCCACCAACTTCACCATCAGCGTGACCACCGAGATCCTGCCCGTGAGCATGACCAAGACCAGCGTGGACTGCACCATGTACATCTGCGGCGACAGCACCGAGTGCAGCAACCTGCTGCTGCAGTACGGCAGCTTCTGCACCCAGCTGAACCGCGCCCTGACCGGCATCGCCGTGGAGCAGGACAAGAACACCCAGGAGGTGTTCGCCCAGGTGAAGCAGATCTACAAGACCCCCCCCATCAAGGACTTCGGCGGCTTCAACTTCAGCCAGATCCTGCCCGACCCCAGCAAGCCCAGCAAGCGCAGCTTCATCGAGGACCTGCTGTTCAACAAGGTGACCCTGGCCGACGCCGGCTTCATCAAGCAGTACGGCGACTGCCTGGGCGACATCGCCGCCCGCGACCTGATCTGCGCCCAGAAGTTCAACGGCCTGACCGTGCTGCCCCCCCTGCTGACCGACGAGATGATCGCCCAGTACACCAGCGCCCTGCTGGCCGGCACCATCACCAGCGGCTGGACCTTCGGCGCCGGCGCCGCCCTGCAGATCCCCTTCGCCATGCAGATGGCCTACCGCTTCAACGGCATCGGCGTGACCCAGAACGTGCTGTACGAGAACCAGAAGCTGATCGCCAACCAGTTCAACAGCGCCATCGGCAAGATCCAGGACAGCCTGAGCAGCACCGCCAGCGCCCTGGGCAAGCTGCAGGACGTGGTGAACCAGAACGCCCAGGCCCTGAACACCCTGGTGAAGCAGCTGAGCAGCAACTTCGGCGCCATCAGCAGCGTGCTGAACGACATCCTGAGCCGCCTGGACAAGGTGGAGGCCGAGGTGCAGATCGACCGCCTGATCACCGGCCGCCTGCAGAGCCTGCAGACCTACGTGACCCAGCAGCTGATCCGCGCCGCCGAGATCCGCGCCAGCGCCAACCTGGCCGCCACCAAGATGAGCGAGTGCGTGCTGGGCCAGAGCAAGCGCGTGGACTTCTGCGGCAAGGGCTACCACCTGATGAGCTTCCCCCAGAGCGCCCCCCACGGCGTGGTGTTCCTGCACGTGACCTACGTGCCCGCCCAGGAGAAGAACTTCACCACCGCCCCCGCCATCTGCCACGACGGCAAGGCCCACTTCCCCCGCGAGGGCGTGTTCGTGAGCAACGGCACCCACTGGTTCGTGACCCAGCGCAACTTCTACGAGCCCCAGATCATCACCACCGACAACACCTTCGTGAGCGGCAACTGCGACGTGGTGATCGGCATCGTGAACAACACCGTGTACGACCCCCTGCAGCCCGAGCTGGACAGCTTCAAGGAGGAGCTGGACAAGTACTTCAAGAACCACACCAGCCCCGACGTGGACCTGGGCGACATCAGCGGCATCAACGCCAGCGTGGTGAACATCCAGAAGGAGATCGACCGCCTGAACGAGGTGGCCAAGAACCTGAACGAGAGCCTGATCGACCTGCAGGAGCTGGGCAAGTACGAGCAGTACATCAAGTGGCCCTGGTACATCTGGCTGGGCTTCATCGCCGGCCTGATCGCCATCGTGATGGTGACCATCATGCTGTGCTGCATGACCAGCTGCTGCAGCTGCCTGAAGGGCTGCTGCAGCTGCGGCAGCTGCTGCAAGTTCGACGAGGACGACAGCGAGCCCGTGCTGAAGGGCGTGAAGCTGCACTACACCTAAtgtacaaggaattctaactagagctcgctgatcagcctcgactgtgccttctagttgccagccatctgttgtttgcccctcccccgtgccttccttgaccctggaaggtgccactcccactgtcctttcctaataaaatgaggaaattgcatcgcattgtctgagtaggtgtcattctattctggggggtggggtggggcaggacagcaagggggaggattgggaagagaatagcaggcatgctggggaATAACTTCGTATAATGTATGCTATACGAAGTTATATCCATCACACTGGCGGCCGcCTCGAGCATGCATCTAGAGGGCCCAATTCGCCCTATAGTGAGTCGTATTACAATTCACTGGCCGTCG TTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCGCCAGC TGGCGTAATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCT

ORF1ab sequence split fragment:

ORF1 ab-fragment 1(SEQ ID NO: 3):

CTGCCTGGCGGCAGTAGCGCGGTGGTCCCACCTGACCCCATGCCGAACTCAGAAGTGAAACGCCGTAG CGCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTAGGGAACTGCCAGGCATCAAATAAAACGAAAGGCTCAGTC GAAAGACTGGGCCTTCCTGCCACTCATCGCAGTACTGTTGTAATTCATTAAgcatgTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTGGTTTAGTGAACCGTCAGATCGattaaaggtttataccttcccaggtaacaaaccaaccaactttcgatctcttgtagatctgttctctaaacgaactttaaaatctgtgtggctgtcactcggctgcatgcttagtgcactcacgcagtataattaataactaattactgtcgttgacaggacacgagtaactcgtctatcttctgcaggctgcttacggtttcgtccgtgttgcagccgatcatcagcacatctaggtttcgtccgggtgtgaccgaaaggtaagatggagagccttgtccctggtttcaacgagaaaacacacgtccaactcagtttgcctgttttacaggttcgcgacgtgctcgtacgtggctttggagactccgtggaggaggtcttatcagaggcacgtcaacatcttaaagatggcacttgtggcttagtagaagttgaaaaaggcgttttgcctcaacttgaacagccctatgtgttcatcaaacgttcggatgctcgaactgcacctcatggtcatgttatggttgagctggtagcagaactcgaaggcattcagtacggtcgtagtggtgagacacttggtgtccttgtccctcatgtgggcgaaataccagtggcttaccgcaaggttcttcttcgtaagaacggtaataaaggagctggtggccatagttacggcgccgatctaaagtcatttgacttaggcgacgagcttggcactgatccttatgaagattttcaagaaaactggaacactaaacatagcagtggtgttacccgtgaactcatgcgtgagcttaacggaggggcatacactcgctatgtcgataacaacttctgtggccctgatggctaccctcttgagtgcattaaagaccttctagcacgtgctggtaaagcttcatgcactttgtccgaacaactggactttattgacactaagaggggtgtatactgctgccgtgaacatgagcatgaaattgcttggtacacggaacgttctgaaaagagctatgaattgcagacaccttttgaaattaaattggcaaagaaatttgacaccttcaatggggaatgtccaaattttgtatttcccttaaattccataatcaagactattcaaccaagggttgaaaagaaaaagcttgatggctttatgggtagaattcgatctgtctatccagttgcgtcaccaaatgaatgcaaccaaatgtgcctttcaactctcatgaagtgtgatcattgtggtgaaacttcatggcagacgggcgattttgttaaagccacttgcgaattttgtggcactgagaatttgactaaagaaggtgccactacttgtggttacttaccccaaaatgctgttgttaaaatttattgtccagcatgtcacaattcagaagtaggacctgagcatagtcttgccgaataccataatgaatctggcttgaaaaccattcttcgtaagggtggtcgcactattgcctttggaggctgtgtgttctcttatgttggttgccataacaagtgtgcctattgggttccacgtgctagcgctaacataggttgtaaccatacaggtgttgttggagaaggttccgaaggtcttaatgacaaccttcttgaaatactccaaaaagagaaagtcaacatcaatattgttggtgactttaaacttaatgaagagatcgccattattttggcatctttttctgcttccacaagtgcttttgtggaaactgtgaaaggtttggattataaagcattcaaacaaattgttgaatcctgtggtaattttaaagttacaaaaggaaaagctaaaaaaggtgcctggaatattggtgaacagaaatcaatactgagtcctctttatgcatttgcatcagaggctgctcgtgttgtacgatcaattttctcccgcactcttgaaactgctcaaaattctgtgcgtgttttacagaaggccgctataacaatactagatggaatttcacagtattcactgagactcattgatgctatgatgttcacatctgatttggctactaacaatctagttgtaatggcctacattacaggtggtgttgttcagttgacttcgcagtggctaactaacatctttggcactgtttatgaaaaactcaaacccgtccttgattggcttgaagagaagtttaaggaaggtgtagagtttcttagagacggttgggaaattgttaaatttatctcaacctgtgcttgtgaaattgtcggtggacaaattgtcacctgtgcaaaggaaattaaggagagtgttcagacattctttaagcttgtaaataaatttttggctttgtgtgctgactctatcattattggtggagctaaacttaaagccttgaatttaggtgaaacatttgtcacgcactcaaagggattgtacagaaagtgtgttaaatccagagaagaaactggcctactcatgcctctaaaagccccaaaagaaattatcttcttagagggagaaacacttcccacagaagtgttaacagaggaagttgtcttgaaaactggtgatttacaaccattagaacaacctactagtgaagctgttgaagctccattggttggtacaccagtttgtattaacgggcttatgttgctcgaaatcaaagacacagaaaagtactgtgcccttgcacctaatatgatggtaacaaacaataccttcacactcaaaggcggtgcaccaacaaaggttacttttggtgatgacactgtgatagaagtgcaaggttacaagagtgtgaatatcacttttgaacttgatgaaaggattgataaagtacttaatgagaagtgctctgcctatacagttgaactcggtacagaagtaaatgagttcgcctgtgttgtggcagatgctgtcataaaaactttgcaaccagtatctgaattacttacaccactgggcattgatttagatgagtggagtatggctacatactacttatttgatgagtctggtgagtttaaattggcttcacatatgtattgttctttctaccctccagatgaggatgaagaagaaggtgattgtgaagaagaagagtttgagccatcaactcaatatgagtatggtactgaagatgattaccaaggtaaacctttggaatttggtgccacttctgctgctcttcaacctgaagaagagcaagaagaagattggttagatgatgatagtcaacaaactgttggtcaacaagacggcagtgaggacaatcagacaactactattcaaacaattgttgaggttcaacctcaattagagatggaacttacaccagttgttcagactattgaagtgaatagttttagtggttatttaaaacttactgacaatgtatacattaaaaatgcagacattgtggaagaagctaaaaaggtaaaaccaacagtggttgttaatgcagccaatgtttaccttaaacatggaggaggtgttgcaggagccttaaataaggctactaacaatgccatgcaagttgaatctgatgattacatagctactaatggaccacttaaagtgggtggtagttgtgttttaagcggacacaatcttgctaaacactgtctt catgttgtcggcccaaatgttaacaaaggtgaagacattcaacttcttaagagtgcttatgaaaattttaatcagc acgaagttctacttgcaccattattatcagctggtatttttggtgctgaccctatacattctttaagagtt

ORF1 ab-fragment 2(SEQ ID NO: 4):

AAGTGGGTGGTAGTTGTGTTTTAAGCGGACACAATCTTGCTAAACACTGTCTTCATGTTGTCGGCCCAA ATGTTAACAAAGGTGAAGACATTCAACTTCTTAAGAGTGCTTATGAAAATTTTAATCAGCACGAAGTTCTACTTGCA CCATTATTATCAGCTGGTATTTTTGGTGCTGACCCTATACATTCTTTAAGAGTTTGTGTAGATACTGTTCGCACAAATGTCTACTTAGCTGTCTTTGATAAAAATCTCTATGACAAACTTGTTTCAAGCTTTTTGGAAATGAAGAGTGAAAAGCAAGTTGAACAAAAGATCGCTGAGATTCCTAAAGAGGAAGTTAAGCCATTTATAACTGAAAGTAAACCTTCAGTTGAACAGAGAAAACAAGATGATAAGAAAATCAAAGCTTGTGTTGAAGAAGTTACAACAACTCTGGAAGAAACTAAGTTCCTCACAGAAAACTTGTTACTTTATATTGACATTAATGGCAATCTTCATCCAGATTCTGCCACTCTTGTTAGTGACATTGACATCACTTTCTTAAAGAAAGATGCTCCATATATAGTGGGTGATGTTGTTCAAGAGGGTGTTTTAACTGCTGTGGTTATACCTACTAAAAAGGCTGGTGGCACTACTGAAATGCTAGCGAAAGCTTTGAGAAAAGTGCCAACAGACAATTATATAACCACTTACCCGGGTCAGGGTTTAAATGGTTACACTGTAGAGGAGGCAAAGACAGTGCTTAAAAAGTGTAAAAGTGCCTTTTACATTCTACCATCTATTATCTCTAATGAGAAGCAAGAAATTCTTGGAACTGTTTCTTGGAATTTGCGAGAAATGCTTGCACATGCAGAAGAAACACGCAAATTAATGCCTGTCTGTGTGGAAACTAAAGCCATAGTTTCAACTATACAGCGTAAATATAAGGGTATTAAAATACAAGAGGGTGTGGTTGATTATGGTGCTAGATTTTACTTTTACACCAGTAAAACAACTGTAGCGTCACTTATCAACACACTTAACGATCTAAATGAAACTCTTGTTACAATGCCACTTGGCTATGTAACACATGGCTTAAATTTGGAAGAAGCTGCTCGGTATATGAGATCTCTCAAAGTGCCAGCTACAGTTTCTGTTTCTTCACCTGATGCTGTTACAGCGTATAATGGTTATCTTACTTCTTCTTCTAAAACACCTGAAGAACATTTTATTGAAACCATCTCACTTGCTGGTTCCTATAAAGATTGGTCCTATTCTGGACAATCTACACAACTAGGTATAGAATTTCTTAAGAGAGGTGATAAAAGTGTATATTACACTAGTAATCCTACCACATTCCACCTAGATGGTGAAGTTATCACCTTTGACAATCTTAAGACACTTCTTTCTTTGAGAGAAGTGAGGACTATTAAGGTGTTTACAACAGTAGACAACATTAACCTCCACACGCAAGTTGTGGACATGTCAATGACATATGGACAACAGTTTGGTCCAACTTATTTGGATGGAGCTGATGTTACTAAAATAAAACCTCATAATTCACATGAAGGTAAAACATTTTATGTTTTACCTAATGATGACACTCTACGTGTTGAGGCTTTTGAGTACTACCACACAACTGATCCTAGTTTTCTGGGTAGGTACATGTCAGCATTAAATCACACTAAAAAGTGGAAATACCCACAAGTTAATGGTTTAACTTCTATTAAATGGGCAGATAACAACTGTTATCTTGCCACTGCATTGTTAACACTCCAACAAATAGAGTTGAAGTTTAATCCACCTGCTCTACAAGATGCTTATTACAGAGCAAGGGCTGGTGAAGCTGCTAACTTTTGTGCACTTATCTTAGCCTACTGTAATAAGACAGTAGGTGAGTTAGGTGATGTTAGAGAAACAATGAGTTACTTGTTTCAACATGCCAATTTAGATTCTTGCAAAAGAGTCTTGAACGTGGTGTGTAAAACTTGTGGACAACAGCAGACAACCCTTAAGGGTGTAGAAGCTGTTATGTACATGGGCACACTTTCTTATGAACAATTTAAGAAAGGTGTTCAGATACCTTGTACGTGTGGTAAACAAGCTACAAAATATCTAGTACAACAGGAGTCACCTTTTGTTATGATGTCAGCACCACCTGCTCAGTATGAACTTAAGCATGGTACATTTACTTGTGCTAGTGAGTACACTGGTAATTACCAGTGTGGTCACTATAAACATATAACTTCTAAAGAAACTTTGTATTGCATAGACGGTGCTTTACTTACAAAGTCCTCAGAATACAAAGGTCCTATTACGGATGTTTTCTACAAAGAAAACAGTTACACAACAACCATAAAACCAGTTACTTATAAATTGGATGGTGTTGTTTGTACAGAAATTGACCCTAAGTTGGACAATTATTATAAGAAAGACAATTCTTATTTCACAGAGCAACCAATTGATCTTGTACCAAACCAACCATATCCAAACGCAAGCTTCGATAATTTTAAGTTTGTATGTGATAATATCAAATTTGCTGATGATTTAAACCAGTTAACTGGTTATAAGAAACCTGCTTCAAGAGAGCTTAAAGTTACATTTTTCCCTGACTTAAATGGTGATGTGGTGGCTATTGATTATAAACACTACACACCCTCTTTTAAGAAAGGAGCTAAATTGTTACATAAACCTATTGTTTGGCATGTTAACAATGCAACTAATAAAGCCACGTATAAACCAAATACCTGGTGTATACGTTGTCTTTGGAGCACAAAACCAGTTGAAACATCAAATTCGTTTGATGTACTGAAGTCAGAGGACGCGCAGGGAATGGATAATCTTGCCTGCGAAGATCTAAAACCAGTCTCTGAAGAAGTAGTGGAAAATCCTACCATACAGAAAGACGTTCTTGAGTGTAATGTGAAAACTACCGAAGTTGTAGGAGACATTATACTTAAACCAGCAAATAATAGTTTAAAAATTACAGAAGAGGTTGGCCACACAGATCTAATGGCTGCTTATGTAGACAATTCTAGTCTTACTATTAAGAAACCTAATGAATTATCTAGAGTATTAGGTTTGAAAACCCTTGCTACTCATGGTTTAGCTGCTGTTAATAGTGTCCCTTGGGATACTATAGCTAATTATGCTAAGCCTTTTCTTAACAAAGTTGTTAGTACAACTACTAACATAGTTACACGGTGTTTAAACCGTGTTTGTACTAATTATATGCCTTATTTCTTTACTTTATTGCTACAATTGTGTACTTTTACTAGAAGTACAAATTCTAGAATTAAAGCATCTATGCCGACTACTATAGCAAAGAATACTGTTAAGAGTGTCGGTAAATTTTGTCTAGAGGCTTCATTTAATTATTTGAAGTCACCTAATTTTTCTAAACTGATAAATATTATAATTTGGTTTTTACTATTAAGTGTTTGCCTAGGTTCTTTAATCTACTCAACCGCTGCTTTAGGTGTTTTAATGTCTAATTTAGGCATGCCTTCTTACTGTACTGGTTACAGAGAAGGCTATTTGAACTCTACTAATGTCACTATTGCAACCTACTGTACTGGTTCTATACCTTGTAGTGTTTGTCTTAGTGGTTTAGATTCTTTAGACACCTATCCTTCTTTAGAAACTATACAAATTACCATTTCATCTTTTAAATGGGATTTAACTGCTTTTGGCTTAGTTGCAGAGTGGTTTTTGGCATATATTCTTTTCACTAGGTTTTTCTATGTACTTGGATTGGCTGCAATCATGCAATTGTTTTTCAGCTATTTTGCAGTACATTTTATTAGTAATTCTTGGCTTATGTGGTTAATAATTAATCTTGTACAAATGGCCCCGATTTCAGCTATGGTTAGAATGTACATCTTCTTTGCATCATTTTATTATGTATGGAAAAGTTATGTGCATGTTGTAGACGGTTGTAATTCATCAACTTGTATGATGTGTTACAAACGTAATAGAGCAACAAGAGTCGAATGTACAACTATTGTTAATGGTGTTAGAAGGTCCTTTTATGTCTATGCTAATGGAGGTAAAGGCTTTTGCAAACTACACAATTGGAATTGTGTTAATTGTGATACATTCTGTGCTGGTAGTACATTTATTAGTGATGAAGTTGCGAGAGACTTGTCACTACAGTTTAAAAGACCAATAAATCCTACTGACCAGTCTTCTTACATCGTTGATAGTGTTACAGTGAAGAATGGTTCCATCCATCTTTACTTTGATAAAGCTGGTCAAAAGACTTATGAAAGACATTCTCTCTCTCATTTTGTTAACTTAGACAACCTGAGAGCTAATAACACT AAAGGTTCATTGCCTATTAATGTTATAGTTTTTGATGGTAAATCAAAATGTGAAGAATCATCTGCAAAATCAGCGTC TGTTTACTACAGTCAGCTTATGTGTCAACCTATACTGTTACTAGATCAGGCATTAGTGTCTGATGTTGGTGATAGTG CGGAAGTTGCAGTTAAAATGTTTGATGCTTACGTTAATACGT

ORF1 ab-fragment 3(SEQ ID NO: 5):

CACTAAAGGTTCATTGCCTATTAATGTTATAGTTTTTGATGGTAAATCAAAATGTGAAGAATCATCTG CAAAATCAGCGTCTGTTTACTACAGTCAGCTTATGTGTCAACCTATACTGTTACTAGATCAGGCATTAGTGTCTGA TGTTGGTGATAGTGCGGAAGTTGCAGTTAAAATGTTTGATGCTTACGTTAATACGTTTTCATCAACTTTTAACGTACCAATGGAAAAACTCAAAACACTAGTTGCAACTGCAGAAGCTGAACTTGCAAAGAATGTGTCCTTAGACAATGTCTTATCTACTTTTATTTCAGCAGCTCGGCAAGGGTTTGTTGATTCAGATGTAGAAACTAAAGATGTTGTTGAATGTCTTAAATTGTCACATCAATCTGACATAGAAGTTACTGGCGATAGTTGTAATAACTATATGCTCACCTATAACAAAGTTGAAAACATGACACCCCGTGACCTTGGTGCTTGTATTGACTGTAGTGCGCGTCATATTAATGCGCAGGTAGCAAAAAGTCACAACATTGCTTTGATATGGAACGTTAAAGATTTCATGTCATTGTCTGAACAACTACGAAAACAAATACGTAGTGCTGCTAAAAAGAATAACTTACCTTTTAAGTTGACATGTGCAACTACTAGACAAGTTGTTAATGTTGTAACAACAAAGATAGCACTTAAGGGTGGTAAAATTGTTAATAATTGGTTGAAGCAGTTAATTAAAGTTACACTTGTGTTCCTTTTTGTTGCTGCTATTTTCTATTTAATAACACCTGTTCATGTCATGTCTAAACATACTGACTTTTCAAGTGAAATCATAGGATACAAGGCTATTGATGGTGGTGTCACTCGTGACATAGCATCTACAGATACTTGTTTTGCTAACAAACATGCTGATTTTGACACATGGTTTAGCCAGCGTGGTGGTAGTTATACTAATGACAAAGCTTGCCCATTGATTGCTGCAGTCATAACAAGAGAAGTGGGTTTTGTCGTGCCTGGTTTGCCTGGCACGATATTACGCACAACTAATGGTGACTTTTTGCATTTCTTACCTAGAGTTTTTAGTGCAGTTGGTAACATCTGTTACACACCATCAAAACTTATAGAGTACACTGACTTTGCAACATCAGCTTGTGTTTTGGCTGCTGAATGTACAATTTTTAAAGATGCTTCTGGTAAGCCAGTACCATATTGTTATGATACCAATGTACTAGAAGGTTCTGTTGCTTATGAAAGTTTACGCCCTGACACACGTTATGTGCTCATGGATGGCTCTATTATTCAATTTCCTAACACCTACCTTGAAGGTTCTGTTAGAGTGGTAACAACTTTTGATTCTGAGTACTGTAGGCACGGCACTTGTGAAAGATCAGAAGCTGGTGTTTGTGTATCTACTAGTGGTAGATGGGTACTTAACAATGATTATTACAGATCTTTACCAGGAGTTTTCTGTGGTGTAGATGCTGTAAATTTACTTACTAATATGTTTACACCACTAATTCAACCTATTGGTGCTTTGGACATATCAGCATCTATAGTAGCTGGTGGTATTGTAGCTATCGTAGTAACATGCCTTGCCTACTATTTTATGAGGTTTAGAAGAGCTTTTGGTGAATACAGTCATGTAGTTGCCTTTAATACTTTACTATTCCTTATGTCATTCACTGTACTCTGTTTAACACCAGTTTACTCATTCTTACCTGGTGTTTATTCTGTTATTTACTTGTACTTGACATTTTATCTTACTAATGATGTTTCTTTTTTAGCACATATTCAGTGGATGGTTATGTTCACACCTTTAGTACCTTTCTGGATAACAATTGCTTATATCATTTGTATTTCCACAAAGCATTTCTATTGGTTCTTTAGTAATTACCTAAAGAGACGTGTAGTCTTTAATGGTGTTTCCTTTAGTACTTTTGAAGAAGCTGCGCTGTGCACCTTTTTGTTAAATAAAGAAATGTATCTAAAGTTGCGTAGTGATGTGCTATTACCTCTTACGCAATATAATAGATACTTAGCTCTTTATAATAAGTACAAGTATTTTAGTGGAGCAATGGATACAACTAGCTACAGAGAAGCTGCTTGTTGTCATCTCGCAAAGGCTCTCAATGACTTCAGTAACTCAGGTTCTGATGTTCTTTACCAACCACCACAAACCTCTATCACCTCAGCTGTT TTGCAGAGTGGTTTTAGAAAAATGGCATTCCCATCTGGTAAAGTTGAGGGTTGTATGGTACAAGTAACTTGTGGTACAACTACACTTAACGGTCTTTGGCTTGATGACGTAGTTTACTGTCCAAGACATGTGATCTGCACCTCTGAAGACATGCTTAACCCTAATTATGAAGATTTACTCATTCGTAAGTCTAATCATAATTTCTTGGTACAGGCTGGTAATGTTCAACTCAGGGTTATTGGACATTCTATGCAAAATTGTGTACTTAAGCTTAAGGTTGATACAGCCAATCCTAAGACACCTAAGTATAAGTTTGTTCGCATTCAACCAGGACAGACTTTTTCAGTGTTAGCTTGTTACAATGGTTCACCATCTGGTGTTTACCAATGTGCTATGAGGCCCAATTTCACTATTAAGGGTTCATTCCT TAATGGTTCATGTGGTAGTGTTGGTTTTAACATAGATTATGACTGTGTCTCTTTTTGTTACATGCACCATATGGAATTACCAACTGGAGTTCATGCTGGCACAGACTTAGAAGGTAACTTTTATGGACCTTTTGTTGACAGGCAAACAGCACAAGCAGCTGGTACGGACACAACTATTACAGTTAATGTTTTAGCTTGGTTGTACGCTGCTGTTATAAATGGAGACAGGTGGTTTCTCAATCGATTTACCACAACT CTTAATGACTTTAACCTTGTGGCTATGAAGTACAATTATGAACCTCTAACACAAGACCATGTTGACATACTAGGACCTCTTTCTGCTCAAACTGGAATTGCCGTTTTAGATATGTGTGCTTCATTAAAAGAATTACTGCAAAATGGTATGAATGGACGTACCATATTGGGTAGTGCTTTATTAGAAGATGAATTTACACCTTTTGATGTTGTTAGACAATGCTCAGGTGTTACTTTCCAAAGTGCAGTGAAAAGAACAATCAAGGGTACACACCACTGGTTGTTACTCACAATTTTGACTTCACTTTTAGTTTTAGTCCAGAGTACTCAATGGTCTTTGTTCTTTTTTTTGTATGAAAATGCCTTTTTACCTTTTGCTATGGGTATTATTGCTATGTCTGCTTTTGCAATGATGTTTGTCAAACATAAGCATGCATTTCTCTGTTTGTTTTTGTTACCTTCTCTTGCCACTGTAGCTTATTTTAATATGGTCTATATGCCTGCTAGTTGGGT GATGCGTATTATGACATGGTTGGATATGGTTGATACTAGTTTGTCTGGTTTTAAGCTAAAAGACTGTGTTATGTATGCATCAGCTGTAGTGTTACTAATCCTTATGACAGCAAGAACTGTGTATGATGATGGTGCTAGGAGAGTGTGGACACTTATGAATGTCTTGACACTCGTTTATAAAGTTTATTATGGTAATGCTTTAGATCAAGCCATTTCCATGTGGGCTCTTATAATCTCTGTTACTTCTAACTACTCAGGTGTAGTTACAACTGTCATGTTTTTGGCCAGAGGTATTGTTTTTATGTGTGTTGAGTATTGCCCTATTTTCTTCATAACTGGTAATACACTTCAGTGTATAATGCTAGTTTATTGTTTCTTAGGCTATTTTTGTACTTGTTACTTTGGCCTCTTTTGTTTACTCAACCGCTACTTTAGACTGACTCTTGGTGTTTATGATTACTTAGTTTCTACACAGGAGTTTAGATATATGAATTCACAGGGACTACTCCCAC CCAAGAATAGCATAGATGCCTTCAAACTCAACATTAAATTGTTGGGTGTTGGTGGCAAACCTTGTATCAAAGTAGCCACTGTACAGTCTAAAATGTCAGATGTAAAGTGCACATCAGTAGTCTTACTCTCAGTTTTGCAACAACTCAGAGTAGAATCATCATCTAAATTGTGGGCTCAATGTGTCCAGTTACACAATGACATTCTCTTAGCTAAAGATACTACTGAAGCCTTTGAAAAAATGGTTTCACTACTTTCTGTTTTGCTTTCCATGCAGGGTGCTGTAGACATAAACAAGCTTTGTGAAGAAATGCTGGACAACAGGGCAACCTTACAAGCTATAGCCTCAGAGTTTAGTTCCCTTCCATCATATGCAGCTTTTGCTACTGCTCAAGAAGCTTATGAGCAGGCTGTTGCTAATGGTGAT TCTGAAGTTGTTCTTAAAAAGTTGAAGAAGTCTTTGAATGTGGCTAAATCTGAATTTGACCGTGATGCAGCCATGCA ACGTAAGTTGGAAAAGATGGCTGATCAAGCTATGACCCAAATGTATAAACAGGCTAGATCTGAGGACAAGAGGGCAA AAGTTACTAGTGCTATGCAGAC

ORF1 ab-fragment 4(SEQ ID NO: 6):

GAGCAGGCTGTTGCTAATGGTGATTCTGAAGTTGTTCTTAAAAAGTTGAAGAAGTCTTTGAATGTGGCT AAATCTGAATTTGACCGTGATGCAGCCATGCAACGTAAGTTGGAAAAGATGGCTGATCAAGCTATGACCCAAATGTA TAAACAGGCTAGATCTGAGGACAAGAGGGCAAAAGTTACTAGTGCTATGCAGACAATGCTTTTCACTATGCTTAGAAAGTTGGATAATGATGCACTCAACAACATTATCAACAATGCAAGAGATGGTTGTGTTCCCTTGAACATAATACCTCTTACAACAGCAGCCAAACTAATGGTTGTCATACCAGACTATAACACATATAAAAATACGTGTGATGGTACAACATTTACTTATGCATCAGCATTGTGGGAAATCCAACAGGTTGTAGATGCAGATAGTAAAATTGTTCAACTTAGTGAAATTAGTATGGACAATTCACCTAATTTAGCATGGCCTCTTATTGTAACAGCTTTAAGGGCCAATTCTGCTGTCAAATTACAGAATAATGAGCTTAGTCCTGTTGCACTACGACAGATGTCTTGTGCTGCCGGTACTACACAAACTGCTTGCACTGATGACAATGCGTTAGCTTACTACAACACAACAAAGGGAGGTAGGTTTGTACTTGCACTGTTATCCGATTTACAGGATTTGAAATGGGCTAGATTCCCTAAGAGTGATGGAACTGGTACTATCTATACAGAACTGGAACCACCTTGTAGGTTTGTTACAGACACACCTAAAGGTCCTAAAGTGAAGTATTTATACTTTATTAAAGGATTAAACAACCTAAATAGAGGTATGGTACTTGGTAGTTTAGCTGCCACAGTACGTCTACAAGCTGGTAATGCAACAGAAGTGCCTGCCAATTCAACTGTATTATCTTTCTGTGCTTTTGCTGTAGATGCTGCTAAAGCTTACAAAGATTATCTAGCTAGTGGGGGACAACCAATCACTAATTGTGTTAAGATGTTGTGTACACACACTGGTACTGGTCAGGCAATAACAGTTACACCGGAAGCCAATATGGATCAAGAATCCTTTGGTGGTGCATCGTGTTGTCTGTACTGCCGTTGCCACATAGATCATCCAAATCCTAAAGGATTTTGTGACTTAAAAGGTAAGTATGTACAAATACCTACAACTTGTGCTAATGACCCTGTGGGTTTTACACTTAAAAACACAGTCTGTACCGTCTGCGGTATGTGGAAAGGTTATGGCTGTAGTTGTGATCAACTCCGCGAACCCATGCTTCAGTCAGCTGATGCACAATCGTTTTTAAACGGGTTTGCGGTGTAAGTGCAGCCCGTCTTACACCGTGCGGCACAGGCACTAGTACTGATGTCGTATACAGGGCTTTTGACATCTACAATGATAAAGTAGCTGGTTTTGCTAAATTCCTAAAAACTAATTGTTGTCGCTTCCAAGAAAAGGACGAAGATGACAATTTAATTGATTCTTACTTTGTAGTTAAGAGACACACTTTCTCTAACTACCAACATGAAGAAACAATTTATAATTTACTTAAGGATTGTCCAGCTGTTGCTAAACATGACTTCTTTAAGTTTAGAATAGACGGTGACATGGTACCACATATATCACGTCAACGTCTTACTAAATACACAATGGCAGACCTCGTCTATGCTTTAAGGCATTTTGATGAAGGTAATTGTGACACATTAAAAGAAATACTTGTCACATACAATTGTTGTGATGATGATTATTTCAATAAAAAGGACTGGTATGATTTTGTAGAAAACCCAGATATATTACGCGTATACGCCAACTTAGGTGAACGTGTACGCCAAGCTTTGTTAAAAACAGTACAATTCTGTGATGCCATGCGAAATGCTGGTATTGTTGGTGTACTGACATTAGATAATCAAGATCTCAATGGTAACTGGTATGATTTCGGTGATTTCATACAAACCACGCCAGGTAGTGGAGTTCCTGTTGTAGATTCTTATTATTCATTGTTAATGCCTATATTAACCTTGACCAGGGCTTTAACTGCAGAGTCACATGTTGACACTGACTTAACAAAGCCTTACATTAAGTGGGATTTGTTAAAATATGACTTCACGGAAGAGAGGTTAAAACTCTTTGACCGTTATTTTAAATATTGGGATCAGACATACCACCCAAATTGTGTTAACTGTTTGGATGACAGATGCATTCTGCATTGTGCAAACTTTAATGTTTTATTCTCTACAGTGTTCCCACCTACAAGTTTTGGACCACTAGTGAGAAAAATATTTGTTGATGGTGTTCCATTTGTAGTTTCAACTGGATACCACTTCAGAGAGCTAGGTGTTGTACATAATCAGGATGTAAACTTACATAGCTCTAGACTTAGTTTTAAGGAATTACTTGTGTATGCTGCTGACCCTGCTATGCACGCTGCTTCTGGTAATCTATTACTAGATAAACGCACTACGTGCTTTTCAGTAGCTGCACTTACTAACAATGTTGCTTTTCAAACTGTCAAACCCGGTAATTTTAACAAAGACTTCTATGACTTTGCTGTGTCTAAGGGTTTCTTTAAGGAAGGAAGTTCTGTTGAATTAAAACACTTCTTCTTTGCTCAGGATGGTAATGCTGCTATCAGCGATTATGACTACTATCGTTATAATCTACCAACAATGTGTGATATCAGACAACTACTATTTGTAGTTGAAGTTGTTGATAAGTACTTTGATTGTTACGATGGTGGCTGTATTAATGCTAACCAAGTCATCGTCAACAACCTAGACAAATCAGCTGGTTTTCCATTTAATAAATGGGGTAAGGCTAGACTTTATTATGATTCAATGAGTTATGAGGATCAAGATGCACTTTTCGCATATACAAAACGTAATGTCATCCCTACTATAACTCAAATGAATCTTAAGTATGCCATTAGTGCAAAGAATAGAGCTCGCACCGTAGCTGGTGTCTCTATCTGTAGTACTATGACCAATAGACAGTTTCATCAAAAATTATTGAAATCAATAGCCGCCACTAGAGGAGCTACTGTAGTAATTGGAACAAGCAAATTCTATGGTGGTTGGCACAACATGTTAAAAACTGTTTATAGTGATGTAGAAAACCCTCACCTTATGGGTTGGGATTATCCTAAATGTGATAGAGCCATGCCTAACATGCTTAGAATTATGGCCTCACTTGTTCTTGCTCGCAAACATACAACGTGTTGTAGCTTGTCACACCGTTTCTATAGATTAGCTAATGAGTGTGCTCAAGTATTGAGTGAAATGGTCATGTGTGGCGGTTCACTATATGTTAAACCAGGTGGAACCTCATCAGGAGATGCCACAACTGCTTATGCTAATAGTGTTTTTAACATTTGTCAAGCTGTCACGGCCAATGTTAATGCACTTTTATCTACTGATGGTAACAAAATTGCCGATAAGTATGTCCGCAATTTACAACACAGACTTTATGAGTGTCTCTATAGAAATAGAGATGTTGACACAGACTTTGTGAATGAGTTTTACGCATATTTGCGTAAACATTTCTCAATGATGATACTCTCTGACGATGCTGTTGTGTGTTTCAATAGCACTTATGCATCTCAAGGTCTAGTGGCTAGCATAAAGAACTTTAAGTCAGTTCTTTATTATCAAAACAATGTTTTTATGTCTGAAGCAAAATGTTGGACTGAGACTGACCTTACTAAAGGACCTCATGAATTTTGCTCTCAACATACAATGCTAGTTAAACAGGGTGATGATTATGTGTACCTTCCTTACCCAGATCCATCAAGAATCCTAGGGGCCGGCTGTTTTGTAGATGATATCGTAAAAACAGATGGTACACTTATGATTGAACGGTTCGTGTCTTTAGCTATAGATGCTTACCCACTTACTAAACATCCTAATCAGGAGTATGCTGATGTCTTTCATTTGTACTTACAATACATAAGAAAGCTACATGATGAGTTAACAGGACACATGTTAGACATGTATTCTGTTATGCTTACTAATGATAACACTTCAAGGTATTGGGAACCTGAGTTTTATGAGGCTATGTACACACCGCATACAGTCTTACAGGCTGTTGGGGCTTGTGTTCTTTGCAATTCACAGACTTCATTAAGATGTGGTGCTTGCATACGTAGACCATTCTTATGTTGTAAATGCTGTTACGACCATGTCATATCAACATCACATAAATTAGTCTTGTCTGTTAATCCGTATGTTTGCAATGCTCCAGGTTGTGATGTCACAGATGTGACTCAACTTTACTTAGGAGGTATGAGCTATTATTGTAAATCACATAAACCACCCATTAGTTTTCCATTGTGTGCTAATGGACAAGTTTTTGGTTTATATAAAAATACATGTGTTGGTAGCGATAATGTTACTGACTTTAATGCAATTGCAACATGTGACTGGACAAATGCTGGTGATTACATTTTAGCTAACACCTGTACTGAAAGACTCAAGCTTTTTGCAGCAGAAACGCTCAAAGCTACTGAGGAGACATTTAAACTGTCTTATGGTATTGCTACTGTACGTGAAGTGCTGTCTGACAGAGAATTACATCTTTCATGGGAAGTTGGTAAACCTAGACCACCACTTAACCGAAATTATGTCTTTACTGGTTATCGTGTAACTAAAAACAGTAAAGTACAAATAGGAGAGTACACCTTTGAAAAAGGTGACTATGGTGATGCTGTTGTTTACCGAGGTACAACAACTTACAAATTAAATGTTGGTGATTATTTTGTGCTGACATCA CATACAGTAATGCCATTAAGTGCACCTACACTAGTGCCACAAGAGCACTATGTTAGAATTACTGGCTTATACCCAAC ACTCAATATCTCAGATGAGTTTTCTAGCAATGTTGCAAATTATCAAAAGGTTGGTATGCAAAAGTATTCTACACTC

ORF1 ab-fragment 5(SEQ ID NO: 7):

caacaacttacaaattaaatgttggtgattattttgtgctgacatcacatacagtaatgccattaagt gcacctacactagtgccacaagagcactatgttagaattactggcttatacccaacactcaatatctcagatgagt tttctagcaatgttgcaaattatcaaaaggttggtatgcaaaagtattctacactccagggaccacctggtactggtaagagtcattttgctattggcctagctctctactacccttctgctcgcatagtgtatacagcttgctctcatgccgctgttgatgcactatgtgagaaggcattaaaatatttgcctatagataaatgtagtagaattatacctgcacgtgctcgtgtagagtgttttgataaattcaaagtgaattcaacattagaacagtatgtcttttgtactgtaaatgcattgcctgagacgacagcagatatagttgtctttgatgaaatttcaatggccacaaattatgatttgagtgttgtcaatgccagattacgtgctaagcactatgtgtacattggcgaccctgctcaattacctgcaccacgcacattgctaactaagggcacactagaaccagaatatttcaatt

cagtgtgtagacttatgaaaactataggtccagacatgttcctcggaacttgtcggcgttgtcctgctgaaattgttgacactgtgagtgctttggtttatgataataagcttaaagcacataaagacaaatcagctcaatgctttaaaatgttttataagggtgttatcacgcatgatgtttcatctgcaattaacaggccacaaataggcgtggtaagagaattccttacacgtaaccctgcttggagaaaagctgtctttatttcaccttataattcacagaatgctgtagcctcaaagattttgggactaccaactcaaactgttgattcatcacagggctcagaatatgactatgtcatattcactcaaaccactgaaacagctcactcttgtaatgtaaacagatttaatgttgctattaccagagcaaaagtaggcatactttgcataatgtctgatagagacctttatgacaagttgcaatttacaagtcttgaaattccacgtaggaatgtggcaactttacaagctgaaaatgtaacaggactctttaaagattgtagtaaggtaatcactgggttacatcctacacaggcacctacacacctcagtgttgacactaaattcaaaactgaaggtttatgtgttgacatacctggcatacctaaggacatgacctatagaagactcatctctatgatgggttttaaaatgaattatcaagttaatggttaccctaacatgtttatcacccgcgaagaagctataagacatgtacgtgcatggattggcttcgatgtcgaggggtgtcatgctactagagaagctgttggtaccaatttacctttacagctaggtttttctacaggtgttaacctagttgctgtacctacaggttatgttgatacacctaataatacagatttttccagagttagtgctaaaccaccgcctggagatcaatttaaacacctcataccacttatgtacaaaggacttccttggaatgtagtgcgtataaagattgtacaaatgttaagtgacacacttaaaaatctctctgacagagtcgtatttgtcttatgggcacatggctttgagttgacatctatgaagtattttgtgaaaataggacctgagcgcacctgttgtctatgtgatagacgtgccacatgcttttccactgcttcagacacttatgcctgttggcatcattctattggatttgattacgtctataatccgtttatgattgatgttcaacaatggggttttacaggtaacctacaaagcaaccatgatctgtattgtcaagtccatggtaatgcacatgtagctagttgtgatgcaatcatgactaggtgtctagctgtccacgagtgctttgttaagcgtgttgactggactattgaatatcctataattggtgatgaactgaagattaatgcggcttgtagaaaggttcaacacatggttgttaaagctgcattattagcagacaaattcccagttcttcacgacattggtaaccctaaagctattaagtgtgtacctcaagctgatgtagaatggaagttctatgatgcacagccttgtagtgacaaagcttataaaatagaagaattattctattcttatgccacacattctgacaaattcacagatggtgtatgcctattttggaattgcaatgtcgatagatatcctgctaattccattgtttgtagatttgacactagagtgctatctaaccttaacttgcctggttgtgatggtggcagtttgtatgtaaataaacatgcattccacacaccagcttttgataaaagtgcttttgttaatttaaaacaattaccatttttctattactctgacagtccatgtgagtctcatggaaaacaagtagtgtcagatatagattatgtaccactaaagtctgctacgtgtataacacgttgcaatttaggtggtgctgtctgtagacatcatgctaatgagtacagattgtatctcgatgcttataacatgatgatctcagctggctttagcttgtgggtttacaaacaatttgatacttataacctctggaacacttttacaagacttcagagtttagaaaatgtggcttttaatgttgtaaataagggacactttgatggacaacagggtgaagtaccagtttctatcatta ataacactgtttacacaaaagttgatggtgttgatgtagaattgtttgaaaataaaacaacattacctgttaatgt agcatttgagctttgggctaagcgcaacattaaaccagtaccagaggtgaaaatactcaataatttgggtgtggac attgctgctaatactgtgatct

ORF1 ab-fragment 6(SEQ ID NO: 8):

gggtgaagtaccagtttctatcattaataacactgtttacacaaaagttgatggtgttgatgtagaat tgtttgaaaataaaacaacattacctgttaatgtagcatttgagctttgggctaagcgcaacattaaaccagtacc agaggtgaaaatactcaataatttgggtgtggacattgctgctaatactgtgatctgggactacaaaagagatgctccagcacatatatctactattggtgtttgttctatgactgacatagccaagaaaccaactgaaacgatttgtgcaccactcactgtcttttttgatggtagagttgatggtcaagtagacttatttagaaatgcccgtaatggtgttcttattacagaaggtagtgttaaaggtttacaaccatctgtaggtcccaaacaagctagtcttaatggagtcacattaattggagaagccgtaaaaacacagttcaattattataagaaagttgatggtgttgtccaacaattacctgaaacttactttactcagagtagaaatttacaagaatttaaacccaggagtcaaatggaaattgatttcttagaattagctatggatgaattcattgaacggtataaattagaaggctatgccttcgaacatatcgtttatggagattttagtcatagtcagttaggtggtttacatctactgattggactagctaaacgttttaaggaatcaccttttgaattagaagattttattcctatggacagtacagttaaaaactatttcataacagatgcgcaaacaggttcatctaagtgtgtgtgttctgttattgatttattacttgatgattttgttgaaataataaaatcccaagatttatctgtagtttctaaggttgtcaaagtgactattgactatacagaaatttcatttatgctttggtgtaaagatggccatgtagaaacattttacccaaaattacaatctagtcaagcgtggcaaccgggtgttgctatgcctaatctttacaaaatgcaaagaatgctattagaaaagtgtgaccttcaaaattatggtgatagtgcaacattacctaaaggcataatgatgaatgtcgcaaaatatactcaactgtgtcaatatttaaacacattaacattagctgtaccctataatatgagagttatacattttggtgctggttctgataaaggagttgcaccaggtacagctgttttaagacagtggttgcctacgggtacgctgcttgtcgattcagatcttaatgactttgtctctgatgcagattcaactttgattggtgattgtgcaactgtacatacagctaataaatgggatctcattattagtgatatgtacgaccctaagactaaaaatgttacaaaagaaaatgactctaaagagggttttttcacttacatttgtgggtttatacaacaaaagctagctcttggaggttccgtggctataaagataacagaacattcttggaatgctgatctttataagctcatgggacacttcgcatggtggacagcctttgttactaatgtgaatgcgtcatcatctgaagcatttttaattggatgtaattatcttggcaaaccacgcgaacaaatagatggttatgtcatgcatgcaaattacatattttggaggaatacaaatccaattcagttgtcttcctattctttatttgacatgagtaaatttccccttaaattaaggggtactgctgttatgtctttaaaagaaggtcaaatcaatgatatgattttatctcttcttagtaaaggtagacttataattagagaaaacaacagagttgttatttctagtgatgttcttgttaacaacGCGGCCGCCGAGGGCCGCGGCAGCCTGCTGACCTGCGGCGACGTGGAGGAGAACCCCGGCCCCGCGGCCGCCatggtgagcaagggcgaggaggacaacatggccatcatcaaggagttcatgcgcttcaaggtgcacatggagggcagcgtgaacggccacgagttcgagatcgagggcgagggcgagggccgcccctacgagggcacccagaccgccaagctgaaggtgaccaagggcggccccctgcccttcgcctgggacatcctgagcccccagttcatgtacggcagcaaggcctacgtgaagcaccccgccgacatccccgactacctgaagctgagcttccccgagggcttcaagtgggagcgcgtgatgaacttcgaggacggcggcgtggtgaccgtgacccaggacagcagcctgcaggacggcgagttcatctacaaggtgaagctgcgcggcaccaacttccccagcgacggccccgtgatgcagaagaagaccatgggctgggaggccagcagcgagcgcatgtaccccgaggacggcgccctgaagggcgagatcaagcagcgcctgaagctgaaggacggcggccactacgacgccgaggtgaagaccacctacaaggccaagaagcccgtgcagctgcccggcgcctacaacgtgaacatcaagctggacatcaccagccacaacgaggactacaccatcgtggagcagtacgagcgcgccgagggccgccacagcaccggcggcatggacgagctgtacaagGCGGCCGCCtaacaatctttaatcagtgtgtaacattagggaggacttgaaagagccaccacattttcaccgaggccacgcggagtacgatcgagtgtacagtgaacaatgctagggagagctgcctatatggaagagccctaatgtgtaaaattaattttagtagtgctatccccatgtgattttaatagcttcttaggagaatgacaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaAAAAAAAAAAAAgggtcggcatggcatctccacctcctcgcggtccgacctgggcatccgaaggaggacgcacgtccactcggatggctaagggagGCTAGCcagacatgataagatacattgatgagtttggacaaaccacaactagaatgcagtgaaaaaaatgctttatttgtgaaatttgtgatgctattgctttatttgtaaccattataagctgcaataaacaagttaacaacaacaattgcattcattttatgtttcaggttcagggggaggtgtgggaggttttttaaagcaagtaaaacctctacaaatgtggtaCAGAAAGTCAAAAGCCTCCGACCGGAGGCTTTTGACTAT TACTGCCGACATGGAAGCCATCACAAACGGCATGATGAACCTATAAACGCAGAAAGGCCCACCCGAAGGTGAGCCAG TGTGATTACCAGCGGCATCAGCACCTTGTCGCCTTGCGTATAATATTTGCAAAAAACCCCTCAAGACCCGTTTAGAG GCCCCAA

stru. delta. S protein expression sequence split fragment:

stru. delta. S-fragment 1(SEQ ID NO: 9):

CTGCCTGGCGGCAGTAGCGCGGTGGTCCCACCTGACCCCATGCCGAACTCAGAAGTGAAACGCCGTAG CGCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTAGGGAACTGCCAGGCATCAAATAAAACGAAAGGCTCAGTC GAAAGACTGGGCCTTCCTGCCACTCATCGCAGTACTGTTGTAATTCATTAAgcatgTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTGGTTTAGTGAACCGTCAGATCTAATACGACTCACTATAGGGattaaaggtttataccttcccaggtaacaaaccaaccaactttcgatctcttgtagatctgttctctaaacgaactttaaaatctgtgtggctgtcactcggctgcatgcttagtgcactcacgcagtataattaataactaattactgtcgttgacaggacacgagtaactcgtctatcttctgcaggctgcttacggtttcgtccgtgttgcagccgatcatcagcacatctaggtttcgtccgggtgtgaccgaaaggtaagacgaacttatggatttgtttatgagaatcttcacaattggaactgtaactttgaagcaaggtgaaatcaaggatgctactccttcagattttgttcgcgctactgcaacgataccgatacaagcctcactccctttcggatggcttattgttggcgttgcacttcttgctgtttttcagagcgcttccaaaatcataaccctcaaaaagagatggcaactagcactctccaagggtgttcactttgtttgcaacttgctgttgttgtttgtaacagtttactcacaccttttgctcgttgctgctggccttgaagccccttttctctatctttatgctttagtctacttcttgcagagtataaactttgtaagaataataatgaggctttggctttgctggaaatgccgttccaaaaacccattactttatgatgccaactattttctttgctggcatactaattgttacgactattgtataccttacaatagtgtaacttcttcaattgtcattacttcaggtgatggcacaacaagtcctatttctgaacatgactaccagattggtggttatactgaaaaatgggaatctggagtaaaagactgtgttgtattacacagttacttcacttcagactattaccagctgtactcaactcaattgagtacagacactggtgttgaacatgttaccttcttcatctacaataaaattgttgatgagcctgaagaacatgtccaaattcacacaatcgacggttcatccggagttgttaatccagtaatggaaccaatttatgatgaaccgacgacgactactagcgtgcctttgtaagcacaagctgatgagtacgaacttatgtactcattcgtttcggaagagacaggtacgttaatagttaatagcgtacttctttttcttgctttcgtggtattcttgctagttacactagccatccttactgcgct tcgattgtgtgcgtactgctgcaatattgttaacgtgagtcttgtaaaaccttctttttacgtttactctcgtgtt aaaaatctgaattcttctagagttcctgatcttctggtctaaacgaactaaatattatattagtttttctgtttgg aactttaattttagccatg

stru. delta. S-fragment 2(SEQ ID NO: 10):

ctagttacactagccatccttactgcgcttcgattgtgtgcgtactgctgcaatattgttaacgtgag tcttgtaaaaccttctttttacgtttactctcgtgttaaaaatctgaattcttctagagttcctgatcttctggtc taaacgaactaaatattatattagtttttctgtttggaactttaattttagccatggcagattccaacggtactattaccgttgaagagcttaaaaagctccttgaacaatggaacctagtaataggtttcctattccttacatggatttgtcttctacaatttgcctatgccaacaggaataggtttttgtatataattaagttaattttcctctggctgttatggccagtaactttagcttgttttgtgcttgctgctgtttacagaataaattggatcaccggtggaattgctatcgcaatggcttgtcttgtaggcttgatgtggctcagctacttcattgcttctttcagactgtttgcgcgtacgcgttccatgtggtcattcaatccagaaactaacattcttctcaacgtgccactccatggcactattctgaccagaccgcttctagaaagtgaactcgtaatcggagctgtgatccttcgtggacatcttcgtattgctggacaccatctaggacgctgtgacatcaaggacctgcctaaagaaatcactgttgctacatcacgaacgctttcttattacaaattgggagcttcgcagcgtgtagcaggtgactcaggttttgctgcatacagtcgctacaggattggcaactataaattaaacacagaccattccagtagcagtgacaatattgctttgcttgtacagtaagtgacaacagatgtttcatctcgttgactttcaggttactatagcagagatattactaattattatgaggacttttaaagtttccatttggaatcttgattacatcataaacctcataattaaaaatttatctaagtcactaactgagaataaatattctcaattagatgaagagcaaccaatggagattgattaaacgaacatgaaaattattcttttcttggcactgataacactcgctacttgtgagctttatcactaccaagagtgtgttagaggtacaacagtacttttaaaagaaccttgctcttctggaacatacgagggcaattcaccatttcatcctctagctgataacaaatttgcactgacttgctttagcactcaatttgcttttgcttgtcctgacggcgtaaaacacgtctatcagttacgtgccagatcagtttcacctaaactgttcatcagacaagaggaagttcaagaactttactctccaatttttcttattgttgcggcaatagtgtttataacactttgcttcacactcaaaagaaagacagaatgattgaactttcattaattgacttctatttgtgctttttagcctttctgctattccttgttttaattatgcttattatcttttggttctcacttgaactgcaagatcataatgaaacttgtcacgcctaaacgaacatgaaatttcttgttttcttaggaatcatcacaactgtagctgcatttcaccaagaatgtagtttacagtcatgtactcaacatcaaccatatgtagttgatgacccgtgtcctattcacttctattctaaatggtatattagagtaggagctagaaaatcagcacctttaattgaattgtgcgtggatgaggctggttctaaatcacccattcagtacatcgatatcggtaattatacagtttcctgtttaccttttacaattaattgccaggaacctaaattgggtagtcttgtagtgcgttgttcgttctatgaagactttttagagtatcatgacgttcgtgttgttttagatttcatctaaacgaacaaactaaaatgtctgataatggaccccaaaatcagcgaaatgcaccccgcattacgtttggtggaccctcagattcaactggcagtaaccagaatggagaacgcagtggggcgcgatcaaaacaacgtcggccccaaggtttacccaataatactgcgtcttggttcaccgctctcactcaacatggcaaggaagaccttaaattccctcgaggacaaggcgttccaattaacaccaatagcagtccagatgaccaaattggctactaccgaagagctaccagacgaattcgtggtggtgacggtaaaatgaaagatctcagtccaagatggtatttctactacctaggaactgggccagaagctggacttccctatggtgctaacaaagacggcatcatatgggttgcaactgagggagccttgaatacaccaaaagatcacattggcacccgcaatcctgctaacaatgctgcaatcgtgctacaacttcctcaaggaacaacattgccaaaaggcttctacgcagaagggagcagaggcggcagtcaagcctcttctcgttcctcatcacgtagtcgcaacagttcaagaaattcaactccaggcagcagtaggggaacttctcctgctagaatggctggcaatggcggtgatgctgctcttgctttgctgctgcttgacagattgaaccagcttgagagcaaaatgtctggtaaaggccaacaacaacaaggccaaactgtcactaagaaatctgctgctgaggcttctaagaagcctcggcaaaaacgtactgccactaaagcatacaatgtaacacaagctttcggcagacgtggtccagaacaaacccaaggaaatt ttggggaccaggaactaatcagacaaggaactgattacaaacattggccgcaaattgcacaatttgcccccagcgc ttcagcgttcttcggaatgtcgcgcattggcatggaagtcacaccttcgggaacgtggttgacctacacaggtgcc atcaaattggatgacaaagatccaaatt

stru. delta. S-fragment 3(SEQ ID NO: 11):

agaacaaacccaaggaaattttggggaccaggaactaatcagacaaggaactgattacaaacattggc cgcaaattgcacaatttgcccccagcgcttcagcgttcttcggaatgtcgcgcattggcatggaagtcacaccttc gggaacgtggttgacctacacaggtgccatcaaattggatgacaaagatccaaatttcaaagatcaagtcattttgctgaataagcatattgacgcatacaaaacattcccaccaacagagcctaaaaaggacaaaaagaagaaggctgatgaaactcaagccttaccgcagagacagaagaaacagcaaactgtgactcttcttcctgctgcagatttggatgatttctccaaacaattgcaacaatccatgagcagtgctgactcaactcaggcctaaactcatgcagaccacacaaggcagatgggctatataaacgttttcgcttttccgtttacgatatatagtctactcttgtgcagaatgaattctcgtaactacatagcacaagtagatgtagttaactttaatctcacaGAGGGCCGCGGCAGCCTGCTGACCTGCGGCGACGTGGAGGAGAACCCCGGCCCCATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGCGTGGTGCCCATCCTGGTGGAGCTGGACGGCGACGTGAACGGCCACAAGTTCAGCGTGAGCGGCGAGGGCGAGGGCGACGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTGGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGAGCGCCATGCCCGAGGGCTACGTGCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGCCACAAGCTGGAGTACAACTACAACAGCCACAACGTGTACATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTGGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGAGCGCCCTGAGCAAGGACCCCAACGAGAAGCGCGACCACATGGTGCTGCTGGAGTTCGTGACCGCCGCCGGCATCACCCTGGGCATGGACGAGCTGTACAAGGCCACCAACTTCAGCCTGCTGAAGCAGGCCGGCGACGTGGAGGAGAACCCCGGCCCCATGATCGAGCAGGACGGCCTGCACGCCGGCAGCCCCGCCGCCTGGGTGGAGCGCCTGTTCGGCTACGACTGGGCCCAGCAGACCATCGGCTGCAGCGACGCCGCCGTGTTCCGCCTGAGCGCCCAGGGCCGCCCCGTGCTGTTCGTGAAGACCGACCTGAGCGGCGCCCTGAACGAGCTGCAGGACGAGGCCGCCCGCCTGAGCTGGCTGGCCACCACCGGCGTGCCCTGCGCCGCCGTGCTGGACGTGGTGACCGAGGCCGGCCGCGACTGGCTGCTGCTGGGCGAGGTGCCCGGCCAGGACCTGCTGAGCAGCCACCTGGCCCCCGCCGAGAAGGTGAGCATCATGGCCGACGCCATGCGCCGCCTGCACACCCTGGACCCCGCCACCTGCCCCTTCGACCACCAGGCCAAGCACCGCATCGAGCGCGCCCGCACCCGCATGGAGGCCGGCCTGGTGGACCAGGACGACCTGGACGAGGAGCACCAGGGCCTGGCCCCCGCCGAGCTGTTCGCCCGCCTGAAGGCCAGCATGCCCGACGGCGAGGACCTGGTGGTGACCCACGGCGACGCCTGCCTGCCCAACATCATGGTGGAGAACGGCCGCTTCAGCGGCTTCATCGACTGCGGCAGGCTGGGCGTGGCCGACCGCTACCAGGACATCGCCCTGGCCACCCGCGACATCGCCGAGGAGCTGGGCGGCGAGTGGGCCGACCGCTTCCTGGTGCTGTACGGCATCGCCGCCCCCGACAGCCAGCGCATCGCCTTCTACCGCCTGCTGGACGAGTTCTTCTAAcaatctttaatcagtgtgtaacattagggaggacttgaaagagccaccacattttcaccgaggccacgcggagtacgatcgagtgtacagtgaacaatgctagggagagctgcctatatggaagagccctaatgtgtaaaattaattttagtagtgctatccccatgtgattttaatagcttcttaggagaatgacaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaAAAAAAAAAAAAgggtcggcatggcatctccacctcctcgcggtccgacctgggcatccgaaggaggacgcacgtccactcggatggctaagggagGCTAGCcagacatgataagatacattgatgagtttggacaaaccacaactagaatgcagtgaaaaaaatgctttatttgtgaaatttgtgatgctattgctttatttgtaaccattataagctgcaataaacaagttaacaacaacaattgcattcattttatgtttcaggttcagggggaggtgtgggaggttttttaaagcaagtaaaacctctacaaatgtggtaCAGAAAGTCAAAAGCCTCCGACCGGAGGCTTTTGACTATTACTGCCGACATGGAAGCCATCACAAA CGGCATGATGAACCTATAAACGCAGAAAGGCCCACCCGAAGGTGAGCCAGTGTGATTACCAGCGGCATCAGCACCTT GTCGCCTTGCGTATAATATTTGCAAAAAACCCCTCAAGACCCGTTTAGAGGCCCCAA

5' UTR (SEQ ID NO: 12) used in example 1

ATTAAAGGTTTATACCTTCCCAGGTAACAAACCAACCAACTTTCGATCTCTTGTAGATCTGTTCTCTAAACGAACTTTAAAATCTGTGTGGCTGTCACTCGGCTGCATGCTTAGTGCACTCACGCAGTATAATTAATAACTAATTACTGTCGTTGACAGGACACGAGTAACTCGTCTATCTTCTGCAGGCTGCTTACGGTTTCGTCCGTGTTGCAGCCGATCATCAGCACATCTAGGTTTCGTCCGGGTGTGACCGAAAGGTAAG

3' UTR (comprising polyA tail, SEQ ID NO: 13) used in example 1

CAATCTTTAATCAGTGTGTAACATTAGGGAGGACTTGAAAGAGCCACCACATTTTCACCGAGGCCACGCGGAGTACGATCGAGTGTACAGTGAACAATGCTAGGGAGAGCTGCCTATATGGAAGAGCCCTAATGTGTAAAATTAATTTTAGTAGTGCTATCCCCATGTGATTTTAATAGCTTCTTAGGAGAATGACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

The embodiments of the present invention are not limited to the above-described examples, and various changes and modifications in form and detail may be made by those skilled in the art without departing from the spirit and scope of the present invention, and these are considered to fall within the scope of the present invention.

Sequence listing

<110> Shenzhen advanced technology research institute of Chinese academy of sciences

Preparation method and application of <120> SARS-CoV-2 virus-like particle

<130> XJYZL20210699

<140> CN202111271508.1

<141> 2021-10-29

<160> 13

<170> PatentIn version 3.5

<210> 1

<211> 2773

<212> DNA

<213> Artificial Sequence

<220>

<223> S-fragment 1

<400> 1

gcagtgagcg caacgcaatt aatgtgagtt agctcactca ttaggcaccc caggctttac 60

actttatgct tccggctcgt atgttgtgtg gaattgtgag cggataacaa tttcacacag 120

gaaacagcta tgaccatgat tacgccaagc ttggtaccga gctcggatcc actagtaacg 180

gccgccagtg tgctggaatt ctgcagatat aacttcgtat aatgtatgct atacgaagtt 240

atccgttaca taacttacgg taaatggccc gcctggctga ccgcccaacg acccccgccc 300

attgacgtca atagtaacgc caatagggac tttccattga cgtcaatggg tggagtattt 360

acggtaaact gcccacttgg cagtacatca agtgtatcat atgccaagta cgccccctat 420

tgacgtcaat gacggtaaat ggcccgcctg gcattgtgcc cagtacatga ccttatggga 480

ctttcctact tggcagtaca tctacgtatt agtcatcgct attaccatgg tcgaggtgag 540

ccccacgttc tgcttcactc tccccatctc ccccccctcc ccacccccaa ttttgtattt 600

atttattttt taattatttt gtgcagcgat gggggcgggg gggggggggg ggcggggcga 660

ggggcggggc ggggcgaggc ggagaggtgc ggcggcagcc aatcagagcg gcgcgctccg 720

aaagtttcct tttatggcga ggcggcggcg gcggcggccc tataaaaagc gaagcgcgcg 780

gcgggcggga gtcgctgcgc gctgccttcg ccccgtgccc cgctccgccg ccgcctcgcg 840

ccgcccgccc cggctctgac tgaccgcgtt actcccacag gtgagcgggc gggacggccc 900

ttctcctccg ggctgtaatt agctgagcaa gaggtaaggg tttaagggat ggttggttgg 960

tggggtatta atgtttaatt acctggagca cctgcctgaa atcacttttt ttcaggttgg 1020

accggtgcca ccatgttcgt gttcctggtg ctgctgcccc tggtgagcag ccagtgcgtg 1080

aacctgacca cccgcaccca gctgcccccc gcctacacca acagcttcac ccgcggcgtg 1140

tactaccccg acaaggtgtt ccgcagcagc gtgctgcaca gcacccagga cctgttcctg 1200

cccttcttca gcaacgtgac ctggttccac gccatccacg tgagcggcac caacggcacc 1260

aagcgcttcg acaaccccgt gctgcccttc aacgacggcg tgtacttcgc cagcaccgag 1320

aagagcaaca tcatccgcgg ctggatcttc ggcaccaccc tggacagcaa gacccagagc 1380

ctgctgatcg tgaacaacgc caccaacgtg gtgatcaagg tgtgcgagtt ccagttctgc 1440

aacgacccct tcctgggcgt gtactaccac aagaacaaca agagctggat ggagagcgag 1500

ttccgcgtgt acagcagcgc caacaactgc accttcgagt acgtgagcca gcccttcctg 1560

atggacctgg agggcaagca gggcaacttc aagaacctgc gcgagttcgt gttcaagaac 1620

atcgacggct acttcaagat ctacagcaag cacaccccca tcaacctggt gcgcgacctg 1680

ccccagggct tcagcgccct ggagcccctg gtggacctgc ccatcggcat caacatcacc 1740

cgcttccaga ccctgctggc cctgcaccgc agctacctga cccccggcga cagcagcagc 1800

ggctggaccg ccggcgccgc cgcctactac gtgggctacc tgcagccccg caccttcctg 1860

ctgaagtaca acgagaacgg caccatcacc gacgccgtgg actgcgccct ggaccccctg 1920

agcgagacca agtgcaccct gaagagcttc accgtggaga agggcatcta ccagaccagc 1980

aacttccgcg tgcagcccac cgagagcatc gtgcgcttcc ccaacatcac caacctgtgc 2040

cccttcggcg aggtgttcaa cgccacccgc ttcgccagcg tgtacgcctg gaaccgcaag 2100

cgcatcagca actgcgtggc cgactacagc gtgctgtaca acagcgccag cttcagcacc 2160

ttcaagtgct acggcgtgag ccccaccaag ctgaacgacc tgtgcttcac caacgtgtac 2220

gccgacagct tcgtgatccg cggcgacgag gtgcgccaga tcgcccccgg ccagaccggc 2280

aagatcgccg actacaacta caagctgccc gacgacttca ccggctgcgt gatcgcctgg 2340

aacagcaaca acctggacag caaggtgggc ggcaactaca actacctgta ccgcctgttc 2400

cgcaagagca acctgaagcc cttcgagcgc gacatcagca ccgagatcta ccaggccggc 2460

agcaccccct gcaacggcgt ggagggcttc aactgctact tccccctgca gagctacggc 2520

ttccagccca ccaacggcgt gggctaccag ccctaccgcg tggtggtgct gagcttcgag 2580

ctgctgcacg cccccgccac cgtgtgcggc cccaagaaga gcaccaacct ggtgaagaac 2640

aagtgcgtga acttcaactt caacggcctg accggcaccg gcgtgctgac cgagagcaac 2700

aagaagttcc tgcccttcca gcagttcggc cgcgacatcg ccgacaccac cgacgccgtg 2760

cgcgaccccc aga 2773

<210> 2

<211> 2785

<212> DNA

<213> Artificial Sequence

<220>

<223> S-fragment 2

<400> 2

cttcgagctg ctgcacgccc ccgccaccgt gtgcggcccc aagaagagca ccaacctggt 60

gaagaacaag tgcgtgaact tcaacttcaa cggcctgacc ggcaccggcg tgctgaccga 120

gagcaacaag aagttcctgc ccttccagca gttcggccgc gacatcgccg acaccaccga 180

cgccgtgcgc gacccccaga ccctggagat cctggacatc accccctgca gcttcggcgg 240

cgtgagcgtg atcacccccg gcaccaacac cagcaaccag gtggccgtgc tgtaccagga 300

cgtgaactgc accgaggtgc ccgtggccat ccacgccgac cagctgaccc ccacctggcg 360

cgtgtacagc accggcagca acgtgttcca gacccgcgcc ggctgcctga tcggcgccga 420

gcacgtgaac aacagctacg agtgcgacat ccccatcggc gccggcatct gcgccagcta 480

ccagacccag accaacagcc cccgccgcgc ccgcagcgtg gccagccaga gcatcatcgc 540

ctacaccatg agcctgggcg ccgagaacag cgtggcctac agcaacaaca gcatcgccat 600

ccccaccaac ttcaccatca gcgtgaccac cgagatcctg cccgtgagca tgaccaagac 660

cagcgtggac tgcaccatgt acatctgcgg cgacagcacc gagtgcagca acctgctgct 720

gcagtacggc agcttctgca cccagctgaa ccgcgccctg accggcatcg ccgtggagca 780

ggacaagaac acccaggagg tgttcgccca ggtgaagcag atctacaaga ccccccccat 840

caaggacttc ggcggcttca acttcagcca gatcctgccc gaccccagca agcccagcaa 900

gcgcagcttc atcgaggacc tgctgttcaa caaggtgacc ctggccgacg ccggcttcat 960

caagcagtac ggcgactgcc tgggcgacat cgccgcccgc gacctgatct gcgcccagaa 1020

gttcaacggc ctgaccgtgc tgccccccct gctgaccgac gagatgatcg cccagtacac 1080

cagcgccctg ctggccggca ccatcaccag cggctggacc ttcggcgccg gcgccgccct 1140

gcagatcccc ttcgccatgc agatggccta ccgcttcaac ggcatcggcg tgacccagaa 1200

cgtgctgtac gagaaccaga agctgatcgc caaccagttc aacagcgcca tcggcaagat 1260

ccaggacagc ctgagcagca ccgccagcgc cctgggcaag ctgcaggacg tggtgaacca 1320

gaacgcccag gccctgaaca ccctggtgaa gcagctgagc agcaacttcg gcgccatcag 1380

cagcgtgctg aacgacatcc tgagccgcct ggacaaggtg gaggccgagg tgcagatcga 1440

ccgcctgatc accggccgcc tgcagagcct gcagacctac gtgacccagc agctgatccg 1500

cgccgccgag atccgcgcca gcgccaacct ggccgccacc aagatgagcg agtgcgtgct 1560

gggccagagc aagcgcgtgg acttctgcgg caagggctac cacctgatga gcttccccca 1620

gagcgccccc cacggcgtgg tgttcctgca cgtgacctac gtgcccgccc aggagaagaa 1680

cttcaccacc gcccccgcca tctgccacga cggcaaggcc cacttccccc gcgagggcgt 1740

gttcgtgagc aacggcaccc actggttcgt gacccagcgc aacttctacg agccccagat 1800

catcaccacc gacaacacct tcgtgagcgg caactgcgac gtggtgatcg gcatcgtgaa 1860

caacaccgtg tacgaccccc tgcagcccga gctggacagc ttcaaggagg agctggacaa 1920

gtacttcaag aaccacacca gccccgacgt ggacctgggc gacatcagcg gcatcaacgc 1980

cagcgtggtg aacatccaga aggagatcga ccgcctgaac gaggtggcca agaacctgaa 2040

cgagagcctg atcgacctgc aggagctggg caagtacgag cagtacatca agtggccctg 2100

gtacatctgg ctgggcttca tcgccggcct gatcgccatc gtgatggtga ccatcatgct 2160

gtgctgcatg accagctgct gcagctgcct gaagggctgc tgcagctgcg gcagctgctg 2220

caagttcgac gaggacgaca gcgagcccgt gctgaagggc gtgaagctgc actacaccta 2280

atgtacaagg aattctaact agagctcgct gatcagcctc gactgtgcct tctagttgcc 2340

agccatctgt tgtttgcccc tcccccgtgc cttccttgac cctggaaggt gccactccca 2400

ctgtcctttc ctaataaaat gaggaaattg catcgcattg tctgagtagg tgtcattcta 2460

ttctgggggg tggggtgggg caggacagca agggggagga ttgggaagag aatagcaggc 2520

atgctgggga ataacttcgt ataatgtatg ctatacgaag ttatatccat cacactggcg 2580

gccgcctcga gcatgcatct agagggccca attcgcccta tagtgagtcg tattacaatt 2640

cactggccgt cgttttacaa cgtcgtgact gggaaaaccc tggcgttacc caacttaatc 2700

gccttgcagc acatccccct ttcgccagct ggcgtaatag cgaagaggcc cgcaccgatc 2760

gcccttccca acagttgcgc agcct 2785

<210> 3

<211> 4547

<212> DNA

<213> Artificial Sequence

<220>

<223> ORF1 ab-fragment 1

<400> 3

ctgcctggcg gcagtagcgc ggtggtccca cctgacccca tgccgaactc agaagtgaaa 60

cgccgtagcg ccgatggtag tgtggggtct ccccatgcga gagtagggaa ctgccaggca 120

tcaaataaaa cgaaaggctc agtcgaaaga ctgggccttc ctgccactca tcgcagtact 180

gttgtaattc attaagcatg tagttattaa tagtaatcaa ttacggggtc attagttcat 240

agcccatata tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg 300

cccaacgacc cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata 360

gggactttcc attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta 420

catcaagtgt atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc 480

gcctggcatt atgcccagta catgacctta tgggactttc ctacttggca gtacatctac 540

gtattagtca tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga 600

tagcggtttg actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg 660

ttttggcacc aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg 720

caaatgggcg gtaggcgtgt acggtgggag gtctatataa gcagagctgg tttagtgaac 780

cgtcagatcg attaaaggtt tataccttcc caggtaacaa accaaccaac tttcgatctc 840

ttgtagatct gttctctaaa cgaactttaa aatctgtgtg gctgtcactc ggctgcatgc 900

ttagtgcact cacgcagtat aattaataac taattactgt cgttgacagg acacgagtaa 960

ctcgtctatc ttctgcaggc tgcttacggt ttcgtccgtg ttgcagccga tcatcagcac 1020

atctaggttt cgtccgggtg tgaccgaaag gtaagatgga gagccttgtc cctggtttca 1080

acgagaaaac acacgtccaa ctcagtttgc ctgttttaca ggttcgcgac gtgctcgtac 1140

gtggctttgg agactccgtg gaggaggtct tatcagaggc acgtcaacat cttaaagatg 1200

gcacttgtgg cttagtagaa gttgaaaaag gcgttttgcc tcaacttgaa cagccctatg 1260

tgttcatcaa acgttcggat gctcgaactg cacctcatgg tcatgttatg gttgagctgg 1320

tagcagaact cgaaggcatt cagtacggtc gtagtggtga gacacttggt gtccttgtcc 1380

ctcatgtggg cgaaatacca gtggcttacc gcaaggttct tcttcgtaag aacggtaata 1440

aaggagctgg tggccatagt tacggcgccg atctaaagtc atttgactta ggcgacgagc 1500

ttggcactga tccttatgaa gattttcaag aaaactggaa cactaaacat agcagtggtg 1560

ttacccgtga actcatgcgt gagcttaacg gaggggcata cactcgctat gtcgataaca 1620

acttctgtgg ccctgatggc taccctcttg agtgcattaa agaccttcta gcacgtgctg 1680

gtaaagcttc atgcactttg tccgaacaac tggactttat tgacactaag aggggtgtat 1740

actgctgccg tgaacatgag catgaaattg cttggtacac ggaacgttct gaaaagagct 1800

atgaattgca gacacctttt gaaattaaat tggcaaagaa atttgacacc ttcaatgggg 1860

aatgtccaaa ttttgtattt cccttaaatt ccataatcaa gactattcaa ccaagggttg 1920

aaaagaaaaa gcttgatggc tttatgggta gaattcgatc tgtctatcca gttgcgtcac 1980

caaatgaatg caaccaaatg tgcctttcaa ctctcatgaa gtgtgatcat tgtggtgaaa 2040

cttcatggca gacgggcgat tttgttaaag ccacttgcga attttgtggc actgagaatt 2100

tgactaaaga aggtgccact acttgtggtt acttacccca aaatgctgtt gttaaaattt 2160

attgtccagc atgtcacaat tcagaagtag gacctgagca tagtcttgcc gaataccata 2220

atgaatctgg cttgaaaacc attcttcgta agggtggtcg cactattgcc tttggaggct 2280

gtgtgttctc ttatgttggt tgccataaca agtgtgccta ttgggttcca cgtgctagcg 2340

ctaacatagg ttgtaaccat acaggtgttg ttggagaagg ttccgaaggt cttaatgaca 2400

accttcttga aatactccaa aaagagaaag tcaacatcaa tattgttggt gactttaaac 2460

ttaatgaaga gatcgccatt attttggcat ctttttctgc ttccacaagt gcttttgtgg 2520

aaactgtgaa aggtttggat tataaagcat tcaaacaaat tgttgaatcc tgtggtaatt 2580

ttaaagttac aaaaggaaaa gctaaaaaag gtgcctggaa tattggtgaa cagaaatcaa 2640

tactgagtcc tctttatgca tttgcatcag aggctgctcg tgttgtacga tcaattttct 2700

cccgcactct tgaaactgct caaaattctg tgcgtgtttt acagaaggcc gctataacaa 2760

tactagatgg aatttcacag tattcactga gactcattga tgctatgatg ttcacatctg 2820

atttggctac taacaatcta gttgtaatgg cctacattac aggtggtgtt gttcagttga 2880

cttcgcagtg gctaactaac atctttggca ctgtttatga aaaactcaaa cccgtccttg 2940

attggcttga agagaagttt aaggaaggtg tagagtttct tagagacggt tgggaaattg 3000

ttaaatttat ctcaacctgt gcttgtgaaa ttgtcggtgg acaaattgtc acctgtgcaa 3060

aggaaattaa ggagagtgtt cagacattct ttaagcttgt aaataaattt ttggctttgt 3120

gtgctgactc tatcattatt ggtggagcta aacttaaagc cttgaattta ggtgaaacat 3180

ttgtcacgca ctcaaaggga ttgtacagaa agtgtgttaa atccagagaa gaaactggcc 3240

tactcatgcc tctaaaagcc ccaaaagaaa ttatcttctt agagggagaa acacttccca 3300

cagaagtgtt aacagaggaa gttgtcttga aaactggtga tttacaacca ttagaacaac 3360

ctactagtga agctgttgaa gctccattgg ttggtacacc agtttgtatt aacgggctta 3420

tgttgctcga aatcaaagac acagaaaagt actgtgccct tgcacctaat atgatggtaa 3480

caaacaatac cttcacactc aaaggcggtg caccaacaaa ggttactttt ggtgatgaca 3540

ctgtgataga agtgcaaggt tacaagagtg tgaatatcac ttttgaactt gatgaaagga 3600

ttgataaagt acttaatgag aagtgctctg cctatacagt tgaactcggt acagaagtaa 3660

atgagttcgc ctgtgttgtg gcagatgctg tcataaaaac tttgcaacca gtatctgaat 3720

tacttacacc actgggcatt gatttagatg agtggagtat ggctacatac tacttatttg 3780

atgagtctgg tgagtttaaa ttggcttcac atatgtattg ttctttctac cctccagatg 3840

aggatgaaga agaaggtgat tgtgaagaag aagagtttga gccatcaact caatatgagt 3900

atggtactga agatgattac caaggtaaac ctttggaatt tggtgccact tctgctgctc 3960

ttcaacctga agaagagcaa gaagaagatt ggttagatga tgatagtcaa caaactgttg 4020

gtcaacaaga cggcagtgag gacaatcaga caactactat tcaaacaatt gttgaggttc 4080

aacctcaatt agagatggaa cttacaccag ttgttcagac tattgaagtg aatagtttta 4140

gtggttattt aaaacttact gacaatgtat acattaaaaa tgcagacatt gtggaagaag 4200

ctaaaaaggt aaaaccaaca gtggttgtta atgcagccaa tgtttacctt aaacatggag 4260

gaggtgttgc aggagcctta aataaggcta ctaacaatgc catgcaagtt gaatctgatg 4320

attacatagc tactaatgga ccacttaaag tgggtggtag ttgtgtttta agcggacaca 4380

atcttgctaa acactgtctt catgttgtcg gcccaaatgt taacaaaggt gaagacattc 4440

aacttcttaa gagtgcttat gaaaatttta atcagcacga agttctactt gcaccattat 4500

tatcagctgg tatttttggt gctgacccta tacattcttt aagagtt 4547

<210> 4

<211> 4500

<212> DNA

<213> Artificial Sequence

<220>

<223> ORF1 ab-fragment 2

<400> 4

aagtgggtgg tagttgtgtt ttaagcggac acaatcttgc taaacactgt cttcatgttg 60

tcggcccaaa tgttaacaaa ggtgaagaca ttcaacttct taagagtgct tatgaaaatt 120

ttaatcagca cgaagttcta cttgcaccat tattatcagc tggtattttt ggtgctgacc 180

ctatacattc tttaagagtt tgtgtagata ctgttcgcac aaatgtctac ttagctgtct 240

ttgataaaaa tctctatgac aaacttgttt caagcttttt ggaaatgaag agtgaaaagc 300

aagttgaaca aaagatcgct gagattccta aagaggaagt taagccattt ataactgaaa 360

gtaaaccttc agttgaacag agaaaacaag atgataagaa aatcaaagct tgtgttgaag 420

aagttacaac aactctggaa gaaactaagt tcctcacaga aaacttgtta ctttatattg 480

acattaatgg caatcttcat ccagattctg ccactcttgt tagtgacatt gacatcactt 540

tcttaaagaa agatgctcca tatatagtgg gtgatgttgt tcaagagggt gttttaactg 600

ctgtggttat acctactaaa aaggctggtg gcactactga aatgctagcg aaagctttga 660

gaaaagtgcc aacagacaat tatataacca cttacccggg tcagggttta aatggttaca 720

ctgtagagga ggcaaagaca gtgcttaaaa agtgtaaaag tgccttttac attctaccat 780

ctattatctc taatgagaag caagaaattc ttggaactgt ttcttggaat ttgcgagaaa 840

tgcttgcaca tgcagaagaa acacgcaaat taatgcctgt ctgtgtggaa actaaagcca 900

tagtttcaac tatacagcgt aaatataagg gtattaaaat acaagagggt gtggttgatt 960

atggtgctag attttacttt tacaccagta aaacaactgt agcgtcactt atcaacacac 1020

ttaacgatct aaatgaaact cttgttacaa tgccacttgg ctatgtaaca catggcttaa 1080

atttggaaga agctgctcgg tatatgagat ctctcaaagt gccagctaca gtttctgttt 1140

cttcacctga tgctgttaca gcgtataatg gttatcttac ttcttcttct aaaacacctg 1200

aagaacattt tattgaaacc atctcacttg ctggttccta taaagattgg tcctattctg 1260

gacaatctac acaactaggt atagaatttc ttaagagagg tgataaaagt gtatattaca 1320

ctagtaatcc taccacattc cacctagatg gtgaagttat cacctttgac aatcttaaga 1380

cacttctttc tttgagagaa gtgaggacta ttaaggtgtt tacaacagta gacaacatta 1440

acctccacac gcaagttgtg gacatgtcaa tgacatatgg acaacagttt ggtccaactt 1500

atttggatgg agctgatgtt actaaaataa aacctcataa ttcacatgaa ggtaaaacat 1560

tttatgtttt acctaatgat gacactctac gtgttgaggc ttttgagtac taccacacaa 1620

ctgatcctag ttttctgggt aggtacatgt cagcattaaa tcacactaaa aagtggaaat 1680

acccacaagt taatggttta acttctatta aatgggcaga taacaactgt tatcttgcca 1740

ctgcattgtt aacactccaa caaatagagt tgaagtttaa tccacctgct ctacaagatg 1800

cttattacag agcaagggct ggtgaagctg ctaacttttg tgcacttatc ttagcctact 1860

gtaataagac agtaggtgag ttaggtgatg ttagagaaac aatgagttac ttgtttcaac 1920

atgccaattt agattcttgc aaaagagtct tgaacgtggt gtgtaaaact tgtggacaac 1980

agcagacaac ccttaagggt gtagaagctg ttatgtacat gggcacactt tcttatgaac 2040

aatttaagaa aggtgttcag ataccttgta cgtgtggtaa acaagctaca aaatatctag 2100

tacaacagga gtcacctttt gttatgatgt cagcaccacc tgctcagtat gaacttaagc 2160

atggtacatt tacttgtgct agtgagtaca ctggtaatta ccagtgtggt cactataaac 2220

atataacttc taaagaaact ttgtattgca tagacggtgc tttacttaca aagtcctcag 2280

aatacaaagg tcctattacg gatgttttct acaaagaaaa cagttacaca acaaccataa 2340

aaccagttac ttataaattg gatggtgttg tttgtacaga aattgaccct aagttggaca 2400

attattataa gaaagacaat tcttatttca cagagcaacc aattgatctt gtaccaaacc 2460

aaccatatcc aaacgcaagc ttcgataatt ttaagtttgt atgtgataat atcaaatttg 2520

ctgatgattt aaaccagtta actggttata agaaacctgc ttcaagagag cttaaagtta 2580

catttttccc tgacttaaat ggtgatgtgg tggctattga ttataaacac tacacaccct 2640

cttttaagaa aggagctaaa ttgttacata aacctattgt ttggcatgtt aacaatgcaa 2700

ctaataaagc cacgtataaa ccaaatacct ggtgtatacg ttgtctttgg agcacaaaac 2760

cagttgaaac atcaaattcg tttgatgtac tgaagtcaga ggacgcgcag ggaatggata 2820

atcttgcctg cgaagatcta aaaccagtct ctgaagaagt agtggaaaat cctaccatac 2880

agaaagacgt tcttgagtgt aatgtgaaaa ctaccgaagt tgtaggagac attatactta 2940

aaccagcaaa taatagttta aaaattacag aagaggttgg ccacacagat ctaatggctg 3000

cttatgtaga caattctagt cttactatta agaaacctaa tgaattatct agagtattag 3060

gtttgaaaac ccttgctact catggtttag ctgctgttaa tagtgtccct tgggatacta 3120

tagctaatta tgctaagcct tttcttaaca aagttgttag tacaactact aacatagtta 3180

cacggtgttt aaaccgtgtt tgtactaatt atatgcctta tttctttact ttattgctac 3240

aattgtgtac ttttactaga agtacaaatt ctagaattaa agcatctatg ccgactacta 3300

tagcaaagaa tactgttaag agtgtcggta aattttgtct agaggcttca tttaattatt 3360

tgaagtcacc taatttttct aaactgataa atattataat ttggttttta ctattaagtg 3420

tttgcctagg ttctttaatc tactcaaccg ctgctttagg tgttttaatg tctaatttag 3480

gcatgccttc ttactgtact ggttacagag aaggctattt gaactctact aatgtcacta 3540

ttgcaaccta ctgtactggt tctatacctt gtagtgtttg tcttagtggt ttagattctt 3600

tagacaccta tccttcttta gaaactatac aaattaccat ttcatctttt aaatgggatt 3660

taactgcttt tggcttagtt gcagagtggt ttttggcata tattcttttc actaggtttt 3720

tctatgtact tggattggct gcaatcatgc aattgttttt cagctatttt gcagtacatt 3780

ttattagtaa ttcttggctt atgtggttaa taattaatct tgtacaaatg gccccgattt 3840

cagctatggt tagaatgtac atcttctttg catcatttta ttatgtatgg aaaagttatg 3900

tgcatgttgt agacggttgt aattcatcaa cttgtatgat gtgttacaaa cgtaatagag 3960

caacaagagt cgaatgtaca actattgtta atggtgttag aaggtccttt tatgtctatg 4020

ctaatggagg taaaggcttt tgcaaactac acaattggaa ttgtgttaat tgtgatacat 4080

tctgtgctgg tagtacattt attagtgatg aagttgcgag agacttgtca ctacagttta 4140

aaagaccaat aaatcctact gaccagtctt cttacatcgt tgatagtgtt acagtgaaga 4200

atggttccat ccatctttac tttgataaag ctggtcaaaa gacttatgaa agacattctc 4260

tctctcattt tgttaactta gacaacctga gagctaataa cactaaaggt tcattgccta 4320

ttaatgttat agtttttgat ggtaaatcaa aatgtgaaga atcatctgca aaatcagcgt 4380

ctgtttacta cagtcagctt atgtgtcaac ctatactgtt actagatcag gcattagtgt 4440

ctgatgttgg tgatagtgcg gaagttgcag ttaaaatgtt tgatgcttac gttaatacgt 4500

<210> 5

<211> 4500

<212> DNA

<213> Artificial Sequence

<220>

<223> ORF1 ab-fragment 3

<400> 5

cactaaaggt tcattgccta ttaatgttat agtttttgat ggtaaatcaa aatgtgaaga 60

atcatctgca aaatcagcgt ctgtttacta cagtcagctt atgtgtcaac ctatactgtt 120

actagatcag gcattagtgt ctgatgttgg tgatagtgcg gaagttgcag ttaaaatgtt 180

tgatgcttac gttaatacgt tttcatcaac ttttaacgta ccaatggaaa aactcaaaac 240

actagttgca actgcagaag ctgaacttgc aaagaatgtg tccttagaca atgtcttatc 300

tacttttatt tcagcagctc ggcaagggtt tgttgattca gatgtagaaa ctaaagatgt 360

tgttgaatgt cttaaattgt cacatcaatc tgacatagaa gttactggcg atagttgtaa 420

taactatatg ctcacctata acaaagttga aaacatgaca ccccgtgacc ttggtgcttg 480

tattgactgt agtgcgcgtc atattaatgc gcaggtagca aaaagtcaca acattgcttt 540

gatatggaac gttaaagatt tcatgtcatt gtctgaacaa ctacgaaaac aaatacgtag 600

tgctgctaaa aagaataact taccttttaa gttgacatgt gcaactacta gacaagttgt 660

taatgttgta acaacaaaga tagcacttaa gggtggtaaa attgttaata attggttgaa 720

gcagttaatt aaagttacac ttgtgttcct ttttgttgct gctattttct atttaataac 780

acctgttcat gtcatgtcta aacatactga cttttcaagt gaaatcatag gatacaaggc 840

tattgatggt ggtgtcactc gtgacatagc atctacagat acttgttttg ctaacaaaca 900

tgctgatttt gacacatggt ttagccagcg tggtggtagt tatactaatg acaaagcttg 960

cccattgatt gctgcagtca taacaagaga agtgggtttt gtcgtgcctg gtttgcctgg 1020

cacgatatta cgcacaacta atggtgactt tttgcatttc ttacctagag tttttagtgc 1080

agttggtaac atctgttaca caccatcaaa acttatagag tacactgact ttgcaacatc 1140

agcttgtgtt ttggctgctg aatgtacaat ttttaaagat gcttctggta agccagtacc 1200

atattgttat gataccaatg tactagaagg ttctgttgct tatgaaagtt tacgccctga 1260

cacacgttat gtgctcatgg atggctctat tattcaattt cctaacacct accttgaagg 1320

ttctgttaga gtggtaacaa cttttgattc tgagtactgt aggcacggca cttgtgaaag 1380

atcagaagct ggtgtttgtg tatctactag tggtagatgg gtacttaaca atgattatta 1440

cagatcttta ccaggagttt tctgtggtgt agatgctgta aatttactta ctaatatgtt 1500

tacaccacta attcaaccta ttggtgcttt ggacatatca gcatctatag tagctggtgg 1560

tattgtagct atcgtagtaa catgccttgc ctactatttt atgaggttta gaagagcttt 1620

tggtgaatac agtcatgtag ttgcctttaa tactttacta ttccttatgt cattcactgt 1680

actctgttta acaccagttt actcattctt acctggtgtt tattctgtta tttacttgta 1740

cttgacattt tatcttacta atgatgtttc ttttttagca catattcagt ggatggttat 1800

gttcacacct ttagtacctt tctggataac aattgcttat atcatttgta tttccacaaa 1860

gcatttctat tggttcttta gtaattacct aaagagacgt gtagtcttta atggtgtttc 1920

ctttagtact tttgaagaag ctgcgctgtg cacctttttg ttaaataaag aaatgtatct 1980

aaagttgcgt agtgatgtgc tattacctct tacgcaatat aatagatact tagctcttta 2040

taataagtac aagtatttta gtggagcaat ggatacaact agctacagag aagctgcttg 2100

ttgtcatctc gcaaaggctc tcaatgactt cagtaactca ggttctgatg ttctttacca 2160

accaccacaa acctctatca cctcagctgt tttgcagagt ggttttagaa aaatggcatt 2220

cccatctggt aaagttgagg gttgtatggt acaagtaact tgtggtacaa ctacacttaa 2280

cggtctttgg cttgatgacg tagtttactg tccaagacat gtgatctgca cctctgaaga 2340

catgcttaac cctaattatg aagatttact cattcgtaag tctaatcata atttcttggt 2400

acaggctggt aatgttcaac tcagggttat tggacattct atgcaaaatt gtgtacttaa 2460

gcttaaggtt gatacagcca atcctaagac acctaagtat aagtttgttc gcattcaacc 2520

aggacagact ttttcagtgt tagcttgtta caatggttca ccatctggtg tttaccaatg 2580

tgctatgagg cccaatttca ctattaaggg ttcattcctt aatggttcat gtggtagtgt 2640

tggttttaac atagattatg actgtgtctc tttttgttac atgcaccata tggaattacc 2700

aactggagtt catgctggca cagacttaga aggtaacttt tatggacctt ttgttgacag 2760

gcaaacagca caagcagctg gtacggacac aactattaca gttaatgttt tagcttggtt 2820

gtacgctgct gttataaatg gagacaggtg gtttctcaat cgatttacca caactcttaa 2880

tgactttaac cttgtggcta tgaagtacaa ttatgaacct ctaacacaag accatgttga 2940

catactagga cctctttctg ctcaaactgg aattgccgtt ttagatatgt gtgcttcatt 3000

aaaagaatta ctgcaaaatg gtatgaatgg acgtaccata ttgggtagtg ctttattaga 3060

agatgaattt acaccttttg atgttgttag acaatgctca ggtgttactt tccaaagtgc 3120

agtgaaaaga acaatcaagg gtacacacca ctggttgtta ctcacaattt tgacttcact 3180

tttagtttta gtccagagta ctcaatggtc tttgttcttt tttttgtatg aaaatgcctt 3240

tttacctttt gctatgggta ttattgctat gtctgctttt gcaatgatgt ttgtcaaaca 3300

taagcatgca tttctctgtt tgtttttgtt accttctctt gccactgtag cttattttaa 3360

tatggtctat atgcctgcta gttgggtgat gcgtattatg acatggttgg atatggttga 3420

tactagtttg tctggtttta agctaaaaga ctgtgttatg tatgcatcag ctgtagtgtt 3480

actaatcctt atgacagcaa gaactgtgta tgatgatggt gctaggagag tgtggacact 3540

tatgaatgtc ttgacactcg tttataaagt ttattatggt aatgctttag atcaagccat 3600

ttccatgtgg gctcttataa tctctgttac ttctaactac tcaggtgtag ttacaactgt 3660

catgtttttg gccagaggta ttgtttttat gtgtgttgag tattgcccta ttttcttcat 3720

aactggtaat acacttcagt gtataatgct agtttattgt ttcttaggct atttttgtac 3780

ttgttacttt ggcctctttt gtttactcaa ccgctacttt agactgactc ttggtgttta 3840

tgattactta gtttctacac aggagtttag atatatgaat tcacagggac tactcccacc 3900

caagaatagc atagatgcct tcaaactcaa cattaaattg ttgggtgttg gtggcaaacc 3960

ttgtatcaaa gtagccactg tacagtctaa aatgtcagat gtaaagtgca catcagtagt 4020

cttactctca gttttgcaac aactcagagt agaatcatca tctaaattgt gggctcaatg 4080

tgtccagtta cacaatgaca ttctcttagc taaagatact actgaagcct ttgaaaaaat 4140

ggtttcacta ctttctgttt tgctttccat gcagggtgct gtagacataa acaagctttg 4200

tgaagaaatg ctggacaaca gggcaacctt acaagctata gcctcagagt ttagttccct 4260

tccatcatat gcagcttttg ctactgctca agaagcttat gagcaggctg ttgctaatgg 4320

tgattctgaa gttgttctta aaaagttgaa gaagtctttg aatgtggcta aatctgaatt 4380

tgaccgtgat gcagccatgc aacgtaagtt ggaaaagatg gctgatcaag ctatgaccca 4440

aatgtataaa caggctagat ctgaggacaa gagggcaaaa gttactagtg ctatgcagac 4500

<210> 6

<211> 4919

<212> DNA

<213> Artificial Sequence

<220>

<223> ORF1 ab-fragment 4

<400> 6

gagcaggctg ttgctaatgg tgattctgaa gttgttctta aaaagttgaa gaagtctttg 60

aatgtggcta aatctgaatt tgaccgtgat gcagccatgc aacgtaagtt ggaaaagatg 120

gctgatcaag ctatgaccca aatgtataaa caggctagat ctgaggacaa gagggcaaaa 180

gttactagtg ctatgcagac aatgcttttc actatgctta gaaagttgga taatgatgca 240

ctcaacaaca ttatcaacaa tgcaagagat ggttgtgttc ccttgaacat aatacctctt 300

acaacagcag ccaaactaat ggttgtcata ccagactata acacatataa aaatacgtgt 360

gatggtacaa catttactta tgcatcagca ttgtgggaaa tccaacaggt tgtagatgca 420

gatagtaaaa ttgttcaact tagtgaaatt agtatggaca attcacctaa tttagcatgg 480

cctcttattg taacagcttt aagggccaat tctgctgtca aattacagaa taatgagctt 540

agtcctgttg cactacgaca gatgtcttgt gctgccggta ctacacaaac tgcttgcact 600

gatgacaatg cgttagctta ctacaacaca acaaagggag gtaggtttgt acttgcactg 660

ttatccgatt tacaggattt gaaatgggct agattcccta agagtgatgg aactggtact 720

atctatacag aactggaacc accttgtagg tttgttacag acacacctaa aggtcctaaa 780

gtgaagtatt tatactttat taaaggatta aacaacctaa atagaggtat ggtacttggt 840

agtttagctg ccacagtacg tctacaagct ggtaatgcaa cagaagtgcc tgccaattca 900

actgtattat ctttctgtgc ttttgctgta gatgctgcta aagcttacaa agattatcta 960

gctagtgggg gacaaccaat cactaattgt gttaagatgt tgtgtacaca cactggtact 1020

ggtcaggcaa taacagttac accggaagcc aatatggatc aagaatcctt tggtggtgca 1080

tcgtgttgtc tgtactgccg ttgccacata gatcatccaa atcctaaagg attttgtgac 1140

ttaaaaggta agtatgtaca aatacctaca acttgtgcta atgaccctgt gggttttaca 1200

cttaaaaaca cagtctgtac cgtctgcggt atgtggaaag gttatggctg tagttgtgat 1260

caactccgcg aacccatgct tcagtcagct gatgcacaat cgtttttaaa cgggtttgcg 1320

gtgtaagtgc agcccgtctt acaccgtgcg gcacaggcac tagtactgat gtcgtataca 1380

gggcttttga catctacaat gataaagtag ctggttttgc taaattccta aaaactaatt 1440

gttgtcgctt ccaagaaaag gacgaagatg acaatttaat tgattcttac tttgtagtta 1500

agagacacac tttctctaac taccaacatg aagaaacaat ttataattta cttaaggatt 1560

gtccagctgt tgctaaacat gacttcttta agtttagaat agacggtgac atggtaccac 1620

atatatcacg tcaacgtctt actaaataca caatggcaga cctcgtctat gctttaaggc 1680

attttgatga aggtaattgt gacacattaa aagaaatact tgtcacatac aattgttgtg 1740

atgatgatta tttcaataaa aaggactggt atgattttgt agaaaaccca gatatattac 1800

gcgtatacgc caacttaggt gaacgtgtac gccaagcttt gttaaaaaca gtacaattct 1860

gtgatgccat gcgaaatgct ggtattgttg gtgtactgac attagataat caagatctca 1920

atggtaactg gtatgatttc ggtgatttca tacaaaccac gccaggtagt ggagttcctg 1980

ttgtagattc ttattattca ttgttaatgc ctatattaac cttgaccagg gctttaactg 2040

cagagtcaca tgttgacact gacttaacaa agccttacat taagtgggat ttgttaaaat 2100

atgacttcac ggaagagagg ttaaaactct ttgaccgtta ttttaaatat tgggatcaga 2160

cataccaccc aaattgtgtt aactgtttgg atgacagatg cattctgcat tgtgcaaact 2220

ttaatgtttt attctctaca gtgttcccac ctacaagttt tggaccacta gtgagaaaaa 2280

tatttgttga tggtgttcca tttgtagttt caactggata ccacttcaga gagctaggtg 2340

ttgtacataa tcaggatgta aacttacata gctctagact tagttttaag gaattacttg 2400

tgtatgctgc tgaccctgct atgcacgctg cttctggtaa tctattacta gataaacgca 2460

ctacgtgctt ttcagtagct gcacttacta acaatgttgc ttttcaaact gtcaaacccg 2520

gtaattttaa caaagacttc tatgactttg ctgtgtctaa gggtttcttt aaggaaggaa 2580

gttctgttga attaaaacac ttcttctttg ctcaggatgg taatgctgct atcagcgatt 2640

atgactacta tcgttataat ctaccaacaa tgtgtgatat cagacaacta ctatttgtag 2700

ttgaagttgt tgataagtac tttgattgtt acgatggtgg ctgtattaat gctaaccaag 2760

tcatcgtcaa caacctagac aaatcagctg gttttccatt taataaatgg ggtaaggcta 2820

gactttatta tgattcaatg agttatgagg atcaagatgc acttttcgca tatacaaaac 2880

gtaatgtcat ccctactata actcaaatga atcttaagta tgccattagt gcaaagaata 2940

gagctcgcac cgtagctggt gtctctatct gtagtactat gaccaataga cagtttcatc 3000

aaaaattatt gaaatcaata gccgccacta gaggagctac tgtagtaatt ggaacaagca 3060

aattctatgg tggttggcac aacatgttaa aaactgttta tagtgatgta gaaaaccctc 3120

accttatggg ttgggattat cctaaatgtg atagagccat gcctaacatg cttagaatta 3180

tggcctcact tgttcttgct cgcaaacata caacgtgttg tagcttgtca caccgtttct 3240

atagattagc taatgagtgt gctcaagtat tgagtgaaat ggtcatgtgt ggcggttcac 3300

tatatgttaa accaggtgga acctcatcag gagatgccac aactgcttat gctaatagtg 3360

tttttaacat ttgtcaagct gtcacggcca atgttaatgc acttttatct actgatggta 3420

acaaaattgc cgataagtat gtccgcaatt tacaacacag actttatgag tgtctctata 3480

gaaatagaga tgttgacaca gactttgtga atgagtttta cgcatatttg cgtaaacatt 3540

tctcaatgat gatactctct gacgatgctg ttgtgtgttt caatagcact tatgcatctc 3600

aaggtctagt ggctagcata aagaacttta agtcagttct ttattatcaa aacaatgttt 3660

ttatgtctga agcaaaatgt tggactgaga ctgaccttac taaaggacct catgaatttt 3720

gctctcaaca tacaatgcta gttaaacagg gtgatgatta tgtgtacctt ccttacccag 3780

atccatcaag aatcctaggg gccggctgtt ttgtagatga tatcgtaaaa acagatggta 3840

cacttatgat tgaacggttc gtgtctttag ctatagatgc ttacccactt actaaacatc 3900

ctaatcagga gtatgctgat gtctttcatt tgtacttaca atacataaga aagctacatg 3960

atgagttaac aggacacatg ttagacatgt attctgttat gcttactaat gataacactt 4020

caaggtattg ggaacctgag ttttatgagg ctatgtacac accgcataca gtcttacagg 4080

ctgttggggc ttgtgttctt tgcaattcac agacttcatt aagatgtggt gcttgcatac 4140

gtagaccatt cttatgttgt aaatgctgtt acgaccatgt catatcaaca tcacataaat 4200

tagtcttgtc tgttaatccg tatgtttgca atgctccagg ttgtgatgtc acagatgtga 4260

ctcaacttta cttaggaggt atgagctatt attgtaaatc acataaacca cccattagtt 4320

ttccattgtg tgctaatgga caagtttttg gtttatataa aaatacatgt gttggtagcg 4380

ataatgttac tgactttaat gcaattgcaa catgtgactg gacaaatgct ggtgattaca 4440

ttttagctaa cacctgtact gaaagactca agctttttgc agcagaaacg ctcaaagcta 4500

ctgaggagac atttaaactg tcttatggta ttgctactgt acgtgaagtg ctgtctgaca 4560

gagaattaca tctttcatgg gaagttggta aacctagacc accacttaac cgaaattatg 4620

tctttactgg ttatcgtgta actaaaaaca gtaaagtaca aataggagag tacacctttg 4680

aaaaaggtga ctatggtgat gctgttgttt accgaggtac aacaacttac aaattaaatg 4740

ttggtgatta ttttgtgctg acatcacata cagtaatgcc attaagtgca cctacactag 4800

tgccacaaga gcactatgtt agaattactg gcttataccc aacactcaat atctcagatg 4860

agttttctag caatgttgca aattatcaaa aggttggtat gcaaaagtat tctacactc 4919

<210> 7

<211> 3000

<212> DNA

<213> Artificial Sequence

<220>

<223> ORF1 ab-fragment 5

<400> 7

caacaactta caaattaaat gttggtgatt attttgtgct gacatcacat acagtaatgc 60

cattaagtgc acctacacta gtgccacaag agcactatgt tagaattact ggcttatacc 120

caacactcaa tatctcagat gagttttcta gcaatgttgc aaattatcaa aaggttggta 180

tgcaaaagta ttctacactc cagggaccac ctggtactgg taagagtcat tttgctattg 240

gcctagctct ctactaccct tctgctcgca tagtgtatac agcttgctct catgccgctg 300

ttgatgcact atgtgagaag gcattaaaat atttgcctat agataaatgt agtagaatta 360

tacctgcacg tgctcgtgta gagtgttttg ataaattcaa agtgaattca acattagaac 420

agtatgtctt ttgtactgta aatgcattgc ctgagacgac agcagatata gttgtctttg 480

atgaaatttc aatggccaca aattatgatt tgagtgttgt caatgccaga ttacgtgcta 540

agcactatgt gtacattggc gaccctgctc aattacctgc accacgcaca ttgctaacta 600

agggcacact agaaccagaa tatttcaatt cagtgtgtag acttatgaaa actataggtc 660

cagacatgtt cctcggaact tgtcggcgtt gtcctgctga aattgttgac actgtgagtg 720

ctttggttta tgataataag cttaaagcac ataaagacaa atcagctcaa tgctttaaaa 780

tgttttataa gggtgttatc acgcatgatg tttcatctgc aattaacagg ccacaaatag 840

gcgtggtaag agaattcctt acacgtaacc ctgcttggag aaaagctgtc tttatttcac 900

cttataattc acagaatgct gtagcctcaa agattttggg actaccaact caaactgttg 960

attcatcaca gggctcagaa tatgactatg tcatattcac tcaaaccact gaaacagctc 1020

actcttgtaa tgtaaacaga tttaatgttg ctattaccag agcaaaagta ggcatacttt 1080

gcataatgtc tgatagagac ctttatgaca agttgcaatt tacaagtctt gaaattccac 1140

gtaggaatgt ggcaacttta caagctgaaa atgtaacagg actctttaaa gattgtagta 1200

aggtaatcac tgggttacat cctacacagg cacctacaca cctcagtgtt gacactaaat 1260

tcaaaactga aggtttatgt gttgacatac ctggcatacc taaggacatg acctatagaa 1320

gactcatctc tatgatgggt tttaaaatga attatcaagt taatggttac cctaacatgt 1380

ttatcacccg cgaagaagct ataagacatg tacgtgcatg gattggcttc gatgtcgagg 1440

ggtgtcatgc tactagagaa gctgttggta ccaatttacc tttacagcta ggtttttcta 1500

caggtgttaa cctagttgct gtacctacag gttatgttga tacacctaat aatacagatt 1560

tttccagagt tagtgctaaa ccaccgcctg gagatcaatt taaacacctc ataccactta 1620

tgtacaaagg acttccttgg aatgtagtgc gtataaagat tgtacaaatg ttaagtgaca 1680

cacttaaaaa tctctctgac agagtcgtat ttgtcttatg ggcacatggc tttgagttga 1740

catctatgaa gtattttgtg aaaataggac ctgagcgcac ctgttgtcta tgtgatagac 1800

gtgccacatg cttttccact gcttcagaca cttatgcctg ttggcatcat tctattggat 1860

ttgattacgt ctataatccg tttatgattg atgttcaaca atggggtttt acaggtaacc 1920

tacaaagcaa ccatgatctg tattgtcaag tccatggtaa tgcacatgta gctagttgtg 1980

atgcaatcat gactaggtgt ctagctgtcc acgagtgctt tgttaagcgt gttgactgga 2040

ctattgaata tcctataatt ggtgatgaac tgaagattaa tgcggcttgt agaaaggttc 2100

aacacatggt tgttaaagct gcattattag cagacaaatt cccagttctt cacgacattg 2160

gtaaccctaa agctattaag tgtgtacctc aagctgatgt agaatggaag ttctatgatg 2220

cacagccttg tagtgacaaa gcttataaaa tagaagaatt attctattct tatgccacac 2280

attctgacaa attcacagat ggtgtatgcc tattttggaa ttgcaatgtc gatagatatc 2340

ctgctaattc cattgtttgt agatttgaca ctagagtgct atctaacctt aacttgcctg 2400

gttgtgatgg tggcagtttg tatgtaaata aacatgcatt ccacacacca gcttttgata 2460

aaagtgcttt tgttaattta aaacaattac catttttcta ttactctgac agtccatgtg 2520

agtctcatgg aaaacaagta gtgtcagata tagattatgt accactaaag tctgctacgt 2580

gtataacacg ttgcaattta ggtggtgctg tctgtagaca tcatgctaat gagtacagat 2640

tgtatctcga tgcttataac atgatgatct cagctggctt tagcttgtgg gtttacaaac 2700

aatttgatac ttataacctc tggaacactt ttacaagact tcagagttta gaaaatgtgg 2760

cttttaatgt tgtaaataag ggacactttg atggacaaca gggtgaagta ccagtttcta 2820

tcattaataa cactgtttac acaaaagttg atggtgttga tgtagaattg tttgaaaata 2880

aaacaacatt acctgttaat gtagcatttg agctttgggc taagcgcaac attaaaccag 2940

taccagaggt gaaaatactc aataatttgg gtgtggacat tgctgctaat actgtgatct 3000

<210> 8

<211> 3421

<212> DNA

<213> Artificial Sequence

<220>

<223> ORF1 ab-fragment 6

<400> 8

gggtgaagta ccagtttcta tcattaataa cactgtttac acaaaagttg atggtgttga 60

tgtagaattg tttgaaaata aaacaacatt acctgttaat gtagcatttg agctttgggc 120

taagcgcaac attaaaccag taccagaggt gaaaatactc aataatttgg gtgtggacat 180

tgctgctaat actgtgatct gggactacaa aagagatgct ccagcacata tatctactat 240

tggtgtttgt tctatgactg acatagccaa gaaaccaact gaaacgattt gtgcaccact 300

cactgtcttt tttgatggta gagttgatgg tcaagtagac ttatttagaa atgcccgtaa 360

tggtgttctt attacagaag gtagtgttaa aggtttacaa ccatctgtag gtcccaaaca 420

agctagtctt aatggagtca cattaattgg agaagccgta aaaacacagt tcaattatta 480

taagaaagtt gatggtgttg tccaacaatt acctgaaact tactttactc agagtagaaa 540

tttacaagaa tttaaaccca ggagtcaaat ggaaattgat ttcttagaat tagctatgga 600

tgaattcatt gaacggtata aattagaagg ctatgccttc gaacatatcg tttatggaga 660

ttttagtcat agtcagttag gtggtttaca tctactgatt ggactagcta aacgttttaa 720

ggaatcacct tttgaattag aagattttat tcctatggac agtacagtta aaaactattt 780

cataacagat gcgcaaacag gttcatctaa gtgtgtgtgt tctgttattg atttattact 840

tgatgatttt gttgaaataa taaaatccca agatttatct gtagtttcta aggttgtcaa 900

agtgactatt gactatacag aaatttcatt tatgctttgg tgtaaagatg gccatgtaga 960

aacattttac ccaaaattac aatctagtca agcgtggcaa ccgggtgttg ctatgcctaa 1020

tctttacaaa atgcaaagaa tgctattaga aaagtgtgac cttcaaaatt atggtgatag 1080

tgcaacatta cctaaaggca taatgatgaa tgtcgcaaaa tatactcaac tgtgtcaata 1140

tttaaacaca ttaacattag ctgtacccta taatatgaga gttatacatt ttggtgctgg 1200

ttctgataaa ggagttgcac caggtacagc tgttttaaga cagtggttgc ctacgggtac 1260

gctgcttgtc gattcagatc ttaatgactt tgtctctgat gcagattcaa ctttgattgg 1320

tgattgtgca actgtacata cagctaataa atgggatctc attattagtg atatgtacga 1380

ccctaagact aaaaatgtta caaaagaaaa tgactctaaa gagggttttt tcacttacat 1440

ttgtgggttt atacaacaaa agctagctct tggaggttcc gtggctataa agataacaga 1500

acattcttgg aatgctgatc tttataagct catgggacac ttcgcatggt ggacagcctt 1560

tgttactaat gtgaatgcgt catcatctga agcattttta attggatgta attatcttgg 1620

caaaccacgc gaacaaatag atggttatgt catgcatgca aattacatat tttggaggaa 1680

tacaaatcca attcagttgt cttcctattc tttatttgac atgagtaaat ttccccttaa 1740

attaaggggt actgctgtta tgtctttaaa agaaggtcaa atcaatgata tgattttatc 1800

tcttcttagt aaaggtagac ttataattag agaaaacaac agagttgtta tttctagtga 1860

tgttcttgtt aacaacgcgg ccgccgaggg ccgcggcagc ctgctgacct gcggcgacgt 1920

ggaggagaac cccggccccg cggccgccat ggtgagcaag ggcgaggagg acaacatggc 1980

catcatcaag gagttcatgc gcttcaaggt gcacatggag ggcagcgtga acggccacga 2040

gttcgagatc gagggcgagg gcgagggccg cccctacgag ggcacccaga ccgccaagct 2100

gaaggtgacc aagggcggcc ccctgccctt cgcctgggac atcctgagcc cccagttcat 2160

gtacggcagc aaggcctacg tgaagcaccc cgccgacatc cccgactacc tgaagctgag 2220

cttccccgag ggcttcaagt gggagcgcgt gatgaacttc gaggacggcg gcgtggtgac 2280

cgtgacccag gacagcagcc tgcaggacgg cgagttcatc tacaaggtga agctgcgcgg 2340

caccaacttc cccagcgacg gccccgtgat gcagaagaag accatgggct gggaggccag 2400

cagcgagcgc atgtaccccg aggacggcgc cctgaagggc gagatcaagc agcgcctgaa 2460

gctgaaggac ggcggccact acgacgccga ggtgaagacc acctacaagg ccaagaagcc 2520

cgtgcagctg cccggcgcct acaacgtgaa catcaagctg gacatcacca gccacaacga 2580

ggactacacc atcgtggagc agtacgagcg cgccgagggc cgccacagca ccggcggcat 2640

ggacgagctg tacaaggcgg ccgcctaaca atctttaatc agtgtgtaac attagggagg 2700

acttgaaaga gccaccacat tttcaccgag gccacgcgga gtacgatcga gtgtacagtg 2760

aacaatgcta gggagagctg cctatatgga agagccctaa tgtgtaaaat taattttagt 2820

agtgctatcc ccatgtgatt ttaatagctt cttaggagaa tgacaaaaaa aaaaaaaaaa 2880

aaaaaaaaaa aaaaaaaaaa aaaaaaaaag ggtcggcatg gcatctccac ctcctcgcgg 2940

tccgacctgg gcatccgaag gaggacgcac gtccactcgg atggctaagg gaggctagcc 3000

agacatgata agatacattg atgagtttgg acaaaccaca actagaatgc agtgaaaaaa 3060

atgctttatt tgtgaaattt gtgatgctat tgctttattt gtaaccatta taagctgcaa 3120

taaacaagtt aacaacaaca attgcattca ttttatgttt caggttcagg gggaggtgtg 3180

ggaggttttt taaagcaagt aaaacctcta caaatgtggt acagaaagtc aaaagcctcc 3240

gaccggaggc ttttgactat tactgccgac atggaagcca tcacaaacgg catgatgaac 3300

ctataaacgc agaaaggccc acccgaaggt gagccagtgt gattaccagc ggcatcagca 3360

ccttgtcgcc ttgcgtataa tatttgcaaa aaacccctca agacccgttt agaggcccca 3420

a 3421

<210> 9

<211> 2215

<212> DNA

<213> Artificial Sequence

<220>

<223> Stru. DELTA.S-fragment 1

<400> 9

ctgcctggcg gcagtagcgc ggtggtccca cctgacccca tgccgaactc agaagtgaaa 60

cgccgtagcg ccgatggtag tgtggggtct ccccatgcga gagtagggaa ctgccaggca 120

tcaaataaaa cgaaaggctc agtcgaaaga ctgggccttc ctgccactca tcgcagtact 180

gttgtaattc attaagcatg tagttattaa tagtaatcaa ttacggggtc attagttcat 240

agcccatata tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg 300

cccaacgacc cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata 360

gggactttcc attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta 420

catcaagtgt atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc 480

gcctggcatt atgcccagta catgacctta tgggactttc ctacttggca gtacatctac 540

gtattagtca tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga 600

tagcggtttg actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg 660

ttttggcacc aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg 720

caaatgggcg gtaggcgtgt acggtgggag gtctatataa gcagagctgg tttagtgaac 780

cgtcagatct aatacgactc actataggga ttaaaggttt ataccttccc aggtaacaaa 840

ccaaccaact ttcgatctct tgtagatctg ttctctaaac gaactttaaa atctgtgtgg 900

ctgtcactcg gctgcatgct tagtgcactc acgcagtata attaataact aattactgtc 960

gttgacagga cacgagtaac tcgtctatct tctgcaggct gcttacggtt tcgtccgtgt 1020

tgcagccgat catcagcaca tctaggtttc gtccgggtgt gaccgaaagg taagacgaac 1080

ttatggattt gtttatgaga atcttcacaa ttggaactgt aactttgaag caaggtgaaa 1140

tcaaggatgc tactccttca gattttgttc gcgctactgc aacgataccg atacaagcct 1200

cactcccttt cggatggctt attgttggcg ttgcacttct tgctgttttt cagagcgctt 1260

ccaaaatcat aaccctcaaa aagagatggc aactagcact ctccaagggt gttcactttg 1320

tttgcaactt gctgttgttg tttgtaacag tttactcaca ccttttgctc gttgctgctg 1380

gccttgaagc cccttttctc tatctttatg ctttagtcta cttcttgcag agtataaact 1440

ttgtaagaat aataatgagg ctttggcttt gctggaaatg ccgttccaaa aacccattac 1500

tttatgatgc caactatttt ctttgctggc atactaattg ttacgactat tgtatacctt 1560

acaatagtgt aacttcttca attgtcatta cttcaggtga tggcacaaca agtcctattt 1620

ctgaacatga ctaccagatt ggtggttata ctgaaaaatg ggaatctgga gtaaaagact 1680

gtgttgtatt acacagttac ttcacttcag actattacca gctgtactca actcaattga 1740

gtacagacac tggtgttgaa catgttacct tcttcatcta caataaaatt gttgatgagc 1800

ctgaagaaca tgtccaaatt cacacaatcg acggttcatc cggagttgtt aatccagtaa 1860

tggaaccaat ttatgatgaa ccgacgacga ctactagcgt gcctttgtaa gcacaagctg 1920

atgagtacga acttatgtac tcattcgttt cggaagagac aggtacgtta atagttaata 1980

gcgtacttct ttttcttgct ttcgtggtat tcttgctagt tacactagcc atccttactg 2040

cgcttcgatt gtgtgcgtac tgctgcaata ttgttaacgt gagtcttgta aaaccttctt 2100

tttacgttta ctctcgtgtt aaaaatctga attcttctag agttcctgat cttctggtct 2160

aaacgaacta aatattatat tagtttttct gtttggaact ttaattttag ccatg 2215

<210> 10

<211> 2984

<212> DNA

<213> Artificial Sequence

<220>

<223> Stru. DELTA.S-fragment 2

<400> 10

ctagttacac tagccatcct tactgcgctt cgattgtgtg cgtactgctg caatattgtt 60

aacgtgagtc ttgtaaaacc ttctttttac gtttactctc gtgttaaaaa tctgaattct 120

tctagagttc ctgatcttct ggtctaaacg aactaaatat tatattagtt tttctgtttg 180

gaactttaat tttagccatg gcagattcca acggtactat taccgttgaa gagcttaaaa 240

agctccttga acaatggaac ctagtaatag gtttcctatt ccttacatgg atttgtcttc 300

tacaatttgc ctatgccaac aggaataggt ttttgtatat aattaagtta attttcctct 360

ggctgttatg gccagtaact ttagcttgtt ttgtgcttgc tgctgtttac agaataaatt 420

ggatcaccgg tggaattgct atcgcaatgg cttgtcttgt aggcttgatg tggctcagct 480

acttcattgc ttctttcaga ctgtttgcgc gtacgcgttc catgtggtca ttcaatccag 540

aaactaacat tcttctcaac gtgccactcc atggcactat tctgaccaga ccgcttctag 600

aaagtgaact cgtaatcgga gctgtgatcc ttcgtggaca tcttcgtatt gctggacacc 660

atctaggacg ctgtgacatc aaggacctgc ctaaagaaat cactgttgct acatcacgaa 720

cgctttctta ttacaaattg ggagcttcgc agcgtgtagc aggtgactca ggttttgctg 780

catacagtcg ctacaggatt ggcaactata aattaaacac agaccattcc agtagcagtg 840

acaatattgc tttgcttgta cagtaagtga caacagatgt ttcatctcgt tgactttcag 900

gttactatag cagagatatt actaattatt atgaggactt ttaaagtttc catttggaat 960

cttgattaca tcataaacct cataattaaa aatttatcta agtcactaac tgagaataaa 1020

tattctcaat tagatgaaga gcaaccaatg gagattgatt aaacgaacat gaaaattatt 1080

cttttcttgg cactgataac actcgctact tgtgagcttt atcactacca agagtgtgtt 1140

agaggtacaa cagtactttt aaaagaacct tgctcttctg gaacatacga gggcaattca 1200

ccatttcatc ctctagctga taacaaattt gcactgactt gctttagcac tcaatttgct 1260

tttgcttgtc ctgacggcgt aaaacacgtc tatcagttac gtgccagatc agtttcacct 1320

aaactgttca tcagacaaga ggaagttcaa gaactttact ctccaatttt tcttattgtt 1380

gcggcaatag tgtttataac actttgcttc acactcaaaa gaaagacaga atgattgaac 1440

tttcattaat tgacttctat ttgtgctttt tagcctttct gctattcctt gttttaatta 1500

tgcttattat cttttggttc tcacttgaac tgcaagatca taatgaaact tgtcacgcct 1560

aaacgaacat gaaatttctt gttttcttag gaatcatcac aactgtagct gcatttcacc 1620

aagaatgtag tttacagtca tgtactcaac atcaaccata tgtagttgat gacccgtgtc 1680

ctattcactt ctattctaaa tggtatatta gagtaggagc tagaaaatca gcacctttaa 1740

ttgaattgtg cgtggatgag gctggttcta aatcacccat tcagtacatc gatatcggta 1800

attatacagt ttcctgttta ccttttacaa ttaattgcca ggaacctaaa ttgggtagtc 1860

ttgtagtgcg ttgttcgttc tatgaagact ttttagagta tcatgacgtt cgtgttgttt 1920

tagatttcat ctaaacgaac aaactaaaat gtctgataat ggaccccaaa atcagcgaaa 1980

tgcaccccgc attacgtttg gtggaccctc agattcaact ggcagtaacc agaatggaga 2040

acgcagtggg gcgcgatcaa aacaacgtcg gccccaaggt ttacccaata atactgcgtc 2100

ttggttcacc gctctcactc aacatggcaa ggaagacctt aaattccctc gaggacaagg 2160

cgttccaatt aacaccaata gcagtccaga tgaccaaatt ggctactacc gaagagctac 2220

cagacgaatt cgtggtggtg acggtaaaat gaaagatctc agtccaagat ggtatttcta 2280

ctacctagga actgggccag aagctggact tccctatggt gctaacaaag acggcatcat 2340

atgggttgca actgagggag ccttgaatac accaaaagat cacattggca cccgcaatcc 2400

tgctaacaat gctgcaatcg tgctacaact tcctcaagga acaacattgc caaaaggctt 2460

ctacgcagaa gggagcagag gcggcagtca agcctcttct cgttcctcat cacgtagtcg 2520

caacagttca agaaattcaa ctccaggcag cagtagggga acttctcctg ctagaatggc 2580

tggcaatggc ggtgatgctg ctcttgcttt gctgctgctt gacagattga accagcttga 2640

gagcaaaatg tctggtaaag gccaacaaca acaaggccaa actgtcacta agaaatctgc 2700

tgctgaggct tctaagaagc ctcggcaaaa acgtactgcc actaaagcat acaatgtaac 2760

acaagctttc ggcagacgtg gtccagaaca aacccaagga aattttgggg accaggaact 2820

aatcagacaa ggaactgatt acaaacattg gccgcaaatt gcacaatttg cccccagcgc 2880

ttcagcgttc ttcggaatgt cgcgcattgg catggaagtc acaccttcgg gaacgtggtt 2940

gacctacaca ggtgccatca aattggatga caaagatcca aatt 2984

<210> 11

<211> 2938

<212> DNA

<213> Artificial Sequence

<220>

<223> Stru. DELTA.S-fragment 3

<400> 11

agaacaaacc caaggaaatt ttggggacca ggaactaatc agacaaggaa ctgattacaa 60

acattggccg caaattgcac aatttgcccc cagcgcttca gcgttcttcg gaatgtcgcg 120

cattggcatg gaagtcacac cttcgggaac gtggttgacc tacacaggtg ccatcaaatt 180

ggatgacaaa gatccaaatt tcaaagatca agtcattttg ctgaataagc atattgacgc 240

atacaaaaca ttcccaccaa cagagcctaa aaaggacaaa aagaagaagg ctgatgaaac 300

tcaagcctta ccgcagagac agaagaaaca gcaaactgtg actcttcttc ctgctgcaga 360

tttggatgat ttctccaaac aattgcaaca atccatgagc agtgctgact caactcaggc 420

ctaaactcat gcagaccaca caaggcagat gggctatata aacgttttcg cttttccgtt 480

tacgatatat agtctactct tgtgcagaat gaattctcgt aactacatag cacaagtaga 540

tgtagttaac tttaatctca cagagggccg cggcagcctg ctgacctgcg gcgacgtgga 600

ggagaacccc ggccccatgg tgagcaaggg cgaggagctg ttcaccggcg tggtgcccat 660

cctggtggag ctggacggcg acgtgaacgg ccacaagttc agcgtgagcg gcgagggcga 720

gggcgacgcc acctacggca agctgaccct gaagttcatc tgcaccaccg gcaagctgcc 780

cgtgccctgg cccaccctgg tgaccaccct gacctacggc gtgcagtgct tcagccgcta 840

ccccgaccac atgaagcagc acgacttctt caagagcgcc atgcccgagg gctacgtgca 900

ggagcgcacc atcttcttca aggacgacgg caactacaag acccgcgccg aggtgaagtt 960

cgagggcgac accctggtga accgcatcga gctgaagggc atcgacttca aggaggacgg 1020

caacatcctg ggccacaagc tggagtacaa ctacaacagc cacaacgtgt acatcatggc 1080

cgacaagcag aagaacggca tcaaggtgaa cttcaagatc cgccacaaca tcgaggacgg 1140

cagcgtgcag ctggccgacc actaccagca gaacaccccc atcggcgacg gccccgtgct 1200

gctgcccgac aaccactacc tgagcaccca gagcgccctg agcaaggacc ccaacgagaa 1260

gcgcgaccac atggtgctgc tggagttcgt gaccgccgcc ggcatcaccc tgggcatgga 1320

cgagctgtac aaggccacca acttcagcct gctgaagcag gccggcgacg tggaggagaa 1380

ccccggcccc atgatcgagc aggacggcct gcacgccggc agccccgccg cctgggtgga 1440

gcgcctgttc ggctacgact gggcccagca gaccatcggc tgcagcgacg ccgccgtgtt 1500

ccgcctgagc gcccagggcc gccccgtgct gttcgtgaag accgacctga gcggcgccct 1560

gaacgagctg caggacgagg ccgcccgcct gagctggctg gccaccaccg gcgtgccctg 1620

cgccgccgtg ctggacgtgg tgaccgaggc cggccgcgac tggctgctgc tgggcgaggt 1680

gcccggccag gacctgctga gcagccacct ggcccccgcc gagaaggtga gcatcatggc 1740

cgacgccatg cgccgcctgc acaccctgga ccccgccacc tgccccttcg accaccaggc 1800

caagcaccgc atcgagcgcg cccgcacccg catggaggcc ggcctggtgg accaggacga 1860

cctggacgag gagcaccagg gcctggcccc cgccgagctg ttcgcccgcc tgaaggccag 1920

catgcccgac ggcgaggacc tggtggtgac ccacggcgac gcctgcctgc ccaacatcat 1980

ggtggagaac ggccgcttca gcggcttcat cgactgcggc aggctgggcg tggccgaccg 2040

ctaccaggac atcgccctgg ccacccgcga catcgccgag gagctgggcg gcgagtgggc 2100

cgaccgcttc ctggtgctgt acggcatcgc cgcccccgac agccagcgca tcgccttcta 2160

ccgcctgctg gacgagttct tctaacaatc tttaatcagt gtgtaacatt agggaggact 2220

tgaaagagcc accacatttt caccgaggcc acgcggagta cgatcgagtg tacagtgaac 2280

aatgctaggg agagctgcct atatggaaga gccctaatgt gtaaaattaa ttttagtagt 2340

gctatcccca tgtgatttta atagcttctt aggagaatga caaaaaaaaa aaaaaaaaaa 2400

aaaaaaaaaa aaaaaaaaaa aaaaaagggt cggcatggca tctccacctc ctcgcggtcc 2460

gacctgggca tccgaaggag gacgcacgtc cactcggatg gctaagggag gctagccaga 2520

catgataaga tacattgatg agtttggaca aaccacaact agaatgcagt gaaaaaaatg 2580

ctttatttgt gaaatttgtg atgctattgc tttatttgta accattataa gctgcaataa 2640

acaagttaac aacaacaatt gcattcattt tatgtttcag gttcaggggg aggtgtggga 2700

ggttttttaa agcaagtaaa acctctacaa atgtggtaca gaaagtcaaa agcctccgac 2760

cggaggcttt tgactattac tgccgacatg gaagccatca caaacggcat gatgaaccta 2820

taaacgcaga aaggcccacc cgaaggtgag ccagtgtgat taccagcggc atcagcacct 2880

tgtcgccttg cgtataatat ttgcaaaaaa cccctcaaga cccgtttaga ggccccaa 2938

<210> 12

<211> 265

<212> DNA

<213> Artificial Sequence

<220>

<223> 5'UTR

<400> 12

attaaaggtt tataccttcc caggtaacaa accaaccaac tttcgatctc ttgtagatct 60

gttctctaaa cgaactttaa aatctgtgtg gctgtcactc ggctgcatgc ttagtgcact 120

cacgcagtat aattaataac taattactgt cgttgacagg acacgagtaa ctcgtctatc 180

ttctgcaggc tgcttacggt ttcgtccgtg ttgcagccga tcatcagcac atctaggttt 240

cgtccgggtg tgaccgaaag gtaag 265

<210> 13

<211> 229

<212> DNA

<213> Artificial Sequence

<220>

<223> 3'UTR

<400> 13

caatctttaa tcagtgtgta acattaggga ggacttgaaa gagccaccac attttcaccg 60

aggccacgcg gagtacgatc gagtgtacag tgaacaatgc tagggagagc tgcctatatg 120

gaagagccct aatgtgtaaa attaatttta gtagtgctat ccccatgtga ttttaatagc 180

ttcttaggag aatgacaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaa 229

Claims

1. A composition for preparing SARS-CoV-2 virus-like particles, comprising the following (a), and further comprising any one or any two selected from the following (b) and (c):

2. The composition of claim 1, wherein the mutation of the mutant of S protein relative to the S protein of SARS-CoV-2 virus comprises N331Q, N501Y, D614G and/or P681H.

3. The composition of claim 1 or 2, wherein the Stru Δ S fragment comprises any one of:

4. The composition according to any one of claims 1 to 3, wherein the packaging signal sequence is nucleotides 19900 and 20000 of NCBI sequence No. NC-045512.2 or nucleotides 19773 and 20335 of NCBI sequence No. NC-045512.2.

5. The composition of any one of claims 1-4, wherein the packaging signal fragment comprises ORF1ab of the SARS-CoV-2 viral genome.

6. A method of preparing the composition of any one of claims 1-5, comprising preparing the first, second and/or third plasmids by:

A method for preparing SARS-CoV-2 virus-like particles, comprising:

transfecting a packaging cell with the composition of any one of claims 1 to 5 to obtain a SARS-CoV-2 virus-like particle.

8. The method of claim 7, further comprising

Preparing a first plasmid, a second plasmid and/or a third plasmid by respectively:

9. The method of claim 8, further comprising amplifying the first, second and/or third plasmids in E.coli, respectively, and then transfecting the packaging cell.

10. SARS-CoV-2 virus-like particles produced by the production method according to any one of claims 7 to 9.

11. Use of a composition according to any one of claims 1 to 5 or a SARS-CoV-2 virus-like particle according to claim 10 in the preparation of a vaccine for the prevention or treatment of SARS-CoV-2 virus infection.

12. Use of a composition according to any of claims 1-5 or a SARS-CoV-2 virus-like particle according to claim 10 in the study of SARS-CoV-2 virus infected cells in vitro.