WO2023218021A1 - Use of transposases for improving transgene expression and nuclear localization - Google Patents

Use of transposases for improving transgene expression and nuclear localization Download PDF

Info

Publication number
WO2023218021A1
WO2023218021A1 PCT/EP2023/062736 EP2023062736W WO2023218021A1 WO 2023218021 A1 WO2023218021 A1 WO 2023218021A1 EP 2023062736 W EP2023062736 W EP 2023062736W WO 2023218021 A1 WO2023218021 A1 WO 2023218021A1
Authority
WO
WIPO (PCT)
Prior art keywords
transposase
protein
nucleic acid
amino acid
seq
Prior art date
Application number
PCT/EP2023/062736
Other languages
French (fr)
Inventor
Avencia SÁNCHEZ-MEJÍAS GARCIA
Marc GÜELL CARGOL
Maria PALLARÈS MASMITJÀ
Original Assignee
Integra Therapeutics
Universitat Pompeu Fabra
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Integra Therapeutics, Universitat Pompeu Fabra filed Critical Integra Therapeutics
Publication of WO2023218021A1 publication Critical patent/WO2023218021A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/09Fusion polypeptide containing a localisation/targetting motif containing a nuclear localisation signal

Definitions

  • the present invention relates to methods for improving transgene expression in a cell population.
  • NPCs nuclear pore complexes
  • Transposases are enzymes naturally evolved to “cut-and-paste” or “copy-and-paste” DNA fragments into genomic DNA of prokaryotes (e.g., Tn3, Tn5 and Tn7) or eukaryotes (e.g., Sleeping Beauty, and PiggyBac). These enzymes have been found in the genome of certain species with low transposition efficiencies to avoid genomic toxicity. Transposases bind to a payload DNA containing specific Inverted Terminal Repeats (ITRs) called transposon, copy or cut its sequence pasting it in a random or semi-random genomic site.
  • ITRs Inverted Terminal Repeats
  • transposases are naturally addressed to the nucleus thanks to a nuclear localization signal (see, e.g., Keith et al. “Analysis of the piggyBac transposase reveals a functional nuclear targeting signal in the 94 c-terminal residues.” BMC Molecular Biology 2008, vol. 9, 72).
  • the state of the art comprises different transposases that were modified to increase their efficiency and to create tools for mammalian genome editing such as SB 100 or hyperactive PiggyBac.
  • Transposases have also been fused to DNA binding domains such as, e.g., Cas9 (see, e.g., WO2022129438 or W02020243085), dead-Cas9 (see, e.g., Hew et al. “RNA-guided piggyBac transposition in human cells.” Synthetic Biology 2019, vol. 4,1), and ZNF, in order to specifically target and edit a certain genomic DNA location.
  • Cas9 see, e.g., WO2022129438 or W02020243085
  • dead-Cas9 see, e.g., Hew et al. “RNA-guided piggyBac transposition in human cells.” Synthetic Biology 2019, vol. 4,1)
  • ZNF in order to specifically target and edit a certain genomic DNA location.
  • transposases have been described as tools for genome editing (see, e.g., Zhao et al. “PiggyBac transposon vectors: the tools of the human gene encoding.” Translational Lung Cancer Research 2016, vol. 5,1)
  • their applications rely on their insertional activity, either for inserting a gene in a host cell’s DNA, or for disrupting a gene, which also poses the problem of insertional mutagenesis.
  • transfection is usually performed on actively dividing (proliferating) cells, because they internalize nucleic acids much better than non-dividing, quiescent cells (Bai et al. 2017), and it is notably difficult to transfect confluent cells or certain cell types that do not divide in culture. Hence, there is a need to improve transfection efficiency in non-dividing cells.
  • the present invention provides evidence that, using transposases with vehicles that facilitates cytoplasmatic transduction of nucleic acids, it is possible to increase the nuclear localization or nuclear uptake of a DNA of interest, where it can then be expressed by the cell (z.e., transiently expressed, transfected) or inserted in the genomic DNA.
  • the method of the present invention also enables efficient transfection of quiescent cells.
  • the method of the present invention also enables efficient transfection without insertional activity when using a catalytically inactive transposase.
  • An object of the present invention is a method of increasing the expression of a nucleic acid molecule encoding at least one transgene of interest in a cell population, comprising contacting a cell population with: a protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, and the nucleic acid molecule encoding the at least one transgene of interest.
  • Another object of the present invention is a method of increasing nuclear localization of a nucleic acid molecule encoding at least one transgene of interest in a cell population, comprising contacting a cell population with: a protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, and the nucleic acid molecule encoding the at least one transgene of interest.
  • Another object of the present invention is a method of editing the genome of a cell population, comprising contacting a cell population with: a protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, and the nucleic acid molecule encoding the at least one transgene of interest.
  • Another object of the present invention is a method of treating a genetic disease in a subject in need thereof, comprising administering said subject with a therapeutically effective amount of: a protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, and a nucleic acid molecule encoding the at least one transgene of interest, wherein expression of the transgene of interest in at least one cell of the subject in need thereof compensates a gene defect responsible for the genetic disease.
  • the nucleic acid molecule encoding the at least one transgene of interest is a deoxyribonucleic acid (DNA) molecule.
  • said DNA molecule is a complementary DNA (cDNA) molecule.
  • said DNA molecule is a genomic DNA (gDNA) molecule.
  • the nucleic acid molecule encoding the at least one transgene of interest further comprises at least one Inverted Terminal Repeat (ITR) sequence.
  • ITR Inverted Terminal Repeat
  • the nucleic acid molecule encoding the at least one transgene of interest further comprises two ITR sequences.
  • the at least one ITR sequence is adjacent to the transgene of interest.
  • the nucleic acid molecule encoding the at least one transgene of interest further comprises two ITR sequences flanking the transgene of interest.
  • the at least one ITR sequence interacts with, or non-covalently binds to, the transposase.
  • the DNA molecule is inserted into the genome of said cell population.
  • the nucleic acid molecule encoding the at least one transgene of interest is stably expressed by the cell population.
  • the transgene of interest is an exogenous gene. In some embodiments, the transgene of interest is an endogenous gene.
  • the nucleic acid molecule encoding the at least one transgene of interest is comprised in a plasmid, a fosmid, a cosmid, an artificial chromosome or a viral vector. In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is comprised in a plasmid. In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is comprised in a viral vector.
  • the nucleic acid molecule encoding the at least one transgene of interest is comprised in a DNA virus-based vector selected from the group consisting of viruses from the realms Duplodnaviria, Monodnaviria and Varidnaviria.
  • the protein or polypeptide comprising the transposase or a fragment thereof increases the nuclear localization of the nucleic acid molecule encoding the at least one transgene of interest. In some embodiments, the protein or polypeptide comprising the transposase or a fragment thereof translocates, or promotes translocation of, the nucleic acid molecule encoding the at least one transgene of interest to the nucleus.
  • the protein or polypeptide comprising the transposase or a fragment thereof is a fusion protein comprising the transposase or a fragment thereof, and at least one additional polypeptide or protein.
  • the at least one additional polypeptide or protein is a nuclease.
  • the at least one additional polypeptide or protein is a RNA-guided nuclease.
  • the at least one additional polypeptide or protein is a Cas nuclease.
  • the at least one additional polypeptide or protein is a Cas9 nuclease.
  • the fusion protein further comprises a linker.
  • the fusion protein has at least 75 % amino acid sequence identity with SEQ ID NO: 2. In some embodiments, the fusion protein comprises or consists of the amino acid sequence of SEQ ID NO: 2.
  • the at least one additional polypeptide or protein is an aptamer binding protein.
  • the at least one additional polypeptide or protein is the MS2 bacteriophage coat protein (MCP).
  • MCP MS2 bacteriophage coat protein
  • the fusion protein interacts, or is capable of interacting, covalently or non-covalently through MCP with a gRNA molecule comprising at least one MS2 aptamer.
  • the gRNA molecule interacts, or is capable of interacting, covalently or non-covalently through the at least one MS2 aptamer with an RNA-guided nuclease.
  • the transposase is selected from the group consisting of hyperactive PiggyBac transposase, PiggyBac transposase, Sleeping Beauty transposase, SB 11 transposase, Tol2 transposase, Mosl transposase, and Frog Prince transposase.
  • the transposase is selected from the group consisting of hyperactive PiggyBac transposase and Sleeping Beauty transposase.
  • the transposase is a hyperactive PiggyBac transposase.
  • the transposase is a hyperactive PiggyBac transposase with the amino acid sequence of SEQ ID NO: 1.
  • the transposase is a modified hyperactive PiggyBac transposase, comprising at least one amino acid mutation compared with the amino acid sequence of the hyperactive PiggyBac transposase of SEQ ID NO: 1.
  • the transposase is a Sleeping Beauty transposase.
  • the transposase or the fragment thereof has decreased catalytic activity.
  • the transposase or the fragment thereof is catalytically dead.
  • the catalytically dead transposase has at least 75 % amino acid sequence identity with SEQ ID NO: 3. In some embodiments, the catalytically dead transposase comprises or consists of the amino acid sequence of SEQ ID NO: 3.
  • the protein or polypeptide comprises a transposase fragment.
  • the transposase fragment comprises or consists of at least one transposase functional domain.
  • the transposase fragment comprises or consists of an ITR-binding domain.
  • the transposase fragment is a fragment of a hyperactive PiggyBac transposase or of a Sleeping Beauty transposase.
  • the transposase fragment is a fragment of a hyperactive PiggyBac transposase, preferably comprising an ITR-binding domain.
  • the transposase fragment is a fragment of a Sleeping Beauty transposase, preferably selected from the group consisting of the SB 100 domain and the N57 domain of Sleeping Beauty transposase.
  • the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is a ribonucleic acid (RNA) molecule or a DNA molecule. In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is an RNA molecule. In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is a messenger RNA (mRNA) molecule.
  • mRNA messenger RNA
  • the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is a cDNA molecule or a plasmid. In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population by transfection or transformation. In some embodiments, transfection is selected from the group consisting of lipofection, electroporation, sonication, nanoparticles, microinjection and viral vector infection, including non-integrative and integrative viral vector infection. In some embodiments, transformation comprises using an integrative viral vector or a modified integrative virus.
  • the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population together with the nucleic acid molecule encoding the at least one transgene of interest. In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population prior to the nucleic acid molecule encoding the at least one transgene of interest. In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population from 1 hour to 72 hours before the nucleic acid molecule encoding the at least one transgene of interest. In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population about 4 hours before the nucleic acid molecule encoding the at least one transgene of interest.
  • the protein or polypeptide comprising a transposase or a fragment thereof, or the nucleic acid encoding the same is comprised in a pharmaceutical composition further comprising at least one acceptable excipient.
  • the cell population is a eukaryotic or a prokaryotic cell population. In some embodiments, the cell population is a eukaryotic cell population. In some embodiments, the cell population is a prokaryotic cell population. In some embodiments, the method is performed in vitro. In some embodiments, the cell population is an in vztro-cultured population of cells. In some embodiments, the method is performed ex vivo. In some embodiments, the method is performed in vivo.
  • the cell population is comprised in a tissue or an organ of a living organism.
  • the organism is an animal.
  • the organism is a mammal.
  • the organism is a human.
  • the method comprises the steps of: contacting the cell population with a vector comprising the nucleic acid encoding the protein or polypeptide comprising the transposase or a fragment thereof, culturing the cell population for a period of time ranging from 1 hour to 72 hours, and contacting the cell population with a vector comprising the nucleic acid molecule encoding the at least one transgene of interest.
  • the method comprises the steps of: contacting the cell population with a vector comprising the protein or polypeptide comprising the transposase or a fragment thereof, and contacting the cell population with a vector comprising the nucleic acid molecule encoding the at least one transgene of interest.
  • Cas9 or“ Cas9 nuclease” refer to an RNA-guided nuclease comprising a Cas9 protein, or a fragment thereof (e.g., a protein comprising an active or inactive DNA cleavage domain of Cas9, and/or the gRNA binding domain of Cas9).
  • a Cas9 nuclease is also referred to sometimes as a casnl nuclease or a CRISPR (clustered regularly interspaced short palindromic repeat)-associated nuclease.
  • CRISPR is an adaptive immune system that provides protection against mobile genetic elements (viruses, transposable elements and conjugative plasmids).
  • CRISPR clusters contain spacers, sequences complementary to antecedent mobile elements, and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA). In type II CRISPR systems, correct processing of pre-crRNA requires a trans-encoded small RNA (tracrRNA), endogenous ribonuclease 3 (rnc) and a Cas9 protein. The tracrRNA serves as a guide for ribonuclease 3-aided processing of pre-crRNA. Subsequently, the Cas9/crRNA/tracrRNA complex endonucleolytically cleaves linear or circular dsDNA target complementary to the spacer.
  • tracrRNA trans-encoded small RNA
  • rnc endogenous ribonuclease 3
  • Cas9 protein The tracrRNA serves as a guide for ribonuclease 3-aided processing of pre-crRNA.
  • the target strand not complementary to crRNA is first cut endonucleolytically, then trimmed 3 ‘-5’ exonucleolytically.
  • DNA-binding and cleavage typically requires protein and both RNAs.
  • single guide RNAs (“sgRNA” or simply “gRNA”) can be engineered so as to incorporate aspects of both the crRNA and tracrRNA into a single RNA species.
  • Cas9 recognizes a short motif in the CRISPR repeat sequences (the PAM or protospacer adjacent motif) to help distinguish self vs. non-self.
  • Cas9 nuclease sequences and structures are well known to those of skill in the art. Cas9 orthologs have been described in various species, including, but not limited to, S. pyogenes and S. thermophilus.
  • Exogenous refers to any molecule that is not naturally present in a cell or organism of interest, but which can be introduced thereinto by one or more genetic, biochemical or other methods.
  • the natural presence of a molecule in a cell or organism may also be determined with respect to the particular developmental stage and environmental conditions thereof.
  • a molecule that is present only during embryonic development of muscle is an exogenous molecule with respect to an adult muscle cell.
  • a molecule induced by heat shock is an exogenous molecule with respect to a non-heat-shocked cell.
  • exogenous molecule can comprise, e.g., a functioning version of a malfunctioning endogenous molecule or a malfunctioning version of a normally functioning endogenous molecule.
  • endogenous coins any molecule that is normally present in a cell or organism, at a particular developmental stage under particular environmental conditions.
  • Fusion protein refers to a single-chain hybrid polypeptide which comprises two or more amino acid sequences fused together (z.e., from two or more different proteins and/or peptides). The two or more amino acid sequences can be fused together via a direct peptidic bond or indirectly through a peptidic linker. A fusion protein may be in particular fully encoded by a single nucleic acid sequence.
  • Gene typically refers to a DNA region encoding a protein (z.e., a coding region).
  • the term may also include DNA regions which do not per se encode a protein (z.e., a non-coding region).
  • the latter include, e.g., regions transcribed into functional non-coding RNA molecules (e.g., transfer RNA, ribosomal RNA, regulatory RNA, etc.).
  • Other non-coding regions regulate the transcription and translation of coding regions (z. e. , regulatory elements), or serve as architectural elements (e.g., scaffold/matrix attachment region), as origins of DNA replication, as centromeres or telomeres, etc.
  • Regulatory elements include, without limitations, promoter sequences, terminators, translational regulatory sequences (e.g., ribosome binding sites [RBS] and internal ribosome entry sites [IRES]), enhancers, silencers, insulators, boundary elements, replication origins, matrix attachment sites and locus control regions.
  • translational regulatory sequences e.g., ribosome binding sites [RBS] and internal ribosome entry sites [IRES]
  • enhancers e.g., ribosome binding sites [RBS] and internal ribosome entry sites [IRES]
  • enhancers e.g., ribosome binding sites [RBS] and internal ribosome entry sites [IRES]
  • enhancers e.g., ribosome binding sites [RBS] and internal ribosome entry sites [IRES]
  • enhancers e.g., ribosome binding sites [RBS] and internal ribosome entry sites [IRES]
  • enhancers e.g
  • Identity when used in a relationship between the sequences of two or more amino acid sequences, or of two or more nucleic acid sequences, refers to the degree of sequence relatedness between amino acid sequences or nucleic acid sequences, as determined by the number of matches between strings of two or more amino acid residues or nucleic acid residues. “Identity” measures the percent of identical matches between the smaller of two or more sequences with gap alignments (if any) addressed by a particular mathematical model or computer program (z.e., “algorithms”). Identity of related amino acid sequences or nucleic acid sequences can be readily calculated by known methods. Such methods include, but are not limited to, those described in Lesk A. M. (1988).
  • Preferred methods for determining identity are designed to give the largest match between the sequences tested. Methods of determining identity are described in publicly available computer programs. Preferred computer program methods for determining identity between two sequences include the GCG program package, including GAP (Genetics Computer Group, University of Wisconsin, Madison, WI; Devereux et al., 1984. Nucleic Acids Res . 12(1 Pt l):387-95), BLASTP, BLASTN, and FASTA (Altschul etal., 1990. J Mol Biol. 215(3):403-10). The BLASTX program is publicly available from the National Center for Biotechnology Information (NCBI) and other sources (BLAST Manual, Altschul et al. NCB/NLM/NIH Bethesda, Md. 20894). The well-known Smith Waterman algorithm may also be used to determine identity.
  • GAP Genetics Computer Group, University of Wisconsin, Madison, WI; Devereux et al., 1984. Nucleic Acids Res . 12(1 P
  • Insertion refers to the addition of a nucleic acid sequence into a second nucleic acid sequence or into a genome or a portion thereof. Insertion may be “specific”, “site-specific”, “targeted” or “on-targeted”: these adjectives define the insertion of a nucleic acid into a specific site of a second nucleic acid or into a specific site of a genome or a portion thereof (i.e., a site that has been purposely selected for insertion). Conversely, the adjectives “random”, “non-targeted” or “off-targeted” refer to non-specific and/or unintended insertion of a nucleic acid into an unwanted site. The terms “total” or “overall” refer to the total number of insertions.
  • Linker refers to a chemical group or a molecule linking two adjacent molecules or moieties.
  • Modified refers to a protein or nucleic acid sequence that is different than a corresponding unmodified protein or nucleic acid sequence.
  • “Mutated”, in connection with a sequence means that the sequence is different than a reference sequence, such as a wild-type sequence.
  • a mutated sequence comprises at least one of a substitution, an addition or a deletion of one or several residues by comparison to a reference sequence, such as a corresponding wild-type sequence.
  • “Mutation” refers to a substitution of a residue within a sequence, e.g., a nucleic acid or amino acid sequence, with another residue; and/or to a deletion or insertion of one or more residues within a nucleic acid or amino acid sequence. Mutations are typically described herein by identifying the original residue followed by the position of the residue within the sequence, then the identity of the newly substituted residue. Various methods for making amino acid substitutions (mutations) provided herein are well known in the art, and are provided by, for example, Green & Sambrook, 2012 (Molecular cloning: a laboratory manual (4 th Ed.). Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.).
  • Nuclease refers to an enzyme catalyzing the hydrolysis of nucleic acids within a nucleic acid sequence. Nuclease activity can result in single-stranded or double-stranded nucleic acid molecules break, wherein the nucleic molecule can be DNA or RNA.
  • Nucleic acid molecule/sequence
  • nucleotide sequence may be used interchangeably to refer to any molecule composed of, or comprising, monomeric nucleotides.
  • a nucleic acid may be an oligonucleotide or a polynucleotide; it can be a DNA, an RNA, or a mix thereof. It can be chemically modified or artificial; e.g., it encompasses peptide nucleic acids (PNA), morpholinos and locked nucleic acids (LNA), as well as glycol nucleic acids (GNA) and threose nucleic acid (TNA).
  • PNA peptide nucleic acids
  • LNA locked nucleic acids
  • GAA glycol nucleic acids
  • TAA threose nucleic acid
  • nucleic acids distinguish from naturally occurring DNA or RNA by changes in the backbone of the molecule.
  • phosphorothioate nucleotides may be used.
  • Other deoxynucleotide analogs include, without limitation, methylphosphonates, phosphoramidates, phosphorodithioates, N3'P5' phosphoramidates and oligoribonucleotide phosphorothioates and their 2’O-allyl analogs and 2’O-methylribonucleotide methylphosphonates which may be used in a nucleic acid of the disclosure.
  • Polypeptide”, “peptide”, “protein” and “amino acid sequence” are used interchangeably to refer to a polymer of amino acid residues. Unless specified, a polymer of amino acid residues can be any length. The term also applies to amino acid polymers in which one or more amino acids are chemical analogues or modified derivatives of corresponding naturally-occurring amino acids.
  • Prevention and any declension thereof refers to prophylactic and preventative measures, wherein the object is to reduce the chances that a subject will develop a given pathologic condition or disorder over a given period of time. Such a reduction may be reflected, e.g., in a delayed onset of at least one symptom of the pathologic condition or disorder in the subject.
  • Specificity refers to the ability to selectively bind a sequence which shares a degree of sequence identity to a selected sequence.
  • Subject refers to a mammal, preferably a human.
  • a subject may be a “patient”, ie., a warm-blooded animal, more preferably a human, who/which is awaiting the receipt of, or is receiving medical care or was/is/will be the object of a medical procedure, or is monitored for the development of a disease.
  • the term “mammal” refers here to any mammal, including humans, domestic and farm animals, and zoo, sports, or pet animals, such as dogs, cats, cattle, horses, sheep, pigs, goats, rabbits, etc.
  • the mammal is a primate, more preferably a human.
  • Transduction and any declension thereof refers to the introduction of one or more nucleic acid molecules (DNA and/or RNA) into one or more cells using a viral vector carrier, e.g., a virus, a viral particle or a viral vector, including without limitation, retroviruses (including lentiviruses), adenoviruses, adeno-associated viruses (AAV), and vectors derived thereof.
  • a viral vector carrier e.g., a virus, a viral particle or a viral vector, including without limitation, retroviruses (including lentiviruses), adenoviruses, adeno-associated viruses (AAV), and vectors derived thereof.
  • retroviruses including lentiviruses
  • AAV adeno-associated viruses
  • Transfection and any declension thereof refers to the introduction of one or several nucleic acid molecules (DNA and/or RNA) into one or more cells by non-viral means, whether in vitro or in vivo.
  • transfection refers to any method, technique or vehicle known in the art that facilitates or increases cytoplasmic transduction of a nucleic acid molecule or cargo. Methods for transfection are well known in the art and include, e.g., lipofection, PEI and electroporation.
  • Transgene refers to an exogenous nucleic acid sequence, in particular an exogenous DNA or cDNA encoding a gene product.
  • the gene product may be an RNA, peptide or protein.
  • the transgene may include or be associated with one or more operational sequences to facilitate or enhance expression, such as a promoter, enhancer(s), response element(s), reporter element(s), insulator element(s), polyadenylation signal(s) and/or other functional elements.
  • Embodiments of the disclosure may utilize any known suitable promoter, enhancer(s), response element(s), reporter element(s), insulator element(s), polyadenylation signal(s) and/or other functional elements, unless specified otherwise. Suitable elements and sequences will be well known to those skilled in the art.
  • Transposase refers to an enzyme that binds to the end of a transposon and catalyzes its movement to another part of the genome by a cut-and-paste mechanism, or a replicative transposition mechanism.
  • the transposase may be a fragment of a transposase, e.g., an ITR-binding domain or a functional domain, preferably an ITR-binding domain.
  • “Treatment”, “alleviation”, “curation” and any declensions thereof refer to a therapeutic treatment, excluding prophylactic or preventative measures; wherein the object is to slow down, lessen, stop or even reverse (either partially or totally) the evolution of a targeted pathologic condition or disorder.
  • Those in need of treatment include those already with the disorder as well those suspected to have the disorder.
  • a subject is successfully “treated” for the targeted pathologic condition or disorder if, after receiving treatment, they show observable and/or measurable reduction in or absence of one or more symptoms associated with the pathologic condition or disorder; relief to some extent; reduced morbidity and/or mortality; and/or improvement in quality-of-life issues.
  • the above parameters for assessing successful treatment and improvement in the disease are readily measurable by routine procedures familiar to a physician.
  • Vector refers to any polynucleotide that can carry, e.g., a second polynucleotide of interest, and e.g., which can transfer gene sequences to target cells.
  • the term includes cloning, and expression vehicles, as well as integrating vectors.
  • the present invention thus relates to a method of increasing the expression of a nucleic acid molecule encoding at least one transgene of interest in a cell population.
  • It also relates to a method of increasing nuclear localization of a nucleic acid molecule encoding at least one transgene of interest in a cell population.
  • It also relates to a method of editing the genome of a cell population, in particular by inserting a nucleic acid molecule encoding at least one transgene of interest in the genome of the cell population.
  • It also relates to a method of transfecting a nucleic acid molecule encoding at least one transgene of interest in a cell population, preferably a non-dividing cell population.
  • the nucleic acid molecule encoding the at least one transgene of interest is transfected in the cell population by at least one technique selected from the group comprising or consisting of lipofection, electroporation, sonication, nanoparticles, microinjection, PEI, and viral vector infection, including non-integrative and integrative viral vector infection.
  • the invention is based on the observation by the Inventors that a cytoplasmic transposase was capable of delivering a nucleic acid to the nucleus of a cell, thereby increasing its nuclear localization.
  • a cytoplasmic transposase is capable of translocating, or of promoting translocation, of a nucleic acid to the nucleus of a cell, including a quiescent cell. Increased nuclear localization thus induces increased expression levels of the nucleic acid.
  • the Inventors observed that increased nuclear localization by the transposase does not rely on its enzymatic activity.
  • a DNA molecule encoding the at least one transgene of interest when delivered to the cytoplasm of a cell, will be sensed by the immune system of the cell that will induce a response against the presence of a foreign DNA (e.g., inflammatory response). Besides, the DNA molecule encoding the at least one transgene of interest will also have to pass the nuclear envelope in order to reach the nucleus. These problems are overcome when the DNA molecule encoding the at least one transgene of interest is translocated by a cytoplasmic transposase.
  • these methods comprise contacting a cell population with: the nucleic acid molecule encoding the at least one transgene of interest, and a protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same.
  • the nucleic acid molecule encoding the at least one transgene of interest is delivered to the cell population by transfection.
  • the nucleic acid molecule encoding the at least one transgene of interest is transfected in the cell population by at least one technique selected from the group comprising or consisting of lipofection, electroporation, sonication, nanoparticles, microinjection, PEI, and viral vector infection, including non-integrative and integrative viral vector infection.
  • the cell population may be contacted with the nucleic acid molecule encoding the at least one transgene of interest, and the protein or polypeptide comprising a transposase or a fragment thereof or a nucleic acid encoding the same, in any order.
  • the cell population is contacted with the nucleic acid molecule encoding the at least one transgene of interest concomitantly with the protein or polypeptide comprising a transposase or a fragment thereof or a nucleic acid encoding the same.
  • the cell population is contacted with the protein or polypeptide comprising a transposase or a fragment thereof or a nucleic acid encoding the same prior to being contacted with the nucleic acid molecule encoding the at least one transgene of interest.
  • the cell population is contacted with the nucleic acid molecule encoding the at least one transgene of interest prior to being contacted with the protein or polypeptide comprising a transposase or a fragment thereof or a nucleic acid encoding the same.
  • nucleic acid molecule encoding the at least one transgene of interest refers to a nucleic acid sequence to be inserted in the genome of a cell, preferably of a eukaryotic cell, more preferably of a mammalian cell (including of a human cell), that encodes at least one product of interest.
  • the product of interest may be a protein or a fragment thereof; in this case, the transgene of interest is said to be a coding nucleic acid sequence.
  • RNA gene or “non-coding RNA”
  • non-coding RNA such as, e.g., a transfer RNA, a ribosomal RNA, a small RNA, a long non-coding RNA, etc.
  • the transgene of interest is a nucleic acid sequence encoding a peptide or protein (e.g., without limitation, an enzyme, a transcription factor, a growth factor, a trophic factor, a hormone, a cytokine, an antibody, an antigen, a receptor, an immune regulator, a differentiation factor, a suicide protein, a cell-cycle modifying protein, an anti-proliferative protein, an angiogenic factor, an anti-angiogenic factor, a genome editor, a nuclease, a recombinase, a transposase, a neurotransmitter, and a reporter, including any precursor thereof, as well as fusion proteins).
  • the sequence of interest may typically be (or be derived from) an mRNA, a cDNA, a gDNA, a synthetic nucleic acid, or any combinations thereof.
  • the transgene of interest is alternatively a nucleic acid sequence of a non-coding RNA.
  • RNAs examples include, but are not limited to, transfer RNAs (tRNAs), ribosomal RNAs (rRNAs), small nuclear RNAs (snRNAs), small nucleolar RNAs (snoRNAs), SmY RNAs, small Cajal body-specific RNAs (scaRNAs), guide RNAs (gRNAs), Y RNAs, telomerase RNA component (TERC), spliced leader RNAs (SL RNAs), catalytic RNAs (z.e., ribozymes; such as, e.g., ribonuclease P, ribonuclease MRP, and the like), antisense RNAs (aRNAs), c/.s-natural antisense transcript (cis-NAT), CRISPR RNAs (crRNAs), long non-coding RNAs (IncRNAs), microRNAs (miRNAs), piwi-interacting RNAs (p
  • transgene of interest examples include any nucleic acid sequence encoding a molecule of therapeutic interest, such as any nucleic acid sequence encoding a peptide or protein, or a non-coding RNA, that is lacking, deficient and/or non-functional in a subject.
  • the nucleic acid molecule encoding the at least one transgene of interest comprises the at least one transgene and at least one regulatory element.
  • regulatory elements include, without limitation, promoters, enhancers, silencers, insulators and the like.
  • the at least one regulatory element is located upstream, i.e., in 5’, of the at least one transgene of interest.
  • the nucleic acid molecule encoding the at least one transgene of interest may be double-stranded or single-stranded.
  • the nucleic acid molecule encoding the at least one transgene of interest may be a deoxyribonucleic acid (DNA) molecule, a ribonucleic acid (RNA) molecule, or a mix thereof; preferably nucleic acid molecule encoding the at least one transgene of interest is a DNA molecule.
  • the nucleic acid molecule encoding the at least one transgene of interest typically comprises natural nucleotides. It may however also comprise non-natural nucleotides.
  • a “natural nucleotide” refers to adenine (A), guanine (G), cytosine (C), thymine (T) and uracil (U).
  • non-natural nucleotides refers to chemically modified A, T, U, C or G nucleotides.
  • the nucleic acid molecule encoding the at least one transgene of interest has a length of at least 10 base pairs (bp), such as at least 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1 000, 2 000, 3 000, 4 000, 5 000, 6 000, 7 000, 8 000, 9 000, 10 000, 20 000, 30 000, 40 000, 50 000, 60 000, 70 000, 80 000, 90 000, 100 000 bp or more.
  • bp base pairs
  • the nucleic acid molecule encoding the at least one transgene of interest is a DNA molecule selected from the group comprising or consisting of a complementary DNA (cDNA) and a genomic DNA (gDNA).
  • cDNA complementary DNA
  • gDNA genomic DNA
  • the DNA molecule is a cDNA.
  • the DNA molecule is a gDNA.
  • the nucleic acid molecule encoding the at least one transgene of interest further comprises at least one Inverted Terminal Repeat (ITR) sequence.
  • ITR Inverted Terminal Repeat
  • an “ITR sequence” refers to a nucleic acid sequence naturally found at the extremities of eukaryotic transposable elements (or transposons). They are referred to as “5’ ITR” and “3’ ITR”. Typically, the 5’ ITR and the 3’ ITR are complementary and capable of forming a hairpin structure. ITR sequences are useful for the recognition of the transposon by a transposase enzyme. In some embodiments, the ITR sequences are palindromic.
  • ITR sequences include SEQ ID NOs: 55 to 60, as well as complementary sequences thereof.
  • SEQ ID NO: 56 ccctagaaagatagtctgcgtaaaattgacgcatgagataatcaatattgtgacgtacgttaa
  • SEQ ID NO: 60 catgcgtcaattttacgcatgattatctttaacgtacgtacgtcacaatatgattatctttctaggg
  • the at least one ITR sequence interacts with, or non-covalently binds to, the protein or polypeptide comprising a transposase or a fragment thereof.
  • the protein or polypeptide comprising a transposase or a fragment thereof specifically recognizes and binds to the at least one ITR sequence.
  • the at least one ITR sequence is adjacent to the transgene of interest.
  • adjacent it is meant that no nucleotide separates the transgene of interest from the at least one ITR sequence.
  • the at least one ITR sequence can be separated from the transgene of interest by at least one nucleotide, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100 nucleotides or more.
  • the nucleic acid molecule encoding the at least one transgene of interest comprises two ITR sequences.
  • the two ITR sequences flank the transgene of interest.
  • the two ITR sequences are directly flanking the transgene of interest, z.e., no nucleotide separates the transgene of interest from the two flanking ITR sequences.
  • the transgene of interest can be separated from at least one of the two ITR sequences, or from both ITR sequences, by at least one nucleotide, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100 nucleotides or more.
  • the nucleic acid molecule encoding the at least one transgene of interest is either stably or transiently expressed by the cell population, preferably stably expressed.
  • the nucleic acid molecule encoding the at least one transgene of interest is transiently expressed.
  • the nucleic acid molecule encoding the at least one transgene of interest is transfected.
  • the transfection comprises the use of at least one vehicle and/or method that facilitates or increases cytoplasmic transduction of the nucleic acid molecule encoding at least one transgene of interest. Examples of transfection techniques include, without limitation, lipofection, electroporation, sonication, nanoparticles, microinjection, PEI, and viral vector infection, including non-integrative and integrative viral vector infection.
  • the nucleic acid molecule encoding the at least one transgene of interest is a DNA molecule which is inserted into the genome of the cells of the cell population. It is to be understood that the term “inserted in the genome” implies that the DNA molecule is replicated alongside the genome of the cell during mitosis.
  • the transgene of interest may be an exogenous gene or an endogenous gene.
  • the transgene of interest is an exogenous gene.
  • exogenous gene refers to a gene that is not naturally present in a cell, but can be introduced into the cell by one or more genetic, biochemical or other methods. Natural presence in the cell may be determined with respect to the particular developmental stage and environmental conditions of the cell. For instance, a molecule that is present only in a cell during embryonic development is exogenous with respect to an adult stage cell. Similarly, a molecule induced by heat-shocking a cell is exogenous with respect to a non-heat- shocked cell.
  • the transgene of interest is an endogenous gene.
  • endogenous gene refers to a gene of which at least one copy is naturally present in the genome of the population of cells.
  • the at least one copy may be an allele, z.e., a version of the gene comprising at least one mutation.
  • expression of a transgenic endogenous gene increases the overall expression levels (z.e., transcripts) of the gene.
  • the nucleic acid molecule encoding the at least one transgene of interest is comprised in a vector, such as, without limitation, a plasmid, a fosmid, a cosmid, an artificial chromosome or a viral vector.
  • the nucleic acid molecule encoding the at least one transgene of interest is comprised in a plasmid.
  • the plasmid may be circular or linear, preferably circular.
  • the nucleic acid molecule encoding the at least one transgene of interest is comprised in a fosmid.
  • the nucleic acid molecule encoding the at least one transgene of interest is comprised in a cosmid.
  • the nucleic acid molecule encoding the at least one transgene of interest is comprised in an artificial chromosome (e.g., human artificial chromosome).
  • the nucleic acid molecule encoding the at least one transgene of interest is comprised in a viral vector.
  • the viral vector may be a DNA virus-based vector.
  • DNA virus-based vectors include vectors derived from viruses from the Duplodnaviria, Monodnaviria or Varidnaviria realms.
  • the DNA virus-based vector is derived from a virus from the Duplodnaviria realm. Viruses from the Duplodnaviria realm include viruses of the Herpesvirales order. In some embodiments, the DNA virus-based vector is therefore a vector derived from Herpesviridae, such as, e.g., from a herpes simplex virus.
  • the DNA virus-based vector is derived from a virus from the Monodnaviria realm. Viruses from the Monodnaviria realm include viruses of the Papillomaviridae and Polyomaviridae families. In some embodiments, the DNA virus-based vector is therefore a vector derived from a Papillomaviridae or a Polyomaviridae .
  • the DNA virus-based vector is derived from a virus from the Varidnaviria realm.
  • Viruses from the Varidnaviria realm include viruses of the Adenoviridae and Poxviridae families.
  • the DNA virus-based vector is therefore a vector derived from an Adenoviridae, such as, e.g., from an adenovirus; or derived from Poxviridae, such as, e.g., from a vaccina virus.
  • the nucleic acid molecule encoding the at least one transgene of interest can be excised from the vector. In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest can be excised from the vector by the at least one ITR sequence, typically by a transposase, or variant or fragment thereof.
  • the protein or polypeptide comprising the transposase or a fragment thereof increases the nuclear localization of the nucleic acid molecule encoding the at least one transgene of interest.
  • “increases the nuclear localization” means that the nuclear localization of the nucleic acid molecule encoding the at least one transgene of interest is increased at least 1.2-fold, 1.3-fold, 1.4-fold, 1.5-fold, 1.6-fold, 1.7-fold, 1.8-fold, 1.9-fold, 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, 10-fold, 20-fold, 30-fold, 40-fold, 50-fold, 60-fold, 70-fold, 80-fold, 90-fold, 100-fold or more in presence of the protein or polypeptide comprising the transposase or a fragment thereof, compared to a condition in absence of the protein or polypeptide comprising the transposase or
  • the protein or polypeptide comprising the transposase or a fragment thereof translocates, or promotes translocation of, the nucleic acid molecule encoding the at least one transgene of interest to the nucleus.
  • the transposase or a fragment thereof (i) interacts and/or non-covalently binds with the nucleic acid molecule encoding the at least one transgene of interest, and (ii) translocates to the nucleus.
  • the transposase or a fragment thereof does not insert and/or transpose the nucleic acid molecule encoding the at least one transgene of interest into the genome.
  • the transposase or a fragment thereof increases at least 1.2-fold, 1.3-fold, 1.4-fold, 1.5-fold, 1.6-fold, 1.7-fold, 1.8-fold, 1.9-fold, 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, 10-fold, 20-fold, 30-fold, 40-fold, 50-fold, 60-fold, 70-fold, 80-fold, 90-fold, 100-fold or more, the passage of the nucleic acid molecule encoding the at least one transgene of interest through the nuclear envelope.
  • the transposase or a fragment thereof protects the nucleic acid molecule encoding the at least one transgene of interest from the cell’s sensing mechanisms and/or defense mechanisms. In some embodiments, the transposase or a fragment thereof does not induce a proinflammatory and/or inflammatory response.
  • the transposase is selected from the group consisting of hyperactive PiggyBac transposase, PiggyBac transposase, Sleeping Beauty transposase, SB11 transposase, Tol2 transposase, Mosl transposase, and Frog Prince transposase.
  • the transposase is selected from the group consisting of hyperactive PiggyBac transposase and Sleeping Beauty transposase.
  • the transposase is a hyperactive PiggyBac transposase.
  • the transposase is a hyperactive PiggyBac transposase with the amino acid sequence of SEQ ID NO: 1.
  • the transposase is a modified hyperactive PiggyBac transposase.
  • modified hyperactive PiggyBac transposase it is referred to a transposase comprising one or more amino acid substitutions, typically no more than 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid substitutions, as compared to the hyperactive PiggyBac transposase with the amino acid sequence of SEQ ID NO: 1.
  • a modified hyperactive PiggyBac transposase may comprise (i) one or more amino acid substitutions to increase excision activity as compared to the hyperactive PiggyBac transposase with the amino acid sequence of SEQ ID NO: 1, and/or (ii) one or more amino acid substitutions to decrease DNA binding activity as compared to the hyperactive PiggyBac transposase with the amino acid sequence of SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase comprises an amino acid sequence at least 80 %, 85 %, 90 %, 95 %, 96 %, 97 %, 98 %, or 99 % identical to the sequence set forth in SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to increase excision activity.
  • the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to increase excision activity selected among the amino acid mutations within the regions defined by the amino acid position numbers [194-200], [214-222], [434-442] or [446-456]; for example, amino acid substitution at the position DI 98, D201, R202, M212 and/or S213; said position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to increase excision activity selected among the amino acid mutations at positions 450, 560, 564, 573, 589, 592, and/or 594; said position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to increase excision activity selected among the amino acid mutations at position of M194 and/or D450, said position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1; preferably the amino acid substitution is M194V and/or D450N.
  • the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to decrease DNA binding activity.
  • the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to decrease DNA binding activity selected among the amino acid mutations at positions 254, 275, 277, 347, 372, 375, and/or 465; said position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to decrease DNA binding activity selected among R275, N347, R372, K375, R376, E377, and E380, said position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to decrease DNA binding activity selected among R372, K375, R376, E377, and E380, said position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1; preferably the amino acid substitution is R372A, K375A, R376A, E377A, and/or E380A.
  • the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to decrease DNA binding activity selected among N347, R372, and K375, said position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1; preferably the amino acid substitution is N347S, N347A, R372A, and/or K375A; more preferably the amino acid substitution is N347S or N347A.
  • the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to increase excision activity, as defined above; and one or more amino acid mutations to decrease DNA binding activity, as defined above.
  • the modified hyperactive PiggyBac transposase includes at least one amino acid substitution to increase excision activity at position D450, and at least two amino acid substitutions to decrease DNA binding activity at positions N347, R372 and K375; preferably said modified transposase of hyperactive PiggyBac comprises the double mutations N347S and D450N or triple mutations D450N, R372A and K375A, said position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1.
  • the modified transposase of hyperactive PiggyBac comprises the double mutations N347S and D450N, said position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase as disclosed in the previous embodiments further comprises at least one mutation in the region defined by the amino acid position numbers [158-169], for example A166S; and/or at least one mutation at position Y527, R518, K525, N463.
  • the modified hyperactive PiggyBac transposase further comprises one or more amino acid substitution at positions 34, 43, 117, 202, 230, 245, 268, 275, 277, 287, 290, 315, 325, 341, 346, 347, 350, 351, 356, 357, 388, 409, 411, 412, 432, 447, 460, 461, 465, 517, 560, 564, 571, 573, 576, 586, 587, 589, 592, and/or 594, the position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase comprises one of the following amino acid substitution or combination of amino acid substitutions: V34M, T43I, Y177H, R202K, S230N, R245A, D268N, K287A, K290A, K287A/K290A, R315A, G325A, R341A, D346N, N347A, N347S, T350A, S351E, S351P, S351A, K356E, N357A, R388A, K409A, A411T, K412A, K432A, D447A, D447N, D450N, R460A, K461A, W465A, S517A, T560A, S564P, S571N, S573A, K576A, H586A, I587A, M589V, S592G, F594L,
  • the modified hyperactive PiggyBac transposase comprises one of the following amino acid substitution or combination of amino acid substitutions:
  • modified hyperactive PiggyBac transposases include modified hyperactive PiggyBac comprising one of the following combinations of amino acid substitutions:
  • R275A/325A/R372A/T560A the position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase comprises one of the following combinations of amino acid substitutions:
  • the modified hyperactive PiggyBac transposase comprises one of the following combinations of amino acid substitutions: N347A/D450N,
  • R275A/R277A/N347S/R372A/D450N/T560A/S564P/F594L R245A/N347S/R372A/D450N/T560A/S564P/S573A/S592G, R277A/G325A/N347A/K375A/D450N/T560A/S564P/S573A/S592G/F594L, G325A/N347S/K375A/D450N/S573A/M589V/S592G, S230N/R277A/N347S/K375A/D450N, G325A/N347S/S351A/K375A/D450N/S573A/M589V/S592G, or Y177H/R275A/G325A/K375A/D450N/T560A/S564P/S592G
  • the modified hyperactive PiggyBac transposase comprises the R372A/K375A/D450N substitutions, said position numbers corresponding to the amino acid numbers of the hyperactive PiggyBac with SEQ ID NO: 1.
  • Said modified transposase has an amino acid sequence of SEQ ID NO: 4.
  • the modified hyperactive PiggyBac transposase has an amino acid sequence selected from the group comprising or consisting of SEQ ID NOs: 5-26.
  • the modified hyperactive PiggyBac transposase has an amino acid sequence selected from the group comprising or consisting of SEQ ID NOs: 5-13.
  • the modified hyperactive PiggyBac transposase has an amino acid sequence selected from the group comprising or consisting of SEQ ID NOs: 14-26.
  • the modified hyperactive PiggyBac transposase may comprise one or more mutations relative to hyperactive PiggyBac transposase that are involved in the conserved catalytic triad, e.g., at amino acid 268 and/or 346 (e.g., D268N and/or D346N) corresponding to the amino acid numbering of SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase may comprise one or more mutations relative to hyperactive PiggyBac transposase that are critical for excision, e.g., at amino acid 287, 287/290 and/or 460/461 (e.g., K287A, K287A/K290A, and/or R460A/K461A) corresponding to the amino acid numbering of SEQ ID NO: 1.
  • 460/461 e.g., K287A, K287A/K290A, and/or R460A/K461A
  • the modified hyperactive PiggyBac transposase may comprise one or more mutations relative to hyperactive PiggyBac transposase that are involved in target joining, e.g., at amino acid 351, 356, and/or 379 (e.g., S351E, S351P, S351A, and/or K356E) corresponding to the amino acid numbering of SEQ ID NO: 1.
  • amino acid 351, 356, and/or 379 e.g., S351E, S351P, S351A, and/or K356E
  • the modified hyperactive PiggyBac transposase may comprise one or more mutations relative to hyperactive PiggyBac transposase that are critical for integration, e.g., at amino acid 560, 564, 571, 573, 589, 592, and/or 594 (e.g., T560A, S564P, S571N, S573A, M589V, S592G, and/or F594L) corresponding to the amino acid numbering of SEQ ID NO: 1.
  • amino acid 560, 564, 571, 573, 589, 592, and/or 594 e.g., T560A, S564P, S571N, S573A, M589V, S592G, and/or F594L
  • the modified hyperactive PiggyBac transposase may comprise one or more mutations relative to hyperactive PiggyBac transposase that are involved in alignment, e.g., at amino acid 325, 347, 350, 357 and/or 465 (e.g., G325A, N347A, N347S, T350A and/or W465A) corresponding to the amino acid numbering of SEQ ID NO: 1.
  • amino acid 325, 347, 350, 357 and/or 465 e.g., G325A, N347A, N347S, T350A and/or W465A
  • the modified hyperactive PiggyBac transposase may comprise one or more mutations relative to hyperactive PiggyBac transposase that are well conserved, e.g., at amino acid 576 and/or 587 (e.g., K576A and/or I587A) corresponding to the amino acid numbering of SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase may comprise one or more mutations relative to hyperactive PiggyBac transposase that are involved in Zn 2+ binding, e.g., 586 (e.g., H586A) corresponding to the amino acid numbering of SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase may comprise one or more mutations relative to hyperactive PiggyBac transposase that are involved in integration, e.g., 315, 341, 372, and/or 375 (e.g., R315A, R341A, R372A, and/or K375A) corresponding to the amino acid numbering of SEQ ID NO: 1.
  • the modified hyperactive PiggyBac is selected for its high specificity of DNA integration into a genome compared to hyperactive PiggyBac.
  • the modified hyperactive PiggyBac comprises an amino acid sequence having one or more of the modifications disclosed herein relative to SEQ ID NO: 1, and retains at least 80 %, 85 %, 90 %, 95 %, 96 %, 97 %, 98 %, 99 % or more sequence identity with any of SEQ ID NOs: 5-26.
  • the modified hyperactive PiggyBac transposase may comprise a mutation of one or more of amino acids selected from amino acid 245, 275, 277, 325, 347, 351, 372, 375, 388, 450, 465, 560, 564, 573, 589, 592, and 594, corresponding to the amino acid numbering of SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase mutation may comprise one or more of the amino acid substitutions selected from R245A, R275A, R277A, R275A/R277A, G325A, N347A, N347S, S351E, S351P, S351A, R372A, K375A, R388A, D450N, W465A, T560A, S564P, S573A, M589V, S592G, and F594L corresponding to the amino acid numbering of SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase comprises an amino acid substitution D450N corresponding to the amino acid numbering of SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase comprises the amino acid substitutions R245A and D450, corresponding to the amino acid numbering of SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase comprises the amino acid substitutions R245A, G325A, and S573P, corresponding to the amino acid numbering of SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase comprises the amino acid substitutions R245A, G325A, D450 and S573P, corresponding to the amino acid numbering of SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase comprises the amino acid substitution N347S or N347A, corresponding to the amino acid numbering of SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase comprises the amino acid substitutions N347S and D450N, corresponding to the amino acid numbering of SEQ ID NO: 1.
  • the modified hyperactive PiggyBac transposase comprises the amino acid substitutions N347A and D450N, corresponding to the amino acid numbering of SEQ ID NO: 1.
  • This modified hyperactive PiggyBac transposase comprises the amino acid sequence of SEQ ID NO: 14.
  • the modified hyperactive PiggyBac transposase comprises or consists of an amino acid sequence having at least 75 %, 80 %, 85 %, 90 %,
  • SEQ ID NO: 20 SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24,
  • the modified hyperactive PiggyBac transposase comprises or consists of an amino acid sequence having at least 75 %, 80 %, 85 %, 90 %,
  • SEQ ID NO: 21 SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25 and SEQ ID NO: 26.
  • the modified transposase is not a HimarlC9 mutant.
  • the transposase is a Sleeping Beauty transposase.
  • the transposase is a hyperactive Sleeping Beauty SB 100 transposase with the amino acid sequence of SEQ ID NO: 61.
  • the transposase is a modified hyperactive Sleeping Beauty SB 100 transposase, comprising at least one amino acid substitution compared with the amino acid sequence of an unmodified Sleeping Beauty transposase, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more substitutions.
  • the modified hyperactive Sleeping Beauty SB 100 transposase comprises one or more of the amino acid substitutions at a position selected from C176, H187, 1212, P247 and K248, corresponding to the amino acid numbering of SEQ ID NO: 61.
  • the modified hyperactive Sleeping Beauty SB 100 transposase comprises one or more of the amino acid substitutions selected from C176S, H187V/P, I212S, P247R/S, and K248A/C/I/L/M/N/R/S/T/V, corresponding to the amino acid numbering of SEQ ID NO: 61.
  • the transposase or the fragment thereof has decreased catalytic activity.
  • “decreased catalytic activity” means a 1.2-fold, 1.3-fold, 1.4-fold, l?5-fold, 1.6-fold, 1.7-fold, 1.8-fold, 1.9-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, 20-fold decrease, or more, compared to a wild-type transposase and/or to an hyperactive transposase.
  • the transposase or the fragment thereof is catalytically dead. In some embodiments, the transposase or the fragment thereof does not exert any catalytic activity. In some embodiments, catalytically dead transposases retain their ability to bind to other proteins, polypeptides and/or nucleic acid molecules.
  • the catalytically dead transposase has at least 75 %, 80 %, 85 %, 90 %, 95 %, 96 %, 97 %, 98 %, 99 %, 99.5 %, 99.9 % or more sequence identity with the catalytically dead hyperactive PiggyBac transpose (dead hyPB) with SEQ ID NO: 3.
  • the dead hyPB comprises or consists of the amino acid sequence with SEQ ID NO: 3.
  • the protein or polypeptide comprises a transposase fragment.
  • the transposase fragment comprises or consists of at least one transposase functional domain.
  • the transposase fragment comprises or consists of an ITR-binding domain.
  • ITR-binding domains include, without limitation, SEQ ID NOs: 62 to 64, wherein SEQ ID NOs: 62 and 63 are ITR-binding domains of the hyperactive PiggyBac transposase and SEQ ID NO: 64 is an ITR-binding domain of the Sleeping Beauty transposase (also called “N57 targeting domain”).
  • the transposase fragment is a fragment of a hyperactive PiggyBac transposase or of a Sleeping Beauty transposase.
  • the transposase fragment is a fragment of a hyperactive PiggyBac transposase, preferably comprising an ITR-binding domain.
  • the transposase fragment is a fragment of a Sleeping Beauty transposase, preferably selected from the group consisting of the SB 100 domain and the N57 domain of Sleeping Beauty transposase.
  • the protein or polypeptide comprising the transposase or a fragment thereof is a fusion protein comprising the transposase or a fragment thereof, and at least one additional polypeptide or protein.
  • the protein or polypeptide comprising the transposase or a fragment thereof is a fusion protein comprising the transposase or a fragment thereof, and at least two additional polypeptides or proteins.
  • the fusion protein comprises a single, contiguous polypeptide chain.
  • the fusion protein further comprises at least one linker, in particular at least one peptide linker, between two polypeptides or proteins of the fusion protein.
  • Exemplary linkers include, without limitation, (G)n, (GS)n, (GGS)n, (GGGS)n (with SEQ ID NO: 49), (GGGGS)n (with SEQ ID NO: 50), (EAAAK) n (with SEQ ID NO: 51), XTEN linkers, and (XP) n linkers, as well as any combinations thereof, wherein n is an integer between 1 and 50.
  • the peptide linker is a glycine/serine-rich linker.
  • the peptide linker has an amino acid sequence selected from the group comprising or consisting of SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 45 and SEQ ID NO: 46.
  • the peptide linker is an XTEN linker comprising or consisting of SEQ ID NO: 47.
  • SEQ ID NO: 47 SGSETPGTSESATPES
  • the peptide linker comprises or consists of SEQ ID NO: 48.
  • the peptide linker comprises or consists of SEQ ID NO: 55.
  • Methods which are well-known to those skilled in the art can be used to construct expression vectors containing the coding sequence of a fusion protein along with appropriate transcriptional/translational control signals. These methods include in vitro recombinant DNA techniques, synthetic techniques and in vivo recombination/genetic recombination. See, for example, the techniques described in Sambrook etal., 2012 (Molecular cloning: A laboratory manual (4 th ed.). Cold Spring Harbor Laboratory Press).
  • the expression vector can be part of a plasmid, virus, or may be a nucleic acid fragment.
  • the expression vector includes an expression cassette into which the polynucleotide encoding the fusion protein (z.e., the coding region) is cloned in operable association with a promoter and/or other transcription or translation control elements.
  • a “coding region” is a portion of nucleic acid which consists of codons translated into amino acids.
  • a “stop codon” (TAG, TGA, or TAA) is not translated into an amino acid, it may be considered to be part of a coding region, if present, but any flanking sequences, for example promoters, ribosome binding sites, transcriptional terminators, introns, 5’ - and 3 ’-untranslated regions, and the like, are not part of a coding region.
  • Two or more coding regions can be present in a single polynucleotide construct, e.g., on a single vector, or in separate polynucleotide constructs, e.g., on separate (different) vectors.
  • any vector may contain a single coding region, or may comprise two or more coding regions, e.g., a vector of the present invention may encode one or more polypeptides, which are post- or co-translationally separated into the final proteins via proteolytic cleavage.
  • a vector, polynucleotide, or nucleic acid of the invention may encode heterologous coding regions, either fused or unfused to a polynucleotide encoding the fusion protein (fragment) of the invention, or variant or derivative thereof.
  • Heterologous coding regions include without limitation specialized elements or motifs, such as a secretory signal peptide or a heterologous functional domain.
  • An operable association is when a coding region for a gene product, e.g., a polypeptide, is associated with one or more regulatory sequences in such a way as to place expression of the gene product under the influence or control of the regulatory sequence(s).
  • Two DNA fragments (such as a polypeptide coding region and a promoter associated therewith) are “operably associated” if induction of promoter function results in the transcription of mRNA encoding the desired gene product and if the nature of the linkage between the two DNA fragments does not interfere with the ability of the expression regulatory sequences to direct the expression of the gene product or interfere with the ability of the DNA template to be transcribed.
  • a promoter region would be operably associated with a nucleic acid encoding a polypeptide if the promoter was capable of effecting transcription of that nucleic acid.
  • the promoter may be a cell-specific promoter that directs substantial transcription of the DNA only in predetermined cells.
  • Other transcription control elements besides a promoter, for example enhancers, operators, repressors, and transcription termination signals, can be operably associated with the polynucleotide to direct cell-specific transcription. Suitable promoters and other transcription control regions are disclosed herein. A variety of transcription control regions are known to those skilled in the art.
  • transcription control regions which function in vertebrate cells, such as, but not limited to, promoter and enhancer segments from cytomegaloviruses (e.g., the immediate early promoter, in conjunction with intron- A), simian virus 40 (e.g., the early promoter), and retroviruses (such as, e.g., Rous sarcoma virus).
  • transcription control regions include those derived from vertebrate genes such as actin, heat shock protein, bovine growth hormone and rabbit a-globin, as well as other sequences capable of controlling gene expression in eukaryotic cells.
  • tissue-specific promoters and enhancers as well as inducible promoters (e.g., promoters inducible tetracyclins).
  • inducible promoters e.g., promoters inducible tetracyclins
  • translation control elements include, but are not limited to ribosome binding sites, translation initiation and termination codons, and elements derived from viral systems (particularly an internal ribosome entry site, or IRES, also referred to as a CITE sequence).
  • the expression cassette may also include other features such as an origin of replication, and/or chromosome integration elements such as retroviral long terminal repeats (LTRs), or adeno-associated viral (AAV) inverted terminal repeats (ITRs).
  • LTRs retroviral long terminal repeats
  • AAV adeno-associated viral inverted terminal repeats
  • Fusion proteins prepared as described herein may be purified by art-known techniques such as high-performance liquid chromatography, ion exchange chromatography, gel electrophoresis, affinity chromatography, size exclusion chromatography, and the like.
  • the actual conditions used to purify a particular protein will depend, in part, on factors such as net charge, hydrophobicity, hydrophilicity etc., and will be apparent to those having skill in the art.
  • affinity chromatography purification an antibody, ligand, receptor or antigen can be used to which the fusion protein binds.
  • the purity of the fusion protein can be determined by any of a variety of well-known analytical methods including gel electrophoresis, high pressure liquid chromatography, and the like.
  • the nucleic acid encoding the fusion protein may be expressed as a single nucleic acid molecule that encodes the entire fusion protein or as multiple (e.g., two or more) nucleic acid molecules that are co-expressed. Polypeptides encoded by nucleic acid molecules that are co-expressed may associate through, e.g., disulfide bonds or other means, to form a functional fusion protein.
  • the fusion protein comprises or consists of the transposase or the fragment thereof fused at the C-terminal end of the at least one protein or polypeptide, either directly or indirectly via a linker. In other embodiments, the fusion protein comprises or consists of the transposase or a fragment thereof fused at the N-terminal end of the at least one protein or polypeptide, either directly or indirectly via a linker.
  • the fusion protein is a triple fusion protein, i.e., the fusion protein comprises the transposase or the fragment thereof, and at least two additional polypeptides or proteins.
  • the fusion protein further comprises a nuclear localization sequence (NLS).
  • NLS nuclear localization sequence
  • one or more NLS should have sufficient strength to drive the accumulation of the fusion protein in the nucleus of the cell.
  • the strength of the nuclear localization activity is determined by the number and position of NLSs, and one or more specific NLSs used in the fusion protein.
  • the NLSs may be located at the N-terminus and/or the C-terminus of the fusion protein.
  • the fusion protein comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs.
  • the fusion protein comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the N-terminus.
  • the fusion protein comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the C-terminus.
  • the fusion protein comprises a combination of these, such as one or more NLSs at the N-terminus and one or more NLSs at the C-terminus.
  • each NLS may be selected as independent from other NLSs.
  • the fusion protein comprises two NLSs, for example, the two NLSs are located at the N-terminus and the C-terminus, respectively.
  • an NLS consists of one or more short sequences of positively charged lysine or arginine exposed on the surface of a protein, but other types of NLS are also known in the art.
  • NLSs include (M)KKRKV (with SEQ ID NO: 52), (M)PKKKRKV (with SEQ ID NO: 53), or (M)SGGSPKKKRKV (with SEQ ID NO: 54), wherein (M) denotes an initiator methionine which may be present when the NLS is located in N-terminal or be post-translationally removed, and which is absent when the NLS is located in C-terminal.
  • the fusion protein may also include other localization sequences, such as cytoplasmic localization sequences, chloroplast localization sequences, mitochondrial localization sequences, and the like, depending on the desired localization of the fusion protein in a cell.
  • localization sequences such as cytoplasmic localization sequences, chloroplast localization sequences, mitochondrial localization sequences, and the like, depending on the desired localization of the fusion protein in a cell.
  • the at least one additional polypeptide or protein is a nuclease.
  • the nuclease is an endonuclease or an exonuclease.
  • the nuclease is a deoxyribonuclease or a ribonuclease.
  • the at least one additional polypeptide or protein is an RNA-guided nuclease.
  • the at least one additional polypeptide or protein is a Cas nuclease.
  • the Cas nuclease is selected from the group comprising or consisting of Cas9, Casl2a (Cpfl), Casl2b, Casl2f, and CasX. It shall be understood that variants and functional fragments thereof are also encompassed, such as nickase Cas (nCas) or dead Cas (dCas) variants.
  • Cas9 nucleases include, without limitation, Streptococcus pyogenes Cas9 (SpCas9), Staphylococcus haemolyticus Cas9 (ShCas9), and Campylobacter jejuni Cas9 (CjCas9).
  • the Cas nuclease has at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the sequence of SpCas9 with SEQ ID NO: 27, ShCas9 with SEQ ID NO: 28, Cpfl with SEQ ID NO: 29, CjCas9 with SEQ ID NO: 30, nCas9 with SEQ ID NO: 31, and/or nCas9 with SEQ ID NO: 32.
  • the Cas nuclease has at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the amino acid sequence of ShCas9 with SEQ ID NO: 28 or SpCas9 with SEQ ID NO: 27.
  • the Cas nuclease is ShCas9 with SEQ ID NO: 28. In some embodiments, the Cas nuclease is SpCas9 with SEQ ID NO: 27.
  • the at least one additional polypeptide or protein is a variant or a functional fragment of a Cas9 nuclease.
  • a Cas9 variant typically comprises one or several amino acid substitutions as compared to the wild-type amino acid sequence of said Cas9.
  • the Cas9 variant is humanized Cas9 (hCas9) or a functional fragment thereof.
  • the term “humanized Cas9” or “hCas9” refers to an optimized Cas9 protein sequence for expression in human cells.
  • hCas9 has an amino acid sequence having at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the amino acid sequence of SEQ ID NO: 33. In some embodiments, hCas9 has an amino acid sequence consisting of SEQ ID NO: 33.
  • the at least one additional polypeptide or protein is CasX.
  • CasX has an amino acid sequence having at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the amino acid sequence of SEQ ID NO: 34.
  • CasX has an amino acid sequence consisting of SEQ ID NO: 34.
  • the at least one additional polypeptide or protein is a dead Cas9 protein (dCas9).
  • dCas9 has an amino acid sequence having at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the amino acid sequence of SEQ ID NO: 35.
  • dCas9 has an amino acid sequence consisting of SEQ ID NO: 35.
  • the at least one additional polypeptide or protein is TnpB (Transposase B from transposon PsiTn554).
  • TnpB has an amino acid sequence having at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the amino acid sequence of SEQ ID NO: 36.
  • TnpB has an amino acid sequence consisting of SEQ ID NO: 36.
  • the at least one additional polypeptide or protein is Casl2f.
  • Casl2f protein is from the bacterium Acidibacillus sulfuroxidans (AsCasl2f).
  • Casl2f has an amino acid sequence having at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the amino acid sequence of SEQ ID NO: 37.
  • Casl2f has an amino acid sequence consisting of SEQ ID NO: 37.
  • the protein or polypeptide comprising the transposase or a fragment thereof is a fusion protein comprising the transposase or a fragment thereof, and a nuclease as described hereinabove.
  • the fusion protein has at least 75 %, 80 %, 85 %, 90 %, 95 %, 96 %, 97 %, 98 %, 99 % or more amino acid sequence identity with SEQ ID NO: 2. In some embodiments, the fusion protein comprises or consists of the amino acid sequence of SEQ ID NO: 2.
  • the at least one additional polypeptide or protein is an aptamer-binding protein.
  • the at least one additional polypeptide or protein is an aptamer-binding protein selected from the group comprising or consisting of MS2 bacteriophage coat protein (MCP), PP7 coat protein (PCP), XN22 peptide and COM.
  • the at least one additional polypeptide or protein is the MS2 bacteriophage coat protein (MCP).
  • MCP has an amino acid sequence having at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the amino acid sequence of SEQ ID NO: 39.
  • MCP is capable of binding to a MS2 RNA tetraloop binding sequence (or “MS2 aptamer”).
  • MS2 aptamer has a nucleic acid sequence having at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the amino acid sequence of SEQ ID NO: 40.
  • the fusion protein interacts, or is capable of interacting, covalently or non-covalently through MCP with a guide RNA (gRNA) molecule comprising at least one MS2 aptamer.
  • gRNA guide RNA
  • the method of the invention may further comprise contacting the cell population with a guide RNA (gRNA) fused to at least one aptamer.
  • gRNA guide RNA
  • the aptamer is an RNA sequence comprising a tetraloop.
  • tetraloop refers to a four-base hairpin loop motif. The term is used interchangeably herein with the terms “stem loop” or “hairpin loop”.
  • the aptamer is a MS2 RNA tetraloop sequence (or MS2 aptamer).
  • the MS2 aptamer has a nucleic acid sequence having at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the amino acid sequence of SEQ ID NO: 40.
  • the aptamer comprises or consists of the nucleic acid sequence with SEQ ID NO: 40.
  • the gRNA is capable of forming a complex with an RNA-guided nuclease as described hereinabove, e.g., a Cas protein or fusion protein comprising a Cas protein.
  • the gRNA molecule interacts, or is capable of interacting, covalently or non-covalently through the at least one MS2 aptamer with an RNA-guided nuclease.
  • the gRNA is capable of targeting the RNA-guided nuclease as described hereinabove to a specific sequence or region of the genome of a cell, preferably a mammalian cell, more preferably a human cell.
  • the specific sequence targeted by the gRNA is adjacent to a protospacer adjacent motif (PAM) specific for the Cas protein.
  • PAM protospacer adjacent motif
  • the specific sequence targeted by the gRNA may be within a safe harbor locus in a cell’s genome.
  • a “safe harbor locus” refers to a region of a cell’s genome, where the integrated material can be adequately expressed without perturbing endogenous gene structure or function. Safe harbor loci include, but are not limited to, AAVS1 (intron 1 of PPP1R12C), HPRT, HI 1, hRosa26, albumin and F-A region.
  • the safe harbor locus may be an exon or an intron of a ubiquitously expressed gene and/or of a gene with tissue specific expression (e.g., muscle).
  • Safe harbor loci can be selected from the group consisting of: exon 1, intron 1 or exon 2 of PPP1R12C; exon 1, intron 1 or exon 2 of HPRT; exon 1, intron 1 or exon 2 of hRosa26; or intron 1 of the albumin gene.
  • a safe harbor locus may also include a region of the genome devoid of endogenous genes and with open chromatin that allows for the expression of the inserted transgene without perturbing the genome structure or function.
  • the gRNA comprises 20, 25, 30, 35, 40, 45, 50 nucleotides or more.
  • the protein or polypeptide comprising the transposase or the fragment thereof is delivered to the cell in the form of a protein.
  • the protein or polypeptide comprising the transposase or the fragment thereof is delivered to the cell in the form of a nucleic acid encoding said transposase or fragment thereof.
  • the nucleic acid may be a ribonucleic acid (RNA) or a deoxyribonucleic acid (DNA).
  • the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is an RNA molecule. In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is a messenger RNA (mRNA) molecule.
  • mRNA messenger RNA
  • the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof may be a DNA molecule.
  • the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is a complementary DNA (cDNA).
  • the DNA molecule may be comprised within a vector, such as, e.g., a plasmid.
  • the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is linear or circular. In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is double-stranded or single-stranded.
  • the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population by transfection or transformation.
  • the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population by transfection.
  • transfection techniques include, without limitation, lipofection, electroporation, sonication, nanoparticles, microinjection and viral vector infection, including non-integrative and integrative viral vector infection.
  • transfection is performed by lipofection.
  • Means of performing lipofection are known in the part and comprise using a lipofection reagent such as, e.g., lipofectamineTM.
  • transfection is performed by electroporation.
  • transfection is performed by sonication.
  • transfection is performed by nanoparticles (e.g., polymeric nanoparticles such as JetPEI).
  • transfection is performed by microinjection.
  • transfection is performed by viral vector infection.
  • the viral vector is integrative or non-integrative, preferably non-integrative. Non-limitative examples of non-integrative viral vectors include adenoviral vectors, adeno-associated virus (AAV) vectors, poxviral vectors, herpes simplex virus vectors and the like.
  • AAV adeno-associated virus
  • the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population by transformation.
  • transformation techniques include, without limitation, the use of an integrative vector selected from the group comprising or consisting of integrating viral vectors, integrating plasmids, enzymes or genome editing methods.
  • transformation comprises using an integrative viral vector or a modified integrative virus.
  • the integrative viral vector or a modified integrative virus is an integrating virus or variant thereof or mutant thereof selected from the group comprising or consisting of the Retroviridae, Adenoviridae, Flaviviridae, Herpesviridae, Hepadnaviridae, Papillomaviridae, Polyomaviridae, Parvoviridae, Arenaviridae, Bornaviridae, Bunyaviridae, Filoviridae and Paramyxoviridae families of viruses, preferably selected from the group comprising or consisting of the Retroviridae, Adenoviridae and Flaviviridae families of viruses.
  • the integrative viral vector belongs to the family of Retroviridae .
  • the integrative viral vector is a lentivirus, or a modified lentivirus.
  • the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population together with the nucleic acid molecule encoding the at least one transgene of interest.
  • the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof, and the nucleic acid molecule encoding the at least one transgene of interest are delivered in the same vector.
  • the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof, and the nucleic acid molecule encoding the at least one transgene of interest are delivered concomitantly in separate vectors.
  • the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population prior to the nucleic acid molecule encoding the at least one transgene of interest, such as, e.g., 1, 2, 3, 4, 5, 10, 30, 60 minutes, 2, 3, 4, 5, 6, 12, 24, 36, 48, 60, 72 hours, 4, 5, 6, 7 days or more before the nucleic acid molecule encoding the at least one transgene of interest.
  • the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population from
  • nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population from
  • 2 hours to 72 hours from 3 hours to 72 hours, from 4 hours to 72 hours, from 5 hours to 72 hours, from 6 hours to 72 hours, from 7 hours to 72 hours, from 8 hours to 72 hours, from 9 hours to 72 hours, from 10 hours to 72 hours, from 11 hours to 72 hours, from 12 hours to 72 hours, from 24 hours to 72 hours, from 36 hours to 72 hours, from 48 hours to 72 hours before the nucleic acid molecule encoding the at least one transgene of interest.
  • the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population from 1 hour to 60 hours, from 1 hour to 48 hours, from 1 hour to 36 hours, from 1 hour to 24 hours, from 1 hour to 12 hours, from 1 hour to 11 hours, from 1 hour to 10 hours, from
  • 1 hour to 9 hours from 1 hour to 8 hours, from 1 hour to 7 hours, from 1 hour to 6 hours, from 1 hour to 5 hours, from 1 hour to 4 hours, from 1 hour to 3 hours, from 1 hour to
  • the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population from 1 hour to 12 hours, from 2 hours to 8 hours, from 3 hours to 6 hours before the nucleic acid molecule encoding the at least one transgene of interest.
  • the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population about 4 hours before the nucleic acid molecule encoding the at least one transgene of interest.
  • the protein or polypeptide comprising a transposase or a fragment thereof, or the nucleic acid encoding the same is comprised in a pharmaceutical composition further comprising at least one pharmaceutically acceptable excipient.
  • the term “pharmaceutically acceptable excipient” refers to an excipient that does not produce an adverse, allergic or other untoward reaction when administered to an animal, preferably a human, or to a population of cells. It includes any and all solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents and the like.
  • An acceptable excipient refers to a non-toxic solid, semi-solid or liquid filler, diluent, encapsulating material or formulation auxiliary of any type.
  • preparations should meet sterility, pyrogenicity, general safety and purity standards as required by EMA or FDA Office of Biologies standards.
  • the acceptable excipient is selected in a group comprising or consisting of a solvent, a diluent, a carrier, a dispersion medium, a coating, an antibacterial agent, an antifungal agent, an isotonic agent, an absorption delaying agent and any combinations thereof.
  • the excipient must be “acceptable” in the sense of being compatible with the protein, polypeptide or nucleic acid molecule of the method, and not be deleterious upon being administered to an individual or to a population of cells.
  • the excipient does not produce an adverse, allergic or other untoward reaction when administered to an individual, preferably a human individual, or to a population of cells.
  • the cell population is a population of eukaryotic cells or of prokaryotic cells. In some embodiments, the cell population is a eukaryotic cell population. In some embodiments, the cell population is a prokaryotic cell population.
  • the cell population is isolated from a donor.
  • donor refers to an animal, preferably a mammal, more preferably a human. The donor may be alive or dead when the cell population is isolated, preferably alive.
  • the cell population is isolated from at least one organ or tissue of the donor, optionally from more than one organ or tissue of the donor.
  • the cell population is homogeneous (z.e., it contains a single type of cells) or heterogenous (z.e., it contains more than one type of cells), preferably homogenous.
  • the cell population is cultured in vitro. In some embodiments, the cell population is comprised in a primary cell culture.
  • the cell population is from an immortalized cell line.
  • the method is performed in vitro.
  • the cell population is an in vztro-cultured population of cells.
  • the cell population is cultured from 2 hours to 72 hours, from 3 hours to 72 hours, from 4 hours to 72 hours, from 5 hours to 72 hours, from 6 hours to 72 hours, from 7 hours to 72 hours, from 8 hours to 72 hours, from 9 hours to 72 hours, from 10 hours to 72 hours, from 11 hours to 72 hours, from 12 hours to 72 hours, from 24 hours to 72 hours, from 36 hours to 72 hours, from 48 hours to 72 hours.
  • a controlled atmosphere e.g., 37°C, 5 % CO2
  • suitable culture medium e.g., Dulbecco’s Modified Eagle Medium.
  • the cell population is cultured from 2 hours to 72 hours, from 3 hours to 72 hours, from 4 hours to 72 hours, from 5 hours to 72 hours, from 6 hours to 72 hours, from 7 hours to 72 hours, from 8 hours to 72 hours, from 9 hours to 72 hours, from 10 hours to 72 hours, from 11 hours to 72 hours, from 12 hours to 72 hours, from 24 hours to 72 hours, from 36 hours to 72 hours, from 48 hours to 72 hours.
  • the cell population is cultured from 1 hour to 60 hours, from 1 hour to 48 hours, from 1 hour to 36 hours, from 1 hour to 24 hours, from 1 hour to 12 hours, from 1 hour to 11 hours, from 1 hour to 10 hours, from 1 hour to 9 hours, from 1 hour to 8 hours, from 1 hour to 7 hours, from 1 hour to 6 hours, from 1 hour to 5 hours, from 1 hour to 4 hours, from 1 hour to 3 hours, from 1 hour to 2 hours.
  • the cell population is a non-dividing, non-replicating, terminally differentiated or quiescent, cell population.
  • non-dividing cell population non-replicating cell population
  • terminal differentiated quiescent cell population
  • non-dividing cell means that the cell remains out of the cell cycle but retains the capacity to divide, in other words, the cell is in a state of reversible growth arrest.
  • Quiescence may be induced by, non-imitatively, contact inhibition, chemical or pharmacological agents, signaling proteins, hormones, inhibitors, and the like, or, alternatively, by the lack of a signal such as a growth factor or the contact with one or more cell types.
  • the cell population is quiescent in the presence of one or more growth inhibiting agent in the culture medium.
  • the one or more growth inhibiting agent is selected from the group comprising or consisting of chemical agents, pharmacological agents, signaling proteins, hormones, growth factors, and inhibitors.
  • the cell population is quiescent in the absence of one or more growth stimulating agent in the culture medium.
  • the one or more growth stimulating agent is selected from the group comprising or consisting of chemical agents, pharmacological agents, signaling proteins, hormones, growth factors, amino acids, sugars, lipids, and fatty acids.
  • “terminally differentiated” cell means that the cell remains out of the cell cycle and lost the capacity to divide, in other words, the cell is in a state of irreversible growth arrest.
  • the non-dividing or quiescent cell population is a confluent cell culture (z.e., quiescence is induced by contact inhibition).
  • the non-dividing or quiescent cell population consists of at least one cell type that do not replicate in culture.
  • the non-dividing or quiescent cell population is senescent.
  • the non-dividing or terminally differentiated cell population consists of at least one cell type that have completely lost the capacity to perform cell division.
  • the cell population is not actively dividing.
  • the cell population is actively dividing, replicating, or proliferating. In some embodiments, the dividing cell population is in exponential phase of cell culture.
  • the method is performed ex vivo.
  • the method is performed in vivo. In some embodiments, the method is performed in vivo in an animal subject, preferably a mammal subject. In certain embodiments, the method is performed in vivo in a human subject. In some embodiments, the cell population is comprised in a tissue or an organ of a living organism. In some embodiments, the organism is an animal. In some embodiments, the organism is a mammal. In some embodiments, the organism is a human.
  • the present invention further relates to a cell or cell population comprising at least one copy of a protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, and/or at least one copy of at least one transgene of interest.
  • a cell or cell population comprises both at least one copy of a protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, and at least one copy of at least one transgene of interest.
  • the protein or polypeptide comprising a transposase or a fragment thereof, or the nucleic acid encoding the same, the at least one transgene of interest, and the cell or cell population further comprise the features as disclosed hereinabove.
  • the present invention further relates to a method of treating a genetic disease in a subject in need thereof, comprising administering to said subject: the nucleic acid molecule encoding the at least one transgene of interest, as described hereinabove, and a protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, as described hereinabove.
  • nucleic acid molecule encoding the at least one transgene of interest, as described hereinabove, and the protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, as described hereinabove, for use in treating a genetic disease in a subject in need thereof.
  • nucleic acid molecule encoding the at least one transgene of interest, as described hereinabove, and the protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, as described hereinabove, for the manufacture of a medicament for treating a genetic disease in a subject in need thereof.
  • the transgene of interest compensates for a gene detect responsible of the genetic disease in the subject.
  • the method further comprises administering to the subject a guide RNA (gRNA) fused to at least one aptamer, as described hereinabove.
  • gRNA guide RNA
  • the method may comprise administering to the subject in need thereof at least one additional therapeutic agent.
  • the at least one therapeutic agent is for treating the genetic disease.
  • the genetic disease is characterized in that at least one gene is mutated in the genome of the subject, so that the protein encoded by the at least one gene has impaired function or is not functional, or is degraded by the cellular protein quality control (e.g., proteasome), or is not produced; in other words, the function of the at least one gene is at least partially, if not completely lost.
  • the cellular protein quality control e.g., proteasome
  • Non-limitative examples of genetic diseases include sickle cell anemia, cystic fibrosis, Huntington disease, congenital muscular dystrophy, Duchenne muscular dystrophy, Fabry disease, Marfan syndrome, thalassemia, cystinosis, familial hypercholesterolemia, hemochromatosis and the like.
  • the transgene of interest restores partially or completely, preferably completely, the function of the at least one gene that is mutated. In some embodiments, the transgene of interest encodes the same protein as the at least one gene that is mutated. In some embodiments, the transgene of interest encodes a different protein than the one encoded the at least one gene that is mutated, but that has similar or identical biological function.
  • Figure 1 is a set of photographs illustrating the viability and transfection efficiency of GFP expressing transposon and transposase in Embryonic stem cells H9 electroporated with transposases (either hyPB alone or fused with a Cas9 nuclease).
  • BF Bright Field
  • GFP Green Fluorescent Protein.
  • Figure 2 is a histogram showing the transcription efficiency of a Red Fluorescent Protein (RFP) reporter in HepG2 cells expressing or not hyPB.
  • RFP Red Fluorescent Protein
  • Figure 3 is a histogram showing the transcription efficiency of a RFP reporter in Huh7 cells expressing or not hyPB.
  • Figures 4 and 5 are graphs showing the transcription efficiency over time (at 3h, 6h, 9h, and 12h in Figure 4; at 8h, lOh, 12h, 14h and 16h in Figure 5) of a GFP reporter in Huh7 cells expressing or not hyPB.
  • Figure 6 is a histogram showing the transcription efficiency of a GFP reporter in Huh7 cells expressing or not a catalytically dead hyPB.
  • Figure 7 is a histogram showing the transcription efficiency of a GFP reporter in Huh7 cells expressing a fusion protein comprising a Cas9 nuclease and either a SB 100 transposase or a N57 transposase.
  • Figures 8A-8B is a set of histograms comparing the transcription efficiency of a GFP reporter in Huh7 cells expressing or not hyPB, wherein the GFP DNA transposon comprises either one Inverted Terminal Repeat (ITR) ( Figure 8A) or two ITRs ( Figure 8B).
  • ITR Inverted Terminal Repeat
  • Figure 9 is a set of photographs showing the transcription efficiency of a luciferase reporter in vivo in mice, one day or 4 weeks after injection, with either hyPB mRNA or a catalytically dead hyPB mRNA.
  • Figure 10 is a set of photographs showing the transcription efficiency of a luciferase reporter in vivo, one day or one week after injection, in mice transfected or not with hyPB DNA using a polymeric nanoparticle vector (jetPEI).
  • jetPEI polymeric nanoparticle vector
  • Figure 11 is a set of photographs showing the transcription efficiency of a luciferase reporter in vivo, one day after injection, in mice transfected or not with hyPB and wherein the DNA payload (luciferase reporter) and the hyPB mRNA are co transfected using the same vector (lipid nanoparticles).
  • Embryonic Stem Cells H9 were electroporated (CRG cell and tissue facility) with 1.8 pg of minicircle GFP transposon alone or together with 1 pg hyPB mRNA or 1.5 pg FiCAT mRNA and 1.75 pg of sgRNA targeting AAVS1 locus. Cells were imaged in a fluorescent microscope and using bright field and viability and % of GFP signal was estimated.
  • Day 0 400k cells/well were seeded in p6 well plates.
  • Day 1 1.25 pg of either hyPB plasmid or pucl9 mock plasmid were transfected with lipofectamine 3000 according manufacturer’s protocol. Cells were kept during 2h with Optimem + transfection mix, and then full media was added (Dulbecco’s modified Eagle medium 10 % fetal bovine serum, 2 mM glutamine and 100 U penicillin/0.1 mg/ml streptomycin)
  • Day 3 Cells were transfected with 0.75 pg RFP DNA transposon and 0.25 pg pucl9 filling DNA with lipofectamine 3000 according manufacturer’s protocol. Cells were kept during 2 hours with Optimem + transfection mix, and then full media was added.
  • the cells were transfected on day 1 with Cas9_SB100 plasmid and Cas9_N57 plasmid instead of hyPB.
  • cells were transfected with GFP transposon containing SB 100 ITRs and with TCR1 gRNA plasmid.
  • 400,000 cells were seeded 24h before transfection in p6 wells. Transfection was performed using 1.8 pg transposon DNA and 1.95 pg hyPB or mock mRNA of similar size with lipofectamine 3000 according manufacturer’s protocol. Cells were kept during 2 hours with Optimem + transfection mix, and then full media was added. Cells were lifted every 3 hours or every 14 hours post-transfection and fluorescence was measured by flow cytometry.
  • the cells were transfected on day 1 with Cas9_SB100 plasmid and Cas9_N57 plasmid instead of hyPB.
  • cells were transfected with GFP transposon containing SB 100 ITRs and with TCR1 gRNA plasmid.
  • nucleic acids were injected into 6-7 weeks old mice (3.2 pg MC-luciferase transposon, 1.42 pg hyPB/hyPB_dead mRNA). Nucleic acids were diluted with PBS and 7 % of animal body weight in ml was injected in less than 7 seconds via retro-orbital systemic injection.
  • Lipid proportions were 50 % Dlin-MC3, 10 % DSPC helper lipid, 38.5 % cholesterol, 1.5 % PEG-2000 and N/P ratio of 6. Size of the nanoparticles were 100-130 nm and encapsulation efficiencies higher than 90 %.
  • Embryonic Stem Cells H9 electroporated with transposases shows GFP signal ( Figure 1), meaning that GFP transposon DNA has entered the nucleus and it has been translated and transcribed. While episomal signal of GFP transposon without hyPB or FiCAT is low (20 % efficiency), hyPB yields a stronger GFP signal (70 % transfection efficiency).
  • HepG2 hepatic cell line previously expressing hyPB shows better internalization of RFP DNA transposon than cells which do not express hyPB ( Figure 2). This result was also obtained with Huh7 hepatic cell line ( Figure 3).
  • Huh7 cells transfected with hyPB mRNA show higher capacity for uptake and expression of DNA transposon compared to cells transfected with mock mRNA ( Figures 4 & 5).
  • Huh7 hepatic cell line transfected with hyPB catalytically dead mRNA shows higher capacity for nuclear uptake and expression of DNA transposon compared to cells transfected with mock mRNA ( Figure 6).
  • hyPB transposases show advantages in DNA payload internalization and expression, even if its catalytic activity is compromised (dead hyPB). 1 out of 3 mice shows luciferase expression 24h after hydrodynamic injection of solo payload while this increases to 2/3 or 3/3 mice when co-injected with hyPB or hyPB catalytically dead.
  • hyPB transposases show advantages in DNA payload internalization and expression, when payload DNA and hyPB mRNA are administered using different delivery methods ( Figure 10).
  • Figure 10 When injecting DNA, 50 % of mice shows luciferase signal 24h after while 100 % of mice shows signal when co-administering hyPB using LNPs.
  • hyPB transposases show advantages in DNA payload internalization and expression, when payload DNA and hyPB mRNA are administered using the same delivery methods ( Figure 11). When injecting DNAsolo, mice do not show luciferase signal while signal is recovered when co-administering hyPB using LNPs in both cases. This result shows hyPB helps internalize DNA into the nucleus therefore it can be expressed.
  • Example 2 Nuclear localization in non-dividing cells
  • NIH/3T3 (ATCC), were cultured in DMEM supplemented with 10 or 1% heat- inactivated fetal bovine serum, lOO U/ml penicillin and O. l mg/ml streptomycin in a humidified CO2 (5%) incubator at 37°C.
  • liver perfusion buffer HBSS KC1 0.4 g.L glucose 1 g.L 1 , NaHCO3 2.1 g.L 1 , EDTA 0.2 g.L 1
  • liver digest buffer DMEM-GlutaMAX 1 g.L 1 glucose, HEPES 15 mM pH 7.4, penicillin/streptomycin 1%, 5 mg per mouse Collagenase IV (C5138 Sigma)
  • liver was placed on ice in plating media (M199, fetal bovine serum 10%, penicillin/streptomycin 1%, sodium pyruvate 1%, L-glutamine 1%, 1 nm insulin, 1 mM dexamethasone, 2 mg.mL 1 bovine serum albumin (BSA)).
  • BSA bovine serum albumin
  • hepatocytes were isolated from a wild-type 8-weeks old mouse, and 75.000 cells were seeded per well. After 48h in culture, cells were transfected. Transfection was performed using 0.2 ug transposon DNA alone (Episomal) or in combination with 0.25ug of hyPB-wt or a catalytically dead hyPB (dead-hyPB) with lipofectamine 3000 according to manufacturer’s protocol. Cells were incubated for 5 hours with the transfection mix in Optimem, and then full media was added. 24 hours after transfection, cells were washed once with PBS and lysated with the appropriate buffer.
  • Luciferase activities were quantified 24h post-transfection using the Luciferase assay system (Promega, Madison, WI, USA), performed in duplicated. Data is shown as fold change over to the non-transfected hepatocytes luciferase values, and expressed in Arbitrary units (AU).
  • FIG. 12 shows that uptake and expression of the DNA transposon (GFP) is increased in cells expressing hyPB or dead-hyPB, even with the lowest dose.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Molecular Biology (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Microbiology (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Enzymes And Modification Thereof (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The present invention relates to methods of increasing expression of a transgene of interest in a cell population and/or of increasing nuclear localization of a transgene of interest in a cell population, using a transposase.

Description

USE OF TRANSPOSASES FOR IMPROVING TRANSGENE EXPRESSION
AND NUCLEAR LOCALIZATION
FIELD OF INVENTION
[0001] The present invention relates to methods for improving transgene expression in a cell population.
BACKGROUND OF INVENTION
[0002] Having efficient methods to introduce nucleic acids or proteins into cells and tissues is key for genome engineering. Different methods have been developed on that purpose such as, e.g., viral infection, electroporation, transfection with polymers and nanoparticles. While these methods enable great efficiencies in transfection, the introduction of DNA of interest into the nucleus remains a major challenge.
[0003] Indeed, after entering the cell, the foreign DNA will face various obstacles before reaching the nucleus. First, mammalian cells are equipped with defense mechanisms aiming to sense and regulate foreign DNA, such as interferon system. Semenova et al. (Nucleic acids research 2019, vol. 47,19), have reported that cytosolic DNA sensors bind transfected DNA in the hours following transfection, leading to a pro-inflammatory response. Second, the foreign DNA molecule must go through the barrier of the nuclear envelope. Vandenbroucke et al. (Nucleic acids research 2007, vol. 35,12) proposed to modulate the activity of nuclear pore complexes (NPCs) with amphipathic molecules for increasing nucleocytoplasmic transport; however, the amphipathic molecules are associated to a certain level of toxicity, and they do not seem to enable increased nuclear uptake for certain DNA carriers.
[0004] Other additional obstacles for the translocation of foreign DNA into the nucleus may include trafficking, cell type specificity, and cell division (for review, see Bai et al. “Cytoplasmic transport and nuclear import of plasmid DNA.” Bioscience reports, 2017, vol. 37,6). [0005] Thus, the difficulty for foreign DNA to arrive to the nucleus in eukaryotic cells (in vitro and in vivo) is a matter that many transfection methods have tried to overcome. The present invention provides a solution consisting in surrounding the gene of interest with ITRs and adding a transposase to the transfection mix thereby helping localizing this foreign DNA into the nucleus.
[0006] Transposases are enzymes naturally evolved to “cut-and-paste” or “copy-and-paste” DNA fragments into genomic DNA of prokaryotes (e.g., Tn3, Tn5 and Tn7) or eukaryotes (e.g., Sleeping Beauty, and PiggyBac). These enzymes have been found in the genome of certain species with low transposition efficiencies to avoid genomic toxicity. Transposases bind to a payload DNA containing specific Inverted Terminal Repeats (ITRs) called transposon, copy or cut its sequence pasting it in a random or semi-random genomic site. Like most DNA-binding proteins, transposases are naturally addressed to the nucleus thanks to a nuclear localization signal (see, e.g., Keith et al. “Analysis of the piggyBac transposase reveals a functional nuclear targeting signal in the 94 c-terminal residues.” BMC Molecular Biology 2008, vol. 9, 72).
[0007] The state of the art comprises different transposases that were modified to increase their efficiency and to create tools for mammalian genome editing such as SB 100 or hyperactive PiggyBac. Transposases have also been fused to DNA binding domains such as, e.g., Cas9 (see, e.g., WO2022129438 or W02020243085), dead-Cas9 (see, e.g., Hew et al. “RNA-guided piggyBac transposition in human cells.” Synthetic Biology 2019, vol. 4,1), and ZNF, in order to specifically target and edit a certain genomic DNA location.
[0008] However, while transposases have been described as tools for genome editing (see, e.g., Zhao et al. “PiggyBac transposon vectors: the tools of the human gene encoding.” Translational Lung Cancer Research 2016, vol. 5,1), their applications rely on their insertional activity, either for inserting a gene in a host cell’s DNA, or for disrupting a gene, which also poses the problem of insertional mutagenesis.
[0009] Furthermore, transfection is usually performed on actively dividing (proliferating) cells, because they internalize nucleic acids much better than non-dividing, quiescent cells (Bai et al. 2017), and it is notably difficult to transfect confluent cells or certain cell types that do not divide in culture. Hence, there is a need to improve transfection efficiency in non-dividing cells.
[0010] Thus, the present invention provides evidence that, using transposases with vehicles that facilitates cytoplasmatic transduction of nucleic acids, it is possible to increase the nuclear localization or nuclear uptake of a DNA of interest, where it can then be expressed by the cell (z.e., transiently expressed, transfected) or inserted in the genomic DNA. The method of the present invention also enables efficient transfection of quiescent cells. Finally, the method of the present invention also enables efficient transfection without insertional activity when using a catalytically inactive transposase.
SUMMARY
[0011] An object of the present invention is a method of increasing the expression of a nucleic acid molecule encoding at least one transgene of interest in a cell population, comprising contacting a cell population with: a protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, and the nucleic acid molecule encoding the at least one transgene of interest.
[0012] Another object of the present invention is a method of increasing nuclear localization of a nucleic acid molecule encoding at least one transgene of interest in a cell population, comprising contacting a cell population with: a protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, and the nucleic acid molecule encoding the at least one transgene of interest.
[0013] Another object of the present invention is a method of editing the genome of a cell population, comprising contacting a cell population with: a protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, and the nucleic acid molecule encoding the at least one transgene of interest. [0014] Another object of the present invention is a method of treating a genetic disease in a subject in need thereof, comprising administering said subject with a therapeutically effective amount of: a protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, and a nucleic acid molecule encoding the at least one transgene of interest, wherein expression of the transgene of interest in at least one cell of the subject in need thereof compensates a gene defect responsible for the genetic disease.
[0015] In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is a deoxyribonucleic acid (DNA) molecule. In some embodiments, said DNA molecule is a complementary DNA (cDNA) molecule. In some embodiments, said DNA molecule is a genomic DNA (gDNA) molecule.
[0016] In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest further comprises at least one Inverted Terminal Repeat (ITR) sequence. In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest further comprises two ITR sequences. In some embodiments, the at least one ITR sequence is adjacent to the transgene of interest. In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest further comprises two ITR sequences flanking the transgene of interest. In some embodiments, the at least one ITR sequence interacts with, or non-covalently binds to, the transposase.
[0017] In some embodiments, the DNA molecule is inserted into the genome of said cell population. In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is stably expressed by the cell population.
[0018] In some embodiments, the transgene of interest is an exogenous gene. In some embodiments, the transgene of interest is an endogenous gene.
[0019] In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is comprised in a plasmid, a fosmid, a cosmid, an artificial chromosome or a viral vector. In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is comprised in a plasmid. In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is comprised in a viral vector. In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is comprised in a DNA virus-based vector selected from the group consisting of viruses from the realms Duplodnaviria, Monodnaviria and Varidnaviria.
[0020] In some embodiments, the protein or polypeptide comprising the transposase or a fragment thereof increases the nuclear localization of the nucleic acid molecule encoding the at least one transgene of interest. In some embodiments, the protein or polypeptide comprising the transposase or a fragment thereof translocates, or promotes translocation of, the nucleic acid molecule encoding the at least one transgene of interest to the nucleus.
[0021] In some embodiments, the protein or polypeptide comprising the transposase or a fragment thereof is a fusion protein comprising the transposase or a fragment thereof, and at least one additional polypeptide or protein. In some embodiments, the at least one additional polypeptide or protein is a nuclease. In some embodiments, the at least one additional polypeptide or protein is a RNA-guided nuclease. In some embodiments, the at least one additional polypeptide or protein is a Cas nuclease. In some embodiments, the at least one additional polypeptide or protein is a Cas9 nuclease.
[0022] In some embodiments, the fusion protein further comprises a linker.
[0023] In some embodiments, the fusion protein has at least 75 % amino acid sequence identity with SEQ ID NO: 2. In some embodiments, the fusion protein comprises or consists of the amino acid sequence of SEQ ID NO: 2.
[0024] In some embodiments, the at least one additional polypeptide or protein is an aptamer binding protein. In some embodiments, the at least one additional polypeptide or protein is the MS2 bacteriophage coat protein (MCP). In some embodiments, the fusion protein interacts, or is capable of interacting, covalently or non-covalently through MCP with a gRNA molecule comprising at least one MS2 aptamer. In some embodiments, the gRNA molecule interacts, or is capable of interacting, covalently or non-covalently through the at least one MS2 aptamer with an RNA-guided nuclease. [0025] In some embodiments, the transposase is selected from the group consisting of hyperactive PiggyBac transposase, PiggyBac transposase, Sleeping Beauty transposase, SB 11 transposase, Tol2 transposase, Mosl transposase, and Frog Prince transposase. In some embodiments, the transposase is selected from the group consisting of hyperactive PiggyBac transposase and Sleeping Beauty transposase. In some embodiments, the transposase is a hyperactive PiggyBac transposase. In some embodiments, the transposase is a hyperactive PiggyBac transposase with the amino acid sequence of SEQ ID NO: 1. In some embodiments, the transposase is a modified hyperactive PiggyBac transposase, comprising at least one amino acid mutation compared with the amino acid sequence of the hyperactive PiggyBac transposase of SEQ ID NO: 1. In some embodiments, the transposase is a Sleeping Beauty transposase. In some embodiments, the transposase or the fragment thereof has decreased catalytic activity. In some embodiments, the transposase or the fragment thereof is catalytically dead. In some embodiments, the catalytically dead transposase has at least 75 % amino acid sequence identity with SEQ ID NO: 3. In some embodiments, the catalytically dead transposase comprises or consists of the amino acid sequence of SEQ ID NO: 3.
[0026] In some embodiments, the protein or polypeptide comprises a transposase fragment. In some embodiments, the transposase fragment comprises or consists of at least one transposase functional domain. In some embodiments, the transposase fragment comprises or consists of an ITR-binding domain. In some embodiments, the transposase fragment is a fragment of a hyperactive PiggyBac transposase or of a Sleeping Beauty transposase. In some embodiments, the transposase fragment is a fragment of a hyperactive PiggyBac transposase, preferably comprising an ITR-binding domain. In some embodiments, the transposase fragment is a fragment of a Sleeping Beauty transposase, preferably selected from the group consisting of the SB 100 domain and the N57 domain of Sleeping Beauty transposase.
[0027] In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is a ribonucleic acid (RNA) molecule or a DNA molecule. In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is an RNA molecule. In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is a messenger RNA (mRNA) molecule. In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is a cDNA molecule or a plasmid. In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population by transfection or transformation. In some embodiments, transfection is selected from the group consisting of lipofection, electroporation, sonication, nanoparticles, microinjection and viral vector infection, including non-integrative and integrative viral vector infection. In some embodiments, transformation comprises using an integrative viral vector or a modified integrative virus.
[0028] In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population together with the nucleic acid molecule encoding the at least one transgene of interest. In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population prior to the nucleic acid molecule encoding the at least one transgene of interest. In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population from 1 hour to 72 hours before the nucleic acid molecule encoding the at least one transgene of interest. In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population about 4 hours before the nucleic acid molecule encoding the at least one transgene of interest.
[0029] In some embodiments, the protein or polypeptide comprising a transposase or a fragment thereof, or the nucleic acid encoding the same, is comprised in a pharmaceutical composition further comprising at least one acceptable excipient.
[0030] In some embodiments, the cell population is a eukaryotic or a prokaryotic cell population. In some embodiments, the cell population is a eukaryotic cell population. In some embodiments, the cell population is a prokaryotic cell population. In some embodiments, the method is performed in vitro. In some embodiments, the cell population is an in vztro-cultured population of cells. In some embodiments, the method is performed ex vivo. In some embodiments, the method is performed in vivo.
[0031] In some embodiments, the cell population is comprised in a tissue or an organ of a living organism. In some embodiments, the organism is an animal. In some embodiments, the organism is a mammal. In some embodiments, the organism is a human.
[0032] In some embodiments, the method comprises the steps of: contacting the cell population with a vector comprising the nucleic acid encoding the protein or polypeptide comprising the transposase or a fragment thereof, culturing the cell population for a period of time ranging from 1 hour to 72 hours, and contacting the cell population with a vector comprising the nucleic acid molecule encoding the at least one transgene of interest.
[0033] In some embodiments, the method comprises the steps of: contacting the cell population with a vector comprising the protein or polypeptide comprising the transposase or a fragment thereof, and contacting the cell population with a vector comprising the nucleic acid molecule encoding the at least one transgene of interest.
DEFINITIONS
[0034] In the present invention, the following terms have the following meanings:
[0035] “Cas9” or“ Cas9 nuclease” refer to an RNA-guided nuclease comprising a Cas9 protein, or a fragment thereof (e.g., a protein comprising an active or inactive DNA cleavage domain of Cas9, and/or the gRNA binding domain of Cas9). A Cas9 nuclease is also referred to sometimes as a casnl nuclease or a CRISPR (clustered regularly interspaced short palindromic repeat)-associated nuclease. CRISPR is an adaptive immune system that provides protection against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain spacers, sequences complementary to antecedent mobile elements, and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA). In type II CRISPR systems, correct processing of pre-crRNA requires a trans-encoded small RNA (tracrRNA), endogenous ribonuclease 3 (rnc) and a Cas9 protein. The tracrRNA serves as a guide for ribonuclease 3-aided processing of pre-crRNA. Subsequently, the Cas9/crRNA/tracrRNA complex endonucleolytically cleaves linear or circular dsDNA target complementary to the spacer. The target strand not complementary to crRNA is first cut endonucleolytically, then trimmed 3 ‘-5’ exonucleolytically. In nature, DNA-binding and cleavage typically requires protein and both RNAs. However, single guide RNAs (“sgRNA” or simply “gRNA”) can be engineered so as to incorporate aspects of both the crRNA and tracrRNA into a single RNA species. Cas9 recognizes a short motif in the CRISPR repeat sequences (the PAM or protospacer adjacent motif) to help distinguish self vs. non-self. Cas9 nuclease sequences and structures are well known to those of skill in the art. Cas9 orthologs have been described in various species, including, but not limited to, S. pyogenes and S. thermophilus.
[0036] “Exogenous” refers to any molecule that is not naturally present in a cell or organism of interest, but which can be introduced thereinto by one or more genetic, biochemical or other methods. The natural presence of a molecule in a cell or organism may also be determined with respect to the particular developmental stage and environmental conditions thereof. Thus, for example, a molecule that is present only during embryonic development of muscle is an exogenous molecule with respect to an adult muscle cell. Similarly, a molecule induced by heat shock is an exogenous molecule with respect to a non-heat-shocked cell. An exogenous molecule can comprise, e.g., a functioning version of a malfunctioning endogenous molecule or a malfunctioning version of a normally functioning endogenous molecule. By contrast, the term “endogenous” coins any molecule that is normally present in a cell or organism, at a particular developmental stage under particular environmental conditions.
[0037] “Fusion protein” refers to a single-chain hybrid polypeptide which comprises two or more amino acid sequences fused together (z.e., from two or more different proteins and/or peptides). The two or more amino acid sequences can be fused together via a direct peptidic bond or indirectly through a peptidic linker. A fusion protein may be in particular fully encoded by a single nucleic acid sequence.
[0038] “Gene” typically refers to a DNA region encoding a protein (z.e., a coding region). The term may also include DNA regions which do not per se encode a protein (z.e., a non-coding region). The latter include, e.g., regions transcribed into functional non-coding RNA molecules (e.g., transfer RNA, ribosomal RNA, regulatory RNA, etc.). Other non-coding regions regulate the transcription and translation of coding regions (z. e. , regulatory elements), or serve as architectural elements (e.g., scaffold/matrix attachment region), as origins of DNA replication, as centromeres or telomeres, etc. Regulatory elements include, without limitations, promoter sequences, terminators, translational regulatory sequences (e.g., ribosome binding sites [RBS] and internal ribosome entry sites [IRES]), enhancers, silencers, insulators, boundary elements, replication origins, matrix attachment sites and locus control regions.
[0039] “Identity” or “ identical”, when used in a relationship between the sequences of two or more amino acid sequences, or of two or more nucleic acid sequences, refers to the degree of sequence relatedness between amino acid sequences or nucleic acid sequences, as determined by the number of matches between strings of two or more amino acid residues or nucleic acid residues. “Identity” measures the percent of identical matches between the smaller of two or more sequences with gap alignments (if any) addressed by a particular mathematical model or computer program (z.e., “algorithms”). Identity of related amino acid sequences or nucleic acid sequences can be readily calculated by known methods. Such methods include, but are not limited to, those described in Lesk A. M. (1988). Computational molecular biology: Sources and methods for sequence analysis. New York, NY: Oxford University Press; Smith D. W. (1993). Biocomputing: Informatics and genome projects. San Diego, CA: Academic Press; Griffin A. M. & Griffin H. G. (1994). Computer analysis of sequence data, Part 1. Totowa, NJ: Humana Press; von Heijne G. (1987). Sequence analysis in molecular biology: treasure trove or trivial pursuit. San Diego, CA: Academic press; Gribskov M. R. & Devereux J. (1991). Sequence analysis primer. New York, NY: Stockton Press; Carillo etal., 1988. SIAM J Appl Math. 48(5): 1073-82. Preferred methods for determining identity are designed to give the largest match between the sequences tested. Methods of determining identity are described in publicly available computer programs. Preferred computer program methods for determining identity between two sequences include the GCG program package, including GAP (Genetics Computer Group, University of Wisconsin, Madison, WI; Devereux et al., 1984. Nucleic Acids Res . 12(1 Pt l):387-95), BLASTP, BLASTN, and FASTA (Altschul etal., 1990. J Mol Biol. 215(3):403-10). The BLASTX program is publicly available from the National Center for Biotechnology Information (NCBI) and other sources (BLAST Manual, Altschul et al. NCB/NLM/NIH Bethesda, Md. 20894). The well-known Smith Waterman algorithm may also be used to determine identity.
[0040] “Insertion” and “integration” refer to the addition of a nucleic acid sequence into a second nucleic acid sequence or into a genome or a portion thereof. Insertion may be “specific”, “site-specific”, “targeted” or “on-targeted”: these adjectives define the insertion of a nucleic acid into a specific site of a second nucleic acid or into a specific site of a genome or a portion thereof (i.e., a site that has been purposely selected for insertion). Conversely, the adjectives “random”, “non-targeted” or “off-targeted” refer to non-specific and/or unintended insertion of a nucleic acid into an unwanted site. The terms “total” or “overall” refer to the total number of insertions.
[0041] “Linker” refers to a chemical group or a molecule linking two adjacent molecules or moieties.
[0042] “Modified” refers to a protein or nucleic acid sequence that is different than a corresponding unmodified protein or nucleic acid sequence.
[0043] “Mutated”, in connection with a sequence (e.g., an amino acid sequence or a nucleic acid sequence) means that the sequence is different than a reference sequence, such as a wild-type sequence. Typically, a mutated sequence comprises at least one of a substitution, an addition or a deletion of one or several residues by comparison to a reference sequence, such as a corresponding wild-type sequence.
[0044] “Mutation” refers to a substitution of a residue within a sequence, e.g., a nucleic acid or amino acid sequence, with another residue; and/or to a deletion or insertion of one or more residues within a nucleic acid or amino acid sequence. Mutations are typically described herein by identifying the original residue followed by the position of the residue within the sequence, then the identity of the newly substituted residue. Various methods for making amino acid substitutions (mutations) provided herein are well known in the art, and are provided by, for example, Green & Sambrook, 2012 (Molecular cloning: a laboratory manual (4th Ed.). Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.).
[0045] “Nuclease” refers to an enzyme catalyzing the hydrolysis of nucleic acids within a nucleic acid sequence. Nuclease activity can result in single-stranded or double-stranded nucleic acid molecules break, wherein the nucleic molecule can be DNA or RNA.
[0046] “Nucleic acid (molecule/sequence)” and “nucleotide sequence” may be used interchangeably to refer to any molecule composed of, or comprising, monomeric nucleotides. A nucleic acid may be an oligonucleotide or a polynucleotide; it can be a DNA, an RNA, or a mix thereof. It can be chemically modified or artificial; e.g., it encompasses peptide nucleic acids (PNA), morpholinos and locked nucleic acids (LNA), as well as glycol nucleic acids (GNA) and threose nucleic acid (TNA). Each of these nucleic acids distinguish from naturally occurring DNA or RNA by changes in the backbone of the molecule. Also, phosphorothioate nucleotides may be used. Other deoxynucleotide analogs include, without limitation, methylphosphonates, phosphoramidates, phosphorodithioates, N3'P5' phosphoramidates and oligoribonucleotide phosphorothioates and their 2’O-allyl analogs and 2’O-methylribonucleotide methylphosphonates which may be used in a nucleic acid of the disclosure.
[0047] “Polypeptide”, “peptide”, “protein” and “amino acid sequence” are used interchangeably to refer to a polymer of amino acid residues. Unless specified, a polymer of amino acid residues can be any length. The term also applies to amino acid polymers in which one or more amino acids are chemical analogues or modified derivatives of corresponding naturally-occurring amino acids. [0048] “Prevention” and any declension thereof refers to prophylactic and preventative measures, wherein the object is to reduce the chances that a subject will develop a given pathologic condition or disorder over a given period of time. Such a reduction may be reflected, e.g., in a delayed onset of at least one symptom of the pathologic condition or disorder in the subject.
[0049] “Specificity” refers to the ability to selectively bind a sequence which shares a degree of sequence identity to a selected sequence.
[0050] “Subject” refers to a mammal, preferably a human. A subject may be a “patient”, ie., a warm-blooded animal, more preferably a human, who/which is awaiting the receipt of, or is receiving medical care or was/is/will be the object of a medical procedure, or is monitored for the development of a disease. The term “mammal” refers here to any mammal, including humans, domestic and farm animals, and zoo, sports, or pet animals, such as dogs, cats, cattle, horses, sheep, pigs, goats, rabbits, etc. Preferably, the mammal is a primate, more preferably a human.
[0051] “Transduction” and any declension thereof refers to the introduction of one or more nucleic acid molecules (DNA and/or RNA) into one or more cells using a viral vector carrier, e.g., a virus, a viral particle or a viral vector, including without limitation, retroviruses (including lentiviruses), adenoviruses, adeno-associated viruses (AAV), and vectors derived thereof.
[0052] “Transfection” and any declension thereof refers to the introduction of one or several nucleic acid molecules (DNA and/or RNA) into one or more cells by non-viral means, whether in vitro or in vivo. In other words, “transfection” refers to any method, technique or vehicle known in the art that facilitates or increases cytoplasmic transduction of a nucleic acid molecule or cargo. Methods for transfection are well known in the art and include, e.g., lipofection, PEI and electroporation.
[0053] “Transgene” refers to an exogenous nucleic acid sequence, in particular an exogenous DNA or cDNA encoding a gene product. The gene product may be an RNA, peptide or protein. In addition to the coding region for the gene product (CDS), the transgene may include or be associated with one or more operational sequences to facilitate or enhance expression, such as a promoter, enhancer(s), response element(s), reporter element(s), insulator element(s), polyadenylation signal(s) and/or other functional elements. Embodiments of the disclosure may utilize any known suitable promoter, enhancer(s), response element(s), reporter element(s), insulator element(s), polyadenylation signal(s) and/or other functional elements, unless specified otherwise. Suitable elements and sequences will be well known to those skilled in the art.
[0054] “Transposase” refers to an enzyme that binds to the end of a transposon and catalyzes its movement to another part of the genome by a cut-and-paste mechanism, or a replicative transposition mechanism. Within the scope of the present invention, the transposase may be a fragment of a transposase, e.g., an ITR-binding domain or a functional domain, preferably an ITR-binding domain.
[0055] “Treatment”, “alleviation”, “curation” and any declensions thereof refer to a therapeutic treatment, excluding prophylactic or preventative measures; wherein the object is to slow down, lessen, stop or even reverse (either partially or totally) the evolution of a targeted pathologic condition or disorder. Those in need of treatment include those already with the disorder as well those suspected to have the disorder. A subject is successfully “treated” for the targeted pathologic condition or disorder if, after receiving treatment, they show observable and/or measurable reduction in or absence of one or more symptoms associated with the pathologic condition or disorder; relief to some extent; reduced morbidity and/or mortality; and/or improvement in quality-of-life issues. The above parameters for assessing successful treatment and improvement in the disease are readily measurable by routine procedures familiar to a physician.
[0056] “Vector” as used herein, refer to any polynucleotide that can carry, e.g., a second polynucleotide of interest, and e.g., which can transfer gene sequences to target cells. Thus, the term includes cloning, and expression vehicles, as well as integrating vectors. DETAILED DESCRIPTION
[0057] When attempting to express a transgene of interest in a population of cells, one of the main obstacles that may arise is a poor expression level, that can be caused by a difficulty for the DNA molecule carrying the transgene of interest to reach the nucleus. The present invention addresses this problem.
[0058] The present invention thus relates to a method of increasing the expression of a nucleic acid molecule encoding at least one transgene of interest in a cell population.
[0059] It also relates to a method of increasing nuclear localization of a nucleic acid molecule encoding at least one transgene of interest in a cell population.
[0060] It also relates to a method of editing the genome of a cell population, in particular by inserting a nucleic acid molecule encoding at least one transgene of interest in the genome of the cell population.
[0061] It also relates to a method of transfecting a nucleic acid molecule encoding at least one transgene of interest in a cell population, preferably a non-dividing cell population.
[0062] In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is transfected in the cell population by at least one technique selected from the group comprising or consisting of lipofection, electroporation, sonication, nanoparticles, microinjection, PEI, and viral vector infection, including non-integrative and integrative viral vector infection.
[0063] The invention is based on the observation by the Inventors that a cytoplasmic transposase was capable of delivering a nucleic acid to the nucleus of a cell, thereby increasing its nuclear localization. In particular, a cytoplasmic transposase is capable of translocating, or of promoting translocation, of a nucleic acid to the nucleus of a cell, including a quiescent cell. Increased nuclear localization thus induces increased expression levels of the nucleic acid. Interestingly, the Inventors observed that increased nuclear localization by the transposase does not rely on its enzymatic activity. [0064] It is to be understood that a DNA molecule encoding the at least one transgene of interest, when delivered to the cytoplasm of a cell, will be sensed by the immune system of the cell that will induce a response against the presence of a foreign DNA (e.g., inflammatory response). Besides, the DNA molecule encoding the at least one transgene of interest will also have to pass the nuclear envelope in order to reach the nucleus. These problems are overcome when the DNA molecule encoding the at least one transgene of interest is translocated by a cytoplasmic transposase.
[0065] According to the invention, these methods comprise contacting a cell population with: the nucleic acid molecule encoding the at least one transgene of interest, and a protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same.
[0066] In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is delivered to the cell population by transfection. In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is transfected in the cell population by at least one technique selected from the group comprising or consisting of lipofection, electroporation, sonication, nanoparticles, microinjection, PEI, and viral vector infection, including non-integrative and integrative viral vector infection.
[0067] In some embodiments, the cell population may be contacted with the nucleic acid molecule encoding the at least one transgene of interest, and the protein or polypeptide comprising a transposase or a fragment thereof or a nucleic acid encoding the same, in any order.
[0068] In some embodiments, the cell population is contacted with the nucleic acid molecule encoding the at least one transgene of interest concomitantly with the protein or polypeptide comprising a transposase or a fragment thereof or a nucleic acid encoding the same.
[0069] In one embodiment, the cell population is contacted with the protein or polypeptide comprising a transposase or a fragment thereof or a nucleic acid encoding the same prior to being contacted with the nucleic acid molecule encoding the at least one transgene of interest. In another embodiment, the cell population is contacted with the nucleic acid molecule encoding the at least one transgene of interest prior to being contacted with the protein or polypeptide comprising a transposase or a fragment thereof or a nucleic acid encoding the same.
[0070] The term “nucleic acid molecule encoding the at least one transgene of interest” refers to a nucleic acid sequence to be inserted in the genome of a cell, preferably of a eukaryotic cell, more preferably of a mammalian cell (including of a human cell), that encodes at least one product of interest. The product of interest may be a protein or a fragment thereof; in this case, the transgene of interest is said to be a coding nucleic acid sequence. However, the term also encompasses non-coding nucleic acid sequences, z.e., nucleic acid sequences that do not encode a protein or a fragment thereof, but rather express an “RNA gene” (or “non-coding RNA”), such as, e.g., a transfer RNA, a ribosomal RNA, a small RNA, a long non-coding RNA, etc.
[0071] In some embodiments, the transgene of interest is a nucleic acid sequence encoding a peptide or protein (e.g., without limitation, an enzyme, a transcription factor, a growth factor, a trophic factor, a hormone, a cytokine, an antibody, an antigen, a receptor, an immune regulator, a differentiation factor, a suicide protein, a cell-cycle modifying protein, an anti-proliferative protein, an angiogenic factor, an anti-angiogenic factor, a genome editor, a nuclease, a recombinase, a transposase, a neurotransmitter, and a reporter, including any precursor thereof, as well as fusion proteins). In this case, the sequence of interest may typically be (or be derived from) an mRNA, a cDNA, a gDNA, a synthetic nucleic acid, or any combinations thereof.
[0072] In some embodiments, the transgene of interest is alternatively a nucleic acid sequence of a non-coding RNA.
[0073] Examples of non-coding RNAs include, but are not limited to, transfer RNAs (tRNAs), ribosomal RNAs (rRNAs), small nuclear RNAs (snRNAs), small nucleolar RNAs (snoRNAs), SmY RNAs, small Cajal body-specific RNAs (scaRNAs), guide RNAs (gRNAs), Y RNAs, telomerase RNA component (TERC), spliced leader RNAs (SL RNAs), catalytic RNAs (z.e., ribozymes; such as, e.g., ribonuclease P, ribonuclease MRP, and the like), antisense RNAs (aRNAs), c/.s-natural antisense transcript (cis-NAT), CRISPR RNAs (crRNAs), long non-coding RNAs (IncRNAs), microRNAs (miRNAs), piwi-interacting RNAs (piRNAs), small interfering RNAs (siRNAs), short hairpin RNAs (shRNAs), /ra/z.s-acting siRNAs (tasiRNAs), repeat-associated siRNAs (rasiRNAs), 7SK RNAs (7SK), enhancer RNAs (eRNAs), and RNA aptamers.
[0074] Examples of transgene of interest include any nucleic acid sequence encoding a molecule of therapeutic interest, such as any nucleic acid sequence encoding a peptide or protein, or a non-coding RNA, that is lacking, deficient and/or non-functional in a subject.
[0075] In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest comprises the at least one transgene and at least one regulatory element. Examples of regulatory elements include, without limitation, promoters, enhancers, silencers, insulators and the like. Preferably, the at least one regulatory element is located upstream, i.e., in 5’, of the at least one transgene of interest.
[0076] The nucleic acid molecule encoding the at least one transgene of interest may be double-stranded or single-stranded.
[0077] The nucleic acid molecule encoding the at least one transgene of interest may be a deoxyribonucleic acid (DNA) molecule, a ribonucleic acid (RNA) molecule, or a mix thereof; preferably nucleic acid molecule encoding the at least one transgene of interest is a DNA molecule.
[0078] The nucleic acid molecule encoding the at least one transgene of interest typically comprises natural nucleotides. It may however also comprise non-natural nucleotides. As used herein, a “natural nucleotide” refers to adenine (A), guanine (G), cytosine (C), thymine (T) and uracil (U). As used herein, the term “non-natural nucleotides” refers to chemically modified A, T, U, C or G nucleotides.
[0079] In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest has a length of at least 10 base pairs (bp), such as at least 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1 000, 2 000, 3 000, 4 000, 5 000, 6 000, 7 000, 8 000, 9 000, 10 000, 20 000, 30 000, 40 000, 50 000, 60 000, 70 000, 80 000, 90 000, 100 000 bp or more.
[0080] In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is a DNA molecule selected from the group comprising or consisting of a complementary DNA (cDNA) and a genomic DNA (gDNA). In some embodiments, the DNA molecule is a cDNA. In some embodiments, the DNA molecule is a gDNA.
[0081] In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest further comprises at least one Inverted Terminal Repeat (ITR) sequence.
[0082] As used herein, an “ITR sequence” refers to a nucleic acid sequence naturally found at the extremities of eukaryotic transposable elements (or transposons). They are referred to as “5’ ITR” and “3’ ITR”. Typically, the 5’ ITR and the 3’ ITR are complementary and capable of forming a hairpin structure. ITR sequences are useful for the recognition of the transposon by a transposase enzyme. In some embodiments, the ITR sequences are palindromic.
[0083] Some non-limiting examples of ITR sequences include SEQ ID NOs: 55 to 60, as well as complementary sequences thereof.
SEQ ID NO: 55 ccctagaaagatagtctgcgtaaaattgacgcatg
SEQ ID NO: 56 ccctagaaagatagtctgcgtaaaattgacgcatgagataatcaatattgtgacgtacgttaa
SEQ ID NO: 57 ccctagaaagatagtctgcgtaaaattgacgcatgagataatcaatattgtgacgtacgttaaagataatcatgcgtaaaattgacgcatg
SEQ ID NO: 58 gattatctttctaggg
SEQ ID NO: 59 cacaatatgattatctttctaggg
SEQ ID NO: 60 catgcgtcaattttacgcatgattatctttaacgtacgtacgtcacaatatgattatctttctaggg
[0084] In some embodiments, the at least one ITR sequence interacts with, or non-covalently binds to, the protein or polypeptide comprising a transposase or a fragment thereof. In some embodiments, the protein or polypeptide comprising a transposase or a fragment thereof specifically recognizes and binds to the at least one ITR sequence.
[0085] In some embodiments, the at least one ITR sequence is adjacent to the transgene of interest. By “adjacent”, it is meant that no nucleotide separates the transgene of interest from the at least one ITR sequence. Alternatively, the at least one ITR sequence can be separated from the transgene of interest by at least one nucleotide, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100 nucleotides or more.
[0086] In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest comprises two ITR sequences. Preferably, the two ITR sequences flank the transgene of interest. In some embodiments, the two ITR sequences are directly flanking the transgene of interest, z.e., no nucleotide separates the transgene of interest from the two flanking ITR sequences. Alternatively, the transgene of interest can be separated from at least one of the two ITR sequences, or from both ITR sequences, by at least one nucleotide, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100 nucleotides or more.
[0087] In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is either stably or transiently expressed by the cell population, preferably stably expressed.
[0088] In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is transiently expressed. In a preferred embodiment, the nucleic acid molecule encoding the at least one transgene of interest is transfected. In some embodiments, the transfection comprises the use of at least one vehicle and/or method that facilitates or increases cytoplasmic transduction of the nucleic acid molecule encoding at least one transgene of interest. Examples of transfection techniques include, without limitation, lipofection, electroporation, sonication, nanoparticles, microinjection, PEI, and viral vector infection, including non-integrative and integrative viral vector infection. [0089] In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is a DNA molecule which is inserted into the genome of the cells of the cell population. It is to be understood that the term “inserted in the genome” implies that the DNA molecule is replicated alongside the genome of the cell during mitosis.
[0090] The transgene of interest may be an exogenous gene or an endogenous gene.
[0091] In some embodiments, the transgene of interest is an exogenous gene. As used herein, “exogenous gene” refers to a gene that is not naturally present in a cell, but can be introduced into the cell by one or more genetic, biochemical or other methods. Natural presence in the cell may be determined with respect to the particular developmental stage and environmental conditions of the cell. For instance, a molecule that is present only in a cell during embryonic development is exogenous with respect to an adult stage cell. Similarly, a molecule induced by heat-shocking a cell is exogenous with respect to a non-heat- shocked cell.
[0092] In some embodiments, the transgene of interest is an endogenous gene. As used herein, “endogenous gene” refers to a gene of which at least one copy is naturally present in the genome of the population of cells. In some embodiments, the at least one copy may be an allele, z.e., a version of the gene comprising at least one mutation. In certain embodiments, expression of a transgenic endogenous gene increases the overall expression levels (z.e., transcripts) of the gene.
[0093] In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is comprised in a vector, such as, without limitation, a plasmid, a fosmid, a cosmid, an artificial chromosome or a viral vector.
[0094] In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is comprised in a plasmid. The plasmid may be circular or linear, preferably circular. In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is comprised in a fosmid. In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is comprised in a cosmid. In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is comprised in an artificial chromosome (e.g., human artificial chromosome). [0095] In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest is comprised in a viral vector.
[0096] The viral vector may be a DNA virus-based vector. Some non-limiting examples of such DNA virus-based vectors include vectors derived from viruses from the Duplodnaviria, Monodnaviria or Varidnaviria realms.
[0097] In some embodiments, the DNA virus-based vector is derived from a virus from the Duplodnaviria realm. Viruses from the Duplodnaviria realm include viruses of the Herpesvirales order. In some embodiments, the DNA virus-based vector is therefore a vector derived from Herpesviridae, such as, e.g., from a herpes simplex virus.
[0098] In some embodiments, the DNA virus-based vector is derived from a virus from the Monodnaviria realm. Viruses from the Monodnaviria realm include viruses of the Papillomaviridae and Polyomaviridae families. In some embodiments, the DNA virus-based vector is therefore a vector derived from a Papillomaviridae or a Polyomaviridae .
[0099] In some embodiments, the DNA virus-based vector is derived from a virus from the Varidnaviria realm. Viruses from the Varidnaviria realm include viruses of the Adenoviridae and Poxviridae families. In some embodiments, the DNA virus-based vector is therefore a vector derived from an Adenoviridae, such as, e.g., from an adenovirus; or derived from Poxviridae, such as, e.g., from a vaccina virus.
[0100] In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest can be excised from the vector. In some embodiments, the nucleic acid molecule encoding the at least one transgene of interest can be excised from the vector by the at least one ITR sequence, typically by a transposase, or variant or fragment thereof.
[0101] In some embodiments, the protein or polypeptide comprising the transposase or a fragment thereof increases the nuclear localization of the nucleic acid molecule encoding the at least one transgene of interest. [0102] As used herein, “increases the nuclear localization” means that the nuclear localization of the nucleic acid molecule encoding the at least one transgene of interest is increased at least 1.2-fold, 1.3-fold, 1.4-fold, 1.5-fold, 1.6-fold, 1.7-fold, 1.8-fold, 1.9-fold, 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, 10-fold, 20-fold, 30-fold, 40-fold, 50-fold, 60-fold, 70-fold, 80-fold, 90-fold, 100-fold or more in presence of the protein or polypeptide comprising the transposase or a fragment thereof, compared to a condition in absence of the protein or polypeptide comprising the transposase or a fragment thereof.
[0103] In some embodiments, the protein or polypeptide comprising the transposase or a fragment thereof translocates, or promotes translocation of, the nucleic acid molecule encoding the at least one transgene of interest to the nucleus. In some embodiments, the transposase or a fragment thereof (i) interacts and/or non-covalently binds with the nucleic acid molecule encoding the at least one transgene of interest, and (ii) translocates to the nucleus.
[0104] In some embodiments, the transposase or a fragment thereof does not insert and/or transpose the nucleic acid molecule encoding the at least one transgene of interest into the genome.
[0105] In some embodiments, the transposase or a fragment thereof increases at least 1.2-fold, 1.3-fold, 1.4-fold, 1.5-fold, 1.6-fold, 1.7-fold, 1.8-fold, 1.9-fold, 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, 10-fold, 20-fold, 30-fold, 40-fold, 50-fold, 60-fold, 70-fold, 80-fold, 90-fold, 100-fold or more, the passage of the nucleic acid molecule encoding the at least one transgene of interest through the nuclear envelope.
[0106] In some embodiments, the transposase or a fragment thereof protects the nucleic acid molecule encoding the at least one transgene of interest from the cell’s sensing mechanisms and/or defense mechanisms. In some embodiments, the transposase or a fragment thereof does not induce a proinflammatory and/or inflammatory response.
[0107] In some embodiments, the transposase is selected from the group consisting of hyperactive PiggyBac transposase, PiggyBac transposase, Sleeping Beauty transposase, SB11 transposase, Tol2 transposase, Mosl transposase, and Frog Prince transposase. [0108] In some embodiments, the transposase is selected from the group consisting of hyperactive PiggyBac transposase and Sleeping Beauty transposase.
[0109] In some embodiments, the transposase is a hyperactive PiggyBac transposase.
[0110] In some embodiments, the transposase is a hyperactive PiggyBac transposase with the amino acid sequence of SEQ ID NO: 1.
Figure imgf000025_0001
[oni] In some embodiments, the transposase is a modified hyperactive PiggyBac transposase. By “modified hyperactive PiggyBac transposase”, it is referred to a transposase comprising one or more amino acid substitutions, typically no more than 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid substitutions, as compared to the hyperactive PiggyBac transposase with the amino acid sequence of SEQ ID NO: 1. More specifically, a modified hyperactive PiggyBac transposase may comprise (i) one or more amino acid substitutions to increase excision activity as compared to the hyperactive PiggyBac transposase with the amino acid sequence of SEQ ID NO: 1, and/or (ii) one or more amino acid substitutions to decrease DNA binding activity as compared to the hyperactive PiggyBac transposase with the amino acid sequence of SEQ ID NO: 1.
[0112] In some embodiments, the modified hyperactive PiggyBac transposase comprises an amino acid sequence at least 80 %, 85 %, 90 %, 95 %, 96 %, 97 %, 98 %, or 99 % identical to the sequence set forth in SEQ ID NO: 1. [0113] In some embodiments, the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to increase excision activity.
[0114] In some embodiments, the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to increase excision activity selected among the amino acid mutations within the regions defined by the amino acid position numbers [194-200], [214-222], [434-442] or [446-456]; for example, amino acid substitution at the position DI 98, D201, R202, M212 and/or S213; said position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1.
[0115] In some embodiments, the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to increase excision activity selected among the amino acid mutations at positions 450, 560, 564, 573, 589, 592, and/or 594; said position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1.
[0116] In some embodiments, the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to increase excision activity selected among the amino acid mutations at position of M194 and/or D450, said position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1; preferably the amino acid substitution is M194V and/or D450N.
[0117] In some embodiments, the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to decrease DNA binding activity.
[0118] In some embodiments, the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to decrease DNA binding activity selected among the amino acid mutations at positions 254, 275, 277, 347, 372, 375, and/or 465; said position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1.
[0119] In some embodiments, the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to decrease DNA binding activity selected among R275, N347, R372, K375, R376, E377, and E380, said position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1.
[0120] In some embodiments, the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to decrease DNA binding activity selected among R372, K375, R376, E377, and E380, said position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1; preferably the amino acid substitution is R372A, K375A, R376A, E377A, and/or E380A.
[0121] In some embodiments, the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to decrease DNA binding activity selected among N347, R372, and K375, said position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1; preferably the amino acid substitution is N347S, N347A, R372A, and/or K375A; more preferably the amino acid substitution is N347S or N347A.
[0122] In some embodiments, the modified hyperactive PiggyBac transposase comprises one or more amino acid mutations to increase excision activity, as defined above; and one or more amino acid mutations to decrease DNA binding activity, as defined above.
[0123] In some embodiments, the modified hyperactive PiggyBac transposase includes at least one amino acid substitution to increase excision activity at position D450, and at least two amino acid substitutions to decrease DNA binding activity at positions N347, R372 and K375; preferably said modified transposase of hyperactive PiggyBac comprises the double mutations N347S and D450N or triple mutations D450N, R372A and K375A, said position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1.
[0124] In some embodiments, the modified transposase of hyperactive PiggyBac comprises the double mutations N347S and D450N, said position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1. [0125] In some embodiments, the modified hyperactive PiggyBac transposase as disclosed in the previous embodiments further comprises at least one mutation in the region defined by the amino acid position numbers [158-169], for example A166S; and/or at least one mutation at position Y527, R518, K525, N463.
[0126] In some embodiments, the modified hyperactive PiggyBac transposase further comprises one or more amino acid substitution at positions 34, 43, 117, 202, 230, 245, 268, 275, 277, 287, 290, 315, 325, 341, 346, 347, 350, 351, 356, 357, 388, 409, 411, 412, 432, 447, 460, 461, 465, 517, 560, 564, 571, 573, 576, 586, 587, 589, 592, and/or 594, the position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1.
[0127] In some embodiments, the modified hyperactive PiggyBac transposase comprises one of the following amino acid substitution or combination of amino acid substitutions: V34M, T43I, Y177H, R202K, S230N, R245A, D268N, K287A, K290A, K287A/K290A, R315A, G325A, R341A, D346N, N347A, N347S, T350A, S351E, S351P, S351A, K356E, N357A, R388A, K409A, A411T, K412A, K432A, D447A, D447N, D450N, R460A, K461A, W465A, S517A, T560A, S564P, S571N, S573A, K576A, H586A, I587A, M589V, S592G, F594L, D450N/R372A/K375A, R275A/R277A, K409A/K412A, R460A/K461A,
R275A/R277A/N347S/K375A/T560A/S573A/M589V/S592G or
R245A/R275A/R277A/R372A/W465A.
[0128] In some embodiments, the modified hyperactive PiggyBac transposase comprises one of the following amino acid substitution or combination of amino acid substitutions:
R372A/K375A/D450N,
R372A/K375 A/R376A/D450N,
K375 A/R376A/E377A/E380A/D450N,
R372A/K375 A/R376 A/E377A/E380A/D450N,
Ml 94 V,
M194V/R372A/K375A, S351A/R372A/K375A/R388A/D450N/W465A/S573A/M589V/S592G/F594L, R245 A/R275 A/R277A/R372A/W465 A/M589 V,
R275A/325A/R372A/T560A,
N347A/D450N,
N347S/D450N/T560A/S573A/F594L,
R202K/R275A/N347S/R372A/D450N/T560A/F594L,
R275A/N347S/K375A/D450N/S592G,
R275A/N347S/R372A/D450N/T560A/F594L,
R275A/R277A/N347S/R372A/D450N/T560A/S564P/F594L,
R245A/N347S/R372A/D450N/T560A/S564P/S573A/S592G,
R277A/G325A/N347A/K375A/D450N/T560A/S564P/S573A/S592G/F594L, V34M/R275A/G325A/N347S/S351A/R372A/K375A/D450N/T560A/S564P, G325A/N347S/K375A/D450N/S573A/M589V/S592G,
S230N/R277A/N347S/K375A/D450N,
T43I/R372A/K375A/A411T/D450N,
G325A/N347S/S351A/K375A/D450N/S573A/M589V/S592G, or
Y177H/R275A/G325A/K375A/D450N/T560A/S564P/S592G, the position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1.
[0129] Some preferred modified hyperactive PiggyBac transposases include modified hyperactive PiggyBac comprising one of the following combinations of amino acid substitutions:
R372A/K375A/D450N,
S351A/R372A/K375A/R388A/D450N/W465A/S573A/M589V/S592G/F594L,
R245 A/R275 A/R277A/R372AAV465 A/M589 V,
N347A/D450N,
N347S/D450N/T560A/S573A/F594L,
R202K/R275A/N347S/R372A/D450N/T560A/F594L,
R275A/N347S/K375A/D450N/S592G,
R275A/N347S/R372A/D450N/T560A/F594L,
R275A/R277A/N347S/R372A/D450N/T560A/S564P/F594L,
R245A/N347S/R372A/D450N/T560A/S564P/S573A/S592G, R277A/G325A/N347A/K375A/D450N/T560A/S564P/S573A/S592G/F594L,
V34M/R275A/G325A/N347S/S351A/R372A/K375A/D450N/T560A/S564P,
G325A/N347S/K375A/D450N/S573A/M589V/S592G,
S230N/R277A/N347S/K375A/D450N,
T43I/R372A/K375A/A411T/D450N,
G325A/N347S/S351A/K375A/D450N/S573A/M589V/S592G,
Y177H/R275A/G325A/K375A/D450N/T560A/S564P/S592G, or
R275A/325A/R372A/T560A, the position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1.
[0130] In some embodiments, the modified hyperactive PiggyBac transposase comprises one of the following combinations of amino acid substitutions:
R245 A/R275 A/R277A/R372AAV465 A/M589 V,
R275A/325A/R372A/T560A,
N347A/D450N,
N347S/D450N/T560A/S573A/F594L,
R202K/R275A/N347S/R372A/D450N/T560A/F594L,
R275A/N347S/K375A/D450N/S592G,
R275A/N347S/R372A/D450N/T560A/F594L,
R275A/R277A/N347S/R372A/D450N/T560A/S564P/F594L,
R245A/N347S/R372A/D450N/T560A/S564P/S573A/S592G,
R277A/G325A/N347A/K375A/D450N/T560A/S564P/S573A/S592G/F594L,
G325A/N347S/K375A/D450N/S573A/M589V/S592G,
S230N/R277A/N347S/K375A/D450N,
G325A/N347S/S351A/K375A/D450N/S573A/M589V/S592G, or
Y177H/R275A/G325A/K375A/D450N/T560A/S564P/S592G, the position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1.
[0131] In some preferred embodiments, the modified hyperactive PiggyBac transposase comprises one of the following combinations of amino acid substitutions: N347A/D450N,
N347S/D450N/T560A/S573A/F594L,
R202K/R275A/N347S/R372A/D450N/T560A/F594L,
R275A/N347S/K375A/D450N/S592G,
R275A/N347S/R372A/D450N/T560A/F594L,
R275A/R277A/N347S/R372A/D450N/T560A/S564P/F594L, R245A/N347S/R372A/D450N/T560A/S564P/S573A/S592G, R277A/G325A/N347A/K375A/D450N/T560A/S564P/S573A/S592G/F594L, G325A/N347S/K375A/D450N/S573A/M589V/S592G, S230N/R277A/N347S/K375A/D450N, G325A/N347S/S351A/K375A/D450N/S573A/M589V/S592G, or Y177H/R275A/G325A/K375A/D450N/T560A/S564P/S592G, the position number corresponding to the amino acid number of the hyperactive PiggyBac with SEQ ID NO: 1.
[0132] In some embodiments, the modified hyperactive PiggyBac transposase comprises the R372A/K375A/D450N substitutions, said position numbers corresponding to the amino acid numbers of the hyperactive PiggyBac with SEQ ID NO: 1. Said modified transposase has an amino acid sequence of SEQ ID NO: 4.
[0133] In some embodiments, the modified hyperactive PiggyBac transposase has an amino acid sequence selected from the group comprising or consisting of SEQ ID NOs: 5-26.
[0134] In some embodiments, the modified hyperactive PiggyBac transposase has an amino acid sequence selected from the group comprising or consisting of SEQ ID NOs: 5-13.
[0135] In some embodiments, the modified hyperactive PiggyBac transposase has an amino acid sequence selected from the group comprising or consisting of SEQ ID NOs: 14-26.
[0136] In some embodiments, the modified hyperactive PiggyBac transposase may comprise one or more mutations relative to hyperactive PiggyBac transposase that are involved in the conserved catalytic triad, e.g., at amino acid 268 and/or 346 (e.g., D268N and/or D346N) corresponding to the amino acid numbering of SEQ ID NO: 1.
[0137] In some embodiments, the modified hyperactive PiggyBac transposase may comprise one or more mutations relative to hyperactive PiggyBac transposase that are critical for excision, e.g., at amino acid 287, 287/290 and/or 460/461 (e.g., K287A, K287A/K290A, and/or R460A/K461A) corresponding to the amino acid numbering of SEQ ID NO: 1.
[0138] In some embodiments, the modified hyperactive PiggyBac transposase may comprise one or more mutations relative to hyperactive PiggyBac transposase that are involved in target joining, e.g., at amino acid 351, 356, and/or 379 (e.g., S351E, S351P, S351A, and/or K356E) corresponding to the amino acid numbering of SEQ ID NO: 1.
[0139] In some embodiments, the modified hyperactive PiggyBac transposase may comprise one or more mutations relative to hyperactive PiggyBac transposase that are critical for integration, e.g., at amino acid 560, 564, 571, 573, 589, 592, and/or 594 (e.g., T560A, S564P, S571N, S573A, M589V, S592G, and/or F594L) corresponding to the amino acid numbering of SEQ ID NO: 1.
[0140] In some embodiments, the modified hyperactive PiggyBac transposase may comprise one or more mutations relative to hyperactive PiggyBac transposase that are involved in alignment, e.g., at amino acid 325, 347, 350, 357 and/or 465 (e.g., G325A, N347A, N347S, T350A and/or W465A) corresponding to the amino acid numbering of SEQ ID NO: 1.
[0141] In some embodiments, the modified hyperactive PiggyBac transposase may comprise one or more mutations relative to hyperactive PiggyBac transposase that are well conserved, e.g., at amino acid 576 and/or 587 (e.g., K576A and/or I587A) corresponding to the amino acid numbering of SEQ ID NO: 1.
[0142] In some embodiments, the modified hyperactive PiggyBac transposase may comprise one or more mutations relative to hyperactive PiggyBac transposase that are involved in Zn2+ binding, e.g., 586 (e.g., H586A) corresponding to the amino acid numbering of SEQ ID NO: 1.
[0143] In some embodiments, the modified hyperactive PiggyBac transposase may comprise one or more mutations relative to hyperactive PiggyBac transposase that are involved in integration, e.g., 315, 341, 372, and/or 375 (e.g., R315A, R341A, R372A, and/or K375A) corresponding to the amino acid numbering of SEQ ID NO: 1.
[0144] In some embodiments, the modified hyperactive PiggyBac is selected for its high specificity of DNA integration into a genome compared to hyperactive PiggyBac.
[0145] In some embodiments, the modified hyperactive PiggyBac comprises an amino acid sequence having one or more of the modifications disclosed herein relative to SEQ ID NO: 1, and retains at least 80 %, 85 %, 90 %, 95 %, 96 %, 97 %, 98 %, 99 % or more sequence identity with any of SEQ ID NOs: 5-26.
[0146] In some embodiments, the modified hyperactive PiggyBac transposase may comprise a mutation of one or more of amino acids selected from amino acid 245, 275, 277, 325, 347, 351, 372, 375, 388, 450, 465, 560, 564, 573, 589, 592, and 594, corresponding to the amino acid numbering of SEQ ID NO: 1.
[0147] In some embodiments, the modified hyperactive PiggyBac transposase mutation may comprise one or more of the amino acid substitutions selected from R245A, R275A, R277A, R275A/R277A, G325A, N347A, N347S, S351E, S351P, S351A, R372A, K375A, R388A, D450N, W465A, T560A, S564P, S573A, M589V, S592G, and F594L corresponding to the amino acid numbering of SEQ ID NO: 1.
[0148] In some embodiments, the modified hyperactive PiggyBac transposase comprises an amino acid substitution D450N corresponding to the amino acid numbering of SEQ ID NO: 1.
[0149] In some embodiments, the modified hyperactive PiggyBac transposase comprises the amino acid substitutions R245A and D450, corresponding to the amino acid numbering of SEQ ID NO: 1. [0150] In some embodiments, the modified hyperactive PiggyBac transposase comprises the amino acid substitutions R245A, G325A, and S573P, corresponding to the amino acid numbering of SEQ ID NO: 1.
[0151] In some embodiments, the modified hyperactive PiggyBac transposase comprises the amino acid substitutions R245A, G325A, D450 and S573P, corresponding to the amino acid numbering of SEQ ID NO: 1.
[0152] In some embodiments, the modified hyperactive PiggyBac transposase comprises the amino acid substitution N347S or N347A, corresponding to the amino acid numbering of SEQ ID NO: 1.
[0153] In some embodiments, the modified hyperactive PiggyBac transposase comprises the amino acid substitutions N347S and D450N, corresponding to the amino acid numbering of SEQ ID NO: 1.
[0154] In some embodiments, the modified hyperactive PiggyBac transposase comprises the amino acid substitutions N347A and D450N, corresponding to the amino acid numbering of SEQ ID NO: 1. This modified hyperactive PiggyBac transposase comprises the amino acid sequence of SEQ ID NO: 14.
[0155] In some embodiments, the modified hyperactive PiggyBac transposase comprises an amino acid sequence with SEQ ID NO: 1, wherein: amino acid residue at position 34 is V or M, amino acid residue at position 43 is T or I, amino acid residue at position 177 is Y or H, amino acid residue at position 202 is R or K, amino acid residue at position 230 is S or N, amino acid residue at position 245 is A, amino acid residue at position 268 is D or N, amino acid residue at position 277 is R or A, amino acid residue at position 275 is R or A, amino acid residue at position 277 is R or A, amino acid residue at position 325 is A or G, amino acid residue at position 347 is S, or A, amino acid residue at position 351 is E, P or A, amino acid residue at position 372 is R or A, amino acid residue at position 375 is K or A, amino acid residue at position 388 is R or A, amino acid residue at position 409 is K or A, amino acid residue at position 411 is A or T, amino acid residue at position 412 is K or A, amino acid residue at position 450 is D or N, amino acid residue at position 460 is R or A, amino acid residue at position 465 is W or A, amino acid residue at position 517 is S or A, amino acid residue at position 560 is T or A, amino acid residue at position 564 is P or S, amino acid residue at position 571 is S or N, amino acid residue at position 573 is S or A, amino acid residue at position 576 is K or A, amino acid residue at position 586 is H or A, amino acid residue at position 587 is I or A, amino acid residue at position 589 is M or V, amino acid residue at position 592 is G or S, and/or, amino acid residue at position 594 is L or F.
[0156] In some embodiments, the modified hyperactive PiggyBac transposase comprises or consists of an amino acid sequence having at least 75 %, 80 %, 85 %, 90 %,
95 %, 96 %, 97 %, 98 %, 99 %, 99.5 %, 99.9 % or more sequence identity with the amino acid sequence selected from the group comprising or consisting of SEQ ID NO: 4,
SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9,
SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14,
SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,
SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24,
SEQ ID NO: 25 and SEQ ID NO: 26. [0157] In some embodiments, the modified hyperactive PiggyBac transposase comprises or consists of an amino acid sequence having at least 75 %, 80 %, 85 %, 90 %,
95 %, 96 %, 97 %, 98 %, 99 %, 99.5 %, 99.9 % or more sequence identity with the amino acid sequence selected from the group comprising or consisting of SEQ ID NO: 5,
SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10,
SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15,
SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20,
SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25 and SEQ ID NO: 26.
[0158] In some embodiments, the modified transposase is not a HimarlC9 mutant.
[0159] In some embodiments, the transposase is a Sleeping Beauty transposase.
[0160] In some embodiments, the transposase is a hyperactive Sleeping Beauty SB 100 transposase with the amino acid sequence of SEQ ID NO: 61.
Figure imgf000036_0001
[0161] In some embodiments, the transposase is a modified hyperactive Sleeping Beauty SB 100 transposase, comprising at least one amino acid substitution compared with the amino acid sequence of an unmodified Sleeping Beauty transposase, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more substitutions.
[0162] In some embodiments, the modified hyperactive Sleeping Beauty SB 100 transposase comprises one or more of the amino acid substitutions at a position selected from C176, H187, 1212, P247 and K248, corresponding to the amino acid numbering of SEQ ID NO: 61. [0163] In some embodiments, the modified hyperactive Sleeping Beauty SB 100 transposase comprises one or more of the amino acid substitutions selected from C176S, H187V/P, I212S, P247R/S, and K248A/C/I/L/M/N/R/S/T/V, corresponding to the amino acid numbering of SEQ ID NO: 61.
[0164] In some embodiments, the transposase or the fragment thereof has decreased catalytic activity. As used herein, “decreased catalytic activity” means a 1.2-fold, 1.3-fold, 1.4-fold, l?5-fold, 1.6-fold, 1.7-fold, 1.8-fold, 1.9-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, 20-fold decrease, or more, compared to a wild-type transposase and/or to an hyperactive transposase.
[0165] In some embodiments, the transposase or the fragment thereof is catalytically dead. In some embodiments, the transposase or the fragment thereof does not exert any catalytic activity. In some embodiments, catalytically dead transposases retain their ability to bind to other proteins, polypeptides and/or nucleic acid molecules.
[0166] In some embodiments, the catalytically dead transposase has at least 75 %, 80 %, 85 %, 90 %, 95 %, 96 %, 97 %, 98 %, 99 %, 99.5 %, 99.9 % or more sequence identity with the catalytically dead hyperactive PiggyBac transpose (dead hyPB) with SEQ ID NO: 3. In some embodiments, the dead hyPB comprises or consists of the amino acid sequence with SEQ ID NO: 3.
Figure imgf000037_0001
[0167] In some embodiments, the protein or polypeptide comprises a transposase fragment.
[0168] In some embodiments, the transposase fragment comprises or consists of at least one transposase functional domain.
[0169] In some embodiments, the transposase fragment comprises or consists of an ITR-binding domain.
[0170] Examples of ITR-binding domains include, without limitation, SEQ ID NOs: 62 to 64, wherein SEQ ID NOs: 62 and 63 are ITR-binding domains of the hyperactive PiggyBac transposase and SEQ ID NO: 64 is an ITR-binding domain of the Sleeping Beauty transposase (also called “N57 targeting domain”).
SEQ ID NO: 62
ILPKEVPGTSDDSTEEPVMKKRTYCTYCPSKIRRKASASCKKCKKVICREHNIDMCQSC F
SEQ ID NO: 63
MKKRTYCTYCPSKIRRKASASCKKCKKVICREHNIDMCQSCF
SEQ ID NO: 64
MGKSKEISQDLRKRIVDLHKSGSSLGAISKRLAVPRSSVQTIVRKYKHHGTTQPSYR
[0171] In some embodiments, the transposase fragment is a fragment of a hyperactive PiggyBac transposase or of a Sleeping Beauty transposase.
[0172] In some embodiments, the transposase fragment is a fragment of a hyperactive PiggyBac transposase, preferably comprising an ITR-binding domain.
[0173] In some embodiments, the transposase fragment is a fragment of a Sleeping Beauty transposase, preferably selected from the group consisting of the SB 100 domain and the N57 domain of Sleeping Beauty transposase.
[0174] In some embodiments, the protein or polypeptide comprising the transposase or a fragment thereof is a fusion protein comprising the transposase or a fragment thereof, and at least one additional polypeptide or protein. [0175] In some embodiments, the protein or polypeptide comprising the transposase or a fragment thereof is a fusion protein comprising the transposase or a fragment thereof, and at least two additional polypeptides or proteins.
[0176] In some embodiments, the fusion protein comprises a single, contiguous polypeptide chain.
[0177] In some embodiments, the fusion protein further comprises at least one linker, in particular at least one peptide linker, between two polypeptides or proteins of the fusion protein.
[0178] Exemplary linkers include, without limitation, (G)n, (GS)n, (GGS)n, (GGGS)n (with SEQ ID NO: 49), (GGGGS)n (with SEQ ID NO: 50), (EAAAK)n (with SEQ ID NO: 51), XTEN linkers, and (XP)n linkers, as well as any combinations thereof, wherein n is an integer between 1 and 50.
[0179] In some embodiments, the peptide linker is a glycine/serine-rich linker. In some embodiments, the peptide linker has an amino acid sequence selected from the group comprising or consisting of SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 45 and SEQ ID NO: 46.
SEQ ID NO: 41 GGSGGGSGGG
SEQ ID NO: 42 GGSGGSGGSGGS
SEQ ID NO: 43 GGSGGSGGSGGSGGS
SEQ ID NO: 44 GGSGGSGGSGGSGGSG
SEQ ID NO: 45 GGSGGSGGSGGSGGSGGSGGS
SEQ ID NO: 46 GGSGGSGGSGGSGGSGGSGGSGGS
[0180] In some embodiments, the peptide linker is an XTEN linker comprising or consisting of SEQ ID NO: 47.
SEQ ID NO: 47: SGSETPGTSESATPES
[0181] In some embodiments, the peptide linker comprises or consists of SEQ ID NO: 48.
SEQ ID NO: 48 GSAGSAAGSGEF [0182] In some embodiments, the peptide linker comprises or consists of SEQ ID NO: 55.
SEQ ID NO: 55:GIHGVPAA
[0183] Methods which are well-known to those skilled in the art can be used to construct expression vectors containing the coding sequence of a fusion protein along with appropriate transcriptional/translational control signals. These methods include in vitro recombinant DNA techniques, synthetic techniques and in vivo recombination/genetic recombination. See, for example, the techniques described in Sambrook etal., 2012 (Molecular cloning: A laboratory manual (4th ed.). Cold Spring Harbor Laboratory Press). The expression vector can be part of a plasmid, virus, or may be a nucleic acid fragment. The expression vector includes an expression cassette into which the polynucleotide encoding the fusion protein (z.e., the coding region) is cloned in operable association with a promoter and/or other transcription or translation control elements. As used herein, a “coding region” is a portion of nucleic acid which consists of codons translated into amino acids. Although a “stop codon” (TAG, TGA, or TAA) is not translated into an amino acid, it may be considered to be part of a coding region, if present, but any flanking sequences, for example promoters, ribosome binding sites, transcriptional terminators, introns, 5’ - and 3 ’-untranslated regions, and the like, are not part of a coding region. Two or more coding regions can be present in a single polynucleotide construct, e.g., on a single vector, or in separate polynucleotide constructs, e.g., on separate (different) vectors. Furthermore, any vector may contain a single coding region, or may comprise two or more coding regions, e.g., a vector of the present invention may encode one or more polypeptides, which are post- or co-translationally separated into the final proteins via proteolytic cleavage. In addition, a vector, polynucleotide, or nucleic acid of the invention may encode heterologous coding regions, either fused or unfused to a polynucleotide encoding the fusion protein (fragment) of the invention, or variant or derivative thereof. Heterologous coding regions include without limitation specialized elements or motifs, such as a secretory signal peptide or a heterologous functional domain. An operable association is when a coding region for a gene product, e.g., a polypeptide, is associated with one or more regulatory sequences in such a way as to place expression of the gene product under the influence or control of the regulatory sequence(s). Two DNA fragments (such as a polypeptide coding region and a promoter associated therewith) are “operably associated” if induction of promoter function results in the transcription of mRNA encoding the desired gene product and if the nature of the linkage between the two DNA fragments does not interfere with the ability of the expression regulatory sequences to direct the expression of the gene product or interfere with the ability of the DNA template to be transcribed. Thus, a promoter region would be operably associated with a nucleic acid encoding a polypeptide if the promoter was capable of effecting transcription of that nucleic acid. The promoter may be a cell-specific promoter that directs substantial transcription of the DNA only in predetermined cells. Other transcription control elements, besides a promoter, for example enhancers, operators, repressors, and transcription termination signals, can be operably associated with the polynucleotide to direct cell-specific transcription. Suitable promoters and other transcription control regions are disclosed herein. A variety of transcription control regions are known to those skilled in the art. These include, without limitation, transcription control regions, which function in vertebrate cells, such as, but not limited to, promoter and enhancer segments from cytomegaloviruses (e.g., the immediate early promoter, in conjunction with intron- A), simian virus 40 (e.g., the early promoter), and retroviruses (such as, e.g., Rous sarcoma virus). Other transcription control regions include those derived from vertebrate genes such as actin, heat shock protein, bovine growth hormone and rabbit a-globin, as well as other sequences capable of controlling gene expression in eukaryotic cells. Additional suitable transcription control regions include tissue-specific promoters and enhancers as well as inducible promoters (e.g., promoters inducible tetracyclins). Similarly, a variety of translation control elements are known to those of ordinary skill in the art. These include, but are not limited to ribosome binding sites, translation initiation and termination codons, and elements derived from viral systems (particularly an internal ribosome entry site, or IRES, also referred to as a CITE sequence). The expression cassette may also include other features such as an origin of replication, and/or chromosome integration elements such as retroviral long terminal repeats (LTRs), or adeno-associated viral (AAV) inverted terminal repeats (ITRs).
[0184] Fusion proteins prepared as described herein may be purified by art-known techniques such as high-performance liquid chromatography, ion exchange chromatography, gel electrophoresis, affinity chromatography, size exclusion chromatography, and the like. The actual conditions used to purify a particular protein will depend, in part, on factors such as net charge, hydrophobicity, hydrophilicity etc., and will be apparent to those having skill in the art. For affinity chromatography purification an antibody, ligand, receptor or antigen can be used to which the fusion protein binds. The purity of the fusion protein can be determined by any of a variety of well-known analytical methods including gel electrophoresis, high pressure liquid chromatography, and the like.
[0185] In some embodiments, the nucleic acid encoding the fusion protein may be expressed as a single nucleic acid molecule that encodes the entire fusion protein or as multiple (e.g., two or more) nucleic acid molecules that are co-expressed. Polypeptides encoded by nucleic acid molecules that are co-expressed may associate through, e.g., disulfide bonds or other means, to form a functional fusion protein.
[0186] In some embodiments, the fusion protein comprises or consists of the transposase or the fragment thereof fused at the C-terminal end of the at least one protein or polypeptide, either directly or indirectly via a linker. In other embodiments, the fusion protein comprises or consists of the transposase or a fragment thereof fused at the N-terminal end of the at least one protein or polypeptide, either directly or indirectly via a linker.
[0187] In some embodiments, the fusion protein is a triple fusion protein, i.e., the fusion protein comprises the transposase or the fragment thereof, and at least two additional polypeptides or proteins.
[0188] In some embodiments, the fusion protein further comprises a nuclear localization sequence (NLS). In general, one or more NLS should have sufficient strength to drive the accumulation of the fusion protein in the nucleus of the cell. In general, the strength of the nuclear localization activity is determined by the number and position of NLSs, and one or more specific NLSs used in the fusion protein.
[0189] In some embodiments, the NLSs may be located at the N-terminus and/or the C-terminus of the fusion protein. In some embodiments, the fusion protein comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs. In some embodiments, the fusion protein comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the N-terminus. In some embodiments, the fusion protein comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the C-terminus. In some embodiments, the fusion protein comprises a combination of these, such as one or more NLSs at the N-terminus and one or more NLSs at the C-terminus.
[0190] Where there are more than one NLS, each NLS may be selected as independent from other NLSs. In some embodiments, the fusion protein comprises two NLSs, for example, the two NLSs are located at the N-terminus and the C-terminus, respectively.
[0191] In general, an NLS consists of one or more short sequences of positively charged lysine or arginine exposed on the surface of a protein, but other types of NLS are also known in the art. Non-limiting examples of NLSs include (M)KKRKV (with SEQ ID NO: 52), (M)PKKKRKV (with SEQ ID NO: 53), or (M)SGGSPKKKRKV (with SEQ ID NO: 54), wherein (M) denotes an initiator methionine which may be present when the NLS is located in N-terminal or be post-translationally removed, and which is absent when the NLS is located in C-terminal.
[0192] In addition, the fusion protein may also include other localization sequences, such as cytoplasmic localization sequences, chloroplast localization sequences, mitochondrial localization sequences, and the like, depending on the desired localization of the fusion protein in a cell.
[0193] In some embodiments, the at least one additional polypeptide or protein is a nuclease. In some embodiments, the nuclease is an endonuclease or an exonuclease. In some embodiments, the nuclease is a deoxyribonuclease or a ribonuclease.
[0194] In some embodiments, the at least one additional polypeptide or protein is an RNA-guided nuclease.
[0195] In some embodiments, the at least one additional polypeptide or protein is a Cas nuclease.
[0196] In some embodiments, the Cas nuclease is selected from the group comprising or consisting of Cas9, Casl2a (Cpfl), Casl2b, Casl2f, and CasX. It shall be understood that variants and functional fragments thereof are also encompassed, such as nickase Cas (nCas) or dead Cas (dCas) variants.
[0197] Examples of Cas9 nucleases include, without limitation, Streptococcus pyogenes Cas9 (SpCas9), Staphylococcus haemolyticus Cas9 (ShCas9), and Campylobacter jejuni Cas9 (CjCas9).
[0198] In some embodiments, the Cas nuclease has at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the sequence of SpCas9 with SEQ ID NO: 27, ShCas9 with SEQ ID NO: 28, Cpfl with SEQ ID NO: 29, CjCas9 with SEQ ID NO: 30, nCas9 with SEQ ID NO: 31, and/or nCas9 with SEQ ID NO: 32.
Figure imgf000044_0001
LASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNK
HRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYE TRIDLSQLGGD
SEQ ID NO: 28
MKRNYILGLDIGITSVGYGIIDYETRDVIDAGVRLFKEANVENNEGRRSKRGARRLKRR
RRHRIQRVKKLLFDYNLLTDHSELSGINPYEARVKGLSQKLSEEEFSAALLHLAKRRGV
HNVNEVEEDTGNELSTKEQISRNSKALEEKYVAELQLERLKKDGEVRGSINRFKTSDYV
KEAKQLLKVQKAYHQLDQSFIDTYIDLLETRRTYYEGPGEGSPFGWKDIKEWYEMLM
GHCTYFPEELRSVKYAYNADLYNALNDLNNLVITRDENEKLEYYEKFQIIENVFKQKK
KPTLKQIAKEILVNEEDIKGYRVTSTGKPEFTNLKVYHDIKDITARKEIIENAELLDQIAKI
LTIYQSSEDIQEELTNLNSELTQEEIEQISNLKGYTGTHNLSLKAINLILDELWHTNDNQI
AIFNRLKLVPKKVDLSQQKEIPTTLVDDFILSPVVKRSFIQSIKVINAIIKKYGLPNDIIIEL
AREKNSKDAQKMINEMQKRNRQTNERIEEIIRTTGKENAKYLIEKIKLHDMQEGKCLYS
LEAIPLEDLLNNPFNYEVDHIIPRSVSFDNSFNNKVLVKQEENSKKGNRTPFQYLSSSDS
KISYETFKKHILNLAKGKGRISKTKKEYLLEERDINRFSVQKDFINRNLVDTRYATRGL
MNLLRSYFRVNNLDVKVKSINGGFTSFLRRKWKFKKERNKGYKHHAEDALIIANADFI
FKEWKKLDKAKKVMENQMFEEKQAESMPEIETEQEYKEIFITPHQIKHIKDFKDYKYSH
RVDKKPNRELINDTLYSTRKDDKGNTLIVNNLNGLYDKDNDKLKKLINKSPEKLLMYH
HDPQTYQKLKLIMEQYGDEKNPLYKYYEETGNYLTKYSKKDNGPVIKKIKYYGNKLN
AHLDITDDYPNSRNKVVKLSLKPYRFDVYLDNGVYKFVTVKNLDVIKKENYYEVNSK
CYEEAKKLKKISNQAEFIASFYNNDLIKINGELYRVIGVNNDLLNRIEVNMIDITYREYL
ENMNDKRPPRIIKTIASKTQSIKKYSTDILGNLYEVKSKKHPQIIKKG
SEQ ID NO: 29
SNAMSKLEKFTNCYSLSKTLRFKAIPVGKTQENIDNKRLLVEDEKRAEDYKGVKKLLD
RYYLSFINDVLHSIKLKNLNNYISLFRKKTRTEKENKELENLEINLRKEIAKAFKGNEGY
KSLFKKDIIETILPEFLDDKDEIALVNSFNGFTTAFTGFFDNRENMFSEEAKSTSIAFRCIN
ENLTRYISNMDIFEKVDAIFDKHEVQEIKEKILNSDYDVEDFFEGEFFNFVLTQEGIDVY
NAIIGGFVTESGEKIKGLNEYINLYNQKTKQKLPKFKPLYKQVLSDRESLSFYGEGYTSD
EEVLEVFRNTLNKNSEIFSSIKKLEKLFKNFDEYSSAGIFVKNGPAISTISKDIFGEWNVIR
DKWNAEYDDIHLKKKAVVTEKYEDDRRKSFKKIGSFSLEQLQEYADADLSVVEKLKEI
IIQKVDEIYKVYGSSEKLFDADFVLEKSLKKNDAVVAIMKDLLDSVKSFENYIKAFFGE
GKETNRDESFYGDFVLAYDILLKVDHIYDAIRNYVTQKPYSKDKFKLYFQNPQFMGGW
DKDKETDYRATILRYGSKYYLAIMDKKYAKCLQKIDKDDVNGNYEKINYKLLPGPNK
MLPKVFFSKKWMAYYNPSEDIQKIYKNGTFKKGDMFNLNDCHKLIDFFKDSISRYPKW SNAYDFNFSETEKYKDIAGFYREVEEQGYKVSFESASKKEVDKLVEEGKLYMFQIYNK DFSDKSHGTPNLHTMYFKLLFDENNHGQIRLSGGAELFMRRASLKKEELVVHPANSPIA NKNPDNPKKTTTLSYDVYKDKRFSEDQYELHIPIAINKCPKNIFKINTEVRVLLKHDDNP YVIGIDRGERNLLYIVVVDGKGNIVEQYSLNEIINNFNGIRIKTDYHSLLDKKEKERFEA RQNWTSIENIKELKAGYISQVVHKICELVEKYDAVIALEDLNSGFKNSRVKVEKQVYQ KFEKMLIDKLNYMVDKKSNPCATGGALKGYQITNKFESFKSMSTQNGFIFYIPAWLTSK IDPSTGFVNLLKTKYTSIADSKKFISSFDRIMYVPEEDLFEFALDYKNFSRTDADYIKKW KLYSYGNRIRIFRNPKKNNVFDWEEVCLTSAYKELFNKYGINYQQGDIRALLCEQSDK AFYSSFMALMSLMLQMRNSITGRTDVDFLISPVKNSDGIFYDSRNYEAQENAILPKNAD ANGAYNIARKVLWAIGQFKKAEDEKLDKVKIAISNKEWLEYAQTSVKH
SEQ ID NO: 30
MARILAFDIGISSIGWAFSENDELKDCGVRIFTKVENPKTGESLALPRRLARSARKRLAR RKARLNHLKHLIANEFKLNYEDYQSFDESLAKAYKGSLISPYELRFRALNELLSKQDFA RVILHIAKRRGYDDIKNSDDKEKGAILKAIKQNEEKLANYQSVGEYLYKEYFQKFKENS KEFTNVRNKKESYERCIAQSFLKDELKLIFKKQREFGFSFSKKFEEEVLSVAFYKRALKD FSHLVGNCSFFTDEKRAPKNSPLAFMFVALTRIINLLNNLKNTEGILYTKDDLNALLNEV LKNGTLTYKQTKKLLGLSDDYEFKGEKGTYFIEFKKYKEFIKALGEHNLSQDDLNEIAK DITLIKDEIKLKKALAKYDLNQNQIDSLSKLEFKDHLNISFKALKLVTPLMLEGKKYDE ACNELNLKVAINEDKKDFLPAFNETYYKDEVTNPVVLRAIKEYRKVLNALLKKYGKV HKINIELAREVGKNHSQRAKIEKEQNENYKAKKDAELECEKLGLKINSKNILKLRLFKE QKEFCAYSGEKIKISDLQDEKMLEIDHIYPYSRSFDDSYMNKVLVFTKQNQEKLNQTPF EAFGNDSAKWQKIEVLAKNLPTKKQKRILDKNYKDKEQKNFKDRNLNDTRYIARLVL NYTKDYLDFLPLSDDENTKLNDTQKGSKVHVEAKSGMLTSALRHTWGFSAKDRNNHL HHAIDAVIIAYANNSIVKAFSDFKKEQESNSAELYAKKISELDYKNKRKFFEPFSGFRQK VLDKIDEIFVSKPERKKPSGALHEETFRKEEEFYQSYGGKEGVLKALELGKIRKVNGKIV KNGDMFRVDIFKHKKTNKFYAVPIYTMDFALKVLPNKAVARSKKGEIKDWILMDENY
EFCFSLYKDSLILIQTKDMQEPEFVYYNAFTSSTVSLIVSKHDNKFETLSKNQKILFKNA NEKEVIAKSIGIQNLKVFEKYIVSALGEVTKAEFRQREDFKK
SEQ ID NO: 31
MDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETA EATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIF GNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDN SDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLF GNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNL
Figure imgf000047_0001
Figure imgf000047_0002
Figure imgf000048_0001
[0199] In some embodiments, the Cas nuclease has at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the amino acid sequence of ShCas9 with SEQ ID NO: 28 or SpCas9 with SEQ ID NO: 27.
[0200] In some embodiments, the Cas nuclease is ShCas9 with SEQ ID NO: 28. In some embodiments, the Cas nuclease is SpCas9 with SEQ ID NO: 27.
[0201] In some embodiments, the at least one additional polypeptide or protein is a variant or a functional fragment of a Cas9 nuclease. A Cas9 variant typically comprises one or several amino acid substitutions as compared to the wild-type amino acid sequence of said Cas9. In some embodiments, the Cas9 variant is humanized Cas9 (hCas9) or a functional fragment thereof. As used herein, the term “humanized Cas9” or “hCas9” refers to an optimized Cas9 protein sequence for expression in human cells. In some embodiments, hCas9 has an amino acid sequence having at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the amino acid sequence of SEQ ID NO: 33. In some embodiments, hCas9 has an amino acid sequence consisting of SEQ ID NO: 33.
Figure imgf000048_0002
Figure imgf000049_0001
[0202] In some embodiments, the at least one additional polypeptide or protein is CasX. In some embodiments, CasX has an amino acid sequence having at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the amino acid sequence of SEQ ID NO: 34. In some embodiments, CasX has an amino acid sequence consisting of SEQ ID NO: 34.
Figure imgf000049_0002
Figure imgf000050_0001
[0203] In some embodiments, the at least one additional polypeptide or protein is a dead Cas9 protein (dCas9). In some embodiments, dCas9 has an amino acid sequence having at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the amino acid sequence of SEQ ID NO: 35. In some embodiments, dCas9 has an amino acid sequence consisting of SEQ ID NO: 35.
Figure imgf000050_0002
[0204] In some embodiments, the at least one additional polypeptide or protein is TnpB (Transposase B from transposon PsiTn554). In some embodiments, TnpB has an amino acid sequence having at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the amino acid sequence of SEQ ID NO: 36. In some embodiments, TnpB has an amino acid sequence consisting of SEQ ID NO: 36.
Figure imgf000051_0001
[0205] In some embodiments, the at least one additional polypeptide or protein is Casl2f. In some embodiments, Casl2f protein is from the bacterium Acidibacillus sulfuroxidans (AsCasl2f). In some embodiments, Casl2f has an amino acid sequence having at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the amino acid sequence of SEQ ID NO: 37. In some embodiments, Casl2f has an amino acid sequence consisting of SEQ ID NO: 37.
Figure imgf000051_0002
[0206] In some embodiments, the protein or polypeptide comprising the transposase or a fragment thereof is a fusion protein comprising the transposase or a fragment thereof, and a nuclease as described hereinabove.
[0207] In some embodiments, the fusion protein has at least 75 %, 80 %, 85 %, 90 %, 95 %, 96 %, 97 %, 98 %, 99 % or more amino acid sequence identity with SEQ ID NO: 2. In some embodiments, the fusion protein comprises or consists of the amino acid sequence of SEQ ID NO: 2.
Figure imgf000052_0001
Figure imgf000053_0001
[0208] In some embodiments, the at least one additional polypeptide or protein is an aptamer-binding protein.
[0209] In some embodiments, the at least one additional polypeptide or protein is an aptamer-binding protein selected from the group comprising or consisting of MS2 bacteriophage coat protein (MCP), PP7 coat protein (PCP), XN22 peptide and COM.
[0210] In some embodiments, the at least one additional polypeptide or protein is the MS2 bacteriophage coat protein (MCP). In some embodiments, MCP has an amino acid sequence having at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the amino acid sequence of SEQ ID NO: 39.
Figure imgf000053_0002
[0211] MCP is capable of binding to a MS2 RNA tetraloop binding sequence (or “MS2 aptamer”). In some embodiments, the MS2 aptamer has a nucleic acid sequence having at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the amino acid sequence of SEQ ID NO: 40.
Figure imgf000053_0003
[0212] In some embodiments, the fusion protein interacts, or is capable of interacting, covalently or non-covalently through MCP with a guide RNA (gRNA) molecule comprising at least one MS2 aptamer.
[0213] Accordingly, the method of the invention may further comprise contacting the cell population with a guide RNA (gRNA) fused to at least one aptamer. [0214] In some embodiments, the aptamer is an RNA sequence comprising a tetraloop. The term “tetraloop” refers to a four-base hairpin loop motif. The term is used interchangeably herein with the terms “stem loop” or “hairpin loop”.
[0215] In some embodiments, the aptamer is a MS2 RNA tetraloop sequence (or MS2 aptamer). In some embodiments, the MS2 aptamer has a nucleic acid sequence having at least 80 %, 85 %, 90 %, 95 %, 99 % or more amino acid sequence identity with the amino acid sequence of SEQ ID NO: 40. In some embodiments, the aptamer comprises or consists of the nucleic acid sequence with SEQ ID NO: 40.
[0216] In some embodiments, the gRNA is capable of forming a complex with an RNA-guided nuclease as described hereinabove, e.g., a Cas protein or fusion protein comprising a Cas protein. In some embodiments, the gRNA molecule interacts, or is capable of interacting, covalently or non-covalently through the at least one MS2 aptamer with an RNA-guided nuclease.
[0217] In some embodiments, the gRNA is capable of targeting the RNA-guided nuclease as described hereinabove to a specific sequence or region of the genome of a cell, preferably a mammalian cell, more preferably a human cell. In some embodiments, the specific sequence targeted by the gRNA is adjacent to a protospacer adjacent motif (PAM) specific for the Cas protein.
[0218] In some embodiments, the specific sequence targeted by the gRNA may be within a safe harbor locus in a cell’s genome. A “safe harbor locus” refers to a region of a cell’s genome, where the integrated material can be adequately expressed without perturbing endogenous gene structure or function. Safe harbor loci include, but are not limited to, AAVS1 (intron 1 of PPP1R12C), HPRT, HI 1, hRosa26, albumin and F-A region. The safe harbor locus may be an exon or an intron of a ubiquitously expressed gene and/or of a gene with tissue specific expression (e.g., muscle). Safe harbor loci can be selected from the group consisting of: exon 1, intron 1 or exon 2 of PPP1R12C; exon 1, intron 1 or exon 2 of HPRT; exon 1, intron 1 or exon 2 of hRosa26; or intron 1 of the albumin gene. A safe harbor locus may also include a region of the genome devoid of endogenous genes and with open chromatin that allows for the expression of the inserted transgene without perturbing the genome structure or function.
[0219] In some embodiments, the gRNA comprises 20, 25, 30, 35, 40, 45, 50 nucleotides or more.
[0220] In some embodiments, the protein or polypeptide comprising the transposase or the fragment thereof is delivered to the cell in the form of a protein.
[0221] In some embodiments, the protein or polypeptide comprising the transposase or the fragment thereof is delivered to the cell in the form of a nucleic acid encoding said transposase or fragment thereof. The nucleic acid may be a ribonucleic acid (RNA) or a deoxyribonucleic acid (DNA).
[0222] In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is an RNA molecule. In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is a messenger RNA (mRNA) molecule.
[0223] Alternatively, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof may be a DNA molecule. In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is a complementary DNA (cDNA). In some embodiments, the DNA molecule may be comprised within a vector, such as, e.g., a plasmid.
[0224] In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is linear or circular. In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is double-stranded or single-stranded.
[0225] In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population by transfection or transformation.
[0226] In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population by transfection. Examples of transfection techniques include, without limitation, lipofection, electroporation, sonication, nanoparticles, microinjection and viral vector infection, including non-integrative and integrative viral vector infection.
[0227] In some embodiments, transfection is performed by lipofection. Means of performing lipofection are known in the part and comprise using a lipofection reagent such as, e.g., lipofectamine™. In some embodiments, transfection is performed by electroporation. In some embodiments, transfection is performed by sonication. In some embodiments, transfection is performed by nanoparticles (e.g., polymeric nanoparticles such as JetPEI). In some embodiments, transfection is performed by microinjection. In some embodiments, transfection is performed by viral vector infection. In some embodiments, the viral vector is integrative or non-integrative, preferably non-integrative. Non-limitative examples of non-integrative viral vectors include adenoviral vectors, adeno-associated virus (AAV) vectors, poxviral vectors, herpes simplex virus vectors and the like.
[0228] In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population by transformation. Examples of transformation techniques include, without limitation, the use of an integrative vector selected from the group comprising or consisting of integrating viral vectors, integrating plasmids, enzymes or genome editing methods. In some embodiments, transformation comprises using an integrative viral vector or a modified integrative virus.
[0229] In some embodiments, the integrative viral vector or a modified integrative virus is an integrating virus or variant thereof or mutant thereof selected from the group comprising or consisting of the Retroviridae, Adenoviridae, Flaviviridae, Herpesviridae, Hepadnaviridae, Papillomaviridae, Polyomaviridae, Parvoviridae, Arenaviridae, Bornaviridae, Bunyaviridae, Filoviridae and Paramyxoviridae families of viruses, preferably selected from the group comprising or consisting of the Retroviridae, Adenoviridae and Flaviviridae families of viruses. In some embodiments, the integrative viral vector belongs to the family of Retroviridae . In some embodiments, the integrative viral vector is a lentivirus, or a modified lentivirus.
[0230] In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population together with the nucleic acid molecule encoding the at least one transgene of interest. In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof, and the nucleic acid molecule encoding the at least one transgene of interest, are delivered in the same vector. Alternatively, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof, and the nucleic acid molecule encoding the at least one transgene of interest, are delivered concomitantly in separate vectors.
[0231] In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population prior to the nucleic acid molecule encoding the at least one transgene of interest, such as, e.g., 1, 2, 3, 4, 5, 10, 30, 60 minutes, 2, 3, 4, 5, 6, 12, 24, 36, 48, 60, 72 hours, 4, 5, 6, 7 days or more before the nucleic acid molecule encoding the at least one transgene of interest.
[0232] In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population from
1 hour to 72 hours before the nucleic acid molecule encoding the at least one transgene of interest.
[0233] In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population from
2 hours to 72 hours, from 3 hours to 72 hours, from 4 hours to 72 hours, from 5 hours to 72 hours, from 6 hours to 72 hours, from 7 hours to 72 hours, from 8 hours to 72 hours, from 9 hours to 72 hours, from 10 hours to 72 hours, from 11 hours to 72 hours, from 12 hours to 72 hours, from 24 hours to 72 hours, from 36 hours to 72 hours, from 48 hours to 72 hours before the nucleic acid molecule encoding the at least one transgene of interest.
[0234] In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population from 1 hour to 60 hours, from 1 hour to 48 hours, from 1 hour to 36 hours, from 1 hour to 24 hours, from 1 hour to 12 hours, from 1 hour to 11 hours, from 1 hour to 10 hours, from
1 hour to 9 hours, from 1 hour to 8 hours, from 1 hour to 7 hours, from 1 hour to 6 hours, from 1 hour to 5 hours, from 1 hour to 4 hours, from 1 hour to 3 hours, from 1 hour to
2 hours before the nucleic acid molecule encoding the at least one transgene of interest.
[0235] In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population from 1 hour to 12 hours, from 2 hours to 8 hours, from 3 hours to 6 hours before the nucleic acid molecule encoding the at least one transgene of interest.
[0236] In some embodiments, the nucleic acid encoding the protein or polypeptide comprising a transposase or a fragment thereof is delivered to the cell population about 4 hours before the nucleic acid molecule encoding the at least one transgene of interest.
[0237] In some embodiments, the protein or polypeptide comprising a transposase or a fragment thereof, or the nucleic acid encoding the same, is comprised in a pharmaceutical composition further comprising at least one pharmaceutically acceptable excipient.
[0238] As used herein, the term “pharmaceutically acceptable excipient” refers to an excipient that does not produce an adverse, allergic or other untoward reaction when administered to an animal, preferably a human, or to a population of cells. It includes any and all solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents and the like. An acceptable excipient refers to a non-toxic solid, semi-solid or liquid filler, diluent, encapsulating material or formulation auxiliary of any type. For human administration, preparations should meet sterility, pyrogenicity, general safety and purity standards as required by EMA or FDA Office of Biologies standards.
[0239] In some embodiments, the acceptable excipient is selected in a group comprising or consisting of a solvent, a diluent, a carrier, a dispersion medium, a coating, an antibacterial agent, an antifungal agent, an isotonic agent, an absorption delaying agent and any combinations thereof. The excipient must be “acceptable” in the sense of being compatible with the protein, polypeptide or nucleic acid molecule of the method, and not be deleterious upon being administered to an individual or to a population of cells. Typically, the excipient does not produce an adverse, allergic or other untoward reaction when administered to an individual, preferably a human individual, or to a population of cells.
[0240] Acceptable excipients for therapeutic use are well known in the art, and are described, for example, in Remington’s Pharmaceutical Sciences, Mack Publishing Co. (A. R. Gennaro ed. 1985). The choice of a suitable excipient can be made with regard to the intended route of administration and standard practice.
[0241] In some embodiments, the cell population is a population of eukaryotic cells or of prokaryotic cells. In some embodiments, the cell population is a eukaryotic cell population. In some embodiments, the cell population is a prokaryotic cell population.
[0242] In some embodiments, the cell population is isolated from a donor. As used herein, “donor” refers to an animal, preferably a mammal, more preferably a human. The donor may be alive or dead when the cell population is isolated, preferably alive. In some embodiments, the cell population is isolated from at least one organ or tissue of the donor, optionally from more than one organ or tissue of the donor. In some embodiments, the cell population is homogeneous (z.e., it contains a single type of cells) or heterogenous (z.e., it contains more than one type of cells), preferably homogenous.
[0243] In some embodiments, the cell population is cultured in vitro. In some embodiments, the cell population is comprised in a primary cell culture.
[0244] In some embodiments, the cell population is from an immortalized cell line.
[0245] In some embodiments, the method is performed in vitro. In some embodiments, the cell population is an in vztro-cultured population of cells.
[0246] Methods to culture cells in vitro are well known in the art and are routine practice. Typically, cells are maintained in a controlled atmosphere (e.g., 37°C, 5 % CO2) and maintained in a suitable culture medium (e.g., Dulbecco’s Modified Eagle Medium). [0247] In some embodiments, the cell population is cultured from 2 hours to 72 hours, from 3 hours to 72 hours, from 4 hours to 72 hours, from 5 hours to 72 hours, from 6 hours to 72 hours, from 7 hours to 72 hours, from 8 hours to 72 hours, from 9 hours to 72 hours, from 10 hours to 72 hours, from 11 hours to 72 hours, from 12 hours to 72 hours, from 24 hours to 72 hours, from 36 hours to 72 hours, from 48 hours to 72 hours. In some embodiments, the cell population is cultured from 1 hour to 60 hours, from 1 hour to 48 hours, from 1 hour to 36 hours, from 1 hour to 24 hours, from 1 hour to 12 hours, from 1 hour to 11 hours, from 1 hour to 10 hours, from 1 hour to 9 hours, from 1 hour to 8 hours, from 1 hour to 7 hours, from 1 hour to 6 hours, from 1 hour to 5 hours, from 1 hour to 4 hours, from 1 hour to 3 hours, from 1 hour to 2 hours.
[0248] In one embodiment, the cell population is a non-dividing, non-replicating, terminally differentiated or quiescent, cell population. The terms “non-dividing cell population”, “non-replicating cell population” “terminally differentiated” and “quiescent cell population” are used interchangeably herein.
[0249] As used herein, “non-dividing” cell, “non-replicating” cell or “quiescent” cell means that the cell remains out of the cell cycle but retains the capacity to divide, in other words, the cell is in a state of reversible growth arrest. Quiescence may be induced by, non-imitatively, contact inhibition, chemical or pharmacological agents, signaling proteins, hormones, inhibitors, and the like, or, alternatively, by the lack of a signal such as a growth factor or the contact with one or more cell types. In some embodiments, the cell population is quiescent in the presence of one or more growth inhibiting agent in the culture medium. In some embodiments, the one or more growth inhibiting agent is selected from the group comprising or consisting of chemical agents, pharmacological agents, signaling proteins, hormones, growth factors, and inhibitors. In some embodiments, the cell population is quiescent in the absence of one or more growth stimulating agent in the culture medium. In some embodiments, the one or more growth stimulating agent is selected from the group comprising or consisting of chemical agents, pharmacological agents, signaling proteins, hormones, growth factors, amino acids, sugars, lipids, and fatty acids. As used herein, “terminally differentiated” cell means that the cell remains out of the cell cycle and lost the capacity to divide, in other words, the cell is in a state of irreversible growth arrest.
[0250] In some embodiments, the non-dividing or quiescent cell population is a confluent cell culture (z.e., quiescence is induced by contact inhibition).
[0251] In some embodiments, the non-dividing or quiescent cell population consists of at least one cell type that do not replicate in culture.
[0252] In some embodiments, the non-dividing or quiescent cell population is senescent.
[0253] In some embodiments, the non-dividing or terminally differentiated cell population consists of at least one cell type that have completely lost the capacity to perform cell division.
[0254] In some embodiments, the cell population is not actively dividing.
[0255] In another embodiment, the cell population is actively dividing, replicating, or proliferating. In some embodiments, the dividing cell population is in exponential phase of cell culture.
[0256] In some embodiments, the method is performed ex vivo.
[0257] In some embodiments, the method is performed in vivo. In some embodiments, the method is performed in vivo in an animal subject, preferably a mammal subject. In certain embodiments, the method is performed in vivo in a human subject. In some embodiments, the cell population is comprised in a tissue or an organ of a living organism. In some embodiments, the organism is an animal. In some embodiments, the organism is a mammal. In some embodiments, the organism is a human.
[0258] The present invention further relates to a cell or cell population comprising at least one copy of a protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, and/or at least one copy of at least one transgene of interest. In some embodiments, to a cell or cell population comprises both at least one copy of a protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, and at least one copy of at least one transgene of interest. [0259] In some embodiments, the protein or polypeptide comprising a transposase or a fragment thereof, or the nucleic acid encoding the same, the at least one transgene of interest, and the cell or cell population, further comprise the features as disclosed hereinabove.
[0260] The present invention further relates to a method of treating a genetic disease in a subject in need thereof, comprising administering to said subject: the nucleic acid molecule encoding the at least one transgene of interest, as described hereinabove, and a protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, as described hereinabove.
[0261] It also relates to the nucleic acid molecule encoding the at least one transgene of interest, as described hereinabove, and the protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, as described hereinabove, for use in treating a genetic disease in a subject in need thereof.
[0262] It also relates to the use of the nucleic acid molecule encoding the at least one transgene of interest, as described hereinabove, and the protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, as described hereinabove, for the manufacture of a medicament for treating a genetic disease in a subject in need thereof.
[0263] In some embodiments, the transgene of interest compensates for a gene detect responsible of the genetic disease in the subject.
[0264] In some embodiments, the method further comprises administering to the subject a guide RNA (gRNA) fused to at least one aptamer, as described hereinabove.
[0265] In some embodiments, the method may comprise administering to the subject in need thereof at least one additional therapeutic agent. In some embodiments, the at least one therapeutic agent is for treating the genetic disease.
[0266] In some embodiments, the genetic disease is characterized in that at least one gene is mutated in the genome of the subject, so that the protein encoded by the at least one gene has impaired function or is not functional, or is degraded by the cellular protein quality control (e.g., proteasome), or is not produced; in other words, the function of the at least one gene is at least partially, if not completely lost.
[0267] Non-limitative examples of genetic diseases include sickle cell anemia, cystic fibrosis, Huntington disease, congenital muscular dystrophy, Duchenne muscular dystrophy, Fabry disease, Marfan syndrome, thalassemia, cystinosis, familial hypercholesterolemia, hemochromatosis and the like.
[0268] In some embodiments, the transgene of interest restores partially or completely, preferably completely, the function of the at least one gene that is mutated. In some embodiments, the transgene of interest encodes the same protein as the at least one gene that is mutated. In some embodiments, the transgene of interest encodes a different protein than the one encoded the at least one gene that is mutated, but that has similar or identical biological function.
BRIEF DESCRIPTION OF THE DRAWINGS
[0269] Figure 1 is a set of photographs illustrating the viability and transfection efficiency of GFP expressing transposon and transposase in Embryonic stem cells H9 electroporated with transposases (either hyPB alone or fused with a Cas9 nuclease). BF: Bright Field; GFP: Green Fluorescent Protein.
[0270] Figure 2 is a histogram showing the transcription efficiency of a Red Fluorescent Protein (RFP) reporter in HepG2 cells expressing or not hyPB.
[0271] Figure 3 is a histogram showing the transcription efficiency of a RFP reporter in Huh7 cells expressing or not hyPB.
[0272] Figures 4 and 5 are graphs showing the transcription efficiency over time (at 3h, 6h, 9h, and 12h in Figure 4; at 8h, lOh, 12h, 14h and 16h in Figure 5) of a GFP reporter in Huh7 cells expressing or not hyPB. [0273] Figure 6 is a histogram showing the transcription efficiency of a GFP reporter in Huh7 cells expressing or not a catalytically dead hyPB.
[0274] Figure 7 is a histogram showing the transcription efficiency of a GFP reporter in Huh7 cells expressing a fusion protein comprising a Cas9 nuclease and either a SB 100 transposase or a N57 transposase.
[0275] Figures 8A-8B is a set of histograms comparing the transcription efficiency of a GFP reporter in Huh7 cells expressing or not hyPB, wherein the GFP DNA transposon comprises either one Inverted Terminal Repeat (ITR) (Figure 8A) or two ITRs (Figure 8B).
[0276] Figure 9 is a set of photographs showing the transcription efficiency of a luciferase reporter in vivo in mice, one day or 4 weeks after injection, with either hyPB mRNA or a catalytically dead hyPB mRNA.
[0277] Figure 10 is a set of photographs showing the transcription efficiency of a luciferase reporter in vivo, one day or one week after injection, in mice transfected or not with hyPB DNA using a polymeric nanoparticle vector (jetPEI).
[0278] Figure 11 is a set of photographs showing the transcription efficiency of a luciferase reporter in vivo, one day after injection, in mice transfected or not with hyPB and wherein the DNA payload (luciferase reporter) and the hyPB mRNA are co transfected using the same vector (lipid nanoparticles).
[0279] Figure 12 is a histogram showing that non-dividing NIH-3T3 cells transfected with hyPB show higher capacity for uptake and expression of DNA transposon compared to cells that do not express hyPB. Results are represented as mean ± s.e.m, n=4.
[0280] Figure 13A-13B is a set of graphs showing nuclear DNA translocation in replicating (Fig. 13A) and non-dividing (Fig. 13B) NH4/3T3 cells, measured in a 20 hours interval post-transfection. Results are represented as mean ± s.e.m, n=2. Statistical significance is represented as *** for p-value<0.001, and ** for p-value<0.005. [0281] Figure 14 is a histogram showing that non-dividing primary mouse hepatocytes in culture transfected with hyPB show higher capacity for uptake and expression of DNA Luciferase-encoding transposon compared to cells that do not express hyPB. Data represented as mean ± SEM, n=3.
EXAMPLES
[0282] The present invention is further illustrated by the following examples.
Example 1: Nuclear localization driven by transposases
Materials and Methods
Electroporation ofH9 stem cells
[0283] Embryonic Stem Cells H9 were electroporated (CRG cell and tissue facility) with 1.8 pg of minicircle GFP transposon alone or together with 1 pg hyPB mRNA or 1.5 pg FiCAT mRNA and 1.75 pg of sgRNA targeting AAVS1 locus. Cells were imaged in a fluorescent microscope and using bright field and viability and % of GFP signal was estimated.
Lipofection ofHepG2 cells
[0284] Day 0: 400k cells/well were seeded in p6 well plates.
[0285] Day 1 : 1.25 pg of either hyPB plasmid or pucl9 mock plasmid were transfected with lipofectamine 3000 according manufacturer’s protocol. Cells were kept during 2h with Optimem + transfection mix, and then full media was added (Dulbecco’s modified Eagle medium 10 % fetal bovine serum, 2 mM glutamine and 100 U penicillin/0.1 mg/ml streptomycin)
[0286] Day 2: Transfected cells were lifted and seeded in pl2 well plates at 150k cells/well
[0287] Day 3: Cells were transfected with 0.75 pg RFP DNA transposon and 0.25 pg pucl9 filling DNA with lipofectamine 3000 according manufacturer’s protocol. Cells were kept during 2 hours with Optimem + transfection mix, and then full media was added.
[0288] Day 4: Fluorescence was measured 24 hours after day4 transfection
[0289] Alternatively, the cells were transfected on day 1 with Cas9_SB100 plasmid and Cas9_N57 plasmid instead of hyPB. On day 3, cells were transfected with GFP transposon containing SB 100 ITRs and with TCR1 gRNA plasmid.
Lipofection ofHuh7 cells
[0290] 400,000 cells were seeded 24h before transfection in p6 wells. Transfection was performed using 1.8 pg transposon DNA and 1.95 pg hyPB or mock mRNA of similar size with lipofectamine 3000 according manufacturer’s protocol. Cells were kept during 2 hours with Optimem + transfection mix, and then full media was added. Cells were lifted every 3 hours or every 14 hours post-transfection and fluorescence was measured by flow cytometry.
[0291] Alternatively, the cells were transfected on day 1 with Cas9_SB100 plasmid and Cas9_N57 plasmid instead of hyPB. On day 3, cells were transfected with GFP transposon containing SB 100 ITRs and with TCR1 gRNA plasmid.
In vivo hydrodynamic injection
[0292] A total of 10 to 10.2 pg of nucleic acids were injected into 6-7 weeks old mice (3.2 pg MC-luciferase transposon, 1.42 pg hyPB/hyPB_dead mRNA). Nucleic acids were diluted with PBS and 7 % of animal body weight in ml was injected in less than 7 seconds via retro-orbital systemic injection.
JetPEI in vivo transfection
[0293] With DNA payload and hyPB with distinct delivery methods'. 35.6 pg of MC-luciferase DNA transposon were intravenously injected into 5-week old mice using in vivo JetPEI according to manufacturer’s instructions at N/P ratio 7. LNPs loaded with 2.5 pg hyPB mRNA were injected 4 hours later intravenously. LNPs manufacturing was performed using Benchtop Nanossemblr (Precision Nanosystems).
[0294] With DNA payload and hyPB with the same delivery method. 15 pg of MC-luciferase DNA transposon loaded LNPs were intravenously injected into 5-week old mice with or without 2.5 pg hyPB loaded LNPS. LNPs manufacturing was performed using Benchtop Nanossemblr (Precision Nanosystems).
[0295] Lipid proportions were 50 % Dlin-MC3, 10 % DSPC helper lipid, 38.5 % cholesterol, 1.5 % PEG-2000 and N/P ratio of 6. Size of the nanoparticles were 100-130 nm and encapsulation efficiencies higher than 90 %.
Whole-body imaging
[0296] Whole body imaging of luciferase expression was performed at different timepoints after FiCAT-gRNA-transposon or transposon control administration with IVIS spectrum imaging system (Caliper Life Sciences). Images were taken 5 min after intraperitoneal injection of D-Luciferin potassium salt (Gold Biotechnology) according to the manufacturer’s instructions. FiCAT: Cas9-hyPB fusion.
Results
[0297] Embryonic Stem Cells H9 electroporated with transposases (either hyPB or FiCAT, FiCAT being the Cas9-hyPB fusion protein) shows GFP signal (Figure 1), meaning that GFP transposon DNA has entered the nucleus and it has been translated and transcribed. While episomal signal of GFP transposon without hyPB or FiCAT is low (20 % efficiency), hyPB yields a stronger GFP signal (70 % transfection efficiency).
[0298] HepG2 hepatic cell line previously expressing hyPB shows better internalization of RFP DNA transposon than cells which do not express hyPB (Figure 2). This result was also obtained with Huh7 hepatic cell line (Figure 3). In addition, Huh7 cells transfected with hyPB mRNA show higher capacity for uptake and expression of DNA transposon compared to cells transfected with mock mRNA (Figures 4 & 5).
[0299] Huh7 hepatic cell line transfected with hyPB catalytically dead mRNA shows higher capacity for nuclear uptake and expression of DNA transposon compared to cells transfected with mock mRNA (Figure 6).
[0300] Interestingly, above results showing higher nuclear uptake capacity of transposon by hyPB transposase compared to cells without hyPB can be generalized to other transposases like SB 100 and N57 (Figure 7).
[0301] The ITR requirement on the DNA payload was then tested. Huh7 hepatic cell line previously expressing with hyPB shows better internalization of GFP DNA transposon than cells which do not express hyPB (previously transfected with Pucl9 plasmid DNA) both with a GFP transposon with 1 ITR or with 2 ITRs (Figure 8A-8B). This data demonstrates that hyPB promotes nuclear internalization of payload DNA even if only 1 ITR is present (so hyPB molecule can bind into its payload).
[0302] The ability of hyPB to improve transfection was further tested in in vivo conditions (Figure 9). hyPB transposases show advantages in DNA payload internalization and expression, even if its catalytic activity is compromised (dead hyPB). 1 out of 3 mice shows luciferase expression 24h after hydrodynamic injection of solo payload while this increases to 2/3 or 3/3 mice when co-injected with hyPB or hyPB catalytically dead.
[0303] hyPB transposases show advantages in DNA payload internalization and expression, when payload DNA and hyPB mRNA are administered using different delivery methods (Figure 10). When injecting DNA, 50 % of mice shows luciferase signal 24h after while 100 % of mice shows signal when co-administering hyPB using LNPs.
[0304] hyPB transposases show advantages in DNA payload internalization and expression, when payload DNA and hyPB mRNA are administered using the same delivery methods (Figure 11). When injecting DNAsolo, mice do not show luciferase signal while signal is recovered when co-administering hyPB using LNPs in both cases. This result shows hyPB helps internalize DNA into the nucleus therefore it can be expressed. Example 2: Nuclear localization in non-dividing cells
Materials and Methods
NIH/3T3 culture
[0305] NIH/3T3 (ATCC), were cultured in DMEM supplemented with 10 or 1% heat- inactivated fetal bovine serum, lOO U/ml penicillin and O. l mg/ml streptomycin in a humidified CO2 (5%) incubator at 37°C.
Dose-dependent effect assay on quiescent NIH-3T3 cells
[0306] 60.000 of NIH-3T3 cells were seeded, and senescence was induced by culturing them in DMEM supplemented with 1% FBS for 48h after reaching confluency. Then, cells were transfected. Transfection was performed using 0.2 ug transposon DNA alone (Episomal) or in combination with the indicated amounts of hyPB-wt or a catalytically dead hyPB (dead-hyPB) with lipofectamine 3000 according to manufacturer’s protocol. Cells were incubated overnight with the transfection mix in Optimem, and then full media was added. 1 hour after DMEM addition, cells were trypsinized and GFP expression was assessed by flow cytometry. (Represented as mean ± SEM, n=4)
Comparison of transfection in dividing ad non-dividing NIH-3T3 cells
[0307] In replicating cells, 60.000 cells were seeded the day prior to transfection. For non-dividing 3T3, senescence was induced by culturing them in DMEM supplemented with 1% FBS for 48h once they reached confluency. Then, cells were transfected. Transfection was performed using 0.25 ug transposon DNA alone (Episomal) or in combination with 0.25ug hyPB-wt or a catalytically dead hyPB (dead-hyPB) with lipofectamine 3000 according to manufacturer’s protocol, and trypsinize in intervals of 3h for GFP assessment with flow cytometry.
[0308] Differences between groups were analyzed for statistical significance using a two-way analysis of variance (ANOVA) test. Primary hepatocytes isolation
[0309] Liver was perfused using liver perfusion buffer (HBSS KC1 0.4 g.L glucose 1 g.L 1, NaHCO3 2.1 g.L 1, EDTA 0.2 g.L 1) and then digested using liver digest buffer (DMEM-GlutaMAX 1 g.L 1 glucose, HEPES 15 mM pH 7.4, penicillin/streptomycin 1%, 5 mg per mouse Collagenase IV (C5138 Sigma)). After excision, liver was placed on ice in plating media (M199, fetal bovine serum 10%, penicillin/streptomycin 1%, sodium pyruvate 1%, L-glutamine 1%, 1 nm insulin, 1 mM dexamethasone, 2 mg.mL1 bovine serum albumin (BSA)). Tissue was homogenized using forceps and then filtered in plating media. Cells were then washed three times in plating media and then plated on collagen coated plates (Thermo Fisher Scientific) in plating media. Media was changed the next morning and cells were incubated for 48 hours prior to transfection.
Culture, and transfection of primary hepatocytes
[0310] Primary hepatocytes were isolated from a wild-type 8-weeks old mouse, and 75.000 cells were seeded per well. After 48h in culture, cells were transfected. Transfection was performed using 0.2 ug transposon DNA alone (Episomal) or in combination with 0.25ug of hyPB-wt or a catalytically dead hyPB (dead-hyPB) with lipofectamine 3000 according to manufacturer’s protocol. Cells were incubated for 5 hours with the transfection mix in Optimem, and then full media was added. 24 hours after transfection, cells were washed once with PBS and lysated with the appropriate buffer. Luciferase activities were quantified 24h post-transfection using the Luciferase assay system (Promega, Madison, WI, USA), performed in duplicated. Data is shown as fold change over to the non-transfected hepatocytes luciferase values, and expressed in Arbitrary units (AU).
[0311] Differences between groups were analyzed for statistical significance using a one- or two-way analysis of variance (ANOVA) test. Pairwise comparisons were analysed with two-sided Student’s t-test when applicable. Results are represented as mean ± s.e.m. Results
[0312] To further support the role of DNA transposases such as PiggyBac transposase in the DNA transposon nuclear import, experiments were conducted in non-dividing cells.
[0313] Two different models were chosen: in Figure 12 and Figure 13A-13B, NIH/3T3 fibroblastic cell line was senescence-induced, and in Figure 14, isolated primary hepatocytes from mouse, which do not replicate when in culture, were utilized to test the PiggyBac (hyPB) DNA nuclear translocation activity in non-dividing (quiescent) cells.
[0314] Increasing amounts of PiggyBac transposase (hyPB) were tested, to probe its dose-dependent effect; furthermore, a catalytically dead version of the transposase (dead- hyPB) was added as a control of DNA translocation activity alone, without insertion activity of the transposase. Figure 12 shows that uptake and expression of the DNA transposon (GFP) is increased in cells expressing hyPB or dead-hyPB, even with the lowest dose.
[0315] In addition, side by side comparison of transduction and expression efficiency of DNA payloads delivered into the cytoplasm of actively dividing and non-dividing NIH/3T3 cells was conducted, in order to support the role of the transposase protein in the nuclear localization of these payloads. (Fig.l3A-13B). Figure 13B shows a significant increase in expression efficiency of the GFP payload in non-dividing NIH- 3T3 cells expressing hyPB or dead-hyPB, compared to the condition without transposase.
[0316] The results with primary hepatocytes also support the conclusion and show that the presence of PiggyBac transposase, whether catalytically active (hyPB) or inactive (dead-hyPB) increases the expression of the transposon (Fig.14), which is a result of its improved nuclear trafficking.
[0317] Overall, these results demonstrate that DNA transposases increase DNA transposon nuclear import, independently of their insertion activity, in particular in nondividing cells.

Claims

1. An in vitro method of increasing nuclear localization of a nucleic acid molecule encoding at least one transgene of interest in a cell population, comprising contacting the cell population with: a protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, and the nucleic acid molecule encoding the at least one transgene of interest.
2. The in vitro method according to claim 1, wherein the nucleic acid molecule encoding the at least one transgene of interest further comprises at least one Inverted Terminal Repeat (ITR) sequence.
3. The in vitro method according to claim 1 or 2, wherein the protein or polypeptide comprising the transposase or a fragment thereof is a fusion protein comprising the transposase or a fragment thereof, and at least one additional polypeptide or protein.
4. The in vitro method according to claim 3, wherein the at least one additional polypeptide or protein is a nuclease.
5. The in vitro method according to claim 3 or 4, wherein the fusion protein has at least 75 % amino acid sequence identity with SEQ ID NO: 2.
6. The in vitro method according to any one of claims 1 to 5, further comprising contacting the cell population with a guide RNA (gRNA).
7. The in vitro method according to any one of claims 3 to 6, wherein the at least one additional polypeptide or protein is an aptamer-binding protein.
8. The in vitro method according to claim 7, wherein the aptamer-binding protein is the MS2 bacteriophage coat protein (MCP), preferably with SEQ ID NO: 39.
9. The in vitro method according to claim 8, wherein the fusion protein interacts, or is capable of interacting, covalently or non-covalently through MCP with the gRNA molecule, wherein the gRNA comprises at least one MS2 aptamer, preferably the MS2 aptamer has a ribonucleic acid sequence with SEQ ID NO: 40. The method according to any one of claims 1 to 9, wherein the transposase is selected from the group consisting of hyperactive PiggyBac transposase, PiggyBac transposase, Sleeping Beauty transposase, SB 11 transposase, Tol2 transposase, Mosl transposase, and Frog Prince transposase. The method according to any one of claims 1 to 10, wherein the transposase is a modified hyperactive PiggyBac transposase, comprising at least one amino acid mutation compared to the amino acid sequence of the hyperactive PiggyBac transposase with SEQ ID NO: 1. The method according to any one of claims 1 to 11, wherein the transposase or the fragment thereof is catalytically dead. The method according to claim 12, wherein the catalytically dead transposase has at least 75 % amino acid sequence identity with SEQ ID NO: 3. A protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, for use in treating a genetic disease in a subject in need thereof, wherein the subject in need thereof is further to be administered with a nucleic acid molecule encoding at least one transgene of interest, wherein expression of the transgene of interest in at least one cell of the subject in need thereof compensates a gene defect responsible for the genetic disease. The protein or polypeptide comprising a transposase or a fragment thereof, or a nucleic acid encoding the same, for use according to claim 14, said use further comprising the features according to any one or several of claims 2 to 13.
PCT/EP2023/062736 2022-05-13 2023-05-12 Use of transposases for improving transgene expression and nuclear localization WO2023218021A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202263341619P 2022-05-13 2022-05-13
US63/341,619 2022-05-13
EP22191966 2022-08-24
EP22191966.5 2022-08-24

Publications (1)

Publication Number Publication Date
WO2023218021A1 true WO2023218021A1 (en) 2023-11-16

Family

ID=86605260

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2023/062736 WO2023218021A1 (en) 2022-05-13 2023-05-12 Use of transposases for improving transgene expression and nuclear localization

Country Status (1)

Country Link
WO (1) WO2023218021A1 (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008027384A1 (en) * 2006-08-28 2008-03-06 University Of Hawaii Methods and compositions for transposon-mediated transgenesis
US20190153440A1 (en) * 2017-11-21 2019-05-23 Casebia Therapeutics Llp Materials and methods for treatment of autosomal dominant retinitis pigmentosa
WO2020127332A1 (en) * 2018-12-17 2020-06-25 Consejo Superior De Investigaciones Científicas (Csic) Method for the introduction of genetic information in cell by site-specific integration system
WO2020243085A1 (en) 2019-05-24 2020-12-03 The Trustees Of Columbia University In The City Of New York Engineered cas-transposon system for programmable and site-directed dna transpositions
WO2021216625A1 (en) * 2020-04-20 2021-10-28 Integrated Dna Technologies, Inc. Optimized protein fusions and linkers
WO2021248023A2 (en) * 2020-06-05 2021-12-09 The Regents Of The University Of California Compositions and methods for epigenome editing
WO2022012758A1 (en) * 2020-07-17 2022-01-20 Probiogen Ag Hyperactive transposons and transposases
WO2022129438A1 (en) 2020-12-16 2022-06-23 Universitat Pompeu Fabra Programmable transposases and uses thereof

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008027384A1 (en) * 2006-08-28 2008-03-06 University Of Hawaii Methods and compositions for transposon-mediated transgenesis
US20190153440A1 (en) * 2017-11-21 2019-05-23 Casebia Therapeutics Llp Materials and methods for treatment of autosomal dominant retinitis pigmentosa
WO2020127332A1 (en) * 2018-12-17 2020-06-25 Consejo Superior De Investigaciones Científicas (Csic) Method for the introduction of genetic information in cell by site-specific integration system
WO2020243085A1 (en) 2019-05-24 2020-12-03 The Trustees Of Columbia University In The City Of New York Engineered cas-transposon system for programmable and site-directed dna transpositions
WO2021216625A1 (en) * 2020-04-20 2021-10-28 Integrated Dna Technologies, Inc. Optimized protein fusions and linkers
WO2021248023A2 (en) * 2020-06-05 2021-12-09 The Regents Of The University Of California Compositions and methods for epigenome editing
WO2022012758A1 (en) * 2020-07-17 2022-01-20 Probiogen Ag Hyperactive transposons and transposases
WO2022129438A1 (en) 2020-12-16 2022-06-23 Universitat Pompeu Fabra Programmable transposases and uses thereof

Non-Patent Citations (19)

* Cited by examiner, † Cited by third party
Title
"Remington's Pharmaceutical Sciences", 1985, MACK PUBLISHING CO
ALTSCHUL ET AL., J MOL BIOL., vol. 215, no. 3, 1990, pages 403 - 10
ALTSCHUL ET AL., NCB/NLM/NIH, pages 20894
BAI ET AL.: "Cytoplasmic transport and nuclear import of plasmid DNA.", BIOSCIENCE REPORTS, vol. 37, 2017, pages 6
BRIAN E HEW ET AL: "RNA-guided piggyBac transposition in human cells", SYNTHETIC BIOLOGY, vol. 4, no. 1, 2 July 2019 (2019-07-02), pages ysz018, XP055737951, DOI: 10.1093/synbio/ysz018 *
CARILLO ET AL., SIAM J APPL MATH., vol. 48, no. 5, 1988, pages 1073 - 82
DEVEREUX ET AL., NUCLEIC ACIDS RES., vol. 12, 1984, pages 387 - 95
GRIBSKOV M. RDEVEREUX J: "Sequence analysis primer", 1991, STOCKTON PRESS
GRIFFIN A. MGRIFFIN H. G: "Computer analysis of sequence data", 1994, HUMANA PRESS
HEW ET AL.: "RNA-guided piggyBac transposition in human cells.", SYNTHETIC BIOLOGY, vol. 4, 2019, pages 1
KEITH ET AL.: "Analysis of the piggyBac transposase reveals a functional nuclear targeting signal in the 94 c-terminal residues.", BMC MOLECULAR BIOLOGY, vol. 9, 2008, pages 72, XP021042424, DOI: 10.1186/1471-2199-9-72
KEITH JAMES H ET AL: "Analysis of the piggyBac transposase reveals a functional nuclear targeting signal in the 94 c-terminal residues", BMC MOLECULAR BIOLOGY, BIOMED CENTRAL LTD, GB, vol. 9, no. 1, 11 August 2008 (2008-08-11), pages 72, XP021042424, ISSN: 1471-2199, DOI: 10.1186/1471-2199-9-72 *
SAMBROOK ET AL.: "Molecular cloning: A laboratory manual", 2012, COLD SPRING HARBOR LABORATORY PRESS
SEMENOVA, NUCLEIC ACIDS RESEARCH, vol. 47, 2019, pages 19
SMITH D. W: "Biocomputing: Informatics and genome projects", 1993, ACADEMIC PRESS
VANDENBROUCKE ET AL., NUCLEIC ACIDS RESEARCH, vol. 35, 2007, pages 12
VON HEIJNE G: "Sequence analysis in molecular biology: treasure trove or trivial pursuit", 1987, ACADEMIC PRESS
ZHAO ET AL.: "PiggyBac transposon vectors: the tools of the human gene encoding.", TRANSLATIONAL LUNG CANCER RESEARCH, vol. 5, 2016, pages 1, XP055923508, DOI: 10.3978/j.issn.2218-6751.2016.01.05
ZHAO SHUANG ET AL: "PiggyBac transposon vectors: the tools of the human gene encoding", TRANSLATIONAL LUNG CANCER RESEARCH, vol. 5, no. 1, 1 February 2016 (2016-02-01), Hong Kong, pages 120 - 125, XP055923508, ISSN: 2218-6751, DOI: 10.3978/j.issn.2218-6751.2016.01.05 *

Similar Documents

Publication Publication Date Title
US10927383B2 (en) Cas9 mRNAs
CA3036926C (en) Modified stem cell memory t cells, methods of making and methods of using same
CN111163633B (en) Non-human animals comprising humanized TTR loci and methods of use thereof
CN110891420B (en) CAS transgenic mouse embryonic stem cell, mouse and application thereof
US11827877B2 (en) Compositions and methods for genomic editing by insertion of donor polynucleotides
JP2019525756A (en) Therapeutic application of genome editing based on CPF1
US20210261985A1 (en) Methods and compositions for assessing crispr/cas-mediated disruption or excision and crispr/cas-induced recombination with an exogenous donor nucleic acid in vivo
JP2019511240A (en) Genome Editing of Human Neural Stem Cells Using Nuclease
CN114007655A (en) Circular RNA for cell therapy
US20190032156A1 (en) Methods and compositions for assessing crispr/cas-induced recombination with an exogenous donor nucleic acid in vivo
CN113993994A (en) Polynucleotides, compositions and methods for polypeptide expression
JP2022553573A (en) CRISPR and AAV strategies for X-linked juvenile retinal isolation therapy
WO2023193616A1 (en) Method for repairing hba2 gene mutations by single base editing and use thereof
CN116096886A (en) Compositions and methods for modulating fork-box P3 (FOXP 3) gene expression
WO2023218021A1 (en) Use of transposases for improving transgene expression and nuclear localization
EP3640334A1 (en) Genome editing system for repeat expansion mutation
KR20220018410A (en) Self-transcribing RNA/DNA system that provides Genome editing in the cytoplasm
CA3178965A1 (en) Messenger rna encoding cas9 for use in genome-editing systems
WO2022129430A1 (en) Therapeutic lama2 payload for treatment of congenital muscular dystrophy
RU2784927C1 (en) Animals other than human, including humanized ttr locus, and application methods
US20220411826A1 (en) Co-opting regulatory bypass repair of genetic diseases
US20230081547A1 (en) Non-human animals comprising a humanized klkb1 locus and methods of use
WO2023235725A2 (en) Crispr-based therapeutics for c9orf72 repeat expansion disease
CN117043324A (en) Therapeutic LAMA2 loading for the treatment of congenital muscular dystrophy

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23726939

Country of ref document: EP

Kind code of ref document: A1