WO2020127990A1 - System for production of crispr-based pharmaceutical compositions - Google Patents

System for production of crispr-based pharmaceutical compositions Download PDF

Info

Publication number
WO2020127990A1
WO2020127990A1 PCT/EP2019/086708 EP2019086708W WO2020127990A1 WO 2020127990 A1 WO2020127990 A1 WO 2020127990A1 EP 2019086708 W EP2019086708 W EP 2019086708W WO 2020127990 A1 WO2020127990 A1 WO 2020127990A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
rna
pathogen
protein
patient
Prior art date
Application number
PCT/EP2019/086708
Other languages
French (fr)
Inventor
Theodore Anastasius PRAMER
Martin Ekenberg
Corinne Kay
Original Assignee
Biomedrex Ab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Biomedrex Ab filed Critical Biomedrex Ab
Publication of WO2020127990A1 publication Critical patent/WO2020127990A1/en

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search

Definitions

  • This invention relates to a system for providing CRISPR-related type therapies of infections, for ex ample a virus infection.
  • Endonucleases are enzymes that cleave polynucleotides.
  • Clustered regular interspaced short palin dromic repeat (CRISPR)-type proteins are endonucleases that use an RNA guide strand to cut DNA or RNA at specific sites.
  • CRISPR-type proteins can be used for editing of DNA or for RNA knock down.
  • Cpfl and Casl3 are two recently identified CRISPR type proteins. The use of Cpfl for ge nome editing is described in Zetsche et al (Zetsche et al., 2015, Cell 163, 759-771). The use of Casl3 for knockdown has been reported in Abudayyeh et al (Abudayyeh et al (2017) Nature,
  • Pathogens such as virus, bacteria and eukaryotic parasites are still a major cause of suffering and death. There is a need for improved therapies for infections.
  • a method involving a system comprising at least one sequencing device, preferably at least two sequencing devices, and at least one RNA synthesis device, where the sequencing device comprises a first computer and the nucle otide synthesis device comprises a second computer, where the computer of the RNA synthesis machine and the computer of the nucleotide sequencing device communicate via a wide area net work, the method involving the following steps: a) providing at least one biological sample from a patient to the sequencing device, where the at least one biological sample comprises human ge netic material comprising the genome of the patient and genetic material from a pathogen that has infected the patient, said genetic material being in the form of polynucleotides, b) the sequencing device determining the polynucleotide sequence of the genetic material where the ge nomic sequence of the patient is determined and at least a part of the sequence of the genetic material of the pathogen is determined, where the determined polynucleotide sequence is stored as sequence data, c
  • Step c) can be carried out as follows: storing the determined polynucleotide sequences as se quence data in a computer memory of a computer of the system, where the sequence data is stored as a patient sequence data set that represents the genome of the patient and a pathogen sequence data set that represents at least a part of the genetic material of the pathogen,
  • the method provides personalized medicine for the treatment of a infection by a pathogen with the use of CRISPR technology.
  • CRISPR technology uses RNA guide strands.
  • RNA is much more sensitive than DNA.
  • RNA degradation by RNAses is a problem when han dling RNA.
  • RNA synthesis and handling must therefore be done in an environment that is con trolled.
  • the algorithm for selecting guide strands can be kept at a central computer, which provides security and easy updates.
  • pharmaceuticals can be produced in a sterile fashion.
  • the method of claim 1 where the system comprises at least two se quencing devices connected to the RNA synthesis device. This provides central and RNA:ase free synthesis of RNA guide strands.
  • identifying the target sequence in step d) comprises identifying a sequence that comprises a Cpfl - PAM sequence motif.
  • the pathogen is a virus selected from FI IV, FIPV, FHepatitis B and Flerpes.
  • the system may have stored sequence information for a number of predetermined target se quences and where step d) comprises finding, in the pathogen data set, a target sequence that is comprised in the sequence of at least one of the predetermined target sequences.
  • the predeter mined target sequences may be ranked according to expected efficacy, and the target sequence with the highest expected efficacy may then be selected.
  • Step d) may comprise identifying a sequence that comprises a Cpfl - PAM sequence motif.
  • the RNA synthesis device additionally has a protein-RNA complex formation station that is configured to use the RNA guide strand synthesised in step f) and to use an aliquot of a CRISPR-type protein to form a complex between the RNA guide strand and the protein.
  • the method may comprise the step of mixing the RNA guide strand or the protein-RNA guide com plex with suitable components to obtain a pharmaceutical composition, and then packing the pharmaceutical composition in a suitable container and sealing the container.
  • a system comprising at least one sequencing device, preferably at least two sequencing devices, and at least one RNA synthesis device, where the sequencing device comprises a first computer and the RNA synthesis device comprises a sec ond computer, where the sequencing device is configured to a) receive at least one biological sample comprising genetic material in the form of poly nucleotides,
  • RNA synthesis device configured to receive the sequence data
  • the sequencing device or the RNA synthesis device is configured to store the se quence data as a patient sequence data set that represents the genome of the patient and a pathogen sequence data set that represents at least a part of the genetic material of the pathogen
  • the RNA synthesis device is configured to: d) find, in the pathogen sequence data set, at least one polynucleotide target sequence with a length from 18 to 60 nucleotides, where said target sequence is not present in the patient genetic sequence data set,
  • RNA guide strand for a CRISPR-type protein that is complimentary to the target sequence of step e), the RNA guide strand additionally comprising a predeter mined handle sequence that mediates binding to a the CRISPR type protein.
  • the second computer have stored sequence information for a number of predetermined target sequences and where step d) comprises finding in the pathogen data set, a target sequence that is comprised in the sequence of at least one of the predetermined target sequences.
  • Step d) may comprise identifying a sequence that comprises a Cpfl - PAM sequence motif.
  • the predetermined target sequences may be ranked according to expected efficacy and the system is configured may then be configured to select the target sequence with the highest efficacy.
  • RNA synthesis device of the system may additionally have a protein-RNA complex formation sta tion that is configured to use the RNA guide strand synthesised in step e) and to use an aliquot of a CRISPR-type protein to form a complex between the RNA guide strand and the protein.
  • Fig. 1 is a schematic diagram showing a RNA guide strand.
  • Fig. 2 is a schematic diagram showing a system.
  • Fig. 3 is a schematic diagram of a polynucleotide sequencing device.
  • Fig. 4 is a schematic diagram of a RNA synthesis device.
  • Figs. 5-6 are flowcharts showing methods.
  • a recombinant CRISPR type protein in complex with an RNA guide strand (protein-RNA complex) that targets the CRISPR type protein to the DNA or the RNA of a pathogen is used for treating an infection in a patient.
  • the protein-RNA complex specifically cuts polynucleotides of the pathogen that causes the infection.
  • the patient may be a human or an animal, preferably a mam malian animal. In a preferred embodiment the patient is a human.
  • the protein-RNA complex may be used to cause double strand breaks in the DNA of pathogen-infected cells, or to knock down RNA in a pathogen-infected cell.
  • the pro- tein-RNA complex may be directed to target a genomic locus of interest of the pathogen.
  • the disease being treated is an infection caused by a pathogen such as a virus, a bacteria or a eukaryotic parasite, such as a fungus.
  • a pathogen such as a virus, a bacteria or a eukaryotic parasite, such as a fungus.
  • the infec tion is caused by a bacteria or a virus, most preferably a virus and most preferably a virus chosen from the group consisting of HIV (Human immunodeficiency Virus), HPV (Human Papilloma Virus), Herpes type 1, Herpes type 2 and Hepatitis B.
  • HIV Human immunodeficiency Virus
  • HPV Human Papilloma Virus
  • Herpes type 1 Herpes type 2
  • Hepatitis B a virus infection being treated is a HIV infection, preferably HIV-1.
  • the virus is a ssRNA virus, in particular when Casl3 is used, for example influ enzas type A, B or C, respiratory syncytial virus, M ERS coronavirus, hepatitis A virus, hepatitis C vi rus, Lassa fever virus, Marburg virus, Ebola virus, Nipa virus, zika virus, West Nile virus or Rift valley fever virus.
  • ssRNA virus in particular when Casl3 is used, for example influ enzas type A, B or C, respiratory syncytial virus, M ERS coronavirus, hepatitis A virus, hepatitis C vi rus, Lassa fever virus, Marburg virus, Ebola virus, Nipa virus, zika virus, West Nile virus or Rift valley fever virus.
  • CRISPR -type proteins use an RNA guide strand to cut DNA or RNA.
  • a guide strand is an RNA mole cule that binds to the CRISPR-type protein and guides the CRISPR type protein to a certain polynu cleotide target sequence.
  • the guide strand is able to hybridize with the target strand (Watson- Crick base pairing).
  • a complex between a CRISPR-type protein and an RNA guide strand is referred to as a protein-RNA complex herein.
  • handle motif and “handle sequence” refers to a RNA sequence that interacts with a CRISPR type protein for example by mediating binding between an RNA guide strand and a CRISPR type protein. Examples of handle sequences for Cpfl and CAS13 are given below.
  • CRISPR-type proteins there are many different useful CRISPR-type proteins.
  • the CRISPR-type protein has en donuclease activity that causes a double strand breaks.
  • the most studied CRISPR protein is CRISPR/Cas9 which cuts DNA, leaving blunt ends.
  • CRISPR/Cas9 has been used for editing of eukary otic genomes (Cong et al, Science 339, (2013) 819-823, Mali et al, Science (2013) 823-826).
  • CRISPR/Cas9 uses a 42-nucleotide RNA guide strand and in addition a second strand (so called tra- crRNA strand) which may be 89 nucleotides long.
  • a second strand which may be 89 nucleotides long.
  • one of the CRISPR-type proteins Cpfl or Casl3 is used.
  • Cpfl for genome editing is described in Zetsche et al (Zetsche et al 2015, Cell 163, 759-771).
  • Casl3 for RNA knockdown has been reported in Abudayyeh et al (Abudayyeh et al (2017) Na ture, 550280-284).
  • Cpfl (Zetsche et al., 2015, Cell 163, 759-771) cuts DNA in a staggered manner, leaving sticky ends with a 4 or 5 nucleotide 5'- overhang. This makes it difficult for DNA - repair system to repair the cut, compared to if a blunt end is created.
  • the unligated DNA may in hibit the pathogen in different ways including but not limited to: 1) triggering apoptosis of a virus- infected cell, 2) causing the death of a pathogen, for example a bacterium.
  • the pathogen is prefer ably a pathogen that has its genomic material in the form of DNA during at least some part of its life cycle. Many virus genomes become integrated into the host genomic DNA. For example, HIV becomes integrated into the genomic DNA of infected T-lymphocytes.
  • the Cpfl protein may be a Cpfl protein from Francisella novicida, Adamiococcus sp BV3L6 or Lach- nospiracea bacterium ND2006 in particular Adamiococcus sp BV3L6 or Lachnospiracea bacterium ND2006.
  • a useful variant of Cpfl is Alt-R Casl2a.
  • CRISPR/Cas 13 cuts RNA and can be used for knockdown of pathogen RNA. This may limit patho gen survival, replication or activation, or may cause the death of the pathogen.
  • the Casl3 protein may be Casl3 from Leptotrichia wadei (Abudayyeh et al (2017) Nature, 550 280-284.
  • Useful vari ants of Casl3 include PspCasl3b, LwaCasl3a, LbuCasl3a and LshCasl3a, LwaCAS13 and
  • CRISPR type proteins edit polynucleotides by inserting an extra base in the polynucle otides, for example mRNA, leading to a frameshift and premature stop of translation.
  • CRISPR/Cas 9, Cpfl and CAS13 it is also includes functional equivalents and homologues of these proteins.
  • modified or truncated proteins are included, provided that they have the same or comparable nuclease activity as the endogenous CRISPR/Cas 9, Cpfl and CAS13 proteins.
  • a homologue may have an amino acid identify with the original protein sequence of at least 70% more preferably at least 80%, even more preferably at least 90%, even more pref erably at least 95% and most preferably at least 99%, using amino acid sequence alignment in BLAST (for example BLAST2 sequences) using the following settings: word size: 3, gapcosts: 11, 1, Matrix: BLOSUM62, Filter string: F, Window Size 40, Threshold 11.
  • BLAST for example BLAST2 sequences
  • the guide strand for Cpfl preferably has a length of from 40 to 44, more preferably 41 to 44 nucleotides and comprises a 5' constant motif (handle sequence) which may be 5' - AAUUUCUACUCUUGUAGAU-3' or 5'-UAAUUUCUACUCUUGUAGAU-3'.
  • the handle se quence interacts with Cpfl and may be important for complexing with Cpfl or for Cpfl activity.
  • the guide segment is 21- 24 nucleotides long and is located 3'-terminal to the handle sequence.
  • the RNA is provided as single stranded RNA but parts, in particular parts of the handle sequence, may form a secondary structure.
  • the guide strand hybridizes with a target strand of double stranded DNA.
  • the opposite strand is referred to as the "displaced strand”.
  • the guide strand for Cpfl may have a unspecific 5' extension of from 3 to 59 or more nucleotides as described in Park et al., Nature Communications, (2016) 9:3313 DOI:
  • the 5' extension is preferably not homologous to the human genome. For example, it may be a scrambled sequence. It has been hy pothesized that such a 5' extension increases efficacy by providing a negative charge.
  • Some CRISPR type proteins uses a PAM (Protospacer Adjacent Motif) motif to rec ognise target sequences.
  • the minimal PAM motif for Cpfl is TTN.
  • the TTN motif used for Cpfl is preferably TTT, even more preferably TTTV where V is any nucleotide except T.
  • the PAM motif is localized on the displaced strand and is not recognized by the guide strand of the RNA -protein complex but by the interaction between the TTN nucleotides and amino acid residues of the Cpfl protein.
  • Cpfl cuts the displaced strand with a 4-5 nucleotide overhang approximately 18-19 nucle otides from the PAM TTTN motif and cuts the target strand approximately 24- 25 nucleotides from the TTTN motif.
  • the protein- RNA complex is preferably directed to a sequence selected from SEQ ID NO 1 to SEQ ID NO 20 shown in Table la, even more preferred sequences selected from SEQ ID NO 41-80 shown in Table lb.
  • These sequences represent the displaced strands of various suitable targets. These sequences have the following properties: 1) they include the TTT PAM important for Cpfl binding to the dis placed strand , 2) the sequences are conserved over a large number of FI IV stains, 3) the sequences are present in sequences that are important for the HIV virus, and 4) the sequences are not pre sent in the human consensus genome, making it safe to target the protein-RNA complex to these sequence. These sequences ensure that the endonuclease activity of the protein-RNA complex will only be targeted to DNA in FllV-infected cells.
  • Table lb SEQ ID NO 1- 20 and 41-80 show the sequences of the displaced strands, including the PAM motif (TTTV) although the TTTV motif is not actually displaced but remains hybridized to the target strand.
  • TTTV PAM motif
  • SEQ ID NO 1 is:
  • SEQ ID NO 1 less the TTTN motif is: 5' -TCT AT G CC AT CT AA AA AT AA-3"
  • RNA Since RNA has U instead of T, the guide sequence strand is 5'- UCUAUGCCAUCUAAAAAUAA - 3'
  • the guide sequence strand has a 5' "handle” sequence that makes the guide RNA bind to the Cpfl protein.
  • the handle sequence may be 5'-UAAUUUCUACUCUUGUAGAU-3 ' '.
  • the guide strand sequence is 5'-UAAUUUCUACUCUUGUAGAU-3' + 5'- UCUAUGCCAUCUAAAAAUAA - 3' which is 5'-UAAUUUCUACUCUUGUAGAU UCUAUGCCAUCUAAAAAUAA - 3'
  • the handle sequence may be 5' - AAUUUCUACUCUUGUAGAUG- 3'.
  • the guide strand will be a part of SEQ ID 1-20 and SEQ ID 41-80.
  • the target sequence will be the reverse comple ment of each of SEQ ID NO 1 - 20 and 41-80.
  • suitable target RNA sequences for targeting CAS13 to H IV RNA, for example HIV mRNA include SEQ ID NO 21-40, and even more preferably SEQ ID NO 81-120, shown in Tables 2a and 2b.
  • sequences have the following properties: 1) the sequences are conserved over a large num ber of HIV stains, 2) the sequences are present in sequences that are important for the HIV virus, and 3) the sequences are not present in the human consensus genome, making it safe to target the protein-RNA complex to these sequences. Protein-RNA complexes with these RNA guide strands cuts crucial HIV mRNA.
  • SEQ ID NO 21 is 5'-AACAUAGUAACAGACUCACAAUAUGCAUUA-3'.
  • the guide sequence will be the reverse complement of this sequence, which is
  • the guide strand comprises a so called “direct repeat sequence” (DRS) ("handle sequence") that is specific for the CASE13 protein used, and which interacts with the CAS13 protein, and may medi ate binding of the guide sequence to the CAS13 protein.
  • DRS direct repeat sequence
  • the DRS is lo cated 5' of the guide segment and for others the DRS is located 3' of the guide segment.
  • PspCasl3b 5'- GUUGUGGAAGGUCCAGUUUUGAGGGGCUAUUACAAC -3 ' (located 3 ' of guide seg- ment)
  • LwaCasl3a 5'- GAUUUAGACUACCCCAAAAACGAAGGGGACUAAAAC -3 ' (located 5' of the guide segment) (Freije et al, 2019, Molecular Cell 76, 826-837).
  • a suitable guide strand sequence for the Lwa CAS13 protein for targeting SEQ ID NO 21 may be is 5'- G AU U U AG ACU ACCCCAAAAACG AAGGGG ACU AAAAC -3'
  • the guide strands for some Casl3 proteins may in addition need a protospacer flanking site (PFS), se for example Abudayyeh et al (2017) Nature,
  • the protein-RNA complexes, or the plasmids or the virus are preferably administered to the pa tient in the form of a pharmaceutical composition.
  • a pharmaceutical composition comprises an effective amount of the protein-RNA complexes, plasmids or virus ("active component"), and a pharmaceutically acceptable carrier, which typically is an aqueous solution optionally comprising a variety of different pharmacologically acceptable compounds.
  • active component typically is an aqueous solution optionally comprising a variety of different pharmacologically acceptable compounds.
  • the formulation is made to suit the mode of administration. There is a wide variety of possible formulations.
  • the formulation may be adapted to increase the uptake or stability of the active component or to improve the pharmacoki netics or pharmacodynamics of the active component, or to enhance other desirable properties of the formulation.
  • the pharmaceutical composition, the complexes and the virus and plasmids de scribed herein are preferably non-naturally occurring or engineered.
  • a protein-RNA complex is delivered. Delivery of the protein-RNA complex can be made in any suitable way. Two reviews that describe useful methods of delivery are: Glass, Lee, LI and Xu; Trends in Biotechnology, 2017, and Liu, Zhang, Liu and Cheng, Journal of Controlled Disease, 266 (2017) 17-26.
  • Suitable methods include nanoparticles for example gold particles, or polymeric carriers, such as polymers obtained from chitosan or poly-caprolactone or poly-lactic/glycolic acid-copolymers.
  • nanoparticles for example gold particles, or polymeric carriers, such as polymers obtained from chitosan or poly-caprolactone or poly-lactic/glycolic acid-copolymers.
  • the use of gold particles is a preferred method of delivery (Mout et al (2017) ACS Nano 11, 2452-2458) and Lee et al Nature Biomedical Engineering volume 1, pages 889-901 (2017).
  • lipid nanoparticles for example as described in Wang et al., PNAS March 15, 2016 vol. 113 no. 11 2868-2873, and Li et al., Biomaterials 178 (2016) 652 - 662.
  • a plasmid or plasmids encoding the protein and/or the guide RNA is admin istered to the patient, as is known in the art.
  • the plasmids are preferably adapted for expression of the protein and transcription of the RNA in the cell type of interest which may be a mammalian cell, preferably a human cell.
  • the protein gene and the guide strand gene is prefera bly under control of suitable promotors that induce expression in these cells.
  • suitable promotors that induce expression in these cells.
  • the route of admin istration, formulation and dose can be as in US Patent No 5,846,946 and as in clinical studies in volving plasmids.
  • the guide strand is delivered (as RNA) together with a plasmid that encodes the CRISPR-type protein, or the other way around.
  • a suitable promotor for expression in humans is chosen when the pathogen is a virus.
  • the pro motor is preferably chosen to suit the internal transcription system of the pathogen.
  • delivery of the CRISPR type protein or the RNA guide strand is carried out with the use of a virus.
  • the CRISPR-type protein and the guide RNA can be delivered using adeno associated virus (AAV), lentivirus, adenovirus or other plasmid or viral vector types, in particular, using formulations and doses from, for example, US Patents Nos. 8,454,972 (formulations, doses for adenovirus), 8,404,658 (formulations, doses for AAV) and 5,846,946 (formulations, doses for DNA plasmids) and from clinical trials and publications regarding the clinical trials involving lentivi rus, AAV and adenovirus.
  • AAV adeno associated virus
  • the route of administration, formulation and dose can be as in US Patent No. 8,454,972 and as in clinical trials involving AAV.
  • the route of administration, formulation and dose can be as in US Patent No. 8,404,658 and as in clinical trials involving adenovirus.
  • each of the sequences encoding the CRISPR-type protein and the guide strand is adapted for expression of the protein in the cell, and adapted for transcription of RNA.
  • the coding sequences are preferably under control of a regulatory element, which typically is a DNA sequence that controls the transcription of the gene of interest.
  • the regulatory element may comprise one more promotors, enhancers or the like.
  • the regulatory element is cho sen to suit the cell in which expression is to be achieved.
  • the regulatory element may be operably linked to the sequences.
  • Each of the CRISPR-type protein and the sequence encoding the guide stand may be operably linked to a separate regulatory element.
  • the genes for the CRISPR-type protein may be codon-optimized for expression in the cells of the interest, for example human cells.
  • the CRISPR-type protein and/or the guide strand may be targeted to the nucleus with the ad dition of nucleus targeting sequences.
  • Formulation may be adapted for for parenteral administration such as for example intraarticular, intravenous, intradermal, intraperitoneal, or subcutaneous administration include aqueous and non-aqueous injection solutions.
  • parenteral administration such as for example intraarticular, intravenous, intradermal, intraperitoneal, or subcutaneous administration include aqueous and non-aqueous injection solutions.
  • Formulations for injection may be in unit dosage forms, for exam ple ampules or in multidosage forms.
  • the formulation can be for administration topically, systemi- cally or locally.
  • the formulation can also be provided as an aerosol.
  • the formulations may contain nuclease inhibitors (such as RNase inhibitors) antioxidants, buffers, antibiotics, salts, solutes that renders the formulation isotonic, lipids, carriers, diluents emulsifiers, chelating agents, excipients, fillers, drying agents, antioxidants, binding agents, solubilizers, stabi lizers, antimicrobial agents, preservatives and the like.
  • nuclease inhibitors such as RNase inhibitors
  • antioxidants such as buffers, antibiotics, salts, solutes that renders the formulation isotonic, lipids, carriers, diluents emulsifiers, chelating agents, excipients, fillers, drying agents, antioxidants, binding agents, solubilizers, stabi lizers, antimicrobial agents, preservatives and the like.
  • the protein-RNA complex, the plasmids or the virus may be administered to the subject in any suitable manner.
  • the protein-RNA complexes, the plasmids or the virus can be administered by a number of routes including intravenous injection, intraperitoneal, intramuscular, transdermal, subcutaneous, topical, sublingual, or rectal means. Suitable modes of administration include injec tion or infusion. Intravenous administration is a preferred mode of administration.
  • an effective amount of the protein-RNA complex, the plasmids or the virus is adminis tered to the subject.
  • An effective amount is an amount that is able to treat one or more symptoms of a disease, halt or reverse the progression of a disease.
  • Administration may be carried out at a single time point or repeatedly over a time period or from an implanted slow-release matrix.
  • Other delivery systems include bolus injections, time-release, delayed release, sustained release or controlled release systems.
  • Dosage and administration regimens may be determined by methods known in the art, for exam ple with testing in appropriate in vitro or in vivo models, such as animal models to analyse efficacy, pharmacokinetics, pharmacodynamics, excretion, tissue uptake and the like by methods known in the art.
  • a suitable way of finding a suitable dose is starting with a low amount and gradually in creasing the amount.
  • the CRISPR type protein for use in protein-RNA complexes are preferably produced in a suitable expression system.
  • Production of protein with the use of expression systems is well known in the art. In general, Current protocols in Molecular Biology (John Wiley & sons) provides guidance for polynucleotide handling and manipulation, and protein expression and handling.
  • CRISPR type pro tein, in particular Cpfl and CAS13 can be produced in any suitable manner.
  • Suitable expression systems include eukaryotic cells such as CHO cells, insect cells or bacteria. Often, E. coli is the pre ferred expression system because of its ease of use, and because the CRISPR-type proteins are of bacterial origin.
  • the production of protein involves cloning of the coding sequence for the protein into a plasmid suitable for expression.
  • the plasmid preferably has a promotor that drives expression.
  • the T7 promotor may be useful.
  • the CMV promotor may be useful.
  • the plasmid is introduced into the cells with the use of well-known transfection protocols, and stable or transient expressing cells are generated. Suitable transfection techniques may be the use of electroporation or the use of liposomes, such as Lipofectamine ® or virus-based methods. Clones stably expression the protein may be selected, expanded and propagated.
  • Expression plasmids for Cpfl are described in Zetsche et al and expression plasmids for CAS13 are described in Abudayyeh et al (see above).
  • the proteins may be expressed with a suitable tag for purification of the protein, such as poly-His tag.
  • Purification of protein may be carried out as is known in the art and may include steps such as: cell lysis, centrifugation, gel filtration, affinity chromatography and dialysis.
  • the protein is preferably purified and endotoxin-free.
  • Useful plasmids for expression of Cpfl include pTE4396, pTE4396, pAsCpfl(TYCV)(BB) (pY211) and pYOlO (pcDNA3.1-hAsCpfl).
  • Useful plasmids for expression of Casl3 include: pC0046-EFla-PspCasl3b-NES-HIV and pC0056 - LwCasl3a-msfGFP-NES (eukaryotic expression) and p2CT-Flis-M BP-Lwa_Casl3a_WT (expression in bacteria).
  • RNA guide strand can be produced in any suitable way.
  • a preferred way is chemical synthesis. Methods for synthesis of RNA are well known to a person skilled in the art. RNA synthesis is prefer able done in a controlled environment to avoid degradation of RNA by for example RNAses.
  • the conditions for complexing guide RNA with protein are known.
  • the protein is incu bated with the guide RNA in a suitable buffer. Incubation time may be 10 minutes to 30 minutes.
  • the above-mentioned methods for administration of protein-RNA complexes, plasmids and virus can be used to introduce double strand breaks with the use of Cpfl pathogenic virus-infected (for example HIV) cells or pathogen cells (such as baceteria), or to knock down pathogen (such as viral) RNA using Casl3, in vivo, ex vivo or in vitro. In one embodiment this is done in vitro.
  • the virus-in fected cells may be a subpopulation of a larger population of cells, where not all cells are infected with the pathogenic virus.
  • There are known in vitro methods for assessing the efficacy of treatment and delivery Examples include the methods used in Ueda et at, Microbiology and Immunology Volume60, Issue7, July 2016, 483-496
  • system 1 automatically determines a suitable guide RNA guide se quence based on the patient's genome and the genome of the particular pathogen that has in fected the patient, and then synthesizes the guide strand.
  • the user provides a biologic sample as sociated with a unique identity to the system 1 and the system 1 then produces an RNA molecule, preferably in complex with a CRISPR-type protein, for treatment of the patient.
  • a ready- to-use pharmaceutical composition is produced.
  • the identity of the sample may be associated with a patient ID and an address for delivery of the pharmaceutical and other useful information, such as sample type, sample date etc.
  • Preferable system 1, in particular RNA synthesis device 3 is able to handle a large number of samples per time unit in an automated fashion.
  • the system 1 comprises at least one sequencing device 2 and at least one RNA synthesis device 3.
  • the at least one sequencing device 2 and the RNA synthesis device 3 are connected to wide area network 4, for example the internet so that they are able to communicate as is known in the art of computer networking. In this manner the sequencing device 2 and the RNA synthesis devices 3 can be placed at different sites.
  • one RNA synthesis device 3 can be used together with a plurality of nucleotide sequencing devices 2 at different sites, for example as point of care devices.
  • the sequencing devices 2 comprises a sequencing com puter 6 and RNA synthesis device 3 comprises a synthesis computer 7.
  • Sequencing computer 6 may be a client in relation to the synthesis computer 7 of the RNA synthesis device 3.
  • System 1 may comprise functionality for ordering, payment, and shipping of pharmaceuticals. This is prefer ably handled by sequencing computer 6 of nucleotide sequencing and synthesis computer 7 of RNA synthesis device.
  • the sequencing device 2 comprises a nucleotide sequencing unit 5 that is arranged to determine the sequence of polynucleotides in a biological sample.
  • a user which may be a healthcare profes sional or the patient, provides a biological sample from the patient to the sequencing device 2, preferably together with data that identifies the user in a unique manner, and the sequencing de vice 2 then automatically determines the polynucleotide sequence of genetic material in the sam ple.
  • the sample may comprise blood, serum, mucus, urine, liquor, skin, ascites fluid, liquid from an infected wound, etc.
  • the sample is a liquid sample, such as blood.
  • the volume of the liquid sample may be in the range of 0.1 ul to 20 ml.
  • a non-liquid sample may be used such as cells from the inside of the cheek of the patient obtained with a cotton swab.
  • the at least one sample comprises human genetic material comprising the genome of the patient and also genetic material from the pathogen that has infected the patient.
  • the genetic material that comprises the genome of the patient and the genetic material from the pathogen may be pro vided in in the same sample or in different samples.
  • one sample is a sample that can be presumed to include the pathogen if interest.
  • FI IV a blood sample is suitable since FI IV infects T-lymphocytes.
  • MRSA bacteria a sample from an infected wound may be suitable.
  • the polynucleotides in the sample may undergo various procedures including but not limited to 1) extraction and/or purification of polynucleotides from the sample, 2) digestion of the polynucleo tides to pieces of smaller length, 3) labelling of polynucleotides, 4) amplification such as amplifica tion using PCR, which is well known in the art, before 5) determination of polynucleotide se quence.
  • US 2018/0169658 and WO 2012/012779 discloses a device that are able to carry out these steps and which may be used as sequencing unit 5.
  • Examples of automated DNA/RNA purification sample preparation workstations include Perkin Elmer chemagicTM 360 Nucleic Acid Extractor or QIAcube FIT System from Qjagen. Sequencing may be done as is known in the art, for example using Sanger sequencing or next gen eration sequencing, for example lllumina-type sequencing. Suitable machinery may include the llu- mina MiSeqDx System or the NextSeq 550Dx. Handling of samples between workstations in nucleotide sequencing device 2 may be done using robots under control of sequencing computer 6. Suitable automated liquid handling systems may be provided for example by Agilent, ThermoFisher or PerkinElmer. Perkin Elmer Janus G3 may be a suitable liquid handling robot. A skilled person would know how to program robots to handle sam ples and to move them from the extraction station to the sequencing station. Identification of sample may for example be done with bar codes bar code readers.
  • Sequencing computer 6 may control various aspects of the nucleotide sequencing unit 5 for exam ple control sample identification, sample handling, timing of procedures of purification, digestion, amplification and sequencing. Sequencing computer 6 also comprises communication means for communication with RNA synthesis device 3. The sequencing computer 6 is able to store polynu cleotide sequence information as sequence data and transfer this data to the synthesis computer 7 of the RNA synthesis device 3 via wide area network 4.
  • the RNA synthesis device 3 comprises a synthesis computer 7 that is able to communicate with the sequencing computer 6 of the sequencing device 2, with the use of a wide area network 4, for example with the use of internet. Communication may be encrypted.
  • Each of sequencing computer 6, and synthesis computer 7 has a memory and a processor and a bus and a communication interface for communication with the other computer, for example a networking port.
  • the sequencing computer 6 and the synthesis computer 7 has suitable mans for making input and output.
  • the sequencing computer 6 and the synthesis computer 7 may have suitable means for making input for example a keyboard and may have a display for showing text, numbers and images.
  • Each of sequencing computer 6 and synthesis computer 7 may have a suita ble operating system such as for example Windows or Linux.
  • the methods described herein can be implemented using any suitable programming language, such as for example C++, Java or Perl.
  • a suitable computer model, mentioned as an example only, may be a Lenovo XI.
  • a user may be able to change parameters using a user interface, for example to time synthesis (so that it is carried out during night-time) or to decide the amount of RNA guide to be synthesized.
  • Synthesis computer 7 of RNA synthesis device 3 has bioinformatics software able to carry parts of the methods described herein.
  • Operations of the bioinformatics software may include various op erations for polynucleotide sequence data known in the art such as sequence storage, sequence retrieval, sequence modification, sequence alignment, sequence assembly, similarity determina tion, applying windows, etc.
  • the EMBOSS and/or the UGENE software packages may be used for some of these operations.
  • BLAST is a frequently used tool for alignment of sequences.
  • Synthesis computer 7 is able to determine the sequence of a suitable RNA guide strand to be syn thesised and provide the sequence to the RNA synthesis station 8. Sequencing computer 6 may also have such bioinformatics software when genome assembly from reads is carried out by se quencing computer 6.
  • Synthesis computer 7 may also have stored in its memory sequence information ("hot list") for a plurality of suitable predetermined polynucleotide target sequences.
  • hot list a plurality of suitable predetermined polynucleotide target sequences.
  • sequences that are suitable to be included in a hot list for the treatment of HIV are those of Tables lb and 2b.
  • the predetermined target sequences of the hot list have been selected so that they are known to be crucial for pathogen viability.
  • Sequences of interest are sequences that are known to be crucial for the pathogen to maintain the pathogen infection in the patient.
  • the sequences in the hot list may be crucial for pathogen virulence, pathogen survival, pathogen replication, pathogen activation (such as the activation of a virus from a latent phase to a lysogenic phase), or similar.
  • Sequences of the hot list may for example be protein coding genes, but may also be regulatory sequences such as promotors.
  • the sequences in the hot list may be ranked according to how efficient they are for treatment, and a sequence may be selected by the system based on how efficient targeting the sequence is. Hence one particular sequence may be selected over other sequences.
  • Sequencing computer 6 or synthesis computer 7 has functionality for assembling sequence reads to longer sequences, and optionally for identify sequence reads as belonging to the human ge nome or to a virus.
  • RNA synthesis device 3 comprises a RNA synthesis station 8 that is able to synthesize RNA.
  • RNA synthesis chemistry is known in the art. Synthesis may for example be carried out by adding and covalently attaching one base at a time to growing RNA chain. Examples of useful RNA synthesis machines include Oligo Synthesizer 192 from Oligomaker APS and ABI 3900 from Biolytic Lab Per formance Inc. W0200364026 describes a useful polynucleotide synthesis machine.
  • RNA synthesis station 8 may also provide functionality for purification, adding buffers etc.
  • RNA synthesis station may also have a packaging unit for packing the RNA, the protein-RNA complex or the pharmaceuti cal product in a container.
  • RNA synthesis station 8 is able to handle each type of RNA molecule in a separate manner, for ex ample in a separate container for example in microtubes, 96-well plates, or other suitable format.
  • the RNA synthesis station is able to provide RNA with a high degree of purity, i.e. free of polynucleotides with other sequences than the desired sequence, and free of remaining reagents.
  • RNA synthesis device 3 may be also comprise a complex formation station 9 that is able to add the RNA guide strand that has been synthesised to purified CRISPR-type protein to form protein-RNA complexes.
  • Complex formation may for example be induced by incubation the protein with the RNA guide strand in a suitable buffer for a certain time. The incubation time may be for example from 10 to 30 minutes.
  • the complex formation station 9 is preferably also implemented by robots as discussed herein.
  • RNA synthesis station 8 and complex formation station 9 may be under control of synthesis com puter 7 which may be able to control for example addition of reagents, reaction timing, and purifi cation steps.
  • a robot may be used for handling samples and the robot may be under control of synthesis computer 7 in a similar manner as in sequencing device 2.
  • Parts of system 1 may be centralized such on a server, data store or a database, in particular stor age of sequences, algorithms for assembling genomes, alignment or for identification of target se quences or for the in-silico design of RNA guide strands.
  • Sequencing computer 6 or synthesis com puter 7 may query the server or database for such services, for example by uploading sequence information for storage and/or analysis.
  • a server or a database may provide sequence infor mation and analysis results to the sequence computer or synthesis computer.
  • the services may be provided in synthesis computer 7 which, again may act as a server.
  • the synthesis computer 7 or a server may be a single, standalone computer server, or may be distributed across several physical or virtual computer servers or data stores, as the case may be. Hence, various components and functionalities of sequencing computer 7 may be carried out by different computers.
  • both sequencing computer 6 and synthesis computer 7 are clients in relation to a server which may or may not be physically distant from either or both of these computers.
  • a central server may hence serve as a datastore for centralized storage and analysis of sequences.
  • a method may comprise the following steps:
  • a biological sam ple is provided to the nucleotide sequencing device 2.
  • the at least one sample comprises human genetic material comprising the genome of the patient and at least some genetic material from the pathogen.
  • the genetic material is in the form polynucleotides.
  • the genetic material is preferably DNA but may be RNA in the case of the pathogen.
  • genetic material is isolated from the sample. This is done as is known in the art, for example with the use of extrac tion. The actual intervention with the patient when the sample is taken is not necessarily a part of the invention.
  • step 102 the genetic material is sequenced, meaning that the nucleotide sequence of the ge netic material is determined.
  • the sequence information is stored in the memory as sequence data by the first sequencing computer 6.
  • the sequence data is preferably associated with the identity of the sample and/or patient.
  • Sequencing is typically DNA sequencing but may be sequencing of RNA, for example when using a CRISPR-type protein that has RNA endonuclease activity such as CAS13, and it is desirable to know mRNA sequence.
  • mRNA sequencing may be done using reverse tran scriptase to change mRNA to DNA.
  • Sequencing coverage for the patient genome is preferably high such that the sequence of entire genome of the patient is determined. There may be a cut off for a minimal coverage of the patient genome. The cut-off may be 99%, 99.5% or 99.99% or higher.
  • the method may include the step of discarding the operation if sequencing coverage is below a threshold. Sequencing is preferably car ried out to a minimum sequencing read depth, which may be for example 2, 3, or four times, or more.
  • the method may include the step of discarding the operation if sequencing depth is below a predetermined threshold. Discarding increases patient safety, because it decreases the risk that the protein-RNA complex is targeted to patient sequences.
  • Sequence coverage for the pathogen genetic material is preferably high but does not need to be as high as for the patient genetic material.
  • the cut-off may be the same as for the patient but may also be 90% or 95%.
  • a very short stretch corresponding to one target sequence which may be as short as 18 nucleotides, more preferably at least 19, 20 or 21, 22, 23 or 24 nucleotides of the pathogen genetic material is determined.
  • sequencing depth may be the same as for the patient genetic material.
  • Short sequencing reads typically have to be assembled in silico into complete sequences. When assembling sequence information, it may have to be determined, which reads relate to the pa tient's genome and which reads relate to the pathogen genome. For example, when it is likely when genetic material from both the patient and the pathogen is present in the sample or that se quence reads end up mixed together for other reasons. This may be done using methods known in the art including mapping of reads to a reference genome, preferably a reference human genome and one or more reference pathogen genomes. For example, sequence reads may be connected by overlapping alignments and mapped to a known reference genome sequence. Software for as sembly of reads to genomes are provided by for example lllumina Inc, and lllumina also provides sequencing equipment.
  • Separation of sequence information may also be done based on biochemical properties and subse quent separate sequencing operations.
  • human nuclear genomic material (which is in the form of chromosomes) may be separated from pathogen genetic material in the form of plas mid DNA or RNA.
  • the sequence information is stored as a patient sequence data set that represents the genome of the patient and a pathogen sequence data set that represents at least a part of the genome of the pathogen.
  • the sequence information is then in, step 103 provided to the RNA synthesis device 3.
  • sequence information is provided as non-assembled reads by se quencing computer 6 to synthesis computer 7 of RNA synthesis device, and synthesis computer 7 carries out assembly.
  • assembly of reads may be done by sequencing computer 6 of the client sequencing device 2 or by synthesis computer 7 of the RNA synthesis device.
  • the se quence information as assembled genetic information or as reads, is provided to the RNA synthe sis device 2.
  • the sequence information is provided together with the identity if the sample.
  • step 104 at least one suitable polynucleotide target sequence with a suitable length is found in the pathogen sequence dataset.
  • the length of the target sequence may be from 18 to 60 nucleo tides, more preferably 19-50 and most preferably from 20-35 polynucleotides.
  • the target sequence is from 20-24 nucleotides, in particular when the Cpfl is used at the CRISPR type protein.
  • a suitable RNA guide strand that targets the target sequence is then designed in sil- ico by synthesis computer 7 as described below.
  • the target sequence should preferably not be present in the patient sequence data set (see below). It may be required that the target sequence is present in the hot list of sequences stored in synthesis computer 7.
  • Step 104 may comprise the step of identification of the pathogen as described below.
  • RNA guiding strands is synthesized by RNA synthesis station 8.
  • the synthesis computer 7 instructs the RNA syn thesis station 8 to synthesize the desired RNA molecule.
  • RNA synthesis is carried out as is known in the art.
  • constant parts of the sequence, such as the handle sequence is not added in silico, but can be added in a "hardware" form during the RNA synthesis step 105.
  • the RNA guide strand is provided with a high degree of purity. The RNA guide strand may need to be purified to get rid of reagents remaining from the synthesis.
  • RNA molecule is provided in a form that is suitable for further processing, in particular for inclusion in a pharmaceutical composition.
  • the RNA guide strand may be provided in a suitable form for exam ple in a microtube or in a 96 well plate format, or other suitable format.
  • RNA is preferably handled under controlled conditions to prevent degradation of RNases.
  • RNase inhibitors may be used to prevent degradation.
  • Low temperature around +4°C
  • RNA synthesis station 8 and protein complex for mation station 9 may be equipped with a cooling element or other cooling means.
  • the RNA guide strand may then preferably be combined with other substances to form a pharma ceutical composition in step 106.
  • the RNA molecule is complexed with the CRISPR- type protein, preferably Cpfl or CAS13. This is done by complex formation unit 9.
  • the RNA molecule is incubated with an aliquot of the protein for a certain time under suitable conditions.
  • the complexes may then be used in a pharmaceutical composition suitable for the purpose.
  • the method may include the step of mixing an aliquot of the RNA guide strand, or the protein-RNA guide strand with suitable components to obtain a pharmaceutical composition.
  • the method may comprise the step of then packing the pharmaceutical composition or the RNA guide strand or the protein-RNA complex, in a suitable container, such as in a vial, and sealing the con tainer.
  • the pharmaceutical composition is preferably manufactured in a controlled environment, such as a clean room, to avoid contamination of microbes, RNAases and other contaminants.
  • the controlled environment may be provided air purification systems such as air filters, air locks, staff dress requirements, entry permits, UV light, disinfectant and sterilization procedures, etc as is known in the art.
  • the pharmaceutical composition may then be distributed to the patient or a caregiver for use by the patient.
  • the patient may be a patient that is suspected of being infected with a pathogen.
  • the method may comprise the step of using the pathogen sequence data set to determine a) that the patient has been infected with a pathogen and 2) the type of pathogen that has infected the patient.
  • the system may be used for screening and/or diagnosis of the patient.
  • sequence reads from the sample may be aligned (tested against) consensus genome sequences of known patho gens or pathogen strains using a sequence aliment tool, such as for example BLAST.
  • sequence a number of pathogens may be included in a dataset and the sample sequences are then aligned (or attempted to be aligned) to the sequences of the dataset.
  • the sample may be deemed to contain ge netic material from the pathogen. It may be required that a minimal number of genes or loci of a pathogen genome should be present, such as 2, 3, 4 or more. A minimum sequencing depth may be required.
  • pathogens examples include HIV (Human immunodeficiency Virus), HPV (Human Papilloma Virus), Herpes type 1, Herpes type 2 and Hepatitis A, Hepatitis B, Hepatis C, Streptococcus, Salmonella, Klebsiella, E. coli, Shigella, Malaria, influenza type A, B or C, respiratory syncytial virus, MERS coronavirus, Lassa fever virus, Marburg virus, Ebola virus, Nipa virus, zika vi rus, West Nile virus or Rift valley fever virus.
  • HIV Human immunodeficiency Virus
  • HPV Human Papilloma Virus
  • Herpes type 1 Herpes type 1
  • Hepatitis A Hepatitis B
  • Hepatis C Streptococcus
  • Salmonella Klebsiella
  • E. coli Shigella
  • Malaria influenza type A, B or C
  • respiratory syncytial virus MERS coronavirus
  • Information about the identified pathogen may be stored together with information about the identity of the sample or the patient.
  • Step 104 may be carried out is in Fig 6.
  • step 200 sequences in the pathogen sequence data set that is present in the hot list are identified. This can be carried out using sequence align ment tools. This step is preferably carried out when the protein-RNA complex comprises a CRISPR type protein that has RNA endonuclease activity, for example Casl3.
  • step 201 at least one PAM motif for a CRISPR-type protein is identified in the pathogen sequence data set. If step 200 is carried out, only sequence data that remains in the pathogen se quence data set after filtering in step 200 is considered for step 201.
  • the PAM motif may be a PAM motif for Cpfl.
  • the Cpfl PAM motif may be TTN, more preferably TTT and most preferably TTTV, where V is any nucleotide except T.
  • Step 201 is necessary where the CRISPR-type protein requires a PAM sequence for binding to a target sequence and can be left out when a PAM sequence is not necessary.
  • a CRISPR-type protein putative target sequence with appropriate length is selected, see above.
  • the target sequence is positioned in relation to the PAM sequence depending on which CRISPR-type protein is used.
  • the TTN PAM motif is located on the displaced strand and is located 5' of the sequence that is complementary to the tar get sequence (see Fig. 2A of Yamano et al).
  • the putative target sequence is for Cpfl.
  • a preferred length for a Cpfl target sequence is 20-24 nucleotides.
  • a target sequence of 20-24 nucleotides which is complimentary to a sequence 3' of PAM is se lected.
  • step 203 it the synthesis computer 7 checks that the putative target sequence is not present in the genome of the patient. This is done by attempting to align the sequence of the putative target strand (or the guide strand) with the patient genome data set using suitable cut-offs. For example, the BLAST algorithm can be used. This should be done with both strands of the human genome. If no alignment takes place, the putative guide strand sequence is not present in the human genome and the sequence can be used for creating an RNA guide strand in silico.
  • suitable cut-offs For example, the BLAST algorithm can be used. This should be done with both strands of the human genome. If no alignment takes place, the putative guide strand sequence is not present in the human genome and the sequence can be used for creating an RNA guide strand in silico.
  • Steps 200, 201, 202 and 203 may be carried out in any order, but it may be useful to carry out step 203 last as this reduces processing time. Step 203 can also be carried out after step 204.
  • a sequence for an RNA guide strand is designed in silico.
  • the guide strand should fulfil the following criteria: 1) it should selectively bind to the target sequence, i.e. be complementary to the target sequence, i.e. able to hybridize to the target sequence, 2) it may comprise a handle se quence which interacts with the CRISPR-type protein.
  • the handle sequence may be may have the sequence UAAUUUCUACUCUUGUAGAU or AAUUUCUACUCUUGUAGAU for Cpfl.
  • the sequence of the guide strand will be identical to a part of the sequence 3' of the TTTV motif.
  • the guide strand is designed by taking the 5'end of the sequence following the TTTV motif and connecting it to the 3'end of the handle sequence.
  • the guide strand sequence can be used for testing against the human genome.
  • Se quencing computer 6 and synthesis computer 7 and communication between these parts is imple mented using digital computer technology for storing and handling digital information and signals as well as suitable hardware and software, including suitable digital processors, digital memories, input means, output means, buses and communications interface.
  • suitable digital processors digital memories
  • input means output means
  • buses and communications interface suitable digital processors
  • a user may be able to make input using for example a keyboard, a mouse or a touch screen.
  • Output to a user, if necessary, may be provided on for example a display.
  • a computer such as a PC or a server may have an operating system.
  • Communication between digital devices such as sequencing computer 6 and synthesis computer 7 may be implemented using suitable networking technologies and protocols, inducing cellular com munication such as 3G, 4G and 5G, Wi-Fi or Bluetooth, or Ethernet.
  • Data communication can be wireless, or wire bound.
  • Information may be exchanged over a wide area net such as internet 4.
  • Communication in system 1 and update of data in system 1, such as communication between se quencing computer 6 and synthesis computer 7 may be carried out using any suitable schedule, such as when needed or on a predetermined schedule, for example every second, every minute or every ten minutes.
  • the methods herein are preferably carried out automatically by system 1.
  • a user may provide input in the form of for example patient data and the required volume of pharmaceutical to be produced, and the system 1 then automatically active component such as a guide strand or a protein-RNA complex or a pharmaceutical.
  • the system 1 can provide an RNA guide strand or a phar maceutical or a RNA protein complex (or other product such as plasmids or viruses) automatically after a biological sample has been provided to the system 1.
  • Genomes for a large number of HIV subtypes (approx 3000 subtypes) where downloaded from www.hiv.lanl.gov. Data was imported into a table in a relational database.
  • Example 1 An algorithm was used to search in the database of Example 1 for target sequences that that com prise a PAM sequence for Cpfl. Target sequences were scored for how many of the HIV virus ge nomes that have them. The most conserved sequences were selected for further processing. It was checked that none of the selected sequences were present in the consensus human genome. The sequences were then selected based on that they should be present in sequences that are im portant for virus survival, replication or activation. The results are shown in Table la and lb. The score shows percentage of strains that carry the sequence. In Table lb it is shown how many strains that carry the sequence. EXAMPLE 3
  • RNA guide strands for targeting each of SEQ ID NO 1 - 20 and 41-80 are synthesized.
  • Each guide strand consists of the 5' handle sequence UAAUUUCUACUCUUGUAGAU followed by a sequence that is complimentary to each of one of SEQ ID NO 1 to 20 less the 5' TTTN motif.
  • RNA-protein complexes with each of the RNA guide strands and Cpfl protein is formed.
  • Gold parti cles are formed as described in Lee et al Nature Biomedical Engineering volume 1, pages 889- 901 (2017).
  • Each of the sixty different complexes is tested in a suitable in vitro model for example the model used in Ueda et at, Microbiology and Immunology Volume60, Issue7, July 2016, 483-496
  • CAS13 protein will be produced as in Abudayyeh et al. Sixty different guide RNA that comprises sequences complimentary to each of sequences SEQ ID NO 21 to 40 and 81 to 120 are synthesized. RNA-protein complexes with each of the RNA guide strands and CAS13 protein is formed. Gold particles are formed as described in Lee et al Nature Biomedical Engineering volume 1, pages 889- 901 (2017).
  • the gold particles with the RNA protein complexes are provided to HIV infected T-lymphocytes in culture.
  • Each of the sixty different complexes is tested in suitable in vitro model, for example the model used in Ueda et at, Microbiology and Immunology Volume 60, Issue7, July 2016, 483-496 While the invention has been described with reference to specific exemplary embodiments, the de scription is in general only intended to illustrate the inventive concept and should not be taken as limiting the scope of the invention. The invention is generally defined by the claims.

Landscapes

  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biophysics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

There is provided a system comprising at least one sequencing device and at least one RNA synthesis device, where the sequencing device comprises a first computer and the nucleotide synthesis device comprises a second computer, where the sequencing device is configured to a) receive at least one biological sample comprising genetic material in the form of polynucleotides, b) determine the polynucleotide sequence of the genetic material and storing it as sequence data, c) providing the sequence data to the RNA synthesis device; where the RNA synthesis device is configured to receive the sequence data, where the sequencing device or the RNA synthesis device is configured to store the sequence data as a patient sequence data set that represents the genome of the patient and a pathogen sequence data set that represents at least a part of the genetic material of the pathogen, where the RNA synthesis device is configured to: d) find, in the pathogen sequence data set, at least one polynucleotide target sequence with a length from 18 to 60 nucleotides, where said target sequence is not present in the patient genetic sequence data set, e) synthesize an RNA guide strand for a CRISPR-type protein that is complimentary to the target sequence of step e), the RNA guide strand additionally comprising a predetermined handle sequence that mediates binding to a the CRISPR type protein.

Description

System for production of CRISPR-based pharmaceutical compositions
FIELD OF TECHNOLOGY
This invention relates to a system for providing CRISPR-related type therapies of infections, for ex ample a virus infection.
BACKGROUND
Endonucleases are enzymes that cleave polynucleotides. Clustered regular interspaced short palin dromic repeat (CRISPR)-type proteins are endonucleases that use an RNA guide strand to cut DNA or RNA at specific sites. CRISPR-type proteins can be used for editing of DNA or for RNA knock down. Cpfl and Casl3 are two recently identified CRISPR type proteins. The use of Cpfl for ge nome editing is described in Zetsche et al (Zetsche et al., 2015, Cell 163, 759-771). The use of Casl3 for knockdown has been reported in Abudayyeh et al (Abudayyeh et al (2017) Nature,
550 280-284.
Pathogens, such as virus, bacteria and eukaryotic parasites are still a major cause of suffering and death. There is a need for improved therapies for infections.
SUMMARY OF INVENTION
In a first aspect of the intervention there is provided a method involving a system, said system comprising at least one sequencing device, preferably at least two sequencing devices, and at least one RNA synthesis device, where the sequencing device comprises a first computer and the nucle otide synthesis device comprises a second computer, where the computer of the RNA synthesis machine and the computer of the nucleotide sequencing device communicate via a wide area net work, the method involving the following steps: a) providing at least one biological sample from a patient to the sequencing device, where the at least one biological sample comprises human ge netic material comprising the genome of the patient and genetic material from a pathogen that has infected the patient, said genetic material being in the form of polynucleotides, b) the sequencing device determining the polynucleotide sequence of the genetic material where the ge nomic sequence of the patient is determined and at least a part of the sequence of the genetic material of the pathogen is determined, where the determined polynucleotide sequence is stored as sequence data, c)the sequencing device providing the sequence data to the RNA synthesis de vice; and the RNA synthesis device receiving the sequence data from the sequencing device, where the sequencing device or the RNA synthesis device stores the sequence data as a patient sequence data set that represents the genome of the patient and a pathogen sequence data set that repre sents at least a part of the genetic material of the pathogen, d) the RNA synthesis device finding, in the pathogen sequence data set, at least one polynucleotide target sequence with a length from 18 to 60 nucleotides, where said target sequence is not present in the patient sequence data set, e) the RNA synthesis device determining an RNA guide strand for a CRISPR-type protein that is complimentary to the target sequence of step d), f) the RNA synthesis device synthesizing the RNA guide strand, where the RNA guide strand comprises a predetermined handle sequence that inter acts with the CRISPR-type protein.
Step c) can be carried out as follows: storing the determined polynucleotide sequences as se quence data in a computer memory of a computer of the system, where the sequence data is stored as a patient sequence data set that represents the genome of the patient and a pathogen sequence data set that represents at least a part of the genetic material of the pathogen,
The method provides personalized medicine for the treatment of a infection by a pathogen with the use of CRISPR technology. CRISPR technology uses RNA guide strands. Importantly, RNA is much more sensitive than DNA. For example, RNA degradation by RNAses is a problem when han dling RNA. RNA synthesis and handling must therefore be done in an environment that is con trolled. Furthermore, the algorithm for selecting guide strands can be kept at a central computer, which provides security and easy updates. Moreover, pharmaceuticals can be produced in a sterile fashion.
In a preferred embodiment the method of claim 1 where the system comprises at least two se quencing devices connected to the RNA synthesis device. This provides central and RNA:ase free synthesis of RNA guide strands.
In one embodiment identifying the target sequence in step d) comprises identifying a sequence that comprises a Cpfl - PAM sequence motif.
In one embodiment the pathogen is a virus selected from FI IV, FIPV, FHepatitis B and Flerpes. The system may have stored sequence information for a number of predetermined target se quences and where step d) comprises finding, in the pathogen data set, a target sequence that is comprised in the sequence of at least one of the predetermined target sequences. The predeter mined target sequences may be ranked according to expected efficacy, and the target sequence with the highest expected efficacy may then be selected.
Step d) may comprise identifying a sequence that comprises a Cpfl - PAM sequence motif.ln one embodiment the RNA synthesis device additionally has a protein-RNA complex formation station that is configured to use the RNA guide strand synthesised in step f) and to use an aliquot of a CRISPR-type protein to form a complex between the RNA guide strand and the protein.
The method may comprise the step of mixing the RNA guide strand or the protein-RNA guide com plex with suitable components to obtain a pharmaceutical composition, and then packing the pharmaceutical composition in a suitable container and sealing the container.
In a second aspect of the invention there is provided a system comprising at least one sequencing device, preferably at least two sequencing devices, and at least one RNA synthesis device, where the sequencing device comprises a first computer and the RNA synthesis device comprises a sec ond computer, where the sequencing device is configured to a) receive at least one biological sample comprising genetic material in the form of poly nucleotides,
b) determine the polynucleotide sequence of the genetic material and storing it as se quence data,
c) providing the sequence data to the RNA synthesis device; where the RNA synthesis device is configured to receive the sequence data, where the sequencing device or the RNA synthesis device is configured to store the se quence data as a patient sequence data set that represents the genome of the patient and a pathogen sequence data set that represents at least a part of the genetic material of the pathogen, where the RNA synthesis device is configured to: d) find, in the pathogen sequence data set, at least one polynucleotide target sequence with a length from 18 to 60 nucleotides, where said target sequence is not present in the patient genetic sequence data set,
e) synthesize an RNA guide strand for a CRISPR-type protein that is complimentary to the target sequence of step e), the RNA guide strand additionally comprising a predeter mined handle sequence that mediates binding to a the CRISPR type protein.
The second computer have stored sequence information for a number of predetermined target sequences and where step d) comprises finding in the pathogen data set, a target sequence that is comprised in the sequence of at least one of the predetermined target sequences. Step d) may comprise identifying a sequence that comprises a Cpfl - PAM sequence motif.The predetermined target sequences may be ranked according to expected efficacy and the system is configured may then be configured to select the target sequence with the highest efficacy.
RNA synthesis device of the system may additionally have a protein-RNA complex formation sta tion that is configured to use the RNA guide strand synthesised in step e) and to use an aliquot of a CRISPR-type protein to form a complex between the RNA guide strand and the protein.
BRIEF DESCRIPTION OF DRAWINGS
Fig. 1 is a schematic diagram showing a RNA guide strand.
Fig. 2 is a schematic diagram showing a system.
Fig. 3 is a schematic diagram of a polynucleotide sequencing device. Fig. 4 is a schematic diagram of a RNA synthesis device.
Figs. 5-6 are flowcharts showing methods.
DETAILED DESCRIPTION
In brief, a recombinant CRISPR type protein in complex with an RNA guide strand (protein-RNA complex) that targets the CRISPR type protein to the DNA or the RNA of a pathogen is used for treating an infection in a patient. The protein-RNA complex specifically cuts polynucleotides of the pathogen that causes the infection. The patient may be a human or an animal, preferably a mam malian animal. In a preferred embodiment the patient is a human.
In a more general sense, the protein-RNA complex may be used to cause double strand breaks in the DNA of pathogen-infected cells, or to knock down RNA in a pathogen-infected cell. The pro- tein-RNA complex may be directed to target a genomic locus of interest of the pathogen.
In a preferred embodiment the disease being treated is an infection caused by a pathogen such as a virus, a bacteria or a eukaryotic parasite, such as a fungus. In a preferred embodiment the infec tion is caused by a bacteria or a virus, most preferably a virus and most preferably a virus chosen from the group consisting of HIV (Human immunodeficiency Virus), HPV (Human Papilloma Virus), Herpes type 1, Herpes type 2 and Hepatitis B. In a preferred embodiment the virus infection being treated is a HIV infection, preferably HIV-1.
In one embodiment the virus is a ssRNA virus, in particular when Casl3 is used, for example influ enzas type A, B or C, respiratory syncytial virus, M ERS coronavirus, hepatitis A virus, hepatitis C vi rus, Lassa fever virus, Marburg virus, Ebola virus, Nipa virus, zika virus, West Nile virus or Rift valley fever virus.
CRISPR -type proteins use an RNA guide strand to cut DNA or RNA. A guide strand is an RNA mole cule that binds to the CRISPR-type protein and guides the CRISPR type protein to a certain polynu cleotide target sequence. The guide strand is able to hybridize with the target strand (Watson- Crick base pairing). A complex between a CRISPR-type protein and an RNA guide strand is referred to as a protein-RNA complex herein.
As used herein "handle motif" and "handle sequence" refers to a RNA sequence that interacts with a CRISPR type protein for example by mediating binding between an RNA guide strand and a CRISPR type protein. Examples of handle sequences for Cpfl and CAS13 are given below.
There are many different useful CRISPR-type proteins. Preferably the CRISPR-type protein has en donuclease activity that causes a double strand breaks. The most studied CRISPR protein is CRISPR/Cas9 which cuts DNA, leaving blunt ends. CRISPR/Cas9 has been used for editing of eukary otic genomes (Cong et al, Science 339, (2013) 819-823, Mali et al, Science (2013) 823-826).
CRISPR/Cas9 uses a 42-nucleotide RNA guide strand and in addition a second strand (so called tra- crRNA strand) which may be 89 nucleotides long. In a preferred embodiment, one of the CRISPR-type proteins Cpfl or Casl3 is used. The use of Cpfl for genome editing is described in Zetsche et al (Zetsche et al 2015, Cell 163, 759-771). The use of Casl3 for RNA knockdown has been reported in Abudayyeh et al (Abudayyeh et al (2017) Na ture, 550280-284).
In contrast to CRISPR/Cas 9, Cpfl (Zetsche et al., 2015, Cell 163, 759-771) cuts DNA in a staggered manner, leaving sticky ends with a 4 or 5 nucleotide 5'- overhang. This makes it difficult for DNA - repair system to repair the cut, compared to if a blunt end is created. The unligated DNA may in hibit the pathogen in different ways including but not limited to: 1) triggering apoptosis of a virus- infected cell, 2) causing the death of a pathogen, for example a bacterium. The pathogen is prefer ably a pathogen that has its genomic material in the form of DNA during at least some part of its life cycle. Many virus genomes become integrated into the host genomic DNA. For example, HIV becomes integrated into the genomic DNA of infected T-lymphocytes.
The Cpfl protein may be a Cpfl protein from Francisella novicida, Adamiococcus sp BV3L6 or Lach- nospiracea bacterium ND2006 in particular Adamiococcus sp BV3L6 or Lachnospiracea bacterium ND2006. A useful variant of Cpfl is Alt-R Casl2a.
CRISPR/Cas 13 cuts RNA and can be used for knockdown of pathogen RNA. This may limit patho gen survival, replication or activation, or may cause the death of the pathogen. The Casl3 protein may be Casl3 from Leptotrichia wadei (Abudayyeh et al (2017) Nature, 550 280-284. Useful vari ants of Casl3 include PspCasl3b, LwaCasl3a, LbuCasl3a and LshCasl3a, LwaCAS13 and
PsmCAS13
Other useful CRISPR type proteins edit polynucleotides by inserting an extra base in the polynucle otides, for example mRNA, leading to a frameshift and premature stop of translation.
When it is referred to CRISPR/Cas 9, Cpfl and CAS13 it is also includes functional equivalents and homologues of these proteins. Thus, modified or truncated proteins are included, provided that they have the same or comparable nuclease activity as the endogenous CRISPR/Cas 9, Cpfl and CAS13 proteins. A homologue may have an amino acid identify with the original protein sequence of at least 70% more preferably at least 80%, even more preferably at least 90%, even more pref erably at least 95% and most preferably at least 99%, using amino acid sequence alignment in BLAST (for example BLAST2 sequences) using the following settings: word size: 3, gapcosts: 11, 1, Matrix: BLOSUM62, Filter string: F, Window Size 40, Threshold 11. With reference to Fig 1, the guide strand for Cpfl preferably has a length of from 40 to 44, more preferably 41 to 44 nucleotides and comprises a 5' constant motif (handle sequence) which may be 5' - AAUUUCUACUCUUGUAGAU-3' or 5'-UAAUUUCUACUCUUGUAGAU-3'. The handle se quence interacts with Cpfl and may be important for complexing with Cpfl or for Cpfl activity.
The guide segment is 21- 24 nucleotides long and is located 3'-terminal to the handle sequence. The RNA is provided as single stranded RNA but parts, in particular parts of the handle sequence, may form a secondary structure. The guide strand hybridizes with a target strand of double stranded DNA. The opposite strand is referred to as the "displaced strand".
In addition, the guide strand for Cpfl may have a unspecific 5' extension of from 3 to 59 or more nucleotides as described in Park et al., Nature Communications, (2018) 9:3313 DOI:
10.1038/s41467-018-05641-3 in order to increase the efficacy. The 5' extension is preferably not homologous to the human genome. For example, it may be a scrambled sequence. It has been hy pothesized that such a 5' extension increases efficacy by providing a negative charge.
Some CRISPR type proteins, including Cpfl, uses a PAM (Protospacer Adjacent Motif) motif to rec ognise target sequences. The minimal PAM motif for Cpfl is TTN. The TTN motif used for Cpfl is preferably TTT, even more preferably TTTV where V is any nucleotide except T. The PAM motif is localized on the displaced strand and is not recognized by the guide strand of the RNA -protein complex but by the interaction between the TTN nucleotides and amino acid residues of the Cpfl protein. Cpfl cuts the displaced strand with a 4-5 nucleotide overhang approximately 18-19 nucle otides from the PAM TTTN motif and cuts the target strand approximately 24- 25 nucleotides from the TTTN motif.
For treatment of HIV in humans with a protein-RNA complex comprising Cpfl protein, the protein- RNA complex is preferably directed to a sequence selected from SEQ ID NO 1 to SEQ ID NO 20 shown in Table la, even more preferred sequences selected from SEQ ID NO 41-80 shown in Table lb. These sequences represent the displaced strands of various suitable targets. These sequences have the following properties: 1) they include the TTT PAM important for Cpfl binding to the dis placed strand , 2) the sequences are conserved over a large number of FI IV stains, 3) the sequences are present in sequences that are important for the HIV virus, and 4) the sequences are not pre sent in the human consensus genome, making it safe to target the protein-RNA complex to these sequence. These sequences ensure that the endonuclease activity of the protein-RNA complex will only be targeted to DNA in FllV-infected cells.
Figure imgf000009_0001
Table 1.
Figure imgf000010_0001
Table lb SEQ ID NO 1- 20 and 41-80 show the sequences of the displaced strands, including the PAM motif (TTTV) although the TTTV motif is not actually displaced but remains hybridized to the target strand. For obtaining a suitable RNA guide sequence from SEQ ID NO 1-20 and 41-80 the following operations are performed, shown with the sequence 5'-TTTATCTATGCCATCTAAAAATAA-3'as ex- ample:
1. Remove PAM.
SEQ ID NO 1 is:
5' -TTT AT CTATG CC AT CT A AAA AT AA-3'
SEQ ID NO 1 less the TTTN motif is: 5' -TCT AT G CC AT CT AA AA AT AA-3"
2. Replace T with U:s:
Since RNA has U instead of T, the guide sequence strand is 5'- UCUAUGCCAUCUAAAAAUAA - 3'
3. Add "handle sequence" The guide sequence strand has a 5' "handle" sequence that makes the guide RNA bind to the Cpfl protein. The handle sequence may be 5'-UAAUUUCUACUCUUGUAGAU-3''.
Thus, the guide strand sequence is 5'-UAAUUUCUACUCUUGUAGAU-3' + 5'- UCUAUGCCAUCUAAAAAUAA - 3' which is 5'-UAAUUUCUACUCUUGUAGAU UCUAUGCCAUCUAAAAAUAA - 3' For use with Cpfl from Acidaminococcus sp. the handle sequence may be 5' - AAUUUCUACUCUUGUAGAUG- 3'.
Note that, because the TTN motif is on the opposite strand of the target strand, the guide strand will be a part of SEQ ID 1-20 and SEQ ID 41-80. The target sequence will be the reverse comple ment of each of SEQ ID NO 1 - 20 and 41-80. Examples of suitable target RNA sequences for targeting CAS13 to H IV RNA, for example HIV mRNA, include SEQ ID NO 21-40, and even more preferably SEQ ID NO 81-120, shown in Tables 2a and 2b. These sequences have the following properties: 1) the sequences are conserved over a large num ber of HIV stains, 2) the sequences are present in sequences that are important for the HIV virus, and 3) the sequences are not present in the human consensus genome, making it safe to target the protein-RNA complex to these sequences. Protein-RNA complexes with these RNA guide strands cuts crucial HIV mRNA.
Figure imgf000012_0001
Table 2a.
Figure imgf000013_0001
Table 2b Similar operations as described above for Cpfl can be done with SEQ ID NO 21- 40 and SEQ ID NO 81-120. However, because these sequences are targeting by CAS13 which targets RNA, the guide strand will comprise a guide segment that is the reverse complement of one of SEQ ID 21-40 and SEQ ID 81-120 SEQ ID NO 21 will be used as an example:
SEQ ID NO 21 is 5'-AACAUAGUAACAGACUCACAAUAUGCAUUA-3'. The guide sequence will be the reverse complement of this sequence, which is
5'-UAAUGCAUAUUGUGAGUCUGUUACUAUGUU- 3'
The guide strand comprises a so called "direct repeat sequence" (DRS) ("handle sequence") that is specific for the CASE13 protein used, and which interacts with the CAS13 protein, and may medi ate binding of the guide sequence to the CAS13 protein. For some CAS13 proteins, the DRS is lo cated 5' of the guide segment and for others the DRS is located 3' of the guide segment.
Below are some examples of DRS for CAS 13 proteins:
PspCasl3b: 5'- GUUGUGGAAGGUCCAGUUUUGAGGGGCUAUUACAAC -3' (located 3' of guide seg- ment)
LwaCasl3a: 5'- GAUUUAGACUACCCCAAAAACGAAGGGGACUAAAAC -3' (located 5' of the guide segment) (Freije et al, 2019, Molecular Cell 76, 826-837).
PsmCasl3b: 5'- GUUGUAGAAGCUUAUCGUUUGGAUAGGUAUGACAAC -3' (Freije et al, 2019, Mo lecular Cell 76, 826-837). Thus, a suitable guide strand sequence for the Lwa CAS13 protein for targeting SEQ ID NO 21 may be is 5'- G AU U U AG ACU ACCCCAAAAACG AAGGGG ACU AAAAC -3'
+
5'- UAAUGCAUAUUGUGAGUCUGUUACUAUGUU 3' which is 5'- GAUUUAGACUACCCCAAAAACGAAGGGGACUAAAAC
UAAUGCAUAUUGUGAGUCUGUUACUAUGUU -3 The guide strands for some Casl3 proteins (but not Casl3a from Lwa or CAS13b from Psp) may in addition need a protospacer flanking site (PFS), se for example Abudayyeh et al (2017) Nature,
550 280-284 where the PFS for Leptotrichia shahii Casl3a is discussed. The PFS may be a prefer ence for FI (=not G. It is referred to (Smargon, Cox, Pyzocha et al., Molecular cell 2017;Cox, Gootenberg, Abudayyeh et al., Science 2017).
The protein-RNA complexes, or the plasmids or the virus are preferably administered to the pa tient in the form of a pharmaceutical composition. Such a pharmaceutical composition comprises an effective amount of the protein-RNA complexes, plasmids or virus ("active component"), and a pharmaceutically acceptable carrier, which typically is an aqueous solution optionally comprising a variety of different pharmacologically acceptable compounds. The formulation is made to suit the mode of administration. There is a wide variety of possible formulations. The formulation may be adapted to increase the uptake or stability of the active component or to improve the pharmacoki netics or pharmacodynamics of the active component, or to enhance other desirable properties of the formulation. The pharmaceutical composition, the complexes and the virus and plasmids de scribed herein are preferably non-naturally occurring or engineered.
In certain embodiments a protein-RNA complex is delivered. Delivery of the protein-RNA complex can be made in any suitable way. Two reviews that describe useful methods of delivery are: Glass, Lee, LI and Xu; Trends in Biotechnology, 2017, and Liu, Zhang, Liu and Cheng, Journal of Controlled Disease, 266 (2017) 17-26.
Suitable methods include nanoparticles for example gold particles, or polymeric carriers, such as polymers obtained from chitosan or poly-caprolactone or poly-lactic/glycolic acid-copolymers. The use of gold particles is a preferred method of delivery (Mout et al (2017) ACS Nano 11, 2452-2458) and Lee et al Nature Biomedical Engineering volume 1, pages 889-901 (2017).
Another preferred method of delivery is lipid nanoparticles, for example as described in Wang et al., PNAS March 15, 2016 vol. 113 no. 11 2868-2873, and Li et al., Biomaterials 178 (2018) 652 - 662.
In other embodiments, a plasmid or plasmids encoding the protein and/or the guide RNA is admin istered to the patient, as is known in the art. The plasmids are preferably adapted for expression of the protein and transcription of the RNA in the cell type of interest which may be a mammalian cell, preferably a human cell. For example, the protein gene and the guide strand gene is prefera bly under control of suitable promotors that induce expression in these cells. A skilled person knows how to achieve expression in mammalian cells. For plasmid delivery, the route of admin istration, formulation and dose can be as in US Patent No 5,846,946 and as in clinical studies in volving plasmids. In some embodiments, the guide strand is delivered (as RNA) together with a plasmid that encodes the CRISPR-type protein, or the other way around. A suitable promotor for expression in humans is chosen when the pathogen is a virus. When the plasmid or plasmids are delivered to pathogens that are bacterial or eukaryotes for expression in the pathogen, the pro motor is preferably chosen to suit the internal transcription system of the pathogen.
In other embodiments delivery of the CRISPR type protein or the RNA guide strand is carried out with the use of a virus. The CRISPR-type protein and the guide RNA can be delivered using adeno associated virus (AAV), lentivirus, adenovirus or other plasmid or viral vector types, in particular, using formulations and doses from, for example, US Patents Nos. 8,454,972 (formulations, doses for adenovirus), 8,404,658 (formulations, doses for AAV) and 5,846,946 (formulations, doses for DNA plasmids) and from clinical trials and publications regarding the clinical trials involving lentivi rus, AAV and adenovirus. For examples, for AAV, the route of administration, formulation and dose can be as in US Patent No. 8,454,972 and as in clinical trials involving AAV. For adenovirus, the route of administration, formulation and dose can be as in US Patent No. 8,404,658 and as in clinical trials involving adenovirus.
When a virus or a plasmid is used, each of the sequences encoding the CRISPR-type protein and the guide strand is adapted for expression of the protein in the cell, and adapted for transcription of RNA. Thus, the coding sequences are preferably under control of a regulatory element, which typically is a DNA sequence that controls the transcription of the gene of interest. The regulatory element may comprise one more promotors, enhancers or the like. The regulatory element is cho sen to suit the cell in which expression is to be achieved. The regulatory element may be operably linked to the sequences. Each of the CRISPR-type protein and the sequence encoding the guide stand may be operably linked to a separate regulatory element. The genes for the CRISPR-type protein may be codon-optimized for expression in the cells of the interest, for example human cells. The CRISPR-type protein and/or the guide strand may be targeted to the nucleus with the ad dition of nucleus targeting sequences.
Multiple guide strands that each target one separate sequence may be deliver simultaneously, for example with the use of a plasmid that encodes for separate guide strands or for one long RNA that is broken up into a plurality of guide strands with the use of a nuclease activity. Formulation may be adapted for for parenteral administration such as for example intraarticular, intravenous, intradermal, intraperitoneal, or subcutaneous administration include aqueous and non-aqueous injection solutions. Formulations for injection may be in unit dosage forms, for exam ple ampules or in multidosage forms. The formulation can be for administration topically, systemi- cally or locally. The formulation can also be provided as an aerosol.
The formulations may contain nuclease inhibitors (such as RNase inhibitors) antioxidants, buffers, antibiotics, salts, solutes that renders the formulation isotonic, lipids, carriers, diluents emulsifiers, chelating agents, excipients, fillers, drying agents, antioxidants, binding agents, solubilizers, stabi lizers, antimicrobial agents, preservatives and the like.
The protein-RNA complex, the plasmids or the virus may be administered to the subject in any suitable manner. The protein-RNA complexes, the plasmids or the virus can be administered by a number of routes including intravenous injection, intraperitoneal, intramuscular, transdermal, subcutaneous, topical, sublingual, or rectal means. Suitable modes of administration include injec tion or infusion. Intravenous administration is a preferred mode of administration.
Preferably an effective amount of the protein-RNA complex, the plasmids or the virus is adminis tered to the subject. An effective amount is an amount that is able to treat one or more symptoms of a disease, halt or reverse the progression of a disease.
Administration may be carried out at a single time point or repeatedly over a time period or from an implanted slow-release matrix. Other delivery systems include bolus injections, time-release, delayed release, sustained release or controlled release systems.
Dosage and administration regimens may be determined by methods known in the art, for exam ple with testing in appropriate in vitro or in vivo models, such as animal models to analyse efficacy, pharmacokinetics, pharmacodynamics, excretion, tissue uptake and the like by methods known in the art. A suitable way of finding a suitable dose is starting with a low amount and gradually in creasing the amount.
The CRISPR type protein for use in protein-RNA complexes are preferably produced in a suitable expression system. Production of protein with the use of expression systems is well known in the art. In general, Current protocols in Molecular Biology (John Wiley & sons) provides guidance for polynucleotide handling and manipulation, and protein expression and handling. CRISPR type pro tein, in particular Cpfl and CAS13 can be produced in any suitable manner. Suitable expression systems include eukaryotic cells such as CHO cells, insect cells or bacteria. Often, E. coli is the pre ferred expression system because of its ease of use, and because the CRISPR-type proteins are of bacterial origin. Typically, the production of protein involves cloning of the coding sequence for the protein into a plasmid suitable for expression. The plasmid preferably has a promotor that drives expression. For expression in E. coli the T7 promotor may be useful. For expression in mam malian cells, the CMV promotor may be useful. The plasmid is introduced into the cells with the use of well-known transfection protocols, and stable or transient expressing cells are generated. Suitable transfection techniques may be the use of electroporation or the use of liposomes, such as Lipofectamine® or virus-based methods. Clones stably expression the protein may be selected, expanded and propagated.
Expression plasmids for Cpfl are described in Zetsche et al and expression plasmids for CAS13 are described in Abudayyeh et al (see above). The proteins may be expressed with a suitable tag for purification of the protein, such as poly-His tag.
Purification of protein may be carried out as is known in the art and may include steps such as: cell lysis, centrifugation, gel filtration, affinity chromatography and dialysis. The protein is preferably purified and endotoxin-free. Useful plasmids for expression of Cpfl include pTE4396, pTE4396, pAsCpfl(TYCV)(BB) (pY211) and pYOlO (pcDNA3.1-hAsCpfl). Useful plasmids for expression of Casl3 include: pC0046-EFla-PspCasl3b-NES-HIV and pC0056 - LwCasl3a-msfGFP-NES (eukaryotic expression) and p2CT-Flis-M BP-Lwa_Casl3a_WT (expression in bacteria).
The RNA guide strand can be produced in any suitable way. A preferred way is chemical synthesis. Methods for synthesis of RNA are well known to a person skilled in the art. RNA synthesis is prefer able done in a controlled environment to avoid degradation of RNA by for example RNAses.
The conditions for complexing guide RNA with protein are known. Typically, the protein is incu bated with the guide RNA in a suitable buffer. Incubation time may be 10 minutes to 30 minutes.
The above-mentioned methods for administration of protein-RNA complexes, plasmids and virus can be used to introduce double strand breaks with the use of Cpfl pathogenic virus-infected (for example HIV) cells or pathogen cells (such as baceteria), or to knock down pathogen (such as viral) RNA using Casl3, in vivo, ex vivo or in vitro. In one embodiment this is done in vitro. The virus-in fected cells may be a subpopulation of a larger population of cells, where not all cells are infected with the pathogenic virus. There are known in vitro methods for assessing the efficacy of treatment and delivery. Examples include the methods used in Ueda et at, Microbiology and Immunology Volume60, Issue7, July 2016, 483-496
SYSTEM
With reference to Figs. 2 to 6, system 1 automatically determines a suitable guide RNA guide se quence based on the patient's genome and the genome of the particular pathogen that has in fected the patient, and then synthesizes the guide strand. The user provides a biologic sample as sociated with a unique identity to the system 1 and the system 1 then produces an RNA molecule, preferably in complex with a CRISPR-type protein, for treatment of the patient. Preferably a ready- to-use pharmaceutical composition is produced. The identity of the sample may be associated with a patient ID and an address for delivery of the pharmaceutical and other useful information, such as sample type, sample date etc. Preferable system 1, in particular RNA synthesis device 3, is able to handle a large number of samples per time unit in an automated fashion.
The system 1 comprises at least one sequencing device 2 and at least one RNA synthesis device 3. Preferably there is one RNA synthesis device 3 and preferably there is a plurality of sequencing de vices 2, such that there are 2, 3, 4, 5, or more sequencing devices 2. The at least one sequencing device 2 and the RNA synthesis device 3 are connected to wide area network 4, for example the internet so that they are able to communicate as is known in the art of computer networking. In this manner the sequencing device 2 and the RNA synthesis devices 3 can be placed at different sites. In particular, one RNA synthesis device 3 can be used together with a plurality of nucleotide sequencing devices 2 at different sites, for example as point of care devices. This provides local sequence determination (for example at point of care) and analysis and in centralized RNA synthe sis, in particular centralized RNA synthesis. The sequencing devices 2 comprises a sequencing com puter 6 and RNA synthesis device 3 comprises a synthesis computer 7. Sequencing computer 6 may be a client in relation to the synthesis computer 7 of the RNA synthesis device 3. System 1 may comprise functionality for ordering, payment, and shipping of pharmaceuticals. This is prefer ably handled by sequencing computer 6 of nucleotide sequencing and synthesis computer 7 of RNA synthesis device. The sequencing device 2 comprises a nucleotide sequencing unit 5 that is arranged to determine the sequence of polynucleotides in a biological sample. A user, which may be a healthcare profes sional or the patient, provides a biological sample from the patient to the sequencing device 2, preferably together with data that identifies the user in a unique manner, and the sequencing de vice 2 then automatically determines the polynucleotide sequence of genetic material in the sam ple.
Any suitable type of sample can be used. For example, the sample may comprise blood, serum, mucus, urine, liquor, skin, ascites fluid, liquid from an infected wound, etc. Preferably the sample is a liquid sample, such as blood. The volume of the liquid sample may be in the range of 0.1 ul to 20 ml. Alternatively a non-liquid sample may be used such as cells from the inside of the cheek of the patient obtained with a cotton swab.
The at least one sample comprises human genetic material comprising the genome of the patient and also genetic material from the pathogen that has infected the patient. The genetic material that comprises the genome of the patient and the genetic material from the pathogen may be pro vided in in the same sample or in different samples. Preferably one sample is a sample that can be presumed to include the pathogen if interest. For FI IV a blood sample is suitable since FI IV infects T-lymphocytes. For MRSA bacteria a sample from an infected wound may be suitable.
The polynucleotides in the sample may undergo various procedures including but not limited to 1) extraction and/or purification of polynucleotides from the sample, 2) digestion of the polynucleo tides to pieces of smaller length, 3) labelling of polynucleotides, 4) amplification such as amplifica tion using PCR, which is well known in the art, before 5) determination of polynucleotide se quence.
US 2018/0169658 and WO 2012/012779 discloses a device that are able to carry out these steps and which may be used as sequencing unit 5.
Examples of automated DNA/RNA purification sample preparation workstations that may be used include Perkin Elmer chemagic™ 360 Nucleic Acid Extractor or QIAcube FIT System from Qjagen. Sequencing may be done as is known in the art, for example using Sanger sequencing or next gen eration sequencing, for example lllumina-type sequencing. Suitable machinery may include the llu- mina MiSeqDx System or the NextSeq 550Dx. Handling of samples between workstations in nucleotide sequencing device 2 may be done using robots under control of sequencing computer 6. Suitable automated liquid handling systems may be provided for example by Agilent, ThermoFisher or PerkinElmer. Perkin Elmer Janus G3 may be a suitable liquid handling robot. A skilled person would know how to program robots to handle sam ples and to move them from the extraction station to the sequencing station. Identification of sample may for example be done with bar codes bar code readers.
Sequencing computer 6 may control various aspects of the nucleotide sequencing unit 5 for exam ple control sample identification, sample handling, timing of procedures of purification, digestion, amplification and sequencing. Sequencing computer 6 also comprises communication means for communication with RNA synthesis device 3. The sequencing computer 6 is able to store polynu cleotide sequence information as sequence data and transfer this data to the synthesis computer 7 of the RNA synthesis device 3 via wide area network 4.
The RNA synthesis device 3 comprises a synthesis computer 7 that is able to communicate with the sequencing computer 6 of the sequencing device 2, with the use of a wide area network 4, for example with the use of internet. Communication may be encrypted.
Each of sequencing computer 6, and synthesis computer 7 has a memory and a processor and a bus and a communication interface for communication with the other computer, for example a networking port. The sequencing computer 6 and the synthesis computer 7 has suitable mans for making input and output. The sequencing computer 6 and the synthesis computer 7 may have suitable means for making input for example a keyboard and may have a display for showing text, numbers and images. Each of sequencing computer 6 and synthesis computer 7 may have a suita ble operating system such as for example Windows or Linux. The methods described herein can be implemented using any suitable programming language, such as for example C++, Java or Perl. A suitable computer model, mentioned as an example only, may be a Lenovo XI.
A user may be able to change parameters using a user interface, for example to time synthesis (so that it is carried out during night-time) or to decide the amount of RNA guide to be synthesized.
Synthesis computer 7 of RNA synthesis device 3 has bioinformatics software able to carry parts of the methods described herein. Operations of the bioinformatics software may include various op erations for polynucleotide sequence data known in the art such as sequence storage, sequence retrieval, sequence modification, sequence alignment, sequence assembly, similarity determina tion, applying windows, etc. For example, the EMBOSS and/or the UGENE software packages may be used for some of these operations. BLAST is a frequently used tool for alignment of sequences. Synthesis computer 7 is able to determine the sequence of a suitable RNA guide strand to be syn thesised and provide the sequence to the RNA synthesis station 8. Sequencing computer 6 may also have such bioinformatics software when genome assembly from reads is carried out by se quencing computer 6.
Synthesis computer 7 may also have stored in its memory sequence information ("hot list") for a plurality of suitable predetermined polynucleotide target sequences. Examples of sequences that are suitable to be included in a hot list for the treatment of HIV are those of Tables lb and 2b. The predetermined target sequences of the hot list have been selected so that they are known to be crucial for pathogen viability. Sequences of interest are sequences that are known to be crucial for the pathogen to maintain the pathogen infection in the patient. The sequences in the hot list may be crucial for pathogen virulence, pathogen survival, pathogen replication, pathogen activation (such as the activation of a virus from a latent phase to a lysogenic phase), or similar. Sequences of the hot list may for example be protein coding genes, but may also be regulatory sequences such as promotors. The sequences in the hot list may be ranked according to how efficient they are for treatment, and a sequence may be selected by the system based on how efficient targeting the sequence is. Hence one particular sequence may be selected over other sequences.
Sequencing computer 6 or synthesis computer 7 has functionality for assembling sequence reads to longer sequences, and optionally for identify sequence reads as belonging to the human ge nome or to a virus.
RNA synthesis device 3 comprises a RNA synthesis station 8 that is able to synthesize RNA. RNA synthesis chemistry is known in the art. Synthesis may for example be carried out by adding and covalently attaching one base at a time to growing RNA chain. Examples of useful RNA synthesis machines include Oligo Synthesizer 192 from Oligomaker APS and ABI 3900 from Biolytic Lab Per formance Inc. W0200364026 describes a useful polynucleotide synthesis machine. RNA synthesis station 8 may also provide functionality for purification, adding buffers etc. RNA synthesis station may also have a packaging unit for packing the RNA, the protein-RNA complex or the pharmaceuti cal product in a container. RNA synthesis station 8 is able to handle each type of RNA molecule in a separate manner, for ex ample in a separate container for example in microtubes, 96-well plates, or other suitable format. Preferably the RNA synthesis station is able to provide RNA with a high degree of purity, i.e. free of polynucleotides with other sequences than the desired sequence, and free of remaining reagents.
RNA synthesis device 3 may be also comprise a complex formation station 9 that is able to add the RNA guide strand that has been synthesised to purified CRISPR-type protein to form protein-RNA complexes. Complex formation may for example be induced by incubation the protein with the RNA guide strand in a suitable buffer for a certain time. The incubation time may be for example from 10 to 30 minutes. The complex formation station 9 is preferably also implemented by robots as discussed herein.
RNA synthesis station 8 and complex formation station 9 may be under control of synthesis com puter 7 which may be able to control for example addition of reagents, reaction timing, and purifi cation steps. A robot may be used for handling samples and the robot may be under control of synthesis computer 7 in a similar manner as in sequencing device 2.
Parts of system 1 may be centralized such on a server, data store or a database, in particular stor age of sequences, algorithms for assembling genomes, alignment or for identification of target se quences or for the in-silico design of RNA guide strands. Sequencing computer 6 or synthesis com puter 7 may query the server or database for such services, for example by uploading sequence information for storage and/or analysis. Hence a server or a database may provide sequence infor mation and analysis results to the sequence computer or synthesis computer. The services may be provided in synthesis computer 7 which, again may act as a server. The synthesis computer 7 or a server may be a single, standalone computer server, or may be distributed across several physical or virtual computer servers or data stores, as the case may be. Hence, various components and functionalities of sequencing computer 7 may be carried out by different computers.
In one embodiment, both sequencing computer 6 and synthesis computer 7 are clients in relation to a server which may or may not be physically distant from either or both of these computers. A central server may hence serve as a datastore for centralized storage and analysis of sequences.
With reference to Fig. 5, a method may comprise the following steps: In step 100 a biological sam ple is provided to the nucleotide sequencing device 2. The at least one sample comprises human genetic material comprising the genome of the patient and at least some genetic material from the pathogen. The genetic material is in the form polynucleotides. The genetic material is preferably DNA but may be RNA in the case of the pathogen. In optional step 101 genetic material is isolated from the sample. This is done as is known in the art, for example with the use of extrac tion. The actual intervention with the patient when the sample is taken is not necessarily a part of the invention.
In step 102 the genetic material is sequenced, meaning that the nucleotide sequence of the ge netic material is determined. The sequence information is stored in the memory as sequence data by the first sequencing computer 6. The sequence data is preferably associated with the identity of the sample and/or patient. Sequencing is typically DNA sequencing but may be sequencing of RNA, for example when using a CRISPR-type protein that has RNA endonuclease activity such as CAS13, and it is desirable to know mRNA sequence. mRNA sequencing may be done using reverse tran scriptase to change mRNA to DNA.
Sequencing coverage for the patient genome is preferably high such that the sequence of entire genome of the patient is determined. There may be a cut off for a minimal coverage of the patient genome. The cut-off may be 99%, 99.5% or 99.99% or higher. The method may include the step of discarding the operation if sequencing coverage is below a threshold. Sequencing is preferably car ried out to a minimum sequencing read depth, which may be for example 2, 3, or four times, or more. The method may include the step of discarding the operation if sequencing depth is below a predetermined threshold. Discarding increases patient safety, because it decreases the risk that the protein-RNA complex is targeted to patient sequences.
Sequence coverage for the pathogen genetic material is preferably high but does not need to be as high as for the patient genetic material. The cut-off may be the same as for the patient but may also be 90% or 95%. As a minimal requirement, a very short stretch corresponding to one target sequence, which may be as short as 18 nucleotides, more preferably at least 19, 20 or 21, 22, 23 or 24 nucleotides of the pathogen genetic material is determined. There may also be a requirement for sequencing depth, which may be the same as for the patient genetic material.
Short sequencing reads typically have to be assembled in silico into complete sequences. When assembling sequence information, it may have to be determined, which reads relate to the pa tient's genome and which reads relate to the pathogen genome. For example, when it is likely when genetic material from both the patient and the pathogen is present in the sample or that se quence reads end up mixed together for other reasons. This may be done using methods known in the art including mapping of reads to a reference genome, preferably a reference human genome and one or more reference pathogen genomes. For example, sequence reads may be connected by overlapping alignments and mapped to a known reference genome sequence. Software for as sembly of reads to genomes are provided by for example lllumina Inc, and lllumina also provides sequencing equipment.
Separation of sequence information may also be done based on biochemical properties and subse quent separate sequencing operations. For example, human nuclear genomic material (which is in the form of chromosomes) may be separated from pathogen genetic material in the form of plas mid DNA or RNA.
The sequence information is stored as a patient sequence data set that represents the genome of the patient and a pathogen sequence data set that represents at least a part of the genome of the pathogen. The sequence information is then in, step 103 provided to the RNA synthesis device 3.
In an alternative embodiment, sequence information is provided as non-assembled reads by se quencing computer 6 to synthesis computer 7 of RNA synthesis device, and synthesis computer 7 carries out assembly. Thus, assembly of reads may be done by sequencing computer 6 of the client sequencing device 2 or by synthesis computer 7 of the RNA synthesis device. In step 103 the se quence information, as assembled genetic information or as reads, is provided to the RNA synthe sis device 2. The sequence information is provided together with the identity if the sample.
In step 104 at least one suitable polynucleotide target sequence with a suitable length is found in the pathogen sequence dataset. The length of the target sequence may be from 18 to 60 nucleo tides, more preferably 19-50 and most preferably from 20-35 polynucleotides. In one embodiment the target sequence is from 20-24 nucleotides, in particular when the Cpfl is used at the CRISPR type protein. A suitable RNA guide strand that targets the target sequence is then designed in sil- ico by synthesis computer 7 as described below. The target sequence should preferably not be present in the patient sequence data set (see below). It may be required that the target sequence is present in the hot list of sequences stored in synthesis computer 7. Step 104 may comprise the step of identification of the pathogen as described below.
The designed sequence is provided to the RNA synthesis station 8. In step 105 an RNA guiding strands is synthesized by RNA synthesis station 8. The synthesis computer 7 instructs the RNA syn thesis station 8 to synthesize the desired RNA molecule. RNA synthesis is carried out as is known in the art. Alternatively, constant parts of the sequence, such as the handle sequence, is not added in silico, but can be added in a "hardware" form during the RNA synthesis step 105. Preferably the RNA guide strand is provided with a high degree of purity. The RNA guide strand may need to be purified to get rid of reagents remaining from the synthesis. Preferably the RNA molecule is provided in a form that is suitable for further processing, in particular for inclusion in a pharmaceutical composition. The RNA guide strand may be provided in a suitable form for exam ple in a microtube or in a 96 well plate format, or other suitable format.
RNA is preferably handled under controlled conditions to prevent degradation of RNases. RNase inhibitors may be used to prevent degradation. Low temperature (around +4°C) also prevents deg radation of RNA (and also of protein). Hence RNA synthesis station 8 and protein complex for mation station 9 may be equipped with a cooling element or other cooling means.
The RNA guide strand may then preferably be combined with other substances to form a pharma ceutical composition in step 106. In a preferred embodiment the RNA molecule is complexed with the CRISPR- type protein, preferably Cpfl or CAS13. This is done by complex formation unit 9. For example, the RNA molecule is incubated with an aliquot of the protein for a certain time under suitable conditions. The complexes may then be used in a pharmaceutical composition suitable for the purpose. The method may include the step of mixing an aliquot of the RNA guide strand, or the protein-RNA guide strand with suitable components to obtain a pharmaceutical composition. The method may comprise the step of then packing the pharmaceutical composition or the RNA guide strand or the protein-RNA complex, in a suitable container, such as in a vial, and sealing the con tainer. The pharmaceutical composition is preferably manufactured in a controlled environment, such as a clean room, to avoid contamination of microbes, RNAases and other contaminants. The controlled environment may be provided air purification systems such as air filters, air locks, staff dress requirements, entry permits, UV light, disinfectant and sterilization procedures, etc as is known in the art.
The pharmaceutical composition may then be distributed to the patient or a caregiver for use by the patient.
The patient may be a patient that is suspected of being infected with a pathogen. The method may comprise the step of using the pathogen sequence data set to determine a) that the patient has been infected with a pathogen and 2) the type of pathogen that has infected the patient. Thus, the system may be used for screening and/or diagnosis of the patient. For example, sequence reads from the sample may be aligned (tested against) consensus genome sequences of known patho gens or pathogen strains using a sequence aliment tool, such as for example BLAST. The genomes of a number of pathogens may be included in a dataset and the sample sequences are then aligned (or attempted to be aligned) to the sequences of the dataset. If the alignment is good enough (for example having a low E-value in BLAST), the sample may be deemed to contain ge netic material from the pathogen. It may be required that a minimal number of genes or loci of a pathogen genome should be present, such as 2, 3, 4 or more. A minimum sequencing depth may be required.
Examples of pathogens that can be identified include HIV (Human immunodeficiency Virus), HPV (Human Papilloma Virus), Herpes type 1, Herpes type 2 and Hepatitis A, Hepatitis B, Hepatis C, Streptococcus, Salmonella, Klebsiella, E. coli, Shigella, Malaria, influenza type A, B or C, respiratory syncytial virus, MERS coronavirus, Lassa fever virus, Marburg virus, Ebola virus, Nipa virus, zika vi rus, West Nile virus or Rift valley fever virus.
Information about the identified pathogen may be stored together with information about the identity of the sample or the patient.
Step 104 may be carried out is in Fig 6. In optional step 200 sequences in the pathogen sequence data set that is present in the hot list are identified. This can be carried out using sequence align ment tools. This step is preferably carried out when the protein-RNA complex comprises a CRISPR type protein that has RNA endonuclease activity, for example Casl3.
In optional step 201 at least one PAM motif for a CRISPR-type protein is identified in the pathogen sequence data set. If step 200 is carried out, only sequence data that remains in the pathogen se quence data set after filtering in step 200 is considered for step 201. The PAM motif may be a PAM motif for Cpfl. The Cpfl PAM motif may be TTN, more preferably TTT and most preferably TTTV, where V is any nucleotide except T. Step 201 is necessary where the CRISPR-type protein requires a PAM sequence for binding to a target sequence and can be left out when a PAM sequence is not necessary.
In step 202 a CRISPR-type protein putative target sequence with appropriate length is selected, see above. When a PAM sequence is used, the target sequence is positioned in relation to the PAM sequence depending on which CRISPR-type protein is used. For Cpfl, The TTN PAM motif is located on the displaced strand and is located 5' of the sequence that is complementary to the tar get sequence (see Fig. 2A of Yamano et al). Preferably the putative target sequence is for Cpfl. A preferred length for a Cpfl target sequence is 20-24 nucleotides. Thus, in a preferred embodiment a target sequence of 20-24 nucleotides which is complimentary to a sequence 3' of PAM is se lected.
In step 203 it the synthesis computer 7 checks that the putative target sequence is not present in the genome of the patient. This is done by attempting to align the sequence of the putative target strand (or the guide strand) with the patient genome data set using suitable cut-offs. For example, the BLAST algorithm can be used. This should be done with both strands of the human genome. If no alignment takes place, the putative guide strand sequence is not present in the human genome and the sequence can be used for creating an RNA guide strand in silico.
Steps 200, 201, 202 and 203 may be carried out in any order, but it may be useful to carry out step 203 last as this reduces processing time. Step 203 can also be carried out after step 204.
In step 204 a sequence for an RNA guide strand is designed in silico. The guide strand should fulfil the following criteria: 1) it should selectively bind to the target sequence, i.e. be complementary to the target sequence, i.e. able to hybridize to the target sequence, 2) it may comprise a handle se quence which interacts with the CRISPR-type protein. For example, for Cpfl the handle sequence may be may have the sequence UAAUUUCUACUCUUGUAGAU or AAUUUCUACUCUUGUAGAU for Cpfl.
It should be noted that for Cpfl a simplified procedure may be used because the sequence of the guide strand will be identical to a part of the sequence 3' of the TTTV motif. Thus, there is no need to find the reverse complement. Instead the guide strand is designed by taking the 5'end of the sequence following the TTTV motif and connecting it to the 3'end of the handle sequence. Also, the guide strand sequence can be used for testing against the human genome.
The methods herein can be implemented any suitable combination of software and hardware. Se quencing computer 6 and synthesis computer 7 and communication between these parts is imple mented using digital computer technology for storing and handling digital information and signals as well as suitable hardware and software, including suitable digital processors, digital memories, input means, output means, buses and communications interface. A user may be able to make input using for example a keyboard, a mouse or a touch screen. Output to a user, if necessary, may be provided on for example a display. A computer such as a PC or a server may have an operating system. Communication between digital devices such as sequencing computer 6 and synthesis computer 7 may be implemented using suitable networking technologies and protocols, inducing cellular com munication such as 3G, 4G and 5G, Wi-Fi or Bluetooth, or Ethernet. Data communication can be wireless, or wire bound. Information may be exchanged over a wide area net such as internet 4.
Communication in system 1 and update of data in system 1, such as communication between se quencing computer 6 and synthesis computer 7 may be carried out using any suitable schedule, such as when needed or on a predetermined schedule, for example every second, every minute or every ten minutes.
The methods herein are preferably carried out automatically by system 1. A user may provide input in the form of for example patient data and the required volume of pharmaceutical to be produced, and the system 1 then automatically active component such as a guide strand or a protein-RNA complex or a pharmaceutical. Preferably the system 1 can provide an RNA guide strand or a phar maceutical or a RNA protein complex (or other product such as plasmids or viruses) automatically after a biological sample has been provided to the system 1.
EXAMPLE 1
Genomes for a large number of HIV subtypes (approx 3000 subtypes) where downloaded from www.hiv.lanl.gov. Data was imported into a table in a relational database.
EXAMPLE 2
An algorithm was used to search in the database of Example 1 for target sequences that that com prise a PAM sequence for Cpfl. Target sequences were scored for how many of the HIV virus ge nomes that have them. The most conserved sequences were selected for further processing. It was checked that none of the selected sequences were present in the consensus human genome. The sequences were then selected based on that they should be present in sequences that are im portant for virus survival, replication or activation. The results are shown in Table la and lb. The score shows percentage of strains that carry the sequence. In Table lb it is shown how many strains that carry the sequence. EXAMPLE 3
An algorithm was used to search in the database of sequences that are likely to be transcribed to RNA, as being suitable targets for CAS13. Target sequences were scored for how many of the HIV virus genomes that have them. The most conserved sequences were selected for further pro cessing. It was checked that none of the selected sequences were present in the consensus human genome. The sequences were then selected based on that they should be present in sequences that are important for virus survival, replication or activation. The results are shown in Table 2a and 2b. The score shows percentage of strains that carry the sequence. In Table 2b it is shown how many strains that carry the sequence.
EXAMPLE 4
Cpfl protein will be produced as in Zetsche et al. Sixty different RNA guide strands for targeting each of SEQ ID NO 1 - 20 and 41-80 are synthesized. Each guide strand consists of the 5' handle sequence UAAUUUCUACUCUUGUAGAU followed by a sequence that is complimentary to each of one of SEQ ID NO 1 to 20 less the 5' TTTN motif.
RNA-protein complexes with each of the RNA guide strands and Cpfl protein is formed. Gold parti cles are formed as described in Lee et al Nature Biomedical Engineering volume 1, pages 889- 901 (2017). Each of the sixty different complexes is tested in a suitable in vitro model for example the model used in Ueda et at, Microbiology and Immunology Volume60, Issue7, July 2016, 483-496
EXAMPLE 5
CAS13 protein will be produced as in Abudayyeh et al. Sixty different guide RNA that comprises sequences complimentary to each of sequences SEQ ID NO 21 to 40 and 81 to 120 are synthesized. RNA-protein complexes with each of the RNA guide strands and CAS13 protein is formed. Gold particles are formed as described in Lee et al Nature Biomedical Engineering volume 1, pages 889- 901 (2017).
The gold particles with the RNA protein complexes are provided to HIV infected T-lymphocytes in culture. Each of the sixty different complexes is tested in suitable in vitro model, for example the model used in Ueda et at, Microbiology and Immunology Volume 60, Issue7, July 2016, 483-496 While the invention has been described with reference to specific exemplary embodiments, the de scription is in general only intended to illustrate the inventive concept and should not be taken as limiting the scope of the invention. The invention is generally defined by the claims.

Claims

1. A method involving a system, said system comprising at least two sequencing devices and one RNA synthesis device, where the sequencing devices each comprises a first computer and the nucleotide synthesis device comprises a second computer, where the computer of the RNA synthesis machine and the computers of the nucleotide sequencing device com municate via a wide area network, said method involving the following steps: a) providing at least one biological sample from a patient to one of the sequencing de vices, where the at least one biological sample comprises human genetic material comprising the genome of the patient and genetic material from a pathogen that has infected, or has suspected to have infected the patient, said genetic material being in the form of polynucleotides,
b) the sequencing device determining the polynucleotide sequence of the genetic mate rial where the genomic sequence of the patient is determined and at least a part of the sequence of the genetic material of the pathogen is determined,
c) storing the determined polynucleotide sequences as sequence data in a computer memory of a computer of the system, where the sequence data is stored as a patient sequence data set that represents the genome of the patient and a pathogen se quence data set that represents at least a part of the genetic material of the pathogen, d) the RNA synthesis device finding, in the pathogen sequence data set, at least one poly nucleotide target sequence with a length from 18 to 60 nucleotides, where said target sequence is not present in the patient sequence data set,
e) the RNA synthesis device determining an RNA guide strand for a CRISPR-type protein that is complimentary to the target sequence of step d),
f) the RNA synthesis device synthesizing the RNA guide strand, where the RNA guide strand comprises a predetermined handle sequence that interacts with the CRISPR- type protein.
2. The method of claim 1 where the system has stored sequence information for a number of predetermined target sequences and where step d) comprises finding, in the pathogen data set, a target sequence that is comprised in the sequence of at least one of the prede termined target sequences.
3. The method of claim 2 where the number of predetermined target sequences are ranked according to expected efficacy, and the target sequence with the highest expected efficacy is selected.
4. The method of any of claims 1 to 3 where step d) comprises identifying a sequence that comprises a Cpfl - PAM sequence motif.
5. The method according to any one of claims 1 to 4 where the pathogen is a virus selected from HIV, HPV, Hepatitis B and Herpes.
6. The method of any one of claims 1 to 5 where the RNA synthesis device additionally has a protein-RNA complex formation station that is configured to use the RNA guide strand synthesised in step f) and to use an aliquot of a CRISPR-type protein to form a complex be tween the RNA guide strand and the protein.
7. The method of any one of claims 1 to 6 comprising the step of mixing the RNA guide
strand or the protein-RNA guide complex with suitable components to obtain a pharma ceutical composition, and then packing the pharmaceutical composition in a suitable con tainer and sealing the container.
8. A system comprising at least two sequencing devices and one RNA synthesis device, where the sequencing device comprises a first computer and the RNA synthesis device comprises a second computer, where the sequencing devices are configured to a) receive at least one biological sample comprising genetic material in the form of poly nucleotides, b) determine the polynucleotide sequence of the genetic material and storing it as se quence data,
c) providing the sequence data to the RNA synthesis device; where the RNA synthesis device is configured to receive the sequence data, where the sequencing device or the RNA synthesis device is configured to store the se quence data as a patient sequence data set that represents the genome of the patient and a pathogen sequence data set that represents at least a part of the genetic material of the pathogen, where the RNA synthesis device is configured to: d) find, in the pathogen sequence data set, at least one polynucleotide target sequence with a length from 18 to 60 nucleotides, where said target sequence is not present in the patient genetic sequence data set,
e) synthesize an RNA guide strand for a CRISPR-type protein that is complimentary to the target sequence of step e), the RNA guide strand additionally comprising a predeter mined handle sequence that mediates binding to a the CRISPR type protein.
9. The system of claim 8 where the second computer has stored sequence information for a number of predetermined target sequences and where step d) comprises finding, in the pathogen data set, a target sequence that is comprised in the sequence of at least one of the predetermined target sequences.
10. The system of claim 9 where the number of predetermined target sequences are ranked according to expected efficacy and where the system is configured to select the target se quence with the highest efficacy.
11. The system of any of claims 8 to 10 where step d) comprises identifying a sequence that comprises a Cpfl - PAM sequence motif.
12. The system of any one of claims 8 to 11 where the RNA synthesis device additionally has a protein-RNA complex formation station that is configured to use the RNA guide strand synthesised in step e) and to use an aliquot of a CRISPR-type protein to form a complex be tween the RNA guide strand and the protein.
PCT/EP2019/086708 2018-12-20 2019-12-20 System for production of crispr-based pharmaceutical compositions WO2020127990A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SE1851634 2018-12-20
SE1851634-4 2018-12-20

Publications (1)

Publication Number Publication Date
WO2020127990A1 true WO2020127990A1 (en) 2020-06-25

Family

ID=69156392

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2019/086708 WO2020127990A1 (en) 2018-12-20 2019-12-20 System for production of crispr-based pharmaceutical compositions

Country Status (1)

Country Link
WO (1) WO2020127990A1 (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5846946A (en) 1996-06-14 1998-12-08 Pasteur Merieux Serums Et Vaccins Compositions and methods for administering Borrelia DNA
WO2003064026A1 (en) 2002-01-31 2003-08-07 Nimblegen Systems Llc Pre-patterned substrate, device and method for optical synthesis of dna probes
WO2012012779A2 (en) 2010-07-23 2012-01-26 Beckman Coulter Inc. System and method including analytical units
US8404658B2 (en) 2007-12-31 2013-03-26 Nanocor Therapeutics, Inc. RNA interference for the treatment of heart failure
US8454972B2 (en) 2004-07-16 2013-06-04 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Method for inducing a multiclade immune response against HIV utilizing a multigene and multiclade immunogen
US20160350476A1 (en) * 2015-05-29 2016-12-01 Agenovir Corporation Antiviral methods and compositions
US20180169658A1 (en) 2016-12-21 2018-06-21 Quandx Inc. Systems and methods for molecular diagnostics

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5846946A (en) 1996-06-14 1998-12-08 Pasteur Merieux Serums Et Vaccins Compositions and methods for administering Borrelia DNA
WO2003064026A1 (en) 2002-01-31 2003-08-07 Nimblegen Systems Llc Pre-patterned substrate, device and method for optical synthesis of dna probes
US8454972B2 (en) 2004-07-16 2013-06-04 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Method for inducing a multiclade immune response against HIV utilizing a multigene and multiclade immunogen
US8404658B2 (en) 2007-12-31 2013-03-26 Nanocor Therapeutics, Inc. RNA interference for the treatment of heart failure
WO2012012779A2 (en) 2010-07-23 2012-01-26 Beckman Coulter Inc. System and method including analytical units
US20160350476A1 (en) * 2015-05-29 2016-12-01 Agenovir Corporation Antiviral methods and compositions
US20180169658A1 (en) 2016-12-21 2018-06-21 Quandx Inc. Systems and methods for molecular diagnostics

Non-Patent Citations (17)

* Cited by examiner, † Cited by third party
Title
ABUDAYYEH ET AL., NATURE, vol. 550, 2017, pages 280 - 284
BERND ZETSCHE ET AL: "Cpf1 Is a Single RNA-Guided Endonuclease of a Class 2 CRISPR-Cas System", CELL, vol. 163, no. 3, 1 October 2015 (2015-10-01), AMSTERDAM, NL, pages 759 - 771, XP055553375, ISSN: 0092-8674, DOI: 10.1016/j.cell.2015.09.038 *
COXGOOTENBERGABUDAYYEH ET AL., SCIENCE, 2017
DE BUHR HENDRIK ET AL: "Harnessing CRISPR to combat human viral infections", CURRENT OPINION IN IMMUNOLOGY, ELSEVIER, OXFORD, GB, vol. 54, 26 July 2018 (2018-07-26), pages 123 - 129, XP085510271, ISSN: 0952-7915, DOI: 10.1016/J.COI.2018.06.002 *
DUO PENG ET AL: "EuPaGDT: a web tool tailored to design CRISPR guide RNAs for eukaryotic pathogens", MICROBIAL GENOMICS, vol. 1, no. 4, 13 October 2015 (2015-10-13), XP055681566, DOI: 10.1099/mgen.0.000033 *
FREIJE ET AL., MOLECULAR CELL, vol. 76, 2019, pages 826 - 837
GLASSLEELIXU, TRENDS IN BIOTECHNOLOGY, 2017
LEE ET AL., NATURE BIOMEDICAL ENGINEERING, vol. 1, 2017, pages 889 - 901
LI ET AL., BIOMATERIALS, vol. 178, 2018, pages 652 - 662
LIUZHANGLIUCHENG, JOURNAL OF CONTROLLED DISEASE, vol. 266, 2017, pages 17 - 26
MALI ET AL., SCIENCE, vol. 339, 2013, pages 823 - 826
MOUT ET AL., ACS NANO, vol. 11, 2017, pages 2452 - 2458
PARK ET AL., NATURE COMMUNICATIONS, vol. 9, 2018, pages 3313
SMARGONCOXPYZOCHA ET AL., MOLECULAR CELL, 2017
UEDA, MICROBIOLOGY AND IMMUNOLOGY, vol. 60, no. 7, July 2016 (2016-07-01), pages 483 - 496
WANG ET AL., PNAS, vol. 113, no. 11, 15 March 2016 (2016-03-15), pages 2868 - 2873
ZETSCHE ET AL., CELL, vol. 163, 2015, pages 759 - 771

Similar Documents

Publication Publication Date Title
AU2016299271B2 (en) A system, device and a method for providing a therapy or a cure for cancer and other pathological states
JP6968416B2 (en) Methods and compositions for the treatment of RNA-induced, HIV infections
US10689691B2 (en) Unbiased identification of double-strand breaks and genomic rearrangement by genome-wide insert capture sequencing
US11149259B2 (en) CRISPR-Cas systems and methods for altering expression of gene products, structural information and inducible modular Cas enzymes
US11578312B2 (en) Engineering and optimization of systems, methods, enzymes and guide scaffolds of CAS9 orthologs and variants for sequence manipulation
US10988777B2 (en) Method for inducing CCR5Δ32 deletion by using CRISPR-Cas9 genome editing technique
Hu et al. RNA-directed gene editing specifically eradicates latent and prevents new HIV-1 infection
US20190367910A1 (en) Methods and compositions for rna-guided treatment of hiv infection
JP2020188757A (en) Methods and compositions for rna-guided treatment of hiv infection
US20210395812A1 (en) Method for detecting off-target effect of adenine base editor system based on whole-genome sequencing and use thereof in gene editing
US20160355795A1 (en) Engineering of systems, methods and optimized guide compositions with new architectures for sequence manipulation
JP2019506156A (en) Methods and compositions for RNA-induced treatment of HIV infection
KR20170137114A (en) Tat-induced CRISPR / endonuclease-based gene editing
JP2018500913A (en) Gene editing with microfluidic delivery
US20130143204A1 (en) Determination of in vivo dna double-strand break localization and application thereof
WO2022132765A1 (en) Biallelic knockout of b2m
EP4259159A1 (en) Biallelic knockout of trac
Shmakova et al. CRISPR/Cas: History and perspectives
Morshedzadeh et al. An update on the application of CRISPR technology in clinical practice
Han et al. Base editing of the HBG promoter induces potent fetal hemoglobin expression with no detectable off-target mutations in human HSCs
Ebrahimi et al. CRISPR-Cas technology as a revolutionary genome editing tool: mechanisms and biomedical applications
WO2020127990A1 (en) System for production of crispr-based pharmaceutical compositions
US20220119787A1 (en) Virus therapy
Ab Halim et al. Clustered Regularly Interspaced Short Palindromic Repeat Paired Associated Protein 9 (CRISPR-Cas9) System and Its Opportunity in Medical Science-A Narrative Review
WO2024064633A2 (en) Biallelic knockout of pdcd1

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19835295

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19835295

Country of ref document: EP

Kind code of ref document: A1