CN117545852A - Solid support and method for depleting and/or enriching library fragments prepared from biological samples - Google Patents

Solid support and method for depleting and/or enriching library fragments prepared from biological samples Download PDF

Info

Publication number
CN117545852A
CN117545852A CN202280042563.XA CN202280042563A CN117545852A CN 117545852 A CN117545852 A CN 117545852A CN 202280042563 A CN202280042563 A CN 202280042563A CN 117545852 A CN117545852 A CN 117545852A
Authority
CN
China
Prior art keywords
library
rna
sequence
solid support
unwanted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202280042563.XA
Other languages
Chinese (zh)
Inventor
R·S·库尔斯滕
J·科布尔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inmair Ltd
Original Assignee
Inmair Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inmair Ltd filed Critical Inmair Ltd
Priority claimed from PCT/US2022/077221 external-priority patent/WO2023056328A2/en
Publication of CN117545852A publication Critical patent/CN117545852A/en
Pending legal-status Critical Current

Links

Landscapes

  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Described herein are solid supports and methods for depleting and/or enriching library fragments prepared from unwanted RNA sequences. These methods may be combined with microfluidics and flow-through cells to make them easier to use. Libraries enriched or depleted by the methods of the invention can be used for sequencing, for example by bridge sequencing. Probes specific for unwanted RNA and methods of enzymatically consuming ribosomal RNA from human microbiome samples using RNAse H are also described, whereby DNase I can be used to degrade the probes and magnetic beads can be used to bind the wanted RNA.

Description

Solid support and method for depleting and/or enriching library fragments prepared from biological samples
Cross Reference to Related Applications
The present application claims the benefit of priority from U.S. provisional application No. 63/250,563 filed on month 9 of 2021 and U.S. provisional application No. 63/351,170 filed on month 6 of 2022, the contents of each of which are incorporated herein by reference in their entirety for any purpose.
Sequence listing
The present application contains a sequence table that has been electronically submitted in XML format and is hereby incorporated by reference in its entirety. The XML copy was created at 2022, 9, 6, named "01243-0028-00PCT_ST26", and was 1,014,000 bytes in size.
Description
Technical Field
The present disclosure relates to solid supports and methods for depleting and/or enriching library fragments prepared from unwanted RNA sequences. Libraries enriched or depleted by the methods of the invention can be used to generate sequencing data. Probes and methods for enzymatically consuming ribosomal RNA from a human microbiome sample are also described.
Background
Samples containing RNA typically have high abundance of RNA that is not of interest to the user. For example, ribosomal RNAs (rrnas) typically contain a majority of the RNA molecules (about 80% -95%) in total RNA. One challenge with RNA sequencing for gene expression analysis is that after RNA extraction, most of the extracted material is dominated by small amounts of highly abundant transcripts such as non-coding ribosomal ribonucleic acids (rRNA). In total RNA samples from human blood, globulin messenger RNA (mRNA) may be present at a predominant level. Thus, sequencing RNA transcripts (RNA-Seq) is often inefficient and cost prohibitive for many users and applications. It is desirable to consume the sample for abundant transcripts, such as rRNA and mRNA, prior to RNA sequencing.
In order to avoid the obstacle of large amounts of unwanted RNA, several solutions have emerged, including RNAse H mediated consumption. The method involves hybridization of a DNA probe complementary to a known rRNA sequence, followed by DNA by rnase H: RNA hybridizes specifically to cleave, followed by a wash step. This methodology was implemented as part of the current Illumina total RNA strand library preparation workflow and New England Biolabs NEBNext rRNA depletion kits and RNA depletion methods as described in U.S. patent nos. 9,745,570 and 9,005,891. While these methods are effective, disadvantages include early consumption, increased cost, and increased operating time (HOT).
Methods for RNA consumption from microbiome samples need to be improved. Microbiome plays a vital role in human health and disease (Cho et al Nat. Rev. Genet.13:260-70 (2012)). In the last decade, analyses based on next generation sequencing provided insight into the composition of microbiomes across body parts and life stages, and began to reveal the relevance of microbial classification or microbial function to disease states (see, e.g., gilbert, j.a. Et al, nat. Med.24:392-400 (2018); durock and Lynch j. Exp. Med.216 (1): 20-40 (2019); lloyd-Price et al Genome med.8:51 (2016)). In addition to genomic analysis of microbiome composition, the multiple sets of biological data also combine microbiota related transcriptome, proteome, or metabolome measurements to provide further insight into microbiome activity and function. Although the metagenomic and metatranscriptomic profiles generally tend to agree, the functional profile of microorganisms derived from DNA sequencing is more conserved between donors than the higher donor-specific transcriptional profile (frankosa, e.a. et al proc.Natl. Acad.sci.U.S.A.111 (22): E2329-38 (2014)). Importantly, many widely encoded metagenomic pathways are expressed by a small number of organisms, highlighting the utility of metatranscriptome in recognizing functional activity (Abu-Ali, g.s. et al, nat. Microbiol.3 (3): 356-366 (2018)). In particular, transcriptomic measurements of human gut-associated microbiome have been used to study microbial carbohydrate metabolism (Tumbaugh, P.J. et al Proc.Natl. Acad.Sci.U.S. A.107:7503-7508 (2010)), providing functional information about gut diseases such as inflammatory bowel disease (IBD, lloyd-Price, J.et al Nature 569:655-662 (2019)) and drug metabolism mechanisms (Haiser, H.J. et al Science 341 (6143): 295-298 (2013)).
Microbiota that proliferate in the human intestinal tract and other tissues are dynamic, varying with individuals and time in composition and functional status. In studying the mechanisms of human microbiome function and microbiome-mediated phenotype, gene expression measurements provide additional insight into the measurement of DNA-based microbiome composition. However, efficient, fair removal of microbial ribosomal RNA (rRNA) presents an obstacle to the acquisition of metatranscriptomic data, as rRNA typically represents > 90% of the total RNA in microbial cells.
In particular, the fact that most microbial-derived RNA molecules correspond to ribosomal RNA (rRNA, as described in Giannoukos, G. Et al Genome biol.,13 (3): R23 (2012)) prevents the acquisition of metatranscriptomic data. In eukaryotes, non-ribosomal RNAs can be easily and efficiently enriched by selective reverse transcription or targeted pulldown of the poly-a tail or by specific binding of the rRNA molecules using probes prior to removal by capture or enzymatic digestion (hrlickova et al Wiley intel iscip. Rev. RNA 8 (1): 10.1002/wma.1364 (2017) and Zhao et al sci. Rep.8 (1): 4781 (2018)). Although poly-A polymerase was first isolated from E.coli (August et al J.biol. Chem.237:3786-3793 (1962) and Modak and Srinivasan J.biol. Chem.248 (19): 6904-6910 (1973)), bacterial mRNA transcripts were not normally polyadenylation and were associated with RNA degradation when polyadenylation occurred (Mohanty and Kushner mol. Microbiol.34:1094-1108 (1999) and O' Hara et al Proc. Natl. Acad. Sci. U.S.A.92:1807-1811 (1995)). Thus, for bacterial samples, selective enrichment of mRNA is not easily achieved and consumption of rRNA must be accomplished by other means.
While extensive research has developed effective methods for consuming rRNA in a single bacterial species using probe-based capture (Culviner et al MBio11 (2): E00010-20 (2020)), enzymatic consumption (Huang et al Nucleic Acids Res.48 (4): E20 (2020)), or CRISPR-based methods (Prezza, G. Et al RNA 26:1069-1078 (2020) and Gu et al Genome biol.,17:1-13 (2016)), consuming rRNA in complex human microbiome samples that may contain hundreds of species presents significant technical challenges. Furthermore, the composition of microbiota is substantially different in different body parts and different life stages, which further expands the classification coverage required for robust consumption of rRNA in human microbiome samples. Probe-based sequence capture methods, such as those used with Illumina's RiboZero Gold kit, can provide powerful rRNA consumption in a variety of sample types, including human intestinal microbiome samples (Reck, m. Et al BMC Genomics 16 (1): 494 (2015)). However, such probes are expensive, difficult to manufacture, and tend to perform best for high quality RNA samples. Furthermore, the rRNA depletion method based on capture can produce different results based on operator skill. These factors lead to a downtime of the kit based on captured bacteria RiboZero Gold consumption.
Described herein is the development of a pan-human microbiome probe set for efficient and consistent enzymatic (rnase H) microbial rRNA consumption. Through an iterative design process, the probe is designed to efficiently consume rRNA found in human oral, vaginal and adult and infant gut microbiome samples, thereby significantly increasing the mapping rate of the encoded microbial gene database. Using defined spike-ins, the rRNA depletion process was demonstrated to not introduce significant bias in the metatranscriptomic profile. In addition, the resulting meta-transcriptomic data allows the user to perfect the information conduits of rRNA and host maps and examine the gene expression and functional pathways across human microbiome sites. Thus, the methods described herein circumvent the limitations of sequence capture methods and represent efficient rRNA consumption options for metatranscriptomic studies of human-related microbial communities.
For example, a major limitation of metatranscriptomics studies (i.e., sequencing microbial communities in a particular environmental sample without culturing microorganisms) is the overcoming of the abundance of ribosomal RNAs (rRNA). High abundance of rRNA is often of limited interest to the user (i.e., unwanted transcripts), but can significantly reduce the sequencing coverage of mRNA (i.e., wanted transcripts). In metatranscriptomic sequencing, rRNA depletion is typically performed by using rRNA depletion by hybridization with 16S and 23S rRNA probes followed by isolation or by using methods based on probe binding followed by exonuclease treatment. After rRNA consumption, library preparation can be performed.
An iterative probe design strategy is described herein for developing a probe set for efficient enzymatic rRNA removal of human-related microbiota. This strategy resulted in a custom probe set that effectively consumed rRNA from a range of human microbiome samples (including adult gut, infant gut, oral and vaginal communities). Successful rRNA consumption allows for the characterization of taxonomic and functional changes during the development of the gut microbiome. Furthermore, the rRNA depletion process does not introduce significant quantitative errors in the resulting transcriptomics profile. The pan-human microbiome enzyme rRNA consumption probes described herein provide a powerful tool for studying transcription kinetics and function of the human microbiome.
In some assays, the "pre-consumption" (including rnase consumption) approach may present problems for users with limited total RNA material input into the assay. For example, if insufficient RNA remains after the early consumption method, downstream biochemical reactions may be inefficient, resulting in poor assay performance and results. Furthermore, the premature consumption of rnase H involves a washing step (which may result in the loss of the desired RNA) and high temperature incubation (which may result in the degradation of the desired RNA), which may be a problem for certain samples.
Described herein are differentiation solutions using solid supports (such as flow-through cell-like devices) with immobilized oligonucleotides that can bind to library fragments prepared from unwanted RNAs. For example, library fragments prepared from rRNA sequences can be captured by flow-through cell-ligated oligonucleotides, while library fragments lacking these sequences can be collected by siphoning. After collection of unconsumed library fragments, a rapid quality control step to check the concentration and size of the non-rRNA sequencing library can only be performed prior to standard sequencing. This approach is advantageous because rRNA can act as a "carrier molecule" for low abundance RNA molecules throughout the library preparation process, allowing for robust, sensitive assays. In addition to removing unwanted library fragments (such as those prepared from rRNA), the method can be extended to replace traditional PCR amplification by a thermocycler, facilitating a bridge amplification-like process to further reduce HOT and demonstrate additional library preparation functions by sequencer flow chemistry. Similar methods can also be used for other unwanted RNAs, such as for consuming host-derived RNA transcripts when a user wants to specifically evaluate microbiome RNAs from a host.
Furthermore, disclosed herein are methods designed for enriching library fragments prepared from a desired RNA. Both the depletion and enrichment methods can generate libraries with fewer unwanted library fragments, allowing cheaper and/or deeper sequencing of the wanted library fragments.
Disclosure of Invention
According to the present specification, methods of depleting library fragments prepared from unwanted RNA and methods of enriching library fragments prepared from unwanted RNA are described herein. These methods can be performed with standard laboratory equipment, such as a flow cell included in a sequencer. In some embodiments, standard sequencing consumables and platforms (i.e., sequencers) can be used as microfluidic devices for enriching or depleting library fragments. In some embodiments, depletion or enrichment occurs after cDNA synthesis and amplification.
Probes useful for enzymatically depleting rRNA from a human microbiome sample are also described.
Embodiment 1. A method of consuming unwanted cDNA library fragments from a library of cDNA fragments prepared from RNA, wherein the unwanted library fragments comprise those prepared from unwanted RNA sequences, comprising (a) preparing a solid support comprising at least one immobilized oligonucleotide, wherein each immobilized oligonucleotide comprises a nucleic acid sequence corresponding to the unwanted RNA sequence or its complement, (b) adding the library of fragments to the solid support and hybridizing the library fragments to at least one immobilized oligonucleotide to allow binding of unwanted library fragments to the at least one immobilized oligonucleotide, and (c) collecting library fragments not bound to the at least one immobilized oligonucleotide.
Embodiment 2. The method according to embodiment 1, wherein at least one unwanted RNA sequence has at least 90%, at least 95% or at least 99% homology to the high abundance RNA sequence in the sample used to prepare the library of fragments.
Embodiment 3. The method of embodiment 2 wherein all unwanted sequences have at least 90%, at least 95% or at least 99% homology to high abundance RNA sequences in the sample used to prepare the fragment library.
Embodiment 4. The method of embodiment 1, wherein the at least one unwanted RNA sequence is a high abundance RNA sequence.
Embodiment 5. The method of any one of embodiments 2 to 4, wherein the high abundance RNA sequence is a ribosomal RNA (rRNA) sequence.
Embodiment 6. The method of any one of embodiments 1 to 5, wherein the unwanted RNA sequence is comprised in a host transcriptome.
Embodiment 7. The method of any one of embodiments 1 to 6, wherein the unwanted RNA sequence is a globin mRNA or 28S, 23S, 18S, 5.8S, 5S, 16S, 12S, HBA-A1, HBA-A2, HBB-B1, HBB-B2, HBG1 or HBG2 RNA or a fragment thereof.
Embodiment 8. The method of any one of embodiments 1 to 7, wherein the unwanted RNA sequences are from human, rat, mouse, or bacteria.
Embodiment 9. The method of embodiment 8, wherein the unwanted RNA sequence is 28S, 18S, 5.8S, 5S, 16S, or 12S RNA from a human, or a fragment thereof.
Embodiment 10. The method of embodiment 8 wherein the unwanted RNA sequence is rat 16S, rat 28S, mouse 16S, or mouse 28S RNA.
Embodiment 11. The method of embodiment 8, wherein the bacterium is an archaebacteria, E.coli, or B.subtilis.
Embodiment 12. The method of embodiment 8, wherein the unwanted RNA sequence is contained in a 23S, 16S or 5S RNA from a gram positive or gram negative bacterium.
Embodiment 13. The method of embodiment 8, wherein the unwanted RNA sequences are from organisms in a human microbiome.
Embodiment 14. The method of embodiment 13, wherein the at least one immobilized oligonucleotide comprises a sequence comprising the sequence of SEQ ID NO:1-1131 or a complement thereof.
Embodiment 15 the method of any one of embodiments 1 to 14, wherein the at least one immobilized oligonucleotide comprises a sequence selected from the group consisting of SEQ ID NOs: 1-1131 or its complement in 2 or more, 5 or more, 10 or more, 25 or more, 50 or more, 100 or more, 200 or more, 300 or more, 400 or more, 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, 1000 or more, 1100 or more, or 1131 sequences.
Embodiment 16. The method of embodiment 15, wherein the at least one immobilized oligonucleotide comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 1-1131 or its complement 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, 1000 or more, 1100 or more, or 1131 sequences.
Embodiment 17 the method of any one of embodiments 14 to 16, wherein the at least one immobilized oligonucleotide comprises at least one sequence comprising SEQ ID NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-909, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129 and 1130 or the complement thereof.
Embodiment 18. The method of embodiment 17, wherein the at least one immobilized oligonucleotide comprises 100 or more, 500 or more, or 1000 or more sequences comprising the sequence of SEQ ID NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-909, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129 and 1130 or the complement thereof.
Embodiment 19. The method of embodiment 18, wherein the at least one immobilized oligonucleotide comprises a sequence comprising the amino acid sequence of SEQ ID NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 35l-355, 357-359, 361-372, 374-380, 383-386, 388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-909, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129, and 1130 or their complements.
Embodiment 20. The method of any one of embodiments 17 to 19, wherein the at least one immobilized oligonucleotide further comprises at least one sequence comprising the sequence of SEQ ID NO: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131 or a complement thereof.
Embodiment 21. The method of embodiment 20, wherein the oligonucleotide library comprises 10 or more, 20 or more, or 30 or more sequences comprising SEQ ID NO: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131 or a complement thereof.
Embodiment 22. The method of embodiment 21, wherein the pool of oligonucleotides comprises a sequence comprising SEQ ID NO: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131 or a complement thereof.
Embodiment 23. The method of any of embodiments 14 to 16, wherein the pool of oligonucleotides comprises at least one sequence comprising the sequence of SEQ ID NO:1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 913, 923-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126, and 1128-1131 or at least one of their complements.
Embodiment 24. The method of embodiment 23, wherein the pool of oligonucleotides comprises 100 or more, 500 or more, or 1000 or more sequences comprising SEQ ID NO:1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 913, 923-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126, and 1128-1131 or at least one of their complements.
Embodiment 25. The method of embodiment 24, wherein the at least one immobilized oligonucleotide comprises a sequence comprising the amino acid sequence of SEQ ID NO:1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 913, 923-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126, and 1128-1131 or the complements thereof.
Embodiment 26. The method of any one of embodiments 17 to 25, wherein the at least one immobilized oligonucleotide further comprises at least one sequence comprising the sequence of SEQ ID NO: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 3O2, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 5757577, 617, 623, 631, 637 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127 or at least one of their complements.
Embodiment 27. The method of embodiment 26, wherein the at least one immobilized oligonucleotide comprises 10 or more, 20 or more, or 30 or more sequences comprising the sequence of SEQ ID NO: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631, 637 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127 or at least one of their complements.
Embodiment 28. The method of embodiment 27, wherein the at least one immobilized oligonucleotide comprises a sequence comprising the amino acid sequence of SEQ ID NO: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631, 637 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127, or the complement thereof.
Embodiment 29. The method of any one of embodiments 1 to 28, wherein the unwanted RNA sequence is selected by determining the most abundant sequence in the sample comprising RNA.
Embodiment 30. The method of embodiment 29 wherein the most abundant sequences comprise 100 most abundant sequences, 1,000 most abundant sequences, or 10,000 most abundant sequences.
Embodiment 31. The method of any one of embodiments 1 to 30, wherein the unwanted RNA sequence comprises a sequence having at least 90%, at least 95%, or at least 99% homology to the most abundant sequence in the sample comprising RNA.
Embodiment 32. The method of any one of embodiments 1 to 31, wherein the collected library fragments comprise libraries that consume unwanted library fragments.
Embodiment 33. The method of any of embodiments 32, wherein unwanted library fragments are used as carrier molecules for other library fragments.
Embodiment 34. The method of any one of embodiments 1 to 33, wherein the library of fragments added to the solid support is prepared from RNA using a strand method of cDNA preparation.
Embodiment 35. The method of any one of embodiments 1 to 34, wherein the library fragment comprises library adaptors and the solid support further comprises immobilized oligonucleotides comprising solid support adaptor sequences capable of binding to library adaptors.
Embodiment 36. The method of embodiment 35, wherein the solid support adapter sequence comprises a P5 sequence (SEQ ID NO: 1132), a P7 sequence (SEQ ID NO: 1133), and/or their complements.
Embodiment 37. The method of embodiment 35 or embodiment 36, wherein an adapter complement sequence that is wholly or partially complementary to the solid support adapter sequence binds to the solid support adapter sequence.
Embodiment 38. The method of embodiment 37, wherein the binding of the adaptor complement to the solid support adaptor sequence is reversible.
Embodiment 39. The method of embodiment 37 or embodiment 38, wherein the adaptor complement to which the solid support adaptor sequence binds generates a double stranded immobilized oligonucleotide.
Embodiment 40. The method of embodiment 39 wherein the solid support adapter sequences that bind to the adapter complement are not capable of binding to library adapters.
Embodiment 41. The method of embodiment 39 or embodiment 40, further comprising denaturing the library fragments and/or adaptor complement that hybridize to the immobilized oligonucleotides.
Embodiment 42. The method of embodiment 41, wherein the denaturing is performed with a denaturing agent and/or heat.
Embodiment 43. The method of embodiment 42, wherein the denaturant is NaOH, optionally wherein the NaOH concentration is 0.2N.
Embodiment 44. The method of embodiment 42, wherein the heat is from 95 ℃ to 98 ℃.
Embodiment 45 the method of any one of embodiments 41 to 44, wherein the denatured library fragments and/or adaptor complement are siphoned to a waste compartment.
Embodiment 46. The method of any one of embodiments 41 to 45, wherein the steps of adding a sample, collecting, and denaturing are repeated, wherein the collected library fragments are added back to the solid support after the denaturing.
Embodiment 47. The method of any one of embodiments 1 to 46, wherein the collected library fragments are collected in a reservoir comprised in a sequencer comprising a flow cell.
Embodiment 48. The method of any one of embodiments 1 to 47, wherein the library fragment hybridized to the immobilized oligonucleotide comprises a library fragment prepared from rRNA.
Embodiment 49 the method of any one of embodiments 1 to 48, wherein the library consuming unwanted library fragments comprises fewer library fragments prepared from unwanted RNA sequences than the same library prior to addition to the solid support.
Embodiment 50. The method of any of embodiments 1 to 49, wherein the unwanted library fragments hybridized to the immobilized oligonucleotides comprise library fragments prepared from host RNAs contained in a sample comprising host RNAs and non-host nucleic acid RNAs.
Embodiment 51. The method of embodiment 50, wherein the non-host RNA is microbial.
Embodiment 52. The method of embodiment 51 wherein the microorganism is a bacterium, a virus, and/or a fungus.
Embodiment 53. The method of embodiment 52, wherein the microorganism is a pathogen.
Embodiment 54. The method of embodiment 52, wherein the microorganism is an organism in a host microbiome.
Embodiment 55 the method of any one of embodiments 50 to 54, wherein the host is a human.
Embodiment 56 the method of any one of embodiments 29 to 55, further comprising adding the collected library fragments to the solid support after denaturing the hybridized library fragments and/or adaptor complement.
Embodiment 57. The method according to embodiments 1 to 56, wherein the sequences comprised in the library fragments specifically bind to solid support adapter sequences comprising P5 (SEQ ID NO: 1132), P7 (SEQ ID NO: 1133) and/or their complementary sequences.
Embodiment 58. The method of embodiments 1 to 57 wherein library adaptor sequences are added to the collected library fragments.
Embodiment 59. The method of embodiment 58, wherein the library adaptor sequence is added by ligation.
Embodiment 60. The method of any one of embodiments 1 to 59, wherein the library of fragments added to the solid support is prepared by a method comprising incorporating one or more library adaptors that specifically bind to solid support adaptor sequences comprising P5 (SEQ ID NO: 1132), P7 (SEQ ID NO: 1133), and/or their complements.
Embodiment 61. The method of embodiment 60, wherein the method comprising incorporating one or more library adaptors is tagging or fragmentation followed by adaptor ligation.
Embodiment 62. The method of any one of embodiments 1 to 61, wherein the method does not require degradation of RNA.
Embodiment 63. The method of any of embodiments 1 to 62, wherein the library size and/or concentration of the library consuming unwanted library fragments is assessed.
Embodiment 64 the method of any one of embodiments 1 to 63, wherein the library of library fragments that are not needed for consumption is sequenced.
Embodiment 65. The method of any of embodiments 1 to 64, further comprising amplifying the library of library fragments that are not needed for consumption prior to sequencing.
Embodiment 66. The method of embodiment 65, wherein the amplification is by PCR amplification.
Embodiment 67. The method of embodiment 65, wherein the amplification is by bridge amplification.
Embodiment 68. The method of embodiment 67, wherein bridge amplification is performed after adding the collected library fragments to the solid support and binding the library adaptors contained in the collected library fragments to the solid support adaptor sequences, wherein the adding is performed after denaturing the hybridized library fragments and/or adaptor complement sequences.
Embodiment 69 the method of embodiment 64, 65, 67, or 68, wherein the sequencing is performed without PCR amplification.
Embodiment 70 the method of any one of embodiments 64, 65, or 67 to 69, wherein the amplifying does not require a thermocycler.
Embodiment 71 the method of any one of embodiments 1 to 70, wherein the method is performed entirely in a sequencer.
Embodiment 72. A method of enriching a library of desired cDNA fragments from a library of cDNA fragments prepared from RNA, wherein the desired library fragments comprise those prepared from the desired RNA sequences, comprising (a) preparing a solid support comprising at least one immobilized oligonucleotide, wherein each immobilized oligonucleotide comprises a nucleic acid sequence corresponding to the desired RNA sequence or its complement, (b) adding the library of fragments to the solid support and hybridizing the library fragments to the at least one immobilized oligonucleotide to allow binding of the desired library fragments to the at least one immobilized oligonucleotide, and (c) collecting library fragments bound to the at least one immobilized oligonucleotide.
Embodiment 73. The method according to embodiment 72, wherein the library of fragments has undergone a method of consuming unwanted cDNA library fragments according to any one of embodiments 1 to 71 prior to said adding.
Embodiment 74. The method of embodiment 72 or 73, wherein at least one desired RNA sequence has at least 90%, at least 95%, or at least 99% homology to a target RNA sequence in a sample used to prepare the library of fragments.
Embodiment 75. The method of embodiment 74, wherein all desired RNA sequences have at least 90%, at least 95% or at least 99% homology to the target RNA sequences in the sample used to prepare the library of fragments.
Embodiment 76 the method of any of embodiments 72 to 75, wherein said at least one desired RNA sequence is a target RNA sequence.
Embodiment 77 the method of any one of embodiments 72 to 76, wherein said desired RNA sequence is an exome sequence.
Embodiment 78 the method of any one of embodiments 72 to 77, wherein the desired RNA sequence is from human, rat, mouse and/or bacteria.
Embodiment 79. The method of embodiment 78, wherein the desired RNA sequence is from an organism in a human microbiome.
Embodiment 80. The method of any one of embodiments 72 to 79, wherein the collected library fragments comprise a library enriched in the desired library fragments.
Embodiment 81 the method of any one of embodiments 72 to 90 wherein the library of fragments added to the solid support is prepared from RNA using a strand method of cDNA preparation.
Embodiment 82 the method of any one of embodiments 72 to 81, wherein said collecting comprises denaturing the library fragments hybridized to the at least one immobilized oligonucleotide, and then collecting the library enriched for the desired fragments in a reservoir comprised in a sequencer comprising the solid support.
Embodiment 83. The method of embodiment 82, wherein the denaturing is performed with a denaturing agent and/or heat.
Embodiment 84. The method of embodiment 83, wherein the heat is from 95 ℃ to 98 ℃.
Embodiment 85. The method of embodiment 83 wherein the denaturant is NaOH, optionally wherein the NaOH concentration is 0.2N.
Embodiment 86 the method of any one of embodiments 82 to 85, wherein the steps of adding the library, denaturing and collecting are repeated, wherein the collected library fragments are added to the solid support after the denaturing.
Embodiment 87 the method of any one of embodiments 82 to 86, wherein the library enriched in desired library fragments comprises a greater percentage of library fragments prepared from desired RNA sequences than the library prior to addition to the solid support.
Embodiment 88 the method of any one of embodiments 82 to 87, wherein the library of enriched desired library fragments is assessed for library size and/or concentration.
Embodiment 89 the method of any of embodiments 82-88, wherein the library enriched in desired library fragments is sequenced.
Embodiment 90 the method of any one of embodiments 82 to 89, further comprising amplifying the library enriched in desired library fragments prior to sequencing.
Embodiment 91 the method of any of embodiments 1 to 90, wherein the at least one immobilized oligonucleotide is 20 to 100 bases in length, optionally wherein the at least one immobilized oligonucleotide is 45 to 55 bases in length.
Embodiment 92 the method of any one of embodiments 1 to 91, wherein the at least one immobilized oligonucleotide is single stranded.
Embodiment 93 the method of any one of embodiments 1 to 92, wherein single-stranded library fragments are prepared prior to adding the fragment library to the solid support.
Embodiment 94 the method of any one of embodiments 1 to 93 wherein the solid support is a flow cell.
Embodiment 95. A solid support having a library of two immobilized oligonucleotides on its surface, wherein a first library comprises immobilized oligonucleotides, each comprising a nucleic acid sequence corresponding to an unwanted RNA sequence or a complement thereof, and a second library comprises immobilized oligonucleotides, each comprising a solid support adaptor sequence capable of binding to a library adaptor comprised in a library fragment.
Embodiment 96. The solid support of embodiment 95, wherein at least one unwanted RNA sequence has at least 90%, at least 95%, or at least 99% homology to high abundance RNA sequences in a sample used to prepare the library of fragments.
Embodiment 97. The solid support of embodiment 96, wherein all unwanted sequences have at least 90%, at least 95%, or at least 99% homology to high abundance RNA sequences in a sample used to prepare the library of fragments.
Embodiment 98 the solid support of any one of embodiments 95-97, wherein the at least one unwanted RNA sequence is a high abundance RNA sequence.
Embodiment 99 the solid support of any one of embodiments 96-98, wherein the high abundance RNA sequence is a ribosomal RNA (rRNA) sequence.
Embodiment 100. The solid support of any one of embodiments 95 to 99, wherein the unwanted RNA sequences are comprised in a host transcriptome.
Embodiment 101. The solid support of any one of embodiments 95-100, wherein the unwanted RNA sequence is a globulin mRNA or 28S, 23S, 18S, 5.8S, 5S, 16S, 12S, HBA-A1, HBA-A2, HBB-B1, HBB-B2, HBG1, or HBG2 RNA, or a fragment thereof.
Embodiment 102. The solid support of any one of embodiments 95-101, wherein the unwanted RNA sequences are from human, rat, mouse, or bacteria.
Embodiment 103. The solid support of embodiment 102 wherein the unwanted RNA sequence is 28S, 18S, 5.8S, 5S, 16S, or 12S RNA from a human, or a fragment thereof.
Embodiment 104. The solid support of embodiment 102, wherein the unwanted RNA sequences are contained in rat 16S, rat 28S, mouse 16S, or mouse 28S RNA.
Embodiment 105. The solid support of embodiment 102 wherein the bacteria are archaebacteria, E.coli or B.subtilis.
Embodiment 106. The solid support of embodiment 102, wherein the unwanted RNA sequences are contained in 23S, 16S, or 5S RNA from a gram positive or gram negative bacterium.
Embodiment 107. The solid support of embodiment 102 wherein the unwanted RNA sequences are from organisms contained in a human microbiome.
Embodiment 108 the solid support of any one of embodiments 95-107 wherein the unwanted RNA sequence comprises SEQ ID NO: 1-1131.
Embodiment 109. The solid support of any one of embodiments 95 to 108, wherein the unwanted RNA sequence comprises a sequence having at least 90%, at least 95%, or at least 99% homology to the most abundant sequence in the sample comprising RNA, wherein the most abundant sequence comprises 100 most abundant sequences, 1,000 most abundant sequences, or 10,000 most abundant sequences.
Embodiment 110. The solid support according to any one of embodiments 95 to 109, wherein the solid support adapter sequence comprises a P5 sequence (SEQ ID NO: 1132), a P7 sequence (SEQ ID NO: 1133) and/or their complements.
Embodiment 111 the solid support of any one of embodiments 95 to 110, wherein an adapter complement sequence that is wholly or partially complementary to the solid support adapter sequence is bound to the solid support adapter sequence.
Embodiment 112. The solid support of embodiment 111, wherein the binding of the adaptor complement sequence to the solid support adaptor sequence is reversible.
Embodiment 113. The solid support of embodiment 111 or embodiment 112, wherein the solid support adaptor sequence and the adaptor complement sequence produce a double-stranded immobilized oligonucleotide.
Embodiment 114 the solid support of any one of embodiments 95 to 113, wherein the at least one immobilized oligonucleotide is 20 to 100 bases in length, optionally wherein the at least one immobilized oligonucleotide is 45 to 55 bases in length.
Embodiment 115 the solid support of any one of embodiments 95-114, wherein the solid support is a flow-through cell.
Embodiment 116 the solid support of any one of embodiments 95-115, wherein the at least one immobilized oligonucleotide is single stranded.
Embodiment 117. A composition comprising single stranded library fragments comprising cDNA prepared from a sample comprising RNA hybridized to the solid support according to any one of embodiments 95 to 116.
Embodiment 118 the composition of embodiment 117 wherein said cDNA is complementary to RNA contained in said sample.
Embodiment 119. A method for depleting unwanted RNA molecules contained in a patient microbiome sample, wherein the patient microbiome sample comprises at least one target RNA or DNA sequence and at least one unwanted RNA molecule, the method comprising (a) sequencing a plurality of probe development microbiome samples to determine from sequencing data at least one unwanted RNA molecule comprising a bacterial ribosomal RNA (rRNA) sequence; (b) Preparing a probe set comprising at least one DNA probe complementary to the at least one unwanted RNA molecule; (c) Contacting the patient microbiome sample with the probe set to produce DNA: RNA hybrids; and (d) causing the DNA to: contacting the RNA hybrid with a ribonuclease that degrades a polypeptide derived from the DNA: degrading said unwanted RNA molecules in said patient microbiome sample to form a degradation mixture.
Embodiment 120. A method for depleting unwanted RNA molecules comprised in a patient microbiome sample, wherein the patient microbiome sample comprises at least one target RNA or DNA sequence and at least one unwanted RNA molecule, the method comprising (a) contacting the patient microbiome sample with a probe set comprising at least one sequence comprising the sequence of SEQ ID NO:1-1131 to produce DNA: RNA hybrids; and (b) causing the DNA to: contacting the RNA hybrid with a ribonuclease that degrades a polypeptide derived from the DNA: degrading said unwanted RNA molecules in said patient microbiome sample to form a degradation mixture.
Embodiment 121 the method of embodiment 119 or embodiment 120, further comprising (a) degrading any remaining DNA probes by contacting the degradation mixture with a DNA digestive enzyme, optionally wherein the DNA digestive enzyme is dnase I, to form a DNA degradation mixture; and (b) isolating the degraded RNA from the degradation mixture or the DNA degradation mixture.
Embodiment 122 the method of any one of embodiments 119-121, wherein said contacting with said probe set comprises treating said nucleic acid sample with a destabilizing agent.
Embodiment 123. The method of embodiment 122, wherein the destabilizing agent is a heat and/or nucleic acid destabilizing chemical.
Embodiment 124. The method of embodiment 123, wherein the nucleic acid destabilizing chemical comprises betaine, DMSO, formamide, glycerol, or derivatives thereof, or mixtures thereof.
Embodiment 125. The method of embodiment 124, wherein the nucleic acid destabilizing chemical comprises formamide.
Embodiment 126 the method of embodiment 125, wherein the formamide is present at a concentration of about 10% to 45% by volume during contact with the probe set.
Embodiment 127 the method of any one of embodiments 123-126, wherein treating the sample with heat comprises applying more than the at least one DNA: heat at the melting temperature of the RNA hybrid.
Embodiment 128 the method of any one of embodiments 119 to 127, wherein the ribonuclease is rnase H or a hybrid enzyme.
Embodiment 129 the method of any of embodiments 119 to 128, wherein the patient is a human.
Embodiment 130 the method of any one of embodiments 119 to 129, wherein the microbiome sample is buccal, vaginal or from the gut.
Embodiment 131 the method of embodiments 119-130 wherein the sample from the intestinal tract is a fecal sample.
Embodiment 132. The method of embodiment 131, wherein the oral sample is a sample from the tongue.
Embodiment 133 the method of any one of embodiments 119-132, wherein said at least one DNA probe comprises a sequence selected from the group consisting of SEQ ID NOs: 1-1131 or its complement in 2 or more, 5 or more, 10 or more, 25 or more, 50 or more, 100 or more, 200 or more, 300 or more, 400 or more, 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, 1000 or more, 1100 or more, or 1131 sequences.
Embodiment 134. The method of embodiment 133, wherein the at least one DNA probe comprises a sequence selected from the group consisting of SEQ ID NOs: 1-1131 or its complement 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, 1000 or more, 1100 or more, or 1131 sequences.
Embodiment 135 the method of any one of embodiments 119 to 134, wherein the patient is at least 12 months, at least 15 months, at least 24 months, or at least 36 months old.
Embodiment 136 the method of any one of embodiments 119 to 135, wherein the microbiome sample comprises at least one unwanted RNA molecule from the genus fecal, trichomonadaceae, and/or clostridium.
Embodiment 137 the method of any one of embodiments 119 to 136, wherein the microbiome sample is vaginal and comprises at least one unwanted RNA molecule from gardnerella, lactobacillus and/or euler.
Embodiment 138 the method of any one of embodiments 119 to 136, wherein the microbiome sample is from the tongue and comprises at least one unwanted RNA molecule from veillonella, rogowski, streptococcus and/or prasuvorexa.
The method of any one of embodiments 120 to 138, wherein the at least one DNA probe comprises at least one sequence comprising the sequence of SEQ ID NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-1099, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129 and 1130.
Embodiment 140 the method of embodiment 139, wherein the at least one DNA probe comprises 100 or more, 500 or more, or 1000 or more sequences comprising SEQ ID NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-1099, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129 and 1130.
Embodiment 141. The method of embodiment 140, wherein the at least one DNA probe comprises a sequence comprising the sequence of SEQ ID NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-1099, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129, and 1130.
Embodiment 142 the method of any one of embodiments 139 to 141, wherein the patient is 3 months or less, 6 months or less, 12 months or less, 18 months or less, 24 months or less, or 36 months or less in age.
Embodiment 143 the method of embodiment 142, wherein the microbiome sample comprises at least one unwanted RNA molecule from bifidobacterium bifidum and/or blautia.
Embodiment 144 the method of any one of embodiments 139 to 143, wherein said at least one DNA probe further comprises at least one sequence comprising the sequence of seq id NO: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131.
Embodiment 145 the method of embodiment 144, wherein the at least one DNA probe comprises 10 or more, 20 or more, or 30 or more sequences comprising the sequence of SEQ ID NO: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131.
Embodiment 146 the method of embodiment 145, wherein said at least one DNA probe comprises a sequence comprising the sequence of SEQ ID NO: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131.
Embodiment 147 the method of any one of embodiments 120 to 138, wherein the at least one immobilized oligonucleotide comprises at least one sequence comprising the amino acid sequence of SEQ ID NO:1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 913, 923-921, 923-925, at least one of 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126, and 1128-1131.
Embodiment 148 the method of embodiment 147, wherein the at least one immobilized oligonucleotide comprises 100 or more, 500 or more, or 1000 or more sequences comprising the sequence of SEQ ID NO:1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 913, 923-921, 923-925, at least one of 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126, and 1128-1131.
Embodiment 149. The method of embodiment 148, wherein the at least one immobilized oligonucleotide comprises a sequence comprising the amino acid sequence of SEQ ID NO:1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 913, 923-921, 923-925, each of 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126, and 1128-1131.
Embodiment 150 the method of any one of embodiments 139 to 149, wherein the at least one DNA probe further comprises at least one sequence comprising the sequence of SEQ ID NO: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631 637, 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127.
Embodiment 151. The method of embodiment 150, wherein the at least one DNA probe comprises 10 or more, 20 or more, or 30 or more sequences comprising the sequence of SEQ ID NO: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631 637, 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127.
Embodiment 152 the method of embodiment 151, wherein said at least one DNA probe comprises a sequence comprising the sequence of SEQ ID NO: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631 637, 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127.
Embodiment 153 the method of any one of embodiments 119-152, wherein the method depletes 70% or more, 80% or more, 90% or more, or 95% or more of the bacterial rRNA contained in the microbiome sample.
Embodiment 154. A composition comprising a set of probes, comprising (a) at least one DNA probe comprising at least one sequence comprising the sequence of SEQ ID NO: 1-1131; and (b) capable of degrading DNA: ribonuclease of RNA in RNA hybrids.
Embodiment 155. The composition of embodiment 154, wherein said ribonuclease is RNase H.
Embodiment 156 a kit comprising a set of probes comprising (a) at least one DNA probe comprising at least one sequence comprising the sequence of SEQ ID NO: 1-1131; and (b) capable of degrading DNA: ribonuclease of RNA in RNA hybrids.
Embodiment 157 the kit of embodiment 156, comprising (a) a probe set comprising at least one DNA probe comprising the sequence of SEQ ID NO: 1-1131; (b) ribonuclease; (c) a dnase; and (d) RNA purification beads.
Embodiment 158. The kit of embodiment 157, wherein the ribonuclease is RNase H.
Embodiment 159 the kit of embodiment 157 or 158, further comprising an RNA consumption buffer, a probe consumption buffer, and a probe removal buffer.
Embodiment 160 the kit of any one of embodiments 157 to 160, further comprising a nucleic acid destabilizing chemical.
Embodiment 161. The kit of embodiment 160, wherein the nucleic acid destabilizing chemical comprises betaine, DMSO, formamide, glycerol, or derivatives thereof, or mixtures thereof.
Embodiment 162 the kit of embodiment 161, wherein the nucleic acid destabilizing chemical comprises formamide.
Embodiment 163 the composition or kit of any of embodiments 154 to 162, wherein said at least one DNA probe comprises a sequence selected from the group consisting of SEQ ID NOs: 1-1131, 2 or more, 5 or more, 10 or more, 25 or more, 50 or more, 100 or more, 200 or more, 300 or more, 400 or more, 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, 1000 or more, 1100 or more, or 1131 sequences.
Embodiment 164 the composition or kit according to embodiment 163, wherein said at least one DNA probe comprises a sequence selected from the group consisting of SEQ ID NOs: 1-1131, 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, 1000 or more, 1100 or more, or 1131 sequences.
Embodiment 165 the composition or kit of any of embodiments 154 to 164, wherein said at least one DNA probe comprises at least one sequence comprising the sequence of SEQ ID NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, l51-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-1099, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129 and 1130.
Embodiment 166. The composition or kit of embodiment 165, wherein the at least one DNA probe comprises 100 or more, 500 or more, or 1000 or more sequences comprising the sequences of SEQ ID NOs: 1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-1099, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129 and 1130.
Embodiment 167. The composition or kit of embodiment 166, wherein the at least one DNA probe comprises a sequence comprising the sequence of SEQ ID NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-1099, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129, and 1130.
Embodiment 168 the composition or kit of any of embodiments 165 to 167, wherein said at least one DNA probe further comprises at least one sequence comprising the sequence of SEQ ID NO: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131.
Embodiment 169. The composition or kit of embodiment 168, wherein the at least one DNA probe comprises 10 or more, 20 or more, or 30 or more sequences comprising the sequences of SEQ ID NOs: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131.
Embodiment 170 the composition or kit of embodiment 169, wherein the at least one DNA probe comprises a sequence comprising the sequence of SEQ ID NO: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131.
Embodiment 171 the composition or kit of any of embodiments 154 to 164, wherein said at least one immobilized oligonucleotide comprises at least one sequence comprising SEQ ID NO:1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 913, 923-921, 923-925, at least one of 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126, and 1128-1131.
Embodiment 172 the composition or kit of embodiment 171, wherein the at least one immobilized oligonucleotide comprises 100 or more, 500 or more, or 1000 or more sequences comprising SEQ ID NO:1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 913, 923-921, 923-925, at least one of 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126, and 1128-1131.
Embodiment 173 the composition or kit of embodiment 172, wherein the at least one immobilized oligonucleotide comprises a sequence comprising the amino acid sequence of SEQ ID NO:1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 913, 923-921, 923-925, each of 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126, and 1128-1131.
Embodiment 174 the composition or kit of any of embodiments 165 to 173, wherein said at least one DNA probe further comprises at least one sequence comprising the sequence of SEQ ID NO: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631 637, 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127.
Embodiment 175 the composition or kit of embodiment 174, wherein the at least one DNA probe comprises 10 or more, 20 or more, or 30 or more sequences comprising the sequence of SEQ ID NO: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, l35, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631 637, 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127.
Embodiment 176 the composition or kit of embodiment 175, wherein the at least one DNA probe comprises a sequence comprising the sequence of SEQ ID NO: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631 637, 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127.
Embodiment 177. A method of selecting cDNA library fragments from a library of cDNA fragments prepared from RNA, comprising (a) preparing a solid support comprising a library of immobilized oligonucleotides, wherein each immobilized oligonucleotide in the library comprises a nucleic acid sequence corresponding to an RNA sequence or a complement thereof, (b) adding the library of fragments to the solid support and hybridizing the library fragments to at least one immobilized oligonucleotide to allow binding of the library fragments to the at least one immobilized oligonucleotide, and (c) collecting library fragments bound or not to the at least one immobilized oligonucleotide.
Embodiment 178 the method of embodiment 177, wherein (a) the selecting is to consume unwanted cDNA library fragments, wherein the RNA sequences comprise unwanted RNA sequences, the unwanted library fragments comprise those prepared from unwanted RNA sequences, and the collecting comprises collecting library fragments that are not bound to at least one immobilized oligonucleotide; or (b) the selecting is enriching for a desired cDNA library fragment, wherein the RNA sequence comprises a desired RNA sequence, the desired library fragment comprises those prepared from the desired RNA sequence, and the collecting comprises collecting library fragments bound to at least one immobilized oligonucleotide.
Embodiment 179. The method of embodiment 178, wherein the library of fragments is subjected to depletion of unwanted cDNA library fragments, followed by enrichment of the collected library fragments not bound to the at least one immobilized oligonucleotide with the wanted cDNA library fragments.
Embodiment 180. A solid support having two pools of immobilized oligonucleotides on its surface, wherein a first pool of oligonucleotides comprises immobilized oligonucleotides, each immobilized oligonucleotide comprising a nucleic acid sequence corresponding to an unwanted RNA sequence or its complement; and the second library of oligonucleotides comprises immobilized oligonucleotides, each immobilized oligonucleotide comprising a solid support adapter sequence capable of binding to a library adapter comprised in a library fragment.
Embodiment 181 the method of any one of embodiments 177 to 179 or the solid support of embodiment 180, wherein at least one unwanted RNA sequence has at least 90%, at least 95% or at least 99% homology to a high abundance RNA sequence in a sample used to prepare the library of fragments.
Embodiment 182. The method or solid support of embodiment 181 wherein the high abundance RNA sequence is a ribosomal RNA (rRNA) sequence.
Embodiment 183. The method or solid support of embodiment 182, wherein the unwanted RNA sequence is a globin mRNA or 28S, 23S, 18S, 5.8S, 5S, 16S, 12S, HBA-A1, HBA-A2, HBB-B1, HBB-B2, HBG1 or HBG2RNA or a fragment thereof.
Embodiment 184. The method or solid support of any one of embodiments 177-183, wherein each pool of immobilized oligonucleotides comprises 2 or more, 5 or more, 10 or more, 25 or more, 50 or more, 100 or more, 200 or more, 300 or more, 400 or more, 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, 1000 or more, or 1100 or more oligonucleotides.
The solid support of any one of embodiments 180-184, wherein an adapter complement sequence that is wholly or partially complementary to the solid support adapter sequence binds to the solid support adapter sequence of the second library, and wherein the binding of the adapter complement sequence to the solid support adapter sequence is reversible.
Embodiment 186. A method of amplifying a desired cDNA library fragment from a cDNA fragment library prepared from RNA, comprising (a) providing a solid support according to embodiment 185; (b) Adding the library of fragments to the solid support and hybridizing the library fragments to at least one immobilized oligonucleotide to allow unwanted library fragments to bind to the first pool of oligonucleotides; (c) Collecting library fragments that do not bind to the first pool of oligonucleotides to prepare a collected library fragment; (d) Denaturing and removing library fragments bound to the first pool of oligonucleotides and adaptor complementary sequences bound to the adaptor sequences of the second pool of oligonucleotides; (e) Adding the collected library fragments to the solid support and hybridizing the library fragments to at least one immobilized oligonucleotide to allow binding of the desired library fragments to the second oligonucleotide library; and (f) amplifying the bound desired library fragments by bridge amplification on the solid support.
Embodiment 187 a method for depleting unwanted RNA molecules contained in a patient microbiome sample, wherein the patient microbiome sample comprises at least one target RNA or DNA sequence and at least one unwanted RNA molecule, the method comprising (a) sequencing a plurality of probe development microbiome samples to determine from sequencing data at least one unwanted RNA molecule comprising a bacterial ribosomal RNA (rRNA) sequence; (b) Preparing a probe set comprising at least one DNA probe complementary to the at least one unwanted RNA molecule; (c) Contacting the patient microbiome sample with the probe set to produce DNA: RNA hybrids; and (d) causing the DNA to: contacting the RNA hybrid with a ribonuclease that degrades a polypeptide derived from the DNA: degrading said unwanted RNA molecules in said patient microbiome sample to form a degradation mixture.
Embodiment 188. A method for depleting unwanted RNA molecules contained in a patient microbiome sample, wherein the patient microbiome sample comprises at least one target RNA or DNA sequence and at least one unwanted RNA molecule, the method comprising (a) contacting the patient microbiome sample with a probe set comprising at least one sequence comprising the sequence of SEQ ID NO:1-1131 to produce DNA: RNA hybrids; and (b) causing the DNA to: contacting the RNA hybrid with a ribonuclease that degrades a polypeptide derived from the DNA: degrading said unwanted RNA molecules in said patient microbiome sample to form a degradation mixture.
Embodiment 189 the method of embodiment 187 or embodiment 188, further comprising (a) degrading any remaining DNA probes by contacting the degradation mixture with a DNA-digesting enzyme, optionally wherein the DNA-digesting enzyme is dnase I, to form a DNA degradation mixture; and (b) isolating the degraded RNA from the degradation mixture or the DNA degradation mixture.
Embodiment 190. A composition comprising a set of probes, comprising (a) at least one DNA probe comprising at least one sequence comprising the sequence of SEQ ID NO: 1-1131; and (b) capable of degrading DNA: ribonuclease of RNA in RNA hybrids.
Embodiment 191 a kit comprising a set of probes comprising (a) at least one DNA probe comprising at least one sequence comprising the sequence of SEQ ID NO: 1-1131; and (b) capable of degrading DNA: ribonuclease of RNA in RNA hybrids.
Embodiment 192 the kit of embodiment 191, comprising (a) a probe set comprising at least one DNA probe comprising the sequence of SEQ ID NO: 1-1131; (b) ribonuclease; (c) a dnase; and (d) RNA purification beads.
Embodiment 193 the method of any one of embodiments 177 to 179, 181 or 186 to 189, the solid support of any one of embodiments 180 to 185, the composition of embodiment 190, or the kit of embodiments 191 or 192, wherein the oligonucleotide library or the probe set comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 1-1131, 2 or more, 5 or more, 10 or more, 25 or more, 50 or more, 100 or more, 200 or more, 300 or more, 400 or more, 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, 1000 or more, 1100 or more, or 1131 sequences.
Embodiment 194 the method of any of embodiments 177-179, 181-184, or 186-189, the solid support of any of embodiments 180-185, the composition of embodiment 190, or the kit of embodiments 191 or 192, wherein the oligonucleotide library or the probe set comprises at least one sequence comprising the nucleotide sequence of SEQ ID NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-1099, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129 and 1130.
Embodiment 195. The method, solid support, composition, or kit of embodiment 194, wherein the oligonucleotide library or the set of probes comprises 100 or more, 500 or more, or 1000 or more sequences comprising the sequences of SEQ ID NOs: 1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-1099, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129 and 1130.
Embodiment 196. The method, solid support, composition, or kit of embodiment 195, wherein the oligonucleotide library or the set of probes comprises a sequence comprising the sequence of SEQ ID NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-1099, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129, and 1130.
Embodiment 197 the method, solid support, composition, or kit of any of embodiments 194 to 196, wherein the oligonucleotide library or the probe set further comprises at least one sequence comprising the sequence of SEQ ID NO: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131.
Embodiment 198 the method, solid support, composition, or kit of embodiment 197 wherein the oligonucleotide library or the probe set comprises 10 or more, 20 or more, or 30 or more sequences comprising the sequences of SEQ ID NOs: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131.
Embodiment 199. The method, solid support, composition, or kit of embodiment 198, wherein the oligonucleotide library or probe set comprises a sequence comprising the sequence of SEQ ID NO: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131.
Embodiment 200 the method of any one of embodiments 177 to 179, 181 to 184, or 186 to 189, the solid support of any one of embodiments 180 to 185, the composition of embodiment 190, or the kit of embodiments 191 or 192, wherein the oligonucleotide library or the probe set comprises at least one sequence comprising the nucleotide sequence of SEQ ID NO:1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 913, 923-921, 923-925, at least one of 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126, and 1128-1131.
Embodiment 201. The method, solid support, composition, or kit of embodiment 200, wherein the oligonucleotide library or the set of probes comprises 100 or more, 500 or more, or 1000 or more sequences comprising SEQ ID NOs: 1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 913, 923-921, 923-925, at least one of 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126, and 1128-1131.
Embodiment 202. The method, solid support, composition, or kit of embodiment 201, wherein the oligonucleotide library or the set of probes comprises a sequence comprising the sequence of SEQ ID NO:1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 913, 923-921, 923-925, each of 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126, and 1128-1131.
Embodiment 203 the method, solid support, composition, or kit of any of embodiments 194-202, wherein the oligonucleotide library or the set of probes further comprises at least one sequence comprising the sequence of SEQ ID NO: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631 637, 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127.
Embodiment 204. The method, solid support, composition, or kit of embodiment 203, wherein the oligonucleotide library or the probe set comprises 10 or more, 20 or more, or 30 or more sequences comprising the sequences of SEQ ID NOs: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631 637, 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127.
Embodiment 205. The method, solid support, composition, or kit of embodiment 204, wherein the oligonucleotide library or the set of probes comprises a sequence comprising the sequence of SEQ ID NO: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631 637, 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127.
Additional objects and advantages will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice. These objects and advantages will be realized and attained by means of the elements and combinations particularly pointed out in the appended claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the claims.
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate one (or more) embodiments and, together with the description, serve to explain the principles described herein.
Drawings
FIG. 1 provides an overview of a method of consuming unwanted library fragments derived from rRNA transcripts. The solid support (such as a flow-through cell) comprises at least one immobilized oligonucleotide comprising a tether that links the oligonucleotide to the solid support. The immobilized oligonucleotides may comprise complementary sequences to sequences contained in library fragments comprising inserts of cDNA prepared from rRNA (e.g., labeled "rRNA complementary sequences").
The library fragments may be flowed through a solid support where fragments prepared from rRNA (i.e., library fragments comprising "rRNA library" sequences) are hybridized to immobilized oligonucleotides each comprising a sequence complementary to rRNA. Library fragments that do not bind to the immobilized oligonucleotides can be collected by siphoning, and then hybridized library fragments (i.e., unwanted library fragments) can be denatured and siphoned into a waste container. The siphoned collected library fragments may then be flowed through the solid support again to allow binding of any additional unwanted library fragments, and the steps of (1) hybridizing the unwanted library fragments, (2) collecting unbound library fragments, and (3) denaturing the hybridized library fragments may be repeated until a final set of collected unbound library fragments is collected, representing a library consuming unwanted library fragments prepared from rRNA. Similar methods can be used for enrichment wherein desired library fragments are bound to immobilized oligonucleotides comprising the complementary sequences of these desired library fragments, except that in similar methods the bound library fragments are used for sequencing and unbound library fragments are siphoned to be discarded.
FIG. 2 shows an overview of a method for consuming unwanted library fragments and performing bridge amplification on the same solid support. The solid support used in the method will comprise immobilized oligonucleotides each comprising a rRNA complementary sequence, as well as immobilized oligonucleotides comprising an adapter sequence that can bind to an adapter comprised in a library fragment. Such adaptors contained in the immobilized oligonucleotides may be referred to as "solid support adaptor sequences" and the library fragments may comprise "library adaptor sequences" that are fully or partially complementary to the solid support adaptor sequences. The solid support adapter may comprise the P5 adapter sequence (SEQ ID NO: 1132) or the P7 adapter sequence (SEQ ID NO: 1133) and/or their complements.
The immobilized oligonucleotide comprising a solid support adapter sequence may bind to an adapter complementary sequence that is wholly or partially complementary to the solid support adapter sequence, wherein the adapter complementary sequence hybridizes to the solid support adapter sequence to form a double stranded nucleic acid. Such hybridization inhibits binding of the immobilized oligonucleotides comprising the adapter sequences to the library fragments (i.e., inhibits binding of the solid support adapter sequences to the library adapter sequences).
Fragments prepared from rRNA can be combined with immobilized oligonucleotides each comprising a sequence complementary to rRNA, as described in the legend of FIG. 1. After collection of the desired library fragments (not bound to the immobilized oligonucleotides each comprising the rRNA complementary sequence), the undesired library fragments and adaptor complementary sequences can be denatured and siphoned to waste. The collected library fragments (comprising the desired library fragments) may then be flowed through a flow cell and bound to immobilized oligonucleotides comprising solid support adapter sequences by hybridization of the library adapter sequences to the solid support adapter sequences. Bridge amplification of the bound library fragments can then be performed. The resulting amplified, depleted library may be sequenced, optionally after quantification and quality control.
FIG. 3 shows the results of the consumption of human intestinal microbiome rRNA with RNase and standard probe (DP 1) or human microbiome probe (HM, comprising HMv1 and HMv2 probes) as described herein using the Ribozero method. Significantly more rRNA consumption was observed with HM probe.
FIG. 4 shows the results of "simulated" consumption of rRNA in wastewater samples with or without RiboZero consumption of HM probes. Significantly more rRNA consumption was observed with HM probe. Bac=bacterial rRNA; arc = archaebacteria rRNA; euk = eukaryotic rRNA; rfam = non-coding RNA defined by Rfam database.
FIG. 5 shows a skin microbiome whole cell mixture (ATCC MSA-2005) TM ) As a result of (a). The experiment compares the results using the RiboZero rnase protocol (using standard DP1 probes or using human microbiome HM probes) with the results using RiboZero-act, which uses a probe-based hybridization method to capture and consume bacterial rRNA from e.coli and bacillus subtilis.
Sequence description
Table 1 provides a list of certain sequences cited herein.
/>
Detailed Description
I. Solid support for enrichment or depletion
In some embodiments, a solid support for enriching for desired library fragments or consuming undesired library fragments may be prepared, wherein at least one oligonucleotide is immobilized to the solid support. In some embodiments, the solid support is a flow cell.
In some embodiments, at least one immobilized oligonucleotide comprises a nucleic acid comprising SEQ ID NO: 1-1131.
In some embodiments, the at least one immobilized oligonucleotide comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 1-1131 or its complement in 2 or more, 5 or more, 10 or more, 25 or more, 50 or more, 100 or more, 200 or more, 300 or more, 400 or more, 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, 1000 or more, 1100 or more, or 1131 sequences. In some embodiments, the at least one immobilized oligonucleotide comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 1-1131 or its complement 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, 1000 or more, 1100 or more, or 1131 sequences.
In some embodiments, the at least one immobilized oligonucleotide comprises at least one sequence from bacterial ribosomal RNA (rRNA) or its complement. In some embodiments, at least one immobilized oligonucleotide comprises at least one sequence comprising SEQ ID NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-909, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129 and 1130 or the complement thereof.
In some embodiments, at least one immobilized oligonucleotide comprises 100 or more, 500 or more, or 1000 or more sequences comprising SEQ ID NOs: 1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-909, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129 and 1130 or the complement thereof. In some embodiments, at least one immobilized oligonucleotide comprises a sequence comprising SEQ ID NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-909, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129, and 1130 or their complements.
In some embodiments, the at least one immobilized oligonucleotide comprises at least one sequence of bifidobacterium bifidum rRNA or a complement thereof. In some embodiments, the at least one immobilized oligonucleotide further comprises at least one sequence comprising SEQ ID NO: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131 or a complement thereof. In some embodiments, at least one immobilized oligonucleotide comprises 10 or more, 20 or more, or 30 or more sequences comprising SEQ ID NOs: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131 or a complement thereof. In some embodiments, at least one immobilized oligonucleotide comprises a sequence comprising SEQ ID NO: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131 or a complement thereof.
In some embodiments, at least one immobilized oligonucleotide comprises 100 or more, 500 or more, or 1000 or more sequences comprising SEQ ID NOs: or at least one of its complementary sequences. In some embodiments, at least one immobilized oligonucleotide comprises a sequence comprising SEQ ID NO: or each of its complementary sequences.
In some embodiments, the at least one immobilized oligonucleotide further comprises at least one sequence comprising SEQ ID NO: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631 637, 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127. In some embodiments, at least one immobilized oligonucleotide comprises 10 or more, 20 or more, or 30 or more sequences comprising SEQ ID NOs: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631 637, 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127. In some embodiments, at least one immobilized oligonucleotide comprises a sequence comprising SEQ ID NO: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631 637, 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127.
Also disclosed herein are compositions comprising library fragments bound to immobilized oligonucleotides on a solid support. In some embodiments, single-stranded library fragments comprising cDNA prepared from a sample comprising RNA are hybridized to a solid support comprising immobilized oligonucleotides. In some embodiments, the cDNA contained in the composition is complementary to RNA contained in the sample.
Kits for depleting or enriching libraries are also disclosed herein. In some embodiments, the kit comprises a solid support as disclosed herein and instructions for using the solid support. Such kits may also include reagents for preparing a cDNA library from RNA, such as reagents for a strand process for preparing cDNA from a sample comprising RNA, as described below.
A. Type of solid support
A variety of solid supports can be used to immobilize oligonucleotides for consumption or enrichment as described herein, including those described in WO 2014/108810, which is incorporated herein in its entirety.
The composition and geometry of the solid support may vary with its use. In some embodiments, the solid support is a planar structure, such as a slide, chip, microchip, and/or array. Thus, the surface of the substrate may be in the form of a planar layer. In some embodiments, the solid support comprises one or more surfaces of a flow cell. As used herein, the term "flow cell" refers to a chamber that includes a solid surface through which one or more fluidic reagents can flow. Examples of flow cells and related fluidic systems and detection platforms that can be readily used in the methods of the present disclosure are described, for example, in the following: bentley et al, nature 456:53-59 (2008); WO 04/018497, US 7,057,026, WO 91/06678, WO 07/123744, US 7,329,492, US 7,211,414, US 7,315,019, US 7,405,281 and US 2008/0108082, each of which is incorporated herein by reference.
In some embodiments, the flow cell is contained within an apparatus or device (which may be referred to as a sequencer) for sequencing nucleic acids. In some embodiments, the sequence may also include a reservoir or conduit for collecting the sample (such as for collecting the sample in a reservoir for draining waste). In some embodiments, one or more reservoirs are separate from the flow cell and contained in the sequencer. In some embodiments, standard sequencers are modified to improve fluid system formulations and/or hardware for use of reservoirs in the methods of the present invention.
As used herein, a "flow cell" may include flow cell-like devices that are not intended to be imaged. While standard flow cells for imaging may be used in the methods of the invention, flow cells may also be designed differently than flow cells for imaging. In some embodiments, the flow-through cell may have a high density of immobilized oligonucleotides, where the imaging infrastructure will have difficulty separating into different bridge-amplified clusters associated with different immobilized oligonucleotides. In some embodiments, high density of immobilized oligonucleotides increases hybridization efficiency. In some embodiments, standard transparent glass may be used in the flow cell. In other embodiments, a hard plastic may be used in the flow cell. The use of glass in the flow cell may allow standard flow cells to be used without further optimisation, whereas the use of hard plastics may reduce the cost of manufacturing the flow cell and/or increase the stability of the flow cell. Different materials may be used, depending on the desired advantages. In some embodiments, the immobilized oligonucleotides are embedded in a substrate other than a standard flow-through cell (i.e., in a substrate other than PAZAM) to improve immobilization of longer length oligonucleotides.
B. Unwanted RNA
As used herein, "unwanted RNA" or "unwanted RNA sequence" refers to any RNA that a user does not wish to analyze. As used herein, unwanted RNAs include complementary sequences to unwanted RNA sequences. When RNA is converted to cDNA and the cDNA is prepared into a library, the user will sequence library fragments prepared from all RNA transcripts without enrichment or depletion. Thus, the methods described herein for consuming library fragments prepared from unwanted RNA can save user time and consumables associated with sequencing and analyzing sequencing data prepared from unwanted RNA.
As used herein, "unwanted RNAs" or "unwanted RNA sequences" also include fragments of such RNAs. For example, the unwanted RNA may comprise a partial sequence of the unwanted RNA. In some embodiments, the unwanted RNA sequences are from human, rat, mouse, or bacteria. In some embodiments, the bacteria are archaea species, escherichia coli, or bacillus subtilis.
As used herein, "unwanted library fragments" refers to library fragments prepared from cDNA prepared from unwanted RNA.
In some embodiments, the unwanted RNA sequence comprises SEQ ID NO: 1-1131.
In some embodiments, unwanted RNA sequences (or their complements) are immobilized to a solid support. A range of different types of RNA may not be required.
1. High abundance RNA
In some embodiments, the unwanted RNA is high abundance RNA. High abundance RNA is RNA that is very abundant in many samples and that the user does not want to sequence, but it may or may not be present in a given sample. In some embodiments, the high abundance RNA sequence is a ribosomal RNA (rRNA) sequence. Exemplary high abundance RNAs are disclosed in WO2021/127191 and WO 2020/132304, each of which is incorporated herein by reference in its entirety.
In some embodiments, the high abundance RNA sequence is the most abundant RNA sequence determined in the sample. In some embodiments, the high abundance RNA sequences are the most abundant RNA sequences in multiple samples, although they may not be the most abundant in a given sample. In some embodiments, the user utilizes the method of determining the most abundant RNA sequences in a sample as described herein.
The most abundant sequences in a given sample are the 100 most abundant sequences. In some embodiments, the method can consume 1,000 of the most abundant sequences or 10,000 of the most abundant sequences in the sample in addition to consuming 100 of the most abundant sequences. In some embodiments, the unwanted RNA sequence comprises a sequence having at least 90%, at least 95%, or at least 99% homology to the most abundant sequence in the sample comprising RNA. In some embodiments, the unwanted RNA sequence comprises a sequence having at least 90%, at least 95%, or at least 99% homology to the most abundant sequence in the sample comprising RNA, wherein the most abundant sequence comprises 100 most abundant sequences. In some embodiments, homology is measured for 1,000 most abundant sequences or 10,000 most abundant sequences.
In some embodiments, the high abundance RNA sequences are contained in RNAs that are known to be highly abundant in a range of samples.
In some embodiments, the unwanted RNA sequence is a globin mRNA or 28S, 23S, 18S, 5.8S, 5S, 16S, 12S, HBA-A1, HBA-A2, HBB-B1, HBB-B2, HBG1 or HBG2 RNA or a fragment thereof.
In some embodiments, the unwanted RNA sequence is 28S, 18S, 5.8S, 5S, 16S, or 12S RNA from a human, or a fragment thereof. In some embodiments, the unwanted RNA sequence is rat 16S, rat 28S, mouse 16S, or mouse 28S RNA.
In some embodiments, the unwanted RNA sequences are contained in mRNA associated with one or more "housekeeping" genes. For example, a housekeeping gene may be a housekeeping gene that is normally expressed in a sample from a tumor or other oncology-related sample, but not involved in tumorigenesis or progression.
In some embodiments, the unwanted RNA sequence is contained in a 23S, 16S or 5S RNA from a gram positive or gram negative bacterium. In some embodiments, the unwanted RNA sequences are from organisms in the human microbiome.
2. Host RNA
In some embodiments, the unwanted RNA sequences are contained in a host transcriptome. For example, a user may wish to study library fragments made from RNAs from organisms contained in a human microbiome, rather than analyze library fragments made from human RNAs.
C. RNA of interest
As used herein, "desired RNA" or "desired RNA sequence" refers to any RNA that a user wants to analyze. As used herein, a desired RNA includes a complement of a desired RNA sequence. The desired RNA may be one from which the user wants to collect sequencing data after cDNA and library preparation. In some cases, the desired RNA is mRNA (or messenger RNA). In some cases, the desired RNA is part of the mRNA in the sample. For example, a user may want to analyze RNA transcribed from a cancer-related gene, and thus this is a desired RNA. In another example, a user may wish to analyze RNA from an organism contained in a human microbiome, and thus RNA from an organism contained in a human microbiome is a desired RNA, while human RNA is an undesired RNA.
As used herein, "desired library fragment" refers to library fragments prepared from cDNA prepared from desired RNA.
In some embodiments, the desired RNA sequence is an exome sequence. In some embodiments, the methods of the invention are used for exome enrichment.
In some embodiments, the desired RNA sequence is from human, rat, mouse, and/or bacteria. In some embodiments, the desired RNA sequence is from an organism in a human microbiome.
D. Immobilized oligonucleotides for enrichment or depletion
In some embodiments, the oligonucleotides for enrichment or depletion are immobilized to a solid support. Such immobilized oligonucleotides may be referred to as being attached to a solid support. In some embodiments, the oligonucleotide may be immobilized to a solid support through a linker molecule. When referring to the immobilization of an oligonucleotide to a solid support, the terms "immobilized" and "linked" are used interchangeably herein and are intended to encompass direct or indirect, covalent or non-covalent linkage unless otherwise indicated explicitly or by context. In certain embodiments of the invention, covalent attachment may be preferred, but it is generally desirable that at least one of the immobilized oligonucleotides remain immobilized or attached to the carrier under conditions intended for use of the carrier, e.g., for enrichment or consumption.
As used herein, "tether" refers to any means of immobilizing an oligonucleotide to a solid support. In some embodiments, a solid support (such as a flow cell) is coated with a covalently linked polymer. In some embodiments, the flow cell contains a polymer coating. In some embodiments, the covalently attached polymer is PAZAM. In some embodiments, the polymeric coating comprises reactive sites for reacting with an oligonucleotide (such as the oligonucleotides described herein). Such covalently attached polymers are described in WO 2013/184796, which is incorporated herein by reference in its entirety. In some embodiments, ultraviolet light is used to crosslink polymers such as PAZAM.
In some embodiments, the immobilized oligonucleotides may be designed to comprise cleavage sites. In some embodiments, the method may include the step of cleaving the immobilized oligonucleotides to remove them from the solid support. In some embodiments, after cleavage of the immobilized oligonucleotide, the resulting fragments from the immobilized oligonucleotide are collected in a waste container included in a sequencer. In some embodiments, the tether may comprise a cleavage site. In this way, some or all of the immobilized oligonucleotides on the solid surface may be removed at the discretion of the user, potentially avoiding the need to transfer the sample to a different solid support.
In some embodiments, the immobilized oligonucleotides described herein are single stranded. In this way, the immobilized oligonucleotides can be used to hybridize to single-stranded library fragments, which are each partially complementary to the sequences contained in the immobilized oligonucleotides. The skilled artisan can design the length of the immobilized oligonucleotides to allow for their preferred level of affinity for interaction between the immobilized oligonucleotides and the fully or partially complementary library fragments (i.e., longer immobilized oligonucleotides are expected to exhibit higher affinity binding to the fully or partially complementary single-stranded library fragments).
In some embodiments, the sequences contained in the immobilized oligonucleotides may be partially or fully complementary to the sequences of library fragments prepared from unwanted RNAs for consumption. In some embodiments, the sequences contained in the immobilized oligonucleotides may be partially or fully complementary to the sequences of library fragments prepared from the desired RNAs for enrichment.
In some embodiments, each immobilized oligonucleotide is 10 to 100 nucleotides long, 20 to 80 nucleotides long, 40 to 60 nucleotides long, 45 to 55 nucleotides long, or 50 nucleotides long. In some embodiments, at least one immobilized oligonucleotide is 45-55 bases in length, optionally wherein the at least one immobilized oligonucleotide is 50 bases in length. In some embodiments, the immobilized oligonucleotide has a molecular weight (m.w.) of 15,000 to 15,500 daltons.
In some embodiments, a plurality of different oligonucleotides comprising sequences that are wholly or partially complementary to unwanted or desired RNAs may be immobilized on a solid support. In some embodiments, these plurality of different oligonucleotides are fully or partially complementary to different sequences contained in unwanted or desired RNAs. For example, if a user wants to consume a given rRNA, the user can prepare a variety of oligonucleotides having overlapping or non-overlapping sequences corresponding to that rRNA. In some embodiments, having multiple immobilized oligonucleotides corresponding to different sequences from a given unwanted RNA can increase the efficiency of consuming library fragments prepared from the RNA. In some embodiments, having multiple immobilized oligonucleotides corresponding to different sequences from a given desired RNA can increase the efficiency of enriching library fragments prepared from that RNA. In part, this increased efficiency may be due to the fact that library fragments may be randomly generated from the cDNA prepared from a given RNA, and that the user cannot predict the specific insert of the cDNA contained in a given fragment.
In some embodiments, the sequence contained in the immobilized oligonucleotide may be fully or partially complementary to a specific location (i.e., target location) on the RNA to be consumed or enriched, e.g., the sequence contained in the immobilized oligonucleotide may be at least 80%, 85%, 90%, 95% or 100% or any range therebetween complementary to the target location on the RNA transcript to be consumed or enriched.
In some embodiments, the immobilized oligonucleotides can bind to a set of different sequences contained in the RNA to be consumed. In some embodiments, a variety of immobilized oligonucleotides may be designed, the tiling of which is intended for use with consumed RNA sequences, such as the tiling described in WO 2020132304, which is incorporated herein in its entirety. In some embodiments, a plurality of immobilized oligonucleotides designed for a target sequence may increase the likelihood that fragments generated by the target sequence bind to at least one immobilized oligonucleotide. For example, a library insert contained in a library fragment may comprise about 150bp, and an immobilized oligonucleotide as described herein may comprise 50-80 nucleotides. In this case, if a fragmentation event occurs within a target sequence and disrupts the binding of a given immobilized oligonucleotide to a fragment (e.g., if fragmentation occurs within a sequence that can bind a given immobilized oligonucleotide), an immobilized oligonucleotide designed to bind an adjacent target sequence may be able to hybridize to the fragment. In this way, tiling of sequences can increase the likelihood of successful consumption or enrichment of fragments made from RNA sequences.
In some embodiments, the oligonucleotides of the invention comprise modified or unmodified nucleic acids.
As used herein, "modified nucleic acid" refers to any substitution from a naturally occurring nucleic acid. For example, the modified nucleic acid may comprise one or more modifications to the sugar-phosphate backbone or pendant bases. Such modifications may increase the stability of the immobilized oligonucleotides.
In some embodiments, one, at least one, or each of the one or more immobilized nucleic acids comprises RNA, deoxyribonucleic acid (DNA), a heterologous nucleic acid (XNA), or a combination thereof. The XNA may include 1, 5-anhydrohexitol nucleic acid (HNA), cyclohexene nucleic acid (CeNA), threose Nucleic Acid (TNA), ethylene Glycol Nucleic Acid (GNA), locked Nucleic Acid (LNA), peptide Nucleic Acid (PNA), fluoroarabinonucleic acid (FANA), or a combination thereof.
In some embodiments, the immobilized nucleic acid consists of a modified nucleic acid. In some embodiments, a percentage of the nucleic acids contained in the immobilized nucleic acids are modified nucleic acids, e.g., every third nucleotide may be a modified nucleic acid.
In some embodiments, the at least one immobilized oligonucleotide comprises a sequence of an unwanted RNA or a complementary sequence. Solid supports having such immobilized oligonucleotides comprising sequences of unwanted RNAs or complementary sequences can be used to consume library fragments prepared from unwanted RNAs using the methods described herein.
In some embodiments, the at least one immobilized oligonucleotide comprises a sequence of a desired RNA or a complementary sequence. Solid supports having such immobilized oligonucleotides comprising sequences of or complementary to a desired RNA can be used to enrich for library fragments prepared from the desired RNA using the methods described herein.
1. Fixing for consumptionChemical oligonucleotides
In some embodiments, the oligonucleotide for consumption comprises one or more unwanted RNA sequences.
In some embodiments, the immobilized oligonucleotides are designed to consume unwanted library fragments from the library. In some embodiments, the unwanted library fragments include library fragments prepared from unwanted RNAs. A representative example of a solid support with immobilized oligonucleotides for consuming unwanted library fragments is shown in fig. 1.
In some embodiments, the immobilized oligonucleotides are designed to consume each of the most abundant species identified from the sample.
Various unwanted RNA types (such as rRNA) are well known in the literature. Ribozero+ probes and nuclease-based rich transcript consumption using the Ribozero+ probes are described in WO 2020/132304A1, the contents of which are incorporated by reference in their entirety.
In some embodiments, the immobilized oligonucleotides are designed to consume the abundant transcripts described in WO 2020/132304A 1.
In some embodiments, unwanted RNA sequences are determined by evaluating sequencing results to determine abundant sequences in a sample comprising RNA. In some embodiments, the unwanted RNA sequence is selected by determining the most abundant sequence in the sample comprising RNA. In some embodiments, the most abundant sequences include 100 most abundant sequences, 1,000 most abundant sequences, or 10,000 most abundant sequences. In some embodiments, the unwanted RNA sequence comprises a sequence having at least 90%, at least 95%, or at least 99% homology to the most abundant sequence in the sample comprising RNA. In some embodiments, the unwanted RNA sequence comprises a sequence having at least 90%, at least 95%, or at least 99% homology to the most abundant sequence in the sample comprising RNA, wherein the most abundant sequence comprises 100 most abundant sequences, 1,000 most abundant sequences, or 10,000 most abundant sequences.
WO 2021/127191, incorporated herein in its entirety, describes a method of selecting rich regions from a sample comprising RNA. The immobilized oligonucleotides can be designed using the methods from WO 2021/127191 which use standard disclosure available software to identify rich regions. In some embodiments, the method of identifying rich regions can avoid bias to known samples within the environmental sample.
Exemplary types of immobilized oligonucleotides for consuming library fragments prepared from unwanted RNA (i.e., unwanted library fragments) are shown in FIG. 1. In some embodiments, unwanted library fragments are prepared from rRNA, and may be referred to as "rRNA libraries. In some embodiments, the rRNA library comprises library fragments prepared from the first strand of cDNA (prepared from RNA). When the unwanted library fragment is an rRNA library, the immobilized oligonucleotide may be a "rRNA complement" that can bind to the rRNA library. An immobilized oligonucleotide comprising an rRNA complementary sequence is one representative type of an immobilized oligonucleotide for consumption, and one skilled in the art can design such an oligonucleotide for consumption of library fragments prepared from any type of unwanted RNA. In some embodiments, unwanted RNAs may be contained in some immobilized oligonucleotides, and complementary sequences of unwanted RNAs may be contained in other immobilized oligonucleotides.
2. Representative sequences contained in immobilized oligonucleotides for consumption
Table 1 describes a set of sequences that may be included in an immobilized oligonucleotide. The immobilized oligonucleotides listed in table 2 or their complementary sequences may have particular utility in the study of microbiome samples.
The immobilized oligonucleotides listed in Table 2 were designed by sequencing total RNA derived from human feces to identify the abundant rRNA sequences detected using the publicly available rRNA classifier SortMeRNA (as described in Kopylova et al, bioinformatics 28 (24): 3211-3217 (2012)). The most abundant transcripts were identified and DNA probes were designed for these transcripts. Consumption was tested with fecal, skin, oral and vaginal samples using the total RNA strand kit and with samples derived from various soil types, with much better results than standard consumption probe sets (data not shown). The oligonucleotides listed in table 2 are designed to remove rRNA sequences from metatranscriptomic samples (such as stool) and antisense to the rRNA sequences they target. In some embodiments, the at least one immobilized nucleotide comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 1-1131 or its complement in 2 or more, 5 or more, 10 or more, 25 or more, 50 or more, 100 or more, 200 or more, 300 or more, 400 or more, 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, 1000 or more, 1100 or more, or 1131 sequences. In some embodiments, the at least one immobilized nucleotide comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 1-1131 or its complement 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, 1000 or more, 1100 or more, or 1131 sequences.
In some embodiments, the at least one immobilized oligonucleotide comprises at least one sequence comprised in the HMv sequence and comprising the amino acid sequence of SEQ ID NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-909, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129 and 1130 or their complements.
In some embodiments, at least one immobilized oligonucleotide comprises 100 or more, 500 or more, or 1000 or more sequences comprising SEQ ID NOs: 1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-909, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129 and 1130 or the complement thereof.
In some embodiments, at least one immobilized oligonucleotide comprises a sequence comprising SEQ ID NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-909, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129, and 1130 or their complements.
In some embodiments, the at least one immobilized oligonucleotide further comprises at least one sequence comprised in the HMv sequence and comprising the amino acid sequence of SEQ ID NO: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131 or a complement thereof.
In some embodiments, at least one immobilized oligonucleotide comprises 10 or more, 20 or more, or 30 or more sequences comprising SEQ ID NOs: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131 or a complement thereof.
In some embodiments, at least one immobilized oligonucleotide comprises a sequence comprising SEQ ID NO: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131 or a complement thereof.
In some embodiments, the at least one immobilized oligonucleotide further comprises at least one sequence that is contained in the HM sequence (comprising both HMv1 and HMv probes) and comprises the sequence set forth in SEQ ID NO:1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 913, 923-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126, and 1128-1131 or at least one of their complements.
In some embodiments, at least one immobilized oligonucleotide comprises 10 or more, 20 or more, or 30 or more sequences comprising SEQ ID NOs: 1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 913, 923-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126, and 1128-1131 or at least one of their complements.
In some embodiments, at least one immobilized oligonucleotide comprises a sequence comprising SEQ ID NO:1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 913, 923-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126, and 1128-1131 or the complements thereof.
In some embodiments, the at least one immobilized oligonucleotide further comprises at least one sequence comprised in the DP1 sequence and comprising the sequence set forth in SEQ ID NO: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631, 637 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127 or at least one of their complements.
In some embodiments, at least one immobilized oligonucleotide comprises 10 or more, 20 or more, or 30 or more sequences comprising SEQ ID NOs: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631, 637 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127 or at least one of their complements.
In some embodiments, at least one immobilized oligonucleotide comprises a sequence comprising SEQ ID NO: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631, 637 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127, or the complement thereof.
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
3. Immobilized oligonucleotides for enrichment
In some embodiments, the immobilized oligonucleotides are designed to enrich for desired library fragments. In some embodiments, the oligonucleotides for enrichment comprise one or more desired RNA sequences. The user can design the oligonucleotides for enrichment in a similar manner as the selection probes for consumption described above. For example, a user can prepare immobilized oligonucleotides of desired RNA sequences in target organisms contained in a human microbiome for enrichment of library fragments prepared from these desired RNA sequences. Also, the user can prepare an immobilized oligonucleotide of the desired mRNA sequence from the target organism.
In some embodiments, the desired RNA may be contained in some of the immobilized oligonucleotides, and the complementary sequence of the desired RNA may be contained in other immobilized oligonucleotides.
E. Immobilized oligonucleotides comprising an adapter sequence and library fragments comprising an adapter sequence
In some embodiments, the solid support comprises an immobilized oligonucleotide comprising an adapter sequence. In some embodiments, the adaptor sequences contained in the immobilized oligonucleotides are solid support adaptor sequences. As used herein, "solid support adaptor sequence" refers to an adaptor sequence contained in an oligonucleotide immobilized to a solid support. In some embodiments, the solid support adapter sequences bind to library adapter sequences. As used herein, "library adaptor sequence" refers to an adaptor sequence incorporated into a library fragment, wherein the library adaptor sequence may bind to a solid support adaptor sequence. In some embodiments, the solid support adapter sequences may be used to immobilize the library fragments to the solid support, wherein such immobilization is not due to the cDNA sequences contained in the library fragments, but rather due to binding to the library adapters contained in the library fragments. In some embodiments, binding of the library adaptor sequences contained in the library fragments to the solid support adaptor sequences contained in the immobilized oligonucleotides is used to immobilize the library fragments to the solid support.
In some embodiments, the library adaptor sequences are incorporated into the library fragments during library preparation. In some embodiments, the library of fragments added to the solid support is prepared by a method comprising incorporating one or more library adaptors that specifically bind to solid support adaptor sequences comprising P5 (SEQ ID NO: 1132), P7 (SEQ ID NO: 1133), and/or their complements. Such methods for incorporating one or more library adaptors may be labelling or fragmentation followed by adaptor ligation.
In some embodiments, the library adaptor sequences are incorporated into the library fragments after enrichment or depletion as described herein. In other words, enrichment or depletion can be performed, and then library adaptors can be added to the enriched or depleted library. In some embodiments, library adaptor sequences are added to the collected library fragments. In some embodiments, library adaptor sequences are added to the collected library fragments by ligation.
In some embodiments, the library fragments comprise library adaptors and the solid support comprises immobilized oligonucleotides comprising solid support adaptor sequences capable of binding to library adaptors.
In some embodiments, the solid support adapter sequence comprises the P5 sequence (SEQ ID NO: 1132), the P7 sequence (SEQ ID NO: 1133), and/or their complements. In some embodiments, the library adaptor sequence comprises a sequence complementary to a P5 sequence or a P7 sequence. In some embodiments, the library adaptor sequence comprises a P5 sequence or a P7 sequence.
In some embodiments, the solid support comprises an immobilized oligonucleotide comprising P5 and/or its complement. In some embodiments, the solid support comprises an immobilized oligonucleotide comprising P7 and/or its complement. In some embodiments, the solid support comprises more than one library of immobilized oligonucleotides, wherein one or more libraries may comprise immobilized oligonucleotides comprising P5 sequences, P7 sequences, and/or their complements.
In some embodiments, the library adaptor sequences contained in the library fragments specifically bind to solid support adaptor sequences comprising P5 (SEQ ID NO: 1132), P7 (SEQ ID NO: 1133), and/or their complements.
F. Adaptor complement sequences for binding to solid support adaptor sequences
In some embodiments, the adaptor complement sequence may be bound to a solid support adaptor sequence. As used herein, an "adapter complementary sequence" is an oligonucleotide that can bind to a solid support adapter sequence. In some embodiments, the solid support adapter sequence is single stranded and the adapter complement sequence is single stranded. In some embodiments, an adapter complement sequence that is wholly or partially complementary to a solid support adapter sequence is bound to the solid support adapter sequence. In some embodiments, binding of the adaptor complementary sequence to the solid support adaptor sequence is used to prevent binding of library adaptor sequences contained in the library fragments to the solid support adaptor sequence. In this way, a user can control when library fragments can bind to the solid support adapter sequences contained in the immobilized oligonucleotides. For example, a user may block binding of library adapter sequences (using adapter complementary sequences) to solid support adapter sequences during enrichment or depletion steps.
In some embodiments, the adaptor complementary sequence that binds to the solid support adaptor sequence generates a double-stranded immobilized oligonucleotide. In some embodiments, the solid support adapter sequences that bind to the adapter complement are not capable of binding to library adapters. In some embodiments, a double-stranded immobilized oligonucleotide comprising a solid support adapter sequence and an adapter complementary sequence is not capable of binding to a library adapter sequence.
In some embodiments, the binding of the adaptor complement sequence to the solid support adaptor sequence is reversible. In some embodiments, a temperature raising or denaturing agent may be used to denature library adapter sequences from solid support adapter sequences. After denaturation of the adaptor complement, the solid support adaptor sequences contained in the immobilized oligonucleotides can be used to bind to the library adaptor sequences.
G. Solid support comprising more than one library of immobilized oligonucleotides
In some embodiments, the solid support comprises more than one library of immobilized oligonucleotides on its surface.
For example, the solid support may comprise a first pool of immobilized oligonucleotides for consumption and a second pool of immobilized oligonucleotides for enrichment. In some embodiments, an immobilized oligonucleotide library (such as with complementary nucleic acid sequences) may be blocked to avoid binding to complementary library fragments during certain steps of the method using a solid support. For example, blocking may be used to inhibit binding of the P5/P7 sequences until the user wishes to perform bridge amplification after depletion/enrichment (as shown in FIG. 2).
In some embodiments, the solid support has a library of two immobilized oligonucleotides on its surface, wherein a first library comprises immobilized oligonucleotides, each comprising an unwanted RNA sequence, and a second library comprises immobilized oligonucleotides, each comprising a solid support adaptor sequence capable of binding to a library adaptor contained in a library fragment (as shown in fig. 2). In some embodiments, the solid support adapter sequences are bound by an adapter complement sequence, wherein the adapter complement sequence can be denatured during the method that allows the solid support adapter sequences to bind to library adapters in the library fragments. Such solid supports can be used in methods for preparing a depletion library and amplifying the depletion library on the same solid support (such as described in example 2).
In some embodiments, at least one unwanted RNA sequence has at least 90%, at least 95%, or at least 99% homology to the high abundance RNA sequence in the sample used to prepare the fragment library. In some embodiments, all unwanted sequences have at least 90%, at least 95%, or at least 99% homology to high abundance RNA sequences in the sample used to prepare the fragment library.
Methods for enriching or depleting library fragments using immobilized oligonucleotides
In some embodiments, the method selects cDNA library fragments from a cDNA fragment library prepared from RNA. Such selection may be to deplete unwanted library fragments by removing them, or such selection may be to enrich for wanted library fragments and collect them. In some embodiments, selecting includes consuming unwanted library fragments and enriching for wanted library fragments.
In some embodiments, a method of selecting cDNA library fragments from a library of cDNA fragments prepared from RNA comprises (a) preparing a solid support comprising a library of immobilized oligonucleotides, wherein each immobilized oligonucleotide in the library comprises a nucleic acid sequence corresponding to an RNA sequence or a complement thereof, (b) adding the library of fragments to the solid support and hybridizing the library fragments to at least one immobilized oligonucleotide to allow binding of the library fragments to the at least one immobilized oligonucleotide, and (c) collecting library fragments bound or not to the at least one immobilized oligonucleotide.
In some embodiments, the selecting is to consume unwanted cDNA library fragments, wherein the RNA sequences comprise unwanted RNA sequences, the unwanted library fragments comprise those prepared from unwanted RNA sequences, and the collecting comprises collecting library fragments that are not bound to the at least one immobilized oligonucleotide.
In some embodiments, the selecting is to enrich for desired cDNA library fragments, wherein the RNA sequence comprises a desired RNA sequence, the desired library fragments comprise those prepared from the desired RNA sequence, and the collecting comprises collecting library fragments bound to at least one immobilized oligonucleotide.
In some embodiments, the fragment library is depleted of unwanted cDNA library fragments, and then the collected library fragments that are not bound to the at least one immobilized oligonucleotide are enriched for the wanted cDNA library fragments.
In some embodiments, the library fragment is prepared from a sample comprising RNA. In some embodiments, the library fragments are prepared from cdnas prepared from RNA in the sample. Such samples may be of any type comprising RNA, and any method of cDNA and library preparation may be combined with the methods of the invention.
In some embodiments, the methods of the invention using solid supports reduce library preparation costs and operating time compared to prior art methods that consume unwanted RNA and then perform library preparation. In some embodiments, the methods of the invention reduce degradation and/or loss of rare RNA transcripts, which can be observed by RNAse-H mediated depletion methods performed prior to library preparation. The methods described herein can be used to consume unwanted rRNA transcripts as well as unwanted non-rRNA transcripts (such as for consuming host transcriptomes in evaluating microbiome samples).
In some embodiments, the methods of depleting or enriching the fragments of a library as described herein increase the yield of the enriched or post-depleting resulting library compared to methods in which the RNA is depleted or enriched prior to library preparation. This increase in yield may be due to the fact that the library preparation itself may be limited when the RNA concentration of the starting RNA sample is very low. The method of enrichment or depletion of the invention after library preparation can avoid or reduce the effect of low RNA concentration in the starting sample on library yield.
The depletion and enrichment methods of the present invention can be flexibly used with any upstream method of cDNA and library preparation that is user-preferred. In other words, a user may select the best cDNA preparation method and the best library preparation method for their particular sample, and then the user may consume or enrich the resulting library fragments using the methods described herein.
In some embodiments, the cDNA is prepared using a strand process. In some embodiments, library preparation comprises incorporating one or more adaptor sequences into library fragments. Alternatively, one or more adaptor sequences may be incorporated into the fragment following the enrichment or depletion methods of the invention.
In some embodiments, single-stranded library fragments are prepared prior to adding the fragment library to the solid support. In this way, single stranded library fragments can be bound to single stranded immobilized oligonucleotides on the surface of a solid support.
In some embodiments, the method is performed after preparing a library from cDNA prepared from RNA. In some embodiments, the method does not require degradation of RNA.
In some embodiments, the library size and/or concentration of the library that consumes the undesired library fragments or that is enriched in the desired library fragments is assessed. Libraries that consume unwanted library fragments or are enriched for wanted library fragments can also be amplified and/or sequenced.
In some embodiments, the method includes the steps of consuming unwanted library fragments and enriching for the wanted library fragments. For example, a depletion flow cell can be used to deplete unwanted library fragments, and then an enrichment flow cell can be used to enrich the depleted library for the desired library fragments. This workflow involving consumption and enrichment may be particularly useful for generating data from relatively rare desired library fragments in a sample. For example, the data of library fragments generated from a particular microorganism contained in a metatranscriptomic sample can be improved by methods of depletion and then enrichment.
In some embodiments, the method comprises performing amplification and/or sequencing on the same flow cell used for depletion and/or enrichment. Such a method comprising amplification and/or sequencing on the same flow cell used for depletion and/or enrichment may be referred to as a "one-pot" or "single flow cell" method.
In some embodiments, amplification and sequencing is not performed on a flow-through cell for consumption and/or enrichment. For example, the collected library may be amplified in a thermocycler and the amplified library fragments may then be sequenced on a different flow cell than that used for depletion and/or enrichment.
In some embodiments, amplification is performed on a flow-through cell for consumption and/or enrichment, and then the amplified library fragments are sequenced on a different flow-through cell than the flow-through cell for consumption and/or enrichment. Such methods may include bridge amplification (as described below) on a flow-through cell for consumption and/or enrichment followed by sequencing of amplified library fragments on a different flow-through cell than the flow-through cell for consumption and/or enrichment.
A. Consumption method
In some embodiments, after reverse transcription of the RNA transcripts to produce complementary DNA (cDNA) and preparation of library fragments from the cDNA, library fragments prepared from one or more abundant RNA transcripts, sequences thereof, or subsequences thereof are consumed from the sample using a plurality of immobilized oligonucleotides. In some embodiments, the library fragments are sequenced after consumption to generate a plurality of sequence reads. In some embodiments, the one or more abundant RNA transcripts may be ribosomal RNA transcripts and/or globulin mRNA transcripts.
In some embodiments, a method of consuming unwanted cDNA library fragments from a cDNA fragment library prepared from RNA comprises (a) preparing a solid support comprising at least one immobilized oligonucleotide, wherein each immobilized oligonucleotide comprises a nucleic acid sequence corresponding to an unwanted RNA sequence or its complement, (b) adding the fragment library to the solid support and hybridizing the library fragments to the at least one immobilized oligonucleotide to allow binding of the unwanted library fragments to the at least one immobilized oligonucleotide, and (c) collecting library fragments that do not bind to the at least one immobilized oligonucleotide. In some embodiments, the solid support for consumption comprises a library of oligonucleotides. In some embodiments, the oligonucleotide library comprises 2 or more, 5 or more, 10 or more, 25 or more, 50 or more, 100 or more, 200 or more, 300 or more, 400 or more, 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, 1000 or more, or 1100 or more oligonucleotides.
In some embodiments, unwanted library fragments comprise those prepared from unwanted RNA sequences. In some embodiments, the library fragments hybridized to the immobilized oligonucleotides comprise library fragments prepared from unwanted RNA sequences. In some embodiments, the unwanted RNA sequences include rRNA.
In some embodiments, the collected library fragments comprise libraries that consume unwanted library fragments. In some embodiments, the collected library fragments are collected in a reservoir contained in a sequencer comprising a flow cell. The collected library fragments may then be removed from the library and the user may perform any other step of interest, such as quantification, amplification, quality control, or sequencing.
In some embodiments, all unwanted sequences have at least 90%, at least 95%, or at least 99% homology to high abundance RNA sequences in the sample used to prepare the fragment library. In some embodiments, all unwanted sequences have at least 90%, at least 95%, or at least 99% homology to high abundance RNA sequences in the sample used to prepare the fragment library.
In some embodiments, a library that consumes unwanted library fragments comprises fewer library fragments prepared from unwanted RNA sequences than the same library prior to addition to the solid support. In other words, the depletion method of the present invention can reduce the number of library fragments prepared from unwanted RNA sequences contained in the collected library.
1. Denaturation in consumption method
In some embodiments, the depletion method further comprises the step of denaturing the one or more nucleic acids bound to the immobilized oligonucleotides.
In some embodiments, the method further comprises denaturing the library fragments hybridized to the immobilized oligonucleotides. In some embodiments, the denatured library fragments are unwanted library fragments. In some embodiments, unwanted library fragments are denatured from the immobilized oligonucleotides, and the unwanted library fragments are siphoned to a waste container.
In some embodiments, the method further comprises denaturing the adaptor complement that hybridizes to the immobilized oligonucleotide. In some embodiments, the adaptor complementary sequence is denatured from the immobilized oligonucleotide, and the adaptor complementary sequence is siphoned to a waste container.
In some embodiments, a single step denatures the adaptor complement and unwanted library fragments. In some embodiments, both the adaptor complement and the unwanted library fragments are siphoned to a waste container.
In some embodiments, denaturation is performed with a denaturant or heat. In some embodiments, the denaturant is NaOH.
In some embodiments, the method comprises repeating the steps. In some embodiments, the steps of adding the sample, collecting and denaturing are repeated, wherein the collected library fragments are added back to the solid support after denaturing. In this way, multiple rounds of unwanted library fragment depletion (by binding unwanted library fragments to immobilized oligonucleotides) can be performed. Multiple rounds of depletion can increase the percentage of unwanted fragments depleted from the library.
In some embodiments, the method further comprises adding the collected library fragments to a solid support after denaturing the hybridized library fragments and/or adaptor complementary sequences.
2. Consumption of host RNA
In some embodiments, the depletion method is for depletion of library fragments prepared from host RNAs. In some embodiments, the host RNA is an unwanted RNA sequence and the non-host RNA is a wanted RNA sequence.
In some embodiments, unwanted library fragments hybridized to the immobilized oligonucleotides comprise library fragments prepared from host RNAs contained in a sample comprising host RNAs and non-host nucleic acid RNAs. In other words, the depletion method can be used to deplete library fragments made from host RNAs from a sample comprising library fragments made from host RNAs and library fragments made from non-host RNAs. Representative samples that may contain host RNA and non-host RNA (and are used for library preparation) include samples for evaluating a patient's microbiome or evaluating an infectious organism (such as a virus, fungus, or bacterium) from a patient's body fluid.
In some embodiments, the non-host RNA is microbial. In some embodiments, the microorganism is a bacterium, a virus, and/or a fungus. In some embodiments, the microorganism is a pathogen. In some embodiments, the microorganism is an organism in a host microbiome. In some embodiments, the host is a human.
B. Enrichment method
In some embodiments, a method of enriching a desired cDNA library fragment from a cDNA fragment library prepared from RNA comprises (a) preparing a solid support comprising at least one immobilized oligonucleotide, wherein each immobilized oligonucleotide comprises a nucleic acid sequence corresponding to the desired RNA sequence or its complement, (b) adding the fragment library to the solid support and hybridizing the library fragment to the at least one immobilized oligonucleotide to allow binding of the desired library fragment to the at least one immobilized oligonucleotide, and (c) collecting the library fragment bound to the at least one immobilized oligonucleotide. In some embodiments, the desired library fragments comprise those prepared from the desired RNA sequences.
In some embodiments, the solid support for enrichment comprises a pool of oligonucleotides. In some embodiments, the oligonucleotide library comprises 2 or more, 5 or more, 10 or more, 25 or more, 50 or more, 100 or more, 200 or more, 300 or more, 400 or more, 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, 1000 or more, or 1100 or more oligonucleotides.
In some embodiments, the desired RNA sequence has homology to the RNA sequence that the user wishes to study (i.e., the target RNA sequence). In some embodiments, at least one desired RNA sequence has at least 90%, at least 95%, or at least 99% homology to a target RNA sequence in a sample used to prepare the fragment library. In some embodiments, all desired RNA sequences have at least 90%, at least 95%, or at least 99% homology to the target RNA sequence in the sample used to prepare the fragment library. In some embodiments, the at least one desired RNA sequence is a target RNA sequence.
In some embodiments, the collected library fragments comprise a library enriched in the desired library fragments. In some embodiments, the library of fragments added to the solid support is prepared from RNA using a strand process of cDNA preparation.
In some embodiments, the collection comprises denaturing the library fragments hybridized to the immobilized oligonucleotides, and then collecting the library enriched for the desired fragments in a reservoir included in a sequencer comprising a solid support. In other words, the library fragments bound to the immobilized oligonucleotides may comprise the desired library fragments, and these desired library fragments may be denatured and then collected.
In some embodiments, denaturation is performed with a denaturant or heat. In some embodiments, the denaturant is NaOH.
In some embodiments, the steps of adding the library, denaturing and collecting library fragments that are not bound to the solid support are repeated, wherein the collected library fragments that are not bound to the solid support are then added back to the solid support after denaturation. Multiple rounds of these steps can result in greater enrichment of the desired library fragments, as more undesired library fragments can be removed.
In some embodiments, the library enriched for the desired library fragment comprises a greater percentage of library fragments prepared from the desired RNA sequence than the library prior to addition to the solid support. This enrichment may be due to the removal of unwanted library fragments that do not bind to the immobilized oligonucleotides comprising the desired RNA sequences.
Once the enriched library is prepared (i.e., the bound desired library fragments are denatured and collected), additional steps may be performed. In some embodiments, library sizes and/or concentrations of libraries enriched in desired library fragments are assessed. In some embodiments, the library enriched for the desired library fragment is sequenced. In some embodiments, the method further comprises amplifying the library enriched in the desired library fragment prior to sequencing.
C. Sample of
The methods of the invention are not limited to a particular type of sample comprising RNA, and these methods may be used with libraries prepared from any sample comprising RNA. Described below are several exemplary types of samples comprising RNA, wherein sequencing of library fragments prepared from the RNA can be improved by enrichment or depletion.
In some embodiments, the sample comprises a microbial sample, a microbiome sample, a bacterial sample, a yeast sample, a plant sample, an animal sample, a patient sample, an epidemiological sample, an environmental sample, a soil sample, a water sample, a metatranscriptomic sample, or a combination thereof. In some embodiments, the sample comprises an organism of a species that is not predetermined, an unknown species, or a combination thereof. As used herein, "non-predetermined species" means that a user has not characterized a given species present in a sample. For example, the spectrum of bacterial species present in a sample from, for example, soil or gut microbiome may not be predetermined, although bacterial species in a later-determined sample may generally be known in the art. As used herein, "unknown species" refers to a species that has not been previously characterized.
In some embodiments, the sample comprises organisms of at least two species.
1. Metatranscriptome and microbiome samples
In some embodiments, methods are used to evaluate RNA from a metatranscriptomic sample. As used herein, a "metatranscriptomic sample" refers to a sample used to generate transcriptome information of culturable and non-culturable microorganisms by large-scale, high-throughput sequencing of transcripts from all microbial communities in a particular environmental sample. Metatranscriptomic sequencing allows the user to randomly sequence RNA to learn about complex microbial communities. Methods that can avoid microbial cultivation can allow avoiding data of deviations introduced by methods related to isolation and cultivation of individual bacteria.
In some embodiments, the metatranscriptomic sample is a "microbiome sample" from a patient. As used herein, a microbiome sample refers to microorganisms present in one or more parts of a patient's body.
In some embodiments, the patient is a human. In some embodiments, the microbiome sample is buccal, vaginal or from the intestinal tract. In some embodiments, the sample from the intestinal tract is a fecal sample. In some embodiments, the oral sample is a sample from the tongue.
In some embodiments, the patient is at least 12 months, at least 15 months, at least 24 months, or at least 36 months old. In some embodiments, the microbiome sample comprises at least one unwanted RNA molecule from the genus bacillus, the family trichomonadaceae, and/or the genus clostridium. In some embodiments, the microbiome sample is vaginal and comprises at least one unwanted RNA molecule from gardnerella, lactobacillus and/or euglehnia. In some embodiments, the microbiome sample is from the tongue and comprises at least one unwanted RNA molecule from the genus veillonella, rogowski, streptococcus and/or prasuvorexa.
The spectrum of species present in a sample from, for example, soil or gut microbiome, may not be predetermined. Furthermore, the species present in the sample may involve hundreds or possibly thousands of different species. Thus, consumption schemes designed for only two representative species may not be sufficient to meet the needs of the meta-transcriptome field. The methods described herein can be used with designing immobilized oligonucleotides for consuming rich sequences (e.g., rich transcripts such as rRNA and globulin mRNA) from a sample (such as a complex sample including a metatranscriptome student sample).
Meta-transcriptomics analysis has many applications. In some embodiments, the user wants to evaluate the population of microorganisms in the patient because the specific bacteria contained in the patient's microbiome are associated with a positive or negative impact on the patient. For example, a user may want to evaluate a microbiome of a patient exhibiting symptoms of an excessively active immune response. In some embodiments, the user may wish to use a metatranscriptomic analysis to assess the effect of the treatment on the microbiome of the patient.
The metatranscriptomic sample may comprise a wide range of organisms. In some embodiments, the immobilized oligonucleotides used in the methods of the invention are designed in an unbiased manner. In other words, the methods of the invention can be used to prepare enriched libraries from a wide range of organisms, including those that may not be identifiable, without biasing the library towards known organisms.
In some embodiments, the methods of the invention can be used to deplete known sequences (in which case the known sequences would be unwanted RNA sequences) from a metatranscriptomic sample to prepare libraries having a greater percentage of library fragments from unknown sequences. When a greater percentage of library fragments are from unknown sequences, the user can sequence these library fragments more deeply.
In some embodiments, the sample comprises an organism of a species that is not predetermined, an unknown or unidentified species, or a combination thereof. In some embodiments, the sample comprises organisms of about, at least, or up to 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100 species, or a number or range of species between any two of these values. The one or more enriched RNA transcripts may include RNA transcripts from an organism of about, at least, or up to 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100 species, or a number or range of species between any two of these values. The sample may comprise, comprise about, comprise, at least, or comprise up to 1ng, 2ng, 3ng, 4ng, 5ng, 6ng, 7ng, 8ng, 9ng, 10ng, 20ng, 30ng, 40ng, 50ng, 60ng, 70ng, 80ng, 90ng, 100ng, 200ng, 300ng, 400ng, 500ng, 600ng, 700ng, 800ng, 900ng, or 1000ng of the RNA transcript.
2. Oncology samples
In some embodiments, the sample may be from a cancer patient (i.e., a oncology sample). For example, oncology samples may be used to assess changes in RNA expression in tumor cells, and possibly monitor these changes over time or during treatment. In this case, the RNA associated with the tumor marker may be the desired RNA. In the methods of the invention, RNA from known tumor markers can be used as the desired RNA to design oligonucleotides for immobilization onto a solid support to enrich library fragments associated with cancer markers. Alternatively or in conjunction with the enrichment methods described herein, oncology samples may be depleted of rRNA and/or mRNA associated with other "housekeeping" genes that are not involved in tumorigenesis or progression.
D. Unwanted library fragments that function as carrier molecules
In some embodiments, unwanted RNA can function as a vector nucleic acid. In some embodiments, unwanted RNAs are used as carrier molecules for other library fragments. In some embodiments, unwanted RNAs are used as carrier molecules for the desired library fragments.
It is well known that samples with low nucleotide concentrations perform poorly in a variety of biochemical reactions, such as limited percentages of yield in purification methods (see, e.g., higgins et al Forensic Sci Med Pathol 10:56-61 (2014)). Low input concentrations may be associated with low library complexity and may lead to difficulties in cDNA conversion or other aspects of library construction. Thus, a "pre-depletion" method (such as a depletion method using rnase) may result in RNA samples that produce low library yields, which reduces downstream data quality (such as poor sequencing results). In some embodiments, the depletion methods described herein have the advantage that unwanted RNA acts as a carrier nucleic acid for the desired RNA during cDNA and library preparation. In some embodiments, the methods of the present invention that consume library fragments increase the yield of desired library fragments compared to prior art methods that consume RNA prior to library preparation.
In some embodiments, the yield of library fragments after consumption of unwanted library fragments by the methods of the invention is greater than the yield of library fragments after consumption of unwanted RNA followed by library preparation in prior art methods.
In some embodiments, when aliquots of the same sample containing RNA are used in the methods of the present invention and the prior art, the sequencing results after library preparation and consumption with the methods of the present invention (consumption of unwanted library fragments downstream after library preparation) can be improved compared to sequencing results using prior art methods (consumption of unwanted RNA upstream prior to library preparation).
The performance of prior art depletion methods that rely on depleting unwanted RNA samples prior to library preparation can present performance problems with low input (e.g., less than 100ng of starting RNA). As used herein, "starting RNA" refers to RNA that is present in a biological sample prior to consumption and library preparation methods. In some embodiments, when the starting sample contains less than 100ng RNA, the methods of the invention produce a sequencable library after consumption. In some embodiments, the starting sample comprises less than 100ng RNA, less than 50ng RNA, less than 20ng RNA, less than 10ng RNA, or less than 1ng RNA.
E. Preparation of Strand cDNA
Various methods are known in the art that allow sequencing data to identify mRNA strands that are sources of library fragments. The use of such "strand" methods may allow a user to use the sequence of a first strand of a cDNA to determine the sequence of the original mRNA strand (without obscuring the data from a second strand of the cDNA).
In some embodiments, the library of fragments added to the solid support is prepared from RNA using a strand process of cDNA preparation.
In the method of the invention, the strand method of preparation using cDNA means that most of the library fragments after the amplification step will correspond to the complementary sequences of the unwanted RNA. In this way, unwanted fragments after amplification can generally be consumed by the immobilized oligonucleotides corresponding to unwanted RNA.
In some embodiments, the user may prefer a non-strand method of cDNA preparation. When cDNA is prepared by a non-strand method and a user wants to consume unwanted RNA, the user may prefer to fix oligonucleotides corresponding to the unwanted RNA sequence and its complement to improve the efficiency of consumption. When preparing cDNA by a non-strand method and the user wants to enrich for desired RNA, the user may prefer to fix oligonucleotides corresponding to the desired RNA sequence and its complement to increase the enrichment efficiency.
Exemplary methods of strand cDNA preparation are outlined in "TruSeq Stranded Total RNA Reference Guide," Illumina (2017). The use of reverse transcriptase in the first strand synthesis actinomycin mixture to copy mRNA into the first strand of cDNA allows RNA-dependent synthesis and prevents unwanted DNA-dependent synthesis. When generating the first strand of the cDNA, the first strand synthesis actinomycin mixture may improve strand specificity. Second strand cDNA synthesis was performed using DNA polymerase I and RNAse H in a second strand tag mix, where dTTP had been replaced with dUTP. When uracil intolerant DNA polymerase is used, incorporation of dUTP in the second strand of the cDNA can quench amplification of the strand.
In some embodiments, the nucleoside triphosphates included in the composition for first strand cDNA synthesis comprise dCTP, dATP, dGTP and dTTP.
In some embodiments, dUTP is substituted for dTTP in the second strand cDNA synthesis reaction for strand specificity. In some embodiments, the composition for second strand cDNA synthesis comprises dCTP, dATP, dGTP and dUTP. In some embodiments, incorporation of dUTP in the second strand of the cDNA inhibits amplification of the second strand of the cDNA in an index PCR reaction during library preparation. In some embodiments, inhibiting amplification of the second strand of the cDNA allows for a strand-specific method.
In some embodiments, uracil-intolerant DNA polymerase can be used in a chain process that includes amplified cDNA preparation. In some embodiments, when uracil intolerant DNA polymerase is used, the presence of uracil in the second strand of cDNA prepared from RNA in the sample can quench the amplification of the second strand. In this way, the amplified cDNA is limited to only the cDNA generated from the first strand of the cDNA of the RNA contained in the sample.
In some embodiments, cDNA preparation is performed by a non-strand method that does preserve strand information from mRNA.
F. Library preparation
Libraries prepared by any method can be used with the enrichment or depletion methods of the invention. In some embodiments, the library preparation method prepares double-stranded library fragments and denatures the double-stranded library fragments prior to addition to the solid support. In this way, a library fragment may be single stranded when it can hybridize to an immobilized oligonucleotide comprising a sequence that is wholly or partially complementary to the library fragment. Similarly, in some embodiments, the immobilized oligonucleotides are single stranded to allow hybridization and capture of complementary single stranded library fragments. In some embodiments, specific binding of the single stranded library fragment to the immobilized oligonucleotide results in a double stranded oligonucleotide. The immobilized oligonucleotides that specifically bind to the library fragments can bind with sufficiently high affinity to avoid denaturation of the double stranded oligonucleotides during standard washing steps. In this way, library fragments that specifically bind to the immobilized oligonucleotides can remain bound during the washing step and removal of unbound library fragments.
G. Library adaptor sequences
In some embodiments, one or more adaptor sequences are incorporated into the library fragments. Such adaptor sequences comprised in the library fragments may be referred to as "library adaptors". In some embodiments, a given library adaptor sequence may be universal, meaning that all or most of the library fragments comprise the library adaptor sequence.
In some embodiments, the library adaptor sequences are incorporated into the library fragments during library preparation. In some embodiments, the library adaptor sequences are incorporated into the library fragments after depletion or enrichment methods as described herein.
The adaptor sequence may be any adaptor sequence known in the art, and one skilled in the art may select an adaptor sequence based on any downstream method (such as sequencing) and what platform the downstream method will use (such as a particular sequencer). In addition, the library adaptor sequences may be designed to bind to solid support adaptor sequences contained in immobilized oligonucleotides on a solid support.
In some embodiments, the library fragment comprises one or more adaptor sequences in addition to the library adaptor sequences for binding to the solid support adaptors. In some embodiments, the adapter sequence comprises a primer sequence, an index tag sequence, a capture sequence, a barcode sequence, a cleavage sequence, or a sequencing-related sequence, or a combination thereof. As used herein, a sequencing-related sequence may be any sequence that is related to a subsequent sequencing step. Sequencing related sequences can be used to simplify downstream sequencing steps. For example, a sequencing related sequence may be a sequence that is incorporated by the step of ligating an adapter to a nucleic acid fragment. In some embodiments, the adapter sequence comprises a P5 (SEQ ID NO: 1132) or P7 sequence (SEQ ID NO: 1133) and/or their complements to facilitate binding to the flow cell in certain sequencing methods. The present disclosure is not limited to the type of adapter sequences that can be used, and the skilled artisan will recognize additional sequences that can be used for library preparation and next generation sequencing.
In some embodiments, the adapter comprises a region for cluster amplification. In some embodiments, the adapter comprises a region for priming a sequencing reaction.
In some embodiments, the adapter comprises an A14 primer binding sequence (SEQ ID NO: 1134). In some embodiments, the adapter comprises a B15 primer binding sequence (SEQ ID NO: 1135).
H. Amplification of
In some embodiments, the methods described herein comprise one or more amplification steps. In some embodiments, the library fragments are amplified prior to addition to the solid support. In some embodiments, the library fragments are amplified following the enrichment or depletion methods described herein. In some embodiments, the amplification is performed by PCR amplification.
1. Amplification with uracil intolerant polymerase
In some embodiments, the library fragments are amplified prior to addition to the solid support. In some embodiments, amplification of library fragments is included in the library preparation method. For example, in the strand process of cDNA preparation, amplification using uracil intolerant DNA polymerase is used to selectively amplify the cDNA strand prepared as the first strand from RNA (without amplifying the second strand of uracil containing DNA). Thus, library fragments added to a solid support may comprise predominantly fragments comprising sequences complementary to the desired or undesired RNA. In other words, the library fragment may comprise mainly fragments prepared from the first strand of the cDNA. In some embodiments, more than 70%, more than 80%, more than 90%, or more than 95% of the library fragments comprise cDNA from the first strand of the cDNA.
2. Amplification after depletion or enrichment
In some embodiments, the collected library fragments are amplified after the depletion or enrichment method. In some embodiments, the depleted library is amplified. In some embodiments, the enriched library is amplified.
In some embodiments, amplification is performed with a thermocycler. In some embodiments, the amplification is performed by PCR amplification.
In some embodiments, the amplification is performed without PCR amplification. In some embodiments, the amplification does not require a thermocycler. In some embodiments, enrichment/depletion and amplification after enrichment/depletion are performed in a sequencer.
In some embodiments, the amplification is performed without a thermocycler. In some embodiments, the amplification is performed by bridge or cluster amplification. As shown in fig. 2, library fragments comprising library adaptor sequences may be bound to immobilized oligonucleotides comprising solid support adaptor sequences. This binding may allow standard bridge amplification. In some embodiments, bridge amplification is performed on the same solid support used for enrichment or depletion.
In some embodiments, bridge amplification is performed after adding the collected library fragments to a solid support and binding library adaptors contained in the collected library fragments to the solid support adaptor sequences, wherein the addition is performed after denaturing the hybridized library fragments and/or adaptor complementary sequences. Such a method is described in fig. 2 and example 2 herein.
In some embodiments, a method of amplifying a desired cDNA library fragment from a cDNA fragment library prepared from RNA comprises:
a. providing a solid support having a library of two immobilized oligonucleotides on its surface, wherein a first library comprises immobilized oligonucleotides, each immobilized oligonucleotide comprising an unwanted RNA sequence, and a second library comprises immobilized oligonucleotides, each immobilized oligonucleotide comprising a solid support adaptor sequence capable of binding to library adaptors comprised in library fragments, wherein the adaptor complementary sequence binds reversibly to the solid support adaptor sequence,
b. adding the library of fragments to a solid support and hybridizing the library fragments to at least one immobilized oligonucleotide to allow unwanted library fragments to bind to the first oligonucleotide library,
c. collecting library fragments that do not bind to the first pool of oligonucleotides to prepare a collected library fragment;
d. denaturing and removing library fragments bound to the first pool of oligonucleotides and adaptor complementary sequences bound to the adaptor sequences of the second pool of oligonucleotides;
e. adding the collected library fragments to the solid support and hybridizing the library fragments to at least one immobilized oligonucleotide to allow binding of the desired library fragments to the second oligonucleotide library; and
f. The bound desired library fragments are amplified by bridge amplification on the solid support.
For example, in some embodiments, immobilized DNA fragments can be amplified using a cluster amplification method, as exemplified by the disclosures of U.S. patent nos. 7,985,565 and 7,115,400, the contents of each of which are incorporated herein by reference in their entirety. The incorporated materials of us patent 7,985,565 and 7,115,400 describe methods of solid phase nucleic acid amplification that allow the amplification products to be immobilized on a solid support to form an array of clusters or "clusters" of immobilized nucleic acid molecules. Each cluster or cluster on such an array is formed from a plurality of identical immobilized polynucleotide strands and a plurality of identical immobilized complementary polynucleotide strands. The array so formed is generally referred to herein as a "clustered array". The products of solid phase amplification reactions, such as those described in U.S. Pat. nos. 7,985,565 and 7,115,400, are so-called "bridged" structures that are formed by annealing pairs of immobilized polynucleotide strands and immobilized complementary strands (both strands in some embodiments immobilized on a solid support via covalent attachment at the 5' end). The cluster amplification method is an example of a method in which immobilized library fragments are used to generate immobilized amplicons.
I. Sequencing of depleted or enriched libraries
In some embodiments, the library is sequenced that consumes unwanted library fragments. In some embodiments, the library enriched for the desired library fragment is sequenced.
Following the depletion or enrichment methods described herein, the library collected may contain less than 15%, 13%, 11%, 9%, 7%, 5%, 3%, 2%, or 1% or any range therebetween of unwanted RNA species. In some embodiments, the library collected after enrichment or depletion comprises at least 99%, 98%, 97%, 95%, 93%, 91%, 89% or 87% or any range therebetween of the desired RNA. In other words, the library used for sequencing after enrichment or depletion mainly comprises library fragments prepared from the target RNA.
In some embodiments, the sequencing data generated after consumption of unwanted library fragments has fewer sequences corresponding to unwanted RNAs than the same library sequenced without consumption.
In some embodiments, the sequencing data generated after enrichment of the desired library fragment has a higher percentage of sequences corresponding to the desired RNA than the same library that was sequenced without enrichment.
The depleted or enriched library prepared by the methods of the invention can be used with any type of RNA sequencing, such as RNA-seq, small RNA sequencing, long non-coding RNA (1 ncRNA) sequencing, circular RNA (circRNA) sequencing, targeted RNA sequencing, exosome RNA sequencing, and degradation group sequencing.
For example, for circRNA sequencing, a user can consume linear RNA by digesting it, followed by preparing a library by the methods described herein and consuming rRNA. Thus, the method of the invention can be readily combined with other steps in known protocols associated with RNA sequencing.
The depleted or enriched library may be sequenced according to any suitable sequencing method, such as direct sequencing, including sequencing-by-synthesis, sequencing-by-ligation, sequencing-by-hybridization, nanopore sequencing, and the like. In some embodiments, the library is sequenced on a solid support for consumption or enrichment. In some embodiments, the solid support used for sequencing is the same as the solid support on which enrichment or depletion is performed. In some embodiments, the solid support used for sequencing is the same as the solid support upon which amplification occurs after enrichment or depletion.
The flow cell provides a convenient solid support for sequencing. One or more library fragments of the solid support (or amplicons generated from the library fragments) may be subjected to SBS or other detection techniques involving repeated delivery of reagents in the circulation. For example, to initiate a first SBS cycle, one or more labeled nucleotides, DNA polymerase, etc. may flow into/through a flow cell containing one or more amplified nucleic acid molecules. Those sites where primer extension causes incorporation of the labeled nucleotide can be detected. Optionally, the nucleotide may also include a reversible termination property that terminates further primer extension upon addition of the nucleotide to the primer. For example, a nucleotide analog with a reversible terminator moiety may be added to the primer such that subsequent extension does not occur until the deblocking agent is delivered to remove the moiety. Thus, for embodiments using reversible termination, the deblocking reagent may be delivered to the flow-through cell (either before or after detection occurs). Washing may be performed between the various delivery steps. The cycle may then be repeated n times to extend the primer n nucleotides, thereby detecting a sequence of length n. Exemplary SBS procedures, fluidic systems, and detection platforms that can be readily adapted for use with amplicons produced by the methods of the present disclosure are described, for example, in the following documents: bentley et al, nature 456:53-59 (2008), WO 04/018497, US 7,057,026, WO 91/06678, WO 07/123744, US 7,329,492, US 7,211,414, US 7,315,019, US 7,405,281, and US 2008/0108082, each of which is incorporated herein by reference.
Sequencing and optionally amplification on the same solid support used for depletion and/or enrichment can reduce the number of manual steps by the user and the loss of sample associated with transferring the sample from one solid support to another.
Methods for depleting rRNA from a patient microbiome sample using DNA probes and RNase
Creating a nucleic acid library from RNA for sequencing is often difficult because a large number of unwanted transcripts, such as ribosomal RNA (rRNA), can dominate the sample and mask the target RNA sequence. Analysis of the transcriptome may be compromised if unwanted transcripts are not removed. Thus, consumption of unwanted RNA from a microbiome sample comprising nucleic acid prior to analysis (such as sequencing) or other downstream applications can increase the specificity and accuracy of the desired analysis. An exemplary method of consuming rRNA is described in WO 202013304 A1, which is incorporated herein in its entirety.
The present disclosure describes methods and materials that can be used to deplete rRNA species from nucleic acid samples so that important RNAs can be studied and not lost in numerous unwanted RNA transcripts. The nucleic acid sample may be any nucleic acid sample described herein, such as a metatranscriptomic sample.
The microbiome sample may contain RNA or DNA or both, including both unwanted (off-target or unwanted) and wanted (target) nucleic acids. The DNA or RNA in the sample may be unmodified or modified and include, but are not limited to, single-or double-stranded DNA or RNA or derivatives thereof (e.g., some regions of DNA or RNA are double-stranded while other regions of DNA or RNA are single-stranded), and the like. However, the microbiome sample may also contain cells from the host. For example, an intestinal microbiome patient from a human patient (i.e., a "host") can comprise microorganisms present in the intestinal tract as well as in the host cells, such that the sample comprises nucleic acids from both the host and the microorganisms.
The microbiome sample can include any chemically, enzymatically, and/or metabolically modified form of nucleic acid, any unmodified form of nucleic acid, or a combination thereof. The microbiome sample can contain both desired and unwanted nucleic acids. Unwanted nucleic acids include those from the host and rRNA from the microorganism. The nucleic acids needed or desired are those that are the basis or focus of the study, i.e., the target nucleic acids. For example, a researcher may wish to study mRNA expression analysis from microorganisms contained in a microbiome, where rRNA from the microorganisms would be considered an unwanted nucleic acid, while other RNAs from the microorganisms are target nucleic acids. In some embodiments, the unwanted RNA is rRNA.
For example, a microbiome sample may contain desired RNAs (such as mRNA) from a microorganism, while also containing unwanted rRNA. General methods for extracting RNA from total samples such as blood, tissue, cells, fixed tissue, etc. are well known in the art, as seen in Current Protocols for Molecular Biology (John Wiley and Sons) and numerous handbooks of molecular biological methods. RNA isolation can be performed by commercially available purification kits, such as Qiagen RNeasy Mini columns, masterPure complete DNA and RNA purification kit (Epicentre), parrafin Block RNA isolation kit (Ambion), RNA-Stat-60 (Tel-Test) or cesium chloride density gradient centrifugation. The current methods are not limited by the method of isolating RNA from the sample prior to RNA consumption.
In some embodiments, the method comprises using probes to host unwanted RNAs and/or unwanted RNAs of the microorganism. For example, the methods described herein can include using probes for non-microbial RNAs (such as the DP1 probe set described herein) and probes for microbial rRNA (such as the HMv and/or HMv probe sets described herein), as described in example 5.
In some embodiments, a method for depleting unwanted RNA molecules contained in a patient microbiome sample, wherein the patient microbiome sample comprises at least one target RNA or DNA sequence and at least one unwanted RNA molecule, comprises (a) sequencing a plurality of probe development microbiome samples to determine from sequencing data at least one unwanted RNA molecule comprising a bacterial ribosomal RNA (rRNA) sequence; (b) Preparing a probe set comprising at least one DNA probe complementary to the at least one unwanted RNA molecule; (c) Contacting the patient microbiome sample with the probe set to produce DNA: RNA hybrids; and (d) causing the DNA to: contacting the RNA hybrid with a ribonuclease that degrades a polypeptide derived from the DNA: degrading said unwanted RNA molecules in said patient microbiome sample to form a degradation mixture.
In some embodiments, a method for depleting unwanted RNA molecules contained in a patient microbiome sample, wherein the patient microbiome sample comprises at least one target RNA or DNA sequence and at least one unwanted RNA molecule, comprising
(a) Contacting a patient microbiome sample with a probe set comprising at least one sequence comprising the amino acid sequence of SEQ ID NO:1-1131 to produce DNA: RNA hybrids; and
(b) DNA: the RNA hybrids are contacted with a ribonuclease that degrades the DNA from: the RNA of the RNA hybrid, thereby degrading unwanted RNA molecules in the patient microbiome sample to form a degradation mixture.
In some embodiments, the method further comprises (a) degrading any remaining DNA probes by contacting the degradation mixture with a DNA digestive enzyme, optionally wherein the DNA digestive enzyme is dnase I, to form a DNA degradation mixture; and (b) isolating the degraded RNA from the degradation mixture or the DNA degradation mixture.
In some embodiments, the addition of a destabilizing agent (such as formamide) aids in the removal of some unwanted RNAs that are more difficult to consume in the absence of formamide. In some embodiments, formamide can be used to relax structural barriers in unwanted RNAs (such as rRNA) so that DNA probes can bind more efficiently. Furthermore, the addition of formamide has shown additional benefits that may improve detection of some non-targeted transcripts by denaturing/relaxing regions of RNA that have, for example, very stable secondary or tertiary structures and often perform poorly in other library preparation methods.
In some embodiments, contacting with the probe set comprises treating the nucleic acid sample with a destabilizing agent. In some embodiments, the destabilizing agent is a thermal and/or nucleic acid destabilizing chemical. In some embodiments, the nucleic acid destabilizing chemical comprises betaine, DMSO, formamide, glycerol, or derivatives thereof, or mixtures thereof. In some embodiments, the nucleic acid destabilizing chemical comprises formamide. In some embodiments, the formamide is present at a concentration of about 10% to 45% by volume during contact with the probe set. In some embodiments, treating the sample with heat comprises applying more than one DNA: heat at the melting temperature of the RNA hybrid. In some embodiments, the ribonuclease is rnase H or a hybrid enzyme.
In some embodiments, unwanted RNA is converted to DNA by hybridizing a partially or fully complementary DNA probe to the unwanted RNA molecule: RNA hybrids. Methods for hybridizing nucleic acid probes to nucleic acids are well established in science, and the fact that DNA probes hybridize to unwanted RNA species after washing and other manipulations of the sample, whether the probes are partially or fully complementary to partner sequences, suggests that DNA probes may be used in the methods of the present disclosure. If the target used in the study is RNA, DNA may also be considered as unwanted nucleic acid, in which case DNA may also be removed by depletion.
In some embodiments, the RNA sample is denatured in the presence of DNA probes. In some embodiments, the DNA probe is added to a denatured RNA sample (denatured at 95 ℃ for 2 minutes), and then the reaction is cooled to 37 ℃ for 15-30 minutes such that the DNA probe hybridizes to its corresponding target RNA sequence, thereby producing DNA: RNA hybridization molecules.
In some embodiments, contacting with the probe set comprises treating the nucleic acid sample with a destabilizing agent. In some embodiments, the destabilizing agent is a thermal or nucleic acid destabilizing chemical. In some embodiments, the nucleic acid destabilizing chemical is betaine, DMSO, formamide, glycerol, or derivatives thereof, or mixtures thereof. In some embodiments, the nucleic acid destabilizing chemical is formamide or a derivative thereof, optionally wherein the formamide or derivative thereof is present at a concentration of about 10% to 45% of the total hybridization reaction volume. In some embodiments, treating the sample with heat comprises applying more than one DNA: heat at the melting temperature of the RNA hybrid.
In some embodiments, the formamide is added to the hybridization reaction regardless of the source of the RNA sample (e.g., human, mouse, rat, etc.). For example, in some embodiments, hybridization to a DNA probe is performed in the presence of at least 3%, 5%, 10%, 20%, 25%, 30%, 35%, 40%, or 45% formamide. In one embodiment, the hybridization reaction for RNA consumption includes about 25% to 45% formamide by volume.
After hybridization reactions, degradation can be derived from DNA: the ribonuclease of the RNA hybrid is added to the reaction. In some embodiments, the ribonuclease is rnase H or a hybrid enzyme. Rnase H (NEB) or hybrid enzyme (Lucigen) is a protein that degrades the DNA: examples of enzymes for RNA of RNA hybrids. RNA is degraded by degradation by ribonucleases (such as RNAse H or hybridases) into small molecules that can then be removed. For example, rnase H is reported to be approximately every 7 to 21 bases from DNA: RNA was digested once in RNA hybrids (Schultz et al, J.biol. Chem.2006, 281:1943-1955; champoux and Schultz, FEBS J.2009, 276:1506-1516). In some embodiments, the DNA is digested: RNA of the RNA hybrid can be performed at 37℃for about 30 minutes.
In some embodiments, at DNA: after digestion of the RNA hybridization molecules, the remaining DNA probes and any unwanted DNA in the nucleic acid sample are degraded. Thus, in some embodiments, the method comprises contacting the ribonuclease degrading mixture with a DNA-digesting enzyme, thereby degrading the DNA in the mixture. In some embodiments, the digested sample is exposed to a DNA-digesting enzyme that degrades the DNA probes, such as dnase I. The dnase DNA digestion reaction is incubated for 30 minutes at, for example, 37 ℃, after which the dnase may be denatured at 75 ℃ for a period of time necessary to denature the dnase, for example, up to 20 minutes.
In some embodiments, the method of depletion comprises separating the degrading RNA from the degrading mixture. In some embodiments, isolating includes purifying target RNA from degraded RNA (and degraded DNA, if present) using, for example, a nucleic acid purification medium, such as RNA capture beads, such as RNAClean XP beads (Beckman Coulter). Thus, in some embodiments, after enzymatic digestion, the target RNA can be enriched by retaining the desired and longer RNA target while removing degradation products. Suitable enrichment methods include treating the degradation mixture with magnetic beads, spin columns, or the like that bind to the desired fragment size of the enriched RNA target. In some embodiments, magnetic beads (such as AMPure XP beads, spreselect beads, RNAClean XP beads (Beckman Coulter)) may be used, provided that these beads are free of rnase (e.g., quality control is such that they are free of rnase). These beads offer different size selection options for nucleic acid binding, e.g., RNAClean XP beads target nucleic acid fragments of 100 nucleotides or longer, and SPRISelect beads target nucleic acid fragments of 150 to 800 nucleotides without targeting shorter nucleic acid sequences (such as degraded RNA and DNA resulting from rnase H and dnase digestions). If the mRNA is the target RNA to be studied, the mRNA can be further enriched by capture using, for example, beads containing oligodT sequences for capturing the mRNA adenylated tail. mRNA capture methods are well known to the skilled artisan.
Once the target RNA has been purified from the reaction components, including undesirable degraded nucleic acids, additional sample manipulations can be performed. In some embodiments, the enriched target total RNA is followed by an exemplary library preparation workflow typical for subsequent sequencing on, for example, an Illumina sequencer. However, it should be understood that these workflows are merely exemplary, and the skilled artisan will appreciate that the enriched RNA may be used directly or after additional operations such as converting RNA to cDNA using established and understood protocols, in a variety of additional applications such as PCR, qPCR, microarray analysis, and the like.
The methods described herein for RNA consumption can produce a sample enriched in target RNA molecules. For example, the methods described herein result in a consumed RNA sample comprising less than 15%, 13%, 11%, 9%, 7%, 5%, 3%, 2%, or 1% or any range therebetween of unwanted RNA species. The enriched RNA sample then comprises at least 99%, 98%, 97%, 95%, 93%, 91%, 89% or 87% or any range therebetween of target total RNA. Once the sample is enriched, the sample may be used for library preparation or other downstream operations.
In some embodiments, the DNA probe does not hybridize to the entire continuous length of the RNA species to be consumed. In some embodiments, the full-length sequence targeted for the consumed RNA species need not be targeted with full-length DNA probes or probe sets that are tiled consecutively across the entire RNA sequence. In some embodiments, the DNA probes described herein leave gaps such that the DNA formed: RNA hybrids are discontinuous. In some embodiments, the DNA: gaps of at least 5 nucleotides, 10 nucleotides, 15 nucleotides or 20 nucleotides between the RNA hybrids provide efficient RNA consumption. Furthermore, probe sets including nicks can more efficiently hybridize to unwanted RNA because DNA probes do not create situations as would be the case with probes that cover the entire RNA sequence targeted for consumption or probes that overlap each other can interfere with hybridization of adjacent probes.
In some embodiments, at least one DNA probe comprises a sequence selected from the group consisting of SEQ ID NOs: 1-1131 or its complement in 2 or more, 5 or more, 10 or more, 25 or more, 50 or more, 100 or more, 200 or more, 300 or more, 400 or more, 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, 1000 or more, 1100 or more, or 1131 sequences.
In some embodiments, at least one DNA probe comprises a sequence selected from the group consisting of SEQ ID NOs: 1-1131 or its complement 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, 1000 or more, 1100 or more, or 1131 sequences.
In some embodiments, the at least one DNA probe comprises at least one HMv1 sequence and comprises the amino acid sequence of SEQ ID NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-1099, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129 and 1130.
In some embodiments, at least one DNA probe comprises 100 or more, 500 or more, or 1000 or more sequences comprising SEQ ID NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-1099, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129 and 1130.
In some embodiments, at least one DNA probe comprises a sequence comprising SEQ ID NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-388, 390, 391, 393 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-736, 739-763, 828. 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1058, 1060-1062, 1064-1067, 1069-1075, 1080-1094, 1096, 1099-1099, 1107-1110, 1112 3, 1115, 1116, 1118-1126, 1129, and 1130.
In some embodiments, the at least one DNA probe further comprises at least one sequence of HMv2 sequence and comprises the sequence of SEQ ID NO: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131. In some embodiments, at least one DNA probe comprises 10 or more, 20 or more, or 30 or more sequences comprising the sequence of SEQ ID NO: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131. In some embodiments, at least one DNA probe comprises a sequence comprising SEQ ID NO: 19. 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131.
In some embodiments, the at least one DNA probe comprises at least one HMv sequence or HMv2 sequence and comprises the sequence of SEQ ID NO:1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 913, 923-921, 923-925, at least one of 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126, and 1128-1131.
In some embodiments, at least one DNA probe comprises 100 or more, 500 or more, or 1000 or more sequences comprising SEQ ID NO:1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 913, 923-921, 923-925, at least one of 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126, and 1128-1131.
In some embodiments, at least one DNA probe comprises a sequence comprising SEQ ID NO:1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 913, 923-921, 923-925, each of 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126, and 1128-1131.
In some embodiments, the at least one DNA probe further comprises at least one sequence of DP1 sequence and comprises the sequence set forth in SEQ ID NO: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631 637, 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127. In some embodiments, at least one DNA probe comprises 10 or more, 20 or more, or 30 or more sequences comprising the sequence of SEQ ID NO: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631 637, 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127. In some embodiments, at least one DNA probe comprises a sequence comprising SEQ ID NO: 11. 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623, 631 637, 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127.
In some embodiments, the method consumes 70% or more, 80% or more, 90% or more, or 95% or more of the bacterial rRNA contained in the microbiome sample.
A. Kit and composition
In some embodiments, at least one probe is contained in a kit or composition. The at least one probe may be any combination of probes disclosed herein.
In some embodiments, the composition comprising the set of probes comprises at least one DNA probe comprising at least one sequence comprising the sequence of SEQ ID NO: 1-1131; and is capable of degrading DNA: ribonuclease of RNA in RNA hybrids. In some embodiments, the ribonuclease is rnase H.
In some embodiments, a kit comprising a set of probes comprises at least one DNA probe comprising at least one sequence comprising the sequence of SEQ ID NO: 1-1131; and is capable of degrading DNA: ribonuclease of RNA in RNA hybrids. In some embodiments, the kit comprises a probe set comprising at least one DNA probe comprising the sequence of SEQ ID NO: 1-1131; ribonuclease; a DNase; RNA purification beads. In some embodiments, the ribonuclease is rnase H.
In some embodiments, the kit further comprises an RNA consumption buffer, a probe consumption buffer, and a probe removal buffer. In some embodiments, the kit further comprises a nucleic acid destabilizing chemical. In some embodiments, the nucleic acid destabilizing chemical comprises betaine, DMSO, formamide, glycerol, or derivatives thereof, or mixtures thereof. In some embodiments, the nucleic acid destabilizing chemical comprises formamide.
Examples
Example 1 rRNA consumption method Using flow cell
Methods of rRNA depletion followed by amplification by a thermocycler can be performed. The method would utilize a flow-through cell currently used for sequencing, with an inlet port for the sequencing fluidic system to pump buffer and reagents onto the flow-through cell and siphon the reagents to a waste container. As with the flow-through cells currently used for sequencing, the oligonucleotide sequences will be tethered (i.e., immobilized) to the surface of the flow-through cell, and the rRNA sequences will be contained in these immobilized oligonucleotides. The user loads an RNA library (i.e., library fragments prepared from cDNA prepared from RNA) onto a sequencer platform or inside a sequencer cooler so that the fluidic system loads the library onto a flow cell. The user may use a commercially available strand cDNA preparation method, such as the method described in "TruSeq Stranded Total RNA Reference Guide," Illumina, 2017.
Fig. 1 outlines a representative consumption method. When the library molecules flow in solution, the library fragments generated from the rRNA transcripts will hybridize to complementary sequences attached to the flow-through cell, while the library fragments generated from the non-rRNA transcripts will continue to flow unimpeded to the reservoir for collection. After the hybridization step is completed, the user will discard the flow-through cell and collect siphoned non-rRNA library fragments for PCR amplification, clean-up quantification, quality control, and sequencing.
This method would take advantage of the current flow cell/sequencer capabilities for a user friendly method that consumes unwanted library fragments, such as those prepared by rRNA.
Example 2 consumption and bridge amplification on the same flow cell
The method can also be designed to consume library fragments made from rRNA and amplify library fragments made from non-rRNA on the same solid support. Such a flow-through cell-like solid support will comprise a library of immobilized oligonucleotides comprising rRNA sequences. The solid support will also comprise another library of immobilized oligonucleotides comprising double stranded P5 and/or P7 oligonucleotides immobilized on a surface. Double stranded P5 and/or P7 oligonucleotides will comprise an adaptor complementary sequence, which is an oligonucleotide that binds reversibly to the P5 and/or P7 adaptor sequence (i.e., the solid support adaptor sequence).
A representative method is shown in fig. 2. After preparation of the cDNA from the sample comprising RNA, library fragments can be prepared by standard methods. These library fragments may be prepared by incorporating library adaptor sequences that bind to P5 and/or P7. Library fragments generated from rRNA transcripts will bind to the surface of the flow-through cell based on hybridization to immobilized oligonucleotides comprising rRNA sequences, whereas library fragments prepared from non-rRNA transcripts will flow unimpeded and be siphoned for temporary storage in a reservoir.
After this step, denaturing reagents (such as NaOH) will be pumped into the flow-through cell device, causing the dissociation of the hybridized library fragments prepared from rRNA and the unligated strands of double stranded P5 and/or P7 oligonucleotides from the flow-through cell into the waste reservoir. The collected library fragments (including library fragments prepared from non-rRNA) are then reintroduced into the flow-through cell from a temporary storage chamber to bind to the single stranded immobilized oligonucleotides comprising P5 and/or P7. Once bound, the bridge amplification chemistry can amplify the library fragments. After bridge amplification generates enough library fragments, cleavage steps can be performed in the manner currently in sequencing chemistry to release the forward and reverse strands for subsequent collection, quantification, and quality control prior to sequencing.
EXAMPLE 3 enrichment of desired cDNA library fragments
A solid support for enrichment, such as a flow cell, may be prepared. The user can prepare oligonucleotides corresponding to the desired RNA and immobilize these oligonucleotides to the solid support. For example, a user may want to enrich for RNA sequences (i.e., desired RNAs) associated with cancer markers for assessing therapeutic response, tumor progression, or other assessment means, and the user may immobilize oligonucleotides comprising sequences from such RNAs to a solid support. Flow-through cells with such immobilized oligonucleotides may be referred to as enrichment flow-through cells.
The user can then prepare a cDNA library as described in example 1 above from a patient sample comprising RNA. The library fragments may then be added to an enrichment flow cell. Library fragments prepared from the desired RNA will bind to the enrichment flow cell and the user can siphon fluid not bound to the enrichment flow cell (including library fragments not prepared from the desired RNA) to a waste container. The user can then denature the bound library fragments, collect them, and sequence them (optionally amplified prior to sequencing). In this way, the library being sequenced will be enriched for library fragments prepared from the desired RNA.
Example 4 preparation of consumption probes for human microbiome samples
To improve enzyme consumption using the Ribo-Zero Plus kit, additional probe sets were developed using an iterative design process that specifically target human intestinal microbiome samples. The goal was to develop probes for enzymatic rRNA consumption of human-related microbiome to enable metatranscriptomic analysis.
In addition to bacterial RNAs (such as rRNA), some human-related microbiome samples can have a large amount of host (human) RNA. For example, skin, oral and vaginal samples are expected to contain many human cells, so probes directed against human and bacterial sequences as well as unwanted sequences can provide optimal results for consuming unwanted sequences from human microbiome samples.
Sequencing data from stool samples consumed by Ribo-Zero Plus was used to determine the most abundant rRNA sequences that were not efficiently consumed in 9 adult healthy stool RNA samples. For these experiments, total RNA from intestinal Microbiome samples from 9 donors (Petersen et al Microbiolome 5 (1): 98 (2017)) was treated in triplicate using the Ribo-Zero Plus rRNA depletion kit, converted to RNAseq library using the TruSeq strand total RNAseq kit, and sequenced on NextSeq (PE 76), producing 11 to 36,000,000 reads per sample. The FASTQ file (as described in Cock et al Nucleic Acids Res.38 (6): 1767-71 (2010)) of each donor was then aligned with SILVA (v 119, see quick et al Nucleic Acids Res: D590-6 (2013)) using SortMeRNA (see Kopylova et al Bioinformation 28:3211-3217 (2012)) to determine the sequence of rRNA to target consumption. Any closely aligned sequence regions (1-3 nucleotides) are combined and ordered by depth of coverage, and then filtered to remove any sequence regions less than 500x coverage. The first 50 most abundant regions were collected from each sample (donor) and combined to create a list of rich regions. Any regions that overlap are then merged and the list is converted to a FASTA file. To determine and remove redundancy, each region is aligned and any region exhibiting 80% or more identity is labeled and only one region is selected for probe design. The existing RiboZero Plus probe (designated DP 1) was then aligned with the selected non-redundant region and any region with a probe ratio equal to or greater than 80% identity was eliminated. The remaining regions were collected, probe positions were determined, and antisense probe sequences were created for the HMv1 probe set. In addition, the HMv probe set also included probes designed directly for the rRNA sequences of E.coli and B.subtilis for all 38 species present in ATCC simulated community samples (MSA-2002, -2005 and-2006).
Example 5 preparation of additional probes to improve rRNA consumption of infant fecal microbiome samples
The human intestinal microbiome profile is known to change rapidly during the first few years of life (see, e.g., stewart et al Nature 562:583-588 (2018)). In young infants, the intestinal microbiota is significantly different from that of adult samples and is often dominated by different classifications such as bifidobacteria (see Turroni et al PLoS One 7 (5): e36957 (2012)). Experiments with the Ribo-Zero Plus HMv1 probe set showed that it was effective in removing rRNA in most infant fecal samples, with an average < 26% reading mapped to bacterial rRNA reading (data not shown). Interestingly, for donor subgroups 9 to 15 months old, the efficiency of rRNA consumption was lower. Taxonomic analysis showed that these samples had high levels of bifidobacterium bifidum. The lack of consumption suggests that the HMv probe set targets rRNA from this particular species relatively poorly.
Additional probes targeting bifidobacteria bifidus were designed using the present iterative process and added to the HMv probe pool to create a second human microbiome pool (HMv 2). Further experiments were performed with the HM probe set containing HMv probe and HMv probe.
Example 6 evaluation of consumption probes for human microbiome samples
A set of human microbiome samples was analyzed using standard RiboZero Plus probes (called DPI), human microbiome probes (HM, including HMv1+ HMv2 probes), or a combination of HM probes and DPI probes (hm+dp1). Experiments were performed according to the standard RiboZero protocol. The results are shown in fig. 3, which shows a significant decrease in the percentage of rRNA readings when the HM probe alone or in combination with the DP1 probe is used as compared to the DPI probe. Thus, the use of HM probes can significantly reduce the amount of unwanted rRNA sequencing.
Experiments with wastewater also showed that the RiboZero protocol using HM probe significantly reduced the amount of rRNA sequenced compared to the "simulated" samples without the RiboZero protocol (fig. 4). Although more than 90% of the simulated samples contained rRNA, in samples subjected to the RiboZero protocol with the HM probe, rRNA was reduced to less than 15%.
Experiments were also performed to evaluate ATCC simulated community samples (skin)Whole cell mixture of skin microbiome, ATCC MSA-2005 TM ) rRNA consumption of (C). The experiment compares the results using the RiboZero rnase protocol (using standard DP1 probes or using human microbiome HM probes) with the results using the RiboZero-act kit, which uses a probe-based hybridization method to capture and consume bacterial rRNA from e.coli and bacillus subtilis. The RiboZero-act probe was contained in a commercial RiboZero Plus rRNA depletion kit (Illumina).
As shown in fig. 5, more than 90% of the reads in the skin microbiome samples indicated no rRNA was consumed. The RiboZero-act kit reduced the level of rRNA, but there was a significant difference between samples. The RiboZero standard (with DP1 probe) only reduced rRNA readings by about 50%. In contrast, riboZero Human Microbiome (HM) treatment reduced rRNA readings to less than 10% of the total readings. These results indicate that the RiboZero rnase method using the HM probe improved the consumption of rRNA from human microbiome samples compared to the RiboZero standard method (using DP1 probe) or the RiboZero-act kit using probe-based hybridization and probes designed to consume rRNA from e.coli and bacillus subtilis. Thus, HM probes are well suited for consumption of rRNA from human microbiome samples.
Equivalent content
The above written description is considered to be sufficient to enable one skilled in the art to practice the embodiments. The foregoing detailed description and examples detail certain embodiments and describe the best mode contemplated by the inventors. It should be understood, however, that no matter how detailed the foregoing may be described in text, the embodiments may be practiced in many ways and should be interpreted according to the appended claims and any equivalents of the appended claims.
As used herein, the term "about" refers to a value, including, for example, integers, fractions and percentages, whether or not explicitly indicated. The term "about" generally refers to a range of values (e.g., +/-5-10% of the range recited) that one of ordinary skill in the art would consider equal to the recited value (e.g., having the same function or result). When a term such as "at least" and "about" precedes a list of numerical values or ranges, the term modifies all values or ranges provided in the list. In some cases, the term "about" may include numerical values rounded to the nearest significant figure.

Claims (29)

1. A method of selecting a cDNA library fragment from a library of cDNA fragments prepared from RNA, the method comprising:
a. preparing a solid support comprising a library of immobilized oligonucleotides, wherein each immobilized oligonucleotide in the library comprises a nucleic acid sequence corresponding to an RNA sequence or its complement,
b. adding the library of fragments to the solid support and hybridizing the library fragments to at least one immobilized oligonucleotide to allow binding of the library fragments to the at least one immobilized oligonucleotide, and
c. library fragments with or without binding to at least one immobilized oligonucleotide are collected.
2. The method according to claim 1, wherein:
a. the selecting is to consume unwanted cDNA library fragments, wherein the RNA sequences comprise unwanted RNA sequences, the unwanted library fragments comprise those prepared from unwanted RNA sequences, and the collecting comprises collecting library fragments that are not bound to at least one immobilized oligonucleotide; or alternatively
b. The selecting is to enrich for desired cDNA library fragments, wherein the RNA sequence comprises a desired RNA sequence, the desired library fragments comprise those prepared from the desired RNA sequence, and the collecting comprises collecting library fragments bound to at least one immobilized oligonucleotide.
3. The method of claim 2, wherein the library of fragments is depleted of unwanted cDNA library fragments, followed by enrichment of the collected library fragments not bound to the at least one immobilized oligonucleotide with the wanted cDNA library fragments.
4. A solid support having on its surface two pools of immobilized oligonucleotides, wherein a first pool of oligonucleotides comprises immobilized oligonucleotides, each immobilized oligonucleotide comprising a nucleic acid sequence corresponding to an unwanted RNA sequence or a complement thereof; and the second library of oligonucleotides comprises immobilized oligonucleotides, each immobilized oligonucleotide comprising a solid support adapter sequence capable of binding to a library adapter comprised in a library fragment.
5. The method of any one of claims 1 to 3 or the solid support of claim 4, wherein at least one unwanted RNA sequence has at least 90%, at least 95% or at least 99% homology to a high abundance RNA sequence in a sample used to prepare the fragment library.
6. The method or solid support of claim 5, wherein the high abundance RNA sequence is a ribosomal RNA (rRNA) sequence.
7. The method or solid support according to claim 5, wherein the unwanted RNA sequence is a globulin mRNA or 28S, 23S, 18S, 5.8S, 5S, 16S, 12S, HBA-A1, HBA-A2, HBB-B1, HBB-B2, HBG1 or HBG2 RNA or a fragment thereof.
8. The method or solid support of any one of claims 1-7, wherein each immobilized oligonucleotide library comprises 2 or more, 5 or more, 10 or more, 25 or more, 50 or more, 100 or more, 200 or more, 300 or more, 400 or more, 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, 1000 or more, or 1100 or more oligonucleotides.
9. The solid support of any one of claims 4-8, wherein an adapter complement that is wholly or partially complementary to the solid support adapter sequence binds to the solid support adapter sequence of the second library, and wherein the binding of the adapter complement to the solid support adapter sequence is reversible.
10. A method of amplifying a desired cDNA library fragment from a cDNA fragment library prepared from RNA, the method comprising:
a. providing a solid support according to claim 9;
b. adding the library of fragments to the solid support and hybridizing the library fragments to at least one immobilized oligonucleotide to allow unwanted library fragments to bind to the first pool of oligonucleotides;
c. collecting library fragments that do not bind to the first pool of oligonucleotides to prepare a collected library fragment;
d. denaturing and removing library fragments bound to the first pool of oligonucleotides and adaptor complementary sequences bound to the adaptor sequences of the second pool of oligonucleotides;
e. adding the collected library fragments to the solid support and hybridizing the library fragments to at least one immobilized oligonucleotide to allow binding of the desired library fragments to the second oligonucleotide library; and
f. The bound desired library fragments are amplified by bridge amplification on the solid support.
11. A method for depleting unwanted RNA molecules contained in a patient microbiome sample, wherein the patient microbiome sample comprises at least one target RNA or DNA sequence and at least one unwanted RNA molecule, the method comprising:
a. sequencing a plurality of probe development microbiome samples to determine from the sequencing data at least one unwanted RNA molecule comprising a bacterial ribosomal RNA (rRNA) sequence;
b. preparing a probe set comprising at least one DNA probe complementary to the at least one unwanted RNA molecule;
c. contacting the patient microbiome sample with the probe set to produce a DNA: RNA hybrid; and
d. contacting the DNA: RNA hybrid with a ribonuclease that degrades the RNA from the DNA: RNA hybrid, thereby degrading the unwanted RNA molecules in the patient microbiome sample to form a degradation mixture.
12. A method for depleting unwanted RNA molecules contained in a patient microbiome sample, wherein the patient microbiome sample comprises at least one target RNA or DNA sequence and at least one unwanted RNA molecule, the method comprising:
a. Contacting the patient microbiome sample with a probe set comprising at least one sequence comprising at least one of SEQ ID NOs 1-1131 to produce DNA: RNA hybrids; and
b. contacting the DNA: RNA hybrid with a ribonuclease that degrades the RNA from the DNA: RNA hybrid, thereby degrading the unwanted RNA molecules in the patient microbiome sample to form a degradation mixture.
13. The method of claim 11 or claim 12, further comprising:
a. degrading any remaining DNA probes by contacting the degradation mixture with a DNA-digesting enzyme, optionally wherein the DNA-digesting enzyme is dnase I, to form a DNA degradation mixture; and
b. isolating the degraded RNA from the degradation mixture or the DNA degradation mixture.
14. A composition comprising a set of probes, the composition comprising:
a. at least one DNA probe comprising at least one sequence comprising at least one of SEQ ID NOs 1-1131; and
b. ribonucleases capable of degrading RNA in DNA-RNA hybrids.
15. A kit comprising a set of probes, the kit comprising:
a. At least one DNA probe comprising at least one sequence comprising at least one of SEQ ID NOs 1-1131; and
b. ribonucleases capable of degrading RNA in DNA-RNA hybrids.
16. The kit of claim 15, comprising:
a. a probe set comprising at least one DNA probe comprising at least one of SEQ ID NOs 1-1131;
b. ribonuclease;
a DNase; and
RNA purification beads.
17. The method of any one of claims 1 to 3, 5 to 8 or 10 to 13, the solid support of any one of claims 4 to 9, the composition of claim 14, or the kit of claim 15 or 16, wherein the oligonucleotide library or the set of probes comprises 2 or more, 5 or more, 10 or more, 25 or more, 50 or more, 100 or more, 200 or more, 300 or more, 400 or more, 500 or more, 600 or more, 700 or more, 800 or more, 900 or more, 1000 or more, 1100 or more, or 1131 sequences selected from SEQ ID NOs 1-1131.
18. The method of any one of claims 1 to 3, 5 to 8 or 10 to 13, the solid support of any one of claims 4 to 9, the composition of claim 14, or the kit of claim 15 or 16, wherein the oligonucleotide library or the probe set comprises at least one sequence comprising the sequence of seq id NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 303-290, 292-301, 303-321, 323-331 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388, 390, 391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, and the like, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-826, 828, 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1017-1110, 1110-1110, 1123-865, 867-872, 874-881, 883-892, 884-940, 1063-898, 1063-968, 1066-1096, 1096-1095, and one of the others.
19. The method, solid support, composition, or kit of claim 18, wherein the oligonucleotide library or the set of probes comprises 100 or more, 500 or more, or 1000 or more sequences comprising SEQ ID NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349 351-355, 357-359, 361-372, 374-380, 383-386, 388, 390, 391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673 675. 676, 678, 680-682, 684, 685, 688-692, 694, 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-826, 828, 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 836-1025, 1027-1029, 1031-1044, 1046-1058, 1110-1110, 1064-1096-1110, 1064, 1066-1096-1095, 1126-1095, 1125, 1115 and one of the others.
20. The method, solid support, composition or kit of claim 19, wherein the oligonucleotide library or the set of probes comprises a sequence comprising the sequence of SEQ ID NO:1-10, 12-18, 21, 22, 24-33, 35, 39-43, 45-48, 50-73, 75, 77, 78, 81-84, 86-103, 105-107, 109-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 160-165, 168-174, 176-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225, 227-246, 248-265, 269, 270, 272-277, 279, 281, 282, 284-290, 292-301, 303-321, 323-331, 333-336, 338, 340-342, 344, 346-349, 351-355, 357-359 361-372, 374-380, 383-386, 388, 390, 391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-512, 514, 516, 518, 519, 521-524, 526-529, 531, 532, 535-539, 541-545, 547-552, 555-577, 580-608, 610, 612-616, 618-622, 624-630, 632-636, 638-640, 643, 646-649, 652-659, 663-673, 675, 676, 678, 680-682, 684, 685, 688-692, 694. 696-705, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-779, 781-796, 798, 801-819, 821-826, 828, 830-832, 834, 836-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-881, 883-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1021, 1023-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1094, 1096-1096, 1115, 1123-1112, 1123-1115, 1123-1126, 1123-1115, 1123-1125, 1123-1115, and 1115.
21. The method, solid support, composition or kit of any one of claims 18 to 20, wherein the oligonucleotide library or the probe set further comprises at least one sequence comprising at least one of SEQ ID NOs 19, 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128 and 1131.
22. The method, solid support, composition, or kit of claim 21, wherein the oligonucleotide library or the probe set comprises 10 or more, 20 or more, or 30 or more sequences comprising at least one of SEQ ID NOs 19, 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131.
23. The method, solid support, composition, or kit of claim 22, wherein the oligonucleotide library or probe set comprises a sequence comprising each of SEQ ID NOs 19, 74, 76, 85, 104, 108, 158, 175, 226, 278, 322, 339, 389, 513, 517, 520, 546, 553, 609, 611, 650, 662, 677, 683, 686, 706, 780, 827, 835, 882, 1022, 1059, 1077, 1078, 1098, 1106, 1111, 1114, 1128, and 1131.
24. The method of any one of claims 1 to 3, 5 to 8 or 10 to 13, the solid support of any one of claims 4 to 9, the composition of claim 14, or the kit of claim 15 or 16, wherein the oligonucleotide library or the probe set comprises at least one sequence comprising the sequence of seq id NO:1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-946, and 1128-1131.
25. The method, solid support, composition, or kit of claim 24, wherein the oligonucleotide library or the set of probes comprises 100 or more, 500 or more, or 1000 or more sequences comprising SEQ ID NO:1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391 393, 395-401, 403-406, 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798. 801-819, 821-828, 830-832, 834-847, 849, 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126 and 1128-1131.
26. The method, solid support, composition or kit of claim 25, wherein the oligonucleotide library or the set of probes comprises a sequence comprising the sequence of SEQ ID NO:1-10, 12-19, 21, 22, 24-33, 35, 39-43, 45-48, 50-78, 81-127, 129-134, 136-140, 142, 143, 148, 149, 151-155, 157, 158, 160-165, 168-182, 184-187, 189-194, 196-204, 206, 208-215, 217-223, 225-246, 248-265, 269, 270, 272-279, 281, 282, 284-290, 292-301, 303-331, 333-336, 338-342, 344, 346-349, 351-355, 357-359, 361-372, 374-380, 383-386, 388-391, 393, 395-401, 403-406 408-420, 422-439, 441, 443, 444, 446-460, 462-466, 468, 469, 471, 473-477, 479-502, 504-514, 516-524, 526-529, 531, 532, 535-539, 541-553, 555-577, 580-616, 618-622, 624-630, 632-636, 638-640, 643, 646-650, 652-659, 662-673, 675-678, 680-686, 688-692, 694, 696-706, 708-715, 717-731, 733, 735, 736, 739-763, 765, 767-796, 798, 801-819, 821-828, 830-832, 834-847, 849. 851, 852, 854-861, 863-865, 867-872, 874-892, 894-898, 900-909, 911, 913-921, 923-925, 927-935, 938-940, 942-948, 950-965, 967-969, 971-979, 981-984, 986-994, 997-1010, 1012-1015, 1017, 1019-1025, 1027-1029, 1031-1044, 1046-1062, 1064-1067, 1069-1075, 1077, 1078, 1080-1094, 1096, 1098-1116, 1118-1126, and 1128-1131.
27. The method, solid support, composition or kit of any one of claims 18 to 26, wherein the oligonucleotide library or the set of probes further comprises at least one sequence comprising the sequence of SEQ ID NO:11, 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623 631, 637, 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117 and 1127.
28. The method, solid support, composition, or kit of claim 27, wherein the oligonucleotide library or the set of probes comprises 10 or more, 20 or more, or 30 or more sequences comprising the sequences of SEQ ID NOs: 11, 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623 631, 637, 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117 and 1127.
29. The method, solid support, composition or kit of claim 28, wherein the oligonucleotide library or the set of probes comprises a sequence comprising the sequence of SEQ ID NO:11, 20, 23, 34, 36-38, 44, 49, 79, 80, 128, 135, 141, 144-147, 150, 156, 159, 166, 167, 183, 188, 195, 205, 207, 216, 224, 247, 266-268, 271, 280, 283, 291, 302, 332, 337, 343, 345, 350, 356, 360, 373, 381, 382, 387, 392, 394, 402, 407, 421, 440, 442, 445, 461, 467, 470, 472, 478, 503, 515, 525, 530, 533, 534, 540, 554, 578, 579, 617, 623 631, 637, 641, 642, 644, 645, 651, 660, 661, 674, 679, 687, 693, 695, 707, 716, 732, 734, 737, 738, 764, 766, 797, 799, 800, 820, 829, 833, 848, 850, 853, 862, 866, 873, 893, 899, 910, 912, 922, 926, 936, 937, 941, 949, 966, 970, 980, 985, 995, 996, 1011, 1016, 1018, 1026, 1030, 1045, 1063, 1068, 1076, 1079, 1095, 1097, 1117, and 1127.
CN202280042563.XA 2021-09-30 2022-09-29 Solid support and method for depleting and/or enriching library fragments prepared from biological samples Pending CN117545852A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US63/250563 2021-09-30
US202263351170P 2022-06-10 2022-06-10
US63/351170 2022-06-10
PCT/US2022/077221 WO2023056328A2 (en) 2021-09-30 2022-09-29 Solid supports and methods for depleting and/or enriching library fragments prepared from biosamples

Publications (1)

Publication Number Publication Date
CN117545852A true CN117545852A (en) 2024-02-09

Family

ID=89794348

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202280042563.XA Pending CN117545852A (en) 2021-09-30 2022-09-29 Solid support and method for depleting and/or enriching library fragments prepared from biological samples

Country Status (1)

Country Link
CN (1) CN117545852A (en)

Similar Documents

Publication Publication Date Title
CN113166797B (en) Nuclease-based RNA depletion
CN108350499B (en) Convertible marking compositions, methods, and processes incorporating same
TW201321518A (en) Method of micro-scale nucleic acid library construction and application thereof
US20110129827A1 (en) Methods for transcript analysis
WO2016022833A1 (en) Digital measurements from targeted sequencing
WO2018195217A1 (en) Compositions and methods for library construction and sequence analysis
JP2009505651A (en) Method for detection of microbial and antibiotic resistance markers and nucleic acid oligonucleotides therefor
WO2018108328A1 (en) Method for increasing throughput of single molecule sequencing by concatenating short dna fragments
CN110719957A (en) Methods and kits for targeted enrichment of nucleic acids
EP4107262A1 (en) Methods of spatially resolved single cell rna sequencing
WO2013106807A1 (en) Scalable characterization of nucleic acids by parallel sequencing
WO2023098492A1 (en) Sequencing library construction method and application
CA3125458A1 (en) Quantitative amplicon sequencing for multiplexed copy number variation detection and allele ratio quantitation
US20210277458A1 (en) Methods, systems, and aparatus for nucleic acid detection
CN113026111A (en) Kit for constructing human single cell TCR sequencing library and application thereof
WO2022007863A1 (en) Method for rapidly enriching target gene region
CN117545852A (en) Solid support and method for depleting and/or enriching library fragments prepared from biological samples
US20230094911A1 (en) Solid Supports and Methods for Depleting and/or Enriching Library Fragments Prepared from Biosamples
CN115698319A (en) Methods and compositions for preparing nucleic acid libraries
WO2023056328A2 (en) Solid supports and methods for depleting and/or enriching library fragments prepared from biosamples
CN110468179A (en) The method of selective amplification nucleic acid sequence
US10072290B2 (en) Methods for amplifying fragmented target nucleic acids utilizing an assembler sequence
US20220380755A1 (en) De-novo k-mer associations between molecular states
WO2024077152A1 (en) Probes for depleting abundant small noncoding rna
CN117165662A (en) Gene detection technology and application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication