WO2023230563A2 - Novel crispr systems - Google Patents

Novel crispr systems Download PDF

Info

Publication number
WO2023230563A2
WO2023230563A2 PCT/US2023/067482 US2023067482W WO2023230563A2 WO 2023230563 A2 WO2023230563 A2 WO 2023230563A2 US 2023067482 W US2023067482 W US 2023067482W WO 2023230563 A2 WO2023230563 A2 WO 2023230563A2
Authority
WO
WIPO (PCT)
Prior art keywords
strain
cas endonuclease
seqid
plant
cas
Prior art date
Application number
PCT/US2023/067482
Other languages
French (fr)
Other versions
WO2023230563A3 (en
Inventor
Betsy ALFORD
Hong Zhu
Stephen BOLARIS
Original Assignee
Bioconsortia, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bioconsortia, Inc. filed Critical Bioconsortia, Inc.
Publication of WO2023230563A2 publication Critical patent/WO2023230563A2/en
Publication of WO2023230563A3 publication Critical patent/WO2023230563A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]

Definitions

  • sequence listing is submitted electronically as an XML formatted sequence listing with a file named created on 25 April 2023 and having a size of 687,882 bytes and is filed concurrently with the specification.
  • sequence listing comprised in this XML formatted document is part of the specification and is herein incorporated by reference in its entirety.
  • the instant disclosure relates generally to the field of biology, specifically genomeediting molecules derived from microbes, and microbial compositions for the improvement of traits in both eukaryotes and prokaryotes.
  • Nucleotide-editing technology has enabled the modification of genomes across a wide variety of species, for the improvement of different traits of interest. Spanning both eukaryotes and prokaryotes, techniques including TAL Effector Nucleases, Meganucleases, Zinc Finger Nucleases, and CRISPR endonucleases provide ways of targeting and editing within the genomes of cells.
  • CRISPR-Cas systems are extensively leveraged in the treatment of human and animal diseases, curing cancer and genetic conditions, producing various compositions for industrial and pharmaceutical use, and improving the quality and quantity of food crops.
  • Cas endonucleases can be precisely developed for specific site modifications. A large number of Cas endonucleases have been described, each with different properties, such as PAM recognition site preferences. [0007] Despite the advances during the past few years, there is a need for a larger variety of Cas endonucleases for target-specific modification of polynucleotides.
  • the microbes provided herein comprise at least one CRISPR-Cas system that is capable of creating a double strand break in, or adjacent to, a target polynucleotide that comprises an appropriate PAM, and to which it is directed by a guide polynucleotide, in any prokaryotic or eukaryotic cell.
  • the cell is a plant cell or an animal cell or a fungal cell.
  • single-effector endonucleases capable of recognizing, binding to, and optionally nicking (on a single strand of two strands) or cleaving (a single strand or a double strand) a target polynucleotide, for example double-stranded DNA.
  • sequence descriptions and sequence listing attached hereto comply with the rules governing nucleotide and amino acid sequence disclosures in patent applications as set forth in 37 C.F.R. ⁇ 1.821 and 1.825.
  • sequence descriptions comprise the three letter codes for amino acids as defined in 37 C.F.R. ⁇ 1.821 and 1.825, which are incorporated herein by reference.
  • Sequence sources are given as ⁇ Strain #_Contig/Locus ID>.
  • SEQID NO: 1 is the Strain 10022 03091 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO:2 is the Strain 10078 01326 Cas endonuclease DNA sequence from Microbacterium sp. TPU 3598.
  • SEQID NO:3 is the Strain 10082 02958 Cas endonuclease DNA sequence from Microbacterium testaceum.
  • SEQID NO:4 is the Strain 100944 02101 Cas endonuclease DNA sequence from Bacillus sp. X1(2014).
  • SEQID NO: 5 is the Strain 100949 02456 Cas endonuclease DNA sequence from Bacillus sp. X1(2014).
  • SEQID NO:6 is the Strain 10099 01758 Cas endonuclease DNA sequence from Rathayibacter tritici.
  • SEQID NO:7 is the Strain 10105 01169 Cas endonuclease DNA sequence from Microbacterium sp. TPU 3598.
  • SEQID NO: 8 is the Strain 10107 02044 Cas endonuclease DNA sequence from Rathayibacter tritici.
  • SEQID NO:9 is the Strain 101119 04896 Cas endonuclease DNA sequence from Bacillus megaterium.
  • SEQID NO: 10 is the Strain 101168 00758 Cas endonuclease DNA sequence from Bacillus megaterium.
  • SEQID NO: 11 is the Strain 101169_03904 Cas endonuclease DNA sequence from Bacillus megaterium.
  • SEQID NO:12 is the Strain 101395_04849 Cas endonuclease DNA sequence from Paenibacillus xylanexedens.
  • SEQID NO: 13 is the Strain 101632_00966 Cas endonuclease DNA sequence from Bacillus megaterium.
  • SEQID NO: 14 is the Strain 101632_03946 Cas endonuclease DNA sequence from Bacillus megaterium.
  • SEQID NO: 15 is the Strain 1017 03442 Cas endonuclease DNA sequence from
  • SEQID NO: 16 is the Strain 1019 01381 Cas endonuclease DNA sequence from Ottowia sp. KADR8-3.
  • SEQID NO: 17 is the Strain 102002_02944 Cas endonuclease DNA sequence from Bacillus simplex.
  • SEQID NO:18 is the Strain 102413 04815 Cas endonuclease DNA sequence from Paenibacillus sp. JDR-2.
  • SEQID NO: 19 is the Strain 102958 01637 Cas endonuclease DNA sequence from Bacillus sp. Y-01.
  • SEQID NO:20 is the Strain 103102 04419 Cas endonuclease DNA sequence from Paenibacillus sp. IHBB 10380.
  • SEQID NO:21 is the Strain 103112 04355 Cas endonuclease DNA sequence from Paenibacillus sp. IHBB 10380.
  • SEQID NO:22 is the Strain 103113 04392 Cas endonuclease DNA sequence from Paenibacillus sp. IHBB 10380.
  • SEQID NO: 23 is the Strain 1038 02403 Cas endonuclease DNA sequence from Microbacterium sp. 1.5R.
  • SEQID NO:24 is the Strain 10387 01168 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO: 25 is the Strain 1040 01689 Cas endonuclease DNA sequence from Microbacterium sp. 1.5R.
  • SEQID NO:26 is the Strain 104168_03005 Cas endonuclease DNA sequence from Paenibacillus crassostreae.
  • SEQID NO:27 is the Strain 10419_02480 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:28 is the Strain 10420_00522 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:29 is the Strain 10426 00468 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:30 is the Strain 10455 02553 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:31 is the Strain 104624_02918 Cas endonuclease DNA sequence from Bacillus megaterium.
  • SEQID NO:32 is the Strain 10466 00895 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO: 33 is the Strain 1047 02160 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:34 is the Strain 10494 01762 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:35 is the Strain 10504 03293 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:36 is the Strain 10506 03105 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:37 is the Strain 10507 00398 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:38 is the Strain 105094 04656 Cas endonuclease DNA sequence from Bacillus megaterium.
  • SEQID NO:39 is the Strain 1051 01436 Cas endonuclease DNA sequence from Microbacterium sp. 1.5R.
  • SEQID NO:40 is the Strain 10511 03526 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:41 is the Strain 105149_04104 Cas endonuclease DNA sequence from Bacillus sp. X1(2014).
  • SEQID NO:42 is the Strain 105149_04649 Cas endonuclease DNA sequence from Bacillus sp. X1(2014).
  • SEQID NO:43 is the Strain 105194_03162 Cas endonuclease DNA sequence from Bacillus sp. X1(2014).
  • SEQID NO:44 is the Strain 105195_04607 Cas endonuclease DNA sequence from Bacillus sp. X1 (2014).
  • SEQID NO:45 is the Strain 105195_04745 Cas endonuclease DNA sequence from Bacillus sp. X1 (2014).
  • SEQID NO:46 is the Strain 105210_04468 Cas endonuclease DNA sequence from Bacillus sp. X1 (2014).
  • SEQID NO:47 is the Strain 10522 02429 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:48 is the Strain 10533 01275 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:49 is the Strain 1063 00998 Cas endonuclease DNA sequence from Microbacterium sp. 1.5R.
  • SEQID NO:50 is the Strain 10631 01157 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:51 is the Strain 10634 01684 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:52 is the Strain 10635 01773 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO: 53 is the Strain 10638 03764 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO: 54 is the Strain 10647 02458 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:55 is the Strain 10660 01480 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:56 is the Strain 10669_01791 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:57 is the Strain 10671_00807 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:58 is the Strain 1078_01481 Cas endonuclease DNA sequence from Microbacterium sp. 1.5R.
  • SEQID NO: 59 is the Strain 10785_00266 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO: 60 is the Strain 1080_02464 Cas endonuclease DNA sequence from Microbacterium sp. 1.5R.
  • SEQID NO:61 is the Strain 10805 00375 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:62 is the Strain 10806 03196 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:63 is the Strain 10808 01266 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:64 is the Strain 10817 00887 Cas endonuclease DNA sequence from Unknown.
  • SEQID NO: 65 is the Strain 1082 03198 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:66 is the Strain 10822 00582 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO: 67 is the Strain 1083 01536 Cas endonuclease DNA sequence from
  • SEQID NO:68 is the Strain 10839 02372 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:69 is the Strain 10842 01772 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:70 is the Strain 10844 02570 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO: 71 is the Strain 1087_01291 Cas endonuclease DNA sequence from Microbacterium sp. CGR1.
  • SEQID NO: 72 is the Strain 1094 00967 Cas endonuclease DNA sequence from
  • SEQID NO: 73 is the Strain 1098 00553 Cas endonuclease DNA sequence from Microbacterium sp. CGR1.
  • SEQID NO:74 is the Strain 1109 01133 Cas endonuclease DNA sequence from
  • SEQID NO: 75 is the Strain 1113 01742 Cas endonuclease DNA sequence from Microbacterium sp. TPU 3598.
  • SEQID NO:76 is the Strain 1115 01841 Cas endonuclease DNA sequence from Microbacterium aurum.
  • SEQID NO:77 is the Strain 11199 02635 Cas endonuclease DNA sequence from Bacillus sp. X1 (2014).
  • SEQID NO:78 is the Strain 1124 01292 Cas endonuclease DNA sequence from Glutamicibacter arilaitensis.
  • SEQID NO:79 is the Strain 11345 00502 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:80 is the Strain 11364 01125 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO: 81 is the Strain 11393 00661 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO: 82 is the Strain 11425 02393 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:83 is the Strain 11456 02753 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO: 84 is the Strain 1155 02473 Cas endonuclease DNA sequence from Microbacterium sp. CGR1.
  • SEQID NO: 85 is the Strain 1165 01871 Cas endonuclease DNA sequence from Microbacterium sp. CGR1.
  • SEQID NO:86 is the Strain 11716_05148 Cas endonuclease DNA sequence from Unknown.
  • SEQID NO: 87 is the Strain 11723_04575 Cas endonuclease DNA sequence from Bacillus megaterium.
  • SEQID NO: 88 is the Strain 1176_01531 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:89 is the Strain 11762 04354 Cas endonuclease DNA sequence from Bacillus megaterium.
  • SEQID NO:90 is the Strain 11773 01698 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO: 91 is the Strain 1178 01705 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:92 is the Strain 11784 03035 Cas endonuclease DNA sequence from Bacillus megaterium.
  • SEQID NO:93 is the Strain 11787 03445 Cas endonuclease DNA sequence from Bacillus megaterium.
  • SEQID NO: 94 is the Strain 1198 01200 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO:95 is the Strain 1212 01737 Cas endonuclease DNA sequence from Ghitamicibacter arilaitensis.
  • SEQID NO:96 is the Strain 12148 02440 Cas endonuclease DNA sequence from Paenibacillus naphthalenovorans.
  • SEQID NO:97 is the Strain 12193 04094 Cas endonuclease DNA sequence from Paenibacillus naphthalenovorans.
  • SEQID NO:98 is the Strain 12301 00979 Cas endonuclease DNA sequence from Unknown.
  • SEQID NO: 99 is the Strain 1253 02264 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO: 100 is the Strain 1271 01237 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO: 101 is the Strain 1286_01295 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO:102 is the Strain 12917_02996 Cas endonuclease DNA sequence from Curtobacterium pusilium.
  • SEQID NO:103 is the Strain 13445_04630 Cas endonuclease DNA sequence from Azospirillum lipoferum.
  • SEQID NO: 104 is the Strain 1396 03114 Cas endonuclease DNA sequence from Paenibacillus sp. CAA11.
  • SEQID NO:105 is the Strain 14053 00313 Cas endonuclease DNA sequence from Paenibacillus yonginensis.
  • SEQID NO:106 is the Strain 14166_02703 Cas endonuclease DNA sequence from Arthrobacter sp. QXT-31.
  • SEQID NO:107 is the Strain 14167 01587 Cas endonuclease DNA sequence from Arthrobacter sp. PGP41.
  • SEQID NO:108 is the Strain 14171 02048 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
  • SEQID NO:109 is the Strain 14186 01112 Cas endonuclease DNA sequence from Arthrobacter sp. FB24.
  • SEQID NO:110 is the Strain 14193 00438 Cas endonuclease DNA sequence from Arthrobacter sp. PGP41.
  • SEQID NO: 111 is the Strain 14196 00143 Cas endonuclease DNA sequence from Arthrobacter sp. PGP41.
  • SEQID NO:112 is the Strain 14202 02580 Cas endonuclease DNA sequence from Arthrobacter sp. PGP41.
  • SEQID NO:113 is the Strain 14229 00612 Cas endonuclease DNA sequence from Arthrobacter sp. FB24.
  • SEQID NO: 114 is the Strain 1431 00729 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
  • SEQID NO: 115 is the Strain 1438 01477 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
  • SEQID NO:116 is the Strain 14544_01404 Cas endonuclease DNA sequence from Bacillus megaterium.
  • SEQID NO:117 is the Strain 14596_01694 Cas endonuclease DNA sequence from Bacillus sp. Y-01.
  • SEQID NO:118 is the Strain 14627_01883 Cas endonuclease DNA sequence from Microbacterium testaceum.
  • SEQID NO:119 is the Strain 14650 03793 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO:120 is the Strain 14658 03699 Cas endonuclease DNA sequence from Arthrobacter sp. QXT-31.
  • SEQID NO: 121 is the Strain 1471 00680 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
  • SEQID NO: 122 is the Strain 1472 00433 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
  • SEQID NO:123 is the Strain 14727 02881 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO:124 is the Strain 14743 03592 Cas endonuclease DNA sequence from Leifsonia xyli.
  • SEQID NO:125 is the Strain 14779 01378 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO:126 is the Strain 14808 02364 Cas endonuclease DNA sequence from Paenibacillus xylanexedens.
  • SEQID NO:127 is the Strain 14817 01476 Cas endonuclease DNA sequence from Paenibacillus xylanexedens.
  • SEQID NO:128 is the Strain 14824 01629 Cas endonuclease DNA sequence from Paenibacillus xylanexedens.
  • SEQID NO:129 is the Strain 14881 01920 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
  • SEQID NO:130 is the Strain 14945 00031 Cas endonuclease DNA sequence from Arthrobacter sp. ATCC 21022.
  • SEQID NO:131 is the Strain 14968_01911 Cas endonuclease DNA sequence from Arthrobacter sp. ZXY-2.
  • SEQID NO:132 is the Strain 14970_00612 Cas endonuclease DNA sequence from Arthrobacter sp. FB24.
  • SEQID NO:133 is the Strain 14977_00931 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO: 134 is the Strain 15010_02299 Cas endonuclease DNA sequence from Arthrobacter sp. PGP41.
  • SEQID NO:135 is the Strain 15062 00931 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO:136 is the Strain 15158 01222 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
  • SEQID NO: 137 is the Strain 15306 00078 Cas endonuclease DNA sequence from Paenibacillus yonginensis.
  • SEQID NO:138 is the Strain 15309_00706 Cas endonuclease DNA sequence from Paenibacillus yonginensis.
  • SEQID NO:139 is the Strain 15353 01923 Cas endonuclease DNA sequence from Paenibacillus yonginensis.
  • SEQID NO:140 is the Strain 15393 03167 Cas endonuclease DNA sequence from Paenibacillus sp. Y412MC10.
  • SEQID NO: 141 is the Strain 15407 04378 Cas endonuclease DNA sequence from Paenibacillus yonginensis.
  • SEQID NO: 142 is the Strain 15448 04948 Cas endonuclease DNA sequence from Sinorhizobium meliloti.
  • SEQID NO: 143 is the Strain 15469 04140 Cas endonuclease DNA sequence from Mitsuaria sp. 7.
  • SEQID NO: 144 is the Strain 15531 01961 Cas endonuclease DNA sequence from Mitsuaria sp. 7.
  • SEQID NO: 145 is the Strain 15546 00078 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO: 146 is the Strain 15832_01937 Cas endonuclease DNA sequence from Arthrobacter sp. PGP41.
  • SEQID NO: 147 is the Strain 15859_02500 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO:148 is the Strain 15875_01814 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO:149 is the Strain 15939_01790 Cas endonuclease DNA sequence from Arthrobacter sp. ZXY-2.
  • SEQID NO: 150 is the Strain 15940 03638 Cas endonuclease DNA sequence from Arthrobacter sp. FB24.
  • SEQID NO: 151 is the Strain 16 02923 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO: 152 is the Strain 16023_03049 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO: 153 is the Strain 16060_02050 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
  • SEQID NO:154 is the Strain 16064 03513 Cas endonuclease DNA sequence from Methylorubrum extorquens.
  • SEQID NO: 155 is the Strain 16095_00628 Cas endonuclease DNA sequence from Sinorhizobium fredii.
  • SEQID NO: 156 is the Strain 16107 02971 Cas endonuclease DNA sequence from Arthrobacter sp. ATCC 21022.
  • SEQID NO: 157 is the Strain 16130 03182 Cas endonuclease DNA sequence from Arthrobacter sp. U41.
  • SEQID NO: 158 is the Strain 16135 03880 Cas endonuclease DNA sequence from Bacillus sp. Y-01.
  • SEQID NO: 159 is the Strain 16157 00325 Cas endonuclease DNA sequence from Arthrobacter sp. ATCC 21022.
  • SEQID NO:160 is the Strain 16158 01725 Cas endonuclease DNA sequence from Arthrobacter sp. QXT-31.
  • SEQID NO:161 is the Strain 16194_01685 Cas endonuclease DNA sequence from Arthrobacter sp. ATCC 21022.
  • SEQID NO: 162 is the Strain 16216_02576 Cas endonuclease DNA sequence from Arthrobacter sp. ZXY-2.
  • SEQID NO:163 is the Strain 16233_01732 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO: 164 is the Strain 16237 03292 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO: 165 is the Strain 16248_01824 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
  • SEQID NO: 166 is the Strain 1625_02362 Cas endonuclease DNA sequence from Chryseobacterium glaciei.
  • SEQID NO: 167 is the Strain 16274 00773 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
  • SEQID NO: 168 is the Strain 16288_06493 Cas endonuclease DNA sequence from Azospirillum thiophilum.
  • SEQID NO: 169 is the Strain 16288_06718 Cas endonuclease DNA sequence from Azospirillum thiophilum.
  • SEQID NO: 170 is the Strain 16299 00539 Cas endonuclease DNA sequence from Rathayibacter tritici.
  • SEQID NO: 171 is the Strain 16333 02521 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO: 172 is the Strain 16334 02912 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • SEQID NO: 173 is the Strain 16349 00705 Cas endonuclease DNA sequence from Arthrobacter sp. ZXY-2.
  • SEQID NO: 174 is the Strain 16351 01257 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
  • SEQID NO: 175 is the Strain 16369 02319 Cas endonuclease DNA sequence from Arthrobacter sp. ZXY-2.
  • SEQID NO: 176 is the Strain 16370_00277 Cas endonuclease DNA sequence from Arthrobacter sp. PGP41.
  • SEQID NO: 177 is the Strain 16372_00908 Cas endonuclease DNA sequence from Microbacterium sp. No. 7.
  • SEQID NO: 178 is the Strain 16396_02202 Cas endonuclease DNA sequence from Arthrobacter crystallopoietes.
  • SEQID NO: 179 is the Strain 16404_01792 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
  • microorganism or “microbe” should be taken broadly. These terms are used interchangeably and include, but are not limited to, the two prokaryotic domains, Bacteria and Archaea, as well as eukaryotic Fungi and Protists.
  • the disclosure refers to the “microbes” of Table 1, or the “microbes” of various other tables or paragraphs present in the disclosure. This characterization can refer to not only the identified taxonomic bacterial genera of the tables, but also the identified taxonomic species, as well as the various novel and newly identified bacterial strains of said tables.
  • microbe refers to any species or taxon of microorganism, including, but not limited to, archaea, bacteria, microalgae, fungi (including mold and yeast species), mycoplasmas, microspores, nanobacteria, oomycetes, and protozoa.
  • a microbe or microorganism encompasses individual cells (e.g., unicellular microorganisms) or more than one cell (e.g., multi-cellular microorganism).
  • a "population of microorganisms” may thus refer to a multiple cells of a single microorganism, in which the cells share common genetic derivation.
  • bacterium refers in general to any prokaryotic organism, and may reference an organism from either Kingdom Eubacteria (Bacteria), Kingdom Archaebacteria (Archae), or both.
  • bacterial genera or other taxonomic classifications have been reassigned due to various reasons (such as but not limited to the evolving field of whole genome sequencing), and it is understood that such nomenclature reassignments are within the scope of any claimed taxonomy.
  • certain species of the genus Erwinia have been described in the literature as belonging to genus Pantoea (Zhang, ⁇ ., Qiu, S. Examining phylogenetic relationships of Erwinia and Pantoea species using whole genome sequence data. Antonie van Leeuwenhoek 108, 1037-1046 (2015).).
  • 16S refers to the DNA sequence of the 16S ribosomal RNA (rRNA) sequence of a bacterium. 16S rRNA gene sequencing is a well-established method for studying phylogeny and taxonomy of bacteria.
  • fungus or "fungi” refers in general to any organism from Kingdom Fungi. Historical taxonomic classification of fungi has been according to morphological presentation. Beginning in the mid-1800' s, it was recognized that some fungi have a pleomorphic life cycle, and that different nomenclature designations were being used for different forms of the same fungus.
  • ITS Internal Transcribed Spacer
  • rRNA small-subunit ribosomal RNA
  • LSU large-subunit rRNA genes in the chromosome or the corresponding transcribed region in the polycistronic rRNA precursor transcript.
  • ITS gene sequencing is a well-established method for studying phylogeny and taxonomy of fungi.
  • LSU Large SubUnit
  • LSU gene sequencing is a well-established method for studying phylogeny and taxonomy of fungi.
  • Some fungal microbes of the present invention may be described by an ITS sequence and some may be described by an LSU sequence. Both are understood to be equally descriptive and accurate for determining taxonomy.
  • microbial consortia or “microbial consortium” refers to a subset of a microbial community of individual microbial species, or strains of a species, which can be described as carrying out a common function, or can be described as participating in, or leading to, or correlating with, a recognizable parameter or plant phenotypic trait
  • the community may comprise one or more species, or strains of a species, of microbes. In some instances, the microbes coexist within the community symbiotically.
  • microbial community means a group of microbes comprising two or more species or strains. Unlike microbial consortia, a microbial community does not have to be carrying out a common function, or does not have to be participating in, or leading to, or correlating with, a recognizable parameter or plant phenotypic trait.
  • AMS accelerated microbial selection
  • DMS directed microbial selection
  • isolated As used herein, “isolate,” “isolated,” “isolated microbe,” and like terms, are intended to mean that the one or more microorganisms has been separated from at least one of the materials with which it is associated in a particular environment (for example soil, water, plant tissue).
  • an “isolated microbe” does not exist in its naturally occurring environment; rather, it is through the various techniques described herein that the microbe has been removed from its natural setting and placed into a non-naturally occurring state of existence.
  • the isolated strain may exist as, for example, a biologically pure culture, or as spores (or other forms of the strain) in association with an agricultural carrier.
  • the isolated microbes exist as isolated and biologically pure cultures. It will be appreciated by one of skill in the art, that an isolated and biologically pure culture of a particular microbe, denotes that said culture is substantially free (within scientific reason) of other living organisms and contains only the individual microbe in question. The culture can contain varying concentrations of said microbe.
  • the disclosure provides for certain quantitative measures of the concentration, or purity limitations, that must be found within an isolated and biologically pure microbial culture.
  • the presence of these purity values is a further attribute that distinguishes the presently disclosed microbes from those microbes existing in a natural state. See, e.g., Merck & Co. v. Olin Mathieson Chemical Corp., 253 F.2d 156 (4th Cir. 1958) (discussing purity limitations for vitamin Bl 2 produced by microbes), incorporated herein by reference.
  • individual isolates should be taken to mean a composition, or culture, comprising a predominance of a single genera, species, or strain, of microorganism, following separation from one or more other microorganisms. The phrase should not be taken to indicate the extent to which the microorganism has been isolated or purified. However, “individual isolates” can comprise substantially only one genus, species, or strain, of microorganism.
  • growth medium is any medium which is suitable to support growth of a plant
  • the media may be natural or artificial including, but not limited to: soil, potting mixes, bark, vermiculite, hydroponic solutions alone and applied to solid plant support systems, and tissue culture gels. It should be appreciated that the media may be used alone or in combination with one or more other media. It may also be used with or without the addition of exogenous nutrients and physical support systems for roots and foliage.
  • the growth medium is a naturally occurring medium such as soil, sand, mud, clay, humus, regolith, rock, or water.
  • the growth medium is artificial.
  • Such an artificial growth medium may be constructed to mimic the conditions of a naturally occurring medium; however, this is not necessary.
  • Artificial growth media can be made from one or more of any number and combination of materials including sand, minerals, glass, rock, water, metals, salts, nutrients, water.
  • the growth medium is sterile. In another embodiment, the growth medium is not sterile.
  • the medium may be amended or enriched with additional compounds or components, for example, a component which may assist in the interaction and/or selection of specific groups of microorganisms with the plant and each other.
  • antibiotics such as penicillin
  • sterilants for example, quaternary ammonium salts and oxidizing agents
  • the physical conditions such as salinity, plant nutrients (for example organic and inorganic minerals (such as phosphorus, nitrogenous salts, ammonia, potassium and micronutrients such as cobalt and magnesium), pH, and/or temperature) could be amended.
  • plant generically includes whole plants, plant organs, plant tissues, seeds, plant cells, seeds and progeny of the same.
  • Plant cells include, without limitation, cells from seeds, suspension cultures, embryos, meristematic regions, callus tissue, leaves, roots, shoots, gametophytes, sporophytes, pollen and microspores.
  • a “plant element” is intended to reference either a whole plant or a plant component, which may comprise differentiated and/or undifferentiated tissues, for example but not limited to plant tissues, parts, and cell types.
  • a plant element is one of the following: whole plant, seedling, meristematic tissue, ground tissue, vascular tissue, dermal tissue, seed, leaf, root, shoot, stem, flower, fruit, stolon, bulb, tuber, corm, keiki, shoot, bud, tumor tissue, and various forms of cells and culture (e.g., single cells, protoplasts, embryos, callus tissue).
  • plant organ refers to plant tissue or a group of tissues that constitute a morphologically and functionally distinct part of a plant.
  • a “plant part” is synonymous to a “portion” of a plant, and refers to any part of the plant, and can include distinct tissues and/or organs, and may be used interchangeably with the term “tissue” throughout [0209] “Progeny” comprises any subsequent generation of an organism, produced via sexual or asexual reproduction.
  • plant element refers to plant cells, plant protoplasts, plant cell tissue cultures from which plants can be regenerated, plant calli, plant clumps, and plant cells that are intact in plants or parts of plants such as embryos, pollen, ovules, seeds, leaves, flowers, branches, fruit, kernels, ears, cobs, husks, stalks, roots, root tips, anthers, and the like, as well as the parts themselves. Grain is intended to mean the mature seed produced by commercial growers for purposes other than growing or reproducing the species. Progeny, variants, and mutants of the regenerated plants are also included within the scope of the invention, provided that these parts comprise the introduced polynucleotides.
  • a “plant reproductive element” is intended to generically reference any part of a plant that is able to initiate other plants via either sexual or asexual reproduction of that plant, for example but not limited to: seed, seedling, root, shoot, cutting, scion, graft, stolon, bulb, tuber, corm, keiki, or bud.
  • the plant element may be in plant or in a plant organ, tissue culture, or cell culture.
  • the term “monocotyledonous” or “monocot” refers to the subclass of angiosperm plants also known as “monocotyledoneae”, whose seeds typically comprise only one embryonic leaf, or cotyledon.
  • the term includes references to whole plants, plant elements, plant organs (e.g., leaves, stems, roots, etc.), seeds, plant cells, and progeny of the same.
  • dicotyledonous or “dicot” refers to the subclass of angiosperm plants also knows as “dicotyledoneae”, whose seeds typically comprise two embryonic leaves, or cotyledons.
  • the term includes references to whole plants, plant elements, plant organs (e.g., leaves, stems, roots, etc.), seeds, plant cells, and progeny of the same.
  • the term “cultivar” refers to a variety, strain, or race, of plant that has been produced by horticultural or agronomic techniques and is not normally found in wild populations.
  • “improved” should be taken broadly to encompass improvement of a characteristic of a plant, as compared to a control plant, or as compared to a known average quantity associated with the characteristic in question.
  • “improved” plant biomass associated with application of a beneficial microbe, or consortia, of the disclosure can be demonstrated by comparing the biomass of a plant treated by the microbes taught herein to the biomass of a control plant not treated.
  • “improved” does not necessarily demand that the data be statistically significant (e.g., p ⁇ 0.05); rather, any quantifiable difference demonstrating that one value (e.g., the average treatment value) is different from another (e.g., the average control value) can rise to the level of “improved.”
  • “inhibiting and suppressing” and like terms should not be construed to require complete inhibition or suppression, although this may be desired in some embodiments.
  • the term “genotype” refers to the genetic makeup of an individual cell, cell culture, tissue, organism (e.g., a plant), or group of organisms.
  • compositions and methods herein may provide for an improved “agronomic trait” or “trait of agronomic importance” or “trait of agronomic interest” to a plant, which may include, but not be limited to, the following: disease resistance, drought tolerance, heat tolerance, cold tolerance, salinity tolerance, metal tolerance, herbicide tolerance, improved water use efficiency, improved nitrogen utilization, improved nitrogen fixation, pest resistance, herbivore resistance, pathogen resistance, yield improvement, health enhancement, vigor improvement, growth improvement, photosynthetic capability improvement, nutrition enhancement, altered protein content, altered oil content, increased biomass, increased shoot length, increased root length, improved root architecture, modulation of a metabolite, modulation of the proteome, increased seed weight, altered seed carbohydrate composition, altered seed oil composition, altered seed protein composition, altered seed nutrient composition, as compared to an isoline plant not comprising a modification derived from the methods or compositions herein
  • Agronomic trait potential is intended to mean a capability of a plant element for exhibiting a phenotype, preferably an improved agronomic trait, at some point during its life cycle, or conveying said phenotype to another plant element with which it is associated in the same plant.
  • the term “molecular marker”, “marker”, or “genetic marker” refers to an indicator that is used in methods for visualizing differences in characteristics of nucleic acid sequences.
  • indicators are restriction fragment length polymorphism (RFLP) markers, amplified fragment length polymorphism (AFLP) markers, single nucleotide polymorphisms (SNPs), insertion mutations, microsatellite markers (SSRs), sequence- characterized amplified regions (SCARs), cleaved amplified polymorphic sequence (CAPS) markers or isozyme markers or combinations of the markers described herein which defines a specific genetic and chromosomal location.
  • RFLP restriction fragment length polymorphism
  • AFLP amplified fragment length polymorphism
  • SNPs single nucleotide polymorphisms
  • SSRs single nucleotide polymorphisms
  • SCARs sequence- characterized amplified regions
  • CAS cleaved amplified polymorphic sequence
  • the term “trait” refers to a characteristic or phenotype.
  • yield of a crop relates to the amount of marketable biomass produced by a plant (e.g., fruit, fiber, grain).
  • Desirable traits may also include other plant characteristics, including but not limited to: water use efficiency, nutrient use efficiency, production, mechanical harvestability, fruit maturity, shelf life, pest/disease resistance, early plant maturity, tolerance to stresses, etc.
  • a trait may be inherited in a dominant or recessive manner, or in a partial or incomplete-dominant manner.
  • a trait may be monogenic (z.e., determined by a single locus) or polygenic (z.e., determined by more than one locus) or may also result from the interaction of one or more genes with the environment.
  • phenotype refers to the observable characteristics of an individual cell, cell culture, organism (e.g., a plant), or group of organisms which results from the interaction between that individual’s genetic makeup (i.e., genotype) and the environment.
  • a “synthetic nucleotide sequence” or “synthetic polynucleotide sequence” is a nucleotide sequence that is not known to occur in nature or that is not naturally occurring. Generally, such a synthetic nucleotide sequence will comprise at least one nucleotide difference when compared to any other naturally occurring nucleotide sequence.
  • nucleic acid refers to a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides, or analogs thereof. This term refers to the primary structure of the molecule, and thus includes double- and single-stranded DNA, as well as double- and single-stranded RNA. It also includes modified nucleic acids such as methylated and/or capped nucleic acids, nucleic acids containing modified bases, backbone modifications, and the like. The terms “nucleic acid” and “nucleotide sequence” are used interchangeably.
  • genes refers to any segment of DNA associated with a biological function.
  • genes include, but are not limited to, coding sequences and/or the regulatory sequences required for their expression.
  • Genes can also include non-expressed DNA segments that, for example, form recognition sequences for other proteins.
  • Genes can be obtained from a variety of sources, including cloning from a source of interest or synthesizing from known or predicted sequence information, and may include sequences designed to have desired parameters.
  • homologous or “homologue”, “homolog”, or “ortholog” is known in the art and refers to related sequences that share a common ancestor or family member and are determined based on the degree of sequence identity.
  • the terms “homology,” “homologous,” “substantially similar” and “corresponding substantially” are used interchangeably herein. They refer to nucleic acid fragments wherein changes in one or more nucleotide bases do not affect the ability of the nucleic acid fragment to mediate gene expression or produce a certain phenotype.
  • a functional relationship may be indicated in any one of a number of ways, including, but not limited to: (a) degree of sequence identity and/or (b) the same or similar biological function. Preferably, both (a) and (b) are indicated.
  • Homology can be determined using software programs readily available in the art, such as those discussed in Current Protocols in Molecular Biology (F.M. Ausubel etal., eds., 1987) Supplement 30, section 7.718, Table 7.71. Some alignment programs are MacVector (Oxford Molecular Ltd, Oxford, U.K.), ALIGN Plus (Scientific and Educational Software, Pennsylvania) and AlignX (Vector NTI, Invitrogen, Carlsbad, CA). Another alignment program is Sequencher (Gene Codes, Ann Arbor, Michigan), using default parameters.
  • nucleotide change refers to, e.g., nucleotide substitution, deletion, insertion, chemical alteration, or any of the proceeding, as is well understood in the art.
  • protein modification refers to, e.g., amino acid substitution, amino acid modification, deletion, and/or insertion, as is well understood in the art.
  • the term “at least a portion” or “fragment” of a nucleic acid or polypeptide means a portion having the minimal size characteristics of such sequences, or any larger fragment of the full length molecule, up to and including the full length molecule.
  • a fragment of a polynucleotide of the disclosure may encode a biologically active portion of a genetic regulatory element.
  • a biologically active portion of a genetic regulatory element can be prepared by isolating a portion of one of the polynucleotides of the disclosure that comprises the genetic regulatory element and assessing activity as described herein.
  • a portion of a polypeptide may be 4 amino acids, 5 amino acids, 6 amino acids, 7 amino acids, and so on, going up to the full length polypeptide.
  • the length of the portion to be used will depend on the particular application.
  • a portion of a nucleic acid useful as a hybridization probe may be as short as 12 nucleotides; in some embodiments, it is 20 nucleotides.
  • a portion of a polypeptide useful as an epitope may be as short as 4 amino acids.
  • a portion of a polypeptide that performs the function of the full-length polypeptide would generally be longer than 4 amino acids.
  • primer refers to an oligonucleotide which is capable of annealing to the amplification target allowing a DNA polymerase to attach, thereby serving as a point of initiation of DNA synthesis when placed under conditions in which synthesis of primer extension product is induced, i.e., in the presence of nucleotides and an agent for polymerization such as DNA polymerase and at a suitable temperature and pH.
  • the (amplification) primer is preferably single stranded for maximum efficiency in amplification.
  • the primer is an oligodeoxyribonucleotide.
  • the primer must be sufficiently long to prime the synthesis of extension products in the presence of the agent for polymerization.
  • primers will depend on many factors, including temperature and composition (A/T vs. G/C content) of primer.
  • a pair of bi-directional primers consists of one forward and one reverse primer as commonly used in the art of DNA amplification such as in PCR amplification.
  • stringency or “stringent hybridization conditions” refer to hybridization conditions that affect the stability of hybrids, e.g., temperature, salt concentration, pH, formamide concentration and the like. These conditions are empirically optimized to maximize specific binding and minimize non-specific binding of primer or probe to its target nucleic acid sequence.
  • the terms as used include reference to conditions under which a probe or primer will hybridize to its target sequence, to a delectably greater degree than other sequences (e.g., at least 2-fold over background). Stringent conditions are sequence dependent and will be different in different circumstances. Longer sequences hybridize specifically at higher temperatures.
  • stringent conditions are selected to be about 5° C lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH.
  • Tm is the temperature (under defined ionic strength and pH) at which 50% of a complementary target sequence hybridizes to a perfectly matched probe or primer.
  • stringent conditions will be those in which the salt concentration is less than about 1.0 M Na+ ion, typically about 0.01 to 1.0 M Na + ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C for short probes or primers (e.g., 10 to 50 nucleotides) and at least about 60° C for long probes or primers (e.g., greater than 50 nucleotides).
  • Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide.
  • exemplary low stringent conditions or “conditions of reduced stringency” include hybridization with a buffer solution of 30% formamide, 1 M NaCl, 1% SDS at 37° C and a wash in 2xSSC at 40° C.
  • Exemplary high stringency conditions include hybridization in 50% formamide, IM NaCl, 1% SDS at 37° C, and a wash in 0.1 xSSC at 60° C. Hybridization procedures are well known in the art and are described by e.g., Ausubel et al., 1998 and Sambrook et al., 2001.
  • stringent conditions are hybridization in 0.25 MNa2HPO4 buffer (pH 7.2) containing 1 mM Na2EDTA, 0.5-20% sodium dodecyl sulfate at 45°C, such as 0.5%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19% or 20%, followed by a wash in 5xSSC, containing 0.1% (w/v) sodium dodecyl sulfate, at 55°C to 65°C.
  • the cell or organism has at least one heterologous trait
  • heterologous trait refers to a phenotype imparted to a cell or organism by an exogenous molecule or other organism (e.g., a microbe), DNA segment heterologous polynucleotide or heterologous nucleic acid.
  • Various changes in phenotype are of interest to the present disclosure, including but not limited to modifying the fatty acid composition in a plant altering the amino acid content of a plant altering a plant's pathogen defense mechanism, increasing a plant’s yield of an economically important trait (e.g., grain yield, forage yield, etc.) and the like. These results can be achieved by providing expression of heterologous products or increased expression of endogenous products in plants using the methods and compositions of the present disclosure [0234]
  • a “synthetic combination” can include a combination of a plant and a microbe of the disclosure.
  • the combination may be achieved, for example, by coating the surface of a seed of a plant, such as an agricultural plant, or host plant tissue (root, stem, leaf, etc.), with a microbe of the disclosure.
  • a “synthetic combination” can include a combination of microbes of various strains or species. Synthetic combinations have at lest one variable that distinguishes the combination from any combination that occurs in nature. That variable may be, inter alia, a concentration of microbe on a seed or plant tissue that does not occur naturally, or a combination of microbe and plant that does not naturally occur, or a combination of microbes or strains that do not occur naturally together. In each of these instances, the synthetic combination demonstrates the hand of man and possesses structural and/or functional attributes that are not present when the individual elements of the combination are considered in isolation.
  • a microbe can be “endogenous” to a seed or plant.
  • a microbe is considered “endogenous” to a plant or seed, if the microbe is derived from the plant specimen from which it is sourced. That is, if the microbe is naturally found associated with said plant.
  • an endogenous microbe is applied to a plant, then the endogenous microbe is applied in an amount that differs from the levels found on the plant in nature.
  • a microbe that is endogenous to a given plant can still form a synthetic combination with the plant, if the microbe is present on said plant at a level that does not occur naturally.
  • a composition (such as a microbe) can be “heterologous” (also termed “exogenous”) to another composition (such as a seed or plant), and in some aspects is referred to herein as a “heterologous composition”.
  • a microbe is considered “heterologous” to a plant or seed, if the microbe is not derived from the plant specimen from which it is sourced. That is, if the microbe is not naturally found associated with said plant
  • a microbe that is normally associated with leaf tissue of a maize plant is considered exogenous to a leaf tissue of another maize plant that naturally lacks said microbe.
  • a microbe that is normally associated with a maize plant is considered exogenous to a wheat plant that naturally lacks said microbe.
  • a composition is “heterologously disposed” when mechanically or manually applied, artificially inoculated, associated with, or disposed onto or into a plant element, seedling, plant or onto or into a plant growth medium or onto or into a treatment formulation so that the treatment exists on or in the plant element, seedling, plant, plant growth medium, or formulation in a manner not found in nature prior to the application of the treatment, e.g, said combination which is not found in nature in that plant variety, at that stage in plant development, in that plant tissue, in that abundance, or in that growth environment (for example, drought).
  • such a manner is contemplated to be selected from the group consisting of: the presence of the microbe; presence of the microbe in a different number of cells, concentration, or amount; the presence of the microbe in a different plant element, tissue, cell type, or other physical location in or on the plant; the presence of the microbe at different time period, e.g., developmental phase of the plant or plant element, time of day, time of season, and combinations thereof.
  • “heterologously disposed” means that the microbe being applied to a different tissue or cell type of the plant element than that in which the microbe is naturally found.
  • heterologously disposed means that the microbe is applied to a developmental stage of the plant element, seedling, or plant in which said microbe is not naturally associated, but may be associated at other stages. For example, if a microbe is normally found at the flowering stage of a plant and no other stage, a microbe applied at the seedling stage may be considered to be heterologously disposed. In some embodiments, a microbe is heterologously disposed the microbe is normally found in the root tissue of a plant element but not in the leaf tissue, and the microbe is applied to the leaf.
  • heterologously disposed means that the native plant element, seedling, or plant does not contain detectable levels of the microbe in that same plant element, seedling, or plant.
  • “heterologously disposed” means that the microbe being applied is at a greater concentration, number, or amount of the plant element, seedling, or plant, than that which is naturally found in said plant element, seedling, or plant
  • a microbe is heterologously disposed when present at a concentration that is at least 1.5 times greater, between 1.5 and 2 times greater, 2 times greater, between 2 and 3 times greater, 3 times greater, between 3 and 5 times greater, 5 times greater, between 5 and 7 times greater, 7 times greater, between 7 and 10 times greater, 10 times greater, or even greater than 10 times higher number, amount, or concentration than the concentration that was present prior to the disposition of said microbe.
  • a microbe that is naturally found in a tissue of a cupressaceous tree would be considered heterologous to tissue of a maize, wheat, cotton, soybean plant
  • a microbe that is naturally found in leaf tissue of a maize, spring wheat cotton, soybean plant is considered heterologous to a leaf tissue of another maize, spring wheat cotton, soybean plant that naturally lacks said microbe, or comprises the microbe in a different quantity.
  • Microbes can also be “heterologously disposed” on a given plant tissue. This means that the microbe is placed upon a plant tissue that it is not naturally found upon. For instance, if a given microbe only naturally occurs on the roots of a given plant, then that microbe could be exogenously applied to the above-ground tissue of a plant and would thereby be “heterologously disposed” upon said plant tissue. As such, a microbe is deemed heterologously disposed, when applied on a plant that does not naturally have the microbe present or does not naturally have the microbe present in the number that is being applied.
  • compositions and methods herein may provide for a “modulated” “agronomic trait” or “trait of agronomic importance” to a host plant, which may include, but not be limited to, the following: altered oil content, altered protein content, altered seed carbohydrate composition, altered seed oil composition, and altered seed protein composition, chemical tolerance, cold tolerance, delayed senescence, disease resistance, drought tolerance, ear weight, growth improvement, health enhancement, heat tolerance, herbicide tolerance, herbivore resistance, improved nitrogen fixation, improved nitrogen utilization, improved root architecture, improved water use efficiency, increased biomass, increased root length, increased seed weight, increased shoot length, increased yield, increased yield under water-limited conditions, kernel mass, kernel moisture content, metal tolerance, number of ears, number of kernels per ear, number of pods, nutrition enhancement, pathogen resistance, pest resistance, photosynthetic capability improvement, salinity tolerance, stay-green, vigor improvement, increased dry weight of mature seeds, increased fresh weight of mature seeds, increased number of mature seeds per plant, increased chlorophyll content
  • modulated it is intended to refer to a change in an agronomic trait that is changed by virtue of the presence of the microbe(s), exudate, broth, metabolite, etc.
  • the modulation provides for the imparting of a beneficial trait
  • CRISPR Clustered Regularly Interspaced Short Palindromic Repeats
  • a CRISPR locus can consist of a CRISPR array, comprising short direct repeats (CRISPR repeats) separated by short variable DNA sequences (called spacers), which can be flanked by diverse Cas (CRISPR-associated) genes.
  • an “effector” or “effector protein” is a protein that encompasses an activity including recognizing, binding to, and/or cleaving or nicking a polynucleotide target
  • An effector, or effector protein may also be an endonuclease.
  • the “effector complex” of a CRISPR system includes Cas proteins involved in crRNA and target recognition and binding. Some of the component Cas proteins may additionally comprise domains involved in target polynucleotide cleavage.
  • Cas protein refers to a polypeptide encoded by a Cas (CRISPR- associated) gene.
  • a Cas protein includes proteins encoded by a gene in a cas locus, and include adaptation molecules as well as interference molecules.
  • An interference molecule of a bacterial adaptive immunity complex includes endonucleases.
  • a Cas endonuclease described herein comprises one or more nuclease domains.
  • a Cas endonuclease includes but is not limited to: a Cas9 protein, a Cpfl (Casl2) protein, a C2cl protein, a C2c2 protein, a C2c3 protein, Cas3, Cas3-HD, Cas 5, Cas7, Cas8, Cas 10, or combinations or complexes of these.
  • a Cas protein may be a “Cas endonuclease” or “Cas effector protein”, that when in complex with a suitable polynucleotide component, is capable of recognizing, binding to, and optionally nicking or cleaving all or part of a specific polynucleotide target sequence.
  • Cas protein is further defined as a functional fragment or functional variant of a native Cas protein, or a protein that shares at least 50%, between 50% and 55%, at least 55%, between 55% and 60%, at least 60%, between 60% and 65%, at least 65%, between 65% and 70%, at least 70%, between 70% and 75%, at least 75%, between 75% and 80%, at least 80%, between 80% and 85%, at least 85%, between 85% and 90%, at least 90%, between 90% and 95%, at least 95%, between 95% and 96%, at least 96%, between 96% and 97%, at least 97%, between 97% and 98%, at least 98%, between 98% and 99%, at least 99%, between 99% and 100%, or 100% sequence identity with at least 50, between 50 and 100, at least 100, between 100 and 150, at least 150, between 150 and 200, at least 200, between 200 and 250, at least 250, between 250 and 300, at least 300, between 300 and 350, at least 350, between 350 and 400, at least
  • a “functional fragment”, “fragment that is functionally equivalent”, and “functionally equivalent fragment” of a Cas endonuclease are used interchangeably herein, and refer to a portion or subsequence of the Cas endonuclease of the present disclosure in which the ability to recognize, bind to, and optionally unwind, nick or cleave (introduce a single or double strand break in) the target site is retained.
  • the portion or subsequence of the Cas endonuclease can comprise a complete or partial (functional) peptide of any one of its domains such as for example, but not limiting to a complete of functional part of a Cas3 HD domain, a complete of functional part of a Cas3 Helicase domain, complete of functional part of a protein (such as but not limiting to a Cas5, Cas5d, Cas7 and Cas8bl).
  • a Cas endonuclease may also include a multifunctional Cas endonuclease.
  • multifunctional Cas endonuclease and “multifunctional Cas endonuclease polypeptide” are used interchangeably herein and includes reference to a single polypeptide that has Cas endonuclease functionality (comprising at least one protein domain that can act as a Cas endonuclease) and at least one other functionality, such as but not limited to, the functionality to form a complex (comprises at least a second protein domain that can form a complex with other proteins).
  • the multifunctional Cas endonuclease comprises at least one additional protein domain relative (either internally, upstream (5’), downstream (3'), or both internally 5’ and 3’, or any combination thereof) to those domains typical of a Cas endonuclease.
  • cascade and “cascade complex” are used interchangeably herein and include reference to a multi-subunit protein complex that can assemble with a polynucleotide forming a polynucleotide-protein complex (PNP).
  • PNP polynucleotide-protein complex
  • Cascade is a PNP that relies on the polynucleotide for complex assembly and stability, and for the identification of target nucleic acid sequences.
  • Cascade functions as a surveillance complex that finds and optionally binds target nucleic acids that are complementary to a variable targeting domain of the guide polynucleotide.
  • cleavage-ready Cascade refers to a multi-subunit protein complex that can assemble with a polynucleotide forming a polynucleotide-protein complex (PNP), wherein one of the cascade proteins is a Cas endonuclease capable of recognizing, binding to, and optionally unwinding, nicking, or cleaving all or part of a target sequence.
  • PNP polynucleotide-protein complex
  • single guide RNA and “sgRNA” are used interchangeably herein and relate to a synthetic fusion of two RNA molecules, a crRNA (CRISPR RNA) comprising a variable targeting domain (linked to a tracr mate sequence that hybridizes to a tracrRNA), fused to a tracrRNA (trans-activating CRISPR RNA).
  • CRISPR RNA crRNA
  • variable targeting domain linked to a tracr mate sequence that hybridizes to a tracrRNA
  • trans-activating CRISPR RNA trans-activating CRISPR RNA
  • the single guide RNA can comprise a crRNA or crRNA fragment and a tracrRNA or tracrRNA fragment of the type II CRISPR/Cas system that can form a complex with a type II Cas endonuclease, wherein said guide RNA/Cas endonuclease complex can direct the Cas endonuclease to a DNA target site, enabling the Cas endonuclease to recognize, optionally bind to, and optionally nick or cleave (introduce a single or double-strand break) the DNA target site.
  • variable targeting domain or “VT domain” is used interchangeably herein and includes a nucleotide sequence that can hybridize (is complementary) to one strand (nucleotide sequence) of a double strand DNA target site.
  • the percent complementation between the first nucleotide sequence domain (VT domain) and the target sequence can be at least 50%, 51%,
  • variable targeting domain can be at least 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 nucleotides in length. In some embodiments, the variable targeting domain comprises a contiguous stretch of 12 to 30 nucleotides.
  • the variable targeting domain can be composed of a DNA sequence, a RNA sequence, a modified DNA sequence, a modified RNA sequence, or any combination thereof.
  • CER domain of a guide polynucleotide
  • a CER domain comprises a (trans-acting) tracrNucleotide mate sequence followed by a tracrNucleotide sequence.
  • the CER domain can be composed of a DNA sequence, a RNA sequence, a modified DNA sequence, a modified RNA sequence (see for example El S20150059010A1, published 26 February 2015), or any combination thereof.
  • guide polynucleotide/Cas endonuclease complex As used herein, the terms “guide polynucleotide/Cas endonuclease complex”, “guide polynucleotide/Cas endonuclease system”, “ guide polynucleotide/Cas complex”, “guide polynucleotide/Cas system” and “guided Cas system” “Polynucleotide-guided endonuclease” , are used interchangeably herein and refer to at least one guide polynucleotide and at least one Cas endonuclease, that are capable of forming a complex, wherein said guide polynucleotide/Cas endonuclease complex can direct the Cas endonuclease to a DNA target site, enabling the Cas endonuclease to recognize, bind to, and optionally nick or cleave (introduce a single or
  • guide RNA/Cas endonuclease complex refers to at least one RNA component and at least one Cas endonuclease that are capable of forming a complex , wherein said guide RNA/Cas endonuclease complex can direct the Cas endonuclease to a DNA target site, enabling the Cas endonuclease to recognize, bind to, and optionally nick or cleave (introduce a single or double-strand break) the DNA target site.
  • target site refers to a polynucleotide sequence such as, but not limited to, a nucleotide sequence on a chromosome, episome, a locus, or any other DNA molecule in the genome (including chromosomal, chloroplastic, mitochondrial DNA, plasmid DNA) of a cell, at which a guide polynucleotide/Cas endonuclease complex can recognize, bind to, and optionally nick or cleave
  • the target site can be an endogenous site in the genome of a cell, or alternatively, the target site can be heterologous to the cell and thereby not be naturally occurring in the genome of the cell, or the target site can be found in a heterologous genomic location compared to where it occurs in nature.
  • a “protospacer adjacent motif (PAM) herein refers to a short nucleotide sequence adjacent to a target sequence (protospacer) that is recognized (targeted) by a guide polynucleotide/Cas endonuclease system described herein.
  • the Cas endonuclease may not successfully recognize a target DNA sequence if the target DNA sequence is not followed by a PAM sequence.
  • the sequence and length of a PAM herein can differ depending on the Cas protein or Cas protein complex used.
  • the PAM sequence can be of any length but is typically 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 nucleotides long.
  • an “altered target site”, “altered target sequence”, “modified target site”, “alterations”, and “modified target sequence” are used interchangeably herein and refer to a target sequence as disclosed herein that comprises at least one alteration when compared to non-altered target sequence.
  • Such “alterations” include, for example: (i) replacement of at least one nucleotide, (ii) a deletion of at least one nucleotide, (iii) an insertion of at least one nucleotide, (iv) a chemical alteration of at least one nucleotide, and/or (v) any combination of the preceding.
  • a “modified nucleotide” or “edited nucleotide” refers to a nucleotide sequence of interest that comprises at least one alteration when compared to its non-modified nucleotide sequence.
  • donor polynucleotide is a polynucleotide construct (e.g., DNA) that comprises a polynucleotide of interest to be inserted into the target site of a Cas endonuclease.
  • polynucleotide modification template includes a polynucleotide that comprises at least one nucleotide modification when compared to the nucleotide sequence to be edited.
  • a nucleotide modification can be at least one nucleotide substitution, addition or deletion.
  • the polynucleotide modification template can further comprise homologous nucleotide sequences flanking the at least one nucleotide modification, wherein the flanking homologous nucleotide sequences provide sufficient homology to the desired nucleotide sequence to be edited.
  • microorganism should be taken broadly. It includes, but is not limited to, prokaryotic Bacteria and Archaea, as well as eukaryotic Fungi and Protists.
  • the microorganisms may include: Proteobacteria (such as Pseudomonas, Enterobacter, Stenotrophomonas, Burkholderia, Rhizobium, Herbaspirillum, Pantoea, Serratia, Rahnella, Azospirillum, Azorhizobium, Azotobacter, Duganella, Delftia, Bradyrhizobiun, Sinorhizobium, Variovorax and Halomonas), Firmicutes (such as Bacillus, Paenibacillus, Lactobacillus, Mycoplasma, and Acetobacterium), Actinobacteria (such as Brevibacterium, Janibacter, Streptomyces, Rhodococcus, Microbacterium, Curtobacterium, Cellulomonas, and Nocardioides), and the fungi Ascomycota (such as Trichoderma, Ampelomyces, Coniothyrium, Pae
  • the microorganism is an endophyte, or an epiphyte, or a microorganism inhabiting the plant rhizosphere or rhizosheath. That is, the microorganism may be found present in the soil material adhered to the roots of a plant or in the area immediately adjacent a plant’s roots.
  • the microorganism is an endophyte.
  • Endophytes may benefit host plants by preventing pathogenic organisms from colonizing them. Extensive colonization of the plant tissue by endophytes creates a “barrier effect,” where the local endophytes outcompete and prevent pathogenic organisms from taking hold. Endophytes may also produce chemicals which inhibit the growth of competitors, including pathogenic organisms.
  • the microorganism is unculturable. This should be taken to mean that the microorganism is not known to be culturable or is difficult to culture using methods known to one skilled in the art.
  • Microorganisms of the present disclosure may be collected or obtained from any source or contained within and/or associated with material collected from any source.
  • the microorganisms are obtained from any general terrestrial environment, including its soils, plants, fungi, animals (including invertebrates) and other biota, including the sediments, water and biota of lakes and rivers; from the marine environment, its biota and sediments (for example sea water, marine muds, marine plants, marine invertebrates (for example sponges), marine vertebrates (for example, fish)); the terrestrial and marine geosphere (regolith and rock, for example crushed subterranean rocks, sand and clays); the cryosphere and its meltwater; the atmosphere (for example, filtered aerial dusts, cloud and rain droplets); urban, industrial and other man-made environments (for example, accumulated organic and mineral matter on concrete, roadside gutters, roof surfaces, road surfaces).
  • the atmosphere for example, filtered aerial dusts, cloud and rain droplets
  • urban, industrial and other man-made environments for example, accumulated organic and mineral matter on concrete, roadside gutters, roof surfaces, road surfaces).
  • the microorganisms are collected from a source likely to favor the selection of appropriate microorganisms.
  • the source may be a particular environment in which it is desirable for other plants to grow, or which is thought to be associated with terroir.
  • the source may be a plant having one or more desirable traits, for example a plant which naturally grows in a particular environment or under certain conditions of interest
  • a certain plant may naturally grow in sandy soil or sand of high salinity, or under extreme temperatures, or with little water, or it may be resistant to certain pests or disease present in the environment and it may be desirable for a commercial crop to be grown in such conditions, particularly if they are, for example, the only conditions available in a particular geographic location.
  • the microorganisms may be collected from commercial crops grown in such environments, or more specifically from individual crop plants best displaying a trait of interest amongst a crop grown in any specific environment for example the fastest-growing plants amongst a crop grown in saline-limiting soils, or the least damaged plants in crops exposed to severe insect damage or disease epidemic, or plants having desired quantities of certain metabolites and other compounds, including fiber content oil content and the like, or plants displaying desirable colors, taste, or smell.
  • the microorganisms may be collected from a plant of interest or any material occurring in the environment of interest including fungi and other animal and plant biota, soil, water, sediments, and other elements of the environment as referred to previously.
  • the microorganisms are individual isolates separated from different environments.
  • a microorganism or a combination of microorganisms, of use in the methods of the disclosure may be selected from a pre-existing collection of individual microbial species or strains based on some knowledge of their likely or predicted benefit to a plant.
  • the microorganism may be predicted to: improve nitrogen fixation; release phosphate from the soil organic matter; release phosphate from the inorganic forms of phosphate (e.g., rock phosphate); “fix carbon" in the root microsphere; live in the rhizosphere of the plant thereby assisting the plant in absorbing nutrients from the surrounding soil and then providing these more readily to the plant; increase the number of nodules on the plant roots and thereby increase the number of symbiotic nitrogen fixing bacteria (e.g., Rhizobium species) per plant and the amount of nitrogen fixed by the plant; elicit plant defensive responses such as ISR (induced systemic resistance) or SAR (systemic acquired resistance) which help the plant resist the invasion and spread of pathogenic microorganisms; compete with microorganisms deleterious to plant growth or health by antagonism, or competitive utilization of resources such as nutrients or space; change the color of one or more part of the plant, or change the chemical profile of the plant, its smell, taste or one or more other quality.
  • a microorganism or combination of microorganisms is selected from a pre-existing collection of individual microbial species or strains that provides no knowledge of their likely or predicted benefit to a plant For example, a collection of unidentified microorganisms isolated from plant tissues without any knowledge of their ability to improve plant growth or health, or a collection of microorganisms collected to explore their potential for producing compounds that could lead to the development of pharmaceutical drugs.
  • the microorganisms are acquired from the source material (for example, soil, rock, water, air, dust, plant or other organism) in which they naturally reside.
  • the microorganisms may be provided in any appropriate form, having regard to its intended use in the methods of the disclosure. However, by way of example only, the microorganisms may be provided as an aqueous suspension, gel, homogenate, granule, powder, slurry, live organism or dried material.
  • the microorganisms of the disclosure may be isolated in substantially pure or mixed cultures. They may be concentrated, diluted, or provided in the natural concentrations in which they are found in the source material.
  • microorganisms from saline sediments may be isolated for use in this disclosure by suspending the sediment in fresh water and allowing the sediment to fall to the bottom.
  • the water containing the bulk of the microorganisms may be removed by decantation after a suitable period of settling and either applied directly to the plant growth medium, or concentrated by filtering or centrifugation, diluted to an appropriate concentration and applied to the plant growth medium with the bulk of the salt removed.
  • microorganisms from mineralized or toxic sources may be similarly treated to recover the microbes for application to the plant growth material to minimize the potential for damage to the plant.
  • the microorganisms are used in a crude form, in which they are not isolated from the source material in which they naturally reside.
  • the microorganisms are provided in combination with the source material in which they reside; for example, as soil, or the roots, seed or foliage of a plant.
  • the source material may include one or more species of microorganisms.
  • a mixed population of microorganisms is used in the methods of the disclosure.
  • any one or a combination of a number of standard techniques which will be readily known to skilled persons may be used.
  • these in general employ processes by which a solid or liquid culture of a single microorganism can be obtained in a substantially pure form, usually by physical separation on the surface of a solid microbial growth medium or by volumetric dilutive isolation into a liquid microbial growth medium.
  • These processes may include isolation from dry material, liquid suspension, slurries or homogenates in which the material is spread in a thin layer over an appropriate solid gel growth medium, or serial dilutions of the material made into a sterile medium and inoculated into liquid or solid culture media.
  • the material containing the microorganisms may be pre-treated prior to the isolation process in order to either multiply all microorganisms in the material, or select portions of the microbial population, either by enriching the material with microbial nutrients (for example, by pasteurizing the sample to select for microorganisms resistant to heat exposure (for example, bacilli), or by exposing the sample to low concentrations of an organic solvent or sterilant (for example, household bleach) to enhance the survival of spore-forming or solvent-resistant microorganisms). Microorganisms can then be isolated from the enriched materials or materials treated for selective survival, as above.
  • microbial nutrients for example, by pasteurizing the sample to select for microorganisms resistant to heat exposure (for example, bacilli)
  • an organic solvent or sterilant for example, household bleach
  • endophytic or epiphytic microorganisms are isolated from plant material. Any number of standard techniques known in the art may be used and the microorganisms may be isolated from any appropriate tissue in the plant, including for example root, stem and leaves, and plant reproductive tissues.
  • conventional methods for isolation from plants typically include the sterile excision of the plant material of interest (e.g., root or stem lengths, leaves), surface sterilization with an appropriate solution (e.g., 2% sodium hypochlorite), after which the plant material is placed on nutrient medium for microbial growth (See, for example, Strobel G and Daisy B (2003) Microbiology and Molecular Biology Reviews 67 (4): 491-502; Zinniel DK et al. (2002) Applied and Environmental Microbiology 68 (5): 2198-2208).
  • an appropriate solution e.g., 2% sodium hypochlorite
  • the microorganisms are isolated from root tissue. Further methodology for isolating microorganisms from plant material are detailed hereinafter. [0278] In one embodiment, the microbial population is exposed (prior to the method or at any stage of the method) to a selective pressure. For example, exposure of the microorganisms to pasteurization before their addition to a plant growth medium (preferably sterile) is likely to enhance the probability that the plants selected for a desired trait will be associated with sporeforming microbes that can more easily survive in adverse conditions, in commercial storage, or if applied to seed as a coating, in an adverse environment.
  • a plant growth medium preferably sterile
  • the microorganism(s) may be used in crude form and need not be isolated from a plant or a media.
  • plant material or growth media which includes the microorganisms identified to be of benefit to a selected plant may be obtained and used as a crude source of microorganisms for the next round of the method or as a crude source of microorganisms at the conclusion of the method.
  • whole plant material could be obtained and optionally processed, such as mulched or crushed.
  • individual tissues or parts of selected plants may be separated from the plant and optionally processed, such as mulched or crushed.
  • one or more part of a plant which is associated with the second set of one or more microorganisms may be removed from one or more selected plants and, where any successive repeat of the method is to be conducted, grafted on to one or more plant used in any step of the plant breeding methods.
  • the present disclosure provides isolated microbes, including novel strains of identified microbial species, presented in Table 1. [0281] In other aspects, the present disclosure provides isolated whole microbial cultures of the species and strains identified in Table 1. These cultures may comprise microbes at various concentrations.
  • the disclosure provides for utilizing a microbe selected from Table 1 in agriculture.
  • a microbe from the genus Bacillus is utilized in agriculture to impart one or more beneficial properties to a plant species.
  • the disclosure relates to microbes having characteristics substantially similar to that of a microbe identified in Table 1.
  • the isolated microbial species, and novel strains of said species, identified in the present disclosure, are able to impart beneficial properties or traits, such as a trait of agronomic importance, to target plant species.
  • the isolated microbes described in Table 1, or consortia of said microbes are able to improve plant health and vitality.
  • the improved plant health and vitality can be quantitatively measured, for example, by measuring the effect that said microbial application has upon a plant phenotypic or genotypic trait.
  • microbes of the present disclosure were obtained, among other places, at various locales in New Zealand and the United States
  • microbes of Table 1 were identified by utilizing standard techniques to characterize the microbes’ phenotype, which was then utilized to identify the microbe to a taxonomically recognized species. Alternatively, the microbes of Table 1 were sequenced (16S and/or Whole Genome Sequencing, according to methods known in the art) to determine taxonomy.
  • the isolation, identification, and culturing of the microbes of the present disclosure can be effected using standard microbiological techniques. Examples of such techniques may be found in Gerhardt, P. (ed.) Methods for General and Molecular Microbiology. American Society for Microbiology, Washington, D.C. (1994) and Lennette, E. H. (ed.) Manual of Clinical Microbiology, Third Edition. American Society for Microbiology, Washington, D.C. (1980), each of which is incorporated by reference.
  • Isolation can be effected by streaking the specimen on a solid medium (e.g., nutrient agar plates) to obtain a single colony, which is characterized by the phenotypic traits described hereinabove (e.g., Gram positive/negative, capable of forming spores aerobically/anaerobically, cellular morphology, carbon source metabolism, acid/base production, enzyme secretion, metabolic secretions, etc.) and to reduce the likelihood of working with a culture which has become contaminated.
  • a solid medium e.g., nutrient agar plates
  • biologically pure isolates can be obtained through repeated subculture of biological samples, each subculture followed by streaking onto solid media to obtain individual colonies.
  • Methods of preparing, thawing, and growing lyophilized bacteria are commonly known, for example, Ghema, R. L. and C. A. Reddy. 2007. Culture Preservation, p 1019-1033. In C. A. Reddy, T. J. Beveridge, J. A. Breznak, G. A. Marzluf, T. M. Schmidt, and L. R. Snyder, eds. American Society for Microbiology, Washington, D.C., 1033 pages; herein incorporated by reference.
  • freeze-dried liquid formulations and cultures stored long term at -70° C in solutions containing glycerol are contemplated for use in providing formulations of the present inventions.
  • the bacteria of the disclosure can be propagated in a liquid medium under aerobic conditions.
  • Medium for growing the bacterial strains of the present disclosure includes a carbon source, a nitrogen source, and inorganic salts, as well as specially required substances such as vitamins, amino acids, nucleic acids and the like.
  • suitable carbon sources which can be used for growing the bacterial strains include, but are not limited to, starch, peptone, yeast extract, amino acids, sugars such as glucose, arabinose, mannose, glucosamine, maltose, and the like; salts of organic acids such as acetic acid, fumaric acid, adipic acid, propionic acid, citric acid, gluconic acid, malic acid, pyruvic acid, malonic acid and the like; alcohols such as ethanol and glycerol and the like; oil or fat such as soybean oil, rice bran oil, olive oil, com oil, sesame oil.
  • the amount of the carbon source added varies according to the kind of carbon source and is typically between 1 to 100 gram(s) per liter of medium.
  • glucose, starch, and/or peptone is contained in the medium as a major carbon source, at a concentration of 0.1-5% (W/V).
  • suitable nitrogen sources which can be used for growing the bacterial strains of the present invention include, but are not limited to, amino acids, yeast extract, tryptone, beef extract, peptone, potassium nitrate, ammonium nitrate, ammonium chloride, ammonium sulfate, ammonium phosphate, ammonia or combinations thereof.
  • the amount of nitrogen source varies according to the type of nitrogen source, typically between 0.1 to 30 gram per liter of medium.
  • the inorganic salts potassium dihydrogen phosphate, dipotassium hydrogen phosphate, disodium hydrogen phosphate, magnesium sulfate, magnesium chloride, ferric sulfate, ferrous sulfate, ferric chloride, ferrous chloride, manganous sulfate, manganous chloride, zinc sulfate, zinc chloride, cupric sulfate, calcium chloride, sodium chloride, calcium carbonate, sodium carbonate can be used alone or in combination.
  • the amount of inorganic acid varies according to the kind of the inorganic salt, typically between 0.001 to 10 gram per liter of medium.
  • specially required substances include, but are not limited to, vitamins, nucleic acids, yeast extract, peptone, meat extract, malt extract, dried yeast and combinations thereof. Cultivation can be effected at a temperature, which allows the growth of the bacterial strains, essentially, between 20°C and 46°C. In some aspects, a temperature range is 30°C-37°C.
  • the medium can be adjusted to pH 7.0- 7.4. It will be appreciated that commercially available media may also be used to culture the bacterial strains, such as Nutrient Broth or Nutrient Agar available from Difco, Detroit, MI. It will be appreciated that cultivation time may differ depending on the type of culture medium used and the concentration of sugar as a major carbon source.
  • cultivation lasts between 24-96 hours.
  • Bacterial cells thus obtained are isolated using methods, which are well known in the art. Examples include, but are not limited to, membrane filtration and centrifugal separation. The pH may be adjusted using sodium hydroxide and the like and the culture may be dried using a freeze dryer, until the water content becomes equal to 4% or less.
  • Microbial co-cultures may be obtained by propagating each strain as described hereinabove. It will be appreciated that the microbial strains may be cultured together when compatible culture conditions can be employed.
  • Microbes can be distinguished into a genus based on polyphasic taxonomy, which incorporates all available phenotypic and genotypic data into a consensus classification (Vandamme et al. 1996. Polyphasic taxonomy, a consensus approach to bacterial systematics. Microbiol Rev 1996, 60:407-438).
  • One accepted genotypic method for defining species is based on overall genomic relatedness, such that strains which share approximately 70% or more relatedness using DNA-DNA hybridization, with 5°C or less ATm (the difference in the melting temperature between homologous and heterologous hybrids), under standard conditions, are considered to be members of the same species. Thus, populations that share greater than the aforementioned 70% threshold can be considered to be variants of the same species.
  • the 16S rRNA sequences are often used for determining taxonomy and making distinctions between species, in that if a 16S rRNA sequence shares less than a specified % sequence identity from a reference sequence, then the two organisms from which the sequences were obtained are said to be of different species.
  • microbes could be of the same species, if they share at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% sequence identity across the 16S or 16S rRNA or rDNA sequence. In some aspects, a microbe could be considered to be the same species only if it shares at least 95% identity.
  • microbial strains of a species as those that share at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% sequence identity across the 16S rRNA sequence.
  • Comparisons may also be made with 23 S rRNA sequences against reference sequences.
  • a microbe could be considered to be the same strain only if it shares at least 95% identity.
  • substantially similar genetic characteristics means a microbe sharing at least 95% identity.
  • ITS Internal Transcriber Sequence
  • the internal transcribed spacer (ITS) region has the highest probability of successful identification for the broadest range of fungi, with the most clearly defined barcode gap between inter- and intraspecific variation, and has been proposed as the formal fungal identification sequence (Schoch el al., PNAS April 17, 2012 109 (16) 6241-6246).
  • microbial strains of the present disclosure include those that comprise polynucleotide sequences that share at least 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity with any one of SEQ ID NOs: 1-179.
  • microbes of the present disclosure include those that comprise polynucleotide sequences that share at least 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity with any one of SEQ ID NOs: 1-179.
  • microbial consortia of the present disclosure include two or more microbes that comprise polynucleotide sequences that share at least 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity with any one of SEQ ID NOs: 1-179.
  • microbial consortia of the present disclosure include two or more microbial strains, wherein at least one of those comprises a polynucleotide sequences that shares at least 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity with any one of SEQ ID NOs: 1-179.
  • microbial consortia of the present disclosure include two or more microbial strains, wherein at least one of those comprises a polynucleotide sequences that shares at least 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity with any one of SEQ ID NOs: 1-179, and wherein at least one of the microbes is optionally selected from Table 1.
  • MLS A has been used successfully to explore clustering patterns among large numbers of strains assigned to very closely related species by current taxonomic methods, to look at the relationships between small numbers of strains within a genus, or within a broader taxonomic grouping, and to address specific taxonomic questions. More generally, the method can be used to ask whether bacterial species exist - that is, to observe whether large populations of similar strains invariably fall into well-resolved clusters, or whether in some cases there is a genetic continuum in which clear separation into clusters is not observed.
  • a determination of phenotypic traits such as morphological, biochemical, and physiological characteristics are made for comparison with a reference genus archetype.
  • the colony morphology can include color, shape, pigmentation, production of slime, etc.
  • Features of the cell are described as to shape, size, Gram reaction, extracellular material, presence of endospores, flagella presence and location, motility, and inclusion bodies.
  • Biochemical and physiological features describe growth of the organism at different ranges of temperature, pH, salinity and atmospheric conditions, growth in presence of different sole carbon and nitrogen sources.
  • agar e.g., YMA
  • bacterial microbes taught herein were identified utilizing 16S rRNA gene sequences. It is known in the art that 16S rRNA contains hypervariable regions that can provide species/strain-specific signature sequences useful for bacterial identification. In the present disclosure, many of the microbes were identified via partial (500 - 1200 bp) 16S rRNA sequence signatures.
  • each strain represents a pure colony isolate that was selected from an agar plate. Selections were made to represent the diversity of organisms present based on any defining morphological characteristics of colonies on agar medium. The medium used, in embodiments, was R2A, PDA, Nitrogen-free semi-solid medium, or MRS agar. Colony descriptions of each of the ‘picked’ isolates were made after 24-hour growth and then entered into our database. Sequence data was subsequently obtained for each of the isolates.
  • CRISPR Clustered Regularly Interspaced Short Palindromic Repeats
  • CRISPR Simple Regularly Interspaced Short Palindromic Repeats
  • Short repeats of an infective virus’ DNA were found in an array pattern within the target bacterium, with the repeats interspersed with spacer sequences unique to that bacterium. It was determined that future infections in that bacterium by the same virus were able to be combatted, as the bacterium produced an endonuclease (Cas endonuclease) that was able to specifically target that virus, using the repeat sequences stored in the array.
  • a CRISPR-Cas system comprises, at a minimum, a CRISPR RNA (crRNA) molecule and at least one CRISPR-associated (Cas) protein to form aa crRNA ribonucleoprotein (crRNP) effector complex.
  • CRISPR-Cas loci comprise an array of identical repeats interspersed with DNA-targeting spacers that encode the crRNA components and an operon-like unit of cas genes encoding the Cas protein components.
  • the resulting ribonucleoprotein complex recognizes a polynucleotide in a sequence-specific manner (lore et al., Nature Structural & Molecular Biology 18, 529-536 (2011)).
  • the crRNA serves as a guide RNA for sequence specific binding of the effector (protein or complex) to double strand DNA sequences, by forming base pairs with the complementary DNA strand while displacing the noncomplementary strand to form a so called R-loop.
  • CRISPR-Cas systems have been classified according to sequence and structural analysis of genomic loci and the associated encoded protein(s). Multiple CRISPR/Cas systems have been described including Class 1 systems, with multi-subunit effector complexes (comprising type I, type IH, and type IV), and Class 2 systems, with single protein effectors (comprising type n, type V, and type VI) (Makarova et al. 2015, Nature Reviews Microbiology Vol. 13: 1-15; Zetsche el a/., 2015, Cell 163, 1-13; Shmakov et al. , 2015, Molecular Cell 60, 1-13; Haft et al. , 2005, Computational Biology, PLoS Comput Biol l(6):e60; and Koonin et al. 2017, Curr Opinion Microbiology 37:67- 78).
  • Class I CRISPR-Cas Systems comprise Types I, III, and IV.
  • a characteristic feature of Class I systems is the presence of an effector endonuclease complex instead of a single protein.
  • a Cascade complex comprises a RNA recognition motif (RRM) and a nucleic acid-binding domain that is the core fold of the diverse RAMP (Repeat- Associated Mysterious Proteins) protein superfamily (Makarova et al. 2013, Biochem Soc Trans 41, 1392-1400; Makarova et al. 2015, Nature Reviews Microbiology Vol. 13:1-15).
  • RRM RNA recognition motif
  • RAMP Repeat- Associated Mysterious Proteins
  • RAMP protein subunits include Cas5 and Cas7 (which comprise the skeleton of the crRNA-effector complex), wherein the Cas5 subunit binds the 5' handle of the crRNA and interacts with the large subunit, and often includes Cas6 which is loosely associated with the effector complex and typically functions as the repeatspecific RNase in the pre-crRNA processing (Charpentier et al., FEMS Microbiol Rev 2015, 39:428-441; Niewoehner et al., RNA 2016, 22:318-329).
  • Type I CRISPR-Cas systems comprise a complex of effector proteins, termed Cascade (CRISPR-associated complex for antiviral defense) comprising at a minimum Cas5 and Cas7.
  • Cascade CRISPR-associated complex for antiviral defense
  • the effector complex functions together with a single CRISPR RNA (crRNA) and Cas3 to defend against invading viral DNA (Brouns, S. J. J. et al. Science 321:960-964; Makarova et al. 2015, Nature Reviews Microbiology Vol. 13:1-15).
  • Type I CRISPR-Cas loci comprise the signature gene cas3 (or a variant cas3' or cas3"), which encodes a metal-dependent nuclease that possesses a single-stranded DNA (ssDNA)-stimulated superfamily 2 helicase with a demonstrated capacity to unwind double stranded DNA (dsDNA) and RNA-DNA duplexes (Makarova et al. 2015, Nature Reviews; Microbiology Vol. 13:1-15).
  • the Cas3 endonuclease is recruited to the Cascade-crRNA-target DNA complex to cleave and degrade the DNA target (Westra, E. R. et al. (2012) Molecular Cell 46:595-605, Sinkunas, T. et al. (2011) EMBO J. 30: 1335-1342, and Sinkunas, T. et al. (2013) EMBO J.
  • Cas6 can be the active endonuclease that is responsible for crRNA processing, and Cas5 and Cas7 function as non-catalytic RNA-binding proteins; although in type I-C systems, crRNA processing can be catalyzed by Cas5 (Makarova et al. 2015, Nature Reviews Microbiology Vol. 13:1-15). Type I systems are divided into seven subtypes (Makarova et al. 2011 , Nat Rev Microbiol. 2011 9(6):467-477; Koonin et al. 2017, Curr Opinion Microbiology 37:67-78).
  • a modified type I CRISPR-associated complex for adaptive antiviral defense comprising at least the protein subunits Cas7, Cas5 and Cas6, wherein one of these subunits is synthetically fused to a Cas3 endonuclease or a modified restriction endonuclease, FokI, have been described (WO2013098244 published 4 Jul. 4, 2013).
  • Type III CRISPR-Cas systems comprising a plurality of cas7 genes, target either ssRNA or ssDNA, and function as either an RNase as well as a target RNA-activated DNA nuclease (Tamulaitis et al, Trends in Microbiology 25(10)49-61, 2017).
  • Csm (Type IH-A) and Cmr (Type III-B) complexes function as RNA-activated single-stranded (ss) DNases that couple the target RNA binding/cleavage with ssDNA degradation.
  • the CRISPR RNA (crRNA)-guided binding of the Csm or Cmr complex to the emerging transcript recruits CaslO DNase to the actively transcribed phage DNA, resulting in degradation of both the transcript and phage DNA, but not the host DNA.
  • the CaslO HD-domain is responsible for the ssDNase activity, and Csm3/Cmr4 subunits are responsible for the endoribonuclease activity of the Csm/Cmr complex.
  • the 3'-flanking sequence of the target RNA is critical for the ssDNase activity of Csm/Cmr: the basepairing with the 5'-handle of crRNA protects host DNA from degradation.
  • Type IV systems although comprising typical type I cas5 and cas7 domains in addition to a cas8-like domain, may lack the CRISPR array that is characteristic of most other CRISPR- Cas systems.
  • Class II CRISPR-Cas systems comprise Types n, V, and VI.
  • a characteristic feature of Class II systems is the presence of a single Cas effector protein instead of an effector complex.
  • Types II and V Cas proteins comprise an RuvC endonuclease domain that adopts the RNase H fold.
  • Type II CRISPR/Cas systems employ a crRNA and tracrRNA (trans-activating CRISPR RNA) to guide the Cas endonuclease to its DNA target.
  • the crRNA comprises a spacer region complementary to one strand of the double strand DNA target and a region that base pairs with the tracrRNA (trans-activating CRISPR RNA) forming a RNA duplex that directs the Cas endonuclease to cleave the DNA target, leaving a blunt end. Spacers are acquired through a not fully understood process involving Casl and Cas2 proteins.
  • Type II CRISPR/Cas loci typically comprise cast and cast genes in addition to the cas9 gene (Chylinski et al., 2013, RNA Biology 10:726-737; Makarova et al. 2015, Nature Reviews Microbiology Vol. 13:1-15).
  • Type II CRISR- Cas loci can encode a tracrRNA, which is partially complementary to the repeats within the respective CRISPR array, and can comprise other proteins such as Csnl and Csn2.
  • the presence of cas9 in the vicinity of casl and cas2 genes is the hallmark of type II loci (Makarova et al. 2015, Nature Reviews Microbiology Vol. 13:1-15).
  • Type V CRISPR/Cas systems comprise a single Cas endonuclease, including Cpfl (Casl2) (Koonin et al, Curr Opinion Microbiology 37:67-78, 2017), that is an active RNA- guided endonuclease that does not necessarily require the additional trans-activating CRISPR (tracr) RNA for target cleavage, unlike Cas9.
  • Type VI CRISPR-Cas systems comprise a casl3 gene that encodes a nuclease with two HEPN (Higher Eukaryotes and Prokaryotes Nucleotide-binding) domains but no HNH or RuvC domains, and are not dependent upon tracrRNA activity.
  • the majority of HEPN domains comprise conserved motifs that constitute a metal-independent endoRNase active site (Anantharam et al., Biol Direct 8: 15, 2013). Because of this feature, it is thought that type VI systems may act on RNA targets instead of the DNA targets that are common to other CRISPR- Cas systems.
  • the proteins encoded in a Cas locus within a bacterial genome include endonucleases that are responsible for effecting cleavage of a nucleotide target (e.g., double-stranded DNA cleavage, DNA nicking, ssDNA cleavage, RNA cleavage), and are variously referred to as “Cas Endonucleases”, “Effector Proteins”, and “Effector Complexes.
  • Other genes within the locus encode proteins of other functions that may be required for complete activity, as discussed below.
  • Endonucleases are enzymes that cleave the phosphodiester bond within a polynucleotide chain, and include restriction endonucleases that cleave DNA at specific sites without damaging the bases.
  • restriction endonucleases include restriction endonucleases, meganucleases, TAL effector nucleases (TALENs), zinc finger nucleases, and Cas (CRISPR-associated) effector endonucleases.
  • Cas endonucleases either as single effector proteins or in an effector complex with other components, unwind the DNA duplex at the target sequence and optionally cleave at least one DNA strand, as mediated by recognition of the target sequence by a polynucleotide (such as, but not limited to, a crRNA or guide RNA) that is in complex with the Cas effector protein.
  • a polynucleotide such as, but not limited to, a crRNA or guide RNA
  • Such recognition and cutting of a target sequence by a Cas endonuclease typically occurs if the correct protospacer-adjacent motif (PAM) is located at or adjacent to the 3' end of the DNA target sequence.
  • PAM protospacer-adjacent motif
  • a Cas endonuclease herein may lack DNA cleavage or nicking activity, but can still specifically bind to a DNA target sequence when complexed with a suitable RNA component.
  • Cas endonucleases may occur as individual effectors (Class 2 CRISPR systems) or as part of larger effector complexes (Class I CRISPR systems).
  • Cas endonucleases that have been described include, but are not limited to, for example: Cas3 (a feature of Class 1 type I systems), Cas9 (a feature of Class 2 type II systems) and Cas 12 (Cpfl) (a feature of Class 2 type V systems).
  • Cas3 (and its variants Cas3' and Cas3") functions as a single-stranded DNA nuclease (HD domain) and an ATP-dependent helicase.
  • a variant of the Cas3 endonuclease can be obtained by disabling the functional activity of one or both domains of the Cas3 endonuclease poly peptide. Disabling the ATPase dependent helicase activity (by deletion, knockout of the Cas3-helicase domain, or through mutagenesis of critical residues or by assembling the reaction in the absence of ATP as described previously (Sinkunas, T. et al., 2013, EMBO J.
  • cleavage ready Cascade comprising the modified Cas3 endonuclease into a nickase (as the HD domain is still functional).
  • Disabling the HD endonuclease activity can be accomplished by any method known in the art, such as but not limited to, mutagenesis of critical residues of the HD domain, can convert the cleavage ready Cascade comprising the modified Cas3 endonuclease into a helicase.
  • Disabling the both the Cas helicase and Cas3 HD endonuclease activity can be accomplished by any method known in the art, such as but not limited to, mutagenesis of critical residues of both the helicase and HD domains, can convert the cleavage ready Cascade comprising the modified Cas3 endonuclease into a binder protein that binds to a target sequence.
  • Cas9 (formerly referred to as Cas5, Csnl, or Csxl2) is a Cas endonuclease that forms a complex with a crNucleotide and a tracrNucleotide, or with a single guide polynucleotide, for specifically recognizing and cleaving all or part of a DNA target sequence.
  • Cas9 recognizes a 3' GC-rich PAM sequence on the target dsDNA.
  • a Cas9 protein comprises a RuvC nuclease with an HNH (H — N — H) nuclease adjacent to the RuvC-II domain.
  • the RuvC nuclease and HNH nuclease each can cleave a single DNA strand at a target sequence (the concerted action of both domains leads to DNA double-strand cleavage, whereas activity of one domain leads to a nick).
  • the RuvC domain comprises subdomains I, II and III, where domain I is located near the N-terminus of Cas9 and subdomains II and III are located in the middle of the protein, flanking the HNH domain (Hsu et al., 2013, Cell 157:1262-1278).
  • Cas9 endonucleases are typically derived from a type II CRISPR system, which includes a DNA cleavage system utilizing a Cas9 endonuclease in complex with at least one polynucleotide component.
  • a Cas9 can be in complex with a CRISPR RNA (crRNA) and a trans-activating CRISPR RNA (tracrRNA).
  • a Cas9 can be in complex with a single guide RNA (Makarova et al. 2015, Nature Reviews Microbiology Vol. 13:1-15).
  • Casl2 (formerly referred to as Cpfl, and variants c2cl, c2c3, CasX, and CasY) comprise an RuvC nuclease domain and produced staggered, 5' overhangs on the dsDNA target. Some variants do not require a tracrRNA, unlike the functionality of Cas9. Cas 12 and its variants recognize a 5' AT-rich PAM sequence on the target dsDNA.
  • An insert domain, called Nuc of the Cas 12a protein has been demonstrated to be responsible for target strand cleavage (Yamano et al., Cell 2016, 165:949-962). Additional mutation studies in other Casl2 proteins demonstrated the Nuc domain contributes to guide and target binding, with the RuvC domain responsible for cleavage (Swarts et al., Mol Cell 2017, 66:221-233 e224).
  • Cas endonucleases and effector proteins can be used for targeted genome editing (via simplex and multiplex double-strand breaks and nicks) and targeted genome regulation (via tethering of epigenetic effector domains to either the Cas protein or sgRNA.
  • a Cas endonuclease can also be engineered to function as an RNA-guided recombinase, and via RNA tethers could serve as a scaffold for the assembly of multiprotein and nucleic acid complexes (Mali et al., 2013, Nature Methods Vol. 10:957-963).
  • the Cas endonuclease (or effector complex) forms a complex with a guide polynucleotide, and recognizes a sequence on a target sequence, called a Protospacer Adjacent Motif (PAM).
  • a “protospacer adjacent motif’ (PAM) herein refers to a short nucleotide sequence adjacent to a target sequence (protospacer) that can be recognized (targeted) by a guide polynucleotide/Cas endonuclease system.
  • the Cas endonuclease may not successfully recognize a target DNA sequence if the target DNA sequence is not followed by a PAM sequence.
  • the sequence and length of a PAM herein can differ depending on the Cas protein or Cas protein complex used.
  • the PAM sequence can be of any length but is typically 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 nucleotides long.
  • a guide polynucleotide/Cas endonuclease complex described herein is capable of recognizing, binding to, and optionally nicking, unwinding, or cleaving all or part of a target sequence.
  • a guide polynucleotide/Cas endonuclease complex that can cleave both strands of a DNA target sequence typically comprises a Cas protein that has all of its endonuclease domains in a functional state (e.g., wild type endonuclease domains or variants thereof retaining some or all activity in each endonuclease domain).
  • a wild type Cas protein e.g., a Cas protein disclosed herein
  • a variant thereof retaining some or all activity in each endonuclease domain of the Cas protein is a suitable example of a Cas endonuclease that can cleave both strands of a DNA target sequence.
  • a guide polynucleotide/Cas endonuclease complex that can cleave one strand of a DNA target sequence can be characterized herein as having nickase activity (e.g., partial cleaving capability).
  • a Cas nickase typically comprises one functional endonuclease domain that allows the Cas to cleave only one strand (i.e., make a nick) of a DNA target sequence.
  • a Cas9 nickase may comprise (i) a mutant, dysfunctional RuvC domain and (ii) a functional HNH domain (e.g., wild type HNH domain).
  • a Cas9 nickase may comprise (i) a functional RuvC domain (e.g., wild type RuvC domain) and (ii) a mutant, dysfunctional HNH domain.
  • a functional RuvC domain e.g., wild type RuvC domain
  • a mutant, dysfunctional HNH domain e.g., a mutant, dysfunctional HNH domain.
  • Cas9 nickases suitable for use herein are disclosed in US20140189896 published on 3 Jul. 2014.
  • a pair of Cas nickases can be used to increase the specificity of DNA targeting. In general, this can be done by providing two Cas nickases that, by virtue of being associated with RNA components with different guide sequences, target and nick nearby DNA sequences on opposite strands in the region for desired targeting.
  • Each nick in these embodiments can be at least about 5, between 5 and 10, at least 10, between 10 and 15, at least 15, between 15 and 20, at least 20, between 20 and 30, at least 30, between 30 and 40, at least 40, between 40 and 50, at least 50, between 50 and 60, at least 60, between 60 and 70, at least 70, between 70 and 80, at least 80, between 80 and 90, at least 90, between 90 and 100, or 100 or greater (or any integer between 5 and 100) bases apart from each other, for example.
  • Cas nickase proteins herein can be used in a Cas nickase pair.
  • a Cas9 nickase with a mutant RuvC domain, but functioning HNH domain i.e., Cas9 HNH+/RuvC-
  • Cas9 HNH+/RuvC- can be used (e.g., Streptococcus pyogenes Cas9 HNH+/RuvC-).
  • Each Cas9 nickase e.g., Cas9 HNH+/RuvC-
  • a guide polynucleotide/Cas endonuclease complex in certain embodiments can bind to a DNA target site sequence, but does not cleave any strand at the target site sequence.
  • Such a complex may comprise a Cas protein in which all of its nuclease domains are mutant, dysfunctional.
  • a Cas9 protein that can bind to a DNA target site sequence, but does not cleave any strand at the target site sequence may comprise both a mutant, dysfunctional RuvC domain and a mutant, dysfunctional HNH domain.
  • a Cas protein herein that binds, but does not cleave, a target DNA sequence can be used to modulate gene expression, for example, in which case the Cas protein could be fused with a transcription factor (or portion thereof) (e.g., a repressor or activator).
  • a transcription factor or portion thereof
  • a repressor or activator e.g., a repressor or activator
  • the disclosed guide polynucleotides, Cas endonucleases, polynucleotide modification templates, donor DNAs, guide polynucleotide/Cas endonuclease systems disclosed herein, and any one combination thereof, optionally further comprising one or more polynucleotide(s) of interest, can be introduced into a cell.
  • Cells include, but are not limited to, human, non-human, animal, bacterial, fungal, insect, yeast, non-conventional yeast, and plant cells, as well as progeny and/or derivatives of the cells produced by the methods described herein.
  • Vectors and constructs include circular plasmids, and linear polynucleotides, comprising a polynucleotide of interest and optionally other components including linkers, adapters, regulatory or analysis.
  • a recognition site and/or target site can be comprised within an intron, coding sequence, 5' UTRs, 3' UTRs, and/or regulatory regions.
  • the disclosure further provides expression constructs for expressing in a prokaryotic or eukaryotic cell/organism a guide RNA/Cas system that is capable of recognizing, binding to, and optionally nicking, unwinding, or cleaving all or part of a target sequence.
  • the expression constructs of the disclosure comprise a promoter operably linked to a nucleotide sequence encoding a Cas gene (or optimized gene) and a promoter operably linked to a guide RNA of the present disclosure.
  • the promoter is capable of driving expression of an operably linked nucleotide sequence in a prokaryotic or eukaryotic cell/organism.
  • a Cas endonuclease provided herein is introduced to a target polynucleotide (e.g., in vitro or in a cell) to effect a single-strand nick or a double-strand break into the target polynucleotide.
  • the nick or break may be leveraged to introduce an edit into the polynucleotide, for example but not limited to the insertion of a heterologous polynucleotide e.g., a polynucleotide of interest), the deletion of a particular sequence, or the introduction of one or more modified nucleotides.
  • Modification of a target sequence may be in the form of a nucleotide insertion, a nucleotide deletion, a nucleotide substitution, the addition of an atom molecule to an existing nucleotide, a nucleotide modification, or the binding of a heterologous polynucleotide or polypeptide to said target sequence.
  • the insertion of one or more nucleotides may be accomplished by the inclusion of a donor polynucleotide in the reaction mixture: said donor polynucleotide is inserted into a double-strand break created by said Cas-alpha ortholog polypeptide.
  • the insertion may be via non-homologous end joining or via homologous recombination.
  • Cells include, but are not limited to, human, non-human, animal, mammalian, bacterial, fungal, insect, yeast, non-conventional yeast, and plant cells as well as plants and seeds produced by the methods described herein. Any plant can be used with the compositions and methods described herein, including monocot and dicot plants, and plant elements.
  • Animal cells can include, but are not limited to: an organism of a phylum including chordates, arthropods, mollusks, annelids, cnidarians, or echinoderms; or an organism of a class including mammals, insects, birds, amphibians, reptiles, or fishes.
  • the animal is human, mouse, C.
  • elegans rat, fruit fly (Drosophila spp.), zebrafish, chicken, dog, cat, guinea pig, hamster, chicken, Japanese ricefish, sea lamprey, pufferfish, tree frog (e.g., Xenopus spp.), monkey, or chimpanzee.
  • Particular cell types that are contemplated include haploid cells, diploid cells, reproductive cells, neurons, muscle cells, endocrine or exocrine cells, epithelial cells, muscle cells, tumor cells, embryonic cells, hematopoietic cells, bone cells, germ cells, somatic cells, stem cells, pluripotent stem cells, induced pluripotent stem cells, progenitor cells, meiotic cells, and mitotic cells.
  • a plurality of cells from an organism may be used.
  • Genome modification via a Cas endonuclease described herein may be used to effect a genotypic and/or phenotypic change on the target organism.
  • a change is preferably related to an improved phenotype of interest or a physiologically-important characteristic, the correction of an endogenous defect, or the expression of some type of expression marker.
  • the phenotype of interest or physiologically-important characteristic is related to the overall health, fitness, or fertility of the organism, the ecological fitness of the organism, or the relationship or interaction of the organism with other organisms or abiotic factors in its environment
  • the cas genes provided herein may be used to improve the phenotype of an organism via polynucleotide modification.
  • Example 1 Identification and Characterization of CRISPR Proteins in Bacterial Systems [0352] A selection of microbes from the collection were sequenced according to methods known in the art, and assigned taxonomy based on either 16S sequence similarity or whole genome sequencing BLAST.
  • bacterial genomes were annotated using Prokka (Seemann, Bioinformatics, Volume 30, Issue 14, 15 July 2014, Pages 2068-2069).
  • CRISPR-associated genes were identified in each genome and assigned nomenclature based on percent identity matches to closest-relative sequences in public databases.
  • Isolates of interest were grown to mid-log phase in R2D media.
  • DNA was extracted with the Qiagen Powersoil DNA extraction kit and sequencing libraries were constructed with the iGenomix RipTide kit as per manufacturer instructions. Sequencing was performed on an Illumina HiSeq with PEI 50. Raw Illumina reads were trimmed to QI 5 with Trimmomatic v38 (Bolger AM, Lohse M, and Usadel B. (2014). Trimmomatic: A flexible trimmer for Illumina Sequence Data. Bioinformatics, btul70) and assembled with SPAdes (Prjibelski A, Antipov D, Meleshko D, Lapidus A, and Korobeynikov A.
  • Each microbe may have a single or multiple different types of CRISPR systems.
  • a particular microbe may have a Class II Type V endonuclease system (such as Cas9) and a Class 1 Type I endonuclease system.
  • Table 1 lists selected microbes comprising identified cas genes, including those encoding Cas endonucleases. Each microbe is listed by a reference ID, unique Strain number, and taxonomy. [0357] Table 2 shows the identity of each particular cas and CRISPR-system related gene discovered in the microbes of Table 1, listed by reference ID. The key for the genes listed by number as Table 2 column headings is given in Table 3.
  • a Cas endonuclease identified herein is used to edit a target polynucleotide.
  • a different cas gene or Cas protein identified herein is used to effect an improvement to a system, target polynucleotide, or cell comprising such.
  • PAM preferences for each of the endonucleases discovered in Example 1 may be determined by methods known in the art (e.g., Karvelis etal., Methods, Volumes 121-122, 15 May 2017, Pages 3-8).
  • a Cas endonuclease may be engineered to accommodate different PAM sequences than that which it naturally prefers (e.g., Leenay and Beisel, J Mol Bio, Volume 429, Issue 2, 20 January 2017, Pages 177-191).
  • a Cas endonuclease (or effector complex) is identified from its source bacterium, such as those disclosed in Table 1, and either expressed and isolated or synthesized de novo from its corresponding DNA sequence.
  • the compositions disclosed herein may be utilized outside of a typical cellular environment for in vitro modification of one or more target polynucleotides.
  • the target polynucleotide is isolated and purified from a genomic source.
  • the target polynucleotide is on a circularized or linearized plasmid.
  • the target polynucleotide is a PCR product.
  • the target polynucleotide is a synthesized oligonucleotide.
  • said modification includes binding to, nicking, and/or or cleaving a target polynucleotide.
  • Creation of a guide polynucleotide-endonuclease complex is achieved by methods known in the art. Delivery may be accomplished to an isolated target polynucleotide in vivo, or to a target polynucleotide within a cell (see, e.g., Wilbie et al., Acc. Chem. Res. 2019, 52, 6, 1555- 1564).
  • the target polynucleotide is selected for its capability to accommodate and bind to a particular Cas endonuclease or effector complex.
  • a Cas endonuclease or effector complex is selected based on its capability to bind to a particular target polynucleotide.
  • Genome modification of the target polynucleotide includes recognition of the target by the endonuclease complex, with or without subsequent nicking or cleaving.
  • a nick or a double strand break (DSB) is created, it is repaired either by simple re-annealing, or using mechanisms in the cell (NHEJ or HR).
  • a donor polynucleotide is introduced and inserted into the double strand break.
  • a polynucleotide modification template is provided to the break site, whereby the break is repaired according to the template.
  • the resulting repair of the nick or break results in perfect repair (no edit observed), the insertion of at least one nucleotide, the deletion of at least one nucleotide, the substitution of at least one nucleotide, and/or the molecular alteration of at least one nucleotide.
  • the edit of the target polynucleotide creates a beneficial phenotype to the organism that contains the target, or to an organism to which the edited polynucleotide is introduced.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Molecular Biology (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Microbiology (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Enzymes And Modification Thereof (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)

Abstract

The disclosure relates to novel CRISPR-Cas systems identified in a variety of bacterial species. The compositions identified herein may be used to edit a heterologous polynucleotide, for example in a eukaryotic or prokaryotic cell.

Description

NOVEL CRISPR SYSTEMS
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit under 35 U.S.C. 119(e) of U.S. Provisional Patent Application Serial No. 63/346,322 filed 26 May 2022, herein incorporated by reference in its entirety.
REFERENCE TO SEQUENCE LISTING SUBMITTED ELECTRONICALLY [0002] The official copy of the sequence listing is submitted electronically as an XML formatted sequence listing with a file named created on 25 April 2023 and having a size of 687,882 bytes and is filed concurrently with the specification. The sequence listing comprised in this XML formatted document is part of the specification and is herein incorporated by reference in its entirety.
FIELD
[0003] The instant disclosure relates generally to the field of biology, specifically genomeediting molecules derived from microbes, and microbial compositions for the improvement of traits in both eukaryotes and prokaryotes.
BACKGROUND
[0004] Nucleotide-editing technology has enabled the modification of genomes across a wide variety of species, for the improvement of different traits of interest. Spanning both eukaryotes and prokaryotes, techniques including TAL Effector Nucleases, Meganucleases, Zinc Finger Nucleases, and CRISPR endonucleases provide ways of targeting and editing within the genomes of cells.
[0005] CRISPR-Cas systems are extensively leveraged in the treatment of human and animal diseases, curing cancer and genetic conditions, producing various compositions for industrial and pharmaceutical use, and improving the quality and quantity of food crops.
[0006] Cas endonucleases can be precisely developed for specific site modifications. A large number of Cas endonucleases have been described, each with different properties, such as PAM recognition site preferences. [0007] Despite the advances during the past few years, there is a need for a larger variety of Cas endonucleases for target-specific modification of polynucleotides.
SUMMARY
[0008] The microbes provided herein comprise at least one CRISPR-Cas system that is capable of creating a double strand break in, or adjacent to, a target polynucleotide that comprises an appropriate PAM, and to which it is directed by a guide polynucleotide, in any prokaryotic or eukaryotic cell. In some cases, the cell is a plant cell or an animal cell or a fungal cell.
[0009] Included are single-effector endonucleases, capable of recognizing, binding to, and optionally nicking (on a single strand of two strands) or cleaving (a single strand or a double strand) a target polynucleotide, for example double-stranded DNA.
BRIEF DESCRIPTION OF THE SEQUENCE LISTING
[0010] The sequence descriptions and sequence listing attached hereto comply with the rules governing nucleotide and amino acid sequence disclosures in patent applications as set forth in 37 C.F.R. §§ 1.821 and 1.825. The sequence descriptions comprise the three letter codes for amino acids as defined in 37 C.F.R. §§ 1.821 and 1.825, which are incorporated herein by reference.
[0011] Sequence sources are given as <Strain #_Contig/Locus ID>.
[0012] SEQID NO: 1 is the Strain 10022 03091 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0013] SEQID NO:2 is the Strain 10078 01326 Cas endonuclease DNA sequence from Microbacterium sp. TPU 3598.
[0014] SEQID NO:3 is the Strain 10082 02958 Cas endonuclease DNA sequence from Microbacterium testaceum.
[0015] SEQID NO:4 is the Strain 100944 02101 Cas endonuclease DNA sequence from Bacillus sp. X1(2014).
[0016] SEQID NO: 5 is the Strain 100949 02456 Cas endonuclease DNA sequence from Bacillus sp. X1(2014).
[0017] SEQID NO:6 is the Strain 10099 01758 Cas endonuclease DNA sequence from Rathayibacter tritici. [0018] SEQID NO:7 is the Strain 10105 01169 Cas endonuclease DNA sequence from Microbacterium sp. TPU 3598.
[0019] SEQID NO: 8 is the Strain 10107 02044 Cas endonuclease DNA sequence from Rathayibacter tritici.
[0020] SEQID NO:9 is the Strain 101119 04896 Cas endonuclease DNA sequence from Bacillus megaterium.
[0021] SEQID NO: 10 is the Strain 101168 00758 Cas endonuclease DNA sequence from Bacillus megaterium.
[0022] SEQID NO: 11 is the Strain 101169_03904 Cas endonuclease DNA sequence from Bacillus megaterium.
[0023] SEQID NO:12 is the Strain 101395_04849 Cas endonuclease DNA sequence from Paenibacillus xylanexedens.
[0024] SEQID NO: 13 is the Strain 101632_00966 Cas endonuclease DNA sequence from Bacillus megaterium.
[0025] SEQID NO: 14 is the Strain 101632_03946 Cas endonuclease DNA sequence from Bacillus megaterium.
[0026] SEQID NO: 15 is the Strain 1017 03442 Cas endonuclease DNA sequence from
Microbacterium sp. 1.5R
[0027] SEQID NO: 16 is the Strain 1019 01381 Cas endonuclease DNA sequence from Ottowia sp. KADR8-3.
[0028] SEQID NO: 17 is the Strain 102002_02944 Cas endonuclease DNA sequence from Bacillus simplex.
[0029] SEQID NO:18 is the Strain 102413 04815 Cas endonuclease DNA sequence from Paenibacillus sp. JDR-2.
[0030] SEQID NO: 19 is the Strain 102958 01637 Cas endonuclease DNA sequence from Bacillus sp. Y-01.
[0031] SEQID NO:20 is the Strain 103102 04419 Cas endonuclease DNA sequence from Paenibacillus sp. IHBB 10380.
[0032] SEQID NO:21 is the Strain 103112 04355 Cas endonuclease DNA sequence from Paenibacillus sp. IHBB 10380. [0033] SEQID NO:22 is the Strain 103113 04392 Cas endonuclease DNA sequence from Paenibacillus sp. IHBB 10380.
[0034] SEQID NO: 23 is the Strain 1038 02403 Cas endonuclease DNA sequence from Microbacterium sp. 1.5R.
[0035] SEQID NO:24 is the Strain 10387 01168 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0036] SEQID NO: 25 is the Strain 1040 01689 Cas endonuclease DNA sequence from Microbacterium sp. 1.5R.
[0037] SEQID NO:26 is the Strain 104168_03005 Cas endonuclease DNA sequence from Paenibacillus crassostreae.
[0038] SEQID NO:27 is the Strain 10419_02480 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0039] SEQID NO:28 is the Strain 10420_00522 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0040] SEQID NO:29 is the Strain 10426 00468 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0041] SEQID NO:30 is the Strain 10455 02553 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0042] SEQID NO:31 is the Strain 104624_02918 Cas endonuclease DNA sequence from Bacillus megaterium.
[0043] SEQID NO:32 is the Strain 10466 00895 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0044] SEQID NO: 33 is the Strain 1047 02160 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0045] SEQID NO:34 is the Strain 10494 01762 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0046] SEQID NO:35 is the Strain 10504 03293 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0047] SEQID NO:36 is the Strain 10506 03105 Cas endonuclease DNA sequence from Microbacterium sp. XT11. [0048] SEQID NO:37 is the Strain 10507 00398 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0049] SEQID NO:38 is the Strain 105094 04656 Cas endonuclease DNA sequence from Bacillus megaterium.
[0050] SEQID NO:39 is the Strain 1051 01436 Cas endonuclease DNA sequence from Microbacterium sp. 1.5R.
[0051] SEQID NO:40 is the Strain 10511 03526 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0052] SEQID NO:41 is the Strain 105149_04104 Cas endonuclease DNA sequence from Bacillus sp. X1(2014).
[0053] SEQID NO:42 is the Strain 105149_04649 Cas endonuclease DNA sequence from Bacillus sp. X1(2014).
[0054] SEQID NO:43 is the Strain 105194_03162 Cas endonuclease DNA sequence from Bacillus sp. X1(2014).
[0055] SEQID NO:44 is the Strain 105195_04607 Cas endonuclease DNA sequence from Bacillus sp. X1 (2014).
[0056] SEQID NO:45 is the Strain 105195_04745 Cas endonuclease DNA sequence from Bacillus sp. X1 (2014).
[0057] SEQID NO:46 is the Strain 105210_04468 Cas endonuclease DNA sequence from Bacillus sp. X1 (2014).
[0058] SEQID NO:47 is the Strain 10522 02429 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0059] SEQID NO:48 is the Strain 10533 01275 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0060] SEQID NO:49 is the Strain 1063 00998 Cas endonuclease DNA sequence from Microbacterium sp. 1.5R.
[0061] SEQID NO:50 is the Strain 10631 01157 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0062] SEQID NO:51 is the Strain 10634 01684 Cas endonuclease DNA sequence from Microbacterium sp. XT11. [0063] SEQID NO:52 is the Strain 10635 01773 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0064] SEQID NO: 53 is the Strain 10638 03764 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0065] SEQID NO: 54 is the Strain 10647 02458 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0066] SEQID NO:55 is the Strain 10660 01480 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0067] SEQID NO:56 is the Strain 10669_01791 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0068] SEQID NO:57 is the Strain 10671_00807 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0069] SEQID NO:58 is the Strain 1078_01481 Cas endonuclease DNA sequence from Microbacterium sp. 1.5R.
[0070] SEQID NO: 59 is the Strain 10785_00266 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0071] SEQID NO: 60 is the Strain 1080_02464 Cas endonuclease DNA sequence from Microbacterium sp. 1.5R.
[0072] SEQID NO:61 is the Strain 10805 00375 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0073] SEQID NO:62 is the Strain 10806 03196 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0074] SEQID NO:63 is the Strain 10808 01266 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0075] SEQID NO:64 is the Strain 10817 00887 Cas endonuclease DNA sequence from Unknown.
[0076] SEQID NO: 65 is the Strain 1082 03198 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0077] SEQID NO:66 is the Strain 10822 00582 Cas endonuclease DNA sequence from Microbacterium sp. XT11. [0078] SEQID NO: 67 is the Strain 1083 01536 Cas endonuclease DNA sequence from
Microbacterium sp. 1.5R
[0079] SEQID NO:68 is the Strain 10839 02372 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0080] SEQID NO:69 is the Strain 10842 01772 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0081] SEQID NO:70 is the Strain 10844 02570 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0082] SEQID NO: 71 is the Strain 1087_01291 Cas endonuclease DNA sequence from Microbacterium sp. CGR1.
[0083] SEQID NO: 72 is the Strain 1094 00967 Cas endonuclease DNA sequence from
Microbacterium sp. 1.5R
[0084] SEQID NO: 73 is the Strain 1098 00553 Cas endonuclease DNA sequence from Microbacterium sp. CGR1.
[0085] SEQID NO:74 is the Strain 1109 01133 Cas endonuclease DNA sequence from
Microbacterium sp. 1.5R
[0086] SEQID NO: 75 is the Strain 1113 01742 Cas endonuclease DNA sequence from Microbacterium sp. TPU 3598.
[0087] SEQID NO:76 is the Strain 1115 01841 Cas endonuclease DNA sequence from Microbacterium aurum.
[0088] SEQID NO:77 is the Strain 11199 02635 Cas endonuclease DNA sequence from Bacillus sp. X1 (2014).
[0089] SEQID NO:78 is the Strain 1124 01292 Cas endonuclease DNA sequence from Glutamicibacter arilaitensis.
[0090] SEQID NO:79 is the Strain 11345 00502 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0091] SEQID NO:80 is the Strain 11364 01125 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0092] SEQID NO: 81 is the Strain 11393 00661 Cas endonuclease DNA sequence from Microbacterium sp. XT11. [0093] SEQID NO: 82 is the Strain 11425 02393 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0094] SEQID NO:83 is the Strain 11456 02753 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0095] SEQID NO: 84 is the Strain 1155 02473 Cas endonuclease DNA sequence from Microbacterium sp. CGR1.
[0096] SEQID NO: 85 is the Strain 1165 01871 Cas endonuclease DNA sequence from Microbacterium sp. CGR1.
[0097] SEQID NO:86 is the Strain 11716_05148 Cas endonuclease DNA sequence from Unknown.
[0098] SEQID NO: 87 is the Strain 11723_04575 Cas endonuclease DNA sequence from Bacillus megaterium.
[0099] SEQID NO: 88 is the Strain 1176_01531 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0100] SEQID NO:89 is the Strain 11762 04354 Cas endonuclease DNA sequence from Bacillus megaterium.
[0101] SEQID NO:90 is the Strain 11773 01698 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0102] SEQID NO: 91 is the Strain 1178 01705 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0103] SEQID NO:92 is the Strain 11784 03035 Cas endonuclease DNA sequence from Bacillus megaterium.
[0104] SEQID NO:93 is the Strain 11787 03445 Cas endonuclease DNA sequence from Bacillus megaterium.
[0105] SEQID NO: 94 is the Strain 1198 01200 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0106] SEQID NO:95 is the Strain 1212 01737 Cas endonuclease DNA sequence from Ghitamicibacter arilaitensis.
[0107] SEQID NO:96 is the Strain 12148 02440 Cas endonuclease DNA sequence from Paenibacillus naphthalenovorans. [0108] SEQID NO:97 is the Strain 12193 04094 Cas endonuclease DNA sequence from Paenibacillus naphthalenovorans.
[0109] SEQID NO:98 is the Strain 12301 00979 Cas endonuclease DNA sequence from Unknown.
[0110] SEQID NO: 99 is the Strain 1253 02264 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0111] SEQID NO: 100 is the Strain 1271 01237 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0112] SEQID NO: 101 is the Strain 1286_01295 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0113] SEQID NO:102 is the Strain 12917_02996 Cas endonuclease DNA sequence from Curtobacterium pusilium.
[0114] SEQID NO:103 is the Strain 13445_04630 Cas endonuclease DNA sequence from Azospirillum lipoferum.
[0115] SEQID NO: 104 is the Strain 1396 03114 Cas endonuclease DNA sequence from Paenibacillus sp. CAA11.
[0116] SEQID NO:105 is the Strain 14053 00313 Cas endonuclease DNA sequence from Paenibacillus yonginensis.
[0117] SEQID NO:106 is the Strain 14166_02703 Cas endonuclease DNA sequence from Arthrobacter sp. QXT-31.
[0118] SEQID NO:107 is the Strain 14167 01587 Cas endonuclease DNA sequence from Arthrobacter sp. PGP41.
[0119] SEQID NO:108 is the Strain 14171 02048 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
[0120] SEQID NO:109 is the Strain 14186 01112 Cas endonuclease DNA sequence from Arthrobacter sp. FB24.
[0121] SEQID NO:110 is the Strain 14193 00438 Cas endonuclease DNA sequence from Arthrobacter sp. PGP41.
[0122] SEQID NO: 111 is the Strain 14196 00143 Cas endonuclease DNA sequence from Arthrobacter sp. PGP41. [0123] SEQID NO:112 is the Strain 14202 02580 Cas endonuclease DNA sequence from Arthrobacter sp. PGP41.
[0124] SEQID NO:113 is the Strain 14229 00612 Cas endonuclease DNA sequence from Arthrobacter sp. FB24.
[0125] SEQID NO: 114 is the Strain 1431 00729 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
[0126] SEQID NO: 115 is the Strain 1438 01477 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
[0127] SEQID NO:116 is the Strain 14544_01404 Cas endonuclease DNA sequence from Bacillus megaterium.
[0128] SEQID NO:117 is the Strain 14596_01694 Cas endonuclease DNA sequence from Bacillus sp. Y-01.
[0129] SEQID NO:118 is the Strain 14627_01883 Cas endonuclease DNA sequence from Microbacterium testaceum.
[0130] SEQID NO:119 is the Strain 14650 03793 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0131] SEQID NO:120 is the Strain 14658 03699 Cas endonuclease DNA sequence from Arthrobacter sp. QXT-31.
[0132] SEQID NO: 121 is the Strain 1471 00680 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
[0133] SEQID NO: 122 is the Strain 1472 00433 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
[0134] SEQID NO:123 is the Strain 14727 02881 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0135] SEQID NO:124 is the Strain 14743 03592 Cas endonuclease DNA sequence from Leifsonia xyli.
[0136] SEQID NO:125 is the Strain 14779 01378 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0137] SEQID NO:126 is the Strain 14808 02364 Cas endonuclease DNA sequence from Paenibacillus xylanexedens. [0138] SEQID NO:127 is the Strain 14817 01476 Cas endonuclease DNA sequence from Paenibacillus xylanexedens.
[0139] SEQID NO:128 is the Strain 14824 01629 Cas endonuclease DNA sequence from Paenibacillus xylanexedens.
[0140] SEQID NO:129 is the Strain 14881 01920 Cas endonuclease DNA sequence from Microbacterium sp. XT11.
[0141] SEQID NO:130 is the Strain 14945 00031 Cas endonuclease DNA sequence from Arthrobacter sp. ATCC 21022.
[0142] SEQID NO:131 is the Strain 14968_01911 Cas endonuclease DNA sequence from Arthrobacter sp. ZXY-2.
[0143] SEQID NO:132 is the Strain 14970_00612 Cas endonuclease DNA sequence from Arthrobacter sp. FB24.
[0144] SEQID NO:133 is the Strain 14977_00931 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0145] SEQID NO: 134 is the Strain 15010_02299 Cas endonuclease DNA sequence from Arthrobacter sp. PGP41.
[0146] SEQID NO:135 is the Strain 15062 00931 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0147] SEQID NO:136 is the Strain 15158 01222 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
[0148] SEQID NO: 137 is the Strain 15306 00078 Cas endonuclease DNA sequence from Paenibacillus yonginensis.
[0149] SEQID NO:138 is the Strain 15309_00706 Cas endonuclease DNA sequence from Paenibacillus yonginensis.
[0150] SEQID NO:139 is the Strain 15353 01923 Cas endonuclease DNA sequence from Paenibacillus yonginensis.
[0151] SEQID NO:140 is the Strain 15393 03167 Cas endonuclease DNA sequence from Paenibacillus sp. Y412MC10.
[0152] SEQID NO: 141 is the Strain 15407 04378 Cas endonuclease DNA sequence from Paenibacillus yonginensis. [0153] SEQID NO: 142 is the Strain 15448 04948 Cas endonuclease DNA sequence from Sinorhizobium meliloti.
[0154] SEQID NO: 143 is the Strain 15469 04140 Cas endonuclease DNA sequence from Mitsuaria sp. 7.
[0155] SEQID NO: 144 is the Strain 15531 01961 Cas endonuclease DNA sequence from Mitsuaria sp. 7.
[0156] SEQID NO: 145 is the Strain 15546 00078 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0157] SEQID NO: 146 is the Strain 15832_01937 Cas endonuclease DNA sequence from Arthrobacter sp. PGP41.
[0158] SEQID NO: 147 is the Strain 15859_02500 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0159] SEQID NO:148 is the Strain 15875_01814 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0160] SEQID NO:149 is the Strain 15939_01790 Cas endonuclease DNA sequence from Arthrobacter sp. ZXY-2.
[0161] SEQID NO: 150 is the Strain 15940 03638 Cas endonuclease DNA sequence from Arthrobacter sp. FB24.
[0162] SEQID NO: 151 is the Strain 16 02923 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0163] SEQID NO: 152 is the Strain 16023_03049 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0164] SEQID NO: 153 is the Strain 16060_02050 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
[0165] SEQID NO:154 is the Strain 16064 03513 Cas endonuclease DNA sequence from Methylorubrum extorquens.
[0166] SEQID NO: 155 is the Strain 16095_00628 Cas endonuclease DNA sequence from Sinorhizobium fredii.
[0167] SEQID NO: 156 is the Strain 16107 02971 Cas endonuclease DNA sequence from Arthrobacter sp. ATCC 21022. [0168] SEQID NO: 157 is the Strain 16130 03182 Cas endonuclease DNA sequence from Arthrobacter sp. U41.
[0169] SEQID NO: 158 is the Strain 16135 03880 Cas endonuclease DNA sequence from Bacillus sp. Y-01.
[0170] SEQID NO: 159 is the Strain 16157 00325 Cas endonuclease DNA sequence from Arthrobacter sp. ATCC 21022.
[0171] SEQID NO:160 is the Strain 16158 01725 Cas endonuclease DNA sequence from Arthrobacter sp. QXT-31.
[0172] SEQID NO:161 is the Strain 16194_01685 Cas endonuclease DNA sequence from Arthrobacter sp. ATCC 21022.
[0173] SEQID NO: 162 is the Strain 16216_02576 Cas endonuclease DNA sequence from Arthrobacter sp. ZXY-2.
[0174] SEQID NO:163 is the Strain 16233_01732 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0175] SEQID NO: 164 is the Strain 16237 03292 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0176] SEQID NO: 165 is the Strain 16248_01824 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
[0177] SEQID NO: 166 is the Strain 1625_02362 Cas endonuclease DNA sequence from Chryseobacterium glaciei.
[0178] SEQID NO: 167 is the Strain 16274 00773 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
[0179] SEQID NO: 168 is the Strain 16288_06493 Cas endonuclease DNA sequence from Azospirillum thiophilum.
[0180] SEQID NO: 169 is the Strain 16288_06718 Cas endonuclease DNA sequence from Azospirillum thiophilum.
[0181] SEQID NO: 170 is the Strain 16299 00539 Cas endonuclease DNA sequence from Rathayibacter tritici.
[0182] SEQID NO: 171 is the Strain 16333 02521 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus. [0183] SEQID NO: 172 is the Strain 16334 02912 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
[0184] SEQID NO: 173 is the Strain 16349 00705 Cas endonuclease DNA sequence from Arthrobacter sp. ZXY-2.
[0185] SEQID NO: 174 is the Strain 16351 01257 Cas endonuclease DNA sequence from Pseudarthrobacter phenanthrenivorans.
[0186] SEQID NO: 175 is the Strain 16369 02319 Cas endonuclease DNA sequence from Arthrobacter sp. ZXY-2.
[0187] SEQID NO: 176 is the Strain 16370_00277 Cas endonuclease DNA sequence from Arthrobacter sp. PGP41.
[0188] SEQID NO: 177 is the Strain 16372_00908 Cas endonuclease DNA sequence from Microbacterium sp. No. 7.
[0189] SEQID NO: 178 is the Strain 16396_02202 Cas endonuclease DNA sequence from Arthrobacter crystallopoietes.
[0190] SEQID NO: 179 is the Strain 16404_01792 Cas endonuclease DNA sequence from Pseudarthrobacter chlorophenolicus.
DETAILED DESCRIPTION
[0191] While the following terms are believed to be well understood by one of ordinary skill in the art, the following are set forth to facilitate explanation of the presently disclosed subject matter.
[0192] The term “a” or “an” refers to one or more of that entity, i.e., can refer to a plural referent. As such, the terms “a” or “an”, “one or more” and “at least one” are used interchangeably herein. In addition, reference to “an element” by the indefinite article “a” or “an” does not exclude the possibility that more than one of the elements is present, unless the context clearly requires that there is one and only one of the elements.
[0193] As used herein the terms “microorganism" or “microbe” should be taken broadly. These terms are used interchangeably and include, but are not limited to, the two prokaryotic domains, Bacteria and Archaea, as well as eukaryotic Fungi and Protists. In some embodiments, the disclosure refers to the “microbes” of Table 1, or the “microbes” of various other tables or paragraphs present in the disclosure. This characterization can refer to not only the identified taxonomic bacterial genera of the tables, but also the identified taxonomic species, as well as the various novel and newly identified bacterial strains of said tables.
[0194] As used herein, the term "microbe" or "microorganism" refers to any species or taxon of microorganism, including, but not limited to, archaea, bacteria, microalgae, fungi (including mold and yeast species), mycoplasmas, microspores, nanobacteria, oomycetes, and protozoa. In some embodiments, a microbe or microorganism encompasses individual cells (e.g., unicellular microorganisms) or more than one cell (e.g., multi-cellular microorganism). A "population of microorganisms" may thus refer to a multiple cells of a single microorganism, in which the cells share common genetic derivation.
[0195] As used herein, the term "bacterium" or "bacteria" refers in general to any prokaryotic organism, and may reference an organism from either Kingdom Eubacteria (Bacteria), Kingdom Archaebacteria (Archae), or both. In some cases, bacterial genera or other taxonomic classifications have been reassigned due to various reasons (such as but not limited to the evolving field of whole genome sequencing), and it is understood that such nomenclature reassignments are within the scope of any claimed taxonomy. For example, certain species of the genus Erwinia have been described in the literature as belonging to genus Pantoea (Zhang, ¥., Qiu, S. Examining phylogenetic relationships of Erwinia and Pantoea species using whole genome sequence data. Antonie van Leeuwenhoek 108, 1037-1046 (2015).).
[0196] The term “16S” refers to the DNA sequence of the 16S ribosomal RNA (rRNA) sequence of a bacterium. 16S rRNA gene sequencing is a well-established method for studying phylogeny and taxonomy of bacteria. [00166] As used herein, the term "fungus" or "fungi" refers in general to any organism from Kingdom Fungi. Historical taxonomic classification of fungi has been according to morphological presentation. Beginning in the mid-1800' s, it was recognized that some fungi have a pleomorphic life cycle, and that different nomenclature designations were being used for different forms of the same fungus. In 1981, the Sydney Congress of the International Mycological Association laid out rules for the naming of fungi according to their status as anamorph, teleomorph, or holomorph (Taylor, J.W. One Fungus = One Name: DNA and fungal nomenclature twenty years after PCR. IMA Fungus 2, 113-120 (2011).). With the development of genomic sequencing, it became evident that taxonomic classification based on molecular phylogenetics did not align with morphological-based nomenclature (Shenoy, B.D., Jeewon, R. and Hyde, K.D. (2007). Impact of DNA sequence-data on the taxonomy of anamorphic fungi. Fungal Diversity 26: 1-54.). As a result, in 2011 the International Botanical Congress adopted a resolution approving the International Code of Nomenclature for Algae, Fungi, and Plants (Melbourne Code) (2012), with the stated outcome of designating "One Fungus = One Name" (Hawksworth, D.L. Managing and coping with names of pleomorphic fungi in a period of transition. IMA Fungus 3, 15-24 (2012)).
[0197] The term "Internal Transcribed Spacer" (“ITS”) refers to the spacer DNA (non-coding DNA) situated between the small-subunit ribosomal RNA (rRNA) and large-subunit (LSU) rRNA genes in the chromosome or the corresponding transcribed region in the polycistronic rRNA precursor transcript. ITS gene sequencing is a well-established method for studying phylogeny and taxonomy of fungi. In some cases, the "Large SubUnit" (“LSU”) sequence is used to identify fungi. LSU gene sequencing is a well-established method for studying phylogeny and taxonomy of fungi. Some fungal microbes of the present invention may be described by an ITS sequence and some may be described by an LSU sequence. Both are understood to be equally descriptive and accurate for determining taxonomy.
[0198] The term “microbial consortia” or “microbial consortium” refers to a subset of a microbial community of individual microbial species, or strains of a species, which can be described as carrying out a common function, or can be described as participating in, or leading to, or correlating with, a recognizable parameter or plant phenotypic trait The community may comprise one or more species, or strains of a species, of microbes. In some instances, the microbes coexist within the community symbiotically.
[0199] The term “microbial community” means a group of microbes comprising two or more species or strains. Unlike microbial consortia, a microbial community does not have to be carrying out a common function, or does not have to be participating in, or leading to, or correlating with, a recognizable parameter or plant phenotypic trait.
[0200] The term “accelerated microbial selection” or “AMS” is used interchangeably with the term “directed microbial selection” or “DMS” and refers to the iterative selection methodology that was utilized, in some embodiments of the disclosure, to derive the claimed microbial species or consortia of said species.
[0201] As used herein, “isolate," “isolated,” “isolated microbe," and like terms, are intended to mean that the one or more microorganisms has been separated from at least one of the materials with which it is associated in a particular environment (for example soil, water, plant tissue). [0202] Thus, an “isolated microbe” does not exist in its naturally occurring environment; rather, it is through the various techniques described herein that the microbe has been removed from its natural setting and placed into a non-naturally occurring state of existence. Thus, the isolated strain may exist as, for example, a biologically pure culture, or as spores (or other forms of the strain) in association with an agricultural carrier.
[0203] In certain aspects of the disclosure, the isolated microbes exist as isolated and biologically pure cultures. It will be appreciated by one of skill in the art, that an isolated and biologically pure culture of a particular microbe, denotes that said culture is substantially free (within scientific reason) of other living organisms and contains only the individual microbe in question. The culture can contain varying concentrations of said microbe. The present disclosure notes that isolated and biologically pure microbes often “necessarily differ from less pure or impure materials.” See, e.g., In re Bergstrom, 427 F.2d 1394, (CCPA 1970)(discussing purified prostaglandins), see also, In re Bergy, 596 F.2d 952 (CCPA 1979)(discussing purified microbes), see also, Parke-Davis & Co. v. H.K. Mulford & Co., 189 F. 95 (S.D.N.Y. 1911) (Learned Hand discussing purified adrenaline), affd in part, rev’d in part, 196 F. 496 (2d Cir. 1912), each of which are incorporated herein by reference. Furthermore, in some aspects, the disclosure provides for certain quantitative measures of the concentration, or purity limitations, that must be found within an isolated and biologically pure microbial culture. The presence of these purity values, in certain embodiments, is a further attribute that distinguishes the presently disclosed microbes from those microbes existing in a natural state. See, e.g., Merck & Co. v. Olin Mathieson Chemical Corp., 253 F.2d 156 (4th Cir. 1958) (discussing purity limitations for vitamin Bl 2 produced by microbes), incorporated herein by reference.
[0204] As used herein, “individual isolates” should be taken to mean a composition, or culture, comprising a predominance of a single genera, species, or strain, of microorganism, following separation from one or more other microorganisms. The phrase should not be taken to indicate the extent to which the microorganism has been isolated or purified. However, “individual isolates” can comprise substantially only one genus, species, or strain, of microorganism.
[0205] The term “growth medium” as used herein, is any medium which is suitable to support growth of a plant By way of example, the media may be natural or artificial including, but not limited to: soil, potting mixes, bark, vermiculite, hydroponic solutions alone and applied to solid plant support systems, and tissue culture gels. It should be appreciated that the media may be used alone or in combination with one or more other media. It may also be used with or without the addition of exogenous nutrients and physical support systems for roots and foliage.
[0206] In one embodiment, the growth medium is a naturally occurring medium such as soil, sand, mud, clay, humus, regolith, rock, or water. In another embodiment, the growth medium is artificial. Such an artificial growth medium may be constructed to mimic the conditions of a naturally occurring medium; however, this is not necessary. Artificial growth media can be made from one or more of any number and combination of materials including sand, minerals, glass, rock, water, metals, salts, nutrients, water. In one embodiment, the growth medium is sterile. In another embodiment, the growth medium is not sterile.
[0207] The medium may be amended or enriched with additional compounds or components, for example, a component which may assist in the interaction and/or selection of specific groups of microorganisms with the plant and each other. For example, antibiotics (such as penicillin) or sterilants (for example, quaternary ammonium salts and oxidizing agents) could be present and/or the physical conditions (such as salinity, plant nutrients (for example organic and inorganic minerals (such as phosphorus, nitrogenous salts, ammonia, potassium and micronutrients such as cobalt and magnesium), pH, and/or temperature) could be amended. [0208] The term “plant” generically includes whole plants, plant organs, plant tissues, seeds, plant cells, seeds and progeny of the same. Plant cells include, without limitation, cells from seeds, suspension cultures, embryos, meristematic regions, callus tissue, leaves, roots, shoots, gametophytes, sporophytes, pollen and microspores. A “plant element” is intended to reference either a whole plant or a plant component, which may comprise differentiated and/or undifferentiated tissues, for example but not limited to plant tissues, parts, and cell types. In one embodiment, a plant element is one of the following: whole plant, seedling, meristematic tissue, ground tissue, vascular tissue, dermal tissue, seed, leaf, root, shoot, stem, flower, fruit, stolon, bulb, tuber, corm, keiki, shoot, bud, tumor tissue, and various forms of cells and culture (e.g., single cells, protoplasts, embryos, callus tissue). The term “plant organ” refers to plant tissue or a group of tissues that constitute a morphologically and functionally distinct part of a plant. As used herein, a “plant part" is synonymous to a “portion” of a plant, and refers to any part of the plant, and can include distinct tissues and/or organs, and may be used interchangeably with the term “tissue” throughout [0209] “Progeny” comprises any subsequent generation of an organism, produced via sexual or asexual reproduction.
[0210] As used herein, the term “plant element” refers to plant cells, plant protoplasts, plant cell tissue cultures from which plants can be regenerated, plant calli, plant clumps, and plant cells that are intact in plants or parts of plants such as embryos, pollen, ovules, seeds, leaves, flowers, branches, fruit, kernels, ears, cobs, husks, stalks, roots, root tips, anthers, and the like, as well as the parts themselves. Grain is intended to mean the mature seed produced by commercial growers for purposes other than growing or reproducing the species. Progeny, variants, and mutants of the regenerated plants are also included within the scope of the invention, provided that these parts comprise the introduced polynucleotides.
[0211] Similarly, a “plant reproductive element” is intended to generically reference any part of a plant that is able to initiate other plants via either sexual or asexual reproduction of that plant, for example but not limited to: seed, seedling, root, shoot, cutting, scion, graft, stolon, bulb, tuber, corm, keiki, or bud. The plant element may be in plant or in a plant organ, tissue culture, or cell culture.
[0212] The term “monocotyledonous” or “monocot” refers to the subclass of angiosperm plants also known as “monocotyledoneae”, whose seeds typically comprise only one embryonic leaf, or cotyledon. The term includes references to whole plants, plant elements, plant organs (e.g., leaves, stems, roots, etc.), seeds, plant cells, and progeny of the same.
[0213] The term “dicotyledonous” or “dicot” refers to the subclass of angiosperm plants also knows as “dicotyledoneae”, whose seeds typically comprise two embryonic leaves, or cotyledons. The term includes references to whole plants, plant elements, plant organs (e.g., leaves, stems, roots, etc.), seeds, plant cells, and progeny of the same.
[0214] As used herein, the term “cultivar” refers to a variety, strain, or race, of plant that has been produced by horticultural or agronomic techniques and is not normally found in wild populations.
[0215] As used herein, “improved" should be taken broadly to encompass improvement of a characteristic of a plant, as compared to a control plant, or as compared to a known average quantity associated with the characteristic in question. For example, “improved" plant biomass associated with application of a beneficial microbe, or consortia, of the disclosure can be demonstrated by comparing the biomass of a plant treated by the microbes taught herein to the biomass of a control plant not treated. Alternatively, one could compare the biomass of a plant treated by the microbes taught herein to the average biomass normally attained by the given plant, as represented in scientific or agricultural publications known to those of skill in the art. In the present disclosure, “improved” does not necessarily demand that the data be statistically significant (e.g., p < 0.05); rather, any quantifiable difference demonstrating that one value (e.g., the average treatment value) is different from another (e.g., the average control value) can rise to the level of “improved.”
[0216] As used herein, “inhibiting and suppressing” and like terms should not be construed to require complete inhibition or suppression, although this may be desired in some embodiments. [0217] As used herein, the term “genotype” refers to the genetic makeup of an individual cell, cell culture, tissue, organism (e.g., a plant), or group of organisms.
[0218] The compositions and methods herein may provide for an improved “agronomic trait” or “trait of agronomic importance” or “trait of agronomic interest” to a plant, which may include, but not be limited to, the following: disease resistance, drought tolerance, heat tolerance, cold tolerance, salinity tolerance, metal tolerance, herbicide tolerance, improved water use efficiency, improved nitrogen utilization, improved nitrogen fixation, pest resistance, herbivore resistance, pathogen resistance, yield improvement, health enhancement, vigor improvement, growth improvement, photosynthetic capability improvement, nutrition enhancement, altered protein content, altered oil content, increased biomass, increased shoot length, increased root length, improved root architecture, modulation of a metabolite, modulation of the proteome, increased seed weight, altered seed carbohydrate composition, altered seed oil composition, altered seed protein composition, altered seed nutrient composition, as compared to an isoline plant not comprising a modification derived from the methods or compositions herein
[0219] “Agronomic trait potential” is intended to mean a capability of a plant element for exhibiting a phenotype, preferably an improved agronomic trait, at some point during its life cycle, or conveying said phenotype to another plant element with which it is associated in the same plant.
[0220] As used herein, the term “molecular marker”, “marker”, or “genetic marker” refers to an indicator that is used in methods for visualizing differences in characteristics of nucleic acid sequences. Examples of such indicators are restriction fragment length polymorphism (RFLP) markers, amplified fragment length polymorphism (AFLP) markers, single nucleotide polymorphisms (SNPs), insertion mutations, microsatellite markers (SSRs), sequence- characterized amplified regions (SCARs), cleaved amplified polymorphic sequence (CAPS) markers or isozyme markers or combinations of the markers described herein which defines a specific genetic and chromosomal location. Mapping of molecular markers in the vicinity of an allele is a procedure which can be performed by the average person skilled in molecular- biological techniques.
[0221] As used herein, the term “trait” refers to a characteristic or phenotype. For example, in the context of some embodiments of the present disclosure, yield of a crop relates to the amount of marketable biomass produced by a plant (e.g., fruit, fiber, grain). Desirable traits may also include other plant characteristics, including but not limited to: water use efficiency, nutrient use efficiency, production, mechanical harvestability, fruit maturity, shelf life, pest/disease resistance, early plant maturity, tolerance to stresses, etc. A trait may be inherited in a dominant or recessive manner, or in a partial or incomplete-dominant manner. A trait may be monogenic (z.e., determined by a single locus) or polygenic (z.e., determined by more than one locus) or may also result from the interaction of one or more genes with the environment.
[0222] As used herein, the term “phenotype” refers to the observable characteristics of an individual cell, cell culture, organism (e.g., a plant), or group of organisms which results from the interaction between that individual’s genetic makeup (i.e., genotype) and the environment. [0223] As used herein, a “synthetic nucleotide sequence” or “synthetic polynucleotide sequence” is a nucleotide sequence that is not known to occur in nature or that is not naturally occurring. Generally, such a synthetic nucleotide sequence will comprise at least one nucleotide difference when compared to any other naturally occurring nucleotide sequence.
[0224] As used herein, the term “nucleic acid” refers to a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides, or analogs thereof. This term refers to the primary structure of the molecule, and thus includes double- and single-stranded DNA, as well as double- and single-stranded RNA. It also includes modified nucleic acids such as methylated and/or capped nucleic acids, nucleic acids containing modified bases, backbone modifications, and the like. The terms “nucleic acid” and “nucleotide sequence” are used interchangeably.
[0225] As used herein, the term “gene” refers to any segment of DNA associated with a biological function. Thus, genes include, but are not limited to, coding sequences and/or the regulatory sequences required for their expression. Genes can also include non-expressed DNA segments that, for example, form recognition sequences for other proteins. Genes can be obtained from a variety of sources, including cloning from a source of interest or synthesizing from known or predicted sequence information, and may include sequences designed to have desired parameters.
[0226] As used herein, the term “homologous” or “homologue”, “homolog”, or “ortholog” is known in the art and refers to related sequences that share a common ancestor or family member and are determined based on the degree of sequence identity. The terms “homology,” “homologous,” “substantially similar” and “corresponding substantially” are used interchangeably herein. They refer to nucleic acid fragments wherein changes in one or more nucleotide bases do not affect the ability of the nucleic acid fragment to mediate gene expression or produce a certain phenotype. These terms also refer to modifications of the nucleic acid fragments of the instant disclosure such as deletion or insertion of one or more nucleotides that do not substantially alter the functional properties of the resulting nucleic acid fragment relative to the initial, unmodified fragment. It is therefore understood, as those skilled in the art will appreciate, that the disclosure encompasses more than the specific exemplary sequences. These terms describe the relationship between a gene found in one species, subspecies, variety, cultivar or strain and the corresponding or equivalent gene in another species, subspecies, variety, cultivar or strain. For purposes of this disclosure homologous sequences are compared. “Homologous sequences” or “homologues” or “orthologs” are thought, believed, or known to be functionally related. A functional relationship may be indicated in any one of a number of ways, including, but not limited to: (a) degree of sequence identity and/or (b) the same or similar biological function. Preferably, both (a) and (b) are indicated. Homology can be determined using software programs readily available in the art, such as those discussed in Current Protocols in Molecular Biology (F.M. Ausubel etal., eds., 1987) Supplement 30, section 7.718, Table 7.71. Some alignment programs are MacVector (Oxford Molecular Ltd, Oxford, U.K.), ALIGN Plus (Scientific and Educational Software, Pennsylvania) and AlignX (Vector NTI, Invitrogen, Carlsbad, CA). Another alignment program is Sequencher (Gene Codes, Ann Arbor, Michigan), using default parameters.
[0227] As used herein, the term “nucleotide change” refers to, e.g., nucleotide substitution, deletion, insertion, chemical alteration, or any of the proceeding, as is well understood in the art. [0228] As used herein, the term “protein modification” refers to, e.g., amino acid substitution, amino acid modification, deletion, and/or insertion, as is well understood in the art.
[0229] As used herein, the term “at least a portion" or “fragment” of a nucleic acid or polypeptide means a portion having the minimal size characteristics of such sequences, or any larger fragment of the full length molecule, up to and including the full length molecule. A fragment of a polynucleotide of the disclosure may encode a biologically active portion of a genetic regulatory element. A biologically active portion of a genetic regulatory element can be prepared by isolating a portion of one of the polynucleotides of the disclosure that comprises the genetic regulatory element and assessing activity as described herein. Similarly, a portion of a polypeptide may be 4 amino acids, 5 amino acids, 6 amino acids, 7 amino acids, and so on, going up to the full length polypeptide. The length of the portion to be used will depend on the particular application. A portion of a nucleic acid useful as a hybridization probe may be as short as 12 nucleotides; in some embodiments, it is 20 nucleotides. A portion of a polypeptide useful as an epitope may be as short as 4 amino acids. A portion of a polypeptide that performs the function of the full-length polypeptide would generally be longer than 4 amino acids.
[0230] The term “primer" as used herein refers to an oligonucleotide which is capable of annealing to the amplification target allowing a DNA polymerase to attach, thereby serving as a point of initiation of DNA synthesis when placed under conditions in which synthesis of primer extension product is induced, i.e., in the presence of nucleotides and an agent for polymerization such as DNA polymerase and at a suitable temperature and pH. The (amplification) primer is preferably single stranded for maximum efficiency in amplification. Preferably, the primer is an oligodeoxyribonucleotide. The primer must be sufficiently long to prime the synthesis of extension products in the presence of the agent for polymerization. The exact lengths of the primers will depend on many factors, including temperature and composition (A/T vs. G/C content) of primer. A pair of bi-directional primers consists of one forward and one reverse primer as commonly used in the art of DNA amplification such as in PCR amplification. [0231] The terms “stringency” or “stringent hybridization conditions” refer to hybridization conditions that affect the stability of hybrids, e.g., temperature, salt concentration, pH, formamide concentration and the like. These conditions are empirically optimized to maximize specific binding and minimize non-specific binding of primer or probe to its target nucleic acid sequence. The terms as used include reference to conditions under which a probe or primer will hybridize to its target sequence, to a delectably greater degree than other sequences (e.g., at least 2-fold over background). Stringent conditions are sequence dependent and will be different in different circumstances. Longer sequences hybridize specifically at higher temperatures.
Generally, stringent conditions are selected to be about 5° C lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength and pH) at which 50% of a complementary target sequence hybridizes to a perfectly matched probe or primer. Typically, stringent conditions will be those in which the salt concentration is less than about 1.0 M Na+ ion, typically about 0.01 to 1.0 M Na + ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C for short probes or primers (e.g., 10 to 50 nucleotides) and at least about 60° C for long probes or primers (e.g., greater than 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. Exemplary low stringent conditions or “conditions of reduced stringency” include hybridization with a buffer solution of 30% formamide, 1 M NaCl, 1% SDS at 37° C and a wash in 2xSSC at 40° C. Exemplary high stringency conditions include hybridization in 50% formamide, IM NaCl, 1% SDS at 37° C, and a wash in 0.1 xSSC at 60° C. Hybridization procedures are well known in the art and are described by e.g., Ausubel et al., 1998 and Sambrook et al., 2001. In some embodiments, stringent conditions are hybridization in 0.25 MNa2HPO4 buffer (pH 7.2) containing 1 mM Na2EDTA, 0.5-20% sodium dodecyl sulfate at 45°C, such as 0.5%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19% or 20%, followed by a wash in 5xSSC, containing 0.1% (w/v) sodium dodecyl sulfate, at 55°C to 65°C.
[0232] In some embodiments, the cell or organism has at least one heterologous trait As used herein, the term “heterologous trait ’ refers to a phenotype imparted to a cell or organism by an exogenous molecule or other organism (e.g., a microbe), DNA segment heterologous polynucleotide or heterologous nucleic acid.
[0233] Various changes in phenotype are of interest to the present disclosure, including but not limited to modifying the fatty acid composition in a plant altering the amino acid content of a plant altering a plant's pathogen defense mechanism, increasing a plant’s yield of an economically important trait (e.g., grain yield, forage yield, etc.) and the like. These results can be achieved by providing expression of heterologous products or increased expression of endogenous products in plants using the methods and compositions of the present disclosure [0234] A “synthetic combination” can include a combination of a plant and a microbe of the disclosure. The combination may be achieved, for example, by coating the surface of a seed of a plant, such as an agricultural plant, or host plant tissue (root, stem, leaf, etc.), with a microbe of the disclosure. Further, a “synthetic combination” can include a combination of microbes of various strains or species. Synthetic combinations have at lest one variable that distinguishes the combination from any combination that occurs in nature. That variable may be, inter alia, a concentration of microbe on a seed or plant tissue that does not occur naturally, or a combination of microbe and plant that does not naturally occur, or a combination of microbes or strains that do not occur naturally together. In each of these instances, the synthetic combination demonstrates the hand of man and possesses structural and/or functional attributes that are not present when the individual elements of the combination are considered in isolation.
[0235] In some embodiments, a microbe can be “endogenous” to a seed or plant. As used herein, a microbe is considered “endogenous” to a plant or seed, if the microbe is derived from the plant specimen from which it is sourced. That is, if the microbe is naturally found associated with said plant. In embodiments in which an endogenous microbe is applied to a plant, then the endogenous microbe is applied in an amount that differs from the levels found on the plant in nature. Thus, a microbe that is endogenous to a given plant can still form a synthetic combination with the plant, if the microbe is present on said plant at a level that does not occur naturally. [0236] In some embodiments, a composition (such as a microbe) can be “heterologous” (also termed “exogenous”) to another composition (such as a seed or plant), and in some aspects is referred to herein as a “heterologous composition”. As used herein, a microbe is considered “heterologous” to a plant or seed, if the microbe is not derived from the plant specimen from which it is sourced. That is, if the microbe is not naturally found associated with said plant For example, a microbe that is normally associated with leaf tissue of a maize plant is considered exogenous to a leaf tissue of another maize plant that naturally lacks said microbe. In another example, a microbe that is normally associated with a maize plant is considered exogenous to a wheat plant that naturally lacks said microbe.
[0237] A composition is “heterologously disposed" when mechanically or manually applied, artificially inoculated, associated with, or disposed onto or into a plant element, seedling, plant or onto or into a plant growth medium or onto or into a treatment formulation so that the treatment exists on or in the plant element, seedling, plant, plant growth medium, or formulation in a manner not found in nature prior to the application of the treatment, e.g, said combination which is not found in nature in that plant variety, at that stage in plant development, in that plant tissue, in that abundance, or in that growth environment (for example, drought). In some embodiments, such a manner is contemplated to be selected from the group consisting of: the presence of the microbe; presence of the microbe in a different number of cells, concentration, or amount; the presence of the microbe in a different plant element, tissue, cell type, or other physical location in or on the plant; the presence of the microbe at different time period, e.g., developmental phase of the plant or plant element, time of day, time of season, and combinations thereof. In some embodiments, “heterologously disposed” means that the microbe being applied to a different tissue or cell type of the plant element than that in which the microbe is naturally found. In some embodiments, “heterologously disposed” means that the microbe is applied to a developmental stage of the plant element, seedling, or plant in which said microbe is not naturally associated, but may be associated at other stages. For example, if a microbe is normally found at the flowering stage of a plant and no other stage, a microbe applied at the seedling stage may be considered to be heterologously disposed. In some embodiments, a microbe is heterologously disposed the microbe is normally found in the root tissue of a plant element but not in the leaf tissue, and the microbe is applied to the leaf. In another non-limiting example, if a microbe is naturally found in the mesophyll layer of leaf tissue but is being applied to the epithelial layer, the microbe would be considered to be heterologously disposed. In some embodiments, “heterologously disposed” means that the native plant element, seedling, or plant does not contain detectable levels of the microbe in that same plant element, seedling, or plant. In some embodiments, “heterologously disposed” means that the microbe being applied is at a greater concentration, number, or amount of the plant element, seedling, or plant, than that which is naturally found in said plant element, seedling, or plant For example, a microbe is heterologously disposed when present at a concentration that is at least 1.5 times greater, between 1.5 and 2 times greater, 2 times greater, between 2 and 3 times greater, 3 times greater, between 3 and 5 times greater, 5 times greater, between 5 and 7 times greater, 7 times greater, between 7 and 10 times greater, 10 times greater, or even greater than 10 times higher number, amount, or concentration than the concentration that was present prior to the disposition of said microbe. In another non-limiting example, a microbe that is naturally found in a tissue of a cupressaceous tree would be considered heterologous to tissue of a maize, wheat, cotton, soybean plant In another example, a microbe that is naturally found in leaf tissue of a maize, spring wheat cotton, soybean plant is considered heterologous to a leaf tissue of another maize, spring wheat cotton, soybean plant that naturally lacks said microbe, or comprises the microbe in a different quantity.
[0238] Microbes can also be “heterologously disposed” on a given plant tissue. This means that the microbe is placed upon a plant tissue that it is not naturally found upon. For instance, if a given microbe only naturally occurs on the roots of a given plant, then that microbe could be exogenously applied to the above-ground tissue of a plant and would thereby be “heterologously disposed” upon said plant tissue. As such, a microbe is deemed heterologously disposed, when applied on a plant that does not naturally have the microbe present or does not naturally have the microbe present in the number that is being applied.
[0239] The compositions and methods herein may provide for a “modulated” “agronomic trait” or “trait of agronomic importance” to a host plant, which may include, but not be limited to, the following: altered oil content, altered protein content, altered seed carbohydrate composition, altered seed oil composition, and altered seed protein composition, chemical tolerance, cold tolerance, delayed senescence, disease resistance, drought tolerance, ear weight, growth improvement, health enhancement, heat tolerance, herbicide tolerance, herbivore resistance, improved nitrogen fixation, improved nitrogen utilization, improved root architecture, improved water use efficiency, increased biomass, increased root length, increased seed weight, increased shoot length, increased yield, increased yield under water-limited conditions, kernel mass, kernel moisture content, metal tolerance, number of ears, number of kernels per ear, number of pods, nutrition enhancement, pathogen resistance, pest resistance, photosynthetic capability improvement, salinity tolerance, stay-green, vigor improvement, increased dry weight of mature seeds, increased fresh weight of mature seeds, increased number of mature seeds per plant, increased chlorophyll content, increased number of pods per plant, increased length of pods per plant, reduced number of wilted leaves per plant, reduced number of severely wilted leaves per plant, and increased number of non- wilted leaves per plant, a detectable modulation in the level of a metabolite, a detectable modulation in the level of a transcript, and a detectable modulation in the proteome, compared to an isoline plant grown from a seed without said seed treatment formulation. By the term “modulated”, it is intended to refer to a change in an agronomic trait that is changed by virtue of the presence of the microbe(s), exudate, broth, metabolite, etc. In aspects, the modulation provides for the imparting of a beneficial trait
[0240] “CRISPR” (Clustered Regularly Interspaced Short Palindromic Repeats) loci refers to certain genetic loci encoding components of DNA cleavage systems, for example, used by bacterial and archaeal cells to destroy foreign DNA (Horvath and Barrangou, 2010, Science 327: 167-170; W02007025097, published 01 March 2007). A CRISPR locus can consist of a CRISPR array, comprising short direct repeats (CRISPR repeats) separated by short variable DNA sequences (called spacers), which can be flanked by diverse Cas (CRISPR-associated) genes. [0241] [0463] As used herein, an “effector” or “effector protein” is a protein that encompasses an activity including recognizing, binding to, and/or cleaving or nicking a polynucleotide target An effector, or effector protein, may also be an endonuclease. The “effector complex” of a CRISPR system includes Cas proteins involved in crRNA and target recognition and binding. Some of the component Cas proteins may additionally comprise domains involved in target polynucleotide cleavage.
[0242] The term “Cas protein” refers to a polypeptide encoded by a Cas (CRISPR- associated) gene. A Cas protein includes proteins encoded by a gene in a cas locus, and include adaptation molecules as well as interference molecules. An interference molecule of a bacterial adaptive immunity complex includes endonucleases. A Cas endonuclease described herein comprises one or more nuclease domains. A Cas endonuclease includes but is not limited to: a Cas9 protein, a Cpfl (Casl2) protein, a C2cl protein, a C2c2 protein, a C2c3 protein, Cas3, Cas3-HD, Cas 5, Cas7, Cas8, Cas 10, or combinations or complexes of these. A Cas protein may be a “Cas endonuclease” or “Cas effector protein”, that when in complex with a suitable polynucleotide component, is capable of recognizing, binding to, and optionally nicking or cleaving all or part of a specific polynucleotide target sequence. Cas protein is further defined as a functional fragment or functional variant of a native Cas protein, or a protein that shares at least 50%, between 50% and 55%, at least 55%, between 55% and 60%, at least 60%, between 60% and 65%, at least 65%, between 65% and 70%, at least 70%, between 70% and 75%, at least 75%, between 75% and 80%, at least 80%, between 80% and 85%, at least 85%, between 85% and 90%, at least 90%, between 90% and 95%, at least 95%, between 95% and 96%, at least 96%, between 96% and 97%, at least 97%, between 97% and 98%, at least 98%, between 98% and 99%, at least 99%, between 99% and 100%, or 100% sequence identity with at least 50, between 50 and 100, at least 100, between 100 and 150, at least 150, between 150 and 200, at least 200, between 200 and 250, at least 250, between 250 and 300, at least 300, between 300 and 350, at least 350, between 350 and 400, at least 400, between 400 and 450, at least 500, or greater than 500 contiguous amino acids of a native Cas protein, and retains at least partial activity of the native sequence.
[0243] A “functional fragment”, “fragment that is functionally equivalent”, and “functionally equivalent fragment” of a Cas endonuclease are used interchangeably herein, and refer to a portion or subsequence of the Cas endonuclease of the present disclosure in which the ability to recognize, bind to, and optionally unwind, nick or cleave (introduce a single or double strand break in) the target site is retained. The portion or subsequence of the Cas endonuclease can comprise a complete or partial (functional) peptide of any one of its domains such as for example, but not limiting to a complete of functional part of a Cas3 HD domain, a complete of functional part of a Cas3 Helicase domain, complete of functional part of a protein (such as but not limiting to a Cas5, Cas5d, Cas7 and Cas8bl).
[0244] [0466] The terms “functional variant”, “variant that is functionally equivalent ’, and “functionally equivalent variant” of a Cas endonuclease or Cas effector protein are used interchangeably herein, and refer to a variant of the Cas effector protein disclosed herein in which the ability to recognize, bind to, and optionally unwind, nick or cleave all or part of a target sequence is retained.
[0245] A Cas endonuclease may also include a multifunctional Cas endonuclease. The term “multifunctional Cas endonuclease” and “multifunctional Cas endonuclease polypeptide” are used interchangeably herein and includes reference to a single polypeptide that has Cas endonuclease functionality (comprising at least one protein domain that can act as a Cas endonuclease) and at least one other functionality, such as but not limited to, the functionality to form a complex (comprises at least a second protein domain that can form a complex with other proteins). In one aspect, the multifunctional Cas endonuclease comprises at least one additional protein domain relative (either internally, upstream (5’), downstream (3'), or both internally 5’ and 3’, or any combination thereof) to those domains typical of a Cas endonuclease.
[0246] The terms “cascade” and “cascade complex” are used interchangeably herein and include reference to a multi-subunit protein complex that can assemble with a polynucleotide forming a polynucleotide-protein complex (PNP). Cascade is a PNP that relies on the polynucleotide for complex assembly and stability, and for the identification of target nucleic acid sequences. Cascade functions as a surveillance complex that finds and optionally binds target nucleic acids that are complementary to a variable targeting domain of the guide polynucleotide.
[0247] The terms “cleavage-ready Cascade”, “crCascade”, “cleavage-ready Cascade complex”, “crCascade complex”, “cleavage-ready Cascade system”, “CRC”, and “crCascade system” are used interchangeably herein and include reference to a multi-subunit protein complex that can assemble with a polynucleotide forming a polynucleotide-protein complex (PNP), wherein one of the cascade proteins is a Cas endonuclease capable of recognizing, binding to, and optionally unwinding, nicking, or cleaving all or part of a target sequence.
[0248] The terms “single guide RNA" and “sgRNA” are used interchangeably herein and relate to a synthetic fusion of two RNA molecules, a crRNA (CRISPR RNA) comprising a variable targeting domain (linked to a tracr mate sequence that hybridizes to a tracrRNA), fused to a tracrRNA (trans-activating CRISPR RNA). The single guide RNA can comprise a crRNA or crRNA fragment and a tracrRNA or tracrRNA fragment of the type II CRISPR/Cas system that can form a complex with a type II Cas endonuclease, wherein said guide RNA/Cas endonuclease complex can direct the Cas endonuclease to a DNA target site, enabling the Cas endonuclease to recognize, optionally bind to, and optionally nick or cleave (introduce a single or double-strand break) the DNA target site.
[0249] The term “variable targeting domain” or “VT domain” is used interchangeably herein and includes a nucleotide sequence that can hybridize (is complementary) to one strand (nucleotide sequence) of a double strand DNA target site. The percent complementation between the first nucleotide sequence domain (VT domain) and the target sequence can be at least 50%, 51%,
52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 63%, 65%, 66%, 67%,
68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%,
84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%,
[0250] 99% or 100%. The variable targeting domain can be at least 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 nucleotides in length. In some embodiments, the variable targeting domain comprises a contiguous stretch of 12 to 30 nucleotides. The variable targeting domain can be composed of a DNA sequence, a RNA sequence, a modified DNA sequence, a modified RNA sequence, or any combination thereof. [0251] The term “Cas endonuclease recognition domain” or “CER domain" (of a guide polynucleotide) is used interchangeably herein and includes a nucleotide sequence that interacts with a Cas endonuclease polypeptide. A CER domain comprises a (trans-acting) tracrNucleotide mate sequence followed by a tracrNucleotide sequence. The CER domain can be composed of a DNA sequence, a RNA sequence, a modified DNA sequence, a modified RNA sequence (see for example El S20150059010A1, published 26 February 2015), or any combination thereof.
[0252] As used herein, the terms “guide polynucleotide/Cas endonuclease complex”, “guide polynucleotide/Cas endonuclease system”, “ guide polynucleotide/Cas complex”, “guide polynucleotide/Cas system” and “guided Cas system” “Polynucleotide-guided endonuclease” , are used interchangeably herein and refer to at least one guide polynucleotide and at least one Cas endonuclease, that are capable of forming a complex, wherein said guide polynucleotide/Cas endonuclease complex can direct the Cas endonuclease to a DNA target site, enabling the Cas endonuclease to recognize, bind to, and optionally nick or cleave (introduce a single or doublestrand break) the DNA target site. A guide polynucleotide/Cas endonuclease complex herein can comprise Cas protein(s) and suitable polynucleotide components) of any of the CRISPR systems known in the art or described herein.
[0253] The terms “guide RNA/Cas endonuclease complex”, “guide RNA/Cas endonuclease system”, “guide RNA/Cas complex”, “guide RNA/Cas system”, “gRNA/Cas complex”, “gRNA/Cas system”, “RNA-guided endonuclease” are used interchangeably herein and refer to at least one RNA component and at least one Cas endonuclease that are capable of forming a complex , wherein said guide RNA/Cas endonuclease complex can direct the Cas endonuclease to a DNA target site, enabling the Cas endonuclease to recognize, bind to, and optionally nick or cleave (introduce a single or double-strand break) the DNA target site.
[0254] [0480] The terms “target site”, “target sequence”, “target site sequence, “target DNA”, “target locus”, “genomic target site”, “genomic target sequence”, and “genomic target locus” are used interchangeably herein and refer to a polynucleotide sequence such as, but not limited to, a nucleotide sequence on a chromosome, episome, a locus, or any other DNA molecule in the genome (including chromosomal, chloroplastic, mitochondrial DNA, plasmid DNA) of a cell, at which a guide polynucleotide/Cas endonuclease complex can recognize, bind to, and optionally nick or cleave The target site can be an endogenous site in the genome of a cell, or alternatively, the target site can be heterologous to the cell and thereby not be naturally occurring in the genome of the cell, or the target site can be found in a heterologous genomic location compared to where it occurs in nature. In some aspects, the target site is simply referred to as “target” when discussed with respect to a polynucleotide sequence. A “target cell” or “target organism” is one that comprises a target polynucleotide.
[0255] A “protospacer adjacent motif (PAM) herein refers to a short nucleotide sequence adjacent to a target sequence (protospacer) that is recognized (targeted) by a guide polynucleotide/Cas endonuclease system described herein. The Cas endonuclease may not successfully recognize a target DNA sequence if the target DNA sequence is not followed by a PAM sequence. The sequence and length of a PAM herein can differ depending on the Cas protein or Cas protein complex used. The PAM sequence can be of any length but is typically 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 nucleotides long.
[0256] An “altered target site”, “altered target sequence”, “modified target site”, “alterations”, and “modified target sequence” are used interchangeably herein and refer to a target sequence as disclosed herein that comprises at least one alteration when compared to non-altered target sequence. Such “alterations” include, for example: (i) replacement of at least one nucleotide, (ii) a deletion of at least one nucleotide, (iii) an insertion of at least one nucleotide, (iv) a chemical alteration of at least one nucleotide, and/or (v) any combination of the preceding.
[0257] [0483] A “modified nucleotide” or “edited nucleotide” refers to a nucleotide sequence of interest that comprises at least one alteration when compared to its non-modified nucleotide sequence.
[0258] As used herein, “donor polynucleotide” is a polynucleotide construct (e.g., DNA) that comprises a polynucleotide of interest to be inserted into the target site of a Cas endonuclease. [0259] [0486] The term “polynucleotide modification template” includes a polynucleotide that comprises at least one nucleotide modification when compared to the nucleotide sequence to be edited. A nucleotide modification can be at least one nucleotide substitution, addition or deletion. Optionally, the polynucleotide modification template can further comprise homologous nucleotide sequences flanking the at least one nucleotide modification, wherein the flanking homologous nucleotide sequences provide sufficient homology to the desired nucleotide sequence to be edited. Microbes and Microorganisms
[0260] As used herein the term “microorganism” should be taken broadly. It includes, but is not limited to, prokaryotic Bacteria and Archaea, as well as eukaryotic Fungi and Protists.
[0261] By way of example, the microorganisms may include: Proteobacteria (such as Pseudomonas, Enterobacter, Stenotrophomonas, Burkholderia, Rhizobium, Herbaspirillum, Pantoea, Serratia, Rahnella, Azospirillum, Azorhizobium, Azotobacter, Duganella, Delftia, Bradyrhizobiun, Sinorhizobium, Variovorax and Halomonas), Firmicutes (such as Bacillus, Paenibacillus, Lactobacillus, Mycoplasma, and Acetobacterium), Actinobacteria (such as Brevibacterium, Janibacter, Streptomyces, Rhodococcus, Microbacterium, Curtobacterium, Cellulomonas, and Nocardioides), and the fungi Ascomycota (such as Trichoderma, Ampelomyces, Coniothyrium, Paecoelomyces, Penicillium, Cladosporium, Hypocrea, Beauveria, Metarhizium, Verticullium, Cordyceps, Pichea, and Candida), Basidiomycota (such as Coprinus, Corticium, andAgaricus) and Oomycota (such as Pythium), andMucoromycota (such as Mucor, andMortierella); as well as Orbilia/Arthrobotrys, Lysinibacillus, Microbacterium, Talaromyces, Arthrobacter, Kosakonia, Masillia, Novosphingobium, and Tumebacilhis.
[0262] In a particular embodiment, the microorganism is an endophyte, or an epiphyte, or a microorganism inhabiting the plant rhizosphere or rhizosheath. That is, the microorganism may be found present in the soil material adhered to the roots of a plant or in the area immediately adjacent a plant’s roots.
[0263] In one embodiment, the microorganism is an endophyte. Endophytes may benefit host plants by preventing pathogenic organisms from colonizing them. Extensive colonization of the plant tissue by endophytes creates a “barrier effect,” where the local endophytes outcompete and prevent pathogenic organisms from taking hold. Endophytes may also produce chemicals which inhibit the growth of competitors, including pathogenic organisms.
[0264] In certain embodiments, the microorganism is unculturable. This should be taken to mean that the microorganism is not known to be culturable or is difficult to culture using methods known to one skilled in the art.
[0265] Microorganisms of the present disclosure may be collected or obtained from any source or contained within and/or associated with material collected from any source.
[0266] In an embodiment, the microorganisms are obtained from any general terrestrial environment, including its soils, plants, fungi, animals (including invertebrates) and other biota, including the sediments, water and biota of lakes and rivers; from the marine environment, its biota and sediments (for example sea water, marine muds, marine plants, marine invertebrates (for example sponges), marine vertebrates (for example, fish)); the terrestrial and marine geosphere (regolith and rock, for example crushed subterranean rocks, sand and clays); the cryosphere and its meltwater; the atmosphere (for example, filtered aerial dusts, cloud and rain droplets); urban, industrial and other man-made environments (for example, accumulated organic and mineral matter on concrete, roadside gutters, roof surfaces, road surfaces).
[0267] In another embodiment the microorganisms are collected from a source likely to favor the selection of appropriate microorganisms. By way of example, the source may be a particular environment in which it is desirable for other plants to grow, or which is thought to be associated with terroir. In another example, the source may be a plant having one or more desirable traits, for example a plant which naturally grows in a particular environment or under certain conditions of interest By way of example, a certain plant may naturally grow in sandy soil or sand of high salinity, or under extreme temperatures, or with little water, or it may be resistant to certain pests or disease present in the environment and it may be desirable for a commercial crop to be grown in such conditions, particularly if they are, for example, the only conditions available in a particular geographic location. By way of further example, the microorganisms may be collected from commercial crops grown in such environments, or more specifically from individual crop plants best displaying a trait of interest amongst a crop grown in any specific environment for example the fastest-growing plants amongst a crop grown in saline-limiting soils, or the least damaged plants in crops exposed to severe insect damage or disease epidemic, or plants having desired quantities of certain metabolites and other compounds, including fiber content oil content and the like, or plants displaying desirable colors, taste, or smell. The microorganisms may be collected from a plant of interest or any material occurring in the environment of interest including fungi and other animal and plant biota, soil, water, sediments, and other elements of the environment as referred to previously. In certain embodiments, the microorganisms are individual isolates separated from different environments.
[0268] In one embodiment a microorganism or a combination of microorganisms, of use in the methods of the disclosure may be selected from a pre-existing collection of individual microbial species or strains based on some knowledge of their likely or predicted benefit to a plant. For example, the microorganism may be predicted to: improve nitrogen fixation; release phosphate from the soil organic matter; release phosphate from the inorganic forms of phosphate (e.g., rock phosphate); “fix carbon" in the root microsphere; live in the rhizosphere of the plant thereby assisting the plant in absorbing nutrients from the surrounding soil and then providing these more readily to the plant; increase the number of nodules on the plant roots and thereby increase the number of symbiotic nitrogen fixing bacteria (e.g., Rhizobium species) per plant and the amount of nitrogen fixed by the plant; elicit plant defensive responses such as ISR (induced systemic resistance) or SAR (systemic acquired resistance) which help the plant resist the invasion and spread of pathogenic microorganisms; compete with microorganisms deleterious to plant growth or health by antagonism, or competitive utilization of resources such as nutrients or space; change the color of one or more part of the plant, or change the chemical profile of the plant, its smell, taste or one or more other quality.
[0269] In one embodiment a microorganism or combination of microorganisms is selected from a pre-existing collection of individual microbial species or strains that provides no knowledge of their likely or predicted benefit to a plant For example, a collection of unidentified microorganisms isolated from plant tissues without any knowledge of their ability to improve plant growth or health, or a collection of microorganisms collected to explore their potential for producing compounds that could lead to the development of pharmaceutical drugs.
[0270] In one embodiment, the microorganisms are acquired from the source material (for example, soil, rock, water, air, dust, plant or other organism) in which they naturally reside. The microorganisms may be provided in any appropriate form, having regard to its intended use in the methods of the disclosure. However, by way of example only, the microorganisms may be provided as an aqueous suspension, gel, homogenate, granule, powder, slurry, live organism or dried material.
[0271] The microorganisms of the disclosure may be isolated in substantially pure or mixed cultures. They may be concentrated, diluted, or provided in the natural concentrations in which they are found in the source material. For example, microorganisms from saline sediments may be isolated for use in this disclosure by suspending the sediment in fresh water and allowing the sediment to fall to the bottom. The water containing the bulk of the microorganisms may be removed by decantation after a suitable period of settling and either applied directly to the plant growth medium, or concentrated by filtering or centrifugation, diluted to an appropriate concentration and applied to the plant growth medium with the bulk of the salt removed. By way of further example, microorganisms from mineralized or toxic sources may be similarly treated to recover the microbes for application to the plant growth material to minimize the potential for damage to the plant.
[0272] In another embodiment, the microorganisms are used in a crude form, in which they are not isolated from the source material in which they naturally reside. For example, the microorganisms are provided in combination with the source material in which they reside; for example, as soil, or the roots, seed or foliage of a plant. In this embodiment, the source material may include one or more species of microorganisms.
[0273] In some embodiments, a mixed population of microorganisms is used in the methods of the disclosure.
[0274] In embodiments of the disclosure where the microorganisms are isolated from a source material (for example, the material in which they naturally reside), any one or a combination of a number of standard techniques which will be readily known to skilled persons may be used. However, by way of example, these in general employ processes by which a solid or liquid culture of a single microorganism can be obtained in a substantially pure form, usually by physical separation on the surface of a solid microbial growth medium or by volumetric dilutive isolation into a liquid microbial growth medium. These processes may include isolation from dry material, liquid suspension, slurries or homogenates in which the material is spread in a thin layer over an appropriate solid gel growth medium, or serial dilutions of the material made into a sterile medium and inoculated into liquid or solid culture media.
[0275] Whilst not essential, in one embodiment, the material containing the microorganisms may be pre-treated prior to the isolation process in order to either multiply all microorganisms in the material, or select portions of the microbial population, either by enriching the material with microbial nutrients (for example, by pasteurizing the sample to select for microorganisms resistant to heat exposure (for example, bacilli), or by exposing the sample to low concentrations of an organic solvent or sterilant (for example, household bleach) to enhance the survival of spore-forming or solvent-resistant microorganisms). Microorganisms can then be isolated from the enriched materials or materials treated for selective survival, as above.
[0276] In an embodiment of the disclosure, endophytic or epiphytic microorganisms are isolated from plant material. Any number of standard techniques known in the art may be used and the microorganisms may be isolated from any appropriate tissue in the plant, including for example root, stem and leaves, and plant reproductive tissues. By way of example, conventional methods for isolation from plants typically include the sterile excision of the plant material of interest (e.g., root or stem lengths, leaves), surface sterilization with an appropriate solution (e.g., 2% sodium hypochlorite), after which the plant material is placed on nutrient medium for microbial growth (See, for example, Strobel G and Daisy B (2003) Microbiology and Molecular Biology Reviews 67 (4): 491-502; Zinniel DK et al. (2002) Applied and Environmental Microbiology 68 (5): 2198-2208).
[0277] In one embodiment of the disclosure, the microorganisms are isolated from root tissue. Further methodology for isolating microorganisms from plant material are detailed hereinafter. [0278] In one embodiment, the microbial population is exposed (prior to the method or at any stage of the method) to a selective pressure. For example, exposure of the microorganisms to pasteurization before their addition to a plant growth medium (preferably sterile) is likely to enhance the probability that the plants selected for a desired trait will be associated with sporeforming microbes that can more easily survive in adverse conditions, in commercial storage, or if applied to seed as a coating, in an adverse environment.
[0279] In certain embodiments, as mentioned herein before, the microorganism(s) may be used in crude form and need not be isolated from a plant or a media. For example, plant material or growth media which includes the microorganisms identified to be of benefit to a selected plant may be obtained and used as a crude source of microorganisms for the next round of the method or as a crude source of microorganisms at the conclusion of the method. For example, whole plant material could be obtained and optionally processed, such as mulched or crushed. Alternatively, individual tissues or parts of selected plants (such as leaves, stems, roots, and seeds) may be separated from the plant and optionally processed, such as mulched or crushed. In certain embodiments, one or more part of a plant which is associated with the second set of one or more microorganisms may be removed from one or more selected plants and, where any successive repeat of the method is to be conducted, grafted on to one or more plant used in any step of the plant breeding methods.
Exemplary Microbes
[0280] In aspects, the present disclosure provides isolated microbes, including novel strains of identified microbial species, presented in Table 1. [0281] In other aspects, the present disclosure provides isolated whole microbial cultures of the species and strains identified in Table 1. These cultures may comprise microbes at various concentrations.
[0282] In aspects, the disclosure provides for utilizing a microbe selected from Table 1 in agriculture.
[0283] In some embodiments, a microbe from the genus Bacillus is utilized in agriculture to impart one or more beneficial properties to a plant species.
[0284] Furthermore, the disclosure relates to microbes having characteristics substantially similar to that of a microbe identified in Table 1.
[0285] The isolated microbial species, and novel strains of said species, identified in the present disclosure, are able to impart beneficial properties or traits, such as a trait of agronomic importance, to target plant species.
[0286] For instance, the isolated microbes described in Table 1, or consortia of said microbes, are able to improve plant health and vitality. The improved plant health and vitality can be quantitatively measured, for example, by measuring the effect that said microbial application has upon a plant phenotypic or genotypic trait.
Sourcing of Microbes
[0287] The microbes of the present disclosure were obtained, among other places, at various locales in New Zealand and the United States
Isolation and Culturine of Microbes
[0288] The microbes of Table 1 were identified by utilizing standard techniques to characterize the microbes’ phenotype, which was then utilized to identify the microbe to a taxonomically recognized species. Alternatively, the microbes of Table 1 were sequenced (16S and/or Whole Genome Sequencing, according to methods known in the art) to determine taxonomy.
[0289] The isolation, identification, and culturing of the microbes of the present disclosure can be effected using standard microbiological techniques. Examples of such techniques may be found in Gerhardt, P. (ed.) Methods for General and Molecular Microbiology. American Society for Microbiology, Washington, D.C. (1994) and Lennette, E. H. (ed.) Manual of Clinical Microbiology, Third Edition. American Society for Microbiology, Washington, D.C. (1980), each of which is incorporated by reference. [0290] Isolation can be effected by streaking the specimen on a solid medium (e.g., nutrient agar plates) to obtain a single colony, which is characterized by the phenotypic traits described hereinabove (e.g., Gram positive/negative, capable of forming spores aerobically/anaerobically, cellular morphology, carbon source metabolism, acid/base production, enzyme secretion, metabolic secretions, etc.) and to reduce the likelihood of working with a culture which has become contaminated.
[0291] For example, for isolated bacteria of the disclosure, biologically pure isolates can be obtained through repeated subculture of biological samples, each subculture followed by streaking onto solid media to obtain individual colonies. Methods of preparing, thawing, and growing lyophilized bacteria are commonly known, for example, Ghema, R. L. and C. A. Reddy. 2007. Culture Preservation, p 1019-1033. In C. A. Reddy, T. J. Beveridge, J. A. Breznak, G. A. Marzluf, T. M. Schmidt, and L. R. Snyder, eds. American Society for Microbiology, Washington, D.C., 1033 pages; herein incorporated by reference. Thus freeze-dried liquid formulations and cultures stored long term at -70° C in solutions containing glycerol are contemplated for use in providing formulations of the present inventions.
[0292] The bacteria of the disclosure can be propagated in a liquid medium under aerobic conditions. Medium for growing the bacterial strains of the present disclosure includes a carbon source, a nitrogen source, and inorganic salts, as well as specially required substances such as vitamins, amino acids, nucleic acids and the like. Examples of suitable carbon sources which can be used for growing the bacterial strains include, but are not limited to, starch, peptone, yeast extract, amino acids, sugars such as glucose, arabinose, mannose, glucosamine, maltose, and the like; salts of organic acids such as acetic acid, fumaric acid, adipic acid, propionic acid, citric acid, gluconic acid, malic acid, pyruvic acid, malonic acid and the like; alcohols such as ethanol and glycerol and the like; oil or fat such as soybean oil, rice bran oil, olive oil, com oil, sesame oil. The amount of the carbon source added varies according to the kind of carbon source and is typically between 1 to 100 gram(s) per liter of medium. Preferably, glucose, starch, and/or peptone is contained in the medium as a major carbon source, at a concentration of 0.1-5% (W/V). Examples of suitable nitrogen sources which can be used for growing the bacterial strains of the present invention include, but are not limited to, amino acids, yeast extract, tryptone, beef extract, peptone, potassium nitrate, ammonium nitrate, ammonium chloride, ammonium sulfate, ammonium phosphate, ammonia or combinations thereof. The amount of nitrogen source varies according to the type of nitrogen source, typically between 0.1 to 30 gram per liter of medium. The inorganic salts, potassium dihydrogen phosphate, dipotassium hydrogen phosphate, disodium hydrogen phosphate, magnesium sulfate, magnesium chloride, ferric sulfate, ferrous sulfate, ferric chloride, ferrous chloride, manganous sulfate, manganous chloride, zinc sulfate, zinc chloride, cupric sulfate, calcium chloride, sodium chloride, calcium carbonate, sodium carbonate can be used alone or in combination. The amount of inorganic acid varies according to the kind of the inorganic salt, typically between 0.001 to 10 gram per liter of medium. Examples of specially required substances include, but are not limited to, vitamins, nucleic acids, yeast extract, peptone, meat extract, malt extract, dried yeast and combinations thereof. Cultivation can be effected at a temperature, which allows the growth of the bacterial strains, essentially, between 20°C and 46°C. In some aspects, a temperature range is 30°C-37°C. For optimal growth, in some embodiments, the medium can be adjusted to pH 7.0- 7.4. It will be appreciated that commercially available media may also be used to culture the bacterial strains, such as Nutrient Broth or Nutrient Agar available from Difco, Detroit, MI. It will be appreciated that cultivation time may differ depending on the type of culture medium used and the concentration of sugar as a major carbon source.
[0293] In aspects, cultivation lasts between 24-96 hours. Bacterial cells thus obtained are isolated using methods, which are well known in the art. Examples include, but are not limited to, membrane filtration and centrifugal separation. The pH may be adjusted using sodium hydroxide and the like and the culture may be dried using a freeze dryer, until the water content becomes equal to 4% or less. Microbial co-cultures may be obtained by propagating each strain as described hereinabove. It will be appreciated that the microbial strains may be cultured together when compatible culture conditions can be employed.
Identification of Microbes
[0294] Microbes can be distinguished into a genus based on polyphasic taxonomy, which incorporates all available phenotypic and genotypic data into a consensus classification (Vandamme et al. 1996. Polyphasic taxonomy, a consensus approach to bacterial systematics. Microbiol Rev 1996, 60:407-438). One accepted genotypic method for defining species is based on overall genomic relatedness, such that strains which share approximately 70% or more relatedness using DNA-DNA hybridization, with 5°C or less ATm (the difference in the melting temperature between homologous and heterologous hybrids), under standard conditions, are considered to be members of the same species. Thus, populations that share greater than the aforementioned 70% threshold can be considered to be variants of the same species.
[0295] For bacterial microbes, the 16S rRNA sequences are often used for determining taxonomy and making distinctions between species, in that if a 16S rRNA sequence shares less than a specified % sequence identity from a reference sequence, then the two organisms from which the sequences were obtained are said to be of different species.
[0296] Thus, one could consider microbes to be of the same species, if they share at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% sequence identity across the 16S or 16S rRNA or rDNA sequence. In some aspects, a microbe could be considered to be the same species only if it shares at least 95% identity.
[0297] Further, one could define microbial strains of a species, as those that share at least 80%, 85%, 90%, 95%, 97%, 98%, or 99% sequence identity across the 16S rRNA sequence.
[0298] Comparisons may also be made with 23 S rRNA sequences against reference sequences. In some aspects, a microbe could be considered to be the same strain only if it shares at least 95% identity. In some embodiments, “substantially similar genetic characteristics” means a microbe sharing at least 95% identity.
[0299] For fungal microbes, the ITS (Internal Transcriber Sequence) is often used for identification of taxonomy. Among the regions of the ribosomal cistron, the internal transcribed spacer (ITS) region has the highest probability of successful identification for the broadest range of fungi, with the most clearly defined barcode gap between inter- and intraspecific variation, and has been proposed as the formal fungal identification sequence (Schoch el al., PNAS April 17, 2012 109 (16) 6241-6246).
[0300] In one embodiment, microbial strains of the present disclosure include those that comprise polynucleotide sequences that share at least 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity with any one of SEQ ID NOs: 1-179.
[0301] In one embodiment, microbes of the present disclosure include those that comprise polynucleotide sequences that share at least 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity with any one of SEQ ID NOs: 1-179. [0302] In one embodiment, microbial consortia of the present disclosure include two or more microbes that comprise polynucleotide sequences that share at least 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity with any one of SEQ ID NOs: 1-179.
[0303] In one embodiment, microbial consortia of the present disclosure include two or more microbial strains, wherein at least one of those comprises a polynucleotide sequences that shares at least 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity with any one of SEQ ID NOs: 1-179.
[0304] In one embodiment, microbial consortia of the present disclosure include two or more microbial strains, wherein at least one of those comprises a polynucleotide sequences that shares at least 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity with any one of SEQ ID NOs: 1-179, and wherein at least one of the microbes is optionally selected from Table 1.
[0305] Unculturable microbes often cannot be assigned to a definite species in the absence of a phenotype determination, the microbes can be given a candidates designation within a genus provided their 16S rRNA sequences subscribes to the principles of identity with known species. [0306] One approach is to observe the distribution of a large number of strains of closely related species in sequence space and to identify clusters of strains that are well resolved from other clusters. This approach has been developed by using the concatenated sequences of multiple core (house-keeping) genes to assess clustering patterns, and has been called multilocus sequence analysis (MLSA) or multilocus sequence phylogenetic analysis. MLS A has been used successfully to explore clustering patterns among large numbers of strains assigned to very closely related species by current taxonomic methods, to look at the relationships between small numbers of strains within a genus, or within a broader taxonomic grouping, and to address specific taxonomic questions. More generally, the method can be used to ask whether bacterial species exist - that is, to observe whether large populations of similar strains invariably fall into well-resolved clusters, or whether in some cases there is a genetic continuum in which clear separation into clusters is not observed.
[0307] In order to more accurately make a determination of genera, a determination of phenotypic traits, such as morphological, biochemical, and physiological characteristics are made for comparison with a reference genus archetype. The colony morphology can include color, shape, pigmentation, production of slime, etc. Features of the cell are described as to shape, size, Gram reaction, extracellular material, presence of endospores, flagella presence and location, motility, and inclusion bodies. Biochemical and physiological features describe growth of the organism at different ranges of temperature, pH, salinity and atmospheric conditions, growth in presence of different sole carbon and nitrogen sources. One of ordinary skill in the art would be reasonably apprised as to the phenotypic traits that define the genera of the present disclosure. For instance, colony color, form, and texture on a particular agar (e.g., YMA) was used to identify species of Rhizobium.
[0308] In one embodiment, bacterial microbes taught herein were identified utilizing 16S rRNA gene sequences. It is known in the art that 16S rRNA contains hypervariable regions that can provide species/strain-specific signature sequences useful for bacterial identification. In the present disclosure, many of the microbes were identified via partial (500 - 1200 bp) 16S rRNA sequence signatures. In aspects, each strain represents a pure colony isolate that was selected from an agar plate. Selections were made to represent the diversity of organisms present based on any defining morphological characteristics of colonies on agar medium. The medium used, in embodiments, was R2A, PDA, Nitrogen-free semi-solid medium, or MRS agar. Colony descriptions of each of the ‘picked’ isolates were made after 24-hour growth and then entered into our database. Sequence data was subsequently obtained for each of the isolates.
[0309] Phylogenetic analysis using the 16S rRNA gene was used to define “substantially similar” species belonging to common genera and also to define “substantially similar” strains of a given taxonomic species. Further, we recorded physiological and/or biochemical properties of the isolates that can be utilized to highlight both minor and significant differences between strains that could lead to advantageous behavior on plants.
CRISPR-Cas Systems
[0310] CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) endonucleases were first identified in bacteria as example of an “adaptive immunity” system, by which certain bacteria were able to survive viral infection utilizing an immune-type response. Short repeats of an infective virus’ DNA were found in an array pattern within the target bacterium, with the repeats interspersed with spacer sequences unique to that bacterium. It was determined that future infections in that bacterium by the same virus were able to be combatted, as the bacterium produced an endonuclease (Cas endonuclease) that was able to specifically target that virus, using the repeat sequences stored in the array.
[0311] A CRISPR-Cas system comprises, at a minimum, a CRISPR RNA (crRNA) molecule and at least one CRISPR-associated (Cas) protein to form aa crRNA ribonucleoprotein (crRNP) effector complex. CRISPR-Cas loci comprise an array of identical repeats interspersed with DNA-targeting spacers that encode the crRNA components and an operon-like unit of cas genes encoding the Cas protein components. The resulting ribonucleoprotein complex recognizes a polynucleotide in a sequence-specific manner (lore et al., Nature Structural & Molecular Biology 18, 529-536 (2011)). The crRNA serves as a guide RNA for sequence specific binding of the effector (protein or complex) to double strand DNA sequences, by forming base pairs with the complementary DNA strand while displacing the noncomplementary strand to form a so called R-loop.
[0312] CRISPR-Cas systems have been classified according to sequence and structural analysis of genomic loci and the associated encoded protein(s). Multiple CRISPR/Cas systems have been described including Class 1 systems, with multi-subunit effector complexes (comprising type I, type IH, and type IV), and Class 2 systems, with single protein effectors (comprising type n, type V, and type VI) (Makarova et al. 2015, Nature Reviews Microbiology Vol. 13: 1-15; Zetsche el a/., 2015, Cell 163, 1-13; Shmakov et al. , 2015, Molecular Cell 60, 1-13; Haft et al. , 2005, Computational Biology, PLoS Comput Biol l(6):e60; and Koonin et al. 2017, Curr Opinion Microbiology 37:67- 78).
[0313] Methods are now known in the art for identifying and classifying Cas proteins, including Cas endonucleases, in bacteria.
CRISPR System Taxonomy
[0314] Many CRISPR systems have been identified in the literature and described, based upon the native bacterial genomic locus architecture, composition, and the domain structures of the effector protein(s).
Class I CRISPR-Cas Systems [0315] Class I CRISPR-Cas systems comprise Types I, III, and IV. A characteristic feature of Class I systems is the presence of an effector endonuclease complex instead of a single protein. A Cascade complex comprises a RNA recognition motif (RRM) and a nucleic acid-binding domain that is the core fold of the diverse RAMP (Repeat- Associated Mysterious Proteins) protein superfamily (Makarova et al. 2013, Biochem Soc Trans 41, 1392-1400; Makarova et al. 2015, Nature Reviews Microbiology Vol. 13:1-15). RAMP protein subunits include Cas5 and Cas7 (which comprise the skeleton of the crRNA-effector complex), wherein the Cas5 subunit binds the 5' handle of the crRNA and interacts with the large subunit, and often includes Cas6 which is loosely associated with the effector complex and typically functions as the repeatspecific RNase in the pre-crRNA processing (Charpentier et al., FEMS Microbiol Rev 2015, 39:428-441; Niewoehner et al., RNA 2016, 22:318-329).
[0316] Type I CRISPR-Cas systems comprise a complex of effector proteins, termed Cascade (CRISPR-associated complex for antiviral defense) comprising at a minimum Cas5 and Cas7. The effector complex functions together with a single CRISPR RNA (crRNA) and Cas3 to defend against invading viral DNA (Brouns, S. J. J. et al. Science 321:960-964; Makarova et al. 2015, Nature Reviews Microbiology Vol. 13:1-15). Type I CRISPR-Cas loci comprise the signature gene cas3 (or a variant cas3' or cas3"), which encodes a metal-dependent nuclease that possesses a single-stranded DNA (ssDNA)-stimulated superfamily 2 helicase with a demonstrated capacity to unwind double stranded DNA (dsDNA) and RNA-DNA duplexes (Makarova et al. 2015, Nature Reviews; Microbiology Vol. 13:1-15). Following target recognition, the Cas3 endonuclease is recruited to the Cascade-crRNA-target DNA complex to cleave and degrade the DNA target (Westra, E. R. et al. (2012) Molecular Cell 46:595-605, Sinkunas, T. et al. (2011) EMBO J. 30: 1335-1342, and Sinkunas, T. et al. (2013) EMBO J.
32:385-394). In some type I systems, Cas6 can be the active endonuclease that is responsible for crRNA processing, and Cas5 and Cas7 function as non-catalytic RNA-binding proteins; although in type I-C systems, crRNA processing can be catalyzed by Cas5 (Makarova et al. 2015, Nature Reviews Microbiology Vol. 13:1-15). Type I systems are divided into seven subtypes (Makarova et al. 2011 , Nat Rev Microbiol. 2011 9(6):467-477; Koonin et al. 2017, Curr Opinion Microbiology 37:67-78). A modified type I CRISPR-associated complex for adaptive antiviral defense (Cascade) comprising at least the protein subunits Cas7, Cas5 and Cas6, wherein one of these subunits is synthetically fused to a Cas3 endonuclease or a modified restriction endonuclease, FokI, have been described (WO2013098244 published 4 Jul. 4, 2013).
[0317] Type III CRISPR-Cas systems, comprising a plurality of cas7 genes, target either ssRNA or ssDNA, and function as either an RNase as well as a target RNA-activated DNA nuclease (Tamulaitis et al, Trends in Microbiology 25(10)49-61, 2017). Csm (Type IH-A) and Cmr (Type III-B) complexes function as RNA-activated single-stranded (ss) DNases that couple the target RNA binding/cleavage with ssDNA degradation. Upon foreign DNA infection, the CRISPR RNA (crRNA)-guided binding of the Csm or Cmr complex to the emerging transcript recruits CaslO DNase to the actively transcribed phage DNA, resulting in degradation of both the transcript and phage DNA, but not the host DNA. The CaslO HD-domain is responsible for the ssDNase activity, and Csm3/Cmr4 subunits are responsible for the endoribonuclease activity of the Csm/Cmr complex. The 3'-flanking sequence of the target RNA is critical for the ssDNase activity of Csm/Cmr: the basepairing with the 5'-handle of crRNA protects host DNA from degradation.
[0318] Type IV systems, although comprising typical type I cas5 and cas7 domains in addition to a cas8-like domain, may lack the CRISPR array that is characteristic of most other CRISPR- Cas systems.
Class II CRISPR-Cas Systems
[0319] Class II CRISPR-Cas systems comprise Types n, V, and VI. A characteristic feature of Class II systems is the presence of a single Cas effector protein instead of an effector complex. Types II and V Cas proteins comprise an RuvC endonuclease domain that adopts the RNase H fold.
[0320] Type II CRISPR/Cas systems employ a crRNA and tracrRNA (trans-activating CRISPR RNA) to guide the Cas endonuclease to its DNA target. The crRNA comprises a spacer region complementary to one strand of the double strand DNA target and a region that base pairs with the tracrRNA (trans-activating CRISPR RNA) forming a RNA duplex that directs the Cas endonuclease to cleave the DNA target, leaving a blunt end. Spacers are acquired through a not fully understood process involving Casl and Cas2 proteins. Type II CRISPR/Cas loci typically comprise cast and cast genes in addition to the cas9 gene (Chylinski et al., 2013, RNA Biology 10:726-737; Makarova et al. 2015, Nature Reviews Microbiology Vol. 13:1-15). Type II CRISR- Cas loci can encode a tracrRNA, which is partially complementary to the repeats within the respective CRISPR array, and can comprise other proteins such as Csnl and Csn2. The presence of cas9 in the vicinity of casl and cas2 genes is the hallmark of type II loci (Makarova et al. 2015, Nature Reviews Microbiology Vol. 13:1-15).
[0321] Type V CRISPR/Cas systems comprise a single Cas endonuclease, including Cpfl (Casl2) (Koonin et al, Curr Opinion Microbiology 37:67-78, 2017), that is an active RNA- guided endonuclease that does not necessarily require the additional trans-activating CRISPR (tracr) RNA for target cleavage, unlike Cas9.
[0322] Type VI CRISPR-Cas systems comprise a casl3 gene that encodes a nuclease with two HEPN (Higher Eukaryotes and Prokaryotes Nucleotide-binding) domains but no HNH or RuvC domains, and are not dependent upon tracrRNA activity. The majority of HEPN domains comprise conserved motifs that constitute a metal-independent endoRNase active site (Anantharam et al., Biol Direct 8: 15, 2013). Because of this feature, it is thought that type VI systems may act on RNA targets instead of the DNA targets that are common to other CRISPR- Cas systems.
Cas Proteins
[0323] The proteins encoded in a Cas locus within a bacterial genome include endonucleases that are responsible for effecting cleavage of a nucleotide target (e.g., double-stranded DNA cleavage, DNA nicking, ssDNA cleavage, RNA cleavage), and are variously referred to as “Cas Endonucleases”, “Effector Proteins”, and “Effector Complexes. Other genes within the locus encode proteins of other functions that may be required for complete activity, as discussed below.
Cas Endonucleases, Effector Proteins, and Effector Complexes
[0324] Endonucleases are enzymes that cleave the phosphodiester bond within a polynucleotide chain, and include restriction endonucleases that cleave DNA at specific sites without damaging the bases. Examples of endonucleases include restriction endonucleases, meganucleases, TAL effector nucleases (TALENs), zinc finger nucleases, and Cas (CRISPR-associated) effector endonucleases.
[0325] Cas endonucleases, either as single effector proteins or in an effector complex with other components, unwind the DNA duplex at the target sequence and optionally cleave at least one DNA strand, as mediated by recognition of the target sequence by a polynucleotide (such as, but not limited to, a crRNA or guide RNA) that is in complex with the Cas effector protein. Such recognition and cutting of a target sequence by a Cas endonuclease typically occurs if the correct protospacer-adjacent motif (PAM) is located at or adjacent to the 3' end of the DNA target sequence. Alternatively, a Cas endonuclease herein may lack DNA cleavage or nicking activity, but can still specifically bind to a DNA target sequence when complexed with a suitable RNA component. (See also U.S. Patent Application US20150082478 published 19 Mar. 2015 and US20150059010 published 26 Feb. 2015).
[0326] Cas endonucleases may occur as individual effectors (Class 2 CRISPR systems) or as part of larger effector complexes (Class I CRISPR systems).
[0327] Cas endonucleases that have been described include, but are not limited to, for example: Cas3 (a feature of Class 1 type I systems), Cas9 (a feature of Class 2 type II systems) and Cas 12 (Cpfl) (a feature of Class 2 type V systems).
[0328] Cas3 (and its variants Cas3' and Cas3") functions as a single-stranded DNA nuclease (HD domain) and an ATP-dependent helicase. A variant of the Cas3 endonuclease can be obtained by disabling the functional activity of one or both domains of the Cas3 endonuclease poly peptide. Disabling the ATPase dependent helicase activity (by deletion, knockout of the Cas3-helicase domain, or through mutagenesis of critical residues or by assembling the reaction in the absence of ATP as described previously (Sinkunas, T. et al., 2013, EMBO J. 32:385-394) can convert the cleavage ready Cascade comprising the modified Cas3 endonuclease into a nickase (as the HD domain is still functional). Disabling the HD endonuclease activity can be accomplished by any method known in the art, such as but not limited to, mutagenesis of critical residues of the HD domain, can convert the cleavage ready Cascade comprising the modified Cas3 endonuclease into a helicase. Disabling the both the Cas helicase and Cas3 HD endonuclease activity can be accomplished by any method known in the art, such as but not limited to, mutagenesis of critical residues of both the helicase and HD domains, can convert the cleavage ready Cascade comprising the modified Cas3 endonuclease into a binder protein that binds to a target sequence.
[0329] Cas9 (formerly referred to as Cas5, Csnl, or Csxl2) is a Cas endonuclease that forms a complex with a crNucleotide and a tracrNucleotide, or with a single guide polynucleotide, for specifically recognizing and cleaving all or part of a DNA target sequence. Cas9 recognizes a 3' GC-rich PAM sequence on the target dsDNA. A Cas9 protein comprises a RuvC nuclease with an HNH (H — N — H) nuclease adjacent to the RuvC-II domain. The RuvC nuclease and HNH nuclease each can cleave a single DNA strand at a target sequence (the concerted action of both domains leads to DNA double-strand cleavage, whereas activity of one domain leads to a nick). In general, the RuvC domain comprises subdomains I, II and III, where domain I is located near the N-terminus of Cas9 and subdomains II and III are located in the middle of the protein, flanking the HNH domain (Hsu et al., 2013, Cell 157:1262-1278). Cas9 endonucleases are typically derived from a type II CRISPR system, which includes a DNA cleavage system utilizing a Cas9 endonuclease in complex with at least one polynucleotide component. For example, a Cas9 can be in complex with a CRISPR RNA (crRNA) and a trans-activating CRISPR RNA (tracrRNA). In another example, a Cas9 can be in complex with a single guide RNA (Makarova et al. 2015, Nature Reviews Microbiology Vol. 13:1-15).
[0330] Casl2 (formerly referred to as Cpfl, and variants c2cl, c2c3, CasX, and CasY) comprise an RuvC nuclease domain and produced staggered, 5' overhangs on the dsDNA target. Some variants do not require a tracrRNA, unlike the functionality of Cas9. Cas 12 and its variants recognize a 5' AT-rich PAM sequence on the target dsDNA. An insert domain, called Nuc, of the Cas 12a protein has been demonstrated to be responsible for target strand cleavage (Yamano et al., Cell 2016, 165:949-962). Additional mutation studies in other Casl2 proteins demonstrated the Nuc domain contributes to guide and target binding, with the RuvC domain responsible for cleavage (Swarts et al., Mol Cell 2017, 66:221-233 e224).
[0331] Cas endonucleases and effector proteins can be used for targeted genome editing (via simplex and multiplex double-strand breaks and nicks) and targeted genome regulation (via tethering of epigenetic effector domains to either the Cas protein or sgRNA. A Cas endonuclease can also be engineered to function as an RNA-guided recombinase, and via RNA tethers could serve as a scaffold for the assembly of multiprotein and nucleic acid complexes (Mali et al., 2013, Nature Methods Vol. 10:957-963).
[0332] To effect cleavage, the Cas endonuclease (or effector complex) forms a complex with a guide polynucleotide, and recognizes a sequence on a target sequence, called a Protospacer Adjacent Motif (PAM). A “protospacer adjacent motif’ (PAM) herein refers to a short nucleotide sequence adjacent to a target sequence (protospacer) that can be recognized (targeted) by a guide polynucleotide/Cas endonuclease system. The Cas endonuclease may not successfully recognize a target DNA sequence if the target DNA sequence is not followed by a PAM sequence. The sequence and length of a PAM herein can differ depending on the Cas protein or Cas protein complex used. The PAM sequence can be of any length but is typically 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 nucleotides long. [0333] A guide polynucleotide/Cas endonuclease complex described herein is capable of recognizing, binding to, and optionally nicking, unwinding, or cleaving all or part of a target sequence.
[0334] A guide polynucleotide/Cas endonuclease complex that can cleave both strands of a DNA target sequence typically comprises a Cas protein that has all of its endonuclease domains in a functional state (e.g., wild type endonuclease domains or variants thereof retaining some or all activity in each endonuclease domain). Thus, a wild type Cas protein (e.g., a Cas protein disclosed herein), or a variant thereof retaining some or all activity in each endonuclease domain of the Cas protein, is a suitable example of a Cas endonuclease that can cleave both strands of a DNA target sequence.
[0335] A guide polynucleotide/Cas endonuclease complex that can cleave one strand of a DNA target sequence can be characterized herein as having nickase activity (e.g., partial cleaving capability). A Cas nickase typically comprises one functional endonuclease domain that allows the Cas to cleave only one strand (i.e., make a nick) of a DNA target sequence. For example, a Cas9 nickase may comprise (i) a mutant, dysfunctional RuvC domain and (ii) a functional HNH domain (e.g., wild type HNH domain). As another example, a Cas9 nickase may comprise (i) a functional RuvC domain (e.g., wild type RuvC domain) and (ii) a mutant, dysfunctional HNH domain. Non-limiting examples of Cas9 nickases suitable for use herein are disclosed in US20140189896 published on 3 Jul. 2014. A pair of Cas nickases can be used to increase the specificity of DNA targeting. In general, this can be done by providing two Cas nickases that, by virtue of being associated with RNA components with different guide sequences, target and nick nearby DNA sequences on opposite strands in the region for desired targeting. Such nearby cleavage of each DNA strand creates a double-strand break (i.e., a DSB with single-stranded overhangs), which is then recognized as a substrate for non-homologous-end-joining, NHEJ (prone to imperfect repair leading to mutations) or homologous recombination, HR Each nick in these embodiments can be at least about 5, between 5 and 10, at least 10, between 10 and 15, at least 15, between 15 and 20, at least 20, between 20 and 30, at least 30, between 30 and 40, at least 40, between 40 and 50, at least 50, between 50 and 60, at least 60, between 60 and 70, at least 70, between 70 and 80, at least 80, between 80 and 90, at least 90, between 90 and 100, or 100 or greater (or any integer between 5 and 100) bases apart from each other, for example. One or two Cas nickase proteins herein can be used in a Cas nickase pair. For example, a Cas9 nickase with a mutant RuvC domain, but functioning HNH domain (i.e., Cas9 HNH+/RuvC-), can be used (e.g., Streptococcus pyogenes Cas9 HNH+/RuvC-). Each Cas9 nickase (e.g., Cas9 HNH+/RuvC-) can be directed to specific DNA sites nearby each other (up to 100 base pairs apart) by using suitable RNA components herein with guide RNA sequences targeting each nickase to each specific DNA site.
[0336] A guide polynucleotide/Cas endonuclease complex in certain embodiments can bind to a DNA target site sequence, but does not cleave any strand at the target site sequence. Such a complex may comprise a Cas protein in which all of its nuclease domains are mutant, dysfunctional. For example, a Cas9 protein that can bind to a DNA target site sequence, but does not cleave any strand at the target site sequence, may comprise both a mutant, dysfunctional RuvC domain and a mutant, dysfunctional HNH domain. A Cas protein herein that binds, but does not cleave, a target DNA sequence can be used to modulate gene expression, for example, in which case the Cas protein could be fused with a transcription factor (or portion thereof) (e.g., a repressor or activator).
[0337] Other proteins involved in adaptation, expression and interference are encoded by other cas genes located in the genomic neighborhood of the CRISPR arrays. Some core proteins are found to be universally present across most Types and Classes of CRISPR systems, while others are specific to a particular Type or Class. See, for example, Zhang et al., Nucleic Acids Research, Volume 42, Issue 4, 1 February 2014, Pages 2448-2459; and Hille et al., Cell, Volume 172, Issue 6, 8 March 2018, Pages 1239- 1259; Makarova and Koonin, Methods Mol Biol. 2015; 1311: 47-75.
Recombinant Constructs and Cell Transformation
[0338] The disclosed guide polynucleotides, Cas endonucleases, polynucleotide modification templates, donor DNAs, guide polynucleotide/Cas endonuclease systems disclosed herein, and any one combination thereof, optionally further comprising one or more polynucleotide(s) of interest, can be introduced into a cell. Cells include, but are not limited to, human, non-human, animal, bacterial, fungal, insect, yeast, non-conventional yeast, and plant cells, as well as progeny and/or derivatives of the cells produced by the methods described herein.
[0339] Standard recombinant DNA and molecular cloning techniques used herein are well known in the art and are described more fully in Sambrook et al, Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y. (1989). Transformation methods are well known to those skilled in the art.
[0340] Vectors and constructs include circular plasmids, and linear polynucleotides, comprising a polynucleotide of interest and optionally other components including linkers, adapters, regulatory or analysis. In some examples a recognition site and/or target site can be comprised within an intron, coding sequence, 5' UTRs, 3' UTRs, and/or regulatory regions.
[0341] The disclosure further provides expression constructs for expressing in a prokaryotic or eukaryotic cell/organism a guide RNA/Cas system that is capable of recognizing, binding to, and optionally nicking, unwinding, or cleaving all or part of a target sequence.
[0342] In one embodiment, the expression constructs of the disclosure comprise a promoter operably linked to a nucleotide sequence encoding a Cas gene (or optimized gene) and a promoter operably linked to a guide RNA of the present disclosure. The promoter is capable of driving expression of an operably linked nucleotide sequence in a prokaryotic or eukaryotic cell/organism.
[0343] In one embodiment, a Cas endonuclease provided herein is introduced to a target polynucleotide (e.g., in vitro or in a cell) to effect a single-strand nick or a double-strand break into the target polynucleotide. The nick or break may be leveraged to introduce an edit into the polynucleotide, for example but not limited to the insertion of a heterologous polynucleotide e.g., a polynucleotide of interest), the deletion of a particular sequence, or the introduction of one or more modified nucleotides.
[0344] Other uses are contemplated and within the scope of the disclosure, for example but not limited to gene targeting, and use of the endonuclease as a targeting molecule without leveraging its capability of nicking or cleaving.
[0345] Modification of a target sequence may be in the form of a nucleotide insertion, a nucleotide deletion, a nucleotide substitution, the addition of an atom molecule to an existing nucleotide, a nucleotide modification, or the binding of a heterologous polynucleotide or polypeptide to said target sequence. The insertion of one or more nucleotides may be accomplished by the inclusion of a donor polynucleotide in the reaction mixture: said donor polynucleotide is inserted into a double-strand break created by said Cas-alpha ortholog polypeptide. The insertion may be via non-homologous end joining or via homologous recombination. Uses in Microbiology, Agriculture, Pharmaceuticals, and Medical Research
[0346] The presently disclosed polynucleotides and polypeptides can be introduced into a cell. Cells include, but are not limited to, human, non-human, animal, mammalian, bacterial, fungal, insect, yeast, non-conventional yeast, and plant cells as well as plants and seeds produced by the methods described herein. Any plant can be used with the compositions and methods described herein, including monocot and dicot plants, and plant elements.
[0347] The presently disclosed polynucleotides and polypeptides can be introduced into an animal cell. Animal cells can include, but are not limited to: an organism of a phylum including chordates, arthropods, mollusks, annelids, cnidarians, or echinoderms; or an organism of a class including mammals, insects, birds, amphibians, reptiles, or fishes. In some aspects, the animal is human, mouse, C. elegans, rat, fruit fly (Drosophila spp.), zebrafish, chicken, dog, cat, guinea pig, hamster, chicken, Japanese ricefish, sea lamprey, pufferfish, tree frog (e.g., Xenopus spp.), monkey, or chimpanzee. Particular cell types that are contemplated include haploid cells, diploid cells, reproductive cells, neurons, muscle cells, endocrine or exocrine cells, epithelial cells, muscle cells, tumor cells, embryonic cells, hematopoietic cells, bone cells, germ cells, somatic cells, stem cells, pluripotent stem cells, induced pluripotent stem cells, progenitor cells, meiotic cells, and mitotic cells. In some aspects, a plurality of cells from an organism may be used.
[0348] Genome modification via a Cas endonuclease described herein may be used to effect a genotypic and/or phenotypic change on the target organism. Such a change is preferably related to an improved phenotype of interest or a physiologically-important characteristic, the correction of an endogenous defect, or the expression of some type of expression marker. In some aspects, the phenotype of interest or physiologically-important characteristic is related to the overall health, fitness, or fertility of the organism, the ecological fitness of the organism, or the relationship or interaction of the organism with other organisms or abiotic factors in its environment
[0349] While the invention has been particularly shown and described with reference to a preferred embodiment and various alternate embodiments, it will be understood by persons skilled in the relevant art that various changes in form and details can be made therein without departing from the spirit and scope of the invention. Various alterations, modifications, and improvements of the present disclosure that readily occur to those skilled in the art, including certain alterations, modifications, substitutions, and improvements are also part of this disclosure. For instance, while the particular examples below may illustrate the methods and embodiments described herein using a specific plant, the principles in these examples may be applied to any plant. Therefore, it will be appreciated that the scope of this invention is encompassed by the embodiments recited herein rather than solely by the specific examples that are exemplified below.
[0350] All cited patents and publications referred to in this application are herein incorporated by reference in their entirety, for all purposes, to the same extent as if each were individually and specifically incorporated by reference.
EXAMPLES
[0351] The cas genes provided herein may be used to improve the phenotype of an organism via polynucleotide modification.
Example 1: Identification and Characterization of CRISPR Proteins in Bacterial Systems [0352] A selection of microbes from the collection were sequenced according to methods known in the art, and assigned taxonomy based on either 16S sequence similarity or whole genome sequencing BLAST.
[0353] Briefly, bacterial genomes were annotated using Prokka (Seemann, Bioinformatics, Volume 30, Issue 14, 15 July 2014, Pages 2068-2069). CRISPR-associated genes were identified in each genome and assigned nomenclature based on percent identity matches to closest-relative sequences in public databases.
[0354] Isolates of interest were grown to mid-log phase in R2D media. DNA was extracted with the Qiagen Powersoil DNA extraction kit and sequencing libraries were constructed with the iGenomix RipTide kit as per manufacturer instructions. Sequencing was performed on an Illumina HiSeq with PEI 50. Raw Illumina reads were trimmed to QI 5 with Trimmomatic v38 (Bolger AM, Lohse M, and Usadel B. (2014). Trimmomatic: A flexible trimmer for Illumina Sequence Data. Bioinformatics, btul70) and assembled with SPAdes (Prjibelski A, Antipov D, Meleshko D, Lapidus A, and Korobeynikov A. (2020) Using SPAdes de novo assembler. Curr. Protoc. Bioinform. 70, el 02) using default parameters. Assembled contigs were analyzed with BinSantity 0.5.4. (Graham ED, Heidelberg JF, and Tully BJ. (2017) BinSanity: unsupervised clustering of environmental microbial assemblies using coverage and affinity propagation. Peer! 5:e3035) for purity with a contamination cutoff of < 5%. The largest bin was extracted and annotated with Prokka 1.8 (Seemann T. (2014) Prokka: rapid prokaryotic genome annotation. Bioinformatics 30(14):2068-9). Taxonomy was assigned using GTDB-tk. All genomes containing the cas9 annotation were parsed and the cas9 sequence was exported.
[0355] Each microbe may have a single or multiple different types of CRISPR systems. In one example, a particular microbe may have a Class II Type V endonuclease system (such as Cas9) and a Class 1 Type I endonuclease system.
[0356] Table 1 lists selected microbes comprising identified cas genes, including those encoding Cas endonucleases. Each microbe is listed by a reference ID, unique Strain number, and taxonomy. [0357] Table 2 shows the identity of each particular cas and CRISPR-system related gene discovered in the microbes of Table 1, listed by reference ID. The key for the genes listed by number as Table 2 column headings is given in Table 3.
[0358] In some cases, a Cas endonuclease identified herein is used to edit a target polynucleotide. In some cases, a different cas gene or Cas protein identified herein is used to effect an improvement to a system, target polynucleotide, or cell comprising such.
Example 2: Use of Novel Cas Endonudeases for Polynucleotide Editing
[0359] PAM preferences for each of the endonucleases discovered in Example 1 may be determined by methods known in the art (e.g., Karvelis etal., Methods, Volumes 121-122, 15 May 2017, Pages 3-8). Alternatively, a Cas endonuclease may be engineered to accommodate different PAM sequences than that which it naturally prefers (e.g., Leenay and Beisel, J Mol Bio, Volume 429, Issue 2, 20 January 2017, Pages 177-191).
[0360] A Cas endonuclease (or effector complex) is identified from its source bacterium, such as those disclosed in Table 1, and either expressed and isolated or synthesized de novo from its corresponding DNA sequence. The compositions disclosed herein may be utilized outside of a typical cellular environment for in vitro modification of one or more target polynucleotides. In some aspects, the target polynucleotide is isolated and purified from a genomic source. In some aspects, the target polynucleotide is on a circularized or linearized plasmid. In some aspects, the target polynucleotide is a PCR product. In some aspects, the target polynucleotide is a synthesized oligonucleotide. In some aspects, said modification includes binding to, nicking, and/or or cleaving a target polynucleotide.
[0361] Creation of a guide polynucleotide-endonuclease complex is achieved by methods known in the art. Delivery may be accomplished to an isolated target polynucleotide in vivo, or to a target polynucleotide within a cell (see, e.g., Wilbie et al., Acc. Chem. Res. 2019, 52, 6, 1555- 1564).
[0362] The target polynucleotide is selected for its capability to accommodate and bind to a particular Cas endonuclease or effector complex. Alternatively, a Cas endonuclease or effector complex is selected based on its capability to bind to a particular target polynucleotide.
[0363] Genome modification of the target polynucleotide includes recognition of the target by the endonuclease complex, with or without subsequent nicking or cleaving. When a nick or a double strand break (DSB) is created, it is repaired either by simple re-annealing, or using mechanisms in the cell (NHEJ or HR).
[0364] In some examples, a donor polynucleotide is introduced and inserted into the double strand break.
[0365] In some examples, a polynucleotide modification template is provided to the break site, whereby the break is repaired according to the template.
[0366] Regardless of the mechanism of repair or the provision of additional heterologous polynucleotides, the resulting repair of the nick or break results in perfect repair (no edit observed), the insertion of at least one nucleotide, the deletion of at least one nucleotide, the substitution of at least one nucleotide, and/or the molecular alteration of at least one nucleotide. [0367] The edit of the target polynucleotide creates a beneficial phenotype to the organism that contains the target, or to an organism to which the edited polynucleotide is introduced.
[0368] Examples of CRISPR proteins found in bacterial strains described in Table 1 are presented in Table 2, with the edit type descriptions of the Table 2 columns defined in Table 3.
Table 1: Microbial strains analyzed for the presence of CRISPR system proteins
Figure imgf000059_0001
Figure imgf000060_0001
Figure imgf000061_0001
Figure imgf000062_0001
61
Figure imgf000063_0001
62
Figure imgf000064_0001
63
Figure imgf000065_0001
64
Figure imgf000066_0001
65
Figure imgf000067_0001
66
Figure imgf000068_0001
67
Figure imgf000069_0001
O\ vo
Figure imgf000070_0001
Figure imgf000071_0001
Figure imgf000072_0001
Figure imgf000073_0001
Figure imgf000074_0001
Figure imgf000075_0001
Figure imgf000076_0001
Figure imgf000077_0001
Figure imgf000078_0001
Figure imgf000079_0001
Figure imgf000080_0001
Figure imgf000081_0001
Figure imgf000082_0001
Figure imgf000083_0001
Figure imgf000084_0001
Figure imgf000085_0001
Figure imgf000086_0001
Figure imgf000087_0001
Figure imgf000088_0001
Figure imgf000089_0001
Figure imgf000090_0001
Figure imgf000091_0001
Figure imgf000092_0001

Claims

IT IS CLAIMED:
1. A synthetic composition comprising a Cas endonuclease identified from the microbes given in Table 1, and a heterologous polynucleotide.
2. The synthetic composition of Claim 1, wherein the Cas endonuclease is encoded by a polynucleotide sharing at least 95% identity with a sequence selected from the group consisting of SEQID NOs: 1-179.
3. The synthetic composition of Claim 1, further comprising a target polynucleotide.
4. The synthetic composition of Claim 1, further comprising a guide polynucleotide.
5. The synthetic composition of Claim 1, wherein the heterologous polynucleotide is an expression element, a transgene, a donor DNA molecule, or a polynucleotide modification template.
6. The synthetic composition of Claim 1, further comprising a deaminase.
7. The synthetic composition of Claim 1, further comprising a heterologous nuclease.
8. The synthetic composition of Claim 1, further comprising a eukaryotic cell.
9. The synthetic composition of Claim 1, further comprising a prokaryotic cell.
10. A nucleotide construct comprising the polynucleotide of Claim 2.
11. A method of introducing a targeted edit in a target polynucleotide, comprising providing to the target polynucleotide the synthetic composition of Claim 1 , incubating the target polynucleotide and synthetic composition under conditions suitable for forming a complex, and assessing the target polynucleotide for the presence of the targeted edit, wherein the targeted edit is selected from the group consisting of: the insertion of at least one nucleotide, the deletion of at least one nucleotide, the replacement of at least one nucleotide, the chemical alteration of at least one nucleotide, and any plurality and/or combination of the preceding.
12. A kit comprising the synthetic composition of Claim 1.
PCT/US2023/067482 2022-05-27 2023-05-25 Novel crispr systems WO2023230563A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202263346322P 2022-05-27 2022-05-27
US63/346,322 2022-05-27

Publications (2)

Publication Number Publication Date
WO2023230563A2 true WO2023230563A2 (en) 2023-11-30
WO2023230563A3 WO2023230563A3 (en) 2024-02-01

Family

ID=88920060

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2023/067482 WO2023230563A2 (en) 2022-05-27 2023-05-25 Novel crispr systems

Country Status (1)

Country Link
WO (1) WO2023230563A2 (en)

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ES2843556T3 (en) * 2015-11-06 2021-07-19 Pioneer Hi Bred Int Improved Plant Transformation Compositions and Methods
DK3387134T3 (en) * 2015-12-11 2020-12-21 Danisco Us Inc PROCEDURES AND COMPOSITIONS FOR INCREASED NUCLEASE MEDIATED GENERAL MODIFICATION AND REDUCED EFFECTS OUTSIDE THE OBJECTIVE

Also Published As

Publication number Publication date
WO2023230563A3 (en) 2024-02-01

Similar Documents

Publication Publication Date Title
US11739032B2 (en) Methods and compositions for improving plant traits
Grünwald et al. Microbial associations in gut systems of wood-and bark-inhabiting longhorned beetles [Coleoptera: Cerambycidae]
CN107365793A (en) A kind of method of extensive genome editor suitable for plant
Qin et al. Seed endophytic microbiota in a coastal plant and phytobeneficial properties of the fungus Cladosporium cladosporioides
Nasanit et al. Assessment of epiphytic yeast diversity in rice (Oryza sativa) phyllosphere in Thailand by a culture-independent approach
CN107475210A (en) A kind of Bacterial Blight Resistance in Rice related gene OsABA2 and its application
Chibeba et al. Polyphasic characterization of nitrogen-fixing and co-resident bacteria in nodules of Phaseolus lunatus inoculated with soils from Piauí State, Northeast Brazil
CN115305252A (en) Receptor kinase gene OsIFBR1 for regulating and controlling rice resistance
Namba et al. A new primer for 16S rDNA analysis of microbial communities associated with Porphyra yezoensis
Ma et al. CRISPR/Cas9-mediated deletion of large chromosomal segments identifies a minichromosome modulating the Colletotrichum graminicola virulence on maize
PROCHÁZKOVÁ et al. Phyllosiphon ari sp. nov.(Watanabea clade, Trebouxiophyceae), a new parasitic species isolated from leaves of Arum italicum (Araceae)
Gitonga et al. Genetic and morphological diversity of indigenous bradyrhizobium nodulating soybean in organic and conventional family farming systems
WO2023230563A2 (en) Novel crispr systems
BR112013032683B1 (en) methods of increasing resistance to pests and / or diseases in a plant and producing an organic compound that has insecticidal activity, insect repellent activity and antifungal activity
Johnston‐Monje et al. Surveying diverse Zea seed for populations of bacterial endophytes
WO2023039463A1 (en) Blind editing of polynucleotide sequences
JP4981278B2 (en) Artificial cultivation method of Hanabiratake
KR101684069B1 (en) Novel genomic marker for identifying Flammulina vehitipes and uses thereof
Petlewski Exploring Lycopodiaceae endophytes, Dendrolycopodium systematics, and the future of fern model systems
CN114292852B (en) Marking composition created by wheat yellow mosaic disease resistant material
Ghasemi et al. In Vitro antagonistic and biodegradation activity of a newly isolated Delftia tsuruhatensis from rice plant in Iran
Tinti et al. Species-specific probe, based on 18S rDNA sequence, could be used for identification of the mucilage producer microalga Gonyaulax fragilis (Dinophyta)
Yu et al. Ochrobactrum pituitosum causes kernel rot and premature shedding of fresh walnut fruits.
Hafsari et al. Molecular identification of Phosphate-Solubilizing Yeast isolate KR. 1BP. 4 from Citatah karst area
Gundersen Identification and CRISPR editing of pathogen responsive genes in Lactuca Sativa

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23812775

Country of ref document: EP

Kind code of ref document: A2