WO2021098709A1 - 衍生自黄杆菌的基因编辑*** - Google Patents
衍生自黄杆菌的基因编辑*** Download PDFInfo
- Publication number
- WO2021098709A1 WO2021098709A1 PCT/CN2020/129665 CN2020129665W WO2021098709A1 WO 2021098709 A1 WO2021098709 A1 WO 2021098709A1 CN 2020129665 W CN2020129665 W CN 2020129665W WO 2021098709 A1 WO2021098709 A1 WO 2021098709A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- guide rna
- sequence
- cas12a protein
- ribozyme
- seq
- Prior art date
Links
- 238000010362 genome editing Methods 0.000 title claims abstract description 59
- 241000230562 Flavobacteriia Species 0.000 title abstract 2
- 241000196324 Embryophyta Species 0.000 claims description 81
- 108700004991 Cas12a Proteins 0.000 claims description 71
- 210000004027 cell Anatomy 0.000 claims description 67
- 108020005004 Guide RNA Proteins 0.000 claims description 66
- 239000002773 nucleotide Substances 0.000 claims description 63
- 125000003729 nucleotide group Chemical group 0.000 claims description 62
- 230000014509 gene expression Effects 0.000 claims description 38
- 238000000034 method Methods 0.000 claims description 36
- 108091026890 Coding region Proteins 0.000 claims description 33
- 108090000994 Catalytic RNA Proteins 0.000 claims description 27
- 102000053642 Catalytic RNA Human genes 0.000 claims description 27
- 108091092562 ribozyme Proteins 0.000 claims description 27
- 150000007523 nucleic acids Chemical group 0.000 claims description 25
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 20
- 235000007164 Oryza sativa Nutrition 0.000 claims description 20
- 235000009566 rice Nutrition 0.000 claims description 20
- 230000004048 modification Effects 0.000 claims description 17
- 238000012986 modification Methods 0.000 claims description 17
- 238000013518 transcription Methods 0.000 claims description 16
- 230000035897 transcription Effects 0.000 claims description 16
- 230000004927 fusion Effects 0.000 claims description 15
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 14
- 241000272517 Anseriformes Species 0.000 claims description 8
- 238000006467 substitution reaction Methods 0.000 claims description 7
- 241000219194 Arabidopsis Species 0.000 claims description 5
- 241000287828 Gallus gallus Species 0.000 claims description 5
- 241000282412 Homo Species 0.000 claims description 5
- 235000021307 Triticum Nutrition 0.000 claims description 5
- 240000008042 Zea mays Species 0.000 claims description 5
- 235000002017 Zea mays subsp mays Nutrition 0.000 claims description 5
- 235000013330 chicken meat Nutrition 0.000 claims description 5
- 230000030648 nucleus localization Effects 0.000 claims description 5
- 244000105624 Arachis hypogaea Species 0.000 claims description 4
- 241000283690 Bos taurus Species 0.000 claims description 4
- 241000282472 Canis lupus familiaris Species 0.000 claims description 4
- 241000282693 Cercopithecidae Species 0.000 claims description 4
- 241000282326 Felis catus Species 0.000 claims description 4
- 241000589565 Flavobacterium Species 0.000 claims description 4
- 241000555689 Flavobacterium branchiophilum Species 0.000 claims description 4
- 244000068988 Glycine max Species 0.000 claims description 4
- 235000010469 Glycine max Nutrition 0.000 claims description 4
- 240000005979 Hordeum vulgare Species 0.000 claims description 4
- 235000007340 Hordeum vulgare Nutrition 0.000 claims description 4
- 241000124008 Mammalia Species 0.000 claims description 4
- 241000699670 Mus sp. Species 0.000 claims description 4
- 241001494479 Pecora Species 0.000 claims description 4
- 241000700159 Rattus Species 0.000 claims description 4
- 240000006394 Sorghum bicolor Species 0.000 claims description 4
- 235000011684 Sorghum saccharatum Nutrition 0.000 claims description 4
- 241000282887 Suidae Species 0.000 claims description 4
- 235000005824 Zea mays ssp. parviglumis Nutrition 0.000 claims description 4
- 235000005822 corn Nutrition 0.000 claims description 4
- 235000020232 peanut Nutrition 0.000 claims description 4
- 244000144977 poultry Species 0.000 claims description 4
- 235000013594 poultry meat Nutrition 0.000 claims description 4
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 claims description 2
- 235000017060 Arachis glabrata Nutrition 0.000 claims description 2
- 235000010777 Arachis hypogaea Nutrition 0.000 claims description 2
- 235000018262 Arachis monticola Nutrition 0.000 claims description 2
- 241000209510 Liliopsida Species 0.000 claims description 2
- 210000003527 eukaryotic cell Anatomy 0.000 claims description 2
- 240000007594 Oryza sativa Species 0.000 claims 2
- 244000098338 Triticum aestivum Species 0.000 claims 1
- 238000010353 genetic engineering Methods 0.000 abstract description 3
- 108090000623 proteins and genes Proteins 0.000 description 49
- 102000004169 proteins and genes Human genes 0.000 description 30
- 235000018102 proteins Nutrition 0.000 description 29
- 108091033409 CRISPR Proteins 0.000 description 26
- 238000010354 CRISPR gene editing Methods 0.000 description 20
- 108020004566 Transfer RNA Proteins 0.000 description 19
- 108020004414 DNA Proteins 0.000 description 18
- 241000209094 Oryza Species 0.000 description 18
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 16
- 230000009466 transformation Effects 0.000 description 16
- 235000001014 amino acid Nutrition 0.000 description 15
- 239000013598 vector Substances 0.000 description 14
- 150000001413 amino acids Chemical class 0.000 description 12
- 239000013612 plasmid Substances 0.000 description 12
- 108090000765 processed proteins & peptides Proteins 0.000 description 12
- 210000001519 tissue Anatomy 0.000 description 12
- 108020004705 Codon Proteins 0.000 description 11
- 239000012634 fragment Substances 0.000 description 11
- 102000039446 nucleic acids Human genes 0.000 description 11
- 108020004707 nucleic acids Proteins 0.000 description 11
- 230000001105 regulatory effect Effects 0.000 description 11
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 10
- 102000004196 processed proteins & peptides Human genes 0.000 description 10
- 210000001938 protoplast Anatomy 0.000 description 10
- 241000894006 Bacteria Species 0.000 description 9
- 108010077850 Nuclear Localization Signals Proteins 0.000 description 9
- 230000006870 function Effects 0.000 description 9
- 229920001184 polypeptide Polymers 0.000 description 9
- 230000009418 agronomic effect Effects 0.000 description 7
- 230000035772 mutation Effects 0.000 description 7
- 108091033319 polynucleotide Proteins 0.000 description 7
- 102000040430 polynucleotide Human genes 0.000 description 7
- 239000002157 polynucleotide Substances 0.000 description 7
- 230000000295 complement effect Effects 0.000 description 6
- 230000005782 double-strand break Effects 0.000 description 6
- 239000013604 expression vector Substances 0.000 description 6
- 230000001404 mediated effect Effects 0.000 description 6
- 108700010070 Codon Usage Proteins 0.000 description 5
- 102100040870 Glycine amidinotransferase, mitochondrial Human genes 0.000 description 5
- 101000893303 Homo sapiens Glycine amidinotransferase, mitochondrial Proteins 0.000 description 5
- 241000829100 Macaca mulatta polyomavirus 1 Species 0.000 description 5
- 101710163270 Nuclease Proteins 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 229910052757 nitrogen Inorganic materials 0.000 description 5
- 241000894007 species Species 0.000 description 5
- 238000013519 translation Methods 0.000 description 5
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 4
- 108091081024 Start codon Proteins 0.000 description 4
- 241000209140 Triticum Species 0.000 description 4
- 241000700605 Viruses Species 0.000 description 4
- OIRDTQYFTABQOQ-KQYNXXCUSA-N adenosine Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OIRDTQYFTABQOQ-KQYNXXCUSA-N 0.000 description 4
- 238000012217 deletion Methods 0.000 description 4
- 230000037430 deletion Effects 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 238000012165 high-throughput sequencing Methods 0.000 description 4
- 230000010354 integration Effects 0.000 description 4
- 108020004999 messenger RNA Proteins 0.000 description 4
- 229930182817 methionine Natural products 0.000 description 4
- 229920000642 polymer Polymers 0.000 description 4
- 241000589158 Agrobacterium Species 0.000 description 3
- 102000053602 DNA Human genes 0.000 description 3
- 241000724709 Hepatitis delta virus Species 0.000 description 3
- 102000002488 Nucleoplasmin Human genes 0.000 description 3
- 238000007792 addition Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 235000013399 edible fruits Nutrition 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 238000012239 gene modification Methods 0.000 description 3
- 238000000338 in vitro Methods 0.000 description 3
- 238000001727 in vivo Methods 0.000 description 3
- 230000001939 inductive effect Effects 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 108060005597 nucleoplasmin Proteins 0.000 description 3
- 235000015097 nutrients Nutrition 0.000 description 3
- 238000002360 preparation method Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000008439 repair process Effects 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- 125000006850 spacer group Chemical group 0.000 description 3
- 230000001052 transient effect Effects 0.000 description 3
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 2
- 108091093088 Amplicon Proteins 0.000 description 2
- 101000909256 Caldicellulosiruptor bescii (strain ATCC BAA-1888 / DSM 6725 / Z-1320) DNA polymerase I Proteins 0.000 description 2
- 101710177611 DNA polymerase II large subunit Proteins 0.000 description 2
- 101710184669 DNA polymerase II small subunit Proteins 0.000 description 2
- 241000206602 Eukaryota Species 0.000 description 2
- 101000860092 Francisella tularensis subsp. novicida (strain U112) CRISPR-associated endonuclease Cas12a Proteins 0.000 description 2
- AEMRFAOFKBGASW-UHFFFAOYSA-N Glycolic acid Chemical compound OCC(O)=O AEMRFAOFKBGASW-UHFFFAOYSA-N 0.000 description 2
- NYHBQMYGNKIUIF-UUOKFMHZSA-N Guanosine Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O NYHBQMYGNKIUIF-UUOKFMHZSA-N 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 2
- 108091005461 Nucleic proteins Proteins 0.000 description 2
- 101000902592 Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1) DNA polymerase Proteins 0.000 description 2
- 108020004511 Recombinant DNA Proteins 0.000 description 2
- 238000012300 Sequence Analysis Methods 0.000 description 2
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 2
- DRTQHJPVMGBUCF-XVFCMESISA-N Uridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-XVFCMESISA-N 0.000 description 2
- 125000000539 amino acid group Chemical group 0.000 description 2
- 210000004102 animal cell Anatomy 0.000 description 2
- 238000000137 annealing Methods 0.000 description 2
- 230000001580 bacterial effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000004071 biological effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000001276 controlling effect Effects 0.000 description 2
- 238000001976 enzyme digestion Methods 0.000 description 2
- 230000005017 genetic modification Effects 0.000 description 2
- 235000013617 genetically modified food Nutrition 0.000 description 2
- 235000003869 genetically modified organism Nutrition 0.000 description 2
- 230000002363 herbicidal effect Effects 0.000 description 2
- 239000004009 herbicide Substances 0.000 description 2
- 230000003834 intracellular effect Effects 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 230000006780 non-homologous end joining Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 239000002243 precursor Substances 0.000 description 2
- 238000001273 protein sequence alignment Methods 0.000 description 2
- 230000008929 regeneration Effects 0.000 description 2
- 238000011069 regeneration method Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 230000001131 transforming effect Effects 0.000 description 2
- DIGQNXIGRZPYDK-WKSCXVIASA-N (2R)-6-amino-2-[[2-[[(2S)-2-[[2-[[(2R)-2-[[(2S)-2-[[(2R,3S)-2-[[2-[[(2S)-2-[[2-[[(2S)-2-[[(2S)-2-[[(2R)-2-[[(2S,3S)-2-[[(2R)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[2-[[(2S)-2-[[(2R)-2-[[2-[[2-[[2-[(2-amino-1-hydroxyethylidene)amino]-3-carboxy-1-hydroxypropylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1-hydroxyethylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxyethylidene]amino]-1-hydroxypropylidene]amino]-1,3-dihydroxypropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxybutylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1-hydroxypropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxyethylidene]amino]-1,5-dihydroxy-5-iminopentylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxybutylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxyethylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1-hydroxyethylidene]amino]hexanoic acid Chemical compound C[C@@H]([C@@H](C(=N[C@@H](CS)C(=N[C@@H](C)C(=N[C@@H](CO)C(=NCC(=N[C@@H](CCC(=N)O)C(=NC(CS)C(=N[C@H]([C@H](C)O)C(=N[C@H](CS)C(=N[C@H](CO)C(=NCC(=N[C@H](CS)C(=NCC(=N[C@H](CCCCN)C(=O)O)O)O)O)O)O)O)O)O)O)O)O)O)O)N=C([C@H](CS)N=C([C@H](CO)N=C([C@H](CO)N=C([C@H](C)N=C(CN=C([C@H](CO)N=C([C@H](CS)N=C(CN=C(C(CS)N=C(C(CC(=O)O)N=C(CN)O)O)O)O)O)O)O)O)O)O)O)O DIGQNXIGRZPYDK-WKSCXVIASA-N 0.000 description 1
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 1
- UHDGCWIWMRVCDJ-UHFFFAOYSA-N 1-beta-D-Xylofuranosyl-NH-Cytosine Natural products O=C1N=C(N)C=CN1C1C(O)C(O)C(CO)O1 UHDGCWIWMRVCDJ-UHFFFAOYSA-N 0.000 description 1
- YKBGVTZYEHREMT-KVQBGUIXSA-N 2'-deoxyguanosine Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](CO)O1 YKBGVTZYEHREMT-KVQBGUIXSA-N 0.000 description 1
- 108010020183 3-phosphoshikimate 1-carboxyvinyltransferase Proteins 0.000 description 1
- CKTSBUTUHBMZGZ-ULQXZJNLSA-N 4-amino-1-[(2r,4s,5r)-4-hydroxy-5-(hydroxymethyl)oxolan-2-yl]-5-tritiopyrimidin-2-one Chemical compound O=C1N=C(N)C([3H])=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 CKTSBUTUHBMZGZ-ULQXZJNLSA-N 0.000 description 1
- 230000005730 ADP ribosylation Effects 0.000 description 1
- 102000007469 Actins Human genes 0.000 description 1
- 108010085238 Actins Proteins 0.000 description 1
- 239000002028 Biomass Substances 0.000 description 1
- 241000743774 Brachypodium Species 0.000 description 1
- 239000002126 C01EB10 - Adenosine Substances 0.000 description 1
- 238000010453 CRISPR/Cas method Methods 0.000 description 1
- 241000701489 Cauliflower mosaic virus Species 0.000 description 1
- 108091035707 Consensus sequence Proteins 0.000 description 1
- MIKUYHXYGGJMLM-GIMIYPNGSA-N Crotonoside Natural products C1=NC2=C(N)NC(=O)N=C2N1[C@H]1O[C@@H](CO)[C@H](O)[C@@H]1O MIKUYHXYGGJMLM-GIMIYPNGSA-N 0.000 description 1
- UHDGCWIWMRVCDJ-PSQAKQOGSA-N Cytidine Natural products O=C1N=C(N)C=CN1[C@@H]1[C@@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-PSQAKQOGSA-N 0.000 description 1
- 241000701022 Cytomegalovirus Species 0.000 description 1
- NYHBQMYGNKIUIF-UHFFFAOYSA-N D-guanosine Natural products C1=2NC(N)=NC(=O)C=2N=CN1C1OC(CO)C(O)C1O NYHBQMYGNKIUIF-UHFFFAOYSA-N 0.000 description 1
- 241000702421 Dependoparvovirus Species 0.000 description 1
- 208000035240 Disease Resistance Diseases 0.000 description 1
- 102100031780 Endonuclease Human genes 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 108010002537 Fruit Proteins Proteins 0.000 description 1
- 108700039691 Genetic Promoter Regions Proteins 0.000 description 1
- 108090001102 Hammerhead ribozyme Proteins 0.000 description 1
- 108091080980 Hepatitis delta virus ribozyme Proteins 0.000 description 1
- 229930010555 Inosine Natural products 0.000 description 1
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- 241000713666 Lentivirus Species 0.000 description 1
- 102000003792 Metallothionein Human genes 0.000 description 1
- 108090000157 Metallothionein Proteins 0.000 description 1
- 241000736262 Microbiota Species 0.000 description 1
- 208000009869 Neu-Laxova syndrome Diseases 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 101100463166 Oryza sativa subsp. japonica PDS gene Proteins 0.000 description 1
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 1
- 102000009661 Repressor Proteins Human genes 0.000 description 1
- 108010034634 Repressor Proteins Proteins 0.000 description 1
- 241000714474 Rous sarcoma virus Species 0.000 description 1
- 241000194017 Streptococcus Species 0.000 description 1
- 241000193996 Streptococcus pyogenes Species 0.000 description 1
- 101000910035 Streptococcus pyogenes serotype M1 CRISPR-associated endonuclease Cas9/Csn1 Proteins 0.000 description 1
- 238000010459 TALEN Methods 0.000 description 1
- 108010043645 Transcription Activator-Like Effector Nucleases Proteins 0.000 description 1
- 241000700618 Vaccinia virus Species 0.000 description 1
- 208000036142 Viral infection Diseases 0.000 description 1
- 108700041896 Zea mays Ubi-1 Proteins 0.000 description 1
- 235000016383 Zea mays subsp huehuetenangensis Nutrition 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 150000007513 acids Chemical class 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 229960005305 adenosine Drugs 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 244000052616 bacterial pathogen Species 0.000 description 1
- 102000005936 beta-Galactosidase Human genes 0.000 description 1
- 108010005774 beta-Galactosidase Proteins 0.000 description 1
- DRTQHJPVMGBUCF-PSQAKQOGSA-N beta-L-uridine Natural products O[C@H]1[C@@H](O)[C@H](CO)O[C@@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-PSQAKQOGSA-N 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 229910000389 calcium phosphate Inorganic materials 0.000 description 1
- 239000001506 calcium phosphate Substances 0.000 description 1
- 235000011010 calcium phosphates Nutrition 0.000 description 1
- 235000013339 cereals Nutrition 0.000 description 1
- 210000003763 chloroplast Anatomy 0.000 description 1
- 239000013611 chromosomal DNA Substances 0.000 description 1
- 244000038559 crop plants Species 0.000 description 1
- UHDGCWIWMRVCDJ-ZAKLUEHWSA-N cytidine Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-ZAKLUEHWSA-N 0.000 description 1
- 230000001086 cytosolic effect Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- QDOXWKRWXJOMAK-UHFFFAOYSA-N dichromium trioxide Chemical compound O=[Cr]O[Cr]=O QDOXWKRWXJOMAK-UHFFFAOYSA-N 0.000 description 1
- 210000005069 ears Anatomy 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 241001233957 eudicotyledons Species 0.000 description 1
- 230000006251 gamma-carboxylation Effects 0.000 description 1
- 125000000291 glutamic acid group Chemical group N[C@@H](CCC(O)=O)C(=O)* 0.000 description 1
- 230000013595 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- 229940029575 guanosine Drugs 0.000 description 1
- 238000003306 harvesting Methods 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 229940088597 hormone Drugs 0.000 description 1
- 210000005260 human cell Anatomy 0.000 description 1
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 1
- 230000033444 hydroxylation Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 229960003786 inosine Drugs 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 238000001638 lipofection Methods 0.000 description 1
- 235000009973 maize Nutrition 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000035800 maturation Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 244000005700 microbiome Species 0.000 description 1
- 238000000520 microinjection Methods 0.000 description 1
- 210000003470 mitochondria Anatomy 0.000 description 1
- 230000002438 mitochondrial effect Effects 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 230000009437 off-target effect Effects 0.000 description 1
- 210000003463 organelle Anatomy 0.000 description 1
- 210000001672 ovary Anatomy 0.000 description 1
- 238000010647 peptide synthesis reaction Methods 0.000 description 1
- 238000003976 plant breeding Methods 0.000 description 1
- 210000002706 plastid Anatomy 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 230000001172 regenerating effect Effects 0.000 description 1
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000002864 sequence alignment Methods 0.000 description 1
- 230000001568 sexual effect Effects 0.000 description 1
- 230000019635 sulfation Effects 0.000 description 1
- 238000005670 sulfation reaction Methods 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- 238000001890 transfection Methods 0.000 description 1
- 230000009261 transgenic effect Effects 0.000 description 1
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 1
- 241000701161 unidentified adenovirus Species 0.000 description 1
- 241000701447 unidentified baculovirus Species 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
- DRTQHJPVMGBUCF-UHFFFAOYSA-N uracil arabinoside Natural products OC1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-UHFFFAOYSA-N 0.000 description 1
- 229940045145 uridine Drugs 0.000 description 1
- 230000009385 viral infection Effects 0.000 description 1
- 239000013603 viral vector Substances 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/102—Mutagenizing nucleic acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/113—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/87—Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
- C12N15/90—Stable introduction of foreign DNA into chromosome
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/87—Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
- C12N15/90—Stable introduction of foreign DNA into chromosome
- C12N15/902—Stable introduction of foreign DNA into chromosome using homologous recombination
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N5/00—Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
- C12N5/10—Cells modified by introduction of foreign genetic material
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/01—Fusion polypeptide containing a localisation/targetting motif
- C07K2319/09—Fusion polypeptide containing a localisation/targetting motif containing a nuclear localisation signal
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/12—Type of nucleic acid catalytic nucleic acids, e.g. ribozymes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/30—Chemical structure
- C12N2310/35—Nature of the modification
- C12N2310/351—Conjugate
- C12N2310/3519—Fusion with another nucleic acid
Definitions
- the invention belongs to the field of genetic engineering. Specifically, the present invention relates to a gene editing system derived from Flavobacterium and its application.
- Genome editing technology is a genetic engineering technology based on the targeted modification of the genome by artificial nucleases, and it is playing an increasingly powerful role in agricultural and medical research.
- Clustered regularly spaced short palindromic repeats and its related system Clustered regularly interspaced short palindromic repeats/CRISPR associated, CRISPR
- CRISPR Clustered regularly interspaced short palindromic repeats/CRISPR associated, CRISPR
- RNA RNA
- Cas The protein can be targeted to any position in the genome, so that the targeted sequence produces a double-strand break (DSB), and activates non-homologous End Joining (NHEJ) or homologous repair ( The Homology Directly Repair (HDR) approach introduces mutations in these two ways.
- the most commonly used Cas protein is the Cas9 protein derived from Streptococcus pyogenes, which belongs to the Type II-A subtype of the Class II CRISPR system.
- Cong et al. Multiplex Genome Engineering Using CRISPR/Cas Systems, Science, 2013)
- Mali et al. RNA-guided human genome engineering via Cas9, Science, 2013
- Both the CRISPR/Cas12a system and the CRISPR/Cas9 system belong to the Class II CRISPR system.
- Zetsche et al. applied the Cas12a protein (formerly known as Cpf1) derived from amino acid streptococcus and Trichospirillum to the gene editing of animal cells (Cpf1 is a Single RNA) -Guided Endonuclease of a Class 2 CRISPR-Cas System, Cell, 2015).
- Cpf1 is a Single RNA
- the CRISPR/Cas12a system belongs to Type V, which has a shorter crRNA sequence and higher specificity.
- the 5'-TTTN PAM sequence is complementary to the 3'-NGG of Cas9, and it is easier to produce sticky ends, etc. Advantages, further expanding the gene editing toolbox of the CRISPR system.
- CRISPR/Cas9 and CRISPR/Cas12 have successfully been widely used in animal cell lines, animal individuals, plant cells, plant individuals and microorganisms. Because of their high efficiency and simple use, they have been widely used worldwide. The scope caused a revolution in the field of gene editing.
- the working efficiency of the CRISPR/Cas12a system varies greatly at different target sites, and the working efficiency is low at certain sites in the plant genome. This may be due to the fact that the existing Cas12a system is mainly derived from humans or animals. Pathogenic bacteria are caused by their suitable working temperature being higher than that of plants. Therefore, it is necessary to identify and develop a CRISPR/Cas12a system that can work stably at suitable plant temperatures.
- FbCas12a protein in plant symbiotic bacteria that had not been reported before through homology and similarity comparison, and artificially predicted the mature form of their own crRNA, and compared their own crRNA with LbCas12a's crRNA in vivo. It is found that FbCas12a can work in plant cells and has higher editing efficiency when using LbCas12a crRNA.
- FIG. 1 Schematic diagram of the carrier used in the embodiment.
- FIG. 1 Editing of rice endogenous gene OsEPSPS by the combination of FbCas12a and FbcrRNA.
- FIG. Editing results of rice endogenous genes by the combination of FbCas12a and FbcrRNA or LbcrRNA.
- the term “and/or” encompasses all combinations of items connected by the term, and should be treated as if each combination has been individually listed herein.
- “A and/or B” encompasses “A”, “A and B”, and “B”.
- “A, B, and/or C” encompasses "A”, “B”, “C”, “A and B”, “A and C”, “B and C”, and "A and B and C”.
- the protein or nucleic acid may be composed of the sequence, or may have additional amino acids or nuclei at one or both ends of the protein or nucleic acid. Glycolic acid, but still has the activity described in the present invention.
- methionine encoded by the start codon at the N-terminus of the polypeptide will be retained under certain actual conditions (for example, when expressed in a specific expression system), but does not substantially affect the function of the polypeptide.
- Gene as used herein not only covers chromosomal DNA present in the nucleus, but also includes organelle DNA present in subcellular components of the cell (such as mitochondria, plastids).
- organism includes any organism suitable for genome editing, preferably eukaryotes.
- organisms include, but are not limited to, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, and cats; poultry such as chickens, ducks, and geese; plants include monocots and dicots, For example, rice, corn, wheat, sorghum, barley, soybeans, peanuts, Arabidopsis and so on.
- Genetically modified organism or “genetically modified cell” means an organism or cell that contains exogenous polynucleotides or modified genes or expression control sequences in its genome.
- exogenous polynucleotides can be stably integrated into the genome of organisms or cells, and inherited for successive generations.
- the exogenous polynucleotide can be integrated into the genome alone or as part of a recombinant DNA construct.
- the modified gene or expression control sequence contains single or multiple deoxynucleotide substitutions, deletions and additions in the organism or cell genome.
- Form in terms of sequence means a sequence from a foreign species, or if from the same species, a sequence that has undergone significant changes in composition and/or locus from its natural form through deliberate human intervention.
- nucleic acid sequence is used interchangeably and are single-stranded or double-stranded RNA or DNA polymers, optionally containing synthetic, non-natural Or changed nucleotide bases.
- Nucleotides are referred to by their single letter names as follows: “A” is adenosine or deoxyadenosine (respectively RNA or DNA), “C” is cytidine or deoxycytidine, and “G” is guanosine or Deoxyguanosine, “U” means uridine, “T” means deoxythymidine, “R” means purine (A or G), “Y” means pyrimidine (C or T), “K” means G or T, “ H” means A or C or T, “I” means inosine, and “N” means any nucleotide.
- Polypeptide “peptide”, and “protein” are used interchangeably in the present invention and refer to a polymer of amino acid residues.
- the term applies to amino acid polymers in which one or more amino acid residues are artificial chemical analogs of the corresponding naturally-occurring amino acids, as well as to naturally-occurring amino acid polymers.
- the terms "polypeptide”, “peptide”, “amino acid sequence” and “protein” may also include modified forms, including but not limited to glycosylation, lipid linkage, sulfation, gamma carboxylation of glutamic acid residues, hydroxyl And ADP-ribosylation.
- Sequence "identity” has the art-recognized meaning, and the percentage of sequence identity between two nucleic acid or polypeptide molecules or regions can be calculated using published techniques. Sequence identity can be measured along the entire length of a polynucleotide or polypeptide or along a region of the molecule.
- identity is well known to the skilled person (Carrillo, H. & Lipman, D., SIAM J Applied Math 48: 1073 (1988) )).
- Suitable conservative amino acid substitutions are known to those skilled in the art and can generally be made without changing the biological activity of the resulting molecule.
- those skilled in the art recognize that a single amino acid substitution in a non-essential region of a polypeptide does not substantially change the biological activity (see, for example, Watson et al., Molecular Biology of the Gene, 4th Edition, 1987, The Benjamin/Cummings Pub .co.,p.224).
- expression construct refers to a vector suitable for expression of a nucleotide sequence of interest in an organism, such as a recombinant vector.
- “Expression” refers to the production of a functional product.
- the expression of a nucleotide sequence may refer to the transcription of the nucleotide sequence (such as transcription to generate mRNA or functional RNA) and/or the translation of RNA into a precursor or mature protein.
- the "expression construct" of the present invention can be a linear nucleic acid fragment, a circular plasmid, a viral vector, or, in some embodiments, can be an RNA (such as mRNA) that can be translated.
- the "expression construct" of the present invention may comprise regulatory sequences and nucleotide sequences of interest from different sources, or regulatory sequences and nucleotide sequences of interest from the same source but arranged in a manner different from those normally occurring in nature.
- regulatory sequence and “regulatory element” are used interchangeably and refer to the upstream (5' non-coding sequence), middle or downstream (3' non-coding sequence) of the coding sequence, and affect the transcription, RNA processing, or processing of the related coding sequence. Stability or translated nucleotide sequence. Regulatory sequences may include, but are not limited to, promoters, translation leader sequences, introns, and polyadenylation recognition sequences.
- Promoter refers to a nucleic acid fragment capable of controlling the transcription of another nucleic acid fragment.
- a promoter is a promoter capable of controlling gene transcription in a cell, regardless of whether it is derived from the cell.
- the promoter can be a constitutive promoter or a tissue-specific promoter or a developmentally regulated promoter or an inducible promoter.
- tissue-specific promoter and “tissue-preferred promoter” are used interchangeably, and refer to mainly but not necessarily exclusively expressed in a tissue or organ, and can also be expressed in a specific cell or cell type Promoter.
- tissue-preferred promoter refers to a promoter whose activity is determined by developmental events.
- inducible promoters selectively express operably linked DNA sequences in response to endogenous or exogenous stimuli (environment, hormones, chemical signals, etc.).
- operably linked refers to the connection of regulatory elements (for example, but not limited to, promoter sequences, transcription termination sequences, etc.) to nucleic acid sequences (for example, coding sequences or open reading frames) such that the nucleotides The transcription of the sequence is controlled and regulated by the transcription control element.
- regulatory elements for example, but not limited to, promoter sequences, transcription termination sequences, etc.
- nucleic acid sequences for example, coding sequences or open reading frames
- "Introducing" nucleic acid molecules such as plasmids, linear nucleic acid fragments, RNA, etc.
- proteins into an organism refers to transforming the cells of the organism with the nucleic acid or protein so that the nucleic acid or protein can function in the cell.
- the "transformation” used in the present invention includes stable transformation and transient transformation.
- “Stable transformation” refers to the introduction of an exogenous nucleotide sequence into the genome, resulting in the stable inheritance of the exogenous nucleotide sequence. Once stably transformed, the exogenous nucleic acid sequence is stably integrated into the genome of the organism and any successive generations thereof.
- Transient transformation refers to the introduction of nucleic acid molecules or proteins into cells to perform functions without stable inheritance of exogenous nucleotide sequences. In transient transformation, the foreign nucleic acid sequence is not integrated into the genome.
- Proteins refer to the physiological, morphological, biochemical or physical characteristics of cells or organisms.
- “Agronomic traits” especially refer to the measurable index parameters of crop plants, including but not limited to: leaf green, grain yield, growth rate, total biomass or accumulation rate, fresh weight at maturity, dry weight at maturity, fruit Yield, seed yield, plant total nitrogen content, fruit nitrogen content, seed nitrogen content, plant nutrient tissue nitrogen content, plant total free amino acid content, fruit free amino acid content, seed free amino acid content, plant nutrient tissue free amino acid content, plant total protein Content, fruit protein content, seed protein content, plant nutrient tissue protein content, herbicide resistance, drought resistance, nitrogen absorption, root lodging, harvest index, stem lodging, plant height, ear height, ear length, disease resistance Resistance, cold resistance, salt resistance and tiller number.
- Genome editing system based on Flavobacterium Cas12a protein
- the present invention provides a new Cas12a protein, which
- Cas12a protein “Cas12a nuclease” and “Cas12a” are used interchangeably herein, and refer to RNA-guided nucleases or variants thereof including Cas12a protein or fragments thereof.
- Cas12a is a component of the CRISPR-Cas12a genome editing system, which can target and/or cleave DNA target sequences to form DNA double-strand breaks (DSB) under the guidance of guide RNA (crRNA).
- the Cas12a protein of the present invention is derived from plant symbiotic bacteria, and therefore, is particularly suitable for genome editing in plants.
- the Cas12a protein is derived from a species of the genus Flavobacterium. In some embodiments, the Cas12a protein is derived from Flavobacterium branchiophilum. Those skilled in the art will understand that the Cas12a protein of different strains of the same bacterial species may have certain differences in amino acid sequence, but can achieve substantially the same function.
- the Cas12a protein is produced recombinantly.
- the Cas12a protein further contains a fusion tag, for example, a tag used for the separation and/or purification of the Cas12a protein.
- a fusion tag for example, a tag used for the separation and/or purification of the Cas12a protein.
- Methods of recombinantly producing proteins are known in the art.
- tags that can be used to separate/or purify proteins are known in the art, including but not limited to His tags, GST tags, and the like. Generally speaking, these tags will not change the activity of the target protein.
- the Cas12a protein is also fused with other functional proteins, such as deaminase, transcription activation/repressor protein, etc., so as to realize base editing or transcription regulation functions.
- the Cas12a protein of the present invention further comprises a nuclear localization sequence (NLS), for example, connected to the nuclear localization sequence via a linker.
- the joint can be 1-50 long (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or 20-25, 25-50) or more amino acids, non-functional amino acid sequences without secondary or higher structure.
- the linker may be a flexible linker, such as SGGS (SEQ ID NO: 3).
- one or more NLS in the Cas12a protein should have sufficient strength to drive the Cas12a protein to accumulate in the nucleus in an amount that can achieve its genome editing function.
- the strength of nuclear localization activity is determined by the number and location of NLS in the Cas12a protein, one or more specific NLS used, or a combination of these factors.
- Exemplary nuclear localization sequences include, but are not limited to, SV40 nuclear localization signal sequence (for example, shown in SEQ ID NO: 4), and nucleoplasmin nuclear localization signal sequence (for example, shown in SEQ ID NO: 5).
- the Cas12a protein of the present invention may also include other positioning sequences, such as cytoplasmic positioning sequence, chloroplast positioning sequence, mitochondrial positioning sequence, etc.
- the multiple positioning sequences may be connected by a linker.
- the Cas12a protein comprises the amino acid sequence shown in SEQ ID NO:6.
- the present invention provides the use of the Cas12a protein of the present invention in genome editing of cells, preferably eukaryotic cells, and more preferably plant cells.
- the present invention provides a genome editing system for site-directed modification of a target nucleic acid sequence in a cell genome, which comprises the Cas12a protein of the present invention and/or comprises a nucleotide sequence encoding the Cas12a protein of the present invention.
- Expression construct for site-directed modification of a target nucleic acid sequence in a cell genome, which comprises the Cas12a protein of the present invention and/or comprises a nucleotide sequence encoding the Cas12a protein of the present invention.
- genomic editing system and “gene editing system” are used interchangeably, and refer to a combination of components required for genome editing of the genome of an organism's cells, wherein the various components of the system, for example, The Cas12a protein, gRNA, or corresponding expression constructs, etc. may exist independently of each other, or may exist in the form of a composition in any combination.
- the genome editing system further includes at least one guide RNA (gRNA) and/or an expression construct comprising a nucleotide sequence encoding the at least one guide RNA.
- gRNA guide RNA
- the guide RNA of the CRISPR-Cas12a genome editing system is usually composed of only crRNA molecules, where the crRNA contains sufficient identity with the target sequence to hybridize with the complementary sequence of the target sequence and direct the CRISPR complex (Cas12a+crRNA) to be specific to the target sequence. Sexual binding sequence.
- the guide RNA is crRNA.
- the guide RNA includes the crRNA backbone sequence shown in SEQ ID NO: 10 or 11.
- the crRNA backbone sequence is SEQ ID NO: 11.
- the cRNA sequence further includes a sequence (ie, a spacer sequence) that specifically hybridizes with the complementary sequence of the target sequence located 3'of the cRNA backbone sequence.
- the crRNA comprises the following sequence:
- sequence N x (spacer sequence) can specifically hybridize to the complementary sequence of the target sequence.
- the 5'end of the target sequence targeted by the genome editing system of the present invention needs to include a protospacer adjacent motif (PAM).
- the PAM may be, for example, 5'-TTTN, where N represents A, G, C, or T.
- PAM protospacer adjacent motif
- different PAM sequences can also be used.
- those skilled in the art can easily determine the target sequence in the genome that can be used for targeting and optionally editing and design a suitable guide RNA accordingly. For example, if there is a PAM sequence 5'-TTTG-3' in the genome, about 18 to about 35, preferably 20, 21, 22 or 23 consecutive nucleotides in the 3'immediate vicinity can be used as the target sequence.
- the at least one guide RNA is encoded by different expression constructs. In some embodiments, the at least one guide RNA is encoded by the same expression construct. In some embodiments, the at least one guide RNA and the Cas12a protein of the invention are encoded by the same expression construct.
- the genome editing system may comprise any one selected from the following:
- the Cas12a protein of the present invention and an expression construct comprising a nucleotide sequence encoding the at least one guide RNA;
- an expression construct comprising a nucleotide sequence encoding the Cas12a protein of the present invention, and an expression construct comprising a nucleotide sequence encoding the at least one guide RNA;
- the nucleotide sequence encoding the Cas12a protein is codon-optimized for the organism from which the cell to be genome edited is derived.
- Codon optimization refers to replacing at least one codon of the natural sequence with a codon that is used more frequently or most frequently in the gene of the host cell (e.g., about or more than about 1, 2, 3, 4, 5, 10). , 15, 20, 25, 50 or more codons while maintaining the natural amino acid sequence to modify the nucleic acid sequence to enhance expression in the host cell of interest.
- Different species display certain codons for specific amino acids Codon preference (the difference in codon usage between organisms) is often related to the translation efficiency of messenger RNA (mRNA), and the translation efficiency is considered to depend on the nature and the nature of the codon being translated
- mRNA messenger RNA
- tRNA transfer RNA
- Codon utilization tables can be easily obtained, such as the codon usage database available on www.kazusa.orjp/codon/ ("Codon Usage Database"), and these tables can be adjusted in different ways Applicable. See, Nakamura Y. et al., "Codon usage tabulated from the international DNA sequence databases: status for the year 2000. Nucl. Acids Res., 28:292 (2000).
- the organism from which the cells for genome editing can be performed by the Cas12a protein or genome editing system of the present invention are preferably eukaryotes, including but not limited to mammals such as humans, mice, rats, monkeys, dogs, pigs, and sheep , Cattle, cats; poultry such as chickens, ducks, geese; plants include monocotyledonous and dicotyledonous plants, such as rice, corn, wheat, sorghum, barley, soybeans, peanuts, Arabidopsis and so on.
- the Cas12a protein or genome editing system of the present invention is particularly suitable for genome editing in plants.
- the nucleotide sequence encoding the Cas12a protein is codon-optimized for plants such as rice.
- the nucleotide sequence encoding the Cas12a protein is selected from SEQ ID NO: 2 and SEQ ID NO: 7.
- the nucleotide sequence encoding the Cas12a protein and/or the nucleotide sequence encoding the at least one guide RNA are operably linked to an expression control element such as a promoter.
- promoters examples include, but are not limited to, polymerase (pol) I, pol II, or pol III promoters.
- the pol I promoter include the chicken RNA pol I promoter.
- pol II promoters include, but are not limited to, cytomegalovirus immediate early (CMV) promoter, Rous sarcoma virus long terminal repeat (RSV-LTR) promoter, and simian virus 40 (SV40) immediate early promoter.
- pol III promoters include U6 and H1 promoters.
- An inducible promoter such as a metallothionein promoter can be used.
- promoters include T7 phage promoter, T3 phage promoter, ⁇ -galactosidase promoter, and Sp6 phage promoter.
- the promoter can be cauliflower mosaic virus 35S promoter, maize Ubi-1 promoter, wheat U6 promoter, rice U3 promoter, maize U3 promoter, rice actin promoter.
- the 5'end of the guide RNA coding sequence is connected to the first The 3'end of the ribozyme coding sequence.
- the first ribozyme is designed to cut the first ribozyme-guide RNA fusion produced by transcription in the cell at the 5'end of the guide RNA, thereby forming a non-carrying 5' Guide RNA with extra nucleotides at the end.
- the 3'end of the guide RNA coding sequence is connected to the 5'end of the second ribozyme coding sequence, and the second ribozyme is designed to cut the cell at the 3'end of the guide RNA.
- the guide RNA-second ribozyme fusion generated by transcription, thereby forming a guide RNA that does not carry additional nucleotides at the 3'end.
- the 5'end of the guide RNA coding sequence is connected to the 3'end of the first ribozyme coding sequence, and the 3'end of the guide RNA coding sequence is connected to the 5'end of the second ribozyme coding sequence.
- the first ribozyme is designed to cut the first ribozyme-guide RNA-second ribozyme fusion produced by transcription in the 5'end of the guide RNA
- the second ribozyme is designed to The first ribozyme-guide RNA-second ribozyme fusion produced by transcription in the cell is cut at the 3'end of the guide RNA, thereby forming a guide RNA that does not carry additional nucleotides at the 5'and 3'ends.
- first or second ribozyme is within the abilities of those skilled in the art. For example, see Gao et al., JIPB, Apr, 2014; Vol 56, Issue 4,343-349.
- the first ribozyme is encoded by the following sequence: 5'-(N) 6 CTGATGAGTCCGTGAGGACGAAACGAGTAAGCTCGTC-3' (SEQ ID NO: 31), wherein N is independently selected from A, G, C, and T , And (N) 6 represents the reverse complementary sequence to the first 6 nucleotides of the 5'end of the guide RNA.
- the second ribozyme is encoded by the following sequence: 5'-GGCCGGCATGGTCCCAGCCTCCTCGCTGGCGCCGGCTGGGCAACATGCTTCGGCATGGCGAATGGGAC-3' (SEQ ID NO: 32).
- the 5'end of the guide RNA coding sequence is connected to the first At the 3'end of the tRNA coding sequence, the first tRNA is designed to be cleaved at the 5'end of the guide RNA (that is, by the precise tRNA processing mechanism that exists in the cell (which precisely removes the 5'and 5'of the precursor tRNA) 3'additional sequence to form a mature tRNA) cleaved) the first tRNA-guide RNA fusion generated by intracellular transcription, thereby forming a guide RNA that does not carry additional nucleotides at the 5'end.
- the 3'end of the guide RNA coding sequence is connected to the 5'end of the second tRNA coding sequence, and the second tRNA is designed to be transcribed in the 3'end tRNA cell of the guide RNA.
- the guide RNA-second tRNA fusion thus forming a guide RNA that does not carry additional nucleotides at the 3'end.
- the 5'end of the guide RNA coding sequence is connected to the 3'end of the first tRNA coding sequence, and the 3'end of the guide RNA coding sequence is connected to the 5'end of the second tRNA coding sequence
- the first tRNA is designed to be a first tRNA-guide RNA-second tRNA fusion produced by cleaving the 5'end of the guide RNA in the cell
- the second tRNA is designed to be the first tRNA-guide RNA-second tRNA fusion.
- the 3'end cuts the first tRNA-guide RNA-second tRNA fusion generated by intracellular transcription, thereby forming a guide RNA that does not carry additional nucleotides at the 5'and 3'ends.
- tRNA-guide RNA fusion is within the ability of those skilled in the art. For example, you can refer to Xie et al., PNAS, Mar 17, 2015; vol. 112, no. 11, 3570-3575.
- the present invention provides a method for site-directed modification of a target nucleic acid sequence in a cell genome, including introducing the genome editing system of the present invention into the cell.
- the introduction of the genome editing system results in a double-strand break (DSB) in the target nucleic acid sequence. Subsequently, through the repair function of the cell, the substitution, deletion and/or addition of one or more nucleotides in the target nucleic acid sequence or its nearby sequence is realized.
- DSB double-strand break
- the present invention also provides a method for producing genetically modified cells, including introducing the genome editing system of the present invention into the cells.
- the present invention also provides a genetically modified organism, which comprises a genetically modified cell or its progeny cells produced by the method of the present invention.
- the target sequence to be modified can be located anywhere in the genome, for example, in a functional gene such as a protein-coding gene, or, for example, can be located in a gene expression regulatory region such as a promoter region or an enhancer region, so as to achieve Modification of gene function or modification of gene expression.
- the modification in the cell target sequence can be detected by T7EI, PCR/RE or sequencing methods.
- the gene editing system can be introduced into cells by various methods well known to those skilled in the art.
- Methods that can be used to introduce the gene editing system of the present invention into cells include, but are not limited to: calcium phosphate transfection, protoplast fusion, electroporation, liposome transfection, microinjection, viral infection (such as baculovirus, vaccinia virus, adenovirus) Viruses, adeno-associated viruses, lentiviruses and other viruses), gene bombardment, PEG-mediated transformation of protoplasts, and Agrobacterium-mediated transformation.
- the methods of the invention are performed in vitro.
- the cell is an isolated cell, or a cell in an isolated tissue or organ.
- the method of the present invention can also be performed in vivo.
- the cell is a cell in an organism, and the system of the present invention can be introduced into the cell in vivo by a method mediated by, for example, a virus or Agrobacterium.
- Cells that can be genome edited by the method of the present invention can be derived from, for example, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cows, and cats; poultry such as chickens, ducks, and geese; plants, including monads.
- mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cows, and cats
- poultry such as chickens, ducks, and geese
- plants including monads.
- Leafy plants and dicotyledonous plants such as rice, corn, wheat, sorghum, barley, soybean, peanut, Arabidopsis, etc.
- the Cas12a protein or genome editing system of the present invention is particularly suitable for genome editing in plants.
- the present invention provides a method for producing a genetically modified plant, comprising introducing the genome editing system of the present invention into at least one of the plants, thereby causing a modification in the genome of the at least one plant.
- the modification includes substitution, deletion and/or addition of one or more nucleotides.
- the genome editing system can be introduced into plants by various methods well known to those skilled in the art.
- Methods that can be used to introduce the genome system of the present invention into plants include, but are not limited to: gene bombardment, PEG-mediated transformation of protoplasts, Agrobacterium-mediated transformation, plant virus-mediated transformation, pollen tube passage method, and ovary injection law.
- the target sequence can be modified by introducing or producing the Cas12a protein and guide RNA into plant cells, and the modification can be inherited stably, without the need to stably transform the genome editing system into plants .
- This avoids the potential off-target effects of the stable genome editing system, and also avoids the integration of exogenous nucleotide sequences in the plant genome, thereby having higher biological safety.
- the introduction is performed in the absence of selective pressure, so as to avoid the integration of foreign nucleotide sequences in the plant genome.
- the introduction includes transforming the genome editing system of the present invention into an isolated plant cell or tissue, and then regenerating the transformed plant cell or tissue into a whole plant.
- the regeneration is performed in the absence of selective pressure, that is, no selective agent for the selective gene carried on the expression vector is used during the tissue culture process. Not using selection agents can improve plant regeneration efficiency and obtain herbicide-resistant plants without exogenous nucleotide sequences.
- the genome editing system of the present invention can be transformed to specific parts on the whole plant, such as leaves, stem tips, pollen tubes, young ears or hypocotyls. This is particularly suitable for the transformation of plants that are difficult to regenerate from tissue culture.
- the protein expressed in vitro and/or the RNA molecule transcribed in vitro is directly transformed into the plant.
- the protein and/or RNA molecule can realize genome editing in plant cells and then be degraded by the cell, avoiding the integration of foreign nucleotide sequences in the plant genome.
- genetic modification of plants using the method of the present invention can obtain plants whose genomes have no exogenous polynucleotide integration, that is, transgene-free modified plants.
- the modification is related to plant traits such as agronomic traits
- the modification causes the plant to have an altered (preferably improved) trait, such as agronomic trait, relative to a wild-type plant.
- the method further includes the step of screening for plants with desired modifications and/or desired traits such as agronomic traits.
- the method further includes obtaining progeny of the genetically modified plant.
- the genetically modified plant or its progeny have desired modifications and/or desired traits such as agronomic traits.
- the present invention also provides a genetically modified plant or its progeny or part thereof, wherein the plant is obtained by the above-mentioned method of the present invention.
- the genetically modified plant or progeny or part thereof is non-transgenic.
- the genetically modified plant or its progeny have desired genetic modification and/or desired traits such as agronomic traits.
- the present invention also provides a plant breeding method, comprising crossing a genetically modified first plant obtained by the above-mentioned method of the present invention with a second plant that does not contain the modification, thereby combining the modification Introduce the second plant.
- the genetically modified first plant has desired traits such as agronomic traits.
- the present invention also includes a kit used in the method of the present invention, which includes the genome editing system of the present invention, and instructions for use.
- the kit generally includes a label indicating the intended use and/or method of use of the contents of the kit.
- the term label includes any written or recorded material provided on or with the kit or otherwise provided with the kit.
- Example 1 Using homology and similarity comparison to find the CRISPR/Cas12a system in plant symbiotic bacteria
- the size of the protein is 1318aa (SEQ ID NO:1), but there are no other Cas protein sequences near the sequence.
- the CRISPR repeat sequence appears at 1509bp downstream of the genome. There are 37 Spacer sequences in total.
- Its Direct Repeat is GTTTAAAACCACTTTAAAATTTCTACTATTGTAGAT (SEQ ID NO: 9) Compared with the Direct Repeat of the commonly used Cas12a proteins FnCas12a, LbCas12a, and AsCas12a as shown in Figure 1a.
- the protein sequence alignment results of this protein with the commonly used FnCas12a, LbCas12a, and AsCas12a are shown in Figure 1b.
- the sequence similarity alignment uses the NCBI blastp program.
- the coding sequence of FbCas12a derived from Flavobacterium branchiophilum was codon-optimized, and two nuclear localization signals (NLS) were added to its 3'end, and BamHI/SmaI restriction sites were added at both ends.
- the optimized FbCas12a protein can be better expressed and localized in rice.
- the nucleotide coding sequence of FbCas12a with NLS added and codon optimized is shown in SEQ ID NO: 7 in the sequence table.
- positions 3967-3987 are the SV40 nuclear localization signal sequence
- positions 3988-3999 are the SGGS linker between the two nuclear localization signal sequences
- positions 4000-4047 are the nucleoplasmin nuclear localization signal sequence.
- Positions 1-3966 are the coding sequence of FbCas12a protein.
- SEQ ID NO: 7 encodes the protein shown in SEQ ID NO: 6, that is, the FbCas12a nuclease with nuclear localization signal.
- the DNA sequence of LbCas12a commonly used in laboratory genome editing was ligated into the pJIT163 vector to obtain the pJIT163-UBI-LbCas12a vector.
- the construction method of the vector is similar to that of pJIT163-UBI-FbCas12a.
- the nucleotide coding sequence of the codon-optimized LbCas12a is shown in SEQ ID NO: 8 in the sequence list.
- the pJIT163-FbCas12a and pJIT163-LbCas12a vectors contain UBI promoter, plant codon-optimized FbCas12a protein or LbCas12a coding sequence, 3'SV40 nuclear localization signal coding sequence, nucleoplasmin nuclear localization signal coding sequence, and its structure
- the schematic diagram is shown in Figure 2a and Figure 2c.
- Cas12a has a crRNA self-maturation function.
- the full-length FbCas12a crRNA backbone sequence full-length Direct Repeat
- genome editing was not achieved. Therefore, it seems that FbCas12a cannot mature its natural crRNA backbone. It is necessary to explore the crRNA of FbCas12a to determine whether it can realize genome editing.
- the recognition sequence of the target sequence to be mutated in rice can be linked into the vector pJIT163-FbcrRNA through two restriction sites, and positions 89-156 are HDV nuclei. Enzyme sequence, position 157-162 is the restriction site sequence of SmaI.
- the synthetic DNA fragment of SEQ ID NO: 14 was ligated into the expression vector pJIT163 to obtain the pJIT163-FbcrRNA vector.
- the vector contains UBI promoter, HH ribozyme, truncated FbcrRNA sequence, HDV ribozyme and CaMV terminator. Its structure is shown in Figure 2b.
- the pJIT163-FbcrRNA vector uses a ribozyme-based crRNA maturation strategy to obtain precisely processed crRNA sequences.
- Example 4 Site-directed mutation of rice endogenous gene EPSPS by FbCas12a system and mutation of four endogenous gene targets in rice using FbCas12a protein and LbCas12a crRNA
- target-EPSPS05 TTTG GTACTAAATATACAATCCCTTGG (SEQ ID NO: 16; the sequence is LOC_Os06g04280.1 in the 956-982th nucleotide of the OsEPSPS gene.
- the underlined part is the PAM sequence).
- target-OsCDC48 TTTA TTCAGATTACATATGGTTAG (SEQ ID NO: 17; nucleotides 582-605 in the OsCDC48 gene of sequence LOC_Os03g05730.
- the underlined part is the PAM sequence).
- target-OsDEP1T3 TTTC AAATGGATCTAAACAGGGCCTTA (SEQ ID NO: 18; nucleotides 1919-1945 in the OsDEP1 gene of sequence LOC_Os09g26999.
- the underlined part is the PAM sequence).
- target-OsPDS TTTG GAGTGAAATCTCTTGTCTTA (SEQ ID NO: 19; the sequence is LOC_Os03g08570 from nucleotides 136 to 159 in the OsPDS gene.
- the underlined part is the PAM sequence).
- target-OsEpspsC02 TTTA TGAAAATATGTATGGAATTCATG (SEQ ID NO: 20; nucleotides 1294-1320 in the OsEPSPS gene with sequence LOC_Os06g04280.1.
- the underlined part is the PAM sequence).
- SP1 is the coding DNA of RNA that can complementally bind to the target-EPSPS05
- SP1-F AGAT GTACTAAATATACAATCCCTTGG (SEQ ID NO: 21)
- SP1-R AAAC CCAAGGGATTGTATATTTAGTAC (SEQ ID NO: 22)
- double-stranded DNA with sticky ends is formed, which is inserted between the two BsaI restriction sites of pJIT163-FbcrRNA to obtain the pJIT163-FbcrRNA plasmid containing SP1.
- the plasmid is verified as a positive plasmid by sequencing.
- SP2 ⁇ SP5 are RNA coding DNAs that can complementally bind to the targets target-OsCDC48, target-OsDEP1T3, target-OsPDS and target-OsEpspsC02.
- SP2-F AGAT TTCAGATTACATATGGTTAG (SEQ ID NO: 23)
- SP2-R AAAC CTAACCATATGTAATCTGAA (SEQ ID NO: 24)
- SP3-F AGAT AAATGGATCTAAACAGGGCCTTA (SEQ ID NO: 25)
- SP3-R AAAC ATTGGCCCTGTTTAGATCCATTT (SEQ ID NO: 26)
- SP4-F AGAT GAGTGAAATCTCTTGTCTTA (SEQ ID NO: 27)
- SP4-R AAAC TAAGACAAGAGATTTCACTC (SEQ ID NO: 28)
- SP5-F AGAT TGAAAATATGTATGGAATTCATG (SEQ ID NO: 29)
- SP5-R AAAC CATGAATTCCATACATATTTTCA (SEQ ID NO: 30)
- double-stranded DNA with sticky ends is formed and inserted between the two BsaI restriction sites of pJIT163-FbcrRNA and pJIT163-LbcrRNA to obtain pJIT163-FbcrRNA plasmid and pJIT163-LbcrRNA plasmid containing SP1 ⁇ SP5.
- the plasmid was sequenced and verified as a positive plasmid.
- the plasmids pJIT163-UBI-FbCas12a, pJIT163-UBI-FbCas12a, and pJIT163-FbcrRNA and pJIT163-LbcrRNA containing SP1 ⁇ SP5 were transformed into the protoplasts of rice Nipponbare respectively.
- the specific process of rice protoplast transformation refers to the literature Shan, Q. et al. .,Rapid and efficient gene modification in rice and Brachypodium using TALENs.Molecular Plant (2013).
- the genomic DNA was extracted 48 hours after the transformation of rice protoplasts, and the DNA was used as a template to conduct amplicon high-throughput sequencing experiments to analyze its editing efficiency.
- the specific process of amplicon high-throughput sequencing refers to the literature Zhang et al. Perfectly matched 20 -Nucleotide guide RNA sequences enable robust genome editing using high-fidelity SpCas9 nucleases. Genome Biology, 2017.
- FIG. 3a The results of artificially matured FbcrRNA: FbCas12a high-throughput sequencing experiments are shown in Figure 3a.
- the results show that compared with the control group, the FbCas12a treatment group has a mutation at the target site of the OsEPSPS gene, and the mutation efficiency is about 6.25%.
- Figure 3b shows the ratio of mutation types obtained from high-throughput sequencing data analysis. The results show that most of the mutation types generated by FbCas12a at the target site are deletions of DNA fragments.
- Fb means FbcrRNA:FbCas12a
- FbLb means LbcrRNA:FbCas12a
- Lb means LbcrRNA:LbCas12a.
Landscapes
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Zoology (AREA)
- Biomedical Technology (AREA)
- Biotechnology (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- Biophysics (AREA)
- Plant Pathology (AREA)
- Physics & Mathematics (AREA)
- Mycology (AREA)
- Medicinal Chemistry (AREA)
- Cell Biology (AREA)
- Gastroenterology & Hepatology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Crystallography & Structural Chemistry (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
- Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
Abstract
Description
Claims (16)
- 一种Cas12a蛋白,其(i)包含与SEQ ID NO:1具有至少80%、至少85%、至少90%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.1%、至少99.2%、至少99.3%、至少99.4%、至少99.5%、至少99.6%、至少99.7%、至少99.8%、至少99.9%、甚至100%序列相同性的氨基酸序列,或(ii)包含相对于SEQ ID NO:1具有一或多个,例如1个、2个、3个、4个、5个、6个、7个、8个、9个或10个氨基酸取代、缺失或添加的氨基酸序列。
- 权利要求1的Cas12a蛋白,其中所述Cas12a蛋白衍生自黄杆菌(Flavobacterium)属物种,例如衍生自噬腮黄杆菌(Flavobacterium branchiophilum)。
- 权利要求1或2的Cas12a蛋白,其中所述Cas12a蛋白还包含核定位序列(NLS)。
- 权利要求3的Cas12a蛋白,其包含SEQ ID NO:6所示氨基酸序列。
- 权利要求1-4中任一项的Cas12a蛋白在对细胞,优选真核细胞,更优选植物细胞进行基因组编辑的用途。
- 一种用于对细胞基因组中靶核酸序列进行定点修饰的基因组编辑***,其包含权利要求1-4中任一项的Cas12a蛋白和/或包含编码权利要求1-4中任一项的Cas12a蛋白的核苷酸序列的表达构建体。
- 权利要求6的基因组编辑***,其还包括至少一种向导RNA(gRNA)和/或包含编码所述至少一种向导RNA的核苷酸序列的表达构建体。
- 权利要求7的基因组编辑***,其中所述向导RNA是crRNA,且包含SEQ ID NO:10或11所示的crRNA骨架序列。
- 权利要求7或8的基因组编辑***,其包含选自以下i)至v)的任一项:i)权利要求1-4中任一项的Cas12a蛋白和所述至少一种向导RNA,任选地,所述Cas12a蛋白和所述至少一种向导RNA形成复合物;ii)包含编码权利要求1-4中任一项的Cas12a蛋白的核苷酸序列的表达构建体,和所述至少一种向导RNA;iii)权利要求1-4中任一项的Cas12a蛋白,和包含编码所述至少一种向导RNA的核苷酸序列的表达构建体;iv)包含编码权利要求1-4中任一项的Cas12a蛋白的核苷酸序列的表达构建体,和包含编码所述至少一种向导RNA的核苷酸序列的表达构建体;v)包含编码权利要求1-4中任一项的Cas12a蛋白的核苷酸序列和编码所述至少一种向导RNA的核苷酸序列的表达构建体。
- 权利要求6-9中任一项的基因组编辑***,其中编码所述Cas12a蛋白的核苷酸序列针对植物如水稻进行密码子优化。
- 权利要求10的基因组编辑***,其中所述编码Cas12a蛋白的核苷酸序列选自 SEQ ID NO:2和SEQ ID NO:7。
- 权利要求7-11中任一项的基因组编辑***,所述编码Cas12a蛋白的核苷酸序列和/或编码所述至少一种向导RNA的核苷酸序列与表达调控元件如启动子可操作地连接。
- 权利要求7-11中任一项的基因组编辑***,所述向导RNA编码序列的5’端连接至第一核酶编码序列的3’端,所述向导RNA编码序列的3’端连接至第二核酶编码序列的5’端,所述第一核酶被设计为在所述向导RNA的5’末端切割细胞内转录生成的第一核酶-向导RNA-第二核酶融合物,所述第二核酶被设计为在所述向导RNA的3’末端切割细胞内转录生成的第一核酶-向导RNA-第二核酶融合物,由此形成不携带5’和3’端额外核苷酸的向导RNA。
- 权利要求13的基因组编辑***,其中所述第一核酶由SEQ ID NO:31所示序列编码,所述第二核酶由SEQ ID NO:32所示序列编码。
- 一种产生经遗传修饰的细胞的方法,包括将权利要求6-14中任一项的基因组编辑***导入所述细胞。
- 权利要求15的方法,其中所述细胞来自哺乳动物如人、小鼠、大鼠、猴、犬、猪、羊、牛、猫;家禽如鸡、鸭、鹅;植物,包括单子叶植物和双子叶植物,例如水稻、玉米、小麦、高粱、大麦、大豆、花生、拟南芥。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/777,936 US20230002453A1 (en) | 2019-11-18 | 2020-11-18 | Gene editing system derived from flavobacteria |
BR112022009584A BR112022009584A2 (pt) | 2019-11-18 | 2020-11-18 | Sistema de edição de genes derivado de flavobacterium |
CN202080080579.0A CN115052980A (zh) | 2019-11-18 | 2020-11-18 | 衍生自黄杆菌的基因编辑*** |
EP20890516.6A EP4063500A4 (en) | 2019-11-18 | 2020-11-18 | GENE EDITING SYSTEM DERIVED FROM BACTERIA OF THE GENUS FLAVOBACTERIUM |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911126348.4 | 2019-11-18 | ||
CN201911126348 | 2019-11-18 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021098709A1 true WO2021098709A1 (zh) | 2021-05-27 |
Family
ID=75981339
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/129665 WO2021098709A1 (zh) | 2019-11-18 | 2020-11-18 | 衍生自黄杆菌的基因编辑*** |
Country Status (5)
Country | Link |
---|---|
US (1) | US20230002453A1 (zh) |
EP (1) | EP4063500A4 (zh) |
CN (1) | CN115052980A (zh) |
BR (1) | BR112022009584A2 (zh) |
WO (1) | WO2021098709A1 (zh) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108513582A (zh) * | 2015-06-18 | 2018-09-07 | 布罗德研究所有限公司 | 新型crispr酶以及*** |
WO2019051428A1 (en) * | 2017-09-11 | 2019-03-14 | The Regents Of The University Of California | CAS9 ANTIBODY MEDIA ADMINISTRATION TO MAMMALIAN CELLS |
CN110117621A (zh) * | 2019-05-24 | 2019-08-13 | 青岛农业大学 | 一种碱基编辑器及其制备方法和应用 |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10513711B2 (en) * | 2014-08-13 | 2019-12-24 | Dupont Us Holding, Llc | Genetic targeting in non-conventional yeast using an RNA-guided endonuclease |
EA038321B1 (ru) * | 2014-11-06 | 2021-08-09 | Е.И. Дюпон Де Немур Энд Компани | Опосредуемая пептидом доставка направляемой рнк эндонуклеазы в клетки |
US10648020B2 (en) * | 2015-06-18 | 2020-05-12 | The Broad Institute, Inc. | CRISPR enzymes and systems |
US9896696B2 (en) * | 2016-02-15 | 2018-02-20 | Benson Hill Biosystems, Inc. | Compositions and methods for modifying genomes |
US20200263190A1 (en) * | 2016-04-19 | 2020-08-20 | The Broad Institute, Inc. | Novel crispr enzymes and systems |
WO2017223538A1 (en) * | 2016-06-24 | 2017-12-28 | The Regents Of The University Of Colorado, A Body Corporate | Methods for generating barcoded combinatorial libraries |
US20190330659A1 (en) * | 2016-07-15 | 2019-10-31 | Zymergen Inc. | Scarless dna assembly and genome editing using crispr/cpf1 and dna ligase |
JP2020507312A (ja) * | 2017-02-10 | 2020-03-12 | ザイマージェン インコーポレイテッド | 複数の宿主用の複数のdnaコンストラクトのアセンブリ及び編集のためのモジュラーユニバーサルプラスミド設計戦略 |
WO2018226972A2 (en) * | 2017-06-09 | 2018-12-13 | Vilmorin & Cie | Compositions and methods for genome editing |
US20210071174A1 (en) * | 2018-05-09 | 2021-03-11 | Dsm Ip Assets B.V. | Crispr transient expression construct (ctec) |
-
2020
- 2020-11-18 EP EP20890516.6A patent/EP4063500A4/en active Pending
- 2020-11-18 CN CN202080080579.0A patent/CN115052980A/zh active Pending
- 2020-11-18 WO PCT/CN2020/129665 patent/WO2021098709A1/zh unknown
- 2020-11-18 BR BR112022009584A patent/BR112022009584A2/pt unknown
- 2020-11-18 US US17/777,936 patent/US20230002453A1/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108513582A (zh) * | 2015-06-18 | 2018-09-07 | 布罗德研究所有限公司 | 新型crispr酶以及*** |
WO2019051428A1 (en) * | 2017-09-11 | 2019-03-14 | The Regents Of The University Of California | CAS9 ANTIBODY MEDIA ADMINISTRATION TO MAMMALIAN CELLS |
CN110117621A (zh) * | 2019-05-24 | 2019-08-13 | 青岛农业大学 | 一种碱基编辑器及其制备方法和应用 |
Non-Patent Citations (20)
Title |
---|
"Biocomputing: Informatics and Genome Projects", 1993, ACADEMIC PRESS |
"Computer Analysis of Sequence Data", 1994, HUMANA PRESS |
"GeneBank", Database accession no. CCB70584.1 |
"NCBI", Database accession no. FQ859183.1 |
"Sequence Analysis Primer", 1991, M STOCKTON PRESS |
BAI ET AL., FUNCTIONAL OVERLAP OF THE ARABIDOPSIS LEAF AND ROOT MICROBIOTA |
CARRILLO, H.LIPMAN, D., SIAM J APPLIED MATH, vol. 48, 1988, pages 1073 |
CONG ET AL.: "Multiplex Genome Engineering Using CRISPR/Cas Systems", SCIENCE, 2013 |
DATABASE Protein GenPept; ANONYMOUS: "type V CRISPR-associated protein Cas12a/Cpf1 [Flavobacterium branchiopphilum]", XP055822487, retrieved from NCBI * |
GAO ET AL., JIPB, vol. 56, no. 4, April 2014 (2014-04-01), pages 343 - 349 |
LEVY ET AL., GENOMIC FEATURES OF BACTERIAL ADAPTATION TO PLANTS |
MALI ET AL.: "RNA-guided human genome engineering via Cas9", SCIENCE, 2013 |
NAKAMURA Y. ET AL.: "Codon usage tabulated from the international DNA sequence databases: status for the year 2000", NUCL. ACIDS RES., vol. 28, 2000, pages 292, XP002941557, DOI: 10.1093/nar/28.1.292 |
PLANT MOLECULAR BIOLOGY, vol. 18, 1992, pages 815 - 818 |
SAMBROOK, J.FRITSCH, E.FMANIATIS, T.: "Molecular Cloning: A Laboratory Manual", 1989, COLD SPRING HARBOR LABORATORY PRESS |
See also references of EP4063500A4 |
SHAN, Q. ET AL.: "Rapid and efficient gene modification in rice and Brachypodium using TALENs", METHOD DISCLOSED IN MOLECULAR PLANT, 2013 |
WATSON ET AL.: "Sequence Analysis in Molecular Biology", 1987, THE BENJAMIN/CUMMINGS PUB. CO., pages: 224 |
XIE ET AL., PNAS, vol. 112, no. 11, 17 March 2015 (2015-03-17), pages 3570 - 3575 |
ZHANG ET AL.: "Perfectly matched 20 -nucleotide guide RNA sequences enable robust genome editing using high-fidelity SpCas9 nucleases", METHODS DESCRIBED IN GENOME BIOLOGY, 2017 |
Also Published As
Publication number | Publication date |
---|---|
EP4063500A1 (en) | 2022-09-28 |
US20230002453A1 (en) | 2023-01-05 |
EP4063500A4 (en) | 2023-12-27 |
CN115052980A (zh) | 2022-09-13 |
BR112022009584A2 (pt) | 2022-10-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11702643B2 (en) | System and method for genome editing | |
WO2019120310A1 (en) | Base editing system and method based on cpf1 protein | |
JP7138712B2 (ja) | ゲノム編集のためのシステム及び方法 | |
CN113373130A (zh) | Cas12蛋白、含有Cas12蛋白的基因编辑***及应用 | |
WO2023169454A1 (zh) | 腺嘌呤脱氨酶及其在碱基编辑中的用途 | |
WO2020224611A1 (en) | Improved gene editing system | |
JP7361109B2 (ja) | C2c1ヌクレアーゼに基づくゲノム編集のためのシステムおよび方法 | |
CA3228222A1 (en) | Class ii, type v crispr systems | |
CN116555237A (zh) | 胞嘧啶脱氨酶及其在碱基编辑中的用途 | |
CN113025597A (zh) | 改进的基因组编辑*** | |
WO2021004456A1 (zh) | 改进的基因组编辑***及其应用 | |
WO2021098709A1 (zh) | 衍生自黄杆菌的基因编辑*** | |
WO2023030534A1 (zh) | 改进的引导编辑*** | |
WO2021175288A1 (zh) | 改进的胞嘧啶碱基编辑*** | |
WO2022188816A1 (zh) | 改进的cg碱基编辑*** | |
WO2024051850A1 (zh) | 基于dna聚合酶的基因组编辑***和方法 | |
WO2023232109A1 (zh) | 新的crispr基因编辑*** | |
CN117327679A (zh) | 碱基编辑工具及其应用 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20890516 Country of ref document: EP Kind code of ref document: A1 |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112022009584 Country of ref document: BR |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2020890516 Country of ref document: EP Effective date: 20220620 |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01E Ref document number: 112022009584 Country of ref document: BR Free format text: APRESENTAR NOVO CONTEUDO ELETRONICO DE LISTAGEM DE SEQUENCIAS BIOLOGICAS, UMA VEZ QUE O APRESENTADO TEM DIVERGENCIA DO PEDIDO NA FASE NACIONAL, EM RELACAO A CAMPOS OBRIGATORIOS (CAMPO 110). |
|
ENP | Entry into the national phase |
Ref document number: 112022009584 Country of ref document: BR Kind code of ref document: A2 Effective date: 20220517 |