RU2778156C1

RU2778156C1 - DNA-CUTTING AGENT BASED ON THE Cas9 PROTEIN FROM THE BACTERIUM CAPNOCYTOPHAGA OCHRACEA

Info

Publication number: RU2778156C1
Application number: RU2021130308A
Authority: RU
Inventors: Константин Викторович Северинов; Александра Андреевна Васильева; Полина Анатольевна Селькова; Анатолий Николаевич Арсениев; Михаил Алексеевич Ходорковский; Яна Витальевна Федорова
Filing date: 2021-10-19
Publication date: 2022-08-15

Abstract

FIELD: biotechnology.

SUBSTANCE: group of inventions relates to biotechnology and describes a new bacterial nuclease of the CRISPR-Cas9 system from the bacterium Capnocytophaga ochracea, as well as an application thereof for forming strictly specific double-stranded breaks in the DNA molecule. The nuclease herein has unusual properties and can be used as a tool for introducing alterations at strictly defined places in the sequence of genomic DNA of unicellular or multicellular organisms.

EFFECT: invention increases the versatility of available CRISPR-Cas9 systems, thereby allowing for the use of Cas9 nucleases from various organisms for cutting genomic or plasmid DNA at a greater number of specific sites and in different settings.

5 cl, 11 dwg, 3 ex

Description

Область техникиTechnical field

Изобретение относится к биотехнологии, а именно, к новым ферментам - Cas нуклеазам систем CRISPR-Cas, применяемым для разрезания ДНК и редактирования геномов различных организмов. Данная технология может применяться в будущем для генной терапии наследственных заболеваний человека, а также для редактирования геномов других организмов.The invention relates to biotechnology, namely, to new enzymes - Cas nucleases of CRISPR-Cas systems, used for cutting DNA and editing the genomes of various organisms. This technology can be used in the future for gene therapy of human hereditary diseases, as well as for editing the genomes of other organisms.

Уровень техникиState of the art

Изменение последовательности ДНК - одна из актуальных задач биотехнологии на сегодняшний день. Редактирование и изменение геномов эукариотических и прокариотических организмов, а также манипуляции с ДНК in vitro, требуют направленного внесения двунитевых разрывов в последовательности ДНК.Changing the DNA sequence is one of the urgent tasks of biotechnology today. Editing and modifying the genomes of eukaryotic and prokaryotic organisms, as well as manipulations with DNA in vitro, require the targeted introduction of double-strand breaks in the DNA sequence.

Для решения этой задачи в настоящее время используют следующие методики: искусственные нуклеазные системы, содержащей домены типа «цинковые пальцы» (ZFN), эффекторные нуклеазы, подобные активатору транскрипции (TALEN-системы) и бактериальные CRISPR-Cas системы. Первые два метода требуют трудозатратой оптимизации аминокислотной последовательности нуклеазы для узнавания конкретной последовательности ДНК. В отличие от них в случае CRISPR-Cas систем структурами, узнающими ДНК мишень, являются не белки, а короткие направляющие РНК. Разрезание конкретной ДНК мишени не требует синтеза нуклеазы или ее гена de novo, а обеспечивается за счет использования направляющих РНК, комплементарных целевой последовательности. Это делает CRISPR-Cas системы удобными и эффективными инструментами разрезания различных ДНК-последовательностей. Методика позволяет осуществлять единовременное разрезание ДНК в нескольких участках при использовании направляющих РНК разной последовательностей. Такой подход используется в том числе для одновременного изменения нескольких генов в эукариотических организмах.To solve this problem, the following methods are currently used: artificial nuclease systems containing zinc finger domains (ZFN), transcription activator-like effector nucleases (TALEN systems), and bacterial CRISPR-Cas systems. The first two methods require labor-intensive optimization of the nuclease amino acid sequence to recognize a particular DNA sequence. In contrast to them, in the case of CRISPR-Cas systems, the structures that recognize the target DNA are not proteins, but short guide RNAs. Cutting a specific target DNA does not require de novo synthesis of the nuclease or its gene, but is achieved through the use of guide RNAs complementary to the target sequence. This makes CRISPR-Cas systems convenient and efficient tools for cutting various DNA sequences. The technique allows simultaneous cutting of DNA in several regions using guide RNAs of different sequences. This approach is used, among other things, to simultaneously change several genes in eukaryotic organisms.

По своей природе CRISPR-Cas системы являются иммунными системами прокариот, способными высоко специфично вносить разрывы в генетический материал вирусов (Mojica F. J. M.,

C.,

J. & Soria E. Intervening sequences of regularly spaced prokaryotic repeats derive from foreign genetic elements // Journal of molecular evolution. - 2005. - Т. 60. - №. 2. - С. 174-182). Аббревиатура CRISPR-Cas расшифровывается как «Clustered Regularly Interspaced Short Palindromic Repeats and CRISPR associated genes» (Jansen R., Embden J. D., Gaastra W. & Schouls L. M. Identification of genes that are associated with DNA repeats in prokaryotes // Molecular microbiology. - 2002. - Т. 43. - №. 6. - С. 1565-1575), что переводе с английского обозначает «короткие палиндромные повторы, регулярно расположенные группами, и ассоциированные с ними гены». Все CRISPR-Cas системы состоят из CRISPR кассет и генов, кодирующих различные Cas белки (Jansen R. et al., Molecular microbiology. - 2002. - Т. 43. - №. 6. - С. 1565-1575). CRISPR кассеты состоят из последовательностей спейсеров, каждый из которых имеет уникальную нуклеотидную последовательность, и повторяющихся палиндромных повторов (Jansen R. et al., Molecular microbiology. - 2002. - Т. 43. - №. 6. - С. 1565-1575). В результате транскрипции CRISPR кассет и их последующего процессинга образуются направляющие крРНК, которые вместе с Cas белками формируют эффекторный комплексBy their nature, CRISPR-Cas systems are the immune systems of prokaryotes capable of highly specific ruptures in the genetic material of viruses (Mojica FJM,

C.,

J. & Soria E. Intervening sequences of regularly spaced prokaryotic repeats derive from foreign genetic elements // Journal of molecular evolution. - 2005. - T. 60. - No. 2. - S. 174-182). The abbreviation CRISPR-Cas stands for "Clustered Regularly Interspaced Short Palindromic Repeats and CRISPR associated genes" (Jansen R., Embden JD, Gaastra W. & Schouls LM Identification of genes that are associated with DNA repeats in prokaryotes // Molecular microbiology. - 2002 - T. 43. - No. 6. - S. 1565-1575), which translated from English means "short palindromic repeats, regularly arranged in groups, and the genes associated with them." All CRISPR-Cas systems consist of CRISPR cassettes and genes encoding various Cas proteins (Jansen R. et al., Molecular microbiology. - 2002. - T. 43. - No. 6. - S. 1565-1575). CRISPR cassettes consist of spacer sequences, each of which has a unique nucleotide sequence, and repetitive palindromic repeats (Jansen R. et al., Molecular microbiology. - 2002. - T. 43. - No. 6. - S. 1565-1575) . As a result of transcription of CRISPR cassettes and their subsequent processing, guide crRNAs are formed, which, together with Cas proteins, form an effector complex.

(Brouns S. J., Jore M. M., Lundgren M., Westra E. R., Slijkhuis R. J., Snijders A. P., Dickman M. J., Makarova K. S., Koonin E. V. & van der Oost J. Small CRISPR RNAs guide antiviral defense in prokaryotes // Science. - 2008. - Т. 321. - №. 5891. - С. 960-964). За счет комплементарного спаривания крРНК с целевым участком ДНК, именуемым протоспейсером, Cas-нуклеаза узнает ДНК-мишень и высоко специфично вносит в нее разрыв.(Brouns S. J., Jore M. M., Lundgren M., Westra E. R., Slijkhuis R. J., Snijders A. P., Dickman M. J., Makarova K. S., Koonin E. V. & van der Oost J. Small CRISPR RNAs guide antiviral defense in prokaryotes // Science. - 2008. - T. 321. - No. 5891. - S. 960-964). By complementary pairing of crRNA with a target DNA region, called a protospacer, Cas nuclease recognizes the target DNA and introduces a break in it in a highly specific manner.

CRISPR-Cas системы, представленными одиночным белком-эффектором, разделяют на шесть различных типов (от I до VI) в зависимости от Cas белков, входящих в состав систем. В 2013 году впервые было предложено использовать систему CRISPR-Cas9, относящуюся к типу II, для редактирования геномной ДНК клеток человека (Cong L., Ran F. A., Cox D., Lin S., Barretto R., Habib N., Hsu P. D., Wu X., Jiang W., Marraffini L. A. & Zhang F. Multiplex genome engineering using CRISPR/Cas systems // Science. - 2013. - Т. 339. - № 6121. - С. 819-823). Система CRISPR-Cas9 II типа отличается простотой состава и механизма работы: для ее функционирования необходимо формирование эффекторного комплекса, состоящего лишь из одного белка Cas9 и двух коротких РНК: крРНК (crRNA) и трейсерной РНК (tracrRNA, трРНК). Трейсерная РНК комплементарно спаривается с участком крРНК, происходящим из CRISPR повтора, образуя вторичную структуру, необходимую для связывания направляющих РНК с Cas эффектором. Определение последовательности направляющих РНК является важным шагом в характеризации неизученных ранее Cas-ортологов. Эффекторный белок Cas9 является РНК-зависимой ДНК эндонуклеазой с двумя нуклеазными доменами (HNH и RuvC), вносящими разрывы в комплементарные нити целевой ДНК, таким образом образуя двунитевой разрыв ДНК (Deltcheva E., Chylinski K., Sharma C. M., Gonzales K., Chao Y., Pirzada Z. A., Eckert M. R., Vogel J. & Charpentier E. CRISPR RNA maturation by trans-encoded small RNA and host factor RNase III // Nature. - 2011. - Т. 471. - №. 7340. - С. 602).CRISPR-Cas systems represented by a single effector protein are divided into six different types (from I to VI) depending on the Cas proteins that make up the systems. In 2013, it was first proposed to use the type II CRISPR-Cas9 system for editing the genomic DNA of human cells (Cong L., Ran F. A., Cox D., Lin S., Barretto R., Habib N., Hsu P. D., Wu X., Jiang W., Marraffini L. A. & Zhang F. Multiplex genome engineering using CRISPR/Cas systems Science 339 No 6121 pp 819-823 Science 2013 The type II CRISPR-Cas9 system is characterized by a simple composition and mechanism of operation: its functioning requires the formation of an effector complex consisting of only one Cas9 protein and two short RNAs: crRNA (crRNA) and tracer RNA (tracrRNA, tRNA). The tracer RNA pairs complementarily with a region of crRNA derived from the CRISPR repeat, forming a secondary structure necessary for binding guide RNAs to the Cas effector. Guide RNA sequencing is an important step in the characterization of previously unstudied Cas orthologues. The effector protein Cas9 is an RNA-dependent DNA endonuclease with two nuclease domains (HNH and RuvC) that introduce breaks in the complementary strands of the target DNA, thus forming a double-strand DNA break (Deltcheva E., Chylinski K., Sharma C. M., Gonzales K., Chao Y., Pirzada Z. A., Eckert M. R., Vogel J. & Charpentier E. CRISPR RNA maturation by trans-encoded small RNA and host factor RNase III // Nature 471 No 7340 2011 P. 602).

Технология CRISPR-Cas9 является одной из самых современных и быстроразвивающихся методик внесения разрывов в ДНК различных организмов, начиная от бактериальных штаммов и заканчивая клетками человека, а также in vitro (Song M. The CRISPR/Cas9 system: Their delivery, in vivo and ex vivo applications and clinical development by startups // Biotechnology Progress. - 2017. - Т. 33. - № 4. - С. 1035-1045).CRISPR-Cas9 technology is one of the most modern and rapidly developing methods for introducing breaks into the DNA of various organisms, ranging from bacterial strains to human cells, as well as in vitro (Song M. The CRISPR/Cas9 system: Their delivery, in vivo and ex vivo applications and clinical development by startups // Biotechnology Progress, 2017, vol. 33, no. 4, pp. 1035-1045).

Эффекторному рибонуклеиновому комплексу, состоящему из Cas9 и дуплекса крРНК и трейсерной РНК, для распознавания и последующего гидролиза ДНК помимо комплементарного соответствия спейсера крРНК и протоспейсера необходимо присутствие PAM (от англ. «PAM» - protospacer adjusted motif) на ДНК мишени (Mojica F. J. M. et al., Journal of molecular evolution. - 2005. - Т. 60. - №. 2. - С. 174-182). PAM представляет собой строго определенную последовательность из нескольких нуклеотидов, расположенных в системах типа II вплотную либо в нескольких нуклеотидах от 3'-конца протоспейсера на нетаргетной цепи. При отсутствии PAM гидролиза связей в ДНК с образованием двунитевого разрыва не происходит. Необходимость присутствия PAM последовательности на мишени повышает специфичность узнавания, но в то же время накладывает ограничение в выборе целевых участков ДНК, в которые необходимо внести разрыв. Таким образом, наличие нужной PAM последовательности, фланирующей ДНК-мишень с 3'-конца, является характеристикой, ограничивающей применение CRISPR-Cas систем на любых участках ДНК.The effector ribonucleic acid complex, consisting of Cas9 and a duplex of crRNA and tracer RNA, for recognition and subsequent hydrolysis of DNA, in addition to the complementary correspondence of the crRNA spacer and protospacer, requires the presence of PAM (from the English "PAM" - protospacer adjusted motif) on the target DNA (Mojica F. J. M. et al ., Journal of molecular evolution, 2005, vol. 60, no. 2, pp. 174-182). PAM is a strictly defined sequence of several nucleotides located in type II systems close to or a few nucleotides from the 3'-end of the protospacer on the non-targeted strand. In the absence of PAM, hydrolysis of DNA bonds with the formation of a double-strand break does not occur. The need for the presence of a PAM sequence on the target increases the specificity of recognition, but at the same time imposes a restriction on the choice of target DNA regions in which a break must be introduced. Thus, the presence of the desired PAM sequence, which flanking the target DNA from the 3' end, is a characteristic that limits the use of CRISPR-Cas systems on any DNA regions.

На сегодняшний день известно несколько CRISPR-Cas нуклеаз, способных направлено и специфично вносить двунитевые разрывы в ДНК. Например, NmeCas9 из Neisseria meningitidis strain 8013 (Esvelt K. M., Mali P., Braff J. L., Moosburner M., Yaung S. J. & Church G. M. Orthogonal Cas9 proteins for RNA-guided gene regulation and editing // Nature Methods. - 2013. - Т. 10. - С. 1116-1121), Nme2Cas9 из Neisseria meningitidis strain De11444 (Edraki A., Mir A., Ibraheim R., Gainetdinov I., Yoon Y., Song C.-Q., Cao Y., Gallant J., Xue W., Rivera-Perez J. A. & Sontheimer E. J. A compact, high-accuracy Cas9 with a dinucleotide PAM for in vivo genome editing // Molecular Cell. - 2019. - Т. 73. - С. 714-726), CjCas9 из Campylobacter jejuni (Kim E., Koo T., Park S. W., Kim D., Kim K., Cho H. Y., Song D. W., Lee K. J., Jung M. H., Kim S., Kim J. H., Kim J. H. & Kim J. S. In vivo genome editing with a small Cas9 orthologue derived from Campylobacter jejuni // Nature Communications. - 2017. - Т. 8. - С. 14500), CdCas9 из Corynebacterium diphtheriae (Hirano S., Abudayyeh O. O., Gootenberg J. S., Horii T., Ishitani R., Hatada I., Zhang F., Nishimasu H. & Nureki O. Structural basis for the promiscuous PAM recognition by Corynebacterium diphtheriae Cas9 // Nature Communications. - 2019. - Т. 10. - С. 1968), GeoCas9 из Geobacillus stearothermophilus (Harrington L. B., Paez-Espino D., Staahl B. T., Chen J. S., Ma E., Kyrpides N. C. & Doudna J. A. A thermostable Cas9 with increased lifetime in human plasma // Nature Communications. - 2017. - Т. 8. - С. 1424.), SaCas9 из Staphylococcus aureus (Ran F. A., Cong L., Yan W. X., Scott D. A., Gootenberg J. S., Kriz A. J., Zetsche B., Shalem O., Wu X., Makarova K. S., Koonin E. V., Sharp P. A. & Zhang F. In vivo genome editing using Staphylococcus aureus Cas9 // Nature. - 2015. - Т. 520. - С. 186-191), SauriCas9 из Staphylococcus auricularis (Hu Z., Wang S., Zhang C., Gao N., Li M., Wang D., Wang D., Liu D., Liu H., Ong S.-G., Wang H., Wang Y. A compact Cas9 ortholog from Staphylococcus Auricularis (SauriCas9) expands the DNA targeting scope // PLOS Biology. - 2020. - Т. 18. - e3000686).To date, several CRISPR-Cas nucleases are known that can introduce double-strand breaks in DNA in a targeted and specific manner. For example, NmeCas9 from Neisseria meningitidis strain 8013 (Esvelt K. M., Mali P., Braff J. L., Moosburner M., Yaung S. J. & Church G. M. Orthogonal Cas9 proteins for RNA-guided gene regulation and editing // Nature Methods. - 2013. - T. 10. - P. 1116-1121), Nme2Cas9 from Neisseria meningitidis strain De11444 (Edraki A., Mir A., Ibraheim R., Gainetdinov I., Yoon Y., Song C.-Q., Cao Y., Gallant J. ., Xue W., Rivera-Perez J. A. & Sontheimer E. J. A compact, high-accuracy Cas9 with a dinucleotide PAM for in vivo genome editing // Molecular Cell. - 2019. - V. 73. - P. 714-726), CjCas9 from Campylobacter jejuni (Kim E., Koo T., Park S. W., Kim D., Kim K., Cho H. Y., Song D. W., Lee K. J., Jung M. H., Kim S., Kim J. H., Kim J. H. & Kim J. S. In vivo genome editing with a small Cas9 orthologue derived from Campylobacter jejuni // Nature Communications. - 2017. - Vol. 8. - P. 14500), CdCas9 from Corynebacterium diphtheriae (Hirano S., Abudayyeh O. O., Gootenberg J. S., Horii T., Ishitani R., Hatada I., Zha ng F., Nishimasu H. & Nureki O. Structural basis for the promiscuous PAM recognition by Corynebacterium diphtheriae Cas9 // Nature Communications. - 2019. - V. 10. - S. 1968), GeoCas9 from Geobacillus stearothermophilus (Harrington L. B., Paez-Espino D., Staahl B. T., Chen J. S., Ma E., Kyrpides N. C. & Doudna J. A. A thermostable Cas9 with increased lifetime in human plasma // Nature Communications. - 2017. - V. 8. - P. 1424.), SaCas9 from Staphylococcus aureus (Ran F. A., Cong L., Yan W. X., Scott D. A., Gootenberg J. S., Kriz A. J., Zetsche B., Shalem O., Wu X., Makarova K. S., Koonin E. V., Sharp P. A. & Zhang F. In vivo genome editing using Staphylococcus aureus Cas9 // Nature. - 2015. - V. 520. - P. 186-191), SauriCas9 from Staphylococcus auricularis (Hu Z., Wang S., Zhang C., Gao N., Li M., Wang D., Wang D., Liu D., Liu H., Ong S.-G., Wang H., Wang Y. A compact Cas9 ortholog from Staphylococcus Auricularis (SauriCas9) expands the DNA targeting scope // PLOS Biology, 2020, vol. 18, e3000686).

Различные CRISPR-Cas белки используют для своей работы разные, оригинальные PAM последовательности. Однако большинство из описанных характеризуются длинной и сложной PAM последовательностью.Different CRISPR-Cas proteins use different, original PAM sequences for their work. However, most of those described are characterized by a long and complex PAM sequence.

Использование CRISPR-Cas белков с новыми разнообразными PAM последовательностями необходимо для обеспечения возможности изменения любого участка ДНК, как in vitro, так и в геноме живых организмов. Изменение эукариотических геномов также требует использования нуклеаз малого размера для обеспечения доставки CRISPR-Cas систем в клетки посредством AAV вирусов.The use of CRISPR-Cas proteins with new diverse PAM sequences is necessary to ensure the possibility of changing any DNA region, both in vitro and in the genome of living organisms. Altering eukaryotic genomes also requires the use of small nucleases to enable delivery of CRISPR-Cas systems to cells via AAV viruses.

Несмотря на известность ряда способов разрезания ДНК и изменения последовательности геномной ДНК, на сегодняшний день сохраняется потребность в новых эффективных инструментах для модификации ДНК в различных организмах и в строго определенных местах последовательности ДНК.Despite the popularity of a number of methods for cutting DNA and changing the sequence of genomic DNA, today there is a need for new effective tools for modifying DNA in various organisms and at strictly defined places in the DNA sequence.

Сущность изобретенияThe essence of the invention

Задачей настоящего изобретения является создание новых инструментов для изменения последовательности геномной ДНК одноклеточных или многоклеточных организмов на основе систем CRISPR-Cas9. Существующие в настоящее время системы имеют ограниченное применение из-за специфичной последовательности РАМ, которая должна присутствовать на 3'-конце участка ДНК, подвергающегося модификации. Поиск новых ферментов Cas9 с другими РАМ последовательностями позволит расширить арсенал имеющихся средств для образования двунитевого разрыва в необходимых, строго определенных местах в молекулах ДНК разных организмов. Для решения этой задачи авторами была охарактеризована ранее биоинформатически предсказанная система из бактерии Capnocytophaga ochracea.The objective of the present invention is to create new tools for changing the sequence of genomic DNA of unicellular or multicellular organisms based on CRISPR-Cas9 systems. Current systems are of limited use due to the specific PAM sequence that must be present at the 3' end of the DNA region to be modified. The search for new Cas9 enzymes with other PAM sequences will expand the arsenal of available tools for the formation of a double-strand break in the necessary, strictly defined places in the DNA molecules of different organisms. To solve this problem, the authors characterized a previously bioinformatically predicted system from the bacterium Capnocytophaga ochracea.

CRISPR нуклеаза II типа CoCas9, которая может быть применена для внесения направленных изменений в геном как этого, так и других организмов. Существенным признаком, отличающим настоящее изобретение, является отличающаяся от других известных последовательность PAM.CRISPR type II nuclease CoCas9, which can be used to introduce targeted changes in the genome of this and other organisms. An essential feature that distinguishes the present invention is the different PAM sequence from other known ones.

Указанная задача решается путем применения белка, содержащего аминокислотную последовательность SEQ ID NO: 1, или содержащего аминокислотную последовательность, которая по меньшей мере на 95% идентична аминокислотной последовательности SEQ ID NO: 1 и имеет отличия по сравнению с SEQ ID NO: 1 только в неконсервативных аминокислотных остатках, для образования двунитевого разрыва в молекуле ДНК, расположенного непосредственно перед нуклеотидной последовательностью 5'-N(A/G)(A/G)(A/T)C-3' в указанной молекуле ДНК. В некоторых вариантах изобретения данное применение характеризуется тем, что образование двунитевого разрыва в молекуле ДНК происходит при температуре от 35°C до 45°C.This problem is solved by using a protein containing the amino acid sequence of SEQ ID NO: 1, or containing an amino acid sequence that is at least 95% identical to the amino acid sequence of SEQ ID NO: 1 and differs from SEQ ID NO: 1 only in non-conservative amino acid residues, to form a double-strand break in the DNA molecule, located immediately before the nucleotide sequence 5'-N(A/G)(A/G)(A/T)C-3' in the specified DNA molecule. In some embodiments of the invention, this application is characterized in that the formation of a double-strand break in the DNA molecule occurs at a temperature of from 35°C to 45°C.

Указанная задача также решается путем создания способа изменения последовательности геномной ДНК одноклеточного или многоклеточного организма, включающего введение в по меньшей мере одну клетку этого организма эффективного количества: а) либо белка, содержащего аминокислотную последовательность SEQ ID NO: 1, либо нуклеиновой кислоты, кодирующей белок, содержащий аминокислотную последовательность SEQ ID NO: 1, и б) либо направляющей РНК, содержащей последовательность, образующую дуплекс с нуклеотидной последовательностью участка геномной ДНК организма, непосредственно примыкающей к нуклеотидной последовательности 5'-N(A/G)(A/G)(A/T)C-3', и взаимодействующей с указанным белком после образования дуплекса, либо последовательности ДНК, кодирующей указанную направляющую РНК; при этом взаимодействие указанного белка с направляющей РНК и нуклеотидной последовательностью 5'-N(A/G)(A/G)(A/T)C-3' приводит к образованию двунитевого разрыва в последовательности геномной ДНК, непосредственно примыкающей к последовательности 5'-N(A/G)(A/G)(A/T)C-3'. В некоторых вариантах изобретения данный способ характеризуется тем, что дополнительно включающий введение экзогенной последовательности ДНК одновременно с направляющей РНК.This problem is also solved by creating a method for changing the genomic DNA sequence of a unicellular or multicellular organism, which includes introducing into at least one cell of this organism an effective amount of: a) either a protein containing the amino acid sequence of SEQ ID NO: 1, or a nucleic acid encoding a protein, containing the amino acid sequence of SEQ ID NO: 1, and b) either a guide RNA containing a sequence forming a duplex with the nucleotide sequence of the genomic DNA region of the organism immediately adjacent to the nucleotide sequence 5'-N(A/G)(A/G)(A /T)C-3', and interacting with the specified protein after the formation of a duplex, or a DNA sequence encoding the specified guide RNA; while the interaction of the specified protein with the guide RNA and the nucleotide sequence 5'-N(A/G)(A/G)(A/T)C-3' leads to the formation of a double-strand break in the genomic DNA sequence immediately adjacent to the 5' -N(A/G)(A/G)(A/T)C-3'. In some embodiments of the invention, this method is characterized by further comprising the introduction of an exogenous DNA sequence simultaneously with the guide RNA.

В качестве направляющей РНК может быть использована смесь из крРНК (crRNA) и трейсерной РНК (tracrRNA), способных образовать комплекс с участком целевой ДНК и белком CoCas9. В предпочтительных вариантах изобретения в качестве направляющей РНК может быть использована гибридная РНК, сконструированная на основе крРНК и трейсерной РНК. Методы конструирования гибридной направляющей РНК известны специалистам (Hsu P. D., Scott D. A., Weinstein J. A., Ran F. A., Konermann S., Agarwala V., Li Y., Fine E. J., Wu X., Shalem O., Cradick T. J., Marraffini L. A., Bao G. & Zhang F. DNA targeting specificity of RNA-guided Cas9 nucleases // Nature Biotechnology. - 2013. - Т. 31. - №9. - С. 827-832). Один из вариантов конструирования гибридной РНК раскрыт в Примерах ниже.A mixture of crRNA (crRNA) and tracer RNA (tracrRNA) capable of forming a complex with the target DNA region and the CoCas9 protein can be used as a guide RNA. In preferred embodiments of the invention, a hybrid RNA constructed from crRNA and tracer RNA can be used as guide RNA. Methods for constructing a hybrid guide RNA are known in the art (Hsu P. D., Scott D. A., Weinstein J. A., Ran F. A., Konermann S., Agarwala V., Li Y., Fine E. J., Wu X., Shalem O., Cradick T. J., Marraffini L. A., Bao G. & Zhang F. DNA targeting specificity of RNA-guided Cas9 nucleases // Nature Biotechnology, 2013, vol. 31, no. 9, pp. 827-832). One design of the fusion RNA is disclosed in the Examples below.

Изобретение может быть использовано как для разрезания целевой ДНК in vitro, так и для модификации генома какого-либо живого организма. Модификация генома может проводиться прямым способом - разрезанием генома в соответствующем сайте, а также вставкой экзогенной последовательности ДНК за счет гомологичной репарации.The invention can be used both for cutting the target DNA in vitro and for modifying the genome of any living organism. Genome modification can be carried out in a direct way - by cutting the genome at the appropriate site, as well as by inserting an exogenous DNA sequence due to homologous repair.

В качестве экзогенной последовательности ДНК может быть использован любой участок двунитевой или однонитевой ДНК из генома организма, отличного от организма, используемого при введении (или смесь таких участков между собой и с другими фрагментами ДНК), при этом этот участок (или смесь участков) предназначен для интеграции в место двуцепочечного разрыва в целевой ДНК, образованного под действием нуклеазы CoCas9. В некоторых вариантах изобретения в качестве экзогенной последовательности ДНК может быть использован участок двуцепочечной ДНК из генома организма, используемого при введении белка CoCas9, но при этом измененный мутациями (заменой нуклеотидов), а также вставками или делециями одного или нескольких нуклеотидов.Any section of double-stranded or single-stranded DNA from the genome of an organism other than the organism used for administration (or a mixture of such sections among themselves and with other DNA fragments) can be used as an exogenous DNA sequence, while this section (or mixture of sections) is intended for integration into the site of a double-strand break in the target DNA, formed under the action of CoCas9 nuclease. In some embodiments of the invention, a portion of double-stranded DNA from the genome of the organism used when introducing the CoCas9 protein, but altered by mutations (substitution of nucleotides), as well as insertions or deletions of one or more nucleotides, can be used as an exogenous DNA sequence.

Техническим результатом настоящего изобретения является повышение универсальности доступных систем CRISPR-Cas9, позволяющее использовать нуклеазу Cas9 для разрезания геномной или плазмидной ДНК в большем количестве специфических сайтов и специфических условий.The technical result of the present invention is to increase the versatility of the available CRISPR-Cas9 systems, allowing the use of the Cas9 nuclease to cut genomic or plasmid DNA at more specific sites and specific conditions.

Подробное раскрытие изобретенияDetailed disclosure of the invention

В описании данного изобретения термины «включает» и «включающий» интерпретируются как означающие «включает, помимо всего прочего». Указанные термины не предназначены для того, чтобы их истолковывали как «состоит только из». Если не определено отдельно, технические и научные термины в данной заявке имеют стандартные значения, общепринятые в научной и технической литературе.In the description of the present invention, the terms "comprises" and "comprising" are interpreted to mean "includes, among other things." These terms are not intended to be construed as "consisting only of". Unless otherwise defined, the technical and scientific terms in this application have the standard meanings generally accepted in the scientific and technical literature.

Используемый здесь термин «процент гомологии двух последовательностей» эквивалентен термину «процент идентичности двух последовательностей». Идентичность последовательностей определяется на основании референсной последовательности. Алгоритмы для анализа последовательности известны в данной области, такие как BLAST, описанный в Altschul et al. (Basic local alignment search tool // Journal of Molecular Biology. - 1990. - Т. 215. - С. 403-410). Для целей настоящего изобретения для определения уровня идентичности и сходства между нуклеотидными последовательностями и аминокислотными последовательностями может быть использовано сравнение нуклеотидных и аминокислотных последовательностей, производимое с помощью пакета программ BLAST, предоставляемого National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/blast) с использованием содержащего разрывы выравнивания со стандартными параметрами. Процент идентичности двух последовательностей определяется числом положений идентичных аминокислот в этих двух последовательностях с учетом числа пробелов и длины каждого пробела, которые необходимо ввести для оптимального сопоставления двух последовательностей путем выравнивания. Процент идентичности равен числу идентичных аминокислот в данных положениях с учетом выравнивания последовательностей, разделенному на общее число положений и умноженному на 100.As used herein, the term "percent homology of two sequences" is equivalent to the term "percent identity of two sequences". Sequence identity is determined based on the reference sequence. Algorithms for sequence analysis are known in the art, such as BLAST as described in Altschul et al. (Basic local alignment search tool // Journal of Molecular Biology. - 1990. - T. 215. - S. 403-410). For the purposes of the present invention, comparison of nucleotide and amino acid sequences using the BLAST software package provided by the National Center for Biotechnology Information (http://www.ncbi.nlm.nih) can be used to determine the level of identity and similarity between nucleotide sequences and amino acid sequences. .gov/blast) using a broken alignment with default settings. The percent identity of two sequences is determined by the number of positions of identical amino acids in these two sequences, taking into account the number of gaps and the length of each gap, which must be entered for optimal matching of two sequences by alignment. The percent identity is equal to the number of identical amino acids at given positions, taking into account the alignment of the sequences, divided by the total number of positions and multiplied by 100.

Термин «специфически гибридизуется» относится к ассоциации между двумя одноцепочечными молекулами нуклеиновых кислот или в достаточной степени комплементарными последовательностями, что разрешает такую гибридизацию в предопределенных условиях, обычно использующихся в данной области.The term "specifically hybridizes" refers to an association between two single-stranded nucleic acid molecules, or sufficiently complementary sequences, to permit such hybridization under predetermined conditions commonly used in the art.

Фраза «двунитевой разрыв, расположенный непосредственно перед нуклеотидной последовательностью РАМ» означает, что двунитевой разрыв в целевой последовательности ДНК будет произведен на расстоянии от 0 до 25 нуклеотидов перед нуклеотидной последовательностью РАМ.The phrase "double strand break located immediately before the PAM nucleotide sequence" means that a double strand break in the target DNA sequence will be made at a distance of 0 to 25 nucleotides before the PAM nucleotide sequence.

Под экзогенной последовательностью ДНК, вводимой одновременно с направляющей РНК, следует понимать последовательность ДНК, подготовленную специально для специфической модификации двуцепочечной целевой ДНК в месте разрыва, определяемого специфичностью направляющей РНК. Подобной модификацией может быть, например, вставка или делеция определенных нуклеотидов в месте разрыва целевой ДНК. Экзогенной ДНК может служить как участок ДНК из другого организма, так и участок ДНК из того же организма, что и целевая ДНК.An exogenous DNA sequence introduced simultaneously with a guide RNA is to be understood as a DNA sequence prepared specifically for the specific modification of a double-stranded target DNA at the break site determined by the specificity of the guide RNA. Such a modification may be, for example, the insertion or deletion of certain nucleotides at the site of a break in the target DNA. Exogenous DNA can be either a stretch of DNA from another organism or a stretch of DNA from the same organism as the target DNA.

Под белком, содержащим определенную аминокислотную последовательность следует понимать белок, имеющий аминокислотную последовательность, составленную из указанной аминокислотной последовательности и, возможно, других последовательностей, соединенных пептидными связями с указанной аминокислотной последовательностью. Примером других последовательностей может служить последовательность сигнала ядерной локализации (NLS), или другие последовательности, обеспечивающие повышенную функциональность для указанной аминокислотной последовательности.A protein containing a specific amino acid sequence is to be understood as a protein having an amino acid sequence composed of the specified amino acid sequence and possibly other sequences connected by peptide bonds to the specified amino acid sequence. Other sequences are exemplified by the nuclear localization signal (NLS) sequence, or other sequences that provide increased functionality for the specified amino acid sequence.

Под эффективным количеством вводимых в клетку белка и РНК следует понимать такое количество белка и РНК, которое при попадании в указанную клетку будет способно образовать функциональный комплекс, то есть комплекс, который будет специфически связываться с целевой ДНК и производить в ней двунитевой разрыв в месте, определяемом направляющей РНК и РАМ последовательностью на ДНК. Эффективность этого процесса может быть оценена при помощи анализа целевой ДНК, выделенной из указанной клетки с помощью стандартных методов, известных специалистам.An effective amount of protein and RNA introduced into a cell should be understood as such an amount of protein and RNA that, when it enters the specified cell, will be able to form a functional complex, that is, a complex that will specifically bind to the target DNA and produce a double-strand break in it at a location determined by guide RNA and PAM sequence on DNA. The efficiency of this process can be assessed by analyzing the target DNA isolated from said cell using standard methods known to those skilled in the art.

Доставка белка и РНК в клетку может быть осуществлена различными способами. Например, белок может быть доставлен в виде ДНК-плазмиды, которая кодирует ген этого белка, как мРНК для трансляции этого белка в цитоплазме клетки, или как рибонуклеопротеидный комплекс, включающий этот белок и направляющую РНК. Доставка может быть осуществлена различными методами, известными специалистам.Delivery of protein and RNA into the cell can be carried out in various ways. For example, a protein can be delivered as a DNA plasmid that encodes the gene for that protein, as an mRNA for translation of that protein in the cell's cytoplasm, or as a ribonucleoprotein complex that includes the protein and a guide RNA. Delivery can be accomplished by various methods known to those skilled in the art.

Нуклеиновая кислота, кодирующая компоненты системы, может быть введена в клетку, непосредственно или опосредованно: за счет трансфекции или трансформации клеток известными специалистам способами, за счет использования рекомбинантного вируса, за счет манипуляций с клеткой, таких как микроинъекция ДНК и т. п.The nucleic acid encoding the components of the system can be introduced into the cell, directly or indirectly: by transfection or transformation of cells by methods known to those skilled in the art, by using a recombinant virus, by manipulation of the cell, such as DNA microinjection, etc.

Доставка рибонуклеинового комплекса, состоящего из нуклеазы и направляющих РНК и экзогенной ДНК (при необходимости) может осуществляться путем трансфекции комплексов в клетку или за счет механического введения комплекса внутрь клетки, например, микроинъекции.Delivery of a ribonucleic complex consisting of a nuclease and guide RNAs and exogenous DNA (if necessary) can be carried out by transfection of the complexes into the cell or by mechanical introduction of the complex into the cell, for example, by microinjection.

Молекула нуклеиновой кислоты, кодирующая белок, который необходимо ввести в клетку, может быть интегрирована в хромосому или может представлять собой внехромосомно реплицирующуюся ДНК. В некоторых вариантах для обеспечения эффективной экспрессии гена белка с вводимой в клетку ДНК необходимо изменить последовательность этой ДНК в соответствии с типом клетки в целях оптимизации кодонов при экспрессии, обусловленное неравномерностью частот встречаемости синонимичных кодонов в кодирующих областях генома различных организмов. Оптимизация кодонов необходима для увеличения экспрессии в клетках животных, растений, грибов или микроорганизмов.The nucleic acid molecule encoding the protein to be introduced into the cell may be integrated into a chromosome or may be extrachromosomally replicating DNA. In some embodiments, to ensure efficient expression of a protein gene with DNA introduced into a cell, it is necessary to change the sequence of this DNA in accordance with the cell type in order to optimize codons during expression, due to the uneven frequency of occurrence of synonymous codons in the coding regions of the genome of various organisms. Codon optimization is required to increase expression in animal, plant, fungal, or microbial cells.

Для функционирования белка, имеющего последовательность, которая по меньшей мере на 95% идентична аминокислотной последовательности SEQ ID NO: 1, в эукариотической клетке необходимо, чтобы этот белок оказался в ядре этой клетки. Поэтому, в некоторых вариантах изобретения, для образования двунитевых разрывов в целевой ДНК используют белок, имеющий последовательность, которая по меньшей мере на 95% идентична аминокислотной последовательности SEQ ID NO: 1, и который дополнительно модифицирован с одного или с обоих концов добавлением одного или нескольких сигналов ядерной локализации. Например, может быть использован сигнал ядерной локализации из вируса SV40. Для эффективной доставки в ядро сигнал ядерной локализации может быть отделен от основной последовательности белка спейсерной последовательностью, например, описанной в Shen B, et al. (Generation of gene-modified mice via Cas9/RNA-mediated gene targeting // Cell Research. - 2013. - Т. 23. - №5. - С. 720-723). Также, в других вариантах осуществления, может быть использован другой сигнал ядерной локализации, или альтернативный метод доставки указанного белка в ядро клетки.For a protein having a sequence that is at least 95% identical to the amino acid sequence of SEQ ID NO: 1 to function in a eukaryotic cell, the protein must be in the nucleus of that cell. Therefore, in some embodiments of the invention, a protein is used to form double-strand breaks in the target DNA, having a sequence that is at least 95% identical to the amino acid sequence of SEQ ID NO: 1, and which is further modified at one or both ends by the addition of one or more nuclear localization signals. For example, a nuclear localization signal from the SV40 virus can be used. For efficient delivery to the nucleus, the nuclear localization signal can be separated from the main protein sequence by a spacer sequence, such as that described in Shen B, et al. (Generation of gene-modified mice via Cas9/RNA-mediated gene targeting // Cell Research. - 2013. - Vol. 23. - No. 5. - P. 720-723). Also, in other embodiments, a different nuclear localization signal, or alternative method of delivering said protein to the cell nucleus, may be used.

Настоящее изобретение охватывает применение белка из организма Capnocytophaga ochracea, гомологичного ранее охарактеризованным белкам Cas9, для внесения двуцепочечных разрывов в молекулы ДНК в строго определенных положениях. Использование CRISPR нуклеаз для внесения направленных изменений в геном имеет ряд преимуществ. Во-первых, специфичность действия системы определяется последовательностью крРНК, что позволяет использовать один тип нуклеазы для всех локусов-мишеней. Во-вторых, методика позволяет доставить в клетку сразу несколько направляющих РНК, комплементарных разным генам-мишеням, что позволяет осуществлять единовременное изменение сразу нескольких генов.The present invention encompasses the use of a protein from the organism Capnocytophaga ochracea, homologous to previously characterized Cas9 proteins, to introduce double-strand breaks in DNA molecules at well-defined positions. The use of CRISPR nucleases to introduce targeted changes in the genome has a number of advantages. First, the specificity of the system's action is determined by the crRNA sequence, which makes it possible to use one type of nuclease for all target loci. Secondly, the technique allows several guide RNAs complementary to different target genes to be delivered into the cell at once, which makes it possible to carry out a simultaneous change in several genes at once.

CoCas9 - Cas нуклеаза, найденная в бактериях Capnocytophaga ochracea DSM 7271, являющихся оппортунистическими патогенами человека, найденными в ротовой полости. Capnocytophaga ochracea CRISPR-Cas9 система (далее CRISPR CoCas9) относится к II C типу CRISPR Cas систем и состоит из CRISPR кассеты, несущей пять прямых повторов (direct repeats, DR) последовательностью 5'-GTTGTGAATTGCTTTCAAATTTTGTAGTTTTGCGATTGATAACAAC-3' разделенных последовательностями уникальных спейсеров. Ни один из спейсеров системы не совпадает по последовательности с известными на сегодня бактериофагами или плазмидами, что не позволяет определить требуемый CoCas9 PAM биоинформатическим анализом. К CRISPR кассете прилегает ген эффекторного Cas9 белка CoCas9. Рядом с Cas геном была обнаружена последовательность, частично комплементарная прямым повторам, складывающаяся в характерную вторичную структуру, - предполагаемая трейсерная РНК (tracrRNA, трРНК) (Фиг. 1)CoCas9 is a Cas nuclease found in the bacterium Capnocytophaga ochracea DSM 7271, which is an opportunistic human pathogen found in the oral cavity. The Capnocytophaga ochracea CRISPR-Cas9 system (hereinafter CRISPR CoCas9) belongs to type II C CRISPR Cas systems and consists of a CRISPR cassette carrying five direct repeats (DR) with a 5'-GTTGTGAATTGCTTTCAAATTTTGTAGTTTTGCGATTGATAACAAC-3' sequence separated by unique spacer sequences. None of the spacers of the system matches the sequence of currently known bacteriophages or plasmids, which makes it impossible to determine the required CoCas9 PAM by bioinformatic analysis. The gene for the Cas9 effector protein CoCas9 is adjacent to the CRISPR cassette. Next to the Cas gene, a sequence was found that is partially complementary to direct repeats, folding into a characteristic secondary structure - putative tracer RNA (tracrRNA, tRNA) (Fig. 1)

Знание характерной архитектуры РНК-Cas белкового комплекса систем II-C типа позволила предсказать направление транскрипции CRISPR кассеты: пре-крРНК транскрибируется в противоположном от Cas генов направлении (Фиг. 1)Knowledge of the characteristic architecture of the RNA-Cas protein complex of type II-C systems made it possible to predict the direction of transcription of the CRISPR cassette: pre-crRNA is transcribed in the opposite direction from Cas genes (Fig. 1)

Таким образом, анализ последовательности локуса CoCas9 позволил предсказать последовательности трейсерной и направляющих РНК (Таблица 1).Thus, sequence analysis of the CoCas9 locus made it possible to predict the tracer and guide RNA sequences (Table 1).

Таблица 1. Определенные биоинформатическими методами последовательности направляющих РНК системы CRISPR CoCas9. Жирным шрифтом обозначена последовательность прямого повтора DR. Знаками “x” обозначены нуклеотиды вариабельной спейсерной части крРНК.Table 1. Bioinformatically determined guide RNA sequences of the CRISPR CoCas9 system. Bold indicates the DR direct repeat sequence. The “x” marks denote the nucleotides of the variable spacer portion of crRNA. НазваниеName ПоследовательностьSubsequence CoCas9 трРНКCoCas9 tRNA 5'-GUCGCACAAUUUGAAAGCAAUUCACAAUAAGGAUUAUUCCGUUGUGAAAACAUUUAAAGGAGCCCUAUCAUUAUAUUAGUGAUAGGGUUCUUUUUU-3' (SEQ ID NO: 2)5'-GUCGCACAAUUUGAAAGCAAUUCACAAUAAGGAUUAUUCCGUUGUGAAAACAUUUAAAGGAGCCCUAUCAUUAUAUUAGUGAUAGGGUUCUUUUUU-3' (SEQ ID NO: 2) CoCas9 крРНКCoCas9 crRNA 5'-xxxxxxxxxxxxxxxxxxxxGUUGUGAAUUGCUUUCAAAUUUUGUAGUUUUGCGAUUGAUAACAAC-3' (SEQ ID NO: 3)5'-xxxxxxxxxxxxxxxxxxxxGUUGUGAAUUGCUUUCAAAUUUUGUAGUUUUGCGAUUGAUAACAAC-3' (SEQ ID NO: 3)

Для проверки активности CoCas9 нуклеазы и определения требуемого CoCas9 PAM мотива, были проведены эксперименты по воссозданию реакции разрезания ДНК in vitro. Для определения PAM последовательности белка CoCas9 использовали in vitro разрезание двунитевых PAM библиотек. Для этого необходимо было получить все компоненты эффекторного комплекса CoCas9: направляющие РНК и нуклеазу в рекомбинантной форме. Определение последовательности направляющих РНК позволило синтезировать in vitro молекулы крРНК и трРНК. Синтез осуществляли с помощью набора NEB HiScribe T7 RNA synthesis. Двунитевые ДНК библиотеки представляли собой фрагменты размером 374 пар нуклеотидов (п. н.), содержащие последовательность протоспейсера, фланкированную рандомизированными семью нуклеотидами (5'-NNNNNNN-3') c 3' конца: 5'-

To test the activity of CoCas9 nuclease and determine the required CoCas9 PAM motif, experiments were carried out to recreate the DNA cutting reaction in vitro. To determine the PAM sequence of the CoCas9 protein, in vitro cutting of double-stranded PAM libraries was used. To do this, it was necessary to obtain all components of the CoCas9 effector complex: guide RNA and nuclease in recombinant form. Sequencing of guide RNAs made it possible to synthesize crRNA and tRNA molecules in vitro. Synthesis was performed using the NEB HiScribe T7 RNA synthesis kit. Double-stranded DNA libraries were fragments of 374 base pairs (bp) containing the protospacer sequence flanked by randomized seven nucleotides (5'-NNNNNNNN-3') from the 3' end: 5'-

Для разрезания этой мишени использовали направляющие РНК следующей последовательности: в качестве трРНК последовательность SEQ ID NO: 2; в качестве крРНК:To cut this target, guide RNAs of the following sequence were used: as tRNA, the sequence of SEQ ID NO: 2; as crRNA:

5'-uaucuccuuucauugagcacGUUGUGAAUUGCUUUCAAAUUUUGUAGUUUUGCGAUUGAUAACAA-3' (SEQ ID NO: 5).5'-uaucuccuuucauugagcacGUUGUGAAUUGCUUUCAAAUUUUGUAGUUUUGCGAUUGAUAACAA-3' (SEQ ID NO: 5).

Жирным шрифтом выделена последовательность крРНК, комплементарная протоспейсеру (целевой ДНК последовательности).The crRNA sequence complementary to the protospacer (target DNA sequence) is highlighted in bold.

Для получения рекомбинантного белка CoCas9 его ген был клонирован в плазмиду pET21a. В качестве кодирующей ген ДНК, использовалась ДНК, амплифицированная с геномной ДНК Capnocytophaga ochracea DSM 7271, заказанной из коллекции DSMZ (Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures GmbH). Клетки E. coli Rosetta были трансформированы полученной плазмидой pET21a-6xHis-CoCas9. 500 мкл ночной культуры разводили в 500 мл среды LB, и растили клетки при температуре 37°C до достижения оптической плотности 0.6 отн. ед. Синтез целевого белка индуцировали добавлением ИПТГ до концентрации 1 мМ, после чего клетки инкубировали при температуре 16°C в течение 16 часов. Затем проводили центрифугирование клеток на скорости 5000 g в течение 30 минут, полученные осадки клеток замораживали при температуре -20°C. Осадки размораживали на льду в течение 30 минут, ресуспензировали в 15 мл лизисного буфера (Tris-HCl 50мМ pH 8, 500 мМ NaCl, β-меркаптоэтанол 1мМ, имидазол 10 мМ) с добавлением 15 мг лизоцима и снова инкубировали на льду в течение 30 минут. Затем клетки разрушали воздействием ультразвука в течение 30 минут и центрифугировали в течение 40 минут на скорости 16000 g. Полученный супернатант пропускали через фильтр 0.2 мкм и наносили на колонку HisTrap HP 1 mL (GE Healthcare) на скорости 1 мл/мин.To obtain the recombinant CoCas9 protein, its gene was cloned into the pET21a plasmid. As DNA encoding the gene, DNA amplified with Capnocytophaga ochracea DSM 7271 genomic DNA ordered from the DSMZ collection (Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures GmbH) was used. E. coli Rosetta cells were transformed with the resulting pET21a-6xHis-CoCas9 plasmid. 500 µl of the overnight culture were diluted in 500 ml of LB medium, and the cells were grown at 37°C until an optical density of 0.6 rel. units Target protein synthesis was induced by adding IPTG to a concentration of 1 mM, after which the cells were incubated at 16°C for 16 hours. Then the cells were centrifuged at a speed of 5000 g for 30 minutes, the obtained cell pellets were frozen at -20°C. The pellets were thawed on ice for 30 minutes, resuspended in 15 ml of lysis buffer (Tris-HCl 50 mM pH 8, 500 mM NaCl, β-mercaptoethanol 1 mM, imidazole 10 mM) supplemented with 15 mg of lysozyme and again incubated on ice for 30 minutes . The cells were then disrupted by sonication for 30 minutes and centrifuged for 40 minutes at 16,000 g. The resulting supernatant was passed through a 0.2 µm filter and applied to a HisTrap HP 1 mL column (GE Healthcare) at a rate of 1 mL/min.

Хроматографию проводили при помощи FPLC хроматографа AKTA (GE Healthcare) на скорости 1 мл/мин. Колонку с нанесенным белком промывали 20 мл лизисного буфера с добавлением 10 мМ имидазола, после чего белок смывали лизисным буфером с добавлением 300 мМ имидазола.Chromatography was performed using an AKTA FPLC chromatograph (GE Healthcare) at 1 ml/min. The protein loaded column was washed with 20 ml of lysis buffer with the addition of 10 mM imidazole, after which the protein was washed with lysis buffer with the addition of 300 mM imidazole.

Затем, фракцию белка, полученную в ходе афинной хроматографии, пропускали через гель-фильтрационную колонку Superdex 200 10/300 GL (24 мл), уравновешенную следующим буфером: Tris-HCl 50 мМ pH 8, 500 мМ NaCl, 1 мМ DTT. При помощи концентратора Amicon (с фильтром на 30 кДа) фракции, соответствующие мономерной форме белка CoCas9, сконцентрировали до 1.6 мг/мл, после чего очищенный белок хранили при температуре -80°C в буфере, содержащем 10% глицерин.Then, the protein fraction obtained from affinity chromatography was passed through a Superdex 200 10/300 GL gel filtration column (24 ml) equilibrated with the following buffer: Tris-HCl 50 mM pH 8, 500 mM NaCl, 1 mM DTT. Using an Amicon concentrator (with a 30 kDa filter), the fractions corresponding to the monomeric form of the CoCas9 protein were concentrated to 1.6 mg/mL, after which the purified protein was stored at -80°C in a buffer containing 10% glycerol.

In vitro реакцию порезки линейных PAM библиотек проводили в объеме 20 мкл в следующих условиях. Реакционная смесь состояла из: 1X CutSmart буфера (NEB), 5 мМ DTT, 100 нМ PAM-библиотеки, 2 мкМ трРНК/крРНК, 400 нМ белка CoCas9. В качестве контроля аналогичным образом были приготовлены пробы, не содержащие РНК. Пробы инкубировали при различных температурах и анализировали методом гель-электрофореза в 2% агарозном геле. В случае правильного узнавания и специфического разрезания ДНК белком CoCas9 должны формироваться два фрагмента ДНК длиной порядка 326 и 48 пар оснований (см. Фиг. 2).In vitro cutting reaction of linear PAM libraries was carried out in a volume of 20 µl under the following conditions. The reaction mixture consisted of: 1X CutSmart buffer (NEB), 5 mM DTT, 100 nM PAM library, 2 μM tRNA/crRNA, 400 nM CoCas9 protein. Samples containing no RNA were prepared in a similar way as controls. Samples were incubated at different temperatures and analyzed by gel electrophoresis in 2% agarose gel. In the case of correct recognition and specific cutting of DNA by the CoCas9 protein, two DNA fragments of the order of 326 and 48 base pairs in length should be formed (see Fig. 2).

Результаты опыта показали, что CoCas9 обладает нуклеазной активностью и разрезает часть фрагментов PAM библиотеки (Фиг. 3).The results of the experiment showed that CoCas9 has nuclease activity and cuts part of the fragments of the PAM library (Fig. 3).

Реакцию разрезания библиотеки повторяли в подобранных условиях. Продукты реакции наносили на 2% агарозный гель и подвергали электрофорезу. Непорезанные фрагменты ДНК длиной 374 п.н. экстрагировали из геля и подготавливали для высокоэффективного секвенирования с помощью набора NEB NextUltra II. Образцы секвенировали на платформе Illumina и далее проводили анализ последовательностей биоформатическими методами: определяли разницу в представленности нуклеотидов в отдельных позициях PAM (NNNNNNN) в сравнении с контрольным образцом. Для анализа результатов были построены PAM лого (Фиг. 4) и PAM колесо (Фиг. 5).The library cutting reaction was repeated under adjusted conditions. The reaction products were applied to a 2% agarose gel and subjected to electrophoresis. Uncut DNA fragments 374 bp long. were extracted from the gel and prepared for high throughput sequencing using the NEB NextUltra II kit. The samples were sequenced on the Illumina platform and then the sequences were analyzed by bioformational methods: the difference in the representation of nucleotides in individual PAM positions (NNNNNNNN) was determined in comparison with the control sample. To analyze the results, a PAM logo (Fig. 4) and a PAM wheel (Fig. 5) were built.

Анализ данных указывают на значимость 2, 3, 4 и 5 позиций PAM. Таким образом, в результате in vitro анализа удалось установить предположительную PAM последовательность для CoCas9: 5'-N(A/G)A(A/T)(C/A)-3'. Эта последовательность является предположительной в силу неточности результатов, получаемых скрининговыми подходами к определению PAM.Data analysis indicates the significance of 2, 3, 4 and 5 PAM positions. Thus, as a result of in vitro analysis, it was possible to establish a putative PAM sequence for CoCas9: 5'-N(A/G)A(A/T)(C/A)-3'. This sequence is hypothetical due to the inaccuracy of the results obtained by screening approaches to the determination of PAM.

Для подтверждения наличия нуклеазной активности у CoCas9 проводили реакции разрезания ДНК фрагментов, содержащих ДНК-мишень 5'-atctcctttcattgagcac-3', фланкированную PAM последовательностями, выбранными по результатам PAM скрининга (5'-AACAACG-3', 5'-CAAACCC-3', 5'-CAAACAA-3', 5'-CAAACTA-3', 5'-CAAACAC-3', 5'-AAATCCA-3', 5'-AAAACCC-3', 5'-CAAACCG-3', 5'-AAAACTC-3', 5'-CAAACAG-3', 5'-AAAAACG-3', 5'-CAAAACC-3', последовательности PAM расположены в порядке уменьшения эффективности узнавания согласно результатам анализа).To confirm the presence of nuclease activity in CoCas9, DNA cutting reactions were performed on fragments containing the target DNA 5'-atctcctttcattgagcac-3', flanked by PAM sequences selected from the results of PAM screening (5'-AACAACG-3', 5'-CAAACCC-3' , 5'-CAAAACAA-3', 5'-CAAACTA-3', 5'-CAAACAC-3', 5'-AAATCCA-3', 5'-AAAACCC-3', 5'-CAAACCG-3', 5 '-AAAACTC-3', 5'-CAAAACAG-3', 5'-AAAAACG-3', 5'-CAAAAACC-3', PAM sequences are arranged in order of decreasing recognition efficiency according to the results of the analysis).

CoCas9 разрезал большинство мишеней, фланкированных выбранными PAM, менее эффективно разрезав 5'-AAAAACG-3', 5'-CAAAACC-3', показавшие меньшую долю разрезания в PAM скрининге (Фиг. 6).CoCas9 cut most of the targets flanked by the selected PAMs, less efficiently cutting 5'-AAAAAACG-3', 5'-CAAAAACC-3', which showed less cut in PAM screening (FIG. 6).

Далее для уточнения PAM последовательности была произведена проверка значимости отдельных нуклеотидных позиций.Further, to clarify the PAM sequence, the significance of individual nucleotide positions was checked.

Для этого эксперимента была выбрана последовательность PAM 5'-CAAACCC-3', на которой CoCas9 продемонстрировал высокую нуклеазную активность (Фиг. 6).For this experiment, the PAM sequence 5'-CAAACCC-3' was chosen, in which CoCas9 showed high nuclease activity (FIG. 6).

Реакции разрезания (Фиг. 7 и Фиг. 8) проводили in vitro с использованием ДНК фрагментов, содержащих ДНК-мишень 5'-atctcctttcattgagcac-3', фланкированную PAM последовательностью 5'-CAAACCC-3' (или ее производных): 5'-

Cutting reactions (Fig. 7 and Fig. 8) were performed in vitro using DNA fragments containing the target DNA 5'-atctcctttcattgagcac-3' flanked by the PAM sequence 5'-CAAACCC-3' (or its derivatives): 5'-

Все реакции разрезания ДНК проводили в следующих условиях:All DNA cutting reactions were carried out under the following conditions:

1x CutSmart буфер1x CutSmart buffer

400 нМ CoCas9400 nM CoCas9

40 нМ ДНК40 nM DNA

2 мкМ крРНК2 μM crRNA

2 мкМ трРНК2 μM tRNA

Время инкубации - 30 минут, температура проведения реакции 37°C.Incubation time - 30 minutes, reaction temperature 37°C.

Замена нуклеотидов в каждом положении PAM (пурин на пиримидин и наоборот) показала, что значимыми являются только 2, 3, 4 5 позиции, подтвердив результаты PAM скрининга (Фиг. 7).The substitution of nucleotides at each PAM position (purine to pyrimidine and vice versa) showed that only positions 2, 3, 4, 5 were significant, confirming the PAM screening results (FIG. 7).

Далее каждый нуклеотид PAM был заменен на все возможные варианты нуклеотидов (Фиг. 8). Замена незначимых 1, 6 и 7 позиции PAM на все четыре возможные варианта нуклеотидов не повлияла на эффективность работы белка (Фиг. 8).Next, each PAM nucleotide was replaced with all possible nucleotide variants (Fig. 8). Substitution of insignificant positions 1, 6, and 7 of PAM for all four possible nucleotide variants did not affect the efficiency of the protein (Fig. 8).

В позициях 2 и 3 CoCas9 требует наличия аденина или гуанина, в позиции 4 требует аденин или тимин. А при замене цитозина в пятой позиции белок практически переставал работать.In positions 2 and 3, CoCas9 requires adenine or guanine; in position 4, it requires adenine or thymine. And when the cytosine was replaced in the fifth position, the protein practically stopped working.

Эти данные согласуются с результатами PAM скрининга (Фиг. 5)These data are consistent with the results of PAM screening (Fig. 5)

В результате проведенных исследований удалось сделать следующий вывод: PAM, распознаваемый нуклеазой CoCas9, соответствует следующей формуле 5'-N(A/G)(A/G)(A/T)C-3' (5'-NRRWC-3').As a result of the studies, the following conclusion was made: PAM recognized by CoCas9 nuclease corresponds to the following formula 5'-N(A/G)(A/G)(A/T)C-3' (5'-NRRWC-3') .

Дополнительно был исследован температурный оптимум нуклеазной активности белка CoCas9 (Фиг. 9). В результате было показано, что белок активен в диапазоне температур 35-45°С.Additionally, the temperature optimum of the nuclease activity of the CoCas9 protein was studied (Fig. 9). As a result, it was shown that the protein is active in the temperature range of 35-45°C.

Нижеследующие примеры осуществления способа приведены в целях раскрытия характеристик настоящего изобретения и их не следует рассматривать как каким-либо образом ограничивающие объем изобретения.The following examples of the implementation of the method are given in order to disclose the characteristics of the present invention and should not be construed as in any way limiting the scope of the invention.

Пример 1. Использование гибридной направляющей РНК для разрезания ДНК мишени.Example 1 Use of a hybrid guide RNA to cut a target DNA.

sgRNA - форма направляющих РНК, которая представляет собой слитые воедино трРНК (трейсерная РНК) и крРНК. Для подбора оптимальной sgRNA были сконструированы два варианта этой последовательности, отличающиеся длиной трРНК - крРНК дуплекса. РНК синтезировали in vitro и проводили с ними эксперименты по разрезанию ДНК -мишени.sgRNA is a form of guide RNA that is a fusion of tRNA (tracer RNA) and crRNA. To select the optimal sgRNA, two variants of this sequence were constructed, differing in the length of the tRNA - crRNA duplex. RNA was synthesized in vitro and experiments were performed with them to cut the target DNA.

В качестве гибридных РНК были использованы следующие РНК последовательности:The following RNA sequences were used as fusion RNAs:

1 - sgRNA1 28DR: UAUCUCCUUUCAUUGAGCACGUUGUGAAUUGCUUUCAAAUUUUGUAGUGAAAGUCGCACAAUUUGAAAGCAAUUCACAAUAAGGAUUAUUCCGUUGUGAAAACAUUUAAAGGAGCCCUAUCAUUAUAUUAGUGAUAGGGUUCUUUUUU (SEQ ID NO: 7);1 - sgRNA1 28DR: UAUCUCCUUUCAUUGAGCACGUUGUGAAUUGCUUUCAAAUUUUGUAGUGAAAGUCGCACAAUUUGAAAGCAAUUCACAAUAAGGAUUAUUCCGUUGUGAAAACAUUUAAAGGAGCCCUAUCAUUAUAUUAGUGAUAGGGUUCUUUUUU (SEQ ID NO: 7);

2 - sgRNA2 35DR:2 - sgRNA2 35DR:

UAUCUCCUUUCAUUGAGCACGUUGUGAAUUGCUUUCAAAUUUUGUAGUUUUGCGAGAAAGUCGCACAAUUUGAAAGCAAUUCACAAUAAGGAUUAUUCCGUUGUGAAAACAUUUAAAGGAGCCCUAUCAUUAUAUUAGUGAUAGGGUUCUUUUUU (SEQ ID NO: 8).UAUCUCCUUUCAUUGAGCACGUUGUGAAUUGCUUUCAAAUUUUGUAGUUUUGCGAGAAAGUCGCACAAUUUGAAAGCAAUUCACAAUAAGGAUUAUUCCGUUGUGAAAACAUUUAAAGGAGCCCUAUCAUUAUAUUAGUGAUAGGGUUCUUUUUU (SEQ ID NO: 8).

Жирным шрифтом обозначена 20-нуклеотидная последовательность, обеспечивающая спаривание с ДНК -мишенью (вариабельная часть sgRNA). Кроме того, в эксперименте делали контрольную пробу без РНК, а также положительный контроль - разрезание мишени с помощью крРНК+трРНК.Bold indicates the 20-nucleotide sequence that provides pairing with the target DNA (variable part of sgRNA). In addition, a control sample without RNA was made in the experiment, as well as a positive control - cutting the target with crRNA + tRNA.

В качестве ДНК мишени использовалась последовательность, содержащая сайт узнавания 5'-tatctcctttcattgagcac-3' с соответствующим консенсусу PAM CAAACCC: 5'-

The sequence containing the recognition site 5'-tatctcctttcattgagcac-3' with the corresponding PAM CAAACCC consensus was used as the target DNA: 5'-

Жирным шрифтом обозначен сайт узнавания, заглавными буквами PAM.The recognition site is marked in bold, PAM in capital letters.

Реакцию проводили в следующих условиях: концентрация ДНК последовательности, содержащей PAM (CAAACCC) - 40 нМ, концентрация белка - 400 нМ, концентрация РНК - 2 мкМ; время инкубирования - 30 минут, температура инкубирования - 37°С.The reaction was carried out under the following conditions: concentration of DNA sequence containing PAM (CAAACCC) - 40 nM, protein concentration - 400 nM, RNA concentration - 2 μM; incubation time - 30 minutes, incubation temperature - 37°C.

Подобранные sgRNA1 и sgRNA2 оказались так же эффективны, как и нативные последовательности трРНК и крРНК (Фиг. 10).The matched sgRNA1 and sgRNA2 proved to be as effective as native tRNA and crRNA sequences (FIG. 10).

Эти варианты гибридной РНК могут быть использованы для разрезания любой другой целевой ДНК при изменении последовательности, непосредственно спаривающейся с ДНК -мишенью.These fusion RNA variants can be used to cut any other target DNA by changing the sequence that directly pairs with the target DNA.

Пример 2. Белки Cas9 из близкородственных организмов, относящихся к Capnocytophaga ochracea.Example 2 Cas9 proteins from closely related organisms belonging to Capnocytophaga ochracea.

На сегодняшний день в Capnocytophaga ochracea не охарактеризовано ни одного фермента системы CRISPR-Cas9. Сравнимый по размерам белок Cca1 из Capnocytophaga canis (также относится к белкам Cas9) идентичен CoCas9 на 66.34% (Фиг. 11, степень идентичности была рассчитана по программе BLASTp, default parameters). При этом значительная часть отличий нуклеаз приходится на домен, взаимодействующий с PAM последовательностью (степень идентичности обоих доменов 67%). Различия этих доменов обуславливают то, что нуклеазы взаимодействуют с разными PAM (PAM Cca1 5'-BRTTTTT-3').To date, not a single enzyme of the CRISPR-Cas9 system has been characterized in Capnocytophaga ochracea. Comparable in size, the Cca1 protein from Capnocytophaga canis (also related to the Cas9 proteins) is 66.34% identical to CoCas9 (Fig. 11, the degree of identity was calculated using the BLASTp program, default parameters). At the same time, a significant part of the differences in nucleases falls on the domain interacting with the PAM sequence (the degree of identity of both domains is 67%). Differences in these domains cause nucleases to interact with different PAMs (PAM Cca1 5'-BRTTTTT-3').

Таким образом, белок CoCas9 существенно отличается по аминокислотной последовательности от других Cas9 белков, изученных на сегодняшний день.Thus, the CoCas9 protein differs significantly in amino acid sequence from other Cas9 proteins studied to date.

Специалисту в области генетической инженерии очевидно, что полученный и охарактеризованный в данном Описании вариант последовательности белка CoCas9 может быть изменен без изменения функции самого белка (например, направленным мутагенезом аминокислотных остатков, напрямую не влияющих на функциональную активность (Sambrook et al., Molecular Cloning: A Laboratory Manual, (1989), CSH Press, pp. 15.3-15.108)). В частности, специалисту известно, что могут быть изменены неконсервативные аминокислотные остатки, не затрагивающие остатки, определяющие функциональность белка (определяющие его функцию или структуру). Примерами таких изменений могут служить замены неконсервативных аминокислотных остатков на гомологичные. В некоторых вариантах осуществления изобретения возможно использование белка, содержащего аминокислотную последовательность, которая по меньшей мере на 95% идентична аминокислотной последовательности SEQ ID NO: 1 и имеет отличия по сравнению с SEQ ID NO: 1 только в неконсервативных аминокислотных остатках, для образования двунитевого разрыва в молекуле ДНК, расположенного непосредственно перед нуклеотидной последовательностью 5'-N(A/G)(A/G)(A/T)C-3' в указанной молекуле ДНК. Гомологичные белки могут быть получены путем мутагенеза (например, сайт-направленного или ПЦР-опосредуемого мутагенеза) соответствующих молекул нуклеиновых кислот с последующим тестированием кодируемого модифицированного белка Cas9 на сохранение его функций в соответствии с описанными здесь функциональными анализами.It is obvious to a specialist in the field of genetic engineering that the variant of the CoCas9 protein sequence obtained and characterized in this Description can be changed without changing the function of the protein itself (for example, by directed mutagenesis of amino acid residues that do not directly affect functional activity (Sambrook et al., Molecular Cloning: A Laboratory Manual, (1989), CSH Press, pp. 15.3-15.108)). In particular, one skilled in the art will be aware that non-conservative amino acid residues can be changed without affecting residues that determine the functionality of the protein (determining its function or structure). Examples of such changes are the replacement of non-conservative amino acid residues with homologous ones. In some embodiments of the invention, it is possible to use a protein containing an amino acid sequence that is at least 95% identical to the amino acid sequence of SEQ ID NO: 1 and differs from SEQ ID NO: 1 only in non-conservative amino acid residues to form a double-strand break in DNA molecule located immediately before the nucleotide sequence 5'-N(A/G)(A/G)(A/T)C-3' in the specified DNA molecule. Homologous proteins can be obtained by mutagenesis (eg, site-directed or PCR-mediated mutagenesis) of the appropriate nucleic acid molecules, followed by testing the encoded modified Cas9 protein for retention of its functions in accordance with the functional assays described here.

Пример 3. Описанная в настоящем изобретении система CoCas9 в комплексе с направляющими РНК может быть использована для изменения последовательности геномной ДНК многоклеточного организма, в том числе эукариотического. Для введения система CoCas9 в комплексе с направляющими РНК в клетки этого организма (во все клетки или в часть клеток) могут быть применены различные подходы, известные специалистам. Например, методы доставки CRISPR-Cas9 систем в клетки организмов раскрыты в источниках (Liu C et al., Delivery strategies of the CRISPR-Cas9 gene-editing system for therapeutic applications. J Control Release. 2017 Nov 28;266:17-26; Lino CA et al., Delivering CRISPR: a review of the challenges and approaches. Drug Deliv. 2018 Nov;25(1):1234-1257), и в источниках, раскрытых внутри этих источников.Example 3 The CoCas9 system described in the present invention in combination with guide RNAs can be used to change the genomic DNA sequence of a multicellular organism, including a eukaryotic one. To introduce the CoCas9 system in complex with guide RNAs into the cells of this organism (in all cells or in some cells), various approaches known to those skilled in the art can be applied. For example, delivery strategies of the CRISPR-Cas9 gene-editing system for therapeutic applications. J Control Release. 2017 Nov 28;266:17-26; Lino CA et al., Delivering CRISPR: a review of the challenges and approaches Drug Deliv.2018 Nov;25(1):1234-1257), and in the sources disclosed within those sources.

Для эффективной экспрессии нуклеазы CoCas9 в эукариотических клетках будет желательно провести оптимизацию кодонов для аминокислотной последовательности белка CoCas9 методами, известными специалистам (например, IDT codon optimization tool).For efficient expression of the CoCas9 nuclease in eukaryotic cells, it will be desirable to perform codon optimization for the amino acid sequence of the CoCas9 protein by methods known to those skilled in the art (eg, IDT codon optimization tool).

Для эффективной работы нуклеазы CoCas9 в эукариотических клетках необходимо обеспечить импорт этого белка внутрь ядра эукариотической клетки. Для этого можно использовать сигнал ядерной локализации из Т-антигена вируса SV40 (Lanford et al., Cell, 1986, 46: 575-582), соединенный с последовательностью CoCas9 с помощью спейсерной последовательности, описанной в Shen B, et al. "Generation of gene-modified mice via Cas9/RNA-mediated gene targeting", Cell Res. 2013 May;23(5):720-3 или без нее. Таким образом, полная аминокислотная последовательность нуклеазы, транспортируемой внутрь ядра эукариотической клетки, будет представлять собой следующую последовательность: MAPKKKRKVGIHGVPAA-CoCas9-KRPAATKKAGQAKKKK (далее CoCas9 NLS). Для доставки белка с приведенной выше аминокислотной последовательностью, могут быть использованы по меньшей мере два подхода.For efficient operation of the CoCas9 nuclease in eukaryotic cells, it is necessary to ensure the import of this protein into the nucleus of the eukaryotic cell. This can be done using the nuclear localization signal from the SV40 T antigen (Lanford et al., Cell, 1986, 46: 575-582) coupled to the CoCas9 sequence using the spacer sequence described in Shen B, et al. "Generation of gene-modified mice via Cas9/RNA-mediated gene targeting", Cell Res. 2013 May;23(5):720-3 or without. Thus, the complete amino acid sequence of a nuclease transported into the nucleus of a eukaryotic cell will be the following sequence: MAPKKKRKVGIHGVPAA-CoCas9-KRPAATKKAGQAKKKK (hereinafter CoCas9 NLS). To deliver a protein with the above amino acid sequence, at least two approaches can be used.

Доставка в виде гена осуществляется путем создания плазмиды, несущей ген CoCas9 NLS под регуляцией промотора (например, CMV промотора) и последовательности, кодирующей направляющие РНК под регуляцией U6 промотора. В качестве ДНК- мишеней используются ДНК последовательности фланкированные 5'-N(A/G)(A/G)(A/T)C-3', например, последовательности гена grin2b человека:Delivery as a gene is accomplished by creating a plasmid carrying the CoCas9 NLS gene under the regulation of a promoter (eg CMV promoter) and a sequence encoding guide RNAs under the regulation of the U6 promoter. As DNA targets, DNA sequences flanked 5'-N(A/G)(A/G)(A/T)C-3' are used, for example, the sequences of the human grin2b gene:

5'-CAGCTGAAGTAATGTTAGAG-3'5'-CAGCTGAAGTAATGTTAGAG-3'

Таким образом, кассета для экспрессии sgРНК выглядит следующим образом:Thus, the sgRNA expression cassette looks like this:

gagggcctatttcccatgattccttcatatttgcatatacgatacaaggctgttagagagataattggaattaatttgactgtaaacacaaagatattagtacaaaatacgtgacgtagaaagtaataatttcttgggtagtttgcagttttaaaattatgttttaaaatggactatcatatgcttaccgtaacttgaaagtatttcgatttcttggctttatatatcttgtggaaaggacgaaacaccg CAGCTGAAGTAATGTTAGAGGTTGTGAATTGCTTTCAAATTTTGTAGTGAAAGTCGCACAATTTGAAAGCAATTCACAATAAGGATTATTCCGTTGTGAAAACATTTAAAGGAGCCCTATCATTATATTAGTGATAGGGTTCTTTTTT (SEQ ID NO: 10).gagggcctatttcccatgattccttcatatttgcatatacgatacaaggctgttagagagataattggaattaatttgactgtaaacacaaagatattagtacaaaatacgtgacgtagaaagtaataatttcttgggtagtttgcagttttaaaattatgttttaaaatggactatcatatgcttaccgtaacttgaaagtatttcgatttcttggctttatatatcttgtggaaaggacgaaacaccg CAGCTGAAGTAATGTTAGAGGTTGTGAATTGCTTTCAAATTTTGTAGTGAAAGTCGCACAATTTGAAAGCAATTCACAATAAGGATTATTCCGTTGTGAAAACATTTAAAGGAGCCCTATCATTATATTAGTGATAGGGTTCTTTTTT (SEQ ID NO: 10).

Жирным шрифтом выделена последовательность U6 промотора, далее идет последовательность, необходимая для узнавания целевой ДНК (заглавными буквами), а далее идет последовательность, образующая структуру sgRNA (заглавные буквы жирным шрифтом).The sequence of the U6 promoter is in bold, followed by the sequence required to recognize the target DNA (capital letters), and then comes the sequence that forms the sgRNA structure (capital letters in bold).

Плазмидную ДНК очищают и трансфицируют в клетки человека HEK293 c помощью реагента Lipofectamine 2000 (Thermo Fisher Scientific). Клетки инкубируют в течение 72 часов, после чего из них выделяется геномная ДНК с помощью колонок для очистки геномной ДНК (Thermo Fisher Scientific). Целевой ДНК сайт анализируется с помощью секвенирования на платформе Illumina с целью определения числа вставок-делеций в ДНК, происходящих в целевом сайте по причине направленного двунитевого разрыва и последующей его репарации.Plasmid DNA was purified and transfected into human HEK293 cells using the Lipofectamine 2000 reagent (Thermo Fisher Scientific). Cells are incubated for 72 hours, after which genomic DNA is isolated using genomic DNA purification columns (Thermo Fisher Scientific). The target DNA site is analyzed by sequencing on the Illumina platform to determine the number of DNA insertion-deletions occurring at the target site due to directed double-strand break and its subsequent repair.

Для амплификации целевых фрагментов используют праймеры, фланкирующие предположительное место внесения разрыва.To amplify the target fragments, primers flanking the presumed site of the break are used.

После амплификации пробы готовятся по протоколу реагента Ultra II DNA Library Prep Kit for Illumina (NEB) для подготовки образцов к высокопроизводительному секвенированию. Затем проводится секвенирование на платформе Illumina 300cycles, прямое прочтение. Результаты секвенирования анализируются биоинформатическими методами. В качестве детекции разрезания принимается вставка или делеция нескольких нуклеотидов в целевой последовательности ДНК.After amplification, samples are prepared using the Ultra II DNA Library Prep Kit for Illumina (NEB) reagent protocol to prepare samples for high throughput sequencing. This is followed by sequencing on the Illumina 300cycles platform, direct reading. The sequencing results are analyzed by bioinformatic methods. The insertion or deletion of several nucleotides in the target DNA sequence is taken as a cut detection.

Доставка в виде рибонуклеинового комплекса осуществляется путем инкубации рекомбинантной формы CoCas9 NLS c направляющими РНК в CutSmart буфере (NEB). Рекомбинантный белок получают из бактериальных клеток-продуцентов, очищая его с помощью аффинной хроматографии (NiNTA, Qiagen) разделением по размеру (Superdex 200).Delivery in the form of a ribonucleic complex is carried out by incubation of the recombinant CoCas9 NLS form with guide RNAs in CutSmart Buffer (NEB). The recombinant protein is obtained from bacterial producer cells by purifying it by size separation affinity chromatography (NiNTA, Qiagen) (Superdex 200).

Белок смешивают с РНК в соотношении 1:2 (CoCas9 NLS : sgRNA), инкубируют в течение 10 минут на комнатной температуре, затем смесь трансфицируют в клетки.The protein is mixed with RNA in a ratio of 1:2 (CoCas9 NLS : sgRNA), incubated for 10 minutes at room temperature, then the mixture is transfected into cells.

Далее проводится анализ экстрагированной из них ДНК на предмет вставок-делеций в целевом ДНК сайте (как описано выше).Next, the DNA extracted from them is analyzed for insertions-deletions in the target DNA site (as described above).

Охарактеризованная в настоящем изобретении нуклеаза CoCas9 из бактерии Capnocytophaga ochracea имеет ряд преимуществ относительно ранее охарактеризованных Cas9 белков.The nuclease CoCas9 from the bacterium Capnocytophaga ochracea characterized in the present invention has a number of advantages over previously characterized Cas9 proteins.

CoCas9 обладает коротким отличным от других известных Cas нуклеаз PAM мотивом, необходимым для функционирования системы.CoCas9 has a short motif different from other known PAM Cas nucleases, which is necessary for the functioning of the system.

Известные на сегодняшний день большинство Cas нуклеаз, способных вносить двунитевые разрывы в ДНК, имеют сложные многобуквенные PAM последовательности, ограничивающие выбор последовательностей, пригодных для разрезания. Среди изученных Cas нуклеаз, распознающих короткие PAM, только CoCas9 может распознавать последовательности, фланкированные NRRWC мотивом.Most of the currently known Cas nucleases capable of introducing double-strand breaks in DNA have complex multi-letter PAM sequences that limit the choice of sequences suitable for cutting. Among the studied Cas nucleases that recognize short PAMs, only CoCas9 can recognize sequences flanked by the NRRWC motif.

CoCas9 - новая Cas нуклеаза, имеющая простой в использовании PAM, отличающийся от известных на сегодняшний день PAM последовательностей других нуклеаз. Белок CoCas9 разрезает с высокой эффективностью различные ДНК-мишени, в том числе и при 37°С, и может стать основой нового инструмента геномного редактирования.CoCas9 is a novel Cas nuclease that has an easy-to-use PAM that is different from the currently known PAM sequences of other nucleases. The CoCas9 protein cuts various DNA targets with high efficiency, including at 37°C, and can become the basis of a new tool for genomic editing.

Несмотря на то, что изобретение описано со ссылкой на раскрываемые варианты воплощения, для специалистов в данной области должно быть очевидно, что конкретные подробно описанные случаи приведены лишь в целях иллюстрирования настоящего изобретения, и их не следует рассматривать как каким-либо образом ограничивающие объем изобретения. Должно быть, понятно, что возможно осуществление различных модификаций без отступления от сути настоящего изобретения.While the invention has been described with reference to the disclosed embodiments, it should be apparent to those skilled in the art that the specific instances described in detail are for the purpose of illustrating the present invention only and should not be construed as limiting the scope of the invention in any way. It should be clear that it is possible to carry out various modifications without departing from the essence of the present invention.

--->--->

<110> Федеральное государственное бюджетное учреждение науки Институт<110> Federal State Budgetary Institution of Science Institute

биологии гена Российской академии наук (Institute of Gene Biology Russian Gene Biology of the Russian Academy of Sciences (Institute of Gene Biology Russian

Academy of Sciences) Academy of Sciences

<120> Средство разрезания ДНК на основе Cas9 белка из бактерии<120> DNA cutter based on Cas9 protein from bacteria

Capnocytophaga ochracea Capnocytophaga ochracea

<160> 10<160> 10

<210> 1<210> 1

<211> 1426<211> 1426

<212> PRT<212> PRT

<213> Capnocytophaga ochracea<213> Capnocytophaga ochracea

<400> 1<400> 1

Met Lys Asn Ile Leu Gly Leu Asp Leu Gly Thr Thr Ser Ile Gly Phe 16Met Lys Asn Ile Leu Gly Leu Asp Leu Gly Thr Thr Ser Ile Gly Phe 16

5 10 15 5 10 15

Ala His Ile Val Glu Asp Glu Asn Lys Glu Lys Ser Glu Ile Lys Glu 32Ala His Ile Val Glu Asp Glu Asn Lys Glu Lys Ser Glu Ile Lys Glu 32

20 25 30 20 25 30

Leu Gly Val Arg Ile Val Ser Leu Thr Thr Asp Glu Gln Ser Asp Phe 48Leu Gly Val Arg Ile Val Ser Leu Thr Asp Glu Gln Ser Asp Phe 48

35 40 45 35 40 45

Glu Lys Gly Lys Ser Ile Thr Thr Asn Ala Asn Arg Thr Leu Lys His 64Glu Lys Gly Lys Ser Ile Thr Thr Asn Ala Asn Arg Thr Leu Lys His 64

50 55 60 50 55 60

Gly Ala Arg Leu Asn Leu Asp Arg Tyr Gln Gln Arg Arg Lys Tyr Leu 80Gly Ala Arg Leu Asn Leu Asp Arg Tyr Gln Gln Arg Arg Lys Tyr Leu 80

65 70 75 8065 70 75 80

Ile Asp Leu Leu Gln Lys Ala Asn Leu Ile Thr Pro Ser Ser Ile Leu 96Ile Asp Leu Leu Gln Lys Ala Asn Leu Ile Thr Pro Ser Ser Ile Leu 96

85 90 95 85 90 95

Ala Glu Asn Gly Lys Asn Thr Thr His Ser Thr Trp Gln Leu Arg Ala 112Ala Glu Asn Gly Lys Asn Thr Thr His Ser Thr Trp Gln Leu Arg Ala 112

100 105 110 100 105 110

Lys Ala Val Thr Glu Arg Ile Glu Lys Glu Glu Phe Ala Arg Ile Leu 128Lys Ala Val Thr Glu Arg Ile Glu Lys Glu Glu Phe Ala Arg Ile Leu 128

115 120 125 115 120 125

Leu Ala Ile Asn Lys Lys Arg Gly Tyr Lys Ser Ser Arg Lys Ala Lys 144Leu Ala Ile Asn Lys Lys Arg Gly Tyr Lys Ser Ser Arg Lys Ala Lys 144

130 135 140 130 135 140

Thr Glu Asp Glu Gly Gln Ala Ile Asp Gly Met Ala Ile Ala Lys Arg 160Thr Glu Asp Glu Gly Gln Ala Ile Asp Gly Met Ala Ile Ala Lys Arg 160

145 150 155 160145 150 155 160

Leu Tyr Asp Glu Asn Leu Thr Pro Gly Gln Leu Ser Leu Gln Leu Leu 176Leu Tyr Asp Glu Asn Leu Thr Pro Gly Gln Leu Ser Leu Gln Leu Leu 176

165 170 175 165 170 175

Gln Gln Asn Lys Lys Leu Leu Pro Asp Phe Tyr Arg Ser Asp Leu Gln 192Gln Gln Asn Lys Lys Leu Leu Pro Asp Phe Tyr Arg Ser Asp Leu Gln 192

180 185 190 180 185 190

Lys Glu Phe Asp Leu Val Trp Asn Phe Gln Lys Gln Phe Tyr Pro Asp 208Lys Glu Phe Asp Leu Val Trp Asn Phe Gln Lys Gln Phe Tyr Pro Asp 208

195 200 205 195 200 205

Ile Leu Thr Asp Ile Phe Tyr Lys Glu Leu Gln Gly Lys Gly Lys Asp 224Ile Leu Thr Asp Ile Phe Tyr Lys Glu Leu Gln Gly Lys Gly Lys Asp 224

210 215 220 210 215 220

Ala Thr Ser Lys Ala Phe Ser Lys Arg Tyr His Phe Asp Thr Thr Glu 240Ala Thr Ser Lys Ala Phe Ser Lys Arg Tyr His Phe Asp Thr Thr Glu 240

225 230 235 240225 230 235 240

Asn Lys Gly Ser Lys Glu Ser Val Arg Leu Gln Ala Tyr Gln Trp Arg 256Asn Lys Gly Ser Lys Glu Ser Val Arg Leu Gln Ala Tyr Gln Trp Arg 256

245 250 255 245 250 255

Ala Glu Ala Ile Ser Lys Gln Leu Ser Lys Glu Glu Val Ala Tyr Val 272Ala Glu Ala Ile Ser Lys Gln Leu Ser Lys Glu Glu Val Ala Tyr Val 272

260 265 270 260 265 270

Leu Thr Glu Ile Asn Asn Asn Leu Asn Asn Ala Ser Gly Tyr Leu Gly 288Leu Thr Glu Ile Asn Asn Asn Leu Asn Asn Ala Ser Gly Tyr Leu Gly 288

275 280 285 275 280 285

Ala Ile Ser Asp Arg Ser Lys Glu Leu Tyr Phe Asn Arg Gln Thr Val 304Ala Ile Ser Asp Arg Ser Lys Glu Leu Tyr Phe Asn Arg Gln Thr Val 304

290 295 300 290 295 300

Gly Gln Tyr Leu Tyr Ala Lys Leu Gln Glu Asn Arg His Asn Ser Leu 320Gly Gln Tyr Leu Tyr Ala Lys Leu Gln Glu Asn Arg His Asn Ser Leu 320

305 310 315 320305 310 315 320

Lys Asn Lys Val Phe Tyr Arg Gln Asp Tyr Leu Asp Glu Phe Glu Arg 336Lys Asn Lys Val Phe Tyr Arg Gln Asp Tyr Leu Asp Glu Phe Glu Arg 336

325 330 335 325 330 335

Ile Trp Glu Thr Gln Ala Ser Phe His Lys Glu Leu Thr Asp Glu Leu 352Ile Trp Glu Thr Gln Ala Ser Phe His Lys Glu Leu Thr Asp Glu Leu 352

340 345 350 340 345 350

Lys Lys Gln Ile Arg Asp Val Val Ile Phe Tyr Gln Arg Lys Pro Lys 368Lys Lys Gln Ile Arg Asp Val Val Ile Phe Tyr Gln Arg Lys Pro Lys 368

355 360 365 355 360 365

Ser Gln Lys Gly Leu Ile Ser Phe Cys Glu Phe Glu Ser Lys Glu Ile 384Ser Gln Lys Gly Leu Ile Ser Phe Cys Glu Phe Glu Ser Lys Glu Ile 384

370 375 380 370 375 380

Glu Ile Glu Lys Asp Gly Lys Thr Ile Thr Lys Asn Ile Gly Ala Arg 400Glu Ile Glu Lys Asp Gly Lys Thr Ile Thr Lys Asn Ile Gly Ala Arg 400

385 390 395 400385 390 395 400

Val Val Pro Lys Ser Ser Pro Leu Phe Gln Glu Phe Lys Ile Trp Gln 416Val Val Pro Lys Ser Ser Pro Leu Phe Gln Glu Phe Lys Ile Trp Gln 416

405 410 415 405 410 415

Ile Leu Asn Asn Val Ile Cys Lys Arg Lys Gly Ile Arg Lys Lys Lys 432Ile Leu Asn Asn Val Ile Cys Lys Arg Lys Gly Ile Arg Lys Lys Lys 432

420 425 430 420 425 430

Ile Ser Ala Lys Thr Thr Gln Leu Asp Leu Leu Asn Glu Ser Ser Gln 448Ile Ser Ala Lys Thr Thr Gln Leu Asp Leu Leu Asn Glu Ser Ser Gln 448

435 440 445 435 440 445

Thr Ile Phe Ser Leu Asp Met Glu Cys Lys Gln Leu Leu Phe Asp Glu 464Thr Ile Phe Ser Leu Asp Met Glu Cys Lys Gln Leu Leu Phe Asp Glu 464

450 455 460 450 455 460

Leu Asn Leu Lys Gly Asp Leu Lys Ser Asp Lys Val Leu Lys Leu Leu 480Leu Asn Leu Lys Gly Asp Leu Lys Ser Asp Lys Val Leu Lys Leu Leu 480

465 470 475 480465 470 475 480

Gly Tyr Ser Pro Gln Glu Trp Glu Ile Asn Tyr Asn Gln Leu Glu Gly 496Gly Tyr Ser Pro Gln Glu Trp Glu Ile Asn Tyr Asn Gln Leu Glu Gly 496

485 490 495 485 490 495

Asn Arg Thr Gln Lys Ala Leu Tyr Glu Ala Tyr Leu Lys Ile Val Glu 512Asn Arg Thr Gln Lys Ala Leu Tyr Glu Ala Tyr Leu Lys Ile Val Glu 512

500 505 510 500 505 510

Met Glu Ala His Asp Val Lys Asp Ile Leu Gln Ile Lys Ser Ala Lys 528Met Glu Ala His Asp Val Lys Asp Ile Leu Gln Ile Lys Ser Ala Lys 528

515 520 525 515 520 525

Asp Asp Trp Ser Leu Asp Glu Ser Pro Leu Ser Ala Ser Glu Ile Arg 544Asp Asp Trp Ser Leu Asp Glu Ser Pro Leu Ser Ala Ser Glu Ile Arg 544

530 535 540 530 535 540

Glu Lys Val Lys Ala Ile Phe Gln Thr Leu Gly Ile Cys Thr Lys Ile 560Glu Lys Val Lys Ala Ile Phe Gln Thr Leu Gly Ile Cys Thr Lys Ile 560

545 550 555 560545 550 555 560

Leu Tyr Phe Asp Pro Leu Leu Pro Val Lys Glu Phe Glu Glu Gln Asp 576Leu Tyr Phe Asp Pro Leu Leu Pro Val Lys Glu Phe Glu Glu Gln Asp 576

565 570 575 565 570 575

Ser Tyr Gln Leu Trp His Leu Leu Tyr Ser Tyr Glu Ser Asp Asp Ser 592Ser Tyr Gln Leu Trp His Leu Leu Tyr Ser Tyr Glu Ser Asp Asp Ser 592

580 585 590 580 585 590

Thr Ser Gly Asn Glu Thr Leu Tyr Arg Ile Leu Glu Lys Lys Tyr Ala 608Thr Ser Gly Asn Glu Thr Leu Tyr Arg Ile Leu Glu Lys Lys Tyr Ala 608

595 600 605 595 600 605

Phe Lys Arg Glu His Ala Arg Ile Leu Ala Asn Val Ala Leu Gln Asp 624Phe Lys Arg Glu His Ala Arg Ile Leu Ala Asn Val Ala Leu Gln Asp 624

610 615 620 610 615 620

Asp Tyr Gly Ser Leu Ser Thr Lys Ala Ile Arg Lys Ile Tyr Pro Asn 640Asp Tyr Gly Ser Leu Ser Thr Lys Ala Ile Arg Lys Ile Tyr Pro Asn 640

625 630 635 640625 630 635 640

Ile Lys Glu Asn Gln Tyr Ser Thr Ala Cys Glu Lys Ala Gly Tyr Lys 656Ile Lys Glu Asn Gln Tyr Ser Thr Ala Cys Glu Lys Ala Gly Tyr Lys 656

645 650 655 645 650 655

His Ser Lys Leu Ser Leu Thr Thr Glu Glu Leu Glu Ala Arg Glu Leu 672His Ser Lys Leu Ser Leu Thr Thr Glu Glu Glu Leu Glu Ala Arg Glu Leu 672

660 665 670 660 665 670

Lys Asn Ile Ile Pro Leu Leu Lys Lys Asn Ala Leu Arg Asn Pro Val 688Lys Asn Ile Ile Pro Leu Leu Lys Lys Asn Ala Leu Arg Asn Pro Val 688

675 680 685 675 680 685

Val Glu Lys Ile Leu Asn Gln Met Ile Asn Val Val Asn Ala Leu Ile 704Val Glu Lys Ile Leu Asn Gln Met Ile Asn Val Val Asn Ala Leu Ile 704

690 695 700 690 695 700

Glu Lys Asn Ser Glu Arg Asp Ala Glu Gly Lys Ile Thr Lys Tyr Phe 720Glu Lys Asn Ser Glu Arg Asp Ala Glu Gly Lys Ile Thr Lys Tyr Phe 720

705 710 715 720705 710 715 720

His Phe Asp Glu Ile Arg Ile Glu Leu Ala Arg Glu Leu Lys Lys Asn 736His Phe Asp Glu Ile Arg Ile Glu Leu Ala Arg Glu Leu Lys Lys Asn 736

725 730 735 725 730 735

Ala Gln Lys Arg Tyr Glu Met Thr Gln Asn Ile Asn Lys Ala Lys Leu 752Ala Gln Lys Arg Tyr Glu Met Thr Gln Asn Ile Asn Lys Ala Lys Leu 752

740 745 750 740 745 750

Glu His Gln Lys Ile Ser Glu Ile Leu Gln Lys Glu Phe Gly Ile Lys 768Glu His Gln Lys Ile Ser Glu Ile Leu Gln Lys Glu Phe Gly Ile Lys 768

755 760 765 755 760 765

Asn Pro Thr Lys Ser Asp Ile Ile Arg Tyr Arg Leu Tyr Gln Glu Leu 784Asn Pro Thr Lys Ser Asp Ile Ile Arg Tyr Arg Leu Tyr Gln Glu Leu 784

770 775 780 770 775 780

Glu His Asn Gly Tyr Lys Glu Leu Tyr Thr Asn Ala Pro Ile Ala Arg 800Glu His Asn Gly Tyr Lys Glu Leu Tyr Thr Asn Ala Pro Ile Ala Arg 800

785 790 795 800785 790 795 800

Asp Met Leu Phe Ser Lys Asn Ile Glu Ile Glu His Ile Val Pro Lys 816Asp Met Leu Phe Ser Lys Asn Ile Glu Ile Glu His Ile Val Pro Lys 816

805 810 815 805 810 815

Ala Arg Val Phe Asp Asp Ser Phe Ser Asn Lys Thr Leu Thr Phe His 832Ala Arg Val Phe Asp Asp Ser Phe Ser Asn Lys Thr Leu Thr Phe His 832

820 825 830 820 825 830

Arg Ile Asn Ser Asp Lys Gly Glu Tyr Thr Ala Phe Asp Tyr Ile Thr 848Arg Ile Asn Ser Asp Lys Gly Glu Tyr Thr Ala Phe Asp Tyr Ile Thr 848

835 840 845 835 840 845

Ser Leu Asn Ser Glu Glu Glu Leu Asn Gln Tyr Leu Thr Arg Val Glu 864Ser Leu Asn Ser Glu Glu Glu Glu Leu Asn Gln Tyr Leu Thr Arg Val Glu 864

850 855 860 850 855 860

Asn Ala Tyr Lys Thr Lys Ser Ile Ser Pro Thr Lys Tyr Lys Asn Leu 880Asn Ala Tyr Lys Thr Lys Ser Ile Ser Pro Thr Lys Tyr Lys Asn Leu 880

865 870 875 880865 870 875 880

Leu Lys Lys Ala Ser Glu Ile Gly Asp Asp Phe Ile Asn Arg Asp Leu 896Leu Lys Lys Ala Ser Glu Ile Gly Asp Asp Phe Ile Asn Arg Asp Leu 896

885 890 895 885 890 895

Arg Asp Thr Gln Tyr Ile Ala Lys Lys Ala Lys Glu Ile Leu Phe Gln 912Arg Asp Thr Gln Tyr Ile Ala Lys Lys Ala Lys Glu Ile Leu Phe Gln 912

900 905 910 900 905 910

Val Thr Lys Asn Val Leu Ser Thr Ser Gly Ser Ile Thr Asp Arg Leu 928Val Thr Lys Asn Val Leu Ser Thr Ser Gly Ser Ile Thr Asp Arg Leu 928

915 920 925 915 920 925

Arg Glu Asp Trp Gly Leu Val Asp Val Met Lys Glu Leu Asn Met Pro 944Arg Glu Asp Trp Gly Leu Val Asp Val Met Lys Glu Leu Asn Met Pro 944

930 935 940 930 935 940

Lys Tyr Gln Ser Leu Gly Leu Thr Glu Val Glu Glu Arg Lys Asp Gly 960Lys Tyr Gln Ser Leu Gly Leu Thr Glu Val Glu Glu Arg Lys Asp Gly 960

945 950 955 960945 950 955 960

Asn Lys Val Thr Val Ile Lys Asn Trp Thr Lys Arg Asn Asp His Arg 976Asn Lys Val Thr Val Ile Lys Asn Trp Thr Lys Arg Asn Asp His Arg 976

965 970 975 965 970 975

His His Ala Met Asp Ala Leu Thr Val Ala Phe Thr Lys Pro Ser Tyr 992His His Ala Met Asp Ala Leu Thr Val Ala Phe Thr Lys Pro Ser Tyr 992

980 985 990 980 985 990

Ile Gln Tyr Leu Asn His Leu Asn Ala Arg Lys Asp Glu Asn Asn Lys 1008Ile Gln Tyr Leu Asn His Leu Asn Ala Arg Lys Asp Glu Asn Asn Lys 1008

995 1000 1005 995 1000 1005

Asn Tyr Ser Val Ile Leu Ala Ile Glu Glu Lys Glu Thr Ile Lys Val 1024Asn Tyr Ser Val Ile Leu Ala Ile Glu Glu Lys Glu Thr Ile Lys Val 1024

1010 1015 1020 1010 1015 1020

Pro Thr Asn Asn Gly Lys Asn Lys Arg Val Phe Ile Glu Pro Ile Pro 1040Pro Thr Asn Asn Gly Lys Asn Lys Arg Val Phe Ile Glu Pro Ile Pro 1040

1025 1030 1035 10401025 1030 1035 1040

Asn Phe Arg Gln Val Ala Lys Lys His Leu Glu Glu Ile Phe Ile Ser 1056Asn Phe Arg Gln Val Ala Lys Lys His Leu Glu Glu Ile Phe Ile Ser 1056

1045 1050 1055 1045 1050 1055

His Lys Ala Lys Asn Lys Val Val Thr Lys Asn Thr Asn Lys Pro Ala 1072His Lys Ala Lys Asn Lys Val Val Thr Lys Asn Thr Asn Lys Pro Ala 1072

1060 1065 1070 1060 1065 1070

Gly Thr Asp Lys Gln Gln Ile Thr Leu Thr Pro Arg Gly Gln Leu His 1088Gly Thr Asp Lys Gln Gln Ile Thr Leu Thr Pro Arg Gly Gln Leu His 1088

1075 1080 1085 1075 1080 1085

Lys Glu Thr Ile Tyr Gly Lys Tyr Gln Tyr Tyr Ile Asn Lys Glu Glu 1104Lys Glu Thr Ile Tyr Gly Lys Tyr Gln Tyr Tyr Ile Asn Lys Glu Glu 1104

1090 1095 1100 1090 1095 1100

Lys Ile Gly Val Lys Phe Asp Glu Arg Thr Ile Ala Lys Val Ser Asn 1120Lys Ile Gly Val Lys Phe Asp Glu Arg Thr Ile Ala Lys Val Ser Asn 1120

1105 1110 1115 11201105 1110 1115 1120

Pro Val Tyr Arg Glu Ala Leu Leu Lys Arg Leu Gln Ala Asn Asp Asn 1136Pro Val Tyr Arg Glu Ala Leu Leu Lys Arg Leu Gln Ala Asn Asp Asn 1136

1125 1130 1135 1125 1130 1135

Asp Pro Lys Lys Ala Phe Ala Gly Lys Asn Ala Leu Ser Lys Asn Pro 1152Asp Pro Lys Lys Ala Phe Ala Gly Lys Asn Ala Leu Ser Lys Asn Pro 1152

1140 1145 1150 1140 1145 1150

Ile Tyr Leu Asp Glu Ser Lys Thr Lys Thr Leu Pro Glu Lys Val Asn 1168Ile Tyr Leu Asp Glu Ser Lys Thr Lys Thr Leu Pro Glu Lys Val Asn 1168

1155 1160 1165 1155 1160 1165

Leu Thr Tyr Leu Glu Glu Asp Phe Ser Ile Arg Lys Asp Ile Ser Pro 1184Leu Thr Tyr Leu Glu Glu Asp Phe Ser Ile Arg Lys Asp Ile Ser Pro 1184

1170 1175 1180 1170 1175 1180

Asp Asn Phe Lys Asp Leu Lys Ser Ile Glu Lys Val Ile Asp Gln Gly 1200Asp Asn Phe Lys Asp Leu Lys Ser Ile Glu Lys Val Ile Asp Gln Gly 1200

1185 1190 1195 12001185 1190 1195 1200

Val Lys Arg Ile Leu Ile Lys Arg Leu Gln Ala Tyr Asp Asn Asp Pro 1216Val Lys Arg Ile Leu Ile Lys Arg Leu Gln Ala Tyr Asp Asn Asp Pro 1216

1205 1210 1215 1205 1210 1215

Lys Lys Ala Phe Val Asp Leu Glu Lys Asn Pro Ile Trp Leu Asn Lys 1232Lys Lys Ala Phe Val Asp Leu Glu Lys Asn Pro Ile Trp Leu Asn Lys 1232

1220 1225 1230 1220 1225 1230

Glu Lys Gly Ile Ala Ile Lys Arg Val Thr Ile Ser Gly Val Asn Asn 1248Glu Lys Gly Ile Ala Ile Lys Arg Val Thr Ile Ser Gly Val Asn Asn 1248

1235 1240 1245 1235 1240 1245

Ala Gln Pro Leu His Ile Gly Lys Asp His Leu Gly Lys Thr Thr Leu 1264Ala Gln Pro Leu His Ile Gly Lys Asp His Leu Gly Lys Thr Thr Leu 1264

1250 1255 1260 1250 1255 1260

Asn Lys Glu Gly Lys Glu Ile Pro Val Asp Tyr Val Ser Thr Gly Asn 1280Asn Lys Glu Gly Lys Glu Ile Pro Val Asp Tyr Val Ser Thr Gly Asn 1280

1265 1270 1275 12801265 1270 1275 1280

Asn His His Val Ala Ile Tyr Arg Asp Lys Glu Gly Asn Leu Gln Glu 1296Asn His His Val Ala Ile Tyr Arg Asp Lys Glu Gly Asn Leu Gln Glu 1296

1285 1290 1295 1285 1290 1295

Gln Ile Val Ser Phe Phe Asp Ala Val Val Arg Ala Gln Gln Gly Ile 1312Gln Ile Val Ser Phe Phe Asp Ala Val Val Arg Ala Gln Gln Gly Ile 1312

1300 1305 1310 1300 1305 1310

Pro Ile Ile Asp Lys Thr Tyr Lys Gln Ala Glu Gly Trp Gln Phe Leu 1328Pro Ile Ile Asp Lys Thr Tyr Lys Gln Ala Glu Gly Trp Gln Phe Leu 1328

1315 1320 1325 1315 1320 1325

Phe Thr Met Lys Gln Asn Glu Met Phe Val Phe Pro Asn Ala Thr Thr 1344Phe Thr Met Lys Gln Asn Glu Met Phe Val Phe Pro Asn Ala Thr Thr 1344

1330 1335 1340 1330 1335 1340

Gly Phe Asn Pro Ala Glu Ile Asp Leu Leu Asp Pro Lys Asn Lys Lys 1360Gly Phe Asn Pro Ala Glu Ile Asp Leu Leu Asp Pro Lys Asn Lys Lys 1360

1345 1350 1355 13601345 1350 1355 1360

Leu Ile Ser Pro Asn Leu Phe Arg Val Gln Lys Ile Ala Thr Lys Asp 1376Leu Ile Ser Pro Asn Leu Phe Arg Val Gln Lys Ile Ala Thr Lys Asp 1376

1365 1370 1375 1365 1370 1375

Tyr Phe Phe Arg His His Leu Glu Thr Asn Val Glu Thr Asp Asn Ile 1392Tyr Phe Phe Arg His His Leu Glu Thr Asn Val Glu Thr Asp Asn Ile 1392

1380 1385 1390 1380 1385 1390

Leu Lys Asn Val Thr Trp Lys Arg Glu Gly Leu Ser Gly Leu Lys Asp 1408Leu Lys Asn Val Thr Trp Lys Arg Glu Gly Leu Ser Gly Leu Lys Asp 1408

1395 1400 1405 1395 1400 1405

Ile Val Lys Val Arg Ile Asn His Leu Gly Asp Ile Val Ser Ile Gly 1424Ile Val Lys Val Arg Ile Asn His Leu Gly Asp Ile Val Ser Ile Gly 1424

1410 1415 1420 1410 1415 1420

Glu Tyr 1426Glu Tyr 1426

14251425

<210> 2<210> 2

<211> 96<211> 96

<212> РНК<212> RNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> CoCas9 трРНК<223> CoCas9 tRNA

<400> 2<400> 2

gucgcacaau uugaaagcaa uucacaauaa ggauuauucc guugugaaaa cauuuaaagg 60gucgcacaau uugaaagcaa uucacaauaa ggauuauucc guugugaaaa cauuuaaagg 60

agcccuauca uuauauuagu gauaggguuc uuuuuu 96agcccuauca uuauauuagu gauaggguuc uuuuuu 96

<210> 3<210> 3

<211> 66<211> 66

<212> РНК<212> RNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> CoCas9 крРНК<223> CoCas9 crRNA

<400> 3<400> 3

nnnnnnnnnn nnnnnnnnnn guugugaauu gcuuucaaau uuuguaguuu ugcgauugau 60nnnnnnnnnn nnnnnnnnnn guugugaauu gcuuucaaau uuuguaguuu ugcgauugau 60

aacaac 66aacaac 66

<210> 4<210> 4

<211> 374<211> 374

<212> ДНК<212> DNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> ДНК-библиотека<223> DNA library

<400> 4<400> 4

cccggggtac cacggagaga tggtggaaat catctttctc gtgggcatcc ttgatggcca 60cccggggtac cacggagaga tggtggaaat catctttctc gtgggcatcc ttgatggcca 60

cctcgtcgga agtgcccacg aggatgacag caatgccaat gctggggggg ctcttctgag 120cctcgtcgga agtgcccacg aggatgacag caatgccaat gctggggggg ctcttctgag 120

aacgagctct gctgcctgac acggccagga cggccaacac caaccagaac ttgggagaac 180aacgagctct gctgcctgac acggccagga cggccaacac caaccagaac ttgggagaac 180

agcactccgc tctgggcttc atcttcaact cgtcgactcc ctgcaaacac aaagaaagag 240agcactccgc tctgggcttc atcttcaact cgtcgactcc ctgcaaacac aaagaaagag 240

catgttaaaa taggatctac atcacgtaac ctgtcttaga agaggctaga tactgcaatt 300catgttaaaa taggatctac atcacgtaac ctgtcttaga agaggctaga tactgcaatt 300

caaggacctt atctcctttc attgagcacN NNNNNNaact ccatctacca gcctactctc 360caaggacctt atctcctttc attgagcacN NNNNNaact ccatctacca gcctactctc 360

ttatctctgg tatt 374ttatctctgg tatt 374

<210> 5<210> 5

<211> 65<211> 65

<212> РНК<212> RNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> крРНК<223> crRNA

<400> 5<400> 5

uaucuccuuu cauugagcac guugugaauu gcuuucaaau uuuguaguuu ugcgauugau 60uaucuccuuu cauugagcac guugugaauu gcuuucaaau uuuguaguuu ugcgauugau 60

aacaa 65aacaa 65

<210> 6<210> 6

<211> 374<211> 374

<212> ДНК<212> DNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> ДНК-библиотека, содержащая ДНК-мишень 5'-atctcctttcattgagcac-3',<223> DNA library containing target DNA 5'-atctcctttcattgagcac-3',

фланкированную PAM последовательностью 5’-CAAACCC-3’ flanked by PAM sequence 5'-CAAACCC-3'

<400> 6<400> 6

caaggacctt atctcctttc attgagcacC AAACCCaact ccatctacca gcctactctc 360caaggacctt atctcctttc attgagcacC AAACCCaact ccatctacca gcctactctc 360

ttatctctgg tatt 374ttatctctgg tatt 374

<210> 7<210> 7

<211> 148<211> 148

<212> РНК<212> RNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> sgRNA1 28DR<223>sgRNA1 28DR

<400> 7<400> 7

uaucuccuuu cauugagcac guugugaauu gcuuucaaau uuuguaguga aagucgcaca 60uaucuccuuu cauugagcac guugugaauu gcuuucaaau uuuguaguga aagucgcaca 60

auuugaaagc aauucacaau aaggauuauu ccguugugaa aacauuuaaa ggagcccuau 120auuugaaagc aauucacaau aaggauuauu ccguugugaa aacauuuaaa ggagcccuau 120

cauuauauua gugauagggu ucuuuuuu 148cauuauauua gugauagggu ucuuuuuu 148

<210> 8<210> 8

<211> 155<211> 155

<212> РНК<212> RNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> sgRNA2 35DR<223> sgRNA2 35DR

<400> 8<400> 8

uaucuccuuu cauugagcac guugugaauu gcuuucaaau uuuguaguuu ugcgagaaag 60uaucuccuuu cauugagcac guugugaauu gcuuucaaau uuuguaguuu ugcgagaaag 60

ucgcacaauu ugaaagcaau ucacaauaag gauuauuccg uugugaaaac auuuaaagga 120ucgcacaauu ugaaagcaau ucacaauaag gauuauuccg uugugaaaac auuuaaagga 120

gcccuaucau uauauuagug auaggguucu uuuuu 155gcccuauucau uauauuagug auaggguucu uuuuu 155

<210> 9<210> 9

<211> 375<211> 375

<212> ДНК<212> DNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> ДНК-библиотека, содержащая сайт узнавания 5'-tatctcctttcattgagcac-3'<223> DNA library containing the 5'-tatctcctttcattgagcac-3' recognition site

с соответствующим консенсусу PAM CAAACCC with consensus-compliant PAM CAAACCC

<400> 9<400> 9

caaggacctt atctcctttc attgagcacC AAACCCcaac tccatctacc agcctactct 360caaggacctt atctcctttc attgagcacC AAACCCcaac tccatctacc agcctactct 360

cttatctctg gtatt 375cttatctctg gtatt 375

<210> 10<210> 10

<211> 398<211> 398

<212> ДНК<212> DNA

<213> artificial sequence<213> artificial sequence

<220><220>

<223> кассета для экспрессии sgРНК<223> sgRNA expression cassette

<400> 10<400> 10

gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc tgttagagag 60gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc tgttagagag 60

ataattggaa ttaatttgac tgtaaacaca aagatattag tacaaaatac gtgacgtaga 120ataattggaa ttaatttgac tgtaaacaca aagatattag tacaaaatac gtgacgtaga 120

aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat ggactatcat 180aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat ggactatcat 180

atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt gtggaaagga 240atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt gtggaaagga 240

cgaaacaccg cagctgaagt aatgttagag gttgtgaatt gctttcaaat tttgtagtga 300cgaaacaccg cagctgaagt aatgttagag gttgtgaatt gctttcaaat tttgtagtga 300

aagtcgcaca atttgaaagc aattcacaat aaggattatt ccgttgtgaa aacatttaaa 360360

ggagccctat cattatatta gtgatagggt tctttttt 398ggagccctat cattatatta gtgatagggt tctttttt 398

<---<---

Claims

1. Применение белка, содержащего аминокислотную последовательность SEQ ID NO: 1 или содержащего аминокислотную последовательность, которая по меньшей мере на 95% идентична аминокислотной последовательности SEQ ID NO: 1 и имеет отличия по сравнению с SEQ ID NO: 1 только в неконсервативных аминокислотных остатках, для образования двунитевого разрыва в молекуле ДНК, расположенного непосредственно перед нуклеотидной последовательностью 5'-N(A/G)(A/G)(A/T)C-3' в указанной молекуле ДНК.1. The use of a protein containing the amino acid sequence of SEQ ID NO: 1 or containing an amino acid sequence that is at least 95% identical to the amino acid sequence of SEQ ID NO: 1 and differs from SEQ ID NO: 1 only in non-conservative amino acid residues, for the formation of a double-strand break in the DNA molecule, located immediately before the nucleotide sequence 5'-N(A/G)(A/G)(A/T)C-3' in the specified DNA molecule.

2. Применение по п. 1, характеризующееся тем, что образование двунитевого разрыва в молекуле ДНК происходит при температуре от 35°C до 45°C. 2. Use according to claim 1, characterized in that the formation of a double-strand break in the DNA molecule occurs at a temperature of 35°C to 45°C.

3. Применение белка по п. 1, где белок содержит аминокислотную последовательность SEQ ID NO: 1.3. The use of a protein according to claim 1, where the protein contains the amino acid sequence of SEQ ID NO: 1.

4. Способ изменения последовательности геномной ДНК одноклеточного или многоклеточного организма, включающий введение в по меньшей мере одну клетку этого организма эффективного количества: а) либо белка, содержащего аминокислотную последовательность SEQ ID NO: 1, либо нуклеиновой кислоты, кодирующей белок, содержащий аминокислотную последовательность SEQ ID NO: 1, и б) либо направляющей РНК, содержащей последовательность, образующую дуплекс с нуклеотидной последовательностью участка геномной ДНК организма, непосредственно примыкающей к нуклеотидной последовательности 5'-N(A/G)(A/G)(A/T)C-3', и взаимодействующей с указанным белком после образования дуплекса, либо последовательности ДНК, кодирующей указанную направляющую РНК; при этом взаимодействие указанного белка с направляющей РНК и нуклеотидной последовательностью 5'-N(A/G)(A/G)(A/T)C-3' приводит к образованию двунитевого разрыва в последовательности геномной ДНК, непосредственно примыкающей к последовательности 5'-N(A/G)(A/G)(A/T)C-3'.4. A method for altering the genomic DNA sequence of a unicellular or multicellular organism, comprising introducing into at least one cell of this organism an effective amount of: a) either a protein containing the amino acid sequence of SEQ ID NO: 1, or a nucleic acid encoding a protein containing the amino acid sequence of SEQ ID NO: 1, and b) either a guide RNA containing a sequence forming a duplex with the nucleotide sequence of the genomic DNA region of the organism immediately adjacent to the nucleotide sequence 5'-N(A/G)(A/G)(A/T)C -3', and interacting with the specified protein after the formation of a duplex, or a DNA sequence encoding the specified guide RNA; in this case, the interaction of the specified protein with the guide RNA and the nucleotide sequence 5'-N(A/G)(A/G)(A/T)C-3' leads to the formation of a double-strand break in the genomic DNA sequence immediately adjacent to the 5' sequence -N(A/G)(A/G)(A/T)C-3'.

5. Способ по п. 4, дополнительно включающий введение экзогенной последовательности ДНК одновременно с направляющей РНК.5. The method of claim 4, further comprising administering the exogenous DNA sequence simultaneously with the guide RNA.