CN116751763B - 一种Cpf1蛋白、V型基因编辑***及应用 - Google Patents
一种Cpf1蛋白、V型基因编辑***及应用 Download PDFInfo
- Publication number
- CN116751763B CN116751763B CN202310510289.0A CN202310510289A CN116751763B CN 116751763 B CN116751763 B CN 116751763B CN 202310510289 A CN202310510289 A CN 202310510289A CN 116751763 B CN116751763 B CN 116751763B
- Authority
- CN
- China
- Prior art keywords
- gene editing
- sequence
- protein
- cpf1
- crispr
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 88
- 102000004169 proteins and genes Human genes 0.000 title claims abstract description 74
- 238000010362 genome editing Methods 0.000 title claims abstract description 72
- 108091033409 CRISPR Proteins 0.000 claims abstract description 62
- 238000010354 CRISPR gene editing Methods 0.000 claims abstract description 50
- 108700004991 Cas12a Proteins 0.000 claims abstract description 36
- 239000002773 nucleotide Substances 0.000 claims abstract description 4
- 125000003729 nucleotide group Chemical group 0.000 claims abstract description 4
- 238000002360 preparation method Methods 0.000 claims abstract description 4
- 125000006850 spacer group Chemical group 0.000 claims description 22
- 108091081062 Repeated sequence (DNA) Proteins 0.000 claims description 10
- 108020004707 nucleic acids Proteins 0.000 claims description 6
- 102000039446 nucleic acids Human genes 0.000 claims description 6
- 150000007523 nucleic acids Chemical class 0.000 claims description 6
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 4
- 239000000203 mixture Substances 0.000 claims description 4
- 238000009472 formulation Methods 0.000 claims description 2
- 238000003745 diagnosis Methods 0.000 claims 1
- 201000010099 disease Diseases 0.000 claims 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims 1
- 230000001737 promoting effect Effects 0.000 abstract description 4
- 238000010353 genetic engineering Methods 0.000 abstract description 2
- 229920002401 polyacrylamide Polymers 0.000 description 50
- 102100035102 E3 ubiquitin-protein ligase MYCBP2 Human genes 0.000 description 47
- 108091079001 CRISPR RNA Proteins 0.000 description 25
- 238000000338 in vitro Methods 0.000 description 17
- 238000003776 cleavage reaction Methods 0.000 description 9
- 230000007017 scission Effects 0.000 description 9
- 150000001413 amino acids Chemical class 0.000 description 8
- 239000013612 plasmid Substances 0.000 description 8
- 101710163270 Nuclease Proteins 0.000 description 7
- 210000003527 eukaryotic cell Anatomy 0.000 description 7
- 238000002474 experimental method Methods 0.000 description 7
- 230000008685 targeting Effects 0.000 description 7
- 108020004414 DNA Proteins 0.000 description 6
- 230000000295 complement effect Effects 0.000 description 6
- 108010042407 Endonucleases Proteins 0.000 description 5
- 102000004533 Endonucleases Human genes 0.000 description 5
- 239000012634 fragment Substances 0.000 description 5
- 101150041972 CDKN2A gene Proteins 0.000 description 4
- 238000003780 insertion Methods 0.000 description 4
- 230000037431 insertion Effects 0.000 description 4
- 238000000034 method Methods 0.000 description 4
- 108700042657 p16 Genes Proteins 0.000 description 4
- 238000000746 purification Methods 0.000 description 4
- 230000003252 repetitive effect Effects 0.000 description 4
- 241000894006 Bacteria Species 0.000 description 3
- 238000010453 CRISPR/Cas method Methods 0.000 description 3
- 108010067770 Endopeptidase K Proteins 0.000 description 3
- 210000004027 cell Anatomy 0.000 description 3
- 238000005520 cutting process Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 239000013613 expression plasmid Substances 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 2
- 108020005004 Guide RNA Proteins 0.000 description 2
- 101100219625 Mus musculus Casd1 gene Proteins 0.000 description 2
- 108091028043 Nucleic acid sequence Proteins 0.000 description 2
- 238000012408 PCR amplification Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 239000011543 agarose gel Substances 0.000 description 2
- 238000003766 bioinformatics method Methods 0.000 description 2
- 101150055766 cat gene Proteins 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 239000012636 effector Substances 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000013604 expression vector Substances 0.000 description 2
- 238000001727 in vivo Methods 0.000 description 2
- 239000013642 negative control Substances 0.000 description 2
- 238000013518 transcription Methods 0.000 description 2
- 230000035897 transcription Effects 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 241000203069 Archaea Species 0.000 description 1
- 108091032955 Bacterial small RNA Proteins 0.000 description 1
- 101150117204 C9 gene Proteins 0.000 description 1
- 108010040467 CRISPR-Associated Proteins Proteins 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 101100219622 Escherichia coli (strain K12) casC gene Proteins 0.000 description 1
- 101100326871 Escherichia coli (strain K12) ygbF gene Proteins 0.000 description 1
- 108091029865 Exogenous DNA Proteins 0.000 description 1
- 102100029768 Histone-lysine N-methyltransferase SETD1A Human genes 0.000 description 1
- 101000865038 Homo sapiens Histone-lysine N-methyltransferase SETD1A Proteins 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- 238000002944 PCR assay Methods 0.000 description 1
- 240000000220 Panda oleosa Species 0.000 description 1
- 235000016496 Panda oleosa Nutrition 0.000 description 1
- 108091005804 Peptidases Proteins 0.000 description 1
- 102000035195 Peptidases Human genes 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 1
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 210000005006 adaptive immune system Anatomy 0.000 description 1
- 238000001042 affinity chromatography Methods 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 101150117416 cas2 gene Proteins 0.000 description 1
- 101150111685 cas4 gene Proteins 0.000 description 1
- 101150038500 cas9 gene Proteins 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000012258 culturing Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000034431 double-strand break repair via homologous recombination Effects 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 230000006801 homologous recombination Effects 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- 238000009396 hybridization Methods 0.000 description 1
- 230000000415 inactivating effect Effects 0.000 description 1
- 230000009545 invasion Effects 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000035800 maturation Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000000813 microbial effect Effects 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 239000002808 molecular sieve Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 239000011148 porous material Substances 0.000 description 1
- 239000013641 positive control Substances 0.000 description 1
- 230000023603 positive regulation of transcription initiation, DNA-dependent Effects 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- URGAHOPLAPQHLN-UHFFFAOYSA-N sodium aluminosilicate Chemical compound [Na+].[Al+3].[O-][Si]([O-])=O.[O-][Si]([O-])=O URGAHOPLAPQHLN-UHFFFAOYSA-N 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
Landscapes
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
本发明属于基因工程技术领域,公开了一种Cpf1蛋白、V型CRISPR/Cas12a基因编辑***及应用。本发明提供一种Cpf1蛋白和编码所述Cpf1蛋白的核苷酸序列。本发明提供一种V型CRISPR/Cas12a基因编辑***,包括所述的Cpf1蛋白、辅助蛋白和CRISPR array。本发明将所述V型CRISPR/Cas12a基因编辑***在原核或真核生物基因编辑中、制备生物基因编辑制剂中应用。本发明的基因编辑***扩大了基因编辑工具的种类,丰富了现有Cpf1作为基因编辑工具的PAM多样性,为Cpf1用于临床治疗提供更多的工具选择,对推动将基因编辑应用于临床治疗具有重要的作用。
Description
技术领域
本发明涉及基因工程技术领域,具体涉及一种Cpf1蛋白、V型CRISPR/Cas12a基因编辑***及应用。
背景技术
微生物自适应免疫***CRISPR/Cas(Clustered Regularly InterspacedPalindromic Repeats/CRISPR-associated proteins system)帮助细菌和古菌防御外来核酸的入侵。CRISPR/Cas***包含直接重复序列(Direct Repeat,DR),这些重复序列由外源DNA的独特间隔序列(Spacer)分离。CRISPR array被转录成长转录物(pre-crRNA,CRISPRRNA的前体),然后被加工处理以产生小的成熟的CRISPR RNA(crRNA),由间隔序列和部分相邻的直接重复组成。crRNA与Cas核酸内切酶形成复合物,在某些情况下,还与辅助Cas蛋白形成复合物并用作靶向和切割外来核酸的指南,从而实现干扰。Cas-crRNA复合物的DNA识别需要靶位点附近存在原间隔物相邻基序(PAM,Proto-Spacer Adjacent Motif),这有助于自我与非自我辨别。CRISPR/Cas***根据不同蛋白酶数量大致分为两类:I类***使用多种Cas蛋白的复合物,如Cascade,而II类***使用单一效应酶,如Cas9、Cas12。
然而,CRISPR/Cas9***中因Cas9蛋白通常较大,由约1300个氨基酸组成,常用来进行体内递送的AAV病毒能容纳的最大装载量为4500kb,除了容纳Cas蛋白,还需要装载tracrRNA等其他发挥基因编辑必不可缺的功能元件,这一条件限制了大部分Cas9蛋白无法进行包装递送,而导致其应用困难。此外,Cas9识别富含G的PAM,导致无法编辑一些特定的靶位点,所以在哺乳动物基因组中的靶向范围有限。因此,研发新的基因组编辑***,以使其在基因编辑的应用上提供更多的可能具有重要意义。
发明内容
本发明的目的在于克服现有技术的不足之处而提供一种Cpf1蛋白、V型CRISPR/Cas12a基因编辑***及应用。
为实现上述目的,本发明采取的技术方案如下:
第一方面,本发明提供一种Cpf1蛋白,所述Cpf1蛋白的氨基酸序列如SEQ ID NO.1~6中任一种序列所示。
本发明的Cpf1蛋白可识别多种不同的PAM序列,只需要识别3位碱基、2位甚至1位碱基的PAM,可靶向的范围更广泛,丰富了现有Cpf1蛋白的PAM多样性,还可极大的提高对特定基因位点进行精准编辑的可能性,为进一步挖掘更多的PAM富含C碱基的新型Cpf1核酸酶提供参考价值,扩充了现有的基因编辑种类。
第二方面,本发明提供种编码所述Cpf1蛋白的核酸,所述核酸的碱基序列如SEQID NO.16~21中任一种序列所示。
第三方面,本发明提供一种V型CRISPR/Cas12a基因编辑***,包括所述的Cpf1蛋白、辅助蛋白和CRISPR array。
本发明的基因编辑***可以识别各自独特的PAM序列,能够在crRNA的引导下在体外环境和真核细胞中行使基因编辑功能,进一步扩大了基因编辑工具的种类,丰富了现有Cpf1作为基因编辑工具的PAM多样性,为Cpf1用于临床治疗提供更多的工具选择,对推动将基因编辑应用于临床治疗具有重要的作用。
作为本发明所述的V型CRISPR/Cas12a基因编辑***的优选实施方式,所述CRISPRarray包括直接重复序列和间隔序列;所述直接重复序列和所述间隔序列间隔排列。
进一步的,所述直接重复序列的核苷酸序列如SEQ ID NO.10~15中任一种序列所示。
作为本发明所述的V型CRISPR/Cas12a基因编辑***的优选实施方式,上述辅助蛋白的氨基酸序列如SEQ ID NO.7~8中任一种序列所示。
第四方面,本发明将所述V型CRISPR/Cas12a基因编辑***在原核或真核生物基因编辑中应用。
第五方面,本发明将所述V型CRISPR/Cas12a基因编辑***在制备生物基因编辑制剂中应用。
与现有技术相比,本发明的有益效果为:
(1)本发明通过宏基因组生物信息学分析,首次挖掘出六种全新V型CRISPR/Cas12a基因编辑***,并预测其各自对应的直接重复序列。6种新型编辑***的Cpf1蛋白可识别多种不同的PAM序列,只需要识别3位碱基、2位甚至1位碱基的PAM,可靶向的范围更广泛,还可极大的提高对特定基因位点进行精准编辑的可能性,为进一步挖掘更多的PAM富含C碱基的新型Cpf1核酸酶提供参考价值。
(2)本发明通过实验证明这六种CRISPR/Cas12a基因编辑***可以识别各自独特的PAM序列,能够在crRNA的引导下在体外环境和真核细胞中行使基因编辑功能。本发明新的六种基因编辑***的发现进一步扩大了基因编辑工具的种类,丰富了现有Cpf1作为基因编辑工具的PAM多样性,为Cpf1用于临床治疗提供更多的工具选择,对推动将基因编辑应用于临床治疗具有重要的作用。
附图说明
图1为本发明所述CRISPR/Cas12a基因编辑***由Cas蛋白及CRISPR array的组成示意图。
图2为本发明所述六种CRISPR/Cas12a基因编辑***的crRNA的二级结构预测图。
图3为本发明所述六种CRISPR/Cas12a基因编辑***的PAM图。
图4为本发明所述六种CRISPR/Cas12a基因编辑***的体外切割实验图。
图5为本发明所述六种CRISPR/Cas12a基因编辑***的真核细胞中发生基因编辑后***dsODN的PCR检测图。
具体实施方式
为更好地说明本发明的目的、技术方案和优点,下面将结合具体实施例对本发明作进一步说明。本领域技术人员应当理解,此处所描述的具体实施例仅用以解释本发明,并不用于限定本发明。
本发明中,所述Cas12a也称为Cpf1,根据同源性将其划分到CRISPR***2类V型中。效应蛋白Cpf1可以通过与引导RNA的互补性应用于基因组编辑。与Cas9相比,Cas9和Cpf1之间靶DNA识别和切割机制不同,所述的Cpf1具有以下特征:(1)Cpf1由单个crRNA引导,而Cas9使用crRNA和第二种小RNA,即反式激活crRNA(tracrRNA);(2)Cpf1识别富含T的PAM,与Cas9青睐的富含G的PAM相反;(3)Cpf1在PAM远端靶位点产生交错端,而Cas9在PAM近端靶位点产生钝端,相比而言,Cpf1切割之后的黏性末端更容易发生同源重组修复;(4)Cpf1含有RuvC结构域,但缺少可检测的第二个核酸内切酶结构域,而Cas9使用HNH和RuvC核酸内切酶域分别切割靶DNA链和非靶DNA链。
实施例中所用的试验方法如无特殊说明,均为常规方法;所用的材料、试剂等,如无特殊说明,均可从商业途径得到。
实施例1:六种V型CRISPR/Cas12a基因编辑***
本实施例利用CRISPRCas Finder软件进行宏基因组注释,通过NUPACK软件预测crRNA结构等生物信息手段对V型CRISPR/Cas12a***相关蛋白及其元件进行分析、预测、筛选得到本发明所述新型编辑***,如图1所示。本发明d的基因编辑***由以下多个元件组成:基因编码的核酸内切酶Cpf1,辅助蛋白Cas1、Cas2、Cas4、CRISPR array。本发明发现六种新的Cpf1蛋白,分别命名为:28c2、28c6、28c12、28c13、28c15和30c9。28c2蛋白编码1262个氨基酸,其序列如SEQ ID NO.1所示;28c6蛋白编码1253个氨基酸,其序列如SEQ ID NO.2所示;28c12蛋白编码1265个氨基酸,其序列如SEQ ID NO.3所示;28c13蛋白编码1274个氨基酸,其序列如SEQ ID NO.4所示;28c15蛋白编码1260个氨基酸,其序列如SEQ ID NO.5所示;30c9蛋白编码1251个氨基酸,其序列如SEQ ID NO.6所示。
辅助蛋白Cas1,其序列如SEQ ID NO.7所示,辅助蛋白Cas2,其序列如SEQ ID NO.8所示、辅助蛋白Cas4,其序列如SEQ ID NO.9所示,三种辅助蛋白参与外源基因捕获,以及crRNA的成熟。
CRISPR array包括直接重复序列和间隔序列,这两种序列是间隔排列的,两个重复序列中间夹一个间隔序列,重复序列在同一细菌中的碱基组成和长度是相对保守的,在不同的细菌之间会有些许差异。本发明所述六种新型CRISPR/Cas12a***所对应的重复序列(Direct Repeats,R序列)D,如SEQ ID NO.10~15所示。CRISPR array转录形成pre-crRNA,pre-crRNA的Spacer是能够与靶序列互补配对杂交的序列,随后pre-crRNA在剪切加工后形成5’为重复序列、3’为间隔序列的成熟crRNA,crRNA与目的锚定基因互补,引导Cpf1蛋白在与间隔序列互补的靶序列上行使编辑功能。
上述序列具体如下:
SEQ ID NO.1(28c2蛋白质序列):
MVNGTKNYFDCFTGFYPINKTLRFELKPIGKTNALIDEFKKGYVDSIVSLDEKRAESRKKVIEVLDNYYEYFINCVLSKEVLLVNDINEAYKLYKDFKADKKDKNFKSYKVKMRTKISEKFQSEKIKFALKDYKDLFGKKRLQESLLYEWYKQKLNNEEINNEAFEDIVKTLSYFIGFTTSLKDYQENRNNFFVPDEKSTSIAYRIIDENMIRYFDNCIRFETFIENKIDLFESLKQWEEYFKPENYIKYFTQDGIDNYNQIIGRKGKDIYSKGINQLINEYRQINKIKNKNLPTMNQLYKQLLSKHNSEELIVGFKDEKDMLQKIETTYYEYSEIVSKLVSFLSESLADDINLYIRSDSLTNLSNSMFGRWDFINDAIYSYTSGFSEKDKLKYEKDVKEVISLVKLQKVIDTYVSSLDIDEKGKYISNSSIYKYLLSINDLNLKNAYSEAKPILVLNEIDNERTNDSERIQQINKIKSLLDAMLEIMHFYKPLYLYKNGKSLVEVEKDEVFYSEFDYLYSQLMPITKLYDKVRNHITKKPYSKDKFKIYFNKPTLLDGWDLNKENSNLGVLLTKNNNYYLGIMNGKYNTSFDTTVAEVKNQINESSTAIGYLKMEYKQVSGANKMFPKVFFAESNKHIYKPSKEILNIRENKLYTKGADDVESRIKWIDFCKHCIKLHPEWNKYFNFKFKPTTEYEDVNTFYEDADAQMYNVSFISFNESYINELVNEGKLYLFQIYNKDFSPNSKGKPNLHTMYWKMIFEDSNITNINNTGLPVFKLNGEAEIFYRKASLNKKVTHEKNLPIKNKNRNNPKEESIFSYDLYKDKRFMADKFFLHCPITINYRTKPLSSSEFNKKINCIVENNKDISILGVDRGERHLLYYSLINQKGEILKQGSLNSLSTSYERDGQEISVLTDYNSILQGREDERDDARKNWGTIQNIKEIKDGYMSHIVHQLSKILIDNNAVLVLENLNSGFKRGRFKIEKQVYQKFEKAMIEKLNYLVFKDRNSTSPGYYLNGYQLTAPFEGFKNLYSQSGIIYYVWPSYTSKICPRTGFVNLLKLNYENIEKSKEIFNNFDIISYNKAKDYFEFGLDYRRFGKDAGKSKWLICTYGNERYFYNSKLKKFECIDITNKIKELFKSNNIDYLNEKDLRNKITNVNSKDFFNSLLFYLRITLQLRYTNGGNLDENDYILSPINDGSDKFFDSRCASESEPKNCDANGAYHIALKGLRLIHSIEDGTTSKIGNETTDWFTFAQNKNKLVE;
SEQ ID NO.2(28c6蛋白质序列):
MSKGKIWENFINQYSVSKTLRFELKPVGKTLENINAKGLIEEDEQRAEDYKKAKKIIDEYHKYFIEGALGSCSLDLNILNEFLQLYNKAQKTDADKKEYEKIQTTLRKNIAESFGKNADKKTKEQYENLFKKELLRNDLPDWVEDEEDAKIIERFKTFTTYFTGFHENRKNIYDNEEKSTAIGYRIVHENLPKFIDNMNAFEKISKALDLSEIDRDFQSELGEIKAEEFFTIEFFNQCLNQFGIDRYNTLLGGISEGENIKKKQGLNERINLYNQQLKGERKKERLPKLKVLYKQILSDSSSHSFSIDEFENDNELLESLEIFYKNELIGFNHSGVDSNIFDLVKDLLLKIDESEQSSIYLKNDKGLTEISQRIFGDWNIIKSALEEYYDEHYPPKKDTFNKKELDERSRWLKENHSIGVIEKALANYENEIVREHLKQNSAPIVSYFKSLEVDGENLIDKIYSAYGNISDLLNSSYPDEKKLVSDRTSKDKIKVFLDSLMSLLHFLKPLDVKDLGNKDSAFYGDYDFIVEQLSKLVRLYNKTRNYLTRKPYSIEKIKLNFENSTLLAGWDVNKERDNNCVIFKRQDGDRELFYLGIMDKSHNKIFTKIEEAKSDDVYQKMNYKLLPGPNKMLPKVFFSKKSIDFYAPGEELLKNYKNGTHKKGENFNLQHCHELIDFFKRSINKHEDWSQFNFKFSDTSEYEDTSFFFKEVSQQGYSITFKNIDRETIEKFVDEGKLYLFQIYNKDFSPKSKGRPNLHTLYWKMLFDERNLANTVYQLNGEAEVFYRKKSISEKDRVVHRADEPIGLKNSENSAQKSLFPYDIVKDRRFTVDKFQFHVPITLNFKSEGNERLNISVNKFLKDNPDVNIIGLDRGERHLIYLTLINQKGEILHQESLNEVMGVNYQQKLHRVEKDRTEERRNWDRIENIKELKSGYLSQVVHKISQLMVEYNAIVVMEDLNFGFKRGRIKVEKQVYQKFEKTLIDKLNYLVFKDREPEEPAGVLNALQLTNKFESFKKLGKQCGFLFYVTSDYTSKIDPATGFVNLLYPKYESVEKSQNFFRKFDNICFNSGAGYFEFDFDYSNFTDRADGTRTRWKVCTVGNERFGYNPKTKASETVNVTESLKELLLQHEIAFENGESLVESISKNTTKYFHKSLLNFLRLTLTLRHSKTGTDIDYILSPVANEEGVFFDSRNASDKMPKDADANGAYNVALKGLMVLERINAAEDLSQFKFKDMSIKNKDWLKFVQDRQG;
SEQ ID NO.3(28c12蛋白质序列):
MIEYTNFIGLYPLSKTLRFKLLPIGKTLENITRNGILTDDKHRAQSYQEVKKLIDEYHKEFIEHTLETFNLELLSTNKQNSLEEYHQLYLKEKNESELKNFTKTQENLRKQIAKTLQNEAKKASLFDKDMIKKNLPDFIQQHPDLKDKENLVKEFDEFTTYFTGFHENRRNMYSDEEKSTAIGYRIIHQNLPKFIDNMIVFSRIQSELQGELNLIAADFKDLLVVNNLDEMFTLPYFNQVLTQSQIDLYNMVIGGKSEEGKIKKQGLNEYINLYNQNHKEQKLPLFKPLFKQILSDRQSLSWLPQQFEEDQELLNAVRECFYSLNDSQCNLKHLQALLVSLADYNLNGIYLTNGPAITTISQQMFNDWNLINRAIIERMSRDIKASSKQKSEAKLEEEIRKRMDSTESFSIQYLNECIETSEIEDIKNAADKRIESAHFARLMICNKKTNEQENLFERIYTAYNEAQTLLNTPYPENQNLIQDQENVARIKYLLDTVKDLQLFVKPLLGKGYEIGKDDTFYGILTRLWTVIDQLTPLYDKVRNYLTRKPYSDKKIKLNFKNSTLLNGWDKNKEADNTAIIMRKEGLFYLGIMNKDIKGYKRMFEKCPQCSEEEAYYEKMEYKLLPGPNKMLPKVFFAKNNIELFKPSERIMAIRENETFKKGDKFNLADCHAFIDFYKESIAKHPEWKDFDFHFSETQLYNDISGFYREVEHQGYKMSFRKIPATYIDQLVENNELYLFQIYNKDFSEYSKGTPNMHTLYWKMLFDERNLADVVYKLNGQAELFYRPASLNYNRPTHPKNEPITNKNKNNPKKESIFKYDLTKDKRYTQDTFLLHVPITLNFKGTNNGNINQQVNSYLQTADNTHIIGIDRGERHLLYLVVIDMKGNIKEQFSLNEIANQNKGIEYRTNYHQLLENREKERVEARVNWQNIENIKDLKEGYLSQVIHLITQLMLKYHAIVVLEDLNFGFMKGRQKVEKSVYQKFEKQLIDKLNYLVNKQIDAEKPGGLLKAYQLAKPFESFQKMGKQSGFLFYIPAWMTSKIDPVTGFVNLLNTNYVNVKESQKFFSNFDRIAYNPEKDWLEWDIDYNKFTTKAKNSRHNWTICTQGERIENHRNEKNGQWNSQNVNLTEEFKKLFALYDIDLAQDLKKYIIQQNDAKFFKELHRILKLTLQMRNSQINSDIDYLVSPVANAEGCFYNSQTANATLPANADANGAYNIARKGLYLLQQIKKAPDLAKLKLTISNEEWLKFAQEKTYQND*;
SEQ ID NO.4(28c13蛋白质序列):
MFNQFTNLYPVIKTLRFELKSIGNTMDTIESNQVIHNDEKRADAYAKLKVTLDAYHKDIIEKVLSRARLTGLEDYAIAVNNLKTSKGNAAYGKELTKNKEQLRKQIAGFFKQPEFAPIFKDLFKEGVIKKDVKAWIDTQPNPSDYFYSDDFANFTGYFGNYNLIRQNLYSPEAKHGTIAYRLIDENLPKFIDNLSILQNIQNKNPDLFDQLSDQYQQYFSELLPSKPTLADFVSLDTFNDLLTQKGLDAYQQIIGGIKTENQLIQGINVLINLHNQQHPEQSKTPKLKPLYKQLLSDRGTFKLPRKFEDDAEMIQANRQYFEEVLGNNTLFETGETPTEAMNQLFLSIENYDLSKIFIESPLLVTSISQKIYGSYAVIPQALEYYHDNHVNPSYAAKFNKAKSDKSRETMEKAKAAWVKGVHAVSVIHQAVIAYNDVLPDDAKLTDTQPVISYYKDIQYSEKTGESQQIFDALMRRYHQAKGMLNTDYPKGSKQILNNKSSFAIVKNLLDVSKAYVNAARDLTIKKPEGLDLDLLFYERLAKTYTYLQDLHALYDTTRNYVTQKPFSTDKIKLNFDCAQLLAGWDFNVIDAKRGVFLVKNGRYYLVIIDNKHKKAMNNLPAPITNNCYDKYNMRLSKDAHMALPKKLFTKDNLKIPAIAEMERRCRDKNGGHHLRKSPDFDKDFMHQMIDTFKDIIKKDKDFDVFGFQFKPTHQYEDINEFYADFNEQALVTWYDKVDSDVIDSLVAEGKIYLFEVYSKDFSDKSTGTPNQQSLILQYLFSQDNLAKRHFKLNGEAEVFYRKASIDKDKAVVHKKGSLLENKNPARPNSKIAKFDIVKDRHYTEDKLFLHIPITLNNNAADMKSYAMNSKVLNTLKTNGGVNVIGIDRGERNLLKITVINSAGEILHQESLNKITSGQDMVTDYHELLDKKEQSRAESRLNWQEVESIKEIKQGYLSQVVYRLSQLMLQYKAIVVLEDLNIGFKRGRFKIEKQVYQNFEKALINKLNYLVLKQLEATEVGGTAHGYQLTAPFESFQKLGKQSGWLFYVPAWNTSHIDPTTGFVNLHHFKYESVAQATDIIDKLSNIRYNPEKDYFEFAIDYNEFTFKGGDSQKYWVVCSTPYKRYVFDKKANMGRGGTKAVDVNAELKALFAAHGVDYASGEDLRPQIKAKANKELLSQLLFLLKTLTAMRYTNASSYEDYILSPVVNKAGEFFDSRKGDATLPLDADSNGSYHIALKGLCLLQRVYDWRGEEFKGLDLFISNNDWLKFAQDRH*;
SEQ ID NO.5(28c15蛋白质序列):
MSNTKDNIFNNFTGIYPINKTLRFELRPVGKTYDLIKDFKNGYVESIVAIDEKRSEARKRIIEIIDEYYEEFINTVLSKKVFYSDDIWQTYTSYKAYKSDKRNKEFVTQKAIMRKKISDAFQNEKTKFNLKDFKDLFGKKSNLKESPLYKWYKNKLDIGEITGEDFEDIIKIITYFIGFTTSLKDYQENRNNLFVAEEQSTAISHRIIDVNMIRYFENCIRFENMKDSELLEDMGKWEKYFVPANYDNFFTQEGIDNYNEIIGRKSKDLYYKGVNQLINEYRQKNKIKNKDMPTMNQLYKQHISKNGDNEINNDFSNEKEMLEQIEQAYITSLDKINRIVSFINENITEGNKIFIRKDFVTNISNRLFGEWNFINNALYSYLSGLSAKNKELFVKQTEEVIKISELQNIIDLYINNLDEDEKEKYLKTDAIYTHFCSFDVCGVQNAYYEAKTVLAVDEINKDREKEEEGAKQISKVKKLLDEILEAVHFYKPLYLYKNGKEIDEIEKDEIFYSEFDYLYSQLMLVTELYDRVRNYLTKKPYSKDKFKIYFNKPTLLDGWDLNKEKNNLSVLLIKDGFYYLGIMDSKYNSVFDVSADDVKINTTELSEEATFLKMEYKQVSGASKMFPKVFFAASNKDMFKPSEEILNIRENKQYLKGANNREAVIKWIDFCKDCLKIHPEWNRYFNFNFRHSDEYENVNSFYEDADTQMYYINFVKFKETYINDLVEEGKLFLFQIYNKDFSEYSKGKPNLHTVYWKMLFDENNVRNINDNTGKPVFKLNGEAEIFYRKASLDKKVTHKKNYPIKNKNKHNNKTESIFEYDLYKDKRFMDDKFFFHCPITINYRAKNILSSEFNKKFNLHIKNSDNMNILGVDRGERHLLYYSLINIKGGIIKQGSLNTIYDSYEKDGINIPVITDYKSILKDREDERMDSRKNWGTIKNIKEMKEGYLSHVVHQVSKLLIDNNAILVLENLNSGFKRRRLKIEKQVYQNFEKSLINKLNYLVLKDADNKDVGHFLKGYQLTAPFEGFQRLNNQSGIIYYVWPSYTSKICPRTGFVSLLHINYENIEKSKEFFNKFDKISYNKDKDYFEFHLDYTRFGKNAGKNKWVICTYGKDRYFFNQKLKKYEYIDITEKIKELLSNNGIDFINENDMRKSIVENNSKNFFGSLLFYLKVVMQLRYTNSNDGCRNENDYILSPVADINGMFFDSRHACDNEPENADANGAYHIALKGLRMIQFIENGVITKQGNETTDWFKFAQNKL*;
SEQ ID NO.6(30c9蛋白质序列):
MSAQSALSTLINKYSLSKTLRFELIPIGKTKESIDRKGLLSQDVKRAQSYKEVKKIIDEYHKEFIEKSLINAKLKGLEEFSKLYYKLQKEDKDKKNIKKMQDNLREQISDLFKNNKKDKWNILFKEDLIKKELPLFAKDDKQKNLINEFNKFTTYFTGFHKNRKNMYAEEEKSTSIPYRIIHQNLPKFLDNIRIFEKIKKNKINTDVIEKELSLFLNGIKINDIFSINFFNDVLNQKGITFYNTILGGVSEKDRTKIKGINEYVNTEYNQKQLDKKSKIPKLKQLYKQILSDTETASFVLEQFENDNQLLEKIEQFYNTELINYETEGKTQSVFLQFEQLFKNMQNYDASKIYISNLSIANISKIIFGDWSIICNALAEWYDKHNTKGKKINEYKKENFLKQDFSIQQIEDAVLEYKNDTLNKEINFLLNYFASFLNEKSKKNIIQRIETEYSKVKDLLNTDYPEKKKLASDKDNVSKIKAFLDSLMDFLHFVKPFNIKKDTGLEKEENFYSIYVPLFEQIDKIIPLYNKVRNYLTKKPYSTEKIKLNFENSTLLDGWDLNKESDNTSVVLRKDDLYYLGIMDKKHNRIFKELPSQNGNESSYEKMIYKLLPGPNKMLPKVFFSKKGKKQFKPSKKLLKKYEDGTHLKGDNFNINDCHNLIDFFKESIAEHEDWKQFDFKFSSTSSYKDLSNFYKEVEKQGYKITFQNISENYINQLIDEGKLYLFQIYNKDFSKYSKGTPNLHTLYWKMLFDNDNLKNIVYKLNGKAEVFYRKSSLILGDNIVHKAGEAIINKNPDNEKKHSTFDYDLIKDKRFTLDKFQFHVPITLNFKSEGRQNLNEDVRKFLKNNPDINIIGIDRGERHLLYLTLINQKGKILFQKSLNEITNEYNNKNGKSQIKSTNYHSLLDKKEKKRDEARKNWGIIENIKELKEGYMSQIVHYISKLMIEKNAILSLEDLNFGFKRGRQKVEKQVYQKFEKMMIDKLNYLVFKDKKANETGGLLNALQLTNKFESFAKLYNQSGFIFYVPAWNTSKIDPITGFVNLLKPYYENLNKSQEFFKKFNNIKYNPKQEYFEFNFDYKNFTNKAEGSKNVWEICTTNNERFMWDKTLNSGKGAQKAVDVTQELKKLFDSSKINYLNGNDIKEDIINQNSADFFRKLMKLLSVVLSLRHNNGLKGKDEKDFILSPVEPFFNSLNAKMEEPKDADANGAYNIALKGLLILKQINESEDLRKIKFNLSNKEWLKFAQSKSF;
SEQ ID NO.7(cas1蛋白质序列):
MNQLVTGGISVLNKGEFIKKQILVYEPFLGDKMSYKNDNMVIRDGNGKIKYQVSCYRIFMVLIVGDVTITTGILRRQQKFGFRLCFLTLGLKVYSVIGPQLQGNTLLHCKQYAYDELTVGKSIIINKILNQRAALTRLRSKTEDVWECISLLEQYSKRLQNDSLNLQEIIGIEGMASKIYFPRIFSNTQWIGRKPRIKFDYINTLLDIGYNALFNFIDAILQVFGFDVYYGVLHTCFYMRKSLVCDIMEPMRPIVDWQIRKSINLKQFKQDDFVQVGKQYQLKYKKSTQYLQVFLEAILNYKEEIFVYVRDYYRSFMKNNPIEAYPVFKLEEL;
SEQ ID NO.8(cas2蛋白质序列):
MIIVSYDISDDKLRTKFSKYLSRFGHRIQYSMFEIDNSERILNNIICDIHNQFEKKFSQEDSIYIFNLSKWCKIE RFGYAKNETNDLLVLTGCKPRP;
SEQ ID NO.9(cas4蛋白质序列):
MEDIILITELNDFIFCPASIYFHHLYGSRDPVLFQSEAQIKGTKAHEAVDSGCYSKKSSILQSLDVYCEKYRL LGKIDIYDGKKKILRERKRQIKQVYDGYIFQLYGQYFSLIEMGYEVDKMELYSMIDNKKYPIELPHNNINMLM KFEMLIHEMREFRLDDRFIQENANKCKNCIYEPACDRGNIGAK;
SEQ ID NO.10(28c2直接重复序列):
GAATTTCTACTGTTGTAGAT;
SEQ ID NO.11(28c6直接重复序列):
AAATTTCTACTTCTGTAGAT;
SEQ ID NO.12(28c12直接重复序列):
TAATTTCTACTATTGTAGAT;
SEQ ID NO.13(28c13直接重复序列):
AATTTCTACTATGTGTAGAT;
SEQ ID NO.14(28c15直接重复序列):
AAATTTCTACTGTTGTAGAT;
SEQ ID NO.15(30c9直接重复序列):
TAATTTCTACTATTGTAGAT。
实施例2:预测基因编辑***识别靶序列的crRNA的二级结构
本实施例为预测实施例1所述的6种V型CRISPR/Cas12a基因编辑***用于识别靶序列的crRNA的二级结构。
具体操作如下:
通过使用AlphaFold模拟体外37℃重复序列的作用过程,并进行二级结构预测,得到成熟crRNA的二级结构。Pre-crRNA在Cpf1核酸酶的作用下去掉重复序列的上游序列,得到20nt的重复序列,和23nt的间隔序列形成成熟的crRNA,与Cpf1蛋白融合形成crRNA-Cpf1复合体。crRNA-Cpf1复合体首先通过扫描寻找合适的PAM,然后继续扫描与间隔序列互补配对的DNA序列,则Cpf1核酸酶的活性被激活,结果如图2所示。
实施例3:体外PAM耗竭实验
本实施例通过体外PAM耗竭实验挖掘实施例1所述的6种V型CRISPR/Cas12a基因编辑体系Cas核酸酶识别间隔序列所需的PAM序列。
具体操作如下:
(1)对于上述实施例1的6中V型CRISPR/Cas12a基因编辑***,将编码Cas蛋白所对应的核苷酸序列通过同源重组***到psumo蛋白表达载体上,将测序正确的重组质粒转化到E.Coli Rosseta 2(全式金,CD811-02)感受态中,激活后涂Kana抗性(50μg/mL)培养皿,次日挑取单克隆,经过大量菌液培养,重组蛋白依次经过Ni柱亲和层析及分子筛纯化后,于-80℃保存备用。
28c2、28c6、28c12、28c13、28c15和30c9的核苷酸序列分别如SEQ ID NO.16~21所示。
(2)在library间隔序列(其序列如SEQ ID NO.22)的3’端添加6个位置随机碱基NNNNNN(共4096条***片段,N表示A、G、C、T),采用overlap PCR方法把library构建到骨架载体上,得到具有4096种不同PAM组合,但5’端间隔序列是一样的Spacer-PAM混合质粒,经二代测序检测到6个位置的随机碱基丰度Gini值小于0.1,表明6个位置的随机碱基分布较均匀。
(3)使用上述实施例2所述V型CRISPR/Cas12a基因编辑***的重复序列,以及对应的library的间隔序列,构建成5’-T7启动子+重复序列+间隔序列形式-3’,通过体外转录(NEB,E2040S)及RNA纯化(NEB,T2040S)获得crRNA。
(4)取10pmol步骤(a)所得Cpf1纯化蛋白与10pmol crRNA混合,室温孵育得到蛋白-crRNA复合物。再与200ng Spacer-PAM混合质粒混合均匀,置于37℃孵育30min,Cpf1蛋白通过识别4096条混合PAM组合中合适的PAM,切割与crRNA互补的Spacer;加入适量蛋白酶K,室温孵育15min,以消化多余的Cpf1蛋白,再98℃灭活10min以灭活蛋白酶K活性。
(5)在Spacer-PAM混合质粒上,随机碱基的两端设计合适的引物对含有间隔序列和PAM组合的位置进行PCR扩增及纯化,在产物两端加上接头进行二代测序(接头采用商业的illumina测序接头引物:Hieff NGS384 Dual Index Primer Kit for Set1,货号12613ES02;I5 primer:TAAGATTA;I7 primer:GAGATTCC),以阴性对照组的PAM耗竭阈值为对照基准,利用Weblogo 3分析6个随机碱基的消耗,通过负向筛选的方式得到每个Cpf1蛋白所识别的PAM序列。
PAM分析结果如图3所示,这六种基因编辑***可识别多种不同的PAM序列,28c2、28c6、28c12、28c15识别3位碱基的PAM,30c9和28c13只需要识别2位甚至1位碱基的PAM,更简单的PAM在基因组上出现的频率更高,这意味着本发明的新型Cpf1蛋白可靶向的范围更广泛,还可极大的提高对特定基因位点进行精准编辑的可能性。
尤其是,相较于现有技术中的Cpf1蛋白大多识别PAM都是富含T碱基的,对于一些T碱基含量较少的基因序列,限制了这些Cpf1蛋白的使用。本发明的30c9蛋白识别PAM富含C碱基,其在真核体外和真核条件下均具有编辑功能,一方面丰富了现有Cpf1蛋白的PAM多样性,可作为编辑一些高GC含量的基因序列的理想编辑工具;另一方面参考30c9核酸酶的蛋白序列和编辑特性,可为进一步挖掘更多的PAM富含C碱基的新型Cpf1核酸酶提供参考价值,扩充现有的基因编辑种类。
上述序列具体如下:
SEQ ID NO.16(28c2基因序列):
ATGGTGAACGGCACCAAGAACTACTTCGACTGTTTCACCGGGTTCTACCCCATCAACAAGACCCTGCGGTTCGAGCTGAAGCCGATCGGGAAAACCAACGCCCTCATCGACGAGTTCAAGAAGGGCTACGTGGACTCCATCGTGAGCCTGGACGAGAAGCGGGCCGAGTCCAGGAAGAAAGTGATCGAGGTGCTGGACAACTACTATGAGTACTTCATCAACTGCGTGCTGAGCAAGGAGGTCCTGCTGGTGAACGACATCAACGAGGCCTACAAGCTATACAAGGACTTCAAGGCCGACAAGAAGGACAAGAACTTCAAGTCCTATAAGGTGAAGATGAGGACCAAGATCTCCGAGAAGTTCCAGTCCGAGAAGATCAAGTTCGCCCTGAAAGACTACAAGGACCTCTTCGGCAAGAAGCGCCTGCAGGAGTCCCTGCTGTACGAGTGGTACAAGCAGAAGCTGAACAACGAGGAGATCAACAACGAGGCCTTTGAGGACATCGTGAAAACCCTGAGCTACTTCATCGGCTTCACCACCAGCCTGAAGGACTACCAGGAGAACAGGAACAACTTCTTCGTGCCCGACGAGAAGAGCACCTCCATCGCTTACCGCATCATCGACGAGAACATGATCCGGTACTTCGATAACTGCATCCGGTTCGAGACCTTCATCGAGAATAAGATTGACCTGTTTGAGAGCCTGAAGCAGTGGGAGGAGTACTTTAAGCCCGAGAATTACATCAAGTACTTTACACAGGACGGGATCGACAACTACAACCAGATCATCGGGCGGAAGGGGAAGGACATCTACTCCAAGGGAATCAACCAACTGATCAACGAGTACCGGCAGATTAACAAGATCAAAAATAAGAACCTGCCGACCATGAATCAGCTCTACAAGCAGCTCCTGAGCAAGCACAACAGCGAAGAGCTGATCGTCGGCTTCAAGGACGAGAAGGACATGCTGCAGAAGATCGAGACCACTTACTACGAGTACTCCGAAATCGTGTCCAAGCTGGTGAGCTTCCTGAGCGAGTCCCTGGCCGACGACATCAACCTTTACATCCGCTCCGACAGCCTGACTAATCTGAGCAACAGTATGTTTGGCC
GCTGGGACTTTATCAACGACGCCATCTACTCTTACACCAGCGGATTCTCTGAGAAAGACAAGCTGAAGTA
CGAGAAGGACGTTAAGGAAGTGATCAGCCTCGTGAAGCTGCAGAAGGTTATCGACACCTATGTGAGCAG
CCTCGATATAGACGAGAAGGGGAAGTACATCTCCAATTCAAGTATCTACAAGTACCTGCTGTCCATCAAT
GACCTGAACCTGAAGAACGCCTACTCCGAGGCAAAGCCTATCCTCGTTCTCAACGAGATCGATAACGAGA
GGACAAATGACAGCGAGCGCATCCAGCAGATCAATAAGATCAAGTCCCTGCTGGACGCCATGCTGGAGA
TTATGCACTTCTATAAGCCCCTGTACCTGTATAAGAACGGCAAGAGCCTCGTCGAGGTGGAGAAGGACGA
GGTGTTCTATTCCGAGTTTGACTACCTCTACAGCCAGCTCATGCCAATCACAAAACTGTACGATAAGGTGC
GGAACCACATCACAAAGAAGCCCTACAGCAAGGACAAGTTCAAGATCTACTTCAATAAGCCCACTCTCCT
CGATGGCTGGGACCTTAATAAGGAAAACTCAAACTTGGGGGTGCTGCTTACCAAGAACAACAACTACTAC
CTGGGCATCATGAACGGGAAGTATAACACTTCCTTCGATACAACAGTGGCCGAGGTGAAAAACCAGATTA
ACGAGAGCTCTACAGCTATCGGGTATCTGAAGATGGAGTACAAGCAGGTCTCCGGGGCCAACAAGATGTT
CCCTAAGGTGTTCTTCGCCGAGTCCAATAAGCACATCTACAAGCCCTCCAAGGAGATACTGAACATCAGA
GAGAACAAGCTCTACACTAAGGGCGCTGACGATGTGGAGTCTCGCATCAAGTGGATTGACTTCTGCAAGC
ACTGTATCAAGCTGCACCCTGAGTGGAACAAATACTTCAACTTCAAGTTCAAGCCCACCACCGAGTACGA
GGACGTTAACACATTTTATGAAGATGCTGACGCCCAGATGTATAACGTGTCTTTTATCTCTTTCAACGAGA
GTTACATCAACGAGCTCGTCAATGAGGGGAAACTGTACCTGTTTCAGATCTATAATAAGGATTTTTCCCCA
AACAGCAAGGGCAAGCCAAATCTGCACACCATGTATTGGAAGATGATCTTCGAGGATAGCAATATTACTA
ACATCAACAATACCGGCCTCCCAGTGTTTAAGCTGAACGGCGAGGCCGAGATCTTCTACCGCAAGGCCAG
CCTGAATAAGAAGGTGACACACGAGAAGAACTTGCCCATCAAGAACAAGAACCGCAACAACCCCAAGGA
GGAGAGCATCTTCTCCTACGACCTCTACAAGGACAAGCGCTTCATGGCCGACAAGTTTTTCCTGCACTGTC
CTATCACCATCAACTATCGGACAAAGCCCCTCAGCAGTAGCGAGTTTAACAAGAAAATCAATTGCATCGT
GGAGAATAATAAGGACATCAGCATCCTGGGCGTGGATAGAGGCGAGCGCCATCTGCTGTACTATTCCCTG
ATCAATCAGAAGGGGGAGATCCTGAAGCAGGGCAGCCTGAACTCCCTTAGCACAAGTTACGAGCGTGAC
GGCCAGGAAATCAGCGTGCTCACCGACTACAACTCCATCCTGCAGGGCAGGGAGGACGAGCGCGACGAT
GCTAGGAAAAACTGGGGGACCATCCAGAATATCAAAGAGATCAAAGACGGCTACATGTCCCACATTGTG
CACCAACTGAGTAAGATCCTCATTGACAACAACGCCGTGCTCGTGCTCGAAAACCTGAACAGCGGCTTTA
AGCGGGGCCGGTTCAAGATCGAGAAGCAGGTCTACCAGAAGTTTGAGAAGGCCATGATCGAGAAGCTGA
ACTACCTAGTCTTTAAGGACCGGAACAGCACCAGCCCAGGCTACTATCTGAACGGCTACCAGCTCACCGC
CCCGTTCGAGGGCTTCAAGAACCTGTATAGCCAGAGTGGCATCATCTACTACGTGTGGCCATCCTACACCT
CTAAGATCTGTCCACGCACCGGCTTTGTCAACCTCCTGAAGCTGAATTACGAGAACATCGAGAAGTCCAA
GGAGATCTTTAACAACTTTGACATCATCTCCTACAATAAGGCAAAGGACTATTTCGAGTTTGGCCTCGACT
ATCGCAGATTTGGGAAGGACGCAGGCAAGTCAAAGTGGCTGATCTGCACCTATGGAAATGAGAGGTACTT
CTACAACAGCAAGCTGAAGAAGTTCGAGTGCATCGACATCACCAACAAGATCAAGGAGTTGTTTAAGTCC
AACAACATCGACTACCTGAACGAGAAGGACCTGCGGAACAAGATCACCAACGTGAACAGCAAAGATTTC
TTCAACTCCCTGCTGTTCTACCTGCGCATCACCCTGCAGCTCCGCTACACCAATGGGGGAAACCTGGATGA
GAACGACTATATCCTGAGCCCCATCAACGACGGATCTGATAAGTTCTTCGACTCCCGGTGCGCCTCCGAG
AGCGAGCCTAAGAACTGCGACGCCAACGGGGCCTACCACATCGCTCTGAAGGGCCTGCGTCTGATCCACA
GCATCGAGGACGGCACTACCAGCAAAATCGGCAATGAAACCACCGATTGGTTCACCTTCGCCCAGAACAAGAACAAGCTGGTGGAG;
SEQ ID NO.17(28c6基因序列):
ATGAGCAAGGGCAAGATCTGGGAGAACTTCATCAACCAGTATAGCGTGAGCAAGACCCTGAGGTTCGAGCTGAAGCCCGTGGGCAAGACCCTGGAGAACATTAACGCTAAGGGGCTGATTGAGGAGGACGAGCAGCGGGCCGAGGATTACAAGAAGGCTAAGAAGATCATCGATGAGTACCATAAGTACTTTATCGAGGGGGCTCTGGGAAGCTGCAGCCTGGACCTGAACATCCTGAACGAGTTTCTGCAGCTCTACAACAAGGCCCAGAAAACCGACGCCGACAAGAAGGAGTACGAGAAGATCCAGACCACCCTGCGGAAGAATATCGCCGAGAGCTTTGGCAAGAACGCCGATAAAAAGACCAAGGAGCAGTATGAGAACCTGTTCAAAAAGGAGCTCCTGCGGAACGATCTGCCTGACTGGGTGGAGGACGAGGAGGACGCCAAAATCATCGAGCGCTTCAAGACTTTCACCACCTATTTTACCGGGTTCCACGAGAACAGGAAGAACATCTACGACAACGAGGAGAAGTCCACCGCCATTGGGTATCGGATCGTCCACGAGAACCTCCCCAAGTTCATTGACAATATGAACGCTTTCGAGAAGATCAGCAAGGCCCTGGATCTGTCCGAGATCGACCGGGACTTCCAGAGCGAGCTGGGGGAGATCAAGGCCGAGGAGTTCTTTACCATTGAGTTCTTCAACCAGTGTCTGAACCAGTTCGGCATCGATCGCTACAATACTCTGCTCGGCGGCATCTCCGAGGGCGAGAATATCAAGAAGAAGCAGGGGCTGAATGAGAGGATCAACCTGTATAACCAGCAGTTGAAGGGAGAGAGGAAGAAGGAGAGGCTGCCCAAGCTGAAGGTGCTCTACAAGCAGATTCTCAGCGACAGCTCCAGCCACTCCTTTAGCATCGACGAGTTCGAGAACGACAACGAGCTGCTGGAGTCCCTGGAAATCTTTTACAAGAATGAGCTGATCGGCTTTAATCACAGCGGCGTGGACTCTAACATCTTTGACCTCGTGAAGGACCTGCTGCTGAAGATCGACGAGTCCGAGCAGTCCTCAATCTACCTGAAGAACGATAAGGGACTGACAGAGATCTCTCAGCGGATCTTTGGCGACTGGAACATTATCAAGAGCGCCCTGGAGGAGTACTATGACGAGCACTACCCTCCAAAGAAGGACACATTCAACAAGAAGGAGCTGGATGAGCGCTCACGGTGGCTGAAGGAGAACCACAG
CATCGGCGTCATCGAGAAGGCCTTGGCCAACTACGAGAACGAAATTGTGAGGGAGCATCTGAAACAGAA
CTCCGCCCCCATCGTGAGCTATTTCAAGTCCCTGGAGGTGGACGGCGAGAACCTGATCGATAAGATCTAC
AGCGCCTACGGCAACATCAGCGATCTCCTGAATAGCAGCTACCCTGACGAGAAGAAGCTGGTGAGCGATC
GGACCAGCAAGGACAAGATTAAGGTGTTCCTGGACAGCCTCATGTCCCTGCTGCACTTTCTCAAGCCTCTG
GACGTTAAAGACCTGGGGAATAAGGACAGCGCATTTTACGGCGACTACGATTTTATCGTGGAGCAACTGT
CCAAGCTGGTGCGGCTCTACAATAAGACAAGGAATTATCTGACCAGAAAACCCTACAGCATCGAGAAAA
TCAAACTGAACTTCGAGAACAGCACCTTGCTGGCCGGATGGGATGTGAACAAGGAACGGGACAACAACT
GCGTGATCTTTAAGAGGCAGGACGGCGACCGCGAGCTGTTCTACCTGGGAATCATGGACAAATCCCACAA
TAAGATCTTCACTAAGATTGAAGAGGCTAAGTCCGACGATGTGTACCAGAAGATGAATTATAAGCTGCTG
CCAGGGCCTAACAAGATGCTGCCCAAGGTCTTTTTCTCTAAGAAATCCATCGACTTTTACGCACCTGGGGA
GGAACTGCTGAAGAACTACAAGAATGGGACCCATAAGAAGGGCGAAAACTTCAACCTCCAGCACTGCCA
CGAGCTGATTGACTTCTTTAAGCGGTCCATCAATAAGCACGAGGACTGGTCTCAGTTCAACTTCAAGTTTT
CTGACACCAGCGAGTACGAGGACACCTCCTTCTTCTTCAAGGAAGTGTCCCAGCAGGGCTACAGTATCAC
ATTCAAGAATATTGATAGGGAAACAATCGAGAAGTTCGTGGACGAGGGGAAGCTGTATCTGTTCCAGATC
TATAACAAAGATTTCAGCCCCAAGAGCAAGGGCAGACCCAACCTGCACACCCTGTACTGGAAGATGCTGT
TCGATGAGCGGAATCTGGCCAACACCGTGTACCAGCTCAATGGGGAGGCCGAGGTGTTTTACCGCAAGAA
GAGCATCAGCGAGAAAGATAGGGTGGTGCACAGGGCCGACGAGCCTATTGGCCTGAAGAACTCCGAGAA
CAGTGCCCAGAAGAGCCTTTTTCCTTATGACATCGTGAAGGATCGCCGGTTCACCGTGGACAAGTTTCAGT
TCCATGTGCCCATCACTCTGAACTTCAAGAGCGAGGGGAACGAGCGGCTGAATATTAGCGTGAACAAGTT
CCTGAAGGACAACCCCGACGTTAACATCATCGGCCTGGACAGAGGCGAGCGGCACCTGATCTACCTGACC
CTGATCAATCAGAAGGGTGAAATCCTTCACCAGGAGTCCCTGAACGAGGTCATGGGAGTGAACTACCAGC
AGAAGCTGCACAGAGTTGAGAAGGACAGGACAGAAGAGAGGCGGAACTGGGACCGGATCGAGAACATA
AAGGAGCTGAAGTCTGGATACCTGAGCCAGGTGGTCCATAAGATTAGCCAGCTCATGGTGGAGTACAATG
CCATCGTGGTCATGGAGGATCTGAATTTTGGCTTCAAGCGGGGCCGAATCAAGGTGGAGAAGCAGGTGTA
TCAGAAGTTCGAAAAGACCCTGATCGACAAGCTGAATTATCTGGTGTTCAAGGACCGGGAACCTGAAGAA
CCTGCCGGAGTGCTCAACGCCCTGCAGCTCACCAACAAATTTGAGTCCTTCAAGAAGCTGGGCAAGCAGT
GCGGCTTCCTGTTCTACGTGACAAGTGACTACACTAGCAAGATCGACCCCGCCACCGGCTTCGTCAACCTG
CTGTACCCTAAGTATGAGTCAGTGGAGAAGTCCCAGAACTTCTTCAGAAAATTCGACAACATCTGCTTCA
ACTCCGGCGCAGGCTACTTCGAGTTCGACTTCGACTACTCCAACTTCACCGATAGAGCCGATGGGACCCG
CACCCGCTGGAAGGTGTGCACCGTGGGCAACGAGAGGTTCGGCTACAATCCAAAGACCAAGGCCAGCGA
GACCGTGAATGTGACCGAGTCCCTGAAGGAGCTGCTGCTGCAGCACGAGATCGCCTTCGAGAATGGCGAA
TCTCTGGTGGAGTCCATCAGCAAGAACACTACCAAATACTTCCACAAGTCCCTGCTGAATTTTCTGAGGCT
GACCCTGACCCTGAGACATAGCAAGACCGGCACCGACATCGATTACATCCTGAGCCCTGTGGCCAACGAG
GAGGGCGTGTTCTTCGACTCCCGGAATGCCAGCGATAAGATGCCAAAGGACGCCGACGCCAACGGAGCC
TACAACGTGGCCCTGAAGGGCCTGATGGTGCTGGAGAGGATTAACGCCGCCGAGGACCTGAGCCAGTTCAAGTTTAAGGACATGAGCATCAAGAACAAGGACTGGCTGAAGTTCGTGCAGGACAGGCAGGGC;
SEQ ID NO.18(28c12基因序列):
ATGATCGAGTACACCAACTTCATCGGCCTGTACCCCCTGTCCAAGACCCTGAGATTCAAGCTGCTGCCCATCGGCAAGACTCTGGAGAATATCACCCGCAACGGCATCCTGACAGATGACAAGCACCGCGCCCAGAGCTATCAGGAGGTGAAGAAGCTGATCGATGAGTACCACAAGGAGTTCATCGAGCACACCCTGGAGACCTTTAACCTGGAACTGCTTAGCACCAACAAGCAGAACTCCCTGGAGGAGTACCACCAGCTTTACCTGAAGGAGAAGAACGAGTCCGAGCTGAAGAACTTCACCAAGACACAGGAGAACCTGCGCAAGCAGATCGCCAAAACCCTGCAGAACGAGGCCAAGAAGGCTAGTCTGTTCGACAAGGATATGATTAAGAAGAACCTGCCCGACTTTATTCAGCAGCACCCCGACCTGAAGGACAAGGAAAACCTCGTGAAGGAGTTCGATGAGTTCACCACATACTTTACAGGCTTCCATGAGAACCGGAGGAACATGTATAGCGACGAGGAGAAGAGCACCGCCATCGGCTATCGGATTATCCACCAGAACCTGCCCAAGTTCATTGACAATATGATCGTCTTTAGCCGCATCCAGTCCGAGCTGCAGGGCGAGCTGAACCTGATCGCCGCTGACTTCAAGGACCTGCTGGTGGTCAACAACCTGGATGAGATGTTTACCCTGCCCTACTTCAACCAAGTGCTGACCCAGAGCCAGATCGACCTCTATAACATGGTAATTGGCGGGAAGAGCGAGGAGGGAAAGATTAAGAAGCAGGGACTGAACGAGTACATAAACCTGTATAACCAGAACCATAAGGAGCAGAAGCTGCCCCTGTTCAAGCCACTCTTCAAGCAGATCCTGAGCGATCGGCAGAGCCTGTCCTGGCTGCCCCAGCAGTTTGAGGAGGACCAGGAGCTGCTGAACGCCGTGAGGGAGTGCTTCTACTCCCTGAACGACTCCCAGTGCAACCTGAAGCACCTGCAGGCTCTGCTGGTTAGCCTGGCCGATTATAACCTGAATGGGATCTACCTGACCAATGGCCCCGCCATCACCACCATTAGCCAGCAGATGTTTAACGACTGGAACCTGATTAACCGCGCCATCATCGAGCGGATGAGCCGGGACATCAAGGCCAGCTCCAAGCAGAAGAGCGAGGCCAAACTGGAGGAGGAGATCAGGAAGCGGATGGACAGCACTGAGTCTTTCTCCATCCAGTACCTGAACGAATGCATCGAGACCAGCGAGATCGAGGACATCAAAAATGCCGCCGACAAGCGCATCGAAAGCGCCCACTTTGCCAGGCTGATGATCTGCAACAAGAAAACCAACGAGCAGGAGAATCTCTTCGAAAGGATCTACACCGCCTACAACGAGGCCCAGACCCTGCTGAATACCCCCTACCCAGAAAATCAGAATCTGATCCAGGACCAGGAGAACGTG
GCCCGGATCAAGTACCTGCTAGACACCGTAAAGGACCTCCAGCTTTTCGTTAAGCCACTGCTGGGGAAGG
GCTACGAAATCGGAAAGGATGACACCTTTTATGGTATACTGACCCGGCTGTGGACTGTGATCGACCAGCT
CACCCCCCTGTACGATAAGGTGCGAAATTACCTGACCCGCAAGCCTTACAGCGATAAGAAAATCAAGCTG
AATTTTAAGAACTCTACTCTGCTGAACGGCTGGGATAAAAATAAGGAGGCAGATAACACTGCCATCATCA
TGCGCAAGGAGGGACTGTTTTACCTGGGCATCATGAACAAGGACATTAAGGGGTATAAGAGGATGTTCGA
GAAGTGCCCTCAGTGCAGCGAGGAGGAGGCCTACTACGAGAAGATGGAGTACAAGCTCCTGCCTGGGCC
AAACAAGATGCTCCCTAAGGTGTTTTTCGCCAAGAACAACATTGAGCTGTTCAAACCCTCCGAGAGGATC
ATGGCAATCCGGGAGAACGAGACCTTTAAGAAAGGCGACAAGTTCAACCTCGCTGACTGCCACGCCTTCA
TCGACTTCTACAAGGAAAGCATCGCCAAACACCCCGAGTGGAAGGACTTTGACTTTCACTTTTCCGAAAC
CCAGCTCTACAATGACATTTCCGGGTTCTATCGCGAGGTGGAACACCAGGGATATAAGATGAGCTTTAGA
AAGATCCCAGCCACCTACATTGATCAGCTCGTGGAGAACAATGAACTGTACCTGTTCCAGATCTATAACA
AGGACTTTAGTGAATATAGCAAGGGCACCCCTAACATGCATACCCTGTACTGGAAGATGCTGTTTGACGA
GAGAAACCTGGCTGATGTTGTGTATAAGCTGAACGGCCAGGCTGAGCTGTTTTACCGACCCGCCAGCCTG
AACTACAACCGGCCCACTCACCCTAAGAACGAGCCCATCACCAACAAGAACAAGAACAACCCCAAAAAG
GAGTCTATCTTCAAGTACGACCTGACTAAGGATAAGCGGTACACCCAGGATACCTTCCTGCTGCACGTTCC
CATTACCCTGAACTTCAAAGGCACTAATAATGGCAATATCAACCAGCAAGTCAACAGCTACCTGCAGACT
GCTGATAATACACACATCATCGGCATCGACAGGGGCGAACGCCACCTGCTGTACCTCGTCGTCATCGACA
TGAAGGGGAACATCAAGGAGCAGTTCTCCCTGAATGAGATCGCCAACCAGAACAAGGGGATTGAGTACC
GGACAAACTACCACCAGCTCCTGGAGAACAGGGAGAAGGAGCGGGTGGAGGCACGGGTGAATTGGCAG
AACATCGAGAACATTAAGGACCTGAAGGAGGGCTACCTGAGCCAAGTGATCCACCTGATTACCCAGCTCA
TGCTGAAGTATCACGCCATCGTGGTGCTCGAAGATCTCAACTTTGGCTTCATGAAGGGGAGACAGAAGGT
GGAGAAGTCCGTGTACCAGAAGTTCGAGAAGCAGCTCATCGATAAACTGAACTATCTCGTGAATAAGCAG
ATCGACGCCGAGAAGCCTGGAGGCCTGCTCAAGGCCTACCAGCTCGCCAAGCCTTTTGAGAGCTTTCAGA
AGATGGGCAAGCAGTCCGGCTTCCTGTTCTACATCCCCGCTTGGATGACATCCAAGATCGATCCTGTGACC
GGCTTCGTCAATCTGCTGAACACCAACTACGTCAACGTTAAGGAGTCCCAGAAGTTTTTCAGCAACTTCGA
CCGGATCGCCTACAATCCAGAGAAGGACTGGCTGGAGTGGGATATTGACTACAATAAGTTCACCACTAAG
GCCAAGAATAGCAGGCACAACTGGACCATCTGTACCCAGGGCGAGCGGATCGAGAATCACAGGAATGAG
AAGAACGGCCAGTGGAACAGCCAGAACGTCAACCTGACCGAGGAGTTTAAGAAGCTGTTCGCACTCTAT
GACATCGACCTGGCCCAGGATCTGAAGAAGTACATCATCCAGCAGAATGACGCTAAGTTCTTTAAAGAGC
TGCACAGAATCCTGAAGCTGACCCTGCAGATGAGGAACTCCCAGATCAACAGCGACATTGACTACCTCGT
GAGCCCCGTGGCCAACGCCGAGGGCTGCTTCTACAATTCCCAGACCGCTAACGCCACCCTGCCAGCCAAC
GCCGACGCCAACGGGGCCTACAATATCGCCCGCAAGGGCCTGTACCTGCTGCAGCAGATCAAGAAGGCC
CCTGACCTGGCCAAGCTGAAGCTCACCATCTCTAACGAGGAGTGGCTGAAGTTCGCCCAGGAGAAAACCTACCAGAATGAC;
SEQ ID NO.19(28c13基因序列):
ATGTTTAACCAGTTCACCAACCTGTACCCAGTGATTAAGACCCTGAGATTCGAGCTGAAGAGCATCGGCAACACTATGGACACTATCGAGAGCAATCAGGTCATCCACAATGACGAGAAGAGGGCCGACGCCTACGCCAAGCTGAAGGTGACCCTCGATGCCTACCACAAGGATATTATTGAGAAGGTGCTGAGCCGCGCCAGACTGACCGGCCTGGAGGACTACGCCATCGCTGTGAACAACCTGAAAACCTCTAAGGGCAACGCCGCTTACGGCAAAGAGCTGACCAAGAACAAGGAGCAGTTGAGAAAGCAGATCGCAGGATTCTTCAAGCAGCCCGAGTTCGCCCCAATTTTCAAAGATCTGTTCAAGGAGGGCGTGATCAAGAAAGACGTTAAGGCCTGGATCGACACCCAGCCTAACCCTAGCGATTACTTCTACTCCGATGACTTCGCCAATTTCACCGGCTACTTCGGCAACTATAACCTGATCCGGCAGAACCTGTATAGCCCTGAGGCTAAGCACGGCACCATCGCCTATCGGCTGATTGACGAGAACCTGCCCAAGTTCATCGACAATCTGAGCATTCTGCAGAACATTCAGAATAAGAATCCCGACCTGTTCGACCAGTTGAGCGACCAGTACCAGCAGTACTTCAGCGAGCTGCTGCCTTCTAAGCCTACACTGGCCGACTTCGTGAGCCTGGACACCTTCAATGATCTGCTGACCCAGAAAGGCCTGGACGCCTACCAGCAGATCATCGGCGGCATCAAGACTGAGAACCAACTGATCCAGGGCATTAATGTGCTGATCAATCTGCACAACCAGCAGCACCCCGAGCAGAGCAAGACCCCCAAACTGAAGCCCCTCTATAAGCAGCTCCTGTCCGACCGCGGCACTTTCAAGCTCCCACGGAAGTTTGAGGATGACGCTGAAATGATCCAGGCCAACCGCCAGTACTTCGAGGAGGTGCTGGGCAACAACACTCTGTTCGAGACCGGCGAAACACCCACCGAAGCCATGAACCAGCTTTTCCTGAGCATCGAGAATTACGATCTGAGCAAGATCTTCATCGAGTCCCCCCTGCTGGTGACCTCCATCTCCCAGAAGATCTATGGCTCCTATGCCGTGATTCCCCAGGCCCTGGAGTACTACCACGATAATCACGTTAACCCCTCTTACGCCGCCAAGTTCAATAAGGCCAAGTCCGACAAGAGCAGGGAGACTATGGAAAAGGCCAAAGCCGCCTGGGTGAAAGGCGTGCACGCCGTGAGTGTGATCCACCAGGCTGTGATCGCATACAATGATGTGCTGCCTGATGACGCAAAGCTGACAGATACCCAGCCCGTGATTAGCTACTACAAGGACATCCAGTACTCCGAAAAGACTGGCGAGTCCCAGCAGATCTTCGATGCCCTGATGCGCCGCTACCACCAGGCCAAAGGCATGCTGAATACTGATTACCCAAAGGGCTCCAAGCAGATCCTGAACAACAAGTCTAGCTTCGCCATCGTGAAAAACCTGCTGGATGTGTCCAAGGCCTACGTGAACGCCGCCCGCGATCTGACAATCAAAAAGCCCGAAGGCCTTGACCTGGACCTGCTGTT
CTACGAGAGGCTCGCCAAAACTTACACATACCTGCAGGACCTGCACGCACTGTACGACACCACGAGAAAC
TACGTGACCCAGAAACCTTTCTCCACCGATAAGATCAAGCTGAATTTTGACTGCGCTCAGCTCCTGGCCGG
GTGGGACTTTAATGTGATCGATGCCAAGAGGGGCGTGTTTCTGGTCAAGAATGGGCGGTATTACCTCGTC
ATCATCGATAATAAGCATAAGAAGGCCATGAATAACCTGCCCGCTCCTATCACTAATAACTGCTACGACA
AATATAACATGAGACTGAGTAAGGACGCCCACATGGCCCTGCCTAAAAAGCTCTTTACCAAGGATAACCT
CAAGATCCCTGCCATTGCCGAGATGGAGCGCAGGTGTCGGGACAAAAATGGCGGCCACCACCTGAGGAA
GAGTCCCGACTTTGATAAGGACTTTATGCACCAGATGATTGACACCTTTAAGGACATTATCAAGAAGGAC
AAGGACTTCGACGTTTTCGGCTTCCAGTTTAAGCCCACTCACCAGTACGAGGACATCAATGAGTTTTACGC
CGACTTCAATGAGCAGGCCTTAGTGACTTGGTACGATAAGGTTGATAGCGATGTGATTGATAGCCTGGTG
GCCGAGGGGAAGATCTACCTGTTCGAAGTGTACTCCAAAGATTTTAGCGACAAGAGTACCGGGACTCCCA
ACCAGCAGAGCCTGATCCTGCAGTACCTGTTCTCTCAGGATAATCTGGCCAAAAGGCACTTTAAGCTGAA
CGGCGAAGCCGAAGTGTTCTACCGGAAGGCCTCTATTGATAAGGACAAGGCCGTGGTGCATAAGAAGGG
CTCCCTGCTGGAGAACAAAAACCCTGCACGGCCCAATTCTAAGATCGCTAAGTTCGACATTGTGAAGGAT
AGACACTACACCGAAGATAAGCTGTTCCTGCATATCCCAATCACACTGAACAACAATGCCGCCGACATGA
AATCCTACGCTATGAATAGCAAGGTGCTGAACACCCTGAAAACAAACGGAGGCGTGAACGTGATCGGCA
TTGACAGAGGGGAAAGAAATCTGCTGAAGATCACCGTGATTAATAGTGCCGGGGAGATCTTGCATCAGG
AGTCCCTGAATAAGATCACTAGCGGGCAGGACATGGTGACTGATTACCATGAGCTTCTGGACAAGAAGGA
GCAGAGCCGCGCTGAGTCTAGGCTGAATTGGCAGGAGGTCGAATCCATTAAGGAGATCAAGCAGGGCTA
CCTGTCCCAGGTGGTGTATAGACTGTCCCAACTGATGCTGCAGTATAAAGCCATCGTGGTGCTGGAAGAT
CTGAATATCGGCTTTAAGCGCGGGAGGTTTAAGATCGAGAAACAGGTGTACCAGAATTTCGAGAAGGCCC
TCATCAACAAGTTAAATTACCTCGTGCTGAAGCAGTTGGAGGCTACCGAGGTGGGGGGCACTGCTCATGG
ATACCAGCTCACAGCCCCCTTTGAGAGCTTTCAGAAGCTGGGGAAGCAGTCTGGCTGGCTCTTTTACGTCC
CCGCCTGGAATACATCCCATATTGACCCCACCACAGGCTTCGTGAACCTGCACCACTTCAAATACGAGAG
CGTCGCCCAGGCAACAGACATCATCGACAAACTGAGCAATATCCGCTACAATCCAGAGAAGGACTACTTC
GAGTTCGCCATTGACTACAACGAGTTCACTTTTAAGGGGGGCGACAGCCAGAAGTACTGGGTGGTGTGCT
CAACCCCTTACAAGAGGTACGTGTTTGATAAAAAAGCCAACATGGGCAGAGGCGGCACCAAGGCCGTGG
ATGTGAACGCCGAGCTGAAGGCCCTCTTTGCAGCCCACGGCGTGGATTATGCAAGCGGAGAGGATCTGAG
GCCCCAGATTAAGGCCAAGGCCAACAAGGAGCTGCTGAGTCAACTGCTGTTTCTGCTGAAAACCCTGACC
GCCATGCGGTACACCAACGCCAGCTCCTACGAGGACTACATCCTGTCTCCAGTGGTGAATAAGGCCGGAG
AGTTCTTTGACAGCAGGAAGGGCGACGCCACCCTGCCACTGGACGCCGACTCTAACGGGTCCTACCACAT
CGCCCTGAAGGGACTGTGCCTGCTGCAGAGGGTGTACGACTGGCGCGGCGAGGAGTTTAAGGGCCTGGACCTGTTCATCTCCAATAATGACTGGCTGAAGTTCGCCCAGGACCGGCAC;
SEQ ID NO.20(28c15基因序列):
ATGAGCAACACTAAGGACAACATCTTTAACAACTTCACCGGCATCTACCCCATCAACAAGACCCTGCGGTTCGAGCTGCGGCCCGTGGGCAAGACCTACGACCTGATCAAGGACTTCAAGAACGGGTACGTGGAGTCCATTGTGGCCATCGACGAGAAGCGGTCCGAGGCCCGGAAGCGGATCATCGAGATCATCGACGAGTACTACGAGGAGTTCATCAACACCGTGCTGAGCAAGAAGGTGTTCTACTCCGACGACATCTGGCAGACCTACACCAGCTACAAGGCCTACAAGAGTGACAAGCGGAACAAGGAGTTTGTCACACAAAAGGCCATCATGCGGAAGAAGATCAGCGATGCCTTCCAGAACGAGAAAACCAAGTTTAACCTGAAGGACTTCAAAGACCTGTTCGGCAAGAAGAGCAATCTGAAGGAGTCCCCCCTGTATAAGTGGTACAAGAACAAGCTGGACATCGGGGAGATCACGGGCGAGGATTTCGAGGACATCATCAAGATAATCACCTACTTCATCGGCTTCACCACCTCCCTGAAGGATTACCAGGAGAACCGGAACAACCTGTTCGTGGCCGAGGAGCAGAGCACCGCCATCAGCCACAGGATTATCGATGTGAACATGATTCGCTACTTCGAGAATTGTATCAGATTCGAGAATATGAAGGACTCCGAACTGCTGGAGGACATGGGGAAGTGGGAGAAGTACTTCGTGCCAGCTAACTACGACAATTTCTTCACTCAGGAGGGTATCGATAACTACAATGAGATTATTGGCCGGAAGTCCAAAGATCTCTACTATAAAGGCGTGAACCAGTTGATCAATGAGTATAGGCAGAAGAACAAGATCAAAAATAAGGATATGCCAACGATGAACCAGCTCTACAAACAGCACATCAGCAAGAACGGCGACAACGAAATCAACAACGACTTCTCCAACGAGAAAGAGATGCTGGAGCAGATCGAGCAAGCCTACATCACCAGCCTCGATAAGATCAATAGGATCGTGTCCTTCATCAATGAGAACATTACCGAAGGAAATAAGATCTTCATTAGGAAGGACTTCGTGACTAATATCAGTAACCGCCTGTTCGGGGAGTGGAACTTCATTAACAACGCCCTCTACAGCTACCTGAGCGGCCTGAGCGCAAAGAACAAGGAGCTGTTCGTGAAGCAGACAGAGGAGGTCATCAAGATCAGCGAGCTCCAGAACATCATCGACCTCTACATCAACAATCTGGATGAGGATGAGAAAGAGAAGTACCTCAAGACCGACGCCATCTACACCCACTTCTGCTCCTTCGATGTGTGCGGGGTGCAGAACGCATACTATGAGGCCAAGACCGTGCTCGCCGTGGACGAGATCAATAAGGACCGGGAGAAAGAGGAAGAGGGAGCCAAGCAGATTTCTAAGGTGAAGAAGCTGCTCGACGAGATCCTCGAAGCCGTCCACTTCTACAAGCCCCTTTACCTCTACAAGAACGGGAAGGAGATCGACGAGATTGAGAAGGATGAGATTTTCTACAGCGAGTTCGACTACCTGTATTCCCAGCTCATGCTGGTGACCGAGCTGTACGACAGGGTGCGCAACTACCTGACCAAGAAACCCTATAGCAAGGATAAATTCAAGATCTACTTTAACAAGCCTACACTGCTCGACGGCTGGGATCTGAACAAGGAGAAAAACAATCTGTCCGTGCTCCTCATCAAGGACGGCTTCTATTATCT
CGGCATCATGGACTCCAAGTACAATAGCGTGTTCGATGTGTCCGCAGACGATGTGAAGATCAACACCACC
GAGCTGTCCGAGGAGGCTACCTTCCTGAAGATGGAGTATAAGCAGGTGAGCGGAGCTTCCAAGATGTTCC
CCAAGGTGTTCTTCGCCGCCTCCAACAAGGACATGTTCAAGCCAAGCGAGGAGATTTTGAACATCCGGGA
GAATAAGCAGTACCTCAAGGGGGCCAATAACAGGGAGGCTGTAATCAAGTGGATCGATTTCTGCAAGGA
CTGTCTCAAGATCCATCCAGAATGGAACCGCTACTTTAACTTCAACTTCCGCCACAGCGACGAGTATGAG
AACGTGAATAGCTTCTATGAGGACGCCGATACTCAGATGTACTACATCAACTTCGTGAAGTTCAAGGAGA
CTTACATCAATGATCTGGTGGAGGAGGGGAAGCTGTTCCTGTTTCAGATCTACAACAAGGACTTCTCCGA
GTACTCCAAGGGCAAGCCCAACCTCCACACCGTGTATTGGAAGATGCTGTTCGACGAGAATAACGTGCGG
AACATCAATGACAATACCGGCAAGCCCGTGTTCAAGCTGAACGGCGAGGCTGAGATCTTTTATCGGAAGG
CCAGCCTGGATAAGAAGGTGACTCACAAGAAAAACTACCCTATCAAAAACAAGAATAAGCACAATAACA
AGACTGAGAGTATCTTTGAGTACGACCTCTACAAGGACAAGCGGTTCATGGATGACAAGTTCTTCTTCCAT
TGCCCCATCACCATCAACTACCGGGCCAAGAATATCCTGTCCAGCGAGTTCAATAAGAAGTTCAACTTGC
ACATCAAAAACAGCGATAACATGAACATTCTGGGCGTGGACAGAGGCGAAAGGCATCTGCTGTACTACTC
CCTGATCAACATTAAGGGAGGAATCATCAAGCAGGGGAGTCTGAACACCATCTACGATTCCTACGAAAAG
GACGGCATCAATATCCCCGTGATTACCGACTACAAGTCCATTCTGAAGGACCGCGAGGACGAGCGGATGG
ACTCCAGGAAGAACTGGGGCACCATCAAGAACATCAAGGAGATGAAGGAGGGCTATCTGAGCCATGTGG
TGCATCAGGTCAGCAAGCTCCTCATCGACAACAATGCCATCCTGGTCCTGGAGAACCTGAACAGCGGCTT
CAAGCGGCGCAGACTGAAGATCGAGAAGCAGGTGTACCAGAACTTCGAGAAAAGCCTGATCAACAAGCT
GAACTACCTCGTCCTGAAGGATGCCGATAACAAGGATGTGGGGCACTTCCTGAAGGGCTACCAGCTCACC
GCTCCTTTCGAGGGGTTCCAGCGCCTGAACAACCAGTCCGGCATCATCTACTACGTGTGGCCCAGCTATAC
CAGCAAGATCTGCCCCCGCACCGGTTTCGTGAGCCTCCTGCACATCAACTACGAGAACATCGAGAAGTCC
AAGGAGTTCTTTAACAAGTTTGACAAGATCTCATATAACAAGGACAAGGACTACTTCGAGTTCCACCTGG
ATTACACCCGGTTCGGGAAGAACGCTGGCAAGAACAAGTGGGTCATCTGCACTTACGGCAAGGATCGCTA
CTTCTTCAACCAGAAGCTGAAGAAGTACGAGTACATCGACATCACAGAGAAGATCAAGGAGCTGCTGAG
CAACAACGGGATCGACTTCATCAACGAGAACGACATGCGCAAGTCCATCGTGGAGAACAACTCCAAGAA
CTTCTTCGGCTCCCTGCTGTTTTACCTCAAGGTCGTGATGCAGTTGCGCTACACCAACAGCAACGACGGGT
GCCGGAATGAGAACGACTACATCCTGAGCCCCGTGGCCGACATTAACGGCATGTTCTTCGACTCCCGGCA
CGCCTGCGACAACGAGCCCGAGAACGCCGACGCCAACGGGGCCTACCACATCGCTCTGAAGGGCCTGCG
CATGATCCAGTTCATCGAGAACGGCGTGATCACCAAGCAGGGCAACGAGACCACCGACTGGTTCAAGTTCGCCCAGAATAAGCTG;
SEQ ID NO.21(30c9基因序列):
ATGAGCGCCCAGAGCGCCCTGAGCACCCTGATCAACAAGTACAGCCTGAGCAAGACCCTGCGCTTCGAGCTGATCCCCATCGGCAAGACCAAGGAGAGCATCGACCGGAAAGGCCTGCTGAGCCAGGATGTGAAGCGAGCCCAGTCCTACAAGGAGGTGAAGAAGATCATCGACGAGTACCACAAGGAGTTCATCGAGAAGTCCCTGATCAACGCCAAGCTGAAGGGCCTCGAAGAGTTCAGCAAGCTGTACTACAAGCTGCAGAAGGAGGACAAGGATAAGAAGAATATCAAGAAGATGCAGGATAACCTGCGCGAGCAGATCTCCGACCTCTTCAAGAACAACAAAAAGGACAAGTGGAACATCCTGTTTAAGGAGGACCTGATCAAGAAGGAGCTGCCACTGTTTGCGAAGGATGATAAGCAGAAGAACCTGATCAATGAGTTCAACAAGTTCACCACATACTTCACCGGCTTCCACAAGAACCGGAAGAACATGTACGCCGAGGAAGAGAAGTCCACCTCTATTCCCTACCGGATCATTCACCAGAATCTGCCTAAGTTTCTGGATAACATCAGGATTTTCGAGAAGATTAAGAAGAACAAGATCAACACTGACGTAATCGAGAAGGAGCTGAGTCTGTTCCTGAACGGAATCAAGATCAACGATATTTTCAGCATTAACTTTTTCAACGATGTGCTGAACCAGAAGGGCATCACCTTCTATAACACCATCCTGGGCGGAGTGAGCGAGAAGGACCGCACCAAGATCAAGGGCATTAATGAGTATGTGAACACCGAGTACAACCAGAAGCAACTGGACAAGAAGAGCAAGATCCCCAAGCTGAAGCAGCTCTACAAGCAGATCCTGAGCGACACCGAGACCGCCAGCTTCGTGCTGGAGCAGTTCGAGAACGACAACCAGCTCCTGGAGAAGATCGAGCAGTTCTACAACACAGAGCTCATCAATTACGAGACCGAGGGCAAGACCCAGTCCGTGTTCCTGCAGTTTGAGCAACTGTTTAAAAACATGCAGAATTACGACGCCTCCAAGATCTACATTAGCAATCTCTCCATCGCTAACATCAGCAAGATCATCTTCGGCGACTGGTCCATCATCTGCAACGCCCTGGCCGAGTGGTACGACAAGCACAACACAAAGGGGAAGAAGATTAACGAGTATAAGAAGGAAAACTTCCTGAAGCAGGATTTCAGCATCCAGCAGATTGAGGACGCCGTGCTGGAGTACAAGAACGACACCTTGAACAAGGAGATCAACTTCCTCCTGAACTACTTCGCCAGCTTCCTCAACGAGAAGTCCAAGAAAAACATCATCCAGCGCATCGAGACCGAGTACTCCAAGGTGAAGGACCTCCTGAACACCGATTACCCCGAGAAGAAGAAGCTGGCCAGCGACAAGGACAACGTGAGCAAGATCAAGGCCTTCCTGGACTCGCTGATGGACTTTCTGCACTTCGTGAAACCCTTCAATATTAAGAAGGACACAGGGCTGGAGAAGGAGGAGAACTTCTACTCCATCTACGTGCCCCTGTTCGAGCAGATCGACAAGATCATCCCCCTTTACAACAAGGTGCGCAACTACCTGACCAAGAAGCCCTATAGCACCGAAAAGATCAAGCTGAACTTCGAGAACAGCACCCTGCTTGACGGCTGGGACCTGAACAAGGAGTCCGACAACACTAGCGTGGTGCTGCGCAAGGACGACCTCTACTACCTGGGCATTATGGATAAGAAGCACAATCGGATCTTCAAAGAACTGCCCAGCCAGAACGGCAATGAGAGTAGCTATGAGAAGATGATCTACAAGCTGCTGCCGGGGCCAAATAAGATGCTGCCCAAGGTGTTCTTCTCCAAAAAGGGCAAGAAGCAGTTCAAGCCCTCCAAGAAACTTCTGAAGAAGTACGAGGACGGGACCCACCTGAAGGGCGATAACTTTAATATCAATGACTGCCACAACCTGATCGACTTCTTTAAGGAGTCCATCGCCGAGCACGAGGACTGGAAGCAGTTCGACTTCAAGTTTAGCAGCACAAGTAGCTACAAGGACCTGTCAAATTTCTATAAGGAGGTGGAGAAACAGGGCTACAAGATCACATTCCAGAACATCTCTGAGAACTATATCAACCAGCTCATCGACGAGGGCAAGCTCTACCTGTTCCAGATCTACAATAAGGACTTCAGCAAGTACAGCAAGGGGACCCCCAACCTGCACACCCTGTACTGGAAGATGCTGTTTGATAACGACAACCTGAAGAACATTGTGTATAAGCTGAATGGCAAGGCCGAGGTGTTCTACCGCAAGTCCTCCCTGATCCTGGGGGACAACATCGTGCACAAGGCTGGCGAGGCAATCATCAACAAGAACCCCGACAACGAGAAAAAGCACAGTACCTTCGATTACGACCTGATTAAGGACAAACGCTTCACCCTCGACAAGTTTCAGTTCCATGTGCCCATTACCCTGAACTTCAAGAGCGAGGGGAGGCAGAACCTGAACGAGGATGTGAGGAAGTTCCTGAAGAACAACCCTGACATAAACATCATCGGTATCGACCGGGGGGAGCGGCACCTCCTGTACCTGACCCTCATCAACCAGAAGGGAAAGATCCTCTTCCAGAAAAGCCTGAACGAGATCACCAACGAGTACAATAACAAGAACGGTAAATCCCAGATCAAGAGCACCAACTACCACTCCCTGCTCGACAAGAAGGAGAAGAAGCGCGATGAGGCCCGCAAGAACTGGGGCATAATCGAGAACATCAAGGAGCTGAAGGAGGGCTACATGAGCCAGATCGTCCACTATATCAGCAAGCTGATGATCGAGAAAAACGCCATTCTGAGCCTTGAGGACCTGAACTTCGGGTTCAAGCGCGGACGCCAGAAGGTCGAGAAGCAGGTGTACCAGAAGTTCGAAAAGATGATGATTGACAAGCTCAACTACCTTGTGTTCAAGGACAAGAAGGCCAACGAGACCGGCGGCCTGCTCAATGCCCTGCAATTGACTAACAAGTTCGAGTCCTTCGCCAAGCTGTATAACCAGTCCGGGTTCATCTTCTACGTCCCAGCTTGGAACACCAGCAAGATCGACCCAATCACCGGCTTTGTGAACCTCCTGAAGCCTTACTACGAGAACCTGAATAAGAGCCAGGAGTTTTTCAAGAAGTTCAACAACATCAAGTACAACCCTAAGCAGGAGTACTTCGAGTTCAACTTCGACTACAAGAACTTCACCAACAAAGCCGAGGGCAGCAAGAACGTCTGGGAGATCTGCACCACTAACAATGAGCGGTTCATGTGGGACAAGACCCTGAACAGCGGCAAGGGCGCTCAGAAGGCCGTGGATGTGACACAGGAGCTGAAGAAGCTGTTTGACAGCAGCAAGATCAACTACCTGAACGGAAACGACATCAAGGAGGACATTATCAATCAGAACTCCGCCGACTTCTTTCGGAAGCTGATGAAGCTGCTGTCCGTGGTGCTGAGCCTGCGGCACAACAACGGCCTGAAGGGGAAGGACGAGAAGGACTTCATCCTGAGCCCCGTGGAGCCCTTCTTTAACAGCCTGAACGCTAAGATGGAGGAGCCTAAGGACGCCGACGCTAACGGCGCATACAACATCGCCCTGAAGGGCCTGCTGATCCTGAAGCAGATTAACGAGAGTGAGGACCTGCGCAAGATCAAGTTCAACCTGAGCAATAAGGAGTGGCTGAAGTTCGCCCAGTCTAAGAGCTTC;
SEQ ID NO.22(Library-间隔序列):
ATGGCGAATACTTTTAAAGTCAT;
引物序列为:
library-NGS-F:
ACACTCTTTCCCTACACGACGCTCTTCCGATCTgtctacaatcggctcgatcga;
library-NGS-R:
GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTgcgcagaccaaaacgatctc。
实施例4:体外切割实验
本实施例通过体外切割实验验证本发明实施例1的6种V型CRISPR/Cas12a基因编辑***在体外具有切割能力,并且验证本发明实施例3所鉴定的PAM的正确性。
具体操作如下:
(1)根据上述实施例3所鉴定的PAM结果,在CDKN2A基因上寻找合适的靶位点,通过退火反应得到靶点序列相同但PAM不同的DNA片段,在该位点两侧设计引物得到长度为2000bp的DNA片段,本发明所述六种CRISPR/Cas12a***所选择的靶序列和需要测试的PAM序列如表1所述。
表1.体外切割靶序列与PAM序列
(2)使用上述实施例1所述V型CRISPR/Cas12a基因编辑***的重复序列,通过体外转录获得包含重复序列和CDKN靶位点对应的crRNA;将本实施例3所述Cpf1纯化蛋白与crRNA孵育形成复合物,再与CDKN2A-DNA片段,置于37℃孵育30min后,加入适量蛋白酶K,室温孵育15min再98℃灭活10min;通过1.5%琼脂糖胶检测体外切割结果。
检测结果如图4所示。结果显示:本发明的六种CRISPR/Cas12a基因编辑***在体外均具有良好的切割能力,可将长度为2000bp的CDKN2A-DNA片段切割为长度分别为500bp和1500bp的两个片段,说明表达纯化得到的Cpf1蛋白是具有生物活性的;这六种Cpf1蛋白可在crRNA的引导下识别与间隔序列互补的靶点进行正确的切割。
实施例5:dsODN***实验
本实施例通过dsODN***实验验证本发明实施例1的6种V型CRISPR/Cas12a基因编辑***在哺乳动物真核细胞中靶向目标基因的编辑能力。
具体操作如下:
(1)根据本发明实施例1所述六种Cpf1蛋白进行人源密码子优化,将对应的核苷酸序列克隆进PX330真核表达载体上(addgene,59909),获得PX330-蛋白真核表达质粒。
(2)在哺乳动物细胞中,以HEK293T细胞为例,选取内源性CDKN2A基因,以本实施例3所鉴定的能在体外条件下进行识别切割的PAM序列,寻找到合适的靶位点,序列格式为5’-与六个Cpf1蛋白结合的直接重复序列-crRNA间隔序列-3’,通过Gibson方法克隆到PXZ载体上(addgene,160229),构建靶向不同靶位点,具有不同PAM的PXZ-CDKN2A target质粒,同时转染PX330-蛋白真核表达质粒与PXZ-CDKN2A target质粒,以LtCpf1为阳性对照组,以只转染PX330-蛋白真核表达质粒作为阴性对照。本实施例所选择的CDKN2A基因靶位点和对应的PAM如表2所示。
表2.真核实验使用的靶位点、PAM序列和检测引物
(c)在生长状态良好的HEK293T细胞24孔板中共转染PX330-蛋白真核质粒、PXZ-Cpf1蛋白-CDKN2A target质粒、1.2μLdsODN,72h后收细胞抽提DNA。
(d)在CDKN2A基因靶点上游以及dsODN序列上设计引物(见表2)进行dsODN-PCR扩增,跑琼脂糖胶检测是否出现目的条带,用来判断是否有dsODN的***,通过检测dsODN的***情况验证本发明所述V型CRISPR/Cas12a基因编辑***在真核细胞环境下是否具有编辑能力。
dsODN-PCR电泳结果如图5所示。结果显示:对应长度的PCR条带用红色三角标注,实施例1的六种CRISPR/Cas12a基因编辑***在真核细胞中均具有切割能力。
综上,本发明通过宏基因组生物信息学分析,首次挖掘出六种全新V型CRISPR/Cas12a基因编辑***,并预测其各自对应的直接重复序列。6种新型编辑***的Cpf1蛋白分别命名为:28c2、28c6、28c12、28c13、28c15和30c9。Cpf1作为一种单一的RNA引导的内切酶,只需要crRNA进行靶向,整体体积比Cas9小,更方便进行体内递送;Cpf1的向导RNA设计比Cas9更简单、更方便。通过实验证明这六种CRISPR/Cas12a基因编辑***可以识别各自独特的PAM序列,能够在crRNA的引导下在体外环境和真核细胞中行使基因编辑功能。本发明新的六种基因编辑***的发现进一步扩大了基因编辑工具的种类,丰富了现有Cpf1作为基因编辑工具的PAM多样性,为Cpf1用于临床治疗提供更多的工具选择,对推动将基因编辑应用于临床治疗具有重要的作用。
最后所应当说明的是,以上实施例仅用以说明本发明的技术方案而非对本发明保护范围的限制,尽管参照较佳实施例对本发明作了详细说明,本领域的普通技术人员应当理解,可以对本发明的技术方案进行修改或者等同替换,而不脱离本发明技术方案的实质和范围。
Claims (5)
1.一种Cpf1蛋白,其特征在于,所述Cpf1蛋白的氨基酸序列如SEQ ID NO.1~6中任一种序列所示。
2.一种编码权利要求1所述Cpf1蛋白的核酸,其特征在于,所述核酸的碱基序列如SEQID NO.16~21中任一种序列所示。
3.一种V型CRISPR/Cas12a基因编辑***,其特征在于,包括如权利要求1所述的Cpf1蛋白、辅助蛋白和CRISPR array;所述CRISPR array包括直接重复序列和间隔序列;两个所述直接重复序列中间夹一个所述间隔序列;所述直接重复序列和所述间隔序列间隔排列;所述直接重复序列的核苷酸序列如SEQ ID NO.10~15中任一种序列所示;所述辅助蛋白的氨基酸序列如SEQ ID NO.7~8中任一种序列所示。
4.权利要求3所述V型CRISPR/Cas12a基因编辑***在原核或非疾病的诊断或治疗的真核生物基因编辑中的应用。
5.权利要求3所述V型CRISPR/Cas12a基因编辑***在制备生物基因编辑制剂中的应用。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310510289.0A CN116751763B (zh) | 2023-05-08 | 2023-05-08 | 一种Cpf1蛋白、V型基因编辑***及应用 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310510289.0A CN116751763B (zh) | 2023-05-08 | 2023-05-08 | 一种Cpf1蛋白、V型基因编辑***及应用 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116751763A CN116751763A (zh) | 2023-09-15 |
CN116751763B true CN116751763B (zh) | 2024-02-13 |
Family
ID=87948550
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310510289.0A Active CN116751763B (zh) | 2023-05-08 | 2023-05-08 | 一种Cpf1蛋白、V型基因编辑***及应用 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116751763B (zh) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016205711A1 (en) * | 2015-06-18 | 2016-12-22 | The Broad Institute Inc. | Novel crispr enzymes and systems |
CN109312316A (zh) * | 2016-02-15 | 2019-02-05 | 本森希尔生物***股份有限公司 | 修饰基因组的组合物和方法 |
CN111757889A (zh) * | 2018-10-29 | 2020-10-09 | 中国农业大学 | 新型CRISPR/Cas12f酶和*** |
CN111836894A (zh) * | 2017-11-21 | 2020-10-27 | 韩国生命工学研究院 | 使用CRISPR/Cpf1***的基因组编辑组合物及其用途 |
CN112331264A (zh) * | 2020-09-11 | 2021-02-05 | 中山大学附属第一医院 | 一种同源2型CRISPR/Cas基因编辑***的构建方法 |
CN112703250A (zh) * | 2018-08-15 | 2021-04-23 | 齐默尔根公司 | CRISPRi在高通量代谢工程中的应用 |
CN113234701A (zh) * | 2020-10-20 | 2021-08-10 | 珠海舒桐医疗科技有限公司 | 一种Cpf1蛋白及基因编辑*** |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20230056843A1 (en) * | 2019-08-19 | 2023-02-23 | Southern Medical University | Construction of high-fidelity crispr/ascpf1 mutant and uses thereof |
-
2023
- 2023-05-08 CN CN202310510289.0A patent/CN116751763B/zh active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016205711A1 (en) * | 2015-06-18 | 2016-12-22 | The Broad Institute Inc. | Novel crispr enzymes and systems |
CN109312316A (zh) * | 2016-02-15 | 2019-02-05 | 本森希尔生物***股份有限公司 | 修饰基因组的组合物和方法 |
CN111836894A (zh) * | 2017-11-21 | 2020-10-27 | 韩国生命工学研究院 | 使用CRISPR/Cpf1***的基因组编辑组合物及其用途 |
CN112703250A (zh) * | 2018-08-15 | 2021-04-23 | 齐默尔根公司 | CRISPRi在高通量代谢工程中的应用 |
CN111757889A (zh) * | 2018-10-29 | 2020-10-09 | 中国农业大学 | 新型CRISPR/Cas12f酶和*** |
CN112331264A (zh) * | 2020-09-11 | 2021-02-05 | 中山大学附属第一医院 | 一种同源2型CRISPR/Cas基因编辑***的构建方法 |
WO2022052211A1 (zh) * | 2020-09-11 | 2022-03-17 | 中山大学附属第一医院 | 一种同源2型CRISPR/Cas9基因编辑***及其构建方法 |
CN113234701A (zh) * | 2020-10-20 | 2021-08-10 | 珠海舒桐医疗科技有限公司 | 一种Cpf1蛋白及基因编辑*** |
Non-Patent Citations (1)
Title |
---|
多重基因组编辑中CRISPR-Cas9***和CRISPR-Cpf1***的应用和比较;郭婷等;中国细胞生物学学报;第41卷(第11期);第2234-2244页 * |
Also Published As
Publication number | Publication date |
---|---|
CN116751763A (zh) | 2023-09-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11713471B2 (en) | Class II, type V CRISPR systems | |
Murray et al. | Nucleotide sequences of transcription and translation initiation regions in Bacillus phage phi 29 early genes. | |
WO2022199511A1 (zh) | 一种Lt1Cas13d蛋白及基因编辑*** | |
CN113234701B (zh) | 一种Cpf1蛋白及基因编辑*** | |
CN112430586B (zh) | 一种VI-B型CRISPR/Cas13基因编辑***及其应用 | |
CN114075559A (zh) | 一种2型CRISPR/Cas9基因编辑***及其应用 | |
Fitzgerald et al. | Rapid shotgeun cloning utillizing the two base recongition endonuclease Cvi JI | |
CN116751764B (zh) | 一种Cas9蛋白、II型CRISPR/Cas9基因编辑***及应用 | |
US20040091886A1 (en) | Method for generating recombinant polynucleotides | |
CN116751763B (zh) | 一种Cpf1蛋白、V型基因编辑***及应用 | |
EP3676396B1 (en) | Transposase compositions, methods of making and methods of screening | |
CN113549650B (zh) | 一种CRISPR-SaCas9基因编辑***及其应用 | |
RU2804422C1 (ru) | Система редактирования геномной днк эукариотической клетки на основе нуклеотидной последовательности, кодирующей белок sucas9nls | |
RU2712497C1 (ru) | Средство разрезания ДНК на основе Cas9 белка из биотехнологически значимой бактерии Clostridium cellulolyticum | |
RU2712492C1 (ru) | Средство разрезания днк на основе cas9 белка из defluviimonas sp. | |
RU2788197C1 (ru) | Средство разрезания ДНК на основе Cas9 белка из бактерии Streptococcus uberis NCTC3858 | |
CN116179513B (zh) | 一种Cpf1蛋白及其在基因编辑中的应用 | |
WO2024119052A2 (en) | Genomic cryptography | |
CN116004762A (zh) | 一种基于CRISPR-Cas9技术的体外剪切效率试剂盒及其应用 | |
CN118006584A (zh) | CRISPR基因座完全缺失Cas1、Cas2和Cas4的可编程核酸酶及其应用 | |
JP2024509047A (ja) | Crispr関連トランスポゾンシステム及びその使用方法 | |
JP2024509048A (ja) | Crispr関連トランスポゾンシステム及びその使用方法 | |
EA042517B1 (ru) | Средство разрезания днк | |
CN117866924A (zh) | 多sgRNA介导的EXPERTplus先导基因编辑***及其应用 | |
KR20040036371A (ko) | 염색체의 특정부위가 제거된 미생물 변이주의 제조를 위한선형 dna 단편 및 이를 이용한 염색체의 특정부위가제거된 미생물 변이주의 제조방법 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |