CN116829946A - Methods and compositions relating to GLP1R variants - Google Patents

Methods and compositions relating to GLP1R variants Download PDF

Info

Publication number
CN116829946A
CN116829946A CN202180073480.2A CN202180073480A CN116829946A CN 116829946 A CN116829946 A CN 116829946A CN 202180073480 A CN202180073480 A CN 202180073480A CN 116829946 A CN116829946 A CN 116829946A
Authority
CN
China
Prior art keywords
antibody
cases
seq
nos
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180073480.2A
Other languages
Chinese (zh)
Inventor
亚伦·萨托
潘卡杰·加格
刘强
福米可·阿克塞尔罗德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Twist Bioscience Corp
Original Assignee
Twist Bioscience Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Twist Bioscience Corp filed Critical Twist Bioscience Corp
Priority claimed from PCT/US2021/047616 external-priority patent/WO2022046944A2/en
Publication of CN116829946A publication Critical patent/CN116829946A/en
Pending legal-status Critical Current

Links

Landscapes

  • Peptides Or Proteins (AREA)

Abstract

Provided herein are methods and compositions relating to a glucagon-like peptide-1 receptor (GLP 1R) library having nucleic acids encoding immunoglobulins that bind to GLP 1R. Libraries described herein include diverse libraries comprising nucleic acids, each nucleic acid encoding a predetermined variant of at least one predetermined reference nucleic acid sequence. Also described herein are libraries of proteins generated when translating a nucleic acid library. Also described herein are libraries of cells expressing the diverse nucleic acid libraries described herein.

Description

Methods and compositions relating to GLP1R variants
Cross reference
The present application claims the benefit of U.S. provisional patent application No. 63/070,734, filed on month 8, 26, 2020, and U.S. provisional patent application No. 63/081,801, filed on month 9, 22, 2020, each of which is incorporated herein by reference in its entirety.
Background
G Protein Coupled Receptors (GPCRs) are involved in a variety of diseases. Because of the problems with obtaining a suitable antigen, the production of antibodies to GPCRs has been difficult because GPCRs are typically expressed at low levels in cells and are very unstable when purified. Thus, there is a need for improved agents for therapeutic intervention targeting GPCRs.
Incorporation by reference
All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.
Disclosure of Invention
Provided herein are antibodies or antibody fragments that bind GLP1R, comprising an immunoglobulin heavy chain and an immunoglobulin light chain: (a) Wherein the immunoglobulin heavy chain comprises an amino acid sequence that is at least about 90% identical to an amino acid sequence shown in table 9; and (b) wherein the immunoglobulin light chain comprises an amino acid sequence that is at least about 90% identical to an amino acid sequence shown in table 10. Also provided herein are antibodies or antibody fragments, wherein the antibodies are monoclonal antibodies, polyclonal antibodies, bispecific antibodies, multispecific antibodies, grafted antibodies, human antibodies, humanized antibodies, synthetic antibodies, chimeric antibodies, camelized antibodies, single chain Fv (scFv), single chain antibodies, fab fragments, F (ab') 2 fragments, fd fragments, fv fragments, single domain antibodies, isolated Complementarity Determining Regions (CDRs), diabodies, fragments consisting of only a single monomer variable domain, disulfide-linked Fv (sdFv), intracellular antibodies (intrabody), anti-idiotype (anti-Id) antibodies, or ab antigen-binding fragments thereof. Also provided herein are antibodies or antibody fragments, wherein the antibodies or antibody fragments thereof are chimeric or humanized. Also provided herein are antibodies or antibody fragments, wherein the EC50 of the antibody or antibody fragment in cAMP assays is less than about 25 nanomolar. Also provided herein are antibodies or antibody fragments, wherein the EC50 of the antibody or antibody fragment in cAMP assays is less than about 20 nanomolar. Also provided herein are antibodies or antibody fragments, wherein the EC50 of the antibody or antibody fragment in cAMP assays is less than about 10 nanomolar. Also provided herein are antibodies or antibody fragments, wherein the antibodies or antibody fragments are agonists of GLP 1R. Also provided herein are antibodies or antibody fragments, wherein the antibodies or antibody fragments are antagonists of GLP 1R. Also provided herein are antibodies or antibody fragments, wherein the antibodies or antibody fragments are allosteric modulators of GLP 1R. Also provided herein are antibodies or antibody fragments, wherein the allosteric modulator of GLP1R is a negative allosteric modulator.
Provided herein are methods of treating a metabolic disease or disorder comprising administering an antibody or antibody fragment that binds GLP1R, wherein the antibody or antibody fragment comprises the sequences shown in tables 7-13. Also provided herein are methods wherein the antibody is a monoclonal antibody, polyclonal antibody, bispecific antibody, multispecific antibody, grafted antibody, human antibody, humanized antibody, synthetic antibody, chimeric antibody, camelized antibody, single chain Fv (scFv), single chain antibody, fab fragment, F (ab') 2 fragment, fd fragment, fv fragment, single domain antibody, isolated Complementarity Determining Region (CDR), diabody, fragment consisting of only a single monomer variable domain, disulfide-linked Fv (sdFv), intracellular antibody, anti-idiotype (anti-Id) antibody, or ab antigen binding fragment thereof. Also provided herein are methods, wherein the antibodies or antibody fragments thereof are chimeric or humanized. Also provided herein are methods wherein the EC50 of the antibody or antibody fragment in cAMP assays is less than about 25 nanomolar. Also provided herein are methods wherein the EC50 of the antibody or antibody fragment in cAMP assays is less than about 20 nanomolar. Also provided herein are methods wherein the EC50 of the antibody or antibody fragment in cAMP assays is less than about 10 nanomolar. Also provided herein are methods wherein the antibody or antibody fragment is an agonist of GLP 1R. Also provided herein are methods wherein the antibody or antibody fragment is an antagonist of GLP 1R. Also provided herein are methods wherein the antibody or antibody fragment is an allosteric modulator of GLP 1R. Also provided herein are methods wherein the allosteric modulator of GLP1R is a negative allosteric modulator. Also provided herein are methods wherein the antibody or antibody fragment is an allosteric modulator. Also provided herein are methods wherein the antibody or antibody fragment is a negative allosteric modulator. Also provided herein are methods wherein the metabolic disease or disorder is type II diabetes or obesity.
Provided herein are antibodies or antibody fragments comprising a variable domain heavy chain region (VH) and a variable domain light chain region (VL), wherein VH comprises complementarity determining regions CDRH1, CDRH2, and CDRH3, wherein VL comprises complementarity determining regions CDRL1, CDRL2, and CDRL3, and wherein (a) the amino acid sequence of CDRH1 is as shown in any one of SEQ ID NOs 441-619; (b) The amino acid sequence of CDRH2 is shown in any one of SEQ ID NO 620-798; (c) The amino acid sequence of CDRH3 is shown in any one of SEQ ID NO: 799-977; (d) The amino acid sequence of CDRL1 is shown in any one of SEQ ID NOs 978-1156; (e) The amino acid sequence of CDRL2 is shown in any one of SEQ ID NOs 1157-1168; and (f) the amino acid sequence of CDRL3 is shown in any one of SEQ ID NOS 1169-1347. Also provided herein are antibodies or antibody fragments, wherein the antibodies are monoclonal antibodies, polyclonal antibodies, bispecific antibodies, multispecific antibodies, grafted antibodies, human antibodies, humanized antibodies, synthetic antibodies, chimeric antibodies, camelized antibodies, single chain Fv (scFv), single chain antibodies, fab fragments, F (ab') 2 fragments, fd fragments, fv fragments, single domain antibodies, isolated Complementarity Determining Regions (CDRs), diabodies, fragments consisting of only a single monomer variable domain, disulfide-linked Fv (sdFv), intracellular antibodies, anti-idiotype (anti-Id) antibodies, or ab antigen-binding fragments thereof. Also provided herein are antibodies or antibody fragments, wherein the antibodies or antibody fragments thereof are chimeric or humanized. Also provided herein are antibodies or antibody fragments, wherein the EC50 of the antibody or antibody fragment in cAMP assays is less than about 25 nanomolar. Also provided herein are antibodies or antibody fragments, wherein the EC50 of the antibody or antibody fragment in cAMP assays is less than about 20 nanomolar. Also provided herein are antibodies or antibody fragments, wherein the EC50 of the antibody or antibody fragment in cAMP assays is less than about 10 nanomolar. Also provided herein are antibodies or antibody fragments, wherein the antibodies or antibody fragments are agonists of GLP 1R. Also provided herein are antibodies or antibody fragments, wherein the antibodies or antibody fragments are antagonists of GLP 1R. Also provided herein are antibodies or antibody fragments, wherein the antibodies or antibody fragments are allosteric modulators of GLP 1R. Also provided herein are antibodies or antibody fragments, wherein the allosteric modulator of GLP1R is a negative allosteric modulator. Also provided herein are antibodies or antibody fragments, wherein the VH comprises a sequence at least about 90% identical to any one of SEQ ID NOs 58-77. Also provided herein are antibodies or antibody fragments, wherein VH comprises the sequence of any one of SEQ ID NOs 58-77. Also provided herein are antibodies or antibody fragments, wherein the VL comprises a sequence at least about 90% identical to any one of SEQ ID NOs 92-111. Also provided herein are antibodies or antibody fragments, wherein VL comprises the sequence of any one of SEQ ID NOs 92-111.
Provided herein are methods of treating a metabolic disease or disorder comprising administering an antibody or antibody fragment that binds GLP1R, the antibody or antibody fragment comprising a variable domain heavy chain region (VH) and a variable domain light chain region (VL), wherein VH comprises complementarity determining regions CDRH1, CDRH2, and CDRH3, wherein VL comprises complementarity determining regions CDRL1, CDRL2, and CDRL3, and wherein (a) the amino acid sequence of CDRH1 is as shown in any one of SEQ ID NOs 441-619; (b) The amino acid sequence of CDRH2 is shown in any one of SEQ ID NO 620-798; (c) The amino acid sequence of CDRH3 is shown in any one of SEQ ID NO: 799-977; (d) The amino acid sequence of CDRL1 is shown in any one of SEQ ID NOs 978-1156; (e) The amino acid sequence of CDRL2 is shown in any one of SEQ ID NOs 1157-1168; and (f) the amino acid sequence of CDRL3 is shown in any one of SEQ ID NOS 1169-1347. Also provided herein are methods wherein the antibody is a monoclonal antibody, polyclonal antibody, bispecific antibody, multispecific antibody, grafted antibody, human antibody, humanized antibody, synthetic antibody, chimeric antibody, camelized antibody, single chain Fv (scFv), single chain antibody, fab fragment, F (ab') 2 fragment, fd fragment, fv fragment, single domain antibody, isolated Complementarity Determining Region (CDR), diabody, fragment consisting of only a single monomer variable domain, disulfide-linked Fv (sdFv), intracellular antibody, anti-idiotype (anti-Id) antibody, or ab antigen binding fragment thereof. Also provided herein are methods, wherein the antibodies or antibody fragments thereof are chimeric or humanized. Also provided herein are methods wherein the EC50 of the antibody or antibody fragment in cAMP assays is less than about 25 nanomolar. Also provided herein are methods wherein the EC50 of the antibody or antibody fragment in cAMP assays is less than about 20 nanomolar. Also provided herein are methods wherein the EC50 of the antibody or antibody fragment in cAMP assays is less than about 10 nanomolar. Also provided herein are methods wherein the antibody or antibody fragment is an agonist of GLP 1R. Also provided herein are methods wherein the antibody or antibody fragment is an antagonist of GLP 1R. Also provided herein are methods wherein the antibody or antibody fragment is an allosteric modulator of GLP 1R. Also provided herein are methods wherein the allosteric modulator of GLP1R is a negative allosteric modulator. Also provided herein are methods wherein the antibody or antibody fragment is an allosteric modulator. Also provided herein are methods wherein the antibody or antibody fragment is a negative allosteric modulator. Also provided herein are methods, wherein the VH comprises a sequence at least about 90% identical to any one of SEQ ID NOs 58-77. Also provided herein are methods wherein the VH comprises the sequence of any one of SEQ ID NOs 58-77. Also provided herein are methods, wherein the VL comprises a sequence at least about 90% identical to any one of SEQ ID NOS: 92-111. Also provided herein are methods, wherein the VL comprises the sequence of any one of SEQ ID NOS 92-111. Also provided herein are methods wherein the metabolic disease or disorder is type II diabetes or obesity.
Provided herein are nucleic acid compositions comprising: a) A first nucleic acid encoding a variable domain heavy chain region (VH) comprising complementarity determining regions CDRH1, CDRH2 and CDRH3, and wherein (i) the amino acid sequence of CDRH1 is as shown in any one of SEQ ID NOs 441-619; (ii) The amino acid sequence of CDRH2 is shown in any one of SEQ ID NO 620-798; (iii) The amino acid sequence of CDRH3 is shown in any one of SEQ ID NO: 799-977; b) A second nucleic acid encoding a variable domain light chain region (VL) comprising complementarity determining regions CDRL1, CDRL2 and CDRL3, and wherein (i) the amino acid sequence of CDRL1 is as shown in any one of SEQ ID NOs 978-1156; (ii) The amino acid sequence of CDRL2 is shown in any one of SEQ ID NOs 1157-1168; and (iii) the amino acid sequence of CDRL3 is shown in any one of SEQ ID NOs 1169-1347.
Provided herein are nucleic acid compositions comprising: a) A first nucleic acid encoding a variable domain heavy chain region (VH) comprising an amino acid sequence at least about 90% identical to a sequence as set forth in any one of SEQ ID NOs 58 to 77; b) A second nucleic acid encoding a variable domain light chain region (VL) comprising an amino acid sequence at least about 90% identical to the sequence shown as any one of SEQ ID NOs 92 to 111; and (3) an excipient. Also provided herein are nucleic acid compositions wherein the VH comprises an amino acid sequence as set forth in any one of SEQ ID NOs 58-77. Also provided herein are nucleic acid compositions wherein the VL comprises an amino acid sequence set forth in any one of SEQ ID NOS: 92-111. Also provided herein are nucleic acid compositions, wherein VH comprises an amino acid sequence as set forth in any one of SEQ ID NOs 58-77, and wherein VL comprises an amino acid sequence as set forth in any one of SEQ ID NOs 92-111.
Drawings
Fig. 1A depicts a first schematic of an immunoglobulin.
Fig. 1B depicts a second schematic of an immunoglobulin.
Figure 2 depicts a schematic of motifs placed in immunoglobulins.
FIG. 3 presents a step diagram illustrating an exemplary process workflow for gene synthesis as disclosed herein.
Fig. 4 shows an example of a computer system.
Fig. 5 is a block diagram illustrating the architecture of a computer system.
Fig. 6 is a diagram illustrating a network configured to consolidate multiple computer systems, multiple mobile phones and personal data assistants, and Network Attached Storage (NAS).
FIG. 7 is a block diagram of a multiprocessor computer system using shared virtual address memory space.
Fig. 8A depicts a schematic of an immunoglobulin comprising a VH domain attached to a VL domain using a linker.
Fig. 8B depicts a schematic of an immunoglobulin full domain architecture comprising a VH domain attached to a VL domain using a linker, a leader sequence, and a pIII sequence.
Fig. 8C depicts a schematic of 4 framework elements (FW 1, FW2, FW3, FW 4) and 3 variable CDR (L1, L2, L3) elements of a VL or VH domain.
FIG. 9A depicts the structure of glucagon-like peptide 1 (GLP-1, cyan) complexed with the GLP-1 receptor (GLP-1R, gray), PDB entry 5VAI.
Fig. 9B depicts the crystal structure of CXCR4 chemokine receptor (grey) complexed with the cyclic peptide antagonist CVX15 (blue), PDB entry 3OR0.
Fig. 9C depicts the crystal structure of human smoothened (smoothened) receptor containing a transmembrane domain (grey) and an extracellular domain (ECD) (orange), PDB entry 5L7D. The ECD is contacted with TMD through extracellular loop 3 (ECL 3).
FIG. 9D depicts the structure of GLP-1R (gray) complexed with Fab (magenta), PDB entry 6LN2.
Fig. 9E depicts the crystal structure of CXCR4 (grey) complexed with the viral chemokine antagonist viral macrophage inflammatory protein 2 (vMIP-II, green), PDB entry 4RWS.
FIG. 10 depicts a schematic of a GPCR focused library design. 2 germline heavy chains VH1-69 and VH3-30;4 germline light chains IGKV1-39 and IGKV3-15, and IGLV1-51 and IGLV2-14.
FIG. 11 depicts the HCDR3 length profile in a GPCR focused library compared to the HCDR3 length profile in a B cell population from 3 healthy adult donors. A total of 2,444,718 unique VH sequences from the GPCR library and 2,481,511 unique VH sequences from the human B cell repertoire (repotoire) were analyzed to generate a length profile.
FIG. 12A depicts the design of GLP-1R overexpressing CHO cells for phage antibody library selection. GLP-1R expression was confirmed by dual detection of GFP green fluorescence and gating of surface expression of cell surface Flag tags.
Fig. 12B depicts a cell-based panning process.
FIG. 13 depicts a graph of the percentage of unique HCDR3 in the output pool of five rounds of GLP-1R panning.
FIG. 14 depicts a binding profile of 13 unique GLP-1R hits (Hit) compared to parent CHO cells.
FIG. 15 depicts the HCDR3 loop sequence of 13 unique GLP1R binders. 6 of the clones had a GLP-1 motif, 4 of the clones had a GLP-2 motif, and 3 clones did not have GLP-1 or GLP-2 motifs. For clones with GLP-1 or GLP-2 motifs, residues similar to the GLP-1 or GLP-2 sequence are colored in black, while the different residues are colored in red. Functional antagonists in the cAMP assay are highlighted in yellow.
FIG. 16A depicts an orthotopic inhibition plot of GLP1R-3 binding in the absence and presence of GLP-1 (7-36).
FIG. 16B depicts a graph of the effect of GLP1R-3 on GLP-1 activation in a cAMP assay.
FIG. 16C depicts a graph of the effect of GLP1R-3 on GLP-1 induced β -arrestin recruitment.
FIG. 17 depicts the design of GLP 1R-59-2. GLP1 (7-36) peptide is linked to the N-terminus of the light chain of the functionally inactive GLP-1R binding antibody GLP 1R-2.
FIG. 18A depicts a graph of GLP-1R-59-2 binding specifically to GLP-1R with an EC50 of 15.5nM.
FIG. 18B depicts a graph of GLP1R-59-2 in a cAMP assay, EC50 being similar to GLP-17-36 peptide.
FIG. 18C depicts a graph of GLP-1R-59-2 inducing β -arrestin recruitment in GLP-1R expressing cells.
FIGS. 19A-19B depict in vivo Pharmacokinetic (PK) and Pharmacodynamic (PD) effects of GLP1R-3 and GLP 1R-59-2. GLP1R-3 has a half-life in rats of 1 week based on beta phase calculation (FIG. 19A). GLP1R-59-2 had a half-life of 2 days in rats (FIG. 19B).
FIG. 20A depicts a graph of GLP1R-59-2 versus glucose after glucose challenge.
Fig. 20B depicts a plot of area under the curve (AUC) in a Glucose Tolerance Test (GTT).
FIG. 21A depicts a graph of a 19+2 hour dosing regimen for GLP1R-3 and GLP-1 peptide exendin 9-39 treatment.
Fig. 21B depicts a plot of area under the curve (AUC) in an insulin resistance test (ITT).
FIG. 22A depicts a graph of GLP1R-3 treatment, a single 6 hour dosing regimen after insulin challenge, compared to GLP-1 peptide exendin 9-39 (1.0 or 0.23mg/kg dose) or controls.
FIG. 22B depicts a plot of area under the curve (AUC) for 6 hours GLP1R-3 (20 mg/kg) treatment in ITT.
FIG. 23A depicts a graph of GLP1R-3 treatment, a single 6 hour dosing regimen after insulin challenge compared to GLP1R-226-1, GLP1R-226-2 or controls.
FIG. 23B depicts the area under the curve (AUC) of a single 6 hour dosing regimen of GLP1R-3 treatment following insulin challenge as compared to GLP1R-226-1, GLP1R-226-2 or controls.
FIGS. 24A-24B are schematic illustrations of panning strategies for GLP1R-221 and GLP1R-222 variants.
FIGS. 25A-25B are graphs of competition data for GLP1R-221 and GLP1R-222 variants.
FIG. 26 is a diagram of GLP1R-221 and GLP1R-222 variants in a cAMP assay.
Detailed Description
Unless otherwise indicated, the present invention employs conventional molecular biology techniques within the skill of the art. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art.
Definition of the definition
Throughout the present invention, various embodiments are presented in a range format. It should be understood that the description of the range format is merely for convenience and brevity and should not be construed as a rigid limitation on the scope of any embodiment. Accordingly, unless the context clearly indicates otherwise, the description of a range should be considered to have specifically disclosed all possible sub-ranges and individual values within the range to one tenth of the unit of the lower limit. For example, a description of a range (e.g., 1 to 6) should be considered to have the specifically disclosed subranges (e.g., 1 to 3, 1 to 4, 1 to 5, 2 to 4, 2 to 6, 3 to 6, etc.), as well as individual values within the range (e.g., 1.1, 2, 2.3, 5, and 5.9). This applies regardless of the width of the range. The upper and lower limits of these intermediate ranges may independently be included in the smaller ranges, and are also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the invention, unless the context clearly dictates otherwise.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of any embodiments. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, components, and/or groups, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
Unless specifically stated or apparent from the context, as used herein, the term "about" with respect to an amount or range of amounts is to be understood to mean +/-10% of the amount and amount, or 10% below the lower limit and 10% above the upper limit of the list for the values listed for the ranges.
As used herein, the term "nucleic acid" encompasses double-stranded or triple-stranded nucleic acids as well as single-stranded molecules, unless specifically indicated. In double-stranded or triple-stranded nucleic acids, the nucleic acid strands need not be coextensive (i.e., double-stranded nucleic acids need not be double-stranded along the entire length of both strands). When provided, the nucleic acid sequences are listed in the 5 'to 3' direction unless otherwise indicated. The methods described herein provide for the production of isolated nucleic acids. The methods described herein additionally provide for the production of isolated and purified nucleic acids. A length of a "nucleic acid" as referred to herein may comprise at least 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000 or more bases. In addition, provided herein are methods for synthesizing any number of nucleotide sequences encoding polypeptide segments, including sequences encoding non-ribosomal peptides (NRPs), sequences encoding non-Ribosomal Peptide Synthase (NRPs) modules and synthetic variants, polypeptide segments of other modular proteins (e.g., antibodies), polypeptide segments from other protein families, including non-coding DNA or RNA, such as regulatory sequences, e.g., promoters, transcription factors, enhancers, siRNA, shRNA, RNAi, miRNA, micronucleolar RNAs derived from micrornas, or any functional or structural DNA or RNA unit of interest. The following are non-limiting examples of polynucleotides: coding or non-coding regions of a gene or gene fragment, intergenic DNA, loci defined by linkage analysis, exons, introns, messenger RNAs (mRNA), transfer RNAs, ribosomal RNAs, short interfering RNAs (siRNA), short hairpin RNAs (shRNA), micrornas (miRNA), micronucleolar RNAs, ribozymes, complementary DNA (cDNA), which is a DNA representation of mRNA, typically obtained by reverse transcription of messenger RNAs (mRNA) or by amplification; DNA molecules, genomic DNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes and primers, synthesized or produced by amplification. The cDNA encoding the gene or gene fragment referred to herein may comprise at least one region encoding an exon sequence, with no intron sequences inserted in the genomic equivalent sequence.
GPCR libraries for GLP1 receptors
Provided herein are methods and compositions relating to a G protein-coupled receptor (GPCR) binding library of glucagon-like peptide-1 receptor (GLP 1R) comprising nucleic acids encoding immunoglobulins comprising a GPCR binding domain. Immunoglobulins as described herein may stably support a GPCR binding domain. GPCR binding domains may be designed based on the surface interactions of the GLP1R ligand and GLP 1R. The library as described herein may be further diversified to provide a library of variants comprising nucleic acids, each nucleic acid encoding a predetermined variant of at least one predetermined reference nucleic acid sequence. Also described herein are libraries of proteins that can be generated when translating a nucleic acid library. In some cases, a nucleic acid library as described herein is transferred into cells to generate a cell library. Also provided herein are downstream applications of the libraries synthesized using the methods described herein. Downstream applications include the identification of biologically relevant functional enhancement (e.g., improved stability, affinity, binding, functional activity) and variant nucleic acid or protein sequences for the treatment or prevention of disease states associated with GPCR signaling.
Provided herein are libraries comprising nucleic acids encoding immunoglobulins. In some cases, the immunoglobulin is an antibody. As used herein, the term antibody is understood to include proteins having the characteristic two-arm Y-shape of a typical antibody molecule as well as one or more fragments of an antibody that retain the ability to specifically bind to an antigen. Exemplary antibodies include, but are not limited to, monoclonal antibodies, polyclonal antibodies, bispecific antibodies, multispecific antibodies, grafted antibodies, human antibodies, humanized antibodies, synthetic antibodies, chimeric antibodies, camelized antibodies, single chain Fv (scFv) (including fragments in which VL and VH are linked by synthetic or natural linkers using recombinant methods, which linkers enable them to be made into a single protein chain in which the VL and VH regions pair to form monovalent molecules, including single chain Fab and scFab), single chain antibodies, fab fragments (including monovalent fragments comprising VL, VH, CL and CH1 domains), F (ab') 2 fragments (including divalent fragments comprising two Fab fragments linked by a disulfide bond of a hinge region), fd fragments (including fragments comprising VH and CH1 fragments), fv fragments (including fragments comprising VL and VH domains of a single arm of an antibody), single domain antibodies (dAb or dAb) (including fragments comprising VH domains), isolated Complementarity Determining Regions (CDRs), diabodies (including fragments comprising two VH and sAb domains that bind to each other and recognize two different antigens), diabodies, or diabody fragments (sAb) comprising only, diabody fragments, or diabody fragments. In some cases, libraries disclosed herein comprise nucleic acids encoding immunoglobulins, wherein the immunoglobulins are Fv antibodies, including Fv antibodies consisting of a minimum antibody fragment containing intact antigen recognition and antigen binding sites. In some embodiments, fv antibodies consist of dimers of one heavy and one light chain variable domain in close non-covalent association, and the three hypervariable regions of each variable domain interact to define antigen-binding sites on the surface of the VH-VL dimer. In some embodiments, six hypervariable regions confer antigen binding specificity to an antibody. In some embodiments, a single variable domain (or half of an Fv comprising only three antigen-specific hypervariable regions, including a single domain antibody comprising one heavy chain variable domain, e.g., a VHH antibody or nanobody, isolated from a camelid) has the ability to recognize and bind an antigen. In some cases, libraries disclosed herein comprise nucleic acids encoding immunoglobulins, wherein the immunoglobulins are single chain Fv or scFv, including antibody fragments comprising VH, VL, or VH and VL domains, wherein the two domains are present in a single polypeptide chain. In some embodiments, the Fv polypeptide further comprises a polypeptide linker between the VH and VL domains, thereby allowing the scFv to form the structure required for antigen binding. In some cases, the scFv is linked to an Fc fragment, or the VHH is linked to an Fc fragment (including a miniantibody). In some cases, the antibodies comprise immunoglobulin molecules and immunologically active fragments of immunoglobulin molecules, e.g., molecules that contain an antigen binding site. Immunoglobulin molecules are of any type (e.g., igG, igE, igM, igD, igA and IgY), class (e.g., igG 1, igG 2, igG 3, igG 4, igA 1, and IgA 2), or subclass.
In some embodiments, the library comprises immunoglobulins of a species suitable for the intended therapeutic target. Typically, these methods include "mammalian", and include methods of transferring donor antigen binding information to a less immunogenic mammalian antibody recipient to generate a useful therapeutic treatment. In some cases, the mammal is a mouse, rat, horse, sheep, cow, primate (e.g., chimpanzee, baboon, gorilla, red chimpanzee, monkey), dog, cat, pig, donkey, rabbit, or human. In some cases, provided herein are libraries and methods for antibody catiolation and caninization.
A "humanized" form of a non-human antibody may be a chimeric antibody that contains minimal sequences derived from the non-human antibody. Humanized antibodies are typically human antibodies (recipient antibodies) in which residues from one or more CDRs are replaced by residues from one or more CDRs of a non-human antibody (donor antibody). The donor antibody may be any suitable non-human antibody, such as a mouse, rat, rabbit, chicken or non-human primate antibody having the desired specificity, affinity or biological effect. In some cases, selected framework region residues of the recipient antibody are substituted with corresponding framework region residues from the donor antibody. Humanized antibodies may also comprise residues that are not present in the recipient antibody or the donor antibody. In some cases, these modifications are made to further improve antibody performance.
"caninisation" may include a method of transferring non-canine antigen binding information from a donor antibody to a less immunogenic canine antibody recipient to generate a treatment useful as a canine therapeutic agent. In some cases, the caninized forms of the non-canine antibodies provided herein are chimeric antibodies that contain minimal sequences derived from the non-canine antibodies. In some cases, the caninized antibody is a canine antibody sequence ("recipient" or "recipient" antibody) in which the hypervariable region residues of the recipient are replaced with hypervariable region residues from a non-canine species ("donor" antibody), such as mice, rats, rabbits, cats, dogs, goats, chickens, cattle, horses, llamas, camels, dromedaries, sharks, non-human primates, humans, humanized, recombinant sequences, or engineered sequences having desired properties. In some cases, the Framework Region (FR) residues of the canine antibodies are substituted with corresponding non-canine FR residues. In some cases, a caninized antibody includes residues that are not present in the recipient antibody or the donor antibody. In some cases, these modifications are made to further improve antibody performance. The caninized antibody may further comprise at least a portion of a canine immunoglobulin constant region (Fc).
"Cat-derived" may include methods of transferring non-cat antigen binding information from a donor antibody to a less immunogenic cat antibody recipient to generate a treatment useful as a cat therapeutic. In some cases, the feline-derived form of the non-feline antibody provided herein is a chimeric antibody that contains minimal sequences derived from the non-feline antibody. In some cases, the catylated antibody is a feline antibody sequence ("acceptor" or "recipient" antibody) in which the hypervariable region residues of the acceptor are replaced with hypervariable region residues from a non-feline species ("donor" antibody), such as a mouse, rat, rabbit, cat, dog, goat, chicken, cow, horse, llama, camel, dromedary, shark, non-human primate, human, humanized, recombinant sequence, or an engineered sequence having the desired property. In some cases, framework Region (FR) residues of the feline antibody are substituted with corresponding non-feline FR residues. In some cases, a catylated antibody includes residues that are not present in the acceptor antibody or the donor antibody. In some cases, these modifications are made to further improve antibody performance. The feline-derived antibody can further comprise at least a portion of a feline antibody immunoglobulin constant region (Fc).
Provided herein are libraries comprising nucleic acids encoding non-immunoglobulins. For example, non-immunoglobulins are antibody mimics. Exemplary antibody mimics include, but are not limited to anticalins, affilins, affibody (affibody) molecules, affimers, affitins, alphabodies, avimers, atrimers, DARPins, fynomers, kunitz domain-based proteins, monoclonal antibodies (monobodies), anticalins, knottins, armadine repeat protein-based proteins, and bicyclic peptides.
The libraries described herein comprise nucleic acids encoding immunoglobulins comprising variation in at least one region thereof. Exemplary regions of antibody variation include, but are not limited to, complementarity Determining Regions (CDRs), variable domains, or constant domains. In some cases, the CDR is CDR1, CDR2, or CDR3. In some cases, the CDRs are heavy domains including, but not limited to, CDRH1, CDRH2, and CDRH3. In some cases, the CDRs are light domains, including, but not limited to, CDRL1, CDRL2, and CDRL3. In some cases, the variable domain is a variable domain light chain (VL) or a variable domain heavy chain (VH). In some cases, the VL domain comprises a kappa or lambda chain. In some examples, the constant domain is a constant domain light Chain (CL) or a constant domain heavy Chain (CH).
The methods described herein provide for the synthesis of libraries comprising immunoglobulin-encoding nucleic acids, wherein each nucleic acid encodes a predetermined variant of at least one predetermined reference nucleic acid sequence. In some cases, the predetermined reference sequence is a nucleic acid sequence encoding a protein, and the library of variants comprises sequences encoding variations of at least a single codon such that a plurality of different variants of a single residue in a subsequent protein encoded by the synthetic nucleic acid are generated by standard translation processes. In some cases, the library of variants comprises different nucleic acids that collectively encode a variation at multiple positions. In some cases, the library of variants comprises a sequence encoding a variation of at least a single codon of a CDRH1, CDRH2, CDRH3, CDRL1, CDRL2, CDRL3, VL, or VH domain. In some cases, the library of variants comprises a sequence encoding a variation of a plurality of codons of a CDRH1, CDRH2, CDRH3, CDRL1, CDRL2, CDRL3, VL, or VH domain. In some cases, the library of variants comprises a sequence encoding a variation of a plurality of codons of framework element 1 (FW 1), framework element 2 (FW 2), framework element 3 (FW 3), or framework element 4 (FW 4). Exemplary numbers of codons for variation include, but are not limited to, at least or about 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 225, 250, 275, 300, or more than 300 codons.
In some cases, at least one region of immunoglobulin variation is from a heavy chain V gene family, a heavy chain D gene family, a heavy chain J gene family, a light chain V gene family, or a light chain J gene family. In some cases, the light chain V gene family comprises an immunoglobulin kappa (IGK) gene or an immunoglobulin lambda (IGL). Exemplary genes include, but are not limited to, IGHV1-18, IGHV1-69, IGHV1-8, IGHV3-21, IGHV3-23, IGHV3-30/33rn, IGHV3-28, IGHV1-69, IGHV3-74, IGHV4-39, IGHV4-59/61, IGKV1-39, IGKV1-9, IGKV2-28, IGKV3-11, IGKV3-15, IGKV3-20, IGKV4-1, IGLV1-51, IGLV2-14, IGLV1-40, and IGLV3-1. In some cases, the gene is IGHV1-69, IGHV3-30, IGHV3-23, IGHV3, IGHV1-46, IGHV3-7, IGHV1, or IGHV1-8. In some cases, the genes are IGHV1-69 and IGHV3-30. In some cases, the gene is IGHJ3, IGHJ6, IGHJ4, IGHJ5, IGHJ2, or IGH1. In some cases, the gene is IGHJ3, IGHJ6, IGHJ or IGHJ4.
Provided herein are libraries comprising immunoglobulin-encoding nucleic acids, wherein the libraries are synthesized with different numbers of fragments. In some cases, the fragment comprises a CDRH1, CDRH2, CDRH3, CDRL1, CDRL2, CDRL3, VL, or VH domain. In some cases, the segments include a framing element 1 (FW 1), a framing element 2 (FW 2), a framing element 3 (FW 3), or a framing element 4 (FW 4). In some cases, the immunoglobulin library is synthesized with at least or about 2 fragments, 3 fragments, 4 fragments, 5 fragments, or more than 5 fragments. The length of each of the nucleic acid fragments or the average length of the synthesized nucleic acids can be at least or about 50, 75, 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 525, 550, 575, 600, or more than 600 base pairs. In some cases, the length is about 50 to 600, 75 to 575, 100 to 550, 125 to 525, 150 to 500, 175 to 475, 200 to 450, 225 to 425, 250 to 400, 275 to 375, or 300 to 350 base pairs.
Libraries comprising immunoglobulin-encoding nucleic acids as described herein comprise amino acids of various lengths when translated. In some cases, the length of each of the amino acid fragments or the average length of the synthetic amino acids can be at least or about 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, or more than 150 amino acids. In some cases, the amino acids are about 15 to 150, 20 to 145, 25 to 140, 30 to 135, 35 to 130, 40 to 125, 45 to 120, 50 to 115, 55 to 110, 60 to 110, 65 to 105, 70 to 100, or 75 to 95 amino acids in length. In some cases, the amino acids are about 22 amino acids to about 75 amino acids in length. In some cases, the immunoglobulin comprises at least or about 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, or more than 5000 amino acids.
A plurality of variant sequences of at least one region of immunoglobulin variation are synthesized de novo using the methods as described herein. In some cases, a number of variant sequences of CDRH1, CDRH2, CDRH3, CDRL1, CDRL2, CDRL3, VL, VH, or a combination thereof are synthesized de novo. In some cases, a number of variant sequences of the framework element 1 (FW 1), the framework element 2 (FW 2), the framework element 3 (FW 3) or the framework element 4 (FW 4) are synthesized de novo. The number of variant sequences may be at least or about 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, or more than 500 sequences. In some cases, the number of variant sequences is at least or about 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, or more than 8000 sequences. In some cases, the number of variant sequences is about 10 to 500, 25 to 475, 50 to 450, 75 to 425, 100 to 400, 125 to 375, 150 to 350, 175 to 325, 200 to 300, 225 to 375, 250 to 350, or 275 to 325 sequences.
In some cases, the variant sequences of at least one region of the immunoglobulin differ in length or sequence. In some cases, at least one region synthesized de novo is directed to CDRH1, CDRH2, CDRH3, CDRL1, CDRL2, CDRL3, VL, VH, or a combination thereof. In some cases, at least one region synthesized de novo is for the frame element 1 (FW 1), the frame element 2 (FW 2), the frame element 3 (FW 3), or the frame element 4 (FW 4). In some cases, the variant sequence comprises at least or about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, or more than 50 variant nucleotides or amino acids as compared to the wild type. In some cases, the variant sequence comprises at least or about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, or 50 additional nucleotides or amino acids as compared to the wild type. In some casesIn some cases, the variant sequence comprises at least or about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, or 50 nucleotides or amino acids less than the wild type. In some cases, the library comprises at least or about 10 1 、10 2 、10 3 、10 4 、10 5 、10 6 、10 7 、10 8 、10 9 、10 10 Or more than 10 10 Variants.
After synthesis of the libraries described herein, the libraries can be used for screening and analysis. For example, library exposable and panning of the library is determined. In some cases, the exposable property is determined using selectable tags. Exemplary labels include, but are not limited to, radiolabels, fluorescent labels, enzymes, chemiluminescent labels, colorimetric labels, affinity labels, or other labels or tags known in the art. In some cases, the tag is histidine, polyhistidine, myc, hemagglutinin (HA), or FLAG. In some cases, libraries are determined by sequencing using various methods, including, but not limited to, single Molecule Real Time (SMRT) sequencing, polymerase clone (poony) sequencing, ligation sequencing, reversible terminator sequencing, proton detection sequencing, ion semiconductor sequencing, nanopore sequencing, electronic sequencing, pyrosequencing, maxam-Gilbert sequencing, chain termination (e.g., sanger) sequencing, +s sequencing, or synthetic sequencing.
In some cases, the library is assayed for functional activity, structural stability (e.g., thermostable or pH stable), expression, specificity, or a combination thereof. In some cases, immunoglobulins (e.g., antibodies) capable of folding in the library are assayed. In some cases, the functional activity, structural stability, expression, specificity, folding, or a combination thereof of the antibody region is determined. For example, the functional activity, structural stability, expression, specificity, folding, or a combination thereof of the VH region or the VL region is determined.
GLP1R libraries
Provided herein are GLP1R binding libraries comprising nucleic acids encoding immunoglobulins (e.g., antibodies) that bind to GLP 1R. In some cases, the immunoglobulin sequence of the GLP1R binding domain is determined by the interaction between the GLP1R binding domain and GLP 1R.
Provided herein are libraries comprising nucleic acids encoding immunoglobulins comprising GLP1R binding domains, wherein the GLP1R binding domains are designed based on surface interactions on GLP 1R. In some cases, GLP1R comprises a sequence as defined in SEQ ID NO. 1. In some cases, the GLP1R binding domain interacts with the amino or carboxy terminus of GLP 1R. In some cases, the GLP1R binding domain interacts with at least one transmembrane domain, including but not limited to transmembrane domain 1 (TM 1), transmembrane domain 2 (TM 2), transmembrane domain 3 (TM 3), transmembrane domain 4 (TM 4), transmembrane domain 5 (TM 5), transmembrane domain 6 (TM 6), and transmembrane domain 7 (TM 7). In some cases, the GLP1R binding domain interacts with the intracellular surface of GLP 1R. For example, the GLP1R binding domain interacts with at least one intracellular loop, including but not limited to intracellular loop 1 (ICL 1), intracellular loop 2 (ICL 2), and intracellular loop 3 (ICL 3). In some cases, the GLP1R binding domain interacts with the extracellular surface of GLP 1R. For example, the GLP1R binding domain interacts with at least one extracellular domain (ECD) or extracellular loop (ECL) of GLP 1R. Extracellular loops include, but are not limited to, extracellular loop 1 (ECL 1), extracellular loop 2 (ECL 2), and extracellular loop 3 (ECL 3).
Described herein are GLP1R binding domains, wherein the GLP1R binding domain is designed based on surface interactions between GLP1R ligands and GLP 1R. In some cases, the ligand is a peptide. In some cases, the ligand is glucagon, glucagon-like peptide 1- (7-36) amide, glucagon-like peptide 1- (7-37), liraglutide, exendin-4, liraglutide, T-0632, GLP1R0017, or BETP. In some cases, the ligand is a GLP1R agonist. In some cases, the ligand is a GLP1R antagonist. In some cases, the ligand is a GLP1R allosteric modulator. In some cases, the allosteric modulator is a negative allosteric modulator. In some cases, the allosteric modulator is a positive allosteric modulator.
Various methods are used to analyze the sequence of the GLP1R binding domain based on the surface interaction between the GLP1R ligand and the GLP 1R. For example, multi-species computational analysis is performed. In some cases, structural analysis is performed. In some cases, sequence analysis is performed. Sequence analysis may be performed using databases known in the art. Non-limiting examples of databases include, but are not limited to NCBI BLAST (BLAST. NCBI. Lm. Nih. Gov/BLAST. Cgi), UCSC Genome Browser (genome. Ucsc. Edu /), uniProt (www.uniprot.org /), and IUPHAR/BPS pharmacological guidelines (org /).
GLP1R binding domains designed based on sequence analysis between various organisms are described herein. For example, sequence analysis is performed to identify homologous sequences in different organisms. Exemplary organisms include, but are not limited to, mice, rats, horses, sheep, cattle, primates (e.g., chimpanzees, baboons, gorillas, chimpanzees, monkeys), dogs, cats, pigs, donkeys, rabbits, fish, flies, and humans.
After identifying the GLP1R binding domain, a library comprising nucleic acids encoding the GLP1R binding domain may be generated. In some cases, the library of GLP1R binding domains comprises sequences of GLP1R binding domains designed based on conformational ligand interactions, peptide ligand interactions, small molecule ligand interactions, extracellular domains of GLP1R, or antibodies targeting GLP 1R. In some cases, the library of GLP1R binding domains comprises the sequence of the GLP1R binding domain designed based on peptide ligand interactions. The library of GLP1R binding domains can be translated to generate a protein library. In some cases, the library of GLP1R binding domains is translated to generate a peptide library, an immunoglobulin library, derivatives thereof, or combinations thereof. In some cases, the library of GLP1R binding domains is translated to generate a library of proteins that are further modified to generate a library of peptide mimetics. In some cases, the library of GLP1R binding domains is translated to generate a library of proteins that are used to generate small molecules.
The methods described herein provide for the synthesis of libraries comprising GLP1R binding domains of nucleic acids, each nucleic acid encoding a predetermined variant of at least one predetermined reference nucleic acid sequence. In some cases, the predetermined reference sequence is a nucleic acid sequence encoding a protein, and the library of variants comprises sequences encoding variations of at least a single codon such that a plurality of different variants of a single residue in a subsequent protein encoded by the synthetic nucleic acid are generated by standard translation processes. In some cases, the library of GLP1R binding domains comprises different nucleic acids that collectively encode a variation at multiple positions. In some cases, the library of variants comprises a sequence encoding a variation of at least a single codon in the GLP1R binding domain. In some cases, the library of variants comprises sequences encoding variations of multiple codons in the GLP1R binding domain. Exemplary numbers of codons for variation include, but are not limited to, at least or about 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 225, 250, 275, 300, or more than 300 codons.
The methods described herein provide for the synthesis of a library comprising nucleic acids encoding a GLP1R binding domain, wherein the library comprises sequences encoding a change in the length of the GLP1R binding domain. In some cases, the library comprises sequences encoding at least or about 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 225, 250, 275, 300, or more than 300 codons less in length variation than the predetermined reference sequence. In some cases, the library comprises sequences encoding at least or about 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 200, 225, 250, 275, 300, or more than 300 codons more in length than the predetermined reference sequence.
After identifying the GLP1R binding domain, the GLP1R binding domain can be placed in an immunoglobulin as described herein. In some cases, the GLP1R binding domain is placed in the CDRH3 region. GPCR binding domains that can be placed in immunoglobulins can also be referred to as motifs. Immunoglobulins comprising GLP1R binding domains may be designed based on binding, specificity, stability, expression, folding or downstream activity. In some cases, an immunoglobulin comprising a GLP1R binding domain is capable of contacting GLP 1R. In some cases, an immunoglobulin comprising a GLP1R binding domain is capable of binding with high affinity to GLP 1R. Exemplary amino acid sequences of GLP1R binding domains are described in table 1.
TABLE 1 GLP1R amino acid sequence
Provided herein are immunoglobulins comprising a GLP1R binding domain, wherein the sequence of the GLP1R binding domain supports interaction with GLP 1R. The sequence may be homologous or identical to the sequence of the GLP1R ligand. In some cases, the GLP1R binding domain sequence comprises at least or about 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO. 1. In some cases, the GLP1R binding domain sequence comprises at least or about 95% homology to SEQ ID NO. 1. In some cases, the GLP1R binding domain sequence comprises at least or about 97% homology to SEQ ID NO. 1. In some cases, the GLP1R binding domain sequence comprises at least or about 99% homology to SEQ ID NO. 1. In some cases, the GLP1R binding domain sequence comprises at least or about 100% homology to SEQ ID NO. 1. In some cases, the GLP1R binding domain sequence comprises at least a portion of at least or about 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, or more than 400 amino acids having SEQ ID No. 1.
The term "sequence identity" refers to two polynucleotide sequences that are identical over a comparison window (i.e., on a nucleotide-by-nucleotide basis). The term "percent sequence identity" is calculated by comparing two optimally aligned sequences over a comparison window, determining the number of positions in the two sequences at which the same nucleobase (e.g., A, T, C, G, U or I) occurs to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the comparison window (i.e., window size), and multiplying the result by 100 to yield the percent sequence identity. The alignment used to determine the percent amino acid sequence identity can be accomplished in a variety of ways within the skill of the art, for example, using publicly available computer software such as EMBOSS MATCHER, EMBOSS WATER, EMBOSS STRETCHER, EMBOSS NEEDLE, EMBOSS LALIGN, BLAST-2, ALIGN, or Megalign (DNASTAR) software. One skilled in the art can determine appropriate parameters for measuring the alignment, including any algorithms needed to achieve maximum alignment over the entire length of the sequences being compared.
In the case of amino acid sequence comparison using ALIGN-2, the amino acid sequence identity of a given amino acid sequence A with a given amino acid sequence B (which may alternatively be expressed as a given amino acid sequence A having or comprising% of the identity of a particular amino acid sequence with a given amino acid sequence B) is calculated as follows: 100 by a score X/Y, where X is the number of amino acid residues scored as identical matches by sequence alignment program ALIGN-2 in the program alignment of A and B, and where Y is the total number of amino acid residues in B. It will be appreciated that when the length of amino acid sequence a is not equal to the length of amino acid sequence B, the% amino acid sequence identity of a to B will not be equal to the% amino acid sequence identity of B to a. All% amino acid sequence identity values used herein were obtained using the ALIGN-2 computer program as described in the previous paragraph, unless specifically indicated otherwise.
The term "homology" or "similarity" between two proteins is determined by comparing the amino acid sequence of one protein sequence and conservative amino acid substitutions thereof with the second protein sequence. Similarity can be determined by methods well known in the art, such as the BLAST program (basic local alignment search tool of the national center for bioinformation).
The terms "complementarity determining region" and "CDR," synonymous with "hypervariable region" or "HVR," are known in the art to refer to non-contiguous sequences of amino acids within the variable region of an antibody that confer antigen specificity and/or binding affinity. Typically, there are three CDRs (CDRH 1, CDRH2, CDRH 3) in each heavy chain variable region, and three CDRs (CDRL 1, CDRL2, CDRL 3) in each light chain variable region. "framework region" and "FR" are known in the art to refer to the non-CDR portions of the heavy and light chain variable regions. Typically, there are four FRs (FR-H1, FR-H2, FR-H3 and FR-H4) in each full-length heavy chain variable region, and four FRs (FR-L1, FR-L2, FR-L3 and FR-L4) in each full-length light chain variable region. The exact amino acid sequence boundaries for a given CDR or FR can be readily determined using any of a number of well-known protocols, including Kabat et Al, (1991), "Sequences of Proteins of Immunological Interest," 5 th edition. Public health agency, national institutes of health, bethesda, MD ("Kabat" numbering scheme), al-Lazikani et Al, (1997) JMB 273,927-948 ("Chothia" numbering scheme); macCallum et al, J.mol.biol.262:732-745 (1996), "anti-body-antigen interactions: contact analysis and binding site topography," J.mol.biol.262,732-745 "(" Contact "numbering scheme); lefranc MP et al, "IMGT unique numbering for immunoglobulin and T cell receptor variable domains and Ig superfamily V-like domains," Dev Comp Immunol, month 1 2003; 27 (1) 55-77 ("IMGT" numbering scheme); honyger A and Pluckthun A, "Yet another numbering scheme for immunoglobulin variable domains: an automatic modeling and analysis tool," J Mol Biol, 6/8/2001; 309 (3) 657-70, ("Aho" numbering scheme); whitelegg NR and Rees AR, "WAM: an improved algorithm for modelling antibodies on the WEB," Protein eng.2000, month 12; 13 (12) 819-24 ("AbM" numbering scheme). In certain embodiments, the CDRs of the antibodies described herein can be defined by a method selected from Kabat, chothia, IMGT, aho, abM or a combination thereof.
The boundaries of a given CDR or FR may vary depending on the scheme used for identification. For example, the Kabat scheme is based on structural alignment, while the Chothia scheme is based on structural information. Numbering of both Kabat and Chothia protocols is based on the most common antibody region sequence length, with insertions provided by the insert letter (e.g., "30 a") and deletions in some antibodies. Both of these schemes place certain insertions and deletions ("indels") at different positions, resulting in different numbering. The Contact scheme is based on analysis of complex crystal structures and is similar in many respects to the Chothia numbering scheme.
Provided herein are GLP1R binding libraries comprising nucleic acids encoding immunoglobulins comprising GLP1R binding domains comprising variations in domain type, domain length or residue variation. In some cases, the domain is a region in an immunoglobulin comprising a GLP1R binding domain. For example, the region is a VH, CDRH3 or VL domain. In some cases, the domain is a GLP1R binding domain.
The methods described herein provide for the synthesis of a GLP1R binding library of nucleic acids, each encoding a predetermined variant of at least one predetermined reference nucleic acid sequence. In some cases, the predetermined reference sequence is a nucleic acid sequence encoding a protein, and the library of variants comprises sequences encoding variations of at least a single codon such that a plurality of different variants of a single residue in a subsequent protein encoded by the synthetic nucleic acid are generated by standard translation processes. In some cases, the GLP1R binding library comprises different nucleic acids that collectively encode a variation at multiple positions. In some cases, the library of variants comprises a sequence encoding a variation of at least a single codon of a VH, CDRH3, or VL domain. In some cases, the library of variants comprises a sequence encoding a variation of at least a single codon in the GLP1R binding domain. For example, at least one single codon of a GLP1R binding domain as set forth in table 1 is varied. In some cases, the library of variants comprises a sequence encoding a variation of a plurality of codons of a VH, CDRH3, or VL domain. In some cases, the library of variants comprises sequences encoding variations of multiple codons in the GLP1R binding domain. Exemplary numbers of codons for variation include, but are not limited to, at least or about 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 225, 250, 275, 300, or more than 300 codons.
The methods described herein provide for the synthesis of a GLP1R binding library of nucleic acids, each nucleic acid encoding a predetermined variant of at least one predetermined reference nucleic acid sequence, wherein the GLP1R binding library comprises sequences encoding a change in domain length. In some cases, the domain is a VH, CDRH3, or VL domain. In some cases, the domain is a GLP1R binding domain. In some cases, the library comprises sequences encoding at least or about 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 225, 250, 275, 300, or more than 300 codons less in length variation than the predetermined reference sequence. In some cases, the library comprises sequences encoding at least or about 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 200, 225, 250, 275, 300, or more than 300 codons more in length than the predetermined reference sequence.
Provided herein are GLP1R binding libraries comprising nucleic acids encoding immunoglobulins comprising GLP1R binding domains, wherein the GLP1R binding libraries are synthesized with different numbers of fragments. In some cases, the fragment comprises a VH, CDRH3, or VL domain. In some cases, the GLP1R binding library is synthesized with at least or about 2 fragments, 3 fragments, 4 fragments, 5 fragments, or more than 5 fragments. The length of each of the nucleic acid fragments or the average length of the synthesized nucleic acids can be at least or about 50, 75, 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 525, 550, 575, 600, or more than 600 base pairs. In some cases, the length is about 50 to 600, 75 to 575, 100 to 550, 125 to 525, 150 to 500, 175 to 475, 200 to 450, 225 to 425, 250 to 400, 275 to 375, or 300 to 350 base pairs.
When translated, a GLP1R binding library comprising nucleic acids encoding an immunoglobulin comprising a GLP1R binding domain as described herein comprises amino acids of various lengths. In some cases, the length of each of the amino acid fragments or the average length of the synthetic amino acids can be at least or about 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, or more than 150 amino acids. In some cases, the amino acids are about 15 to 150, 20 to 145, 25 to 140, 30 to 135, 35 to 130, 40 to 125, 45 to 120, 50 to 115, 55 to 110, 60 to 110, 65 to 105, 70 to 100, or 75 to 95 amino acids in length. In some cases, the amino acids are about 22 to about 75 amino acids in length.
A GLP1R binding library comprising de novo synthesized variant sequences encoding immunoglobulins comprising GLP1R binding domains comprises a number of variant sequences. In some cases, a number of variant sequences of CDRH1, CDRH2, CDRH3, CDRL1, CDRL2, CDRL3, VL, VH, or a combination thereof are synthesized de novo. In some cases, a number of variant sequences of the framework element 1 (FW 1), the framework element 2 (FW 2), the framework element 3 (FW 3) or the framework element 4 (FW 4) are synthesized de novo. In some cases, many variant sequences of GPCR binding domains are synthesized de novo. For example, the number of variant sequences is about 1 to about 10 sequences of the VH domain, about 10 of the GLP1R binding domain 8 Sequences, and about 1 to about 44 sequences of the VK domain. The number of variant sequences may be at least or about 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, or more than 500 sequences. In some cases, the number of variant sequences is about 10 to 300, 25 to 275, 50 to 250, 75 to 225, 100 to 200, or 125 to 150 sequences.
Described herein are antibodies or antibody fragments thereof that bind GLP 1R. In some embodiments, the antibody or antibody fragment thereof comprises the sequences as set forth in tables 7-13. In some embodiments, an antibody or antibody fragment thereof comprises a sequence having at least or about 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a sequence as set forth in tables 7-13.
In some cases, an antibody or antibody fragment described herein comprises the CDRH1 sequence of any one of SEQ ID NOs 441-619. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 80% identical to the CDRH1 sequence of any one of SEQ ID NOs 441-619. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 85% identical to the CDRH1 sequence of any one of SEQ ID NOs 441-619. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 90% identical to the CDRH1 sequence of any one of SEQ ID NOs 441-619. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 95% identical to the CDRH1 sequence of any one of SEQ ID NOs 441-619. In some cases, an antibody or antibody fragment described herein comprises the CDRH2 sequence of any one of SEQ ID NOs 620-798. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 80% identical to the CDRH2 sequence of any one of SEQ ID NOs 620-798. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 85% identical to the CDRH2 sequence of any one of SEQ ID NOs 620-798. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 90% identical to the CDRH2 sequence of any one of SEQ ID NOs 620-798. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 95% identical to the CDRH2 sequence of any one of SEQ ID NOs 620-798. In some cases, an antibody or antibody fragment described herein comprises the CDRH3 sequence of any one of SEQ ID NOs 799-977. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 80% identical to the CDRH3 sequence of any one of SEQ ID NOs 799-977. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 85% identical to the CDRH3 sequence of any one of SEQ ID NOs 799-977. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 90% identical to the CDRH3 sequence of any one of SEQ ID NOs 799-977. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 95% identical to the CDRH3 sequence of any one of SEQ ID NOs 799-977.
In some cases, an antibody or antibody fragment described herein comprises the CDRL1 sequence of any one of SEQ ID NOs 978-1156. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 80% identical to the CDRL1 sequence of any one of SEQ ID NOs 978-1156. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 85% identical to the CDRL1 sequence of any one of SEQ ID NOs 978-1156. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 90% identical to the CDRL1 sequence of any one of SEQ ID NOs 978-1156. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 95% identical to the CDRL1 sequence of any one of SEQ ID NOs 978-1156. In some cases, an antibody or antibody fragment described herein comprises the CDRL2 sequence of any one of SEQ ID NOs 1157-1168. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 80% identical to the CDRL2 sequence of any one of SEQ ID NOs 1157-1168. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 85% identical to the CDRL2 sequence of any one of SEQ ID NOs 1157-1168. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 90% identical to the CDRL2 sequence of any one of SEQ ID NOs 1157-1168. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 95% identical to the CDRL2 sequence of any one of SEQ ID NOs 1157-1168. In some cases, an antibody or antibody fragment described herein comprises the CDRL3 sequence of any one of SEQ ID NOs 1169-1347. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 80% identical to the CDRL3 sequence of any one of SEQ ID NOs 1169-1347. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 85% identical to the CDRL3 sequence of any one of SEQ ID NOs 1169-1347. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 90% identical to the CDRL3 sequence of any one of SEQ ID NOs 1169-1347. In some cases, an antibody or antibody fragment described herein comprises a sequence that is at least 95% identical to the CDRL3 sequence of any one of SEQ ID NOs 1169-1347.
In some embodiments, an antibody or antibody fragment comprises a variable domain heavy chain region (VH) and a variable domain light chain region (VL), wherein VH comprises complementarity determining regions CDRH1, CDRH2, and CDRH3, wherein VL comprises complementarity determining regions CDRL1, CDRL2, and CDRL3, and wherein (a) the amino acid sequence of CDRH1 is as shown in any one of SEQ ID NOs 441-619; (b) The amino acid sequence of CDRH2 is shown in any one of SEQ ID NO 620-798; (c) The amino acid sequence of CDRH3 is shown in any one of SEQ ID NO: 799-977; (d) The amino acid sequence of CDRL1 is shown in any one of SEQ ID NOs 978-1156; (e) The amino acid sequence of CDRL2 is shown in any one of SEQ ID NOs 1157-1168; and (f) the amino acid sequence of CDRL3 is shown in any one of SEQ ID NOS 1169-1347. In some embodiments, an antibody or antibody fragment comprises a variable domain heavy chain region (VH) and a variable domain light chain region (VL), wherein VH comprises complementarity determining regions CDRH1, CDRH2, and CDRH3, wherein VL comprises complementarity determining regions CDRL1, CDRL2, and CDRL3, and wherein (a) the amino acid sequence of CDRH1 is at least or about 80%, 85%, 90%, or 95% identical to any one of SEQ ID NOs 441-619; (b) The amino acid sequence of CDRH2 is at least or about 80%, 85%, 90% or 95% identical to any one of SEQ ID NOs 620-798; (c) The amino acid sequence of CDRH3 is at least or about 80%, 85%, 90% or 95% identical to any one of SEQ ID NOs 799-977; (d) The amino acid sequence of CDRL1 is at least or about 80%, 85%, 90% or 95% identical to any one of SEQ ID NOs 978-1156; (e) The amino acid sequence of CDRL2 is at least or about 80%, 85%, 90% or 95% identical to any one of SEQ ID NOs 1157-1168; and (f) the amino acid sequence of CDRL3 is at least or about 80%, 85%, 90%, or 95% identical to any one of SEQ ID NOS 1169-1347.
In some embodiments, described herein are antibodies or antibody fragments comprising a variable domain heavy chain region (VH) and a variable domain light chain region (VL), wherein VH comprises an amino acid sequence at least about 90% identical to the sequence set forth in any one of SEQ ID NOs 58-77, and wherein VL comprises an amino acid sequence at least about 90% identical to the sequence set forth in any one of SEQ ID NOs 92-111. In some cases, an antibody or antibody fragment comprises a VH comprising at least or about 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to any of SEQ ID NOs 58-77 and a VL comprising at least or about 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to any of SEQ ID NOs 92-111.
The term "sequence identity" refers to two polynucleotide sequences that are identical over a comparison window (i.e., on a nucleotide-by-nucleotide basis). The term "percent sequence identity" is calculated by comparing two optimally aligned sequences over a comparison window, determining the number of positions in the two sequences at which the same nucleobase (e.g., A, T, C, G, U or I) occurs to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the comparison window (i.e., window size), and multiplying the result by 100 to yield the percent sequence identity. In general, techniques for determining sequence identity include comparing two nucleotide or amino acid sequences and determining their percent identity. Sequence comparisons, for example, to assess identity, can be made by any suitable alignment algorithm, including, but not limited to, the Needleman-Wunsch algorithm (see, e.g., the EMBOSS Needle aligners available at www.ebi.ac.uk/Tools/psa/embos_needle/optionally with default settings), the BLAST algorithm (see, e.g., the BLAST alignment Tools available at blast.ncbi.nlm.nih.gov/blast.cgi optionally with default settings), and the Smith-Waterman algorithm (see, e.g., the EMBOSS Water aligners available at www.ebi.ac.uk/Tools/psa/embos_water/optionally with default settings). Any suitable parameters of the selected algorithm, including default parameters, may be used to evaluate the optimal alignment. The "percent identity" between two sequences, also referred to as "percent homology", can be calculated as the number of exact matches between the two optimally aligned sequences divided by the length of the reference sequence and multiplied by 100. The percent identity can also be determined, for example, by comparing sequence information using an advanced BLAST computer program (including version 2.2.9 available from the national institutes of health). The BLAST program is based on the alignment of Karlin and Altschul, proc. Natl. Acad. Sci. USA 87:2264-2268 (1990), and is described as Altschul et al, J. Mol. Biol.215:403-410 (1990); karlin and Altschul, proc. Natl. Acad. Sci. USA 90:5873-5877 (1993); and Altschul et al, nucleic Acids Res.25:3389-3402 (1997). Briefly, the BLAST program defines identity as the number of identical alignment symbols (i.e., nucleotides or amino acids) divided by the total number of symbols in the shorter of the two sequences. This procedure can be used to determine the percent identity of the full length of the sequences being compared. Default parameters are provided to optimize searches with short query sequences (e.g., with a blastp program). This procedure also allows the use of SEG filters to mask sections of query sequences determined by the SEG procedure of Wootton and Federhen, computers and Chemistry 17:149-163 (1993). High sequence identity typically includes a range of about 80% to 100% sequence identity and integer values therebetween.
A GLP1R binding library comprising de novo synthesized variant sequences encoding immunoglobulins comprising GLP1R binding domains comprises improved diversity. For example, variants are generated by placing a GLP1R binding domain variant into an immunoglobulin comprising an N-terminal CDRH3 variation and a C-terminal CDRH3 variation. In some cases, the variant comprises an affinity maturation variant. Alternatively or in combination, variants include variants in other regions of the immunoglobulin including, but not limited to, CDRH1, CDRH2, CDRL1, CDRL2, and CDRL 3. In some cases, the number of variants of the GLP1R binding library is at least or about 10 4 、10 5 、10 6 、10 7 、10 8 、10 9 、10 10 、10 11 、10 12 、10 13 、10 14 、10 15 、10 16 、10 17 、10 18 、10 19 、10 20 Or more than 10 20 Different sequences. For example, a library comprising about 10 VH region variant sequences, about 237 CDRH3 region variant sequences, and about 43 VL and CDRL3 region variant sequences comprises 10 5 Different sequences (10X 237X 43).
In some cases, at least one region of antibody variation is from a heavy chain V gene family, a heavy chain D gene family, a heavy chain J gene family, a light chain V gene family, or a light chain J gene family. In some cases, the light chain V gene family comprises an immunoglobulin kappa (IGK) gene or an immunoglobulin lambda (IGL). Exemplary regions of antibody variation include, but are not limited to, IGHV1-18, IGHV1-69, IGHV1-8, IGHV3-21, IGHV3-23, IGHV3-30/33rn, IGHV3-28, IGHV1-69, IGHV3-74, IGHV4-39, IGHV4-59/61, IGKV1-39, IGKV1-9, IGKV2-28, IGKV3-11, IGKV3-15, IGKV3-20, IGKV4-1, IGLV1-51, IGLV2-14, IGLV1-40, and IGLV3-1. In some cases, the gene is IGHV1-69, IGHV3-30, IGHV3-23, IGHV3, IGHV1-46, IGHV3-7, IGHV1, or IGHV1-8. In some cases, the genes are IGHV1-69 and IGHV3-30. In some cases, the region of antibody variation is IGHJ3, IGHJ6, IGHJ4, IGHJ5, IGHJ2, or IGH1. In some cases, the region of antibody variation is IGHJ3, IGHJ6, IGHJ or IGHJ4. In some cases, at least one region of antibody variation is IGHV1-69, IGHV3-23, IGKV3-20, IGKV1-39, or a combination thereof. In some cases, at least one region of antibody variation is IGHV1-69 and IGKV3-20. In some cases, at least one region of antibody variation is IGHV1-69 and IGKV1-39. In some cases, at least one region of antibody variation is IGHV3-23 and IGKV3-20. In some cases, at least one region of antibody variation is IGHV3-23 and IGKV1-39.
Provided herein are libraries comprising nucleic acids encoding GLP1R antibodies comprising variations in at least one region of the antibody, wherein the region is a CDR region. In some cases, the GLP1R antibody is a single domain antibody comprising one heavy chain variable domain, e.g., a VHH antibody. In some cases, a VHH antibody comprises a variation in one or more CDR regions. In some cases, a library described herein comprises at least or about 1, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1200, 1400, 1600, 1800, 2000, 2400, 2600, 2800, 3000, or more than 3000 CDR1, CDR2, or CDR3 sequences. In some cases, a library described herein comprises at least or about 10 4 、10 5 、10 6 、10 7 、10 8 、10 9 、10 10 、10 11 、10 12 、10 13 、10 14 、10 15 、10 16 、10 17 、10 18 、10 19 、10 20 Or more than 10 20 Each CDR1, CDR2 or CDR3 sequence. For example, the library comprises at least 2000 CDR1 sequences, at least 1200 CDR2 sequences, and at least 1600 CDR3 sequences. In some cases, each sequence is not identical.
In some cases, CDR1, CDR2, or CDR3 belongs to the variable domain light chain(VL). CDR1, CDR2, or CDR3 of the variable domain light chain (VL) may be referred to as CDRL1, CDRL2, or CDRL3, respectively. In some cases, a library described herein comprises at least or about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1200, 1400, 1600, 1800, 2000, 2400, 2600, 2800, 3000, or CDR2 or CDR3 sequences of more than 3000 VLs. In some cases, a library described herein comprises at least or about 10 4 、10 5 、10 6 、10 7 、10 8 、10 9 、10 10 、10 11 、10 12 、10 13 、10 14 、10 15 、10 16 、10 17 、10 18 、10 19 、10 20 Or more than 10 20 CDR1, CDR2 or CDR3 sequences of each VL. For example, the library comprises CDR1 sequences of at least 20 VLs, CDR2 sequences of at least 4 VLs, and CDR3 sequences of at least 140 VLs. In some cases, the library comprises CDR1 sequences of at least 2 VLs, CDR2 sequences of at least 1 VL, and CDR3 sequences of at least 3000 VLs. In some cases, VL is IGKV1-39, IGKV1-9, IGKV2-28, IGKV3-11, IGKV3-15, IGKV3-20, IGKV4-1, IGLV1-51, IGLV2-14, IGLV1-40 or IGLV3-1. In some cases, VL is IGKV2-28. In some cases, VL is IGLV1-51.
In some cases, CDR1, CDR2, or CDR3 belongs to a variable domain heavy chain (VH). CDR1, CDR2 or CDR3 of the variable domain heavy chain (VH) may be referred to as CDRH1, CDRH2 or CDRH3, respectively. In some cases, a library described herein comprises at least or about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1200, 1400, 1600, 1800, 2000, 2400, 2600, 2800, 3000, or CDR2 or CDR3 sequences of more than 3000 VH. In some cases, a library described herein comprises at least or about 10 4 、10 5 、10 6 、10 7 、10 8 、10 9 、10 10 、10 11 、10 12 、10 13 、10 14 、10 15 、10 16 、10 17 、10 18 、10 19 、10 20 Or more than 10 20 CDR1, CDR2 or CDR3 sequences of the individual VH. For example, the library comprises CDR1 sequences of at least 30 VH, CDR2 sequences of at least 570 VH, and at least 10 8 CDR3 sequences of the individual VH. In some cases, the library comprises CDR1 sequences of at least 30 VH, CDR2 sequences of at least 860 VH, and at least 10 7 CDR3 sequences of the individual VH. In some cases, VH is IGHV1-18, IGHV1-69, IGHV1-8, IGHV3-21, IGHV3-23, IGHV3-30/33rn, IGHV3-28, IGHV3-74, IGHV4-39, or IGHV4-59/61. In some cases, VH is IGHV1-69, IGHV3-30, IGHV3-23, IGHV3, IGHV1-46, IGHV3-7, IGHV1, or IGHV1-8. In some cases, VH is IGHV1-69 or IGHV3-30. In some cases, VH is IGHV3-23.
In some embodiments, the library as described herein comprises CDRL1, CDRL2, CDRL3, CDRH1, CDRH2, or CDRH3 of different lengths. In some cases, the length of CDRL1, CDRL2, CDRL3, CDRH1, CDRH2, or CDRH3 comprises at least or about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 40, 50, 60, 70, 80, 90, or more than 90 amino acids. For example, the length of CDRH3 comprises at least or about 12, 15, 16, 17, 20, 21, or 23 amino acids. In some cases, the length of CDRL1, CDRL2, CDRL3, CDRH1, CDRH2, or CDRH3 comprises a range of about 1 to about 10, about 5 to about 15, about 10 to about 20, or about 15 to about 30 amino acids.
When translated, libraries comprising nucleic acids encoding antibodies having variant CDR sequences as described herein comprise amino acids of various lengths. In some cases, the length of each of the amino acid fragments or the average length of the synthetic amino acids can be at least or about 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, or more than 150 amino acids. In some cases, the amino acids are about 15 to 150, 20 to 145, 25 to 140, 30 to 135, 35 to 130, 40 to 125, 45 to 120, 50 to 115, 55 to 110, 60 to 110, 65 to 105, 70 to 100, or 75 to 95 amino acids in length. In some cases, the amino acids are about 22 amino acids to about 75 amino acids in length. In some cases, an antibody comprises at least or about 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, or more than 5000 amino acids.
In the libraries described herein, the length ratios of CDRL1, CDRL2, CDRL3, CDRH1, CDRH2, or CDRH3 can vary. In some cases, a CDRL1, CDRL2, CDRL3, CDRH1, CDRH2, or CDRH3 comprising at least or about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 40, 50, 60, 70, 80, 90, or more than 90 amino acids in length comprises about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or more than 90% of the library. For example, CDRH3 comprising about 23 amino acids in length is present in the library at 40%, CDRH3 comprising about 21 amino acids in length is present in the library at 30%, CDRH3 comprising about 17 amino acids in length is present in the library at 20%, and CDRH3 comprising about 12 amino acids in length is present in the library at 10%. In some cases, CDRH3 comprising about 20 amino acids in length is present in the library at 40%, CDRH3 comprising about 16 amino acids in length is present in the library at 30%, CDRH3 comprising about 15 amino acids in length is present in the library at 20%, and CDRH3 comprising about 12 amino acids in length is present in the library at 10%.
Libraries encoding VHH antibodies as described herein comprise variant CDR sequences that are shuffled to generate a polypeptide having at least or about 10 7 、10 8 、10 9 、10 10 、10 11 、10 12 、10 13 、10 14 、10 15 、10 16 、10 17 、10 18 、10 19 、10 20 Or more than 10 20 A library of theoretical diversity of individual sequences. In some cases, the final library diversity of the library is at least or about 10 7 、10 8 、10 9 、10 10 、10 11 、10 12 、10 13 、10 14 、10 15 、10 16 、10 17 、10 18 、10 19 、10 20 Or more than 10 20 Sequence.
Provided herein are GLP1R binding libraries encoding immunoglobulins. In some cases, the GLP1R immunoglobulin is an antibody. In some cases, the GLP1R immunoglobulin is a VHH antibody. In some cases, the GLP1R immunoglobulin comprises a binding affinity (e.g., kD) to GLP1R of less than 1nM, less than 1.2nM, less than 2nM, less than 5nM, less than 10nM, less than 11nM, less than 13.5nM, less than 15nM, less than 20nM, less than 25nM, or less than 30 nM. In some cases, the GLP1R immunoglobulin comprises a kD of less than 1 nM. In some cases, the GLP1R immunoglobulin comprises a kD of less than 1.2 nM. In some cases, the GLP1R immunoglobulin comprises a kD of less than 2 nM. In some cases, the GLP1R immunoglobulin comprises a kD of less than 5 nM. In some cases, the GLP1R immunoglobulin comprises a kD of less than 10 nM. In some cases, the GLP1R immunoglobulin comprises a kD of less than 13.5 nM. In some cases, the GLP1R immunoglobulin comprises a kD of less than 15 nM. In some cases, the GLP1R immunoglobulin comprises a kD of less than 20 nM. In some cases, the GLP1R immunoglobulin comprises a kD of less than 25 nM. In some cases, the GLP1R immunoglobulin comprises a kD of less than 30 nM.
In some cases, the GLP1R immunoglobulin is a GLP1R agonist. In some cases, the GLP1R immunoglobulin is a GLP1R antagonist. In some cases, the GLP1R immunoglobulin is a GLP1R allosteric modulator. In some cases, the allosteric modulator is a negative allosteric modulator. In some cases, the allosteric modulator is a positive allosteric modulator. In some cases, the GLP1R immunoglobulin results in an agonistic, antagonistic, or allosteric effect at a concentration of at least or about 1nM, 2nM, 4nM, 6nM, 8nM, 10nM, 20nM, 30nM, 40nM, 50nM, 60nM, 70nM, 80nM, 90nM, 100nM, 120nM, 140nM, 160nM, 180nM, 200nM, 300nM, 400nM, 500nM, 600nM, 700nM, 800nM, 900nM, 1000nM, or more than 1000 nM. In some cases, the GLP1R immunoglobulin is a negative allosteric modulator. In some cases, the GLP1R immunoglobulin is a negative allosteric modulator at a concentration of at least or about 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1nM, 2nM, 4nM, 6nM, 8nM, 10nM, 20nM, 30nM, 40nM, 50nM, 60nM, 70nM, 80nM, 90nM, 100nM, or more than 100 nM. In some cases, the GLP1R immunoglobulin is a negative allosteric modulator at a concentration ranging from about 0.001 to about 100, 0.01 to about 90, about 0.1 to about 80, 1 to about 50, about 10 to about 40nM, or about 1 to about 10 nM. In some cases, the GLP1R immunoglobulin comprises an EC50 or IC50 of at least or about 0.001, 0.0025, 0.005, 0.01, 0.025, 0.05, 0.06, 0.07, 0.08, 0.9, 0.1, 0.5, 1, 2, 3, 4, 5, 6, or more than 6 nM. In some cases, the GLP1R immunoglobulin comprises an EC50 or IC50 of at least or about 1nM, 2nM, 4nM, 6nM, 8nM, 10nM, 20nM, 30nM, 40nM, 50nM, 60nM, 70nM, 80nM, 90nM, 100nM, or more than 100 nM.
Provided herein are GLP1R binding libraries encoding immunoglobulins, wherein the immunoglobulins comprise a long half-life. In some cases, the half-life of the GLP1R immunoglobulin is at least or about 12 hours, 24 hours, 36 hours, 48 hours, 60 hours, 72 hours, 84 hours, 96 hours, 108 hours, 120 hours, 140 hours, 160 hours, 180 hours, 200 hours, or more than 200 hours. In some cases, the half-life of the GLP1R immunoglobulin ranges from about 12 hours to about 300 hours, from about 20 hours to about 280 hours, from about 40 hours to about 240 hours, or from about 60 hours to about 200 hours.
GLP1R immunoglobulins as described herein may comprise improved properties. In some cases, the GLP1R immunoglobulin is monomeric. In some cases, GLP1R immunoglobulins are not prone to aggregation. In some cases, at least or about 70%, 75%, 80%, 85%, 90%, 95%, or 99% of the GLP1R immunoglobulin is monomeric. In some cases, the GLP1R immunoglobulin is thermostable. In some cases, GLP1R immunoglobulin results in reduced non-specific binding.
After synthesis of a GLP1R binding library comprising nucleic acids encoding immunoglobulins comprising GLP1R binding domains, the library can be used for screening and analysis. For example, library exposable and panning of the library is determined. In some cases, the exposable property is determined using selectable tags. Exemplary labels include, but are not limited to, radiolabels, fluorescent labels, enzymes, chemiluminescent labels, colorimetric labels, affinity labels, or other labels or tags known in the art. In some cases, the tag is histidine, polyhistidine, myc, hemagglutinin (HA), or FLAG. In some cases, the GLP1R binding library comprises nucleic acid encoding an immunoglobulin comprising a GPCR binding domain having multiple tags (e.g., GFP, FLAG, and Lucy) and a DNA barcode. In some cases, libraries are determined by sequencing using various methods, including, but not limited to, single Molecule Real Time (SMRT) sequencing, polymerase clone (poony) sequencing, ligation sequencing, reversible terminator sequencing, proton detection sequencing, ion semiconductor sequencing, nanopore sequencing, electronic sequencing, pyrosequencing, maxam-Gilbert sequencing, chain termination (e.g., sanger) sequencing, +s sequencing, or synthetic sequencing.
Expression system
Provided herein are libraries comprising nucleic acids encoding immunoglobulins comprising GLP1R binding domains, wherein the libraries have improved specificity, stability, expression, folding or downstream activity. In some cases, the libraries described herein are used for screening and analysis.
Provided herein are libraries comprising nucleic acids encoding immunoglobulins comprising GLP1R binding domains, wherein the nucleic acid libraries are used in screening and analysis. In some cases, screening and analysis includes in vitro, in vivo, or ex vivo assays. Cells for screening include primary cells taken from a living subject or cell line. The cells may be from prokaryotes (e.g., bacteria and fungi) or eukaryotes (e.g., animals and plants). Exemplary animal cells include, but are not limited to, those from mice, rabbits, primates, and insects. In some cases, the cells used for screening include cell lines, including but not limited to Chinese Hamster Ovary (CHO) cell lines, human Embryonic Kidney (HEK) cell lines, or Baby Hamster Kidney (BHK) cell lines. In some cases, the nucleic acid libraries described herein can also be delivered to a multicellular organism. Exemplary multicellular organisms include, but are not limited to, plants, mice, rabbits, primates, and insects.
The nucleic acid libraries described herein or libraries of proteins encoded thereby can be screened for various pharmacological or pharmacokinetic properties. In some cases, the library is screened using an in vitro assay, an in vivo assay, or an ex vivo assay. For example, in vitro pharmacological or pharmacokinetic properties screened include, but are not limited to, binding affinity, binding specificity, and binding affinity. Exemplary in vivo pharmacological or pharmacokinetic properties of the libraries described herein that are screened include, but are not limited to, therapeutic efficacy, activity, preclinical toxicity profile, clinical efficacy profile, clinical toxicity profile, immunogenicity, efficacy, and clinical safety profile.
The pharmacological or pharmacokinetic properties that can be screened include, but are not limited to, cell binding affinity and cell activity. For example, a cell binding affinity assay or cell activity assay is performed to determine agonism, antagonism or allosteric effects of the libraries described herein. In some cases, the cellular activity assay is a cAMP assay. In some cases, the library as described herein is compared to the cell binding or cell activity of the GLP1R ligand.
Libraries as described herein can be screened in a cell-based assay or a non-cell-based assay. Examples of non-cell based assays include, but are not limited to, use of viral particles, use of in vitro translation proteins, and use of proteoliposomes with GLP 1R.
The nucleic acid library as described herein may be screened by sequencing. In some cases, next generation sequencing is used to determine sequence enrichment of GLP1R binding variants. In some cases, V gene distribution, J gene distribution, V gene family, CDR3 count per length, or a combination thereof is determined. In some cases, cloning frequency, cloning accumulation, lineage accumulation, or a combination thereof is determined. In some cases, the number of sequences, sequences with VH clones, clones greater than 1, clonotypes greater than 1, lineages, simpsons, or combinations thereof is determined. In some cases, the percentage of CDR3 that are not identical is determined. For example, the percentage of non-identical CDR3 is calculated as the number of non-identical CDR3 in the sample divided by the total number of sequences with CDR3 in the sample.
Provided herein are nucleic acid libraries, wherein the nucleic acid library is expressible in a vector. Expression vectors for insertion into the nucleic acid libraries disclosed herein may comprise eukaryotic or prokaryotic expression vectors. Exemplary expression vectors include, but are not limited to, mammalian expression vectors: pSF-CMV-NEO-NH2-PPT-3XFLAG, pSF-CMV-NEO-COOH-3XFLAG, pSF-CMV-PURO-NH2-GST-TEV, pSF-OXB20-COOH-TEV-FLAG (R) -6His, pCEP4 pDOST 27, pSF-CMV-Ub-KrYFP, pSF-CMV-FMDV-daGFP, pEF1a-mCherry-N1 vector, pEF1a-tdTomato vector, pSF-CMV-FMDV-Hygro, pSF-CMV-PGK-PURO, pMCP-tag (m) and pSF-CMV-PURO-NH2-CMYC; bacterial expression vector: pSF-OXB 20-. Beta.Gal, pSF-OXB20-Fluc, pSF-OXB20 and pSF-Tac; plant expression vector: pRI 101-AN DNA and pCambia2301; yeast expression vector: pTYB21 and pKLAC2, and insect vectors: pAc5.1/V5-His A and pDEST8. In some cases, the vector is pcDNA3 or pcDNA3.1.
Described herein are nucleic acid libraries expressed in vectors to generate constructs comprising immunoglobulins comprising the sequence of a GLP1R binding domain. In some cases, the constructs are of different sizes. In some cases, the construct comprises at least or about 500, 600, 700, 800, 900, 1000, 1100, 1300, 1400, 1500, 1600, 1700, 1800, 2000, 2400, 2600, 2800, 3000, 3200, 3400, 3600, 3800, 4000, 4200, 4400, 4600, 4800, 5000, 6000, 7000, 8000, 9000, 10000, or more than 10000 bases. In some of the cases where the number of the cases, the construct comprises about 300 to 1,000, 300 to 2,000, 300 to 3,000, 300 to 4,000, 300 to 5,000, 300 to 6,000, 300 to 7,000, 300 to 8,000, 300 to 9,000, 300 to 10,000, 1,000 to 2,000, 1,000 to 3,000, 1,000 to 4,000, 1,000 to 5,000, 1,000 to 6,000, 1,000 to 7,000, 1,000 to 8,000, 1,000 to 9,000, 1,000 to 10,000, 2,000 to 3,000, 2,000 to 4,000, 2,000 to 5,000, 2,000 to 6,000, 2,000 to 7,000, 2,000 to 8,000, 2,000 to 9,000, 2,000 to 10,000, 3,000 to 4,000, 1,000 ranges of 3,000 to 5,000, 3,000 to 6,000, 3,000 to 7,000, 3,000 to 8,000, 3,000 to 9,000, 3,000 to 10,000, 4,000 to 5,000, 4,000 to 6,000, 4,000 to 7,000, 4,000 to 8,000, 4,000 to 9,000, 4,000 to 10,000, 5,000 to 6,000, 5,000 to 7,000, 5,000 to 8,000, 5,000 to 9,000, 5,000 to 10,000, 6,000 to 7,000, 6,000 to 8,000, 6,000 to 9,000, 6,000 to 10,000, 7,000 to 8,000, 7,000 to 9,000, 7,000 to 10,000, 8,000 to 9,000, 8,000 to 10,000, or 9,000 to 10,000 bases.
Provided herein are libraries comprising nucleic acids encoding immunoglobulins comprising a GPCR binding domain, wherein the nucleic acid library is expressed in cells. In some cases, the library is synthesized to express the reporter gene. Exemplary reporter genes include, but are not limited to, acetohydroxyacid synthase (AHAS), alkaline Phosphatase (AP), beta-galactosidase (LacZ), beta-Glucosidase (GUS), chloramphenicol Acetyl Transferase (CAT), green Fluorescent Protein (GFP), red Fluorescent Protein (RFP), yellow Fluorescent Protein (YFP), cyan Fluorescent Protein (CFP), sky blue fluorescent protein, lemon fluorescent protein, orange fluorescent protein, cherry fluorescent protein, blue green fluorescent protein, blue fluorescent protein, horseradish peroxidase (HRP), luciferase (Luc), nopaline synthase (NOS), octopine synthase (OCS), luciferase, and derivatives thereof. Methods for determining reporter gene modulation are well known in the art and include, but are not limited to, fluorometry (e.g., fluorescence spectroscopy, fluorescence Activated Cell Sorting (FACS), fluorescence microscopy) and antibiotic resistance assays.
Diseases and disorders
Provided herein are GLP1R binding libraries comprising nucleic acids encoding immunoglobulins (e.g., antibodies) that comprise GLP1R binding domains that may have therapeutic effects. In some cases, the GLP1R binding library, when translated, produces a protein for treating a disease or disorder. In some cases, the protein is an immunoglobulin. In some cases, the protein is a peptidomimetic.
The GLP1R library as described herein may comprise modulators of GLP 1R. In some cases, the modulator of GLP1R is an inhibitor. In some cases, the modulator of GLP1R is an activator. In some cases, the GLP1R inhibitor is a GLP1R antagonist. In some cases, the GLP1R antagonist is GLP1R-3. In some cases, modulators of GLP1R are used to treat various diseases or disorders.
Exemplary diseases include, but are not limited to, cancer, inflammatory diseases or disorders, metabolic diseases or disorders, cardiovascular diseases or disorders, respiratory diseases or disorders, pain, digestive diseases or disorders, reproductive diseases or disorders, endocrine diseases or disorders, or neurological diseases or disorders. In some cases, the cancer is a solid cancer or a hematologic cancer. In some cases, modulators of GLP1R as described herein are used to treat weight gain (or to induce weight loss), treat obesity, or treat type II diabetes. In some cases, GLP1R modulators are used to treat hypoglycemia. In some cases, GLP1R modulators are used to treat post-weight hypoglycemia. In some cases, GLP1R modulators are used to treat severe hypoglycemia. In some cases, GLP1R modulators are used to treat hyperinsulinemia. In some cases, GLP1R modulators are used to treat congenital hyperinsulinemia.
In some cases, the subject is a mammal. In some cases, the subject is a mouse, rabbit, dog, or human. The subject treated by the methods described herein may be an infant, adult, or child. Pharmaceutical compositions comprising an antibody or antibody fragment as described herein may be administered intravenously or subcutaneously.
Described herein are pharmaceutical compositions comprising antibodies or antibody fragments thereof that bind GLP 1R. In some embodiments, the antibody or antibody fragment thereof comprises the sequences as set forth in tables 7-13. In some embodiments, an antibody or antibody fragment thereof comprises a sequence having at least or about 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a sequence as set forth in tables 7-13.
In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising the CDRH1 sequence of any one of SEQ ID NOs 441-619. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 80% identical to the CDRH1 sequence of any one of SEQ ID NOs 441-619. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 85% identical to the CDRH1 sequence of any one of SEQ ID NOs 441-619. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 90% identical to the CDRH1 sequence of any one of SEQ ID NOs 441-619. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 95% identical to the CDRH1 sequence of any one of SEQ ID NOs 441-619. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising the CDRH2 sequence of any one of SEQ ID NOs 620-798. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 80% identical to the CDRH2 sequence of any one of SEQ ID NOs 620-798. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 85% identical to the CDRH2 sequence of any one of SEQ ID NOs 620-798. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 90% identical to the CDRH2 sequence of any one of SEQ ID NOs 620-798. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 95% identical to the CDRH2 sequence of any one of SEQ ID NOs 620-798. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising the CDRH3 sequence of any one of SEQ ID NOs 799-977. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 80% identical to the CDRH3 sequence of any one of SEQ ID NOs 799-977. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 85% identical to the CDRH3 sequence of any one of SEQ ID NOs 799-977. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 90% identical to the CDRH3 sequence of any one of SEQ ID NOs 799-977. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 95% identical to the CDRH3 sequence of any one of SEQ ID NOs 799-977.
In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising the CDRL1 sequence of any one of SEQ ID NOs 978-1156. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 80% identical to the CDRL1 sequence of any one of SEQ ID NOs 978-1156. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 85% identical to the CDRL1 sequence of any one of SEQ ID NOs 978-1156. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 90% identical to the CDRL1 sequence of any one of SEQ ID NOs 978-1156. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 95% identical to the CDRL1 sequence of any one of SEQ ID NOs 978-1156. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising the CDRL2 sequence of any one of SEQ ID NOs 1157-1168. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 80% identical to the CDRL2 sequence of any one of SEQ ID NOs 1157-1168. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 85% identical to the CDRL2 sequence of any one of SEQ ID NOs 1157-1168. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 90% identical to the CDRL2 sequence of any one of SEQ ID NOs 1157-1168. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 95% identical to the CDRL2 sequence of any one of SEQ ID NOs 1157-1168. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising the CDRL3 sequence of any one of SEQ ID NOs 1169-1347. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 80% identical to the CDRL3 sequence of any one of SEQ ID NOs 1169-1347. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 85% identical to the CDRL3 sequence of any one of SEQ ID NOs 1169-1347. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 90% identical to the CDRL3 sequence of any one of SEQ ID NOs 1169-1347. In some cases, the pharmaceutical composition comprises an antibody or antibody fragment described herein comprising a sequence that is at least 95% identical to the CDRL3 sequence of any one of SEQ ID NOs 1169-1347.
In some embodiments, an antibody or antibody fragment comprises a variable domain heavy chain region (VH) and a variable domain light chain region (VL), wherein VH comprises complementarity determining regions CDRH1, CDRH2, and CDRH3, wherein VL comprises complementarity determining regions CDRL1, CDRL2, and CDRL3, and wherein (a) the amino acid sequence of CDRH1 is as shown in any one of SEQ ID NOs 441-619; (b) The amino acid sequence of CDRH2 is shown in any one of SEQ ID NO 620-798; (c) The amino acid sequence of CDRH3 is shown in any one of SEQ ID NO: 799-977; (d) The amino acid sequence of CDRL1 is shown in any one of SEQ ID NOs 978-1156; (e) The amino acid sequence of CDRL2 is shown in any one of SEQ ID NOs 1157-1168; and (f) the amino acid sequence of CDRL3 is shown in any one of SEQ ID NOS 1169-1347. In some embodiments, an antibody or antibody fragment comprises a variable domain heavy chain region (VH) and a variable domain light chain region (VL), wherein VH comprises complementarity determining regions CDRH1, CDRH2, and CDRH3, wherein VL comprises complementarity determining regions CDRL1, CDRL2, and CDRL3, and wherein (a) the amino acid sequence of CDRH1 is at least or about 80%, 85%, 90%, or 95% identical to any one of SEQ ID NOs 441-619; (b) The amino acid sequence of CDRH2 is at least or about 80%, 85%, 90% or 95% identical to any one of SEQ ID NOs 620-798; (c) The amino acid sequence of CDRH3 is at least or about 80%, 85%, 90% or 95% identical to any one of SEQ ID NOs 799-977; (d) The amino acid sequence of CDRL1 is at least or about 80%, 85%, 90% or 95% identical to any one of SEQ ID NOs 978-1156; (e) The amino acid sequence of CDRL2 is at least or about 80%, 85%, 90% or 95% identical to any one of SEQ ID NOs 1157-1168; and (f) the amino acid sequence of CDRL3 is at least or about 80%, 85%, 90%, or 95% identical to any one of SEQ ID NOS 1169-1347.
In some embodiments, described herein are antibodies or antibody fragments comprising a variable domain heavy chain region (VH) and a variable domain light chain region (VL), wherein VH comprises an amino acid sequence at least about 90% identical to the sequence set forth in any one of SEQ ID NOs 58-77, and wherein VL comprises an amino acid sequence at least about 90% identical to the sequence set forth in any one of SEQ ID NOs 92-111. In some cases, an antibody or antibody fragment comprises a VH comprising at least or about 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to any of SEQ ID NOs 58-77 and a VL comprising at least or about 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to any of SEQ ID NOs 92-111.
Described herein are pharmaceutical compositions comprising an antibody or antibody fragment thereof that binds GLP1R, comprising various doses of the antibody or antibody fragment. In some cases, the dose is in the range of about 1 to 80mg/kg, about 1 to about 100mg/kg, about 5 to about 80mg/kg, about 5 to about 60mg/kg, about 5 to about 50mg/kg, or about 5 to about 500mg/kg, which may be administered in a single dose or multiple doses. In some of the cases where the number of the cases, the dosage is about 0.01mg/kg, about 0.05mg/kg, about 0.10mg/kg, about 0.25mg/kg, about 0.5mg/kg, about 1mg/kg, about 5mg/kg, about 10mg/kg, about 15mg/kg, about 20mg/kg, about 25mg/kg, about 30mg/kg, about 35mg/kg, about 40mg/kg, about 45mg/kg, about 50mg/kg, about 55mg/kg, about 60mg/kg, about 65mg/kg, about 70mg/kg, about 75mg/kg, about 80mg/kg, about 85mg/kg, about 90mg/kg, about 95mg/kg, about 100mg/kg, about 105mg/kg, about 110mg/kg, about 115mg/kg, about 120, about 125, about 130, about 135, about 140, about 145, about 150, about about 155, about 160, about 165, about 170, about 175, about 180, about 185, about 190, about 195, about 200, about 205, about 210, about 215, about 220, about 225, about 230, about 240, about 250, about 260, about 270, about 275, about 280, about 290, about 300, about 310, about 320, about 330, about 340, about 350, about 360mg/kg, about 370mg/kg, about 380mg/kg, about 390mg/kg, about 400mg/kg, 410mg/kg, about 420mg/kg, about 430mg/kg, about 440mg/kg, about 450mg/kg, about 460mg/kg, about 470mg/kg, about 480mg/kg, about 490mg/kg, or about 500 mg/kg.
Variant libraries
Codon variation
The library of variant nucleic acids described herein may comprise a plurality of nucleic acids, wherein each nucleic acid encodes a variant codon sequence as compared to a reference nucleic acid sequence. In some cases, each nucleic acid of the first population of nucleic acids contains a variant at a single variant site. In some cases, the first nucleic acid population contains multiple variants at a single variant site, such that the first nucleic acid population contains more than one variant at the same variant site. The first population of nucleic acids may comprise nucleic acids that collectively encode a plurality of codon variants at the same variant site. The first population of nucleic acids may comprise nucleic acids that collectively encode up to 19 or more codons at the same location. The first population of nucleic acids may comprise nucleic acids that collectively encode up to 60 variant triplets at the same location, or the first population of nucleic acids may comprise nucleic acids that collectively encode up to 61 different codon triplets at the same location. During translation, each variant may encode a codon that produces a different amino acid. Table 2 provides a list of each possible codon (and representative amino acids) for a variant site.
TABLE 2 codon and amino acid lists
The population of nucleic acids may comprise different nucleic acids that collectively encode up to 20 codon variations at multiple positions. In this case, each nucleic acid in the population comprises codon variations at more than one position in the same nucleic acid. In some cases, each nucleic acid in the population comprises a codon variation at 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more codons in a single nucleic acid. In some cases, each variant long nucleic acid comprises a codon variation at 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, or more codons in a single long nucleic acid. In some cases, the population of variant nucleic acids comprises a codon variation at 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, or more codons in a single nucleic acid. In some cases, the population of variant nucleic acids comprises a codon variation at least about 10, 20, 30, 40, 50, 60, 70, 80, 90, 100 or more codons in a single long nucleic acid.
Highly parallel nucleic acid synthesis
Provided herein is a platform approach that utilizes miniaturization, parallelization, and vertical integration of end-to-end processes from polynucleotide synthesis to gene assembly within nanopores on silicon to create a revolutionary synthesis platform. The devices described herein provide a silicon synthesis platform with the same footprint as 96-well plates, which is capable of increasing throughput by up to 1,000-fold or more, as compared to traditional synthesis methods, producing up to about 1,000,000 or more polynucleotides or 10,000 or more genes in a single highly parallelized run.
With the advent of next generation sequencing, high resolution genomic data has become an important factor in the study of biological roles of various genes in normal biology and disease pathogenesis. The heart of this study is the central rule of molecular biology and the concept of "residue-by-residue transfer of continuous information". Genomic information encoded in DNA is transcribed into information, which is then translated into a protein that is an active product within a given biological pathway.
Another exciting area of research is the discovery, development and manufacture of therapeutic molecules focused on highly specific cellular targets. The high diversity of DNA sequence libraries is central to the development pipeline of targeted therapeutics. The gene mutants are used to express proteins in the design, construction and testing of protein engineering cycles, ideally resulting in optimized genes that highly express proteins with high affinity for their therapeutic targets. As an example, consider the binding pocket of the receptor. The ability to simultaneously test all sequence arrangements of all residues within the binding pocket would allow for thorough exploration, thereby increasing the chance of success. Saturation mutagenesis, in which researchers attempt to generate all possible mutations at specific sites within the receptor, represents one approach to this development challenge. Although it is costly, time consuming and labor intensive, it is capable of introducing each variant to each location. In contrast, combinatorial mutagenesis (where some selected positions or short DNA fragments can be widely modified) generates a repertoire of incomplete variants with biased representation.
To speed up the drug development pipeline, the required library of variants (in other words, the exact library) available at the desired frequency at the correct location available for testing can reduce cost and turnaround time of screening. Provided herein are methods for synthesizing libraries of nucleic acid synthetic variants that provide for precise introduction of each desired variant at a desired frequency. This means that not only can the sequence space be thoroughly sampled, but these hypotheses can be queried in an efficient way, reducing costs and screening time for the end user. Whole genome editing can elucidate important pathways, libraries, where the optimal functionality of each variant and sequence arrangement can be tested, and thousands of genes can be used to reconstruct the entire pathway and genome to redesign the biological system for drug discovery.
In a first example, the drug itself may be optimized using the methods described herein. For example, to improve a particular function of an antibody, libraries of variant polynucleotides encoding a portion of the antibody are designed and synthesized. A library of variant nucleic acids of the antibody can then be generated by the process described herein (e.g., PCR mutagenesis followed by insertion into a vector). Antibodies are then expressed in the producer cell line and screened for enhanced activity. Exemplary screening includes examining the antigen Is described herein) or the modulation of binding affinity, stability, or effector function (e.g., ADCC, complement, or apoptosis). Exemplary regions of an optimized antibody include, but are not limited to, an Fc region, a Fab region, a variable region of the Fab region, a constant region of the Fab region, a variable domain of a heavy or light chain (V H Or V L ) And V H Or V L Is a specific Complementarity Determining Region (CDR).
The nucleic acid libraries synthesized by the methods described herein can be expressed in a variety of cells associated with a disease state. Cells associated with a disease state include cell lines, tissue samples, primary cells from a subject, cultured cells expanded from a subject, or cells in a model system. Exemplary model systems include, but are not limited to, plant and animal models of disease states.
To identify variant molecules associated with the prevention, alleviation or treatment of a disease state, the library of variant nucleic acids described herein is expressed in cells associated with the disease state, or in cells that can induce the disease state. In some cases, the agent is used to induce a disease state in the cell. Exemplary means for inducing a disease state include, but are not limited to, cre/Lox recombinant systems, LPS inflammation induction, and hypoglycemia-inducing streptozotocin. The cells associated with a disease state may be cells from a model system or cultured cells, as well as cells from a subject suffering from a particular disease condition. Exemplary disease conditions include bacteria, fungi, viruses, autoimmune or proliferative disorders (e.g., cancer). In some cases, the library of variant nucleic acids is expressed in a model system, a cell line, or primary cells derived from the subject, and screened for a change in at least one cellular activity. Exemplary cellular activities include, but are not limited to, proliferation, cycle progression, cell death, adhesion, migration, replication, cell signaling, energy production, oxygen utilization, metabolic activity and aging, response to free radical damage, or any combination thereof.
Substrate
Devices used as polynucleotide synthesis surfaces may be in the form of substrates including, but not limited to, homogeneous array surfaces, patterned array surfaces, channels, beads, gels, and the like. Provided herein are substrates comprising a plurality of clusters, wherein each cluster comprises a plurality of loci that support attachment and synthesis of polynucleotides. In some cases, the substrate comprises a homogeneous array surface. For example, the homogeneous array surface is a homogeneous plate. The term "locus" as used herein refers to a discrete region of structure that provides support for the extension of a polynucleotide encoding a single predetermined sequence from a surface. In some cases, the seat is on a two-dimensional surface, e.g., a substantially flat surface. In some cases, the seat is on a three-dimensional surface, e.g., a hole, microwell, channel, or post. In some cases, the surface of the locus comprises a material that is activated functionalized to attach to at least one nucleotide for polynucleotide synthesis or preferably the same nucleotide population used to synthesize the polynucleotide population. In some cases, a polynucleotide refers to a population of polynucleotides encoding the same nucleic acid sequence. In some cases, the surface of the substrate includes one or more surfaces of the substrate. The average error rate of polynucleotides synthesized within the libraries described herein using the provided systems and methods is typically less than 1/1000, less than about 1/2000, less than about 1/3000, or less, typically without error correction.
Provided herein are surfaces that support parallel synthesis of multiple polynucleotides having different predetermined sequences at addressable locations on a common support. In some cases, the substrate provides support for synthesizing more than 50, 100, 200, 400, 600, 800, 1000, 1200, 1400, 1600, 1800, 2,000, 5,000, 10,000, 20,000, 50,000, 100,000, 200,000, 300,000, 400,000, 500,000, 600,000, 700,000, 800,000, 900,000, 1,000,000, 1,200,000, 1,400,000, 1,600,000, 1,800,000, 2,000,000, 2,500,000, 3,000,000, 3,500,000, 4,000,000, 4,500,000, 5,000,000, 10,000,000 or more different polynucleotides. In some cases, the surface provides support for synthesizing more than 50, 100, 200, 400, 600, 800, 1000, 1200, 1400, 1600, 1800, 2,000, 5,000, 10,000, 20,000, 50,000, 100,000, 200,000, 300,000, 400,000, 500,000, 600,000, 700,000, 800,000, 900,000, 1,000,000, 1,200,000, 1,400,000, 1,600,000, 1,800,000, 2,000,000, 2,500,000, 3,000,000, 3,500,000, 4,000,000, 4,500,000, 5,000,000, 10,000,000 or more polynucleotides encoding different sequences. In some cases, at least a portion of the polynucleotide has the same sequence or is configured to be synthesized with the same sequence. In some cases, the substrate provides a surface environment for growth of a polynucleotide having at least 80, 90, 100, 120, 150, 175, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500 bases or more.
Provided herein are methods of synthesizing polynucleotides at different loci on a substrate, wherein each locus supports synthesis of a population of polynucleotides. In some cases, each locus supports synthesis of a population of polynucleotides having a different sequence than a population of polynucleotides growing at another locus. In some cases, each polynucleotide sequence is synthesized to have 1,2, 3, 4,5, 6, 7, 8, 9 or more redundancies at different loci within the same cluster of loci on the surface used for polynucleotide synthesis. In some cases, the seats of the substrate are located within a plurality of clusters. In some cases, the substrate comprises at least 10, 500, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 11000, 12000, 13000, 14000, 15000, 20000, 30000, 40000, 50000, or more clusters. In some cases, the substrate comprises more than 2,000, 5,000, 10,000, 100,000, 200,000, 300,000, 400,000, 500,000, 600,000, 700,000, 800,000, 900,000, 1,000,000, 1,100,000, 1,200,000, 1,300,000, 1,400,000, 1,500,000, 1,600,000, 1,700,000, 1,800,000, 1,900,000, 2,000,000, 300,000, 400,000, 500,000, 600,000, 700,000, 800,000, 900,000, 1,000,000, 1,200,000, 1,400,000, 1,600,000, 1,800,000, 2,000,000, 2,500,000, 3,000,000, 3,500,000, 4,000,000, 4,500,000, 5,000, or 10,000, or more different seats. In some cases, the substrate comprises about 10,000 different seats. The number of seats within a single cluster may be different in different situations. In some cases, each cluster comprises 1,2, 3, 4,5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 120, 130, 150, 200, 300, 400, 500, or more seats. In some cases, each cluster includes about 50-500 seats. In some cases, each cluster includes about 100-200 seats. In some cases, each cluster includes about 100-150 seats. In some cases, each cluster includes about 109, 121, 130, or 137 seats. In some cases, each cluster includes about 19, 20, 61, 64, or more seats. Alternatively or in combination, polynucleotide synthesis occurs on a homogeneous array surface.
In some cases, the number of different polynucleotides synthesized on the substrate depends on the number of different loci available in the substrate. In some cases, the seat density within a cluster or surface of the substrate is at least or about 1, 10, 25, 50, 65, 75, 100, 130, 150, 175, 200, 300, 400, 500, 1,000 or more seats/mm 2 . In some cases, the substrate comprises 10-500, 25-400, 50-500, 100-500, 150-500, 10-250, 50-250, 10-200, or 50-200mm 2 . In some cases, the distance between the centers of two adjacent seats within a cluster or surface is about 10-500, about 10-200, or about 10-100um. In some cases, the distance between two centers of adjacent seats is greater than about 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100um. In some cases, the distance between the centers of two adjacent seats is less than about 200, 150, 100, 80, 70, 60, 50, 40, 30, 20, or 10um. In some cases, the width of each seat is about 0.5, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100um. In some cases, the width of each seat is about 0.5-100, 0.5-50, 10-75, or 0.5-50um.
In some cases, the density of clusters within the substrate is at least or about 1 cluster per 100mm 2 1 cluster/10 mm 2 1 cluster/5 mm 2 1 cluster/4 mm 2 1 cluster/3 mm 2 1 cluster/2 mm 2 1 cluster/1 mm 2 2 clusters/1 mm 2 3 clusters/1 mm 2 4 clusters/1 mm 2 5 clusters/1 mm 2 10 clusters/1 mm 2 50 clusters/1 mm 2 Or more. In some cases, the substrate comprises about 1 tuft/10 mm 2 Up to about 10 clusters/1 mm 2 . In some cases, the distance between the centers of two adjacent clusters is at least or about 50100, 200, 500, 1000, 2000 or 5000um. In some cases, the distance between the centers of two adjacent clusters is about 50-100, 50-200, 50-300, 50-500, and 100-2000um. In some cases, the distance between the centers of two adjacent clusters is about 0.05-50, 0.05-10, 0.05-5, 0.05-4, 0.05-3, 0.05-2, 0.1-10, 0.2-10, 0.3-10, 0.4-10, 0.5-5, or 0.5-2mm. In some cases, each tuft has a cross-section of about 0.5 to about 2, about 0.5 to about 1, or about 1 to about 2mm. In some cases, the cross-section of each tuft is about 0.5, 0.6, 0.7, 0.8, 0.9, 1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, or 2mm. In some cases, the internal cross-section of each tuft is about 0.5, 0.6, 0.7, 0.8, 0.9, 1, 1.1, 1.15, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, or 2mm.
In some cases, the substrate is about the size of a standard 96-well plate, e.g., about 100 to about 200mm x about 50 to about 150mm. In some cases, the diameter of the substrate is less than or equal to about 1000, 500, 450, 400, 300, 250, 200, 150, 100, or 50mm. In some cases, the diameter of the substrate is about 25-1000, 25-800, 25-600, 25-500, 25-400, 25-300, or 25-200mm. In some cases, the substrate has a flat surface area of at least about 100, 200, 500, 1,000, 2,000, 5,000, 10,000, 12,000, 15,000, 20,000, 30,000, 40,000, 50,000mm 2 Or more. In some cases, the thickness of the substrate is about 50-2000, 50-1000, 100-1000, 200-1000, or 250-1000mm.
Surface material
The substrates, devices, and reactors provided herein are made of any variety of materials suitable for the methods, compositions, and systems described herein. In some cases, the substrate material is fabricated to exhibit low levels of nucleotide binding. In some cases, the substrate material is modified to create different surfaces that exhibit high levels of nucleotide binding. In some cases, the substrate material is transparent to visible and/or UV light. In some cases, the substrate material is sufficiently conductive, e.g., capable of forming a uniform electric field across all or a portion of the substrate. In some cases, the conductive material is connected to electrical ground. In some cases, the substrate is thermally conductive or thermally insulating. In some cases, the material is chemically and thermally resistant to support chemical or biochemical reactions, such as polynucleotide synthesis reaction processes. In some cases, the substrate comprises a flexible material. For flexible materials, the materials may include, but are not limited to: modified and unmodified nylon, nitrocellulose, polypropylene, and the like. In some cases, the substrate comprises a rigid material. For rigid materials, the materials may include, but are not limited to: glass; fused quartz; silicon, plastic (e.g., polytetrafluoroethylene, polypropylene, polystyrene, polycarbonate, mixtures thereof, etc.); and metals (e.g., gold, platinum, etc.). The substrate, solid support or reactor may be made of a material selected from the group consisting of silicon, polystyrene, agarose, dextran, cellulose polymer, polyacrylamide, polydimethylsiloxane (PDMS) and glass. The substrate/solid support or microstructures/reactors therein may be made from a combination of the materials listed herein or any other suitable material known in the art.
Surface architecture
Provided herein are substrates for use in the methods, compositions, and systems described herein, wherein the substrates have a surface architecture suitable for the methods, compositions, and systems described herein. In some cases, the substrate includes raised and/or recessed features. One benefit of having this feature is the increased surface area that supports polynucleotide synthesis. In some cases, the substrate having raised and/or recessed features is referred to as a three-dimensional substrate. In some cases, the three-dimensional substrate comprises one or more channels. In some cases, one or more seats comprise a channel. In some cases, the channels may be reagent deposited via a deposition device (e.g., a material deposition device). In some cases, the reagents and/or fluids are collected in larger pores in fluid communication with one or more channels. For example, the substrate includes a plurality of channels corresponding to a plurality of seats within the cluster, and the plurality of channels are in fluid communication with one aperture of the cluster. In some methods, the polynucleotide library is synthesized in a plurality of loci of a cluster.
Provided herein are substrates for use in the methods, compositions, and systems described herein, wherein the substrates are configured for polynucleotide synthesis. In some cases, the structures are formulated to allow controlled flow and mass transfer pathways for polynucleotide synthesis on the surface. In some cases, the construction of the substrate allows for a controlled and uniform distribution of mass transfer pathways, chemical exposure times, and/or wash efficacy during polynucleotide synthesis. In some cases, the construction of the substrate allows for improved cleaning efficiency, for example, by providing sufficient volume for the growing polynucleotide such that the volume excluded by the growing polynucleotide does not account for more than 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1% or less of the initial available volume available for or suitable for the growing polynucleotide. In some cases, the three-dimensional structure allows for controlled flow of fluid, allowing for rapid exchange of chemical exposures.
Provided herein are substrates for use in the methods, compositions, and systems described herein, wherein the substrates comprise structures suitable for the methods, compositions, and systems described herein. In some cases, isolation is achieved by physical structure. In some cases, isolation is achieved by surface differential functionalization that generates activated and deactivated regions for polynucleotide synthesis. In some cases, differential functionalization is achieved by alternating hydrophobicity on the substrate surface, resulting in a water contact angle effect that causes the deposited reagent to bead or wet. The use of larger structures can reduce the splatter and cross contamination of reagents at different polynucleotide synthesis sites and adjacent sites. In some cases, the reagents are deposited at different polynucleotide synthesis locations using a device (e.g., a material deposition device). Substrates having three-dimensional features are configured in a manner that allows synthesis of large numbers of polynucleotides (e.g., greater than about 10,000) with low error rates (e.g., less than about 1:500, 1:1000, 1:1500, 1:2,000, 1:3,000, 1:5,000, or 1:10,000). In some cases, the substrate comprises a density of about or greater than about 1, 5, 10, 20, 30, 40, 50, 60, 70, 80, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 300, 400, or 500 features/mm 2 Is characterized by (3).
The aperture of the substrate may have the same or different width, height and/or volume as another aperture of the substrate. The channels of the substrate may have the same or different width, height, and/or volume as the other channels of the substrate. In some cases, the diameter of the clusters or the diameter of the holes comprising the clusters, or both, is about 0.05-50, 0.05-10, 0.05-5, 0.05-4, 0.05-3, 0.05-2, 0.05-1, 0.05-0.5, 0.05-0.1, 0.1-10, 0.2-10, 0.3-10, 0.4-10, 0.5-5, or 0.5-2mm. In some cases, the clusters or pores, or both, have a diameter of less than or about 5, 4, 3, 2, 1, 0.5, 0.1, 0.09, 0.08, 0.07, 0.06, or 0.05mm. In some cases, the clusters or pores, or both, are about 1.0 to 1.3mm in diameter. In some cases, the diameter of the clusters or the holes, or both, is about 1.150mm. In some cases, the diameter of the clusters or holes or both is about 0.08mm. The diameter of a cluster refers to a cluster within a two-dimensional or three-dimensional substrate.
In some cases, the height of the holes is about 20-1000, 50-1000, 100-1000, 200-1000, 300-1000, 400-1000, or 500-1000um. In some cases, the height of the holes is less than about 1000, 900, 800, 700, or 600um.
In some cases, the substrate comprises a plurality of channels corresponding to a plurality of seats within a cluster, wherein the height or depth of the channels is 5-500, 5-400, 5-300, 5-200, 5-100, 5-50, or 10-50um. In some cases, the height of the channel is less than 100, 80, 60, 40, or 20um.
In some cases, the channel, the seat (e.g., in a substantially planar substrate), or both the channel and the seat (e.g., in a three-dimensional substrate in which the seat corresponds to the channel) have diameters of about 1-1000, 1-500, 1-200, 1-100, 5-100, or 10-100um, e.g., about 90, 80, 70, 60, 50, 40, 30, 20, or 10um. In some cases, the diameter of the channel, the seat, or both the channel and the seat is less than about 100, 90, 80, 70, 60, 50, 40, 30, 20, or 10um. In some cases, the distance between two adjacent channels, seats, or centers of a channel and a seat is about 1-500, 1-200, 1-100, 5-200, 5-100, 5-50, or 5-30, e.g., about 20um.
Surface modification
Provided herein are methods for synthesizing polynucleotides on a surface, wherein the surface comprises various surface modifications. In some cases, surface modification is employed to chemically and/or physically alter the surface by an addition process or a subtraction process to alter one or more chemical and/or physical properties of the substrate surface or selected sites or regions of the substrate surface. For example, surface modification includes, but is not limited to, (1) altering the wetting characteristics of a surface, (2) functionalizing the surface, i.e., providing, modifying, or replacing surface functional groups, (3) defunctionalizing the surface, i.e., removing surface functional groups, (4) altering the chemical composition of the surface in other ways (e.g., by etching), (5) increasing or decreasing the surface roughness, (6) providing a coating on the surface, e.g., a coating that exhibits wetting characteristics that are different from the surface wetting characteristics, and/or (7) depositing particles on the surface.
In some cases, the addition of a chemical layer (referred to as an adhesion promoter) on top of the surface facilitates the patterning of the structure of the seat on the substrate surface. Exemplary surfaces for applying adhesion promotion include, but are not limited to, glass, silicon dioxide, and silicon nitride. In some cases, the adhesion promoter is a chemical with high surface energy. In some cases, a second chemical layer is deposited on the surface of the substrate. In some cases, the second chemical layer has a low surface energy. In some cases, the surface energy of the chemical layer coated on the surface supports the positioning of the droplets on the surface. The proximity of the seat and/or the fluid contact area at the seat may be varied according to the selected pattern arrangement.
In some cases, the substrate surface or resolved loci on which nucleic acids or other moieties (e.g., for polynucleotide synthesis) are deposited are smooth or substantially planar (e.g., two-dimensional) or have irregularities, such as raised or recessed features (e.g., three-dimensional features). In some cases, the substrate surface is modified with one or more layers of different compounds. Such purpose modifying layers include, but are not limited to, inorganic and organic layers, such as metals, metal oxides, polymers, small organic molecules, and the like.
In some cases, the resolved loci of the substrate are functionalized with one or more portions that increase and/or decrease the surface energy. In some cases, the moiety is chemically inert. In some cases, the portions are configured to support a desired chemical reaction, e.g., one or more processes in a polynucleotide synthesis reaction. The surface energy or hydrophobicity of a surface is a factor that determines the affinity of a nucleotide for attachment to a surface. In some cases, a method for substrate functionalization includes: (a) providing a substrate having a surface comprising silica; and (b) silylating the surface using suitable silylating agents (e.g., organofunctional alkoxysilane molecules) as described herein or known in the art. Methods and functionalizing agents are described in U.S. patent No. 5474796, which is incorporated herein by reference in its entirety.
In some cases, the substrate surface is functionalized by contact with a derivatizing composition comprising a mixture of silanes, typically via reactive hydrophilic moieties present on the substrate surface, under reaction conditions effective to couple the silane to the substrate surface. Silylation typically covers the surface by self-assembly with organofunctional alkoxysilane molecules. As is currently known in the art, a variety of siloxane functionalizing agents may also be used, for example, to reduce or increase the surface energy. Organofunctional alkoxysilanes are classified according to their organofunctional groups.
Polynucleotide synthesis
The methods of the invention for polynucleotide synthesis may include processes involving phosphoramidite chemistry. In some cases, polynucleotide synthesis includes coupling the base to a phosphoramidite. Polynucleotide synthesis may include coupling bases by depositing phosphoramidite under coupling conditions, wherein the same base is optionally deposited more than once with phosphoramidite, i.e., double coupling. Polynucleotide synthesis may include capping of unreacted sites. In some cases, capping is optional. Polynucleotide synthesis may also include oxidation or oxidation steps. Polynucleotide synthesis may include deblocking, detritylation, and sulfidation. In some cases, polynucleotide synthesis includes oxidation or sulfidation. In some cases, between one or each step during the polynucleotide synthesis reaction, for example using a tetrazole or acetonitrile washing device. The time frame of any one of the steps of the phosphoramidite synthesis process may be less than about 2 minutes, 1 minute, 50sec, 40sec, 30sec, 20sec and 10sec.
Polynucleotide synthesis using the phosphoramidite approach may include subsequent addition of phosphoramidite building blocks (e.g., nucleoside phosphoramidites) to the growing polynucleotide chain to form phosphite triester linkages. Phosphoramidite polynucleotide synthesis proceeds in the 3 'to 5' direction. Phosphoramidite polynucleotide synthesis allows for the controlled addition of one nucleotide to a growing nucleic acid strand during each synthesis cycle. In some cases, each synthesis cycle includes a coupling step. Phosphoramidite coupling involves the formation of a phosphite triester linkage between an activated nucleoside phosphoramidite and a nucleoside bound to a substrate, for example via a linker. In some cases, nucleoside phosphoramidites are provided to an activated device. In some cases, nucleoside phosphoramidites are provided to the device along with an activator. In some cases, nucleoside phosphoramidites are provided to the device in an amount that exceeds the substrate-bound nucleosides by a factor of 1.5, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100 or more. In some cases, the addition of nucleoside phosphoramidites is performed in an anhydrous environment, e.g., in anhydrous acetonitrile. After the addition of the nucleoside phosphoramidite, the device is optionally washed. In some cases, the coupling step is repeated one or more additional times, optionally with a washing step between the addition of nucleoside phosphoramidites to the substrate. In some cases, the polynucleotide synthesis methods used herein comprise 1, 2, 3, or more consecutive coupling steps. In many cases, the nucleoside bound to the device is deprotected prior to coupling by removal of a protecting group which acts to prevent polymerization. A common protecting group is 4,4' -Dimethoxytrityl (DMT).
After coupling, the phosphoramidite polynucleotide synthesis method optionally includes a capping step. In the capping step, the growing polynucleotide is treated with a capping reagent. The capping step may be used to block unreacted substrate-bound 5' -OH groups from further chain extension after coupling, thereby preventing the formation of polynucleotides with internal base deletions. In addition, activation with 1H-tetrazolePhosphoramidites can react to a small extent with guanosine at the O6 position. Without being bound by theory, in use I 2 After oxidation of water, the by-product (possibly via O6-N7 migration) may undergo depurination. The apurinic site may terminate to be cleaved during the final deprotection of the polynucleotide, thereby reducing the yield of full-length product. O6 modification can be achieved by using I 2 Water oxidation is removed by treatment with a capping reagent. In some cases, including a capping step during polynucleotide synthesis reduces the error rate compared to uncapped synthesis. As an example, the capping step includes treating the substrate-bound polynucleotide with a mixture of acetic anhydride and 1-methylimidazole. After the capping step, the device is optionally washed.
In some cases, the growing nucleic acid bound to the device is oxidized after the addition of the nucleoside phosphoramidite, and optionally after capping and one or more washing steps. The oxidation step includes oxidizing the phosphite triester to a tetra-coordinated phosphotriester, which is a protected precursor of naturally occurring phosphodiester internucleoside linkages. In some cases, oxidation of the growing polynucleotide is achieved by treatment with iodine and water, optionally in the presence of a weak base (e.g., pyridine, lutidine, collidine). The oxidation can be carried out under anhydrous conditions using, for example, tert-butyl hydroperoxide or (1S) - (+) - (10-camphorsulfonyl) -oxaziridine (CSO). In some methods, the capping step is performed after oxidation. The second capping step allows the device to dry, as residual water of oxidation, which may be present continuously, may inhibit subsequent coupling. After oxidation, the device and growing polynucleotide are optionally washed. In some cases, the oxidation step is replaced with a sulfidation step to yield a polynucleotide phosphorothioate, wherein any capping step may be performed after sulfidation. Many agents are capable of effecting sulfur transfer including, but not limited to, 3- (dimethylaminomethylene) amino) -3H-1,2, 4-dithiazol-3-thione, DDTT, 3H-1, 2-benzodithiol-3-one 1, 1-dioxide (also known as Beaucage reagent) and N, N, N' -tetraethylthiuram disulfide (TETD).
For subsequent cycles of nucleoside incorporation by coupling, the protected 5' end of the growing polynucleotide bound to the device is removed, allowing the primary hydroxyl group to react with the next nucleoside phosphoramidite. In some cases, the protecting group is DMT and deblocking is performed with trichloroacetic acid in methylene chloride. Extended time or detritylation with acid solutions stronger than the recommended acid solutions can lead to increased depurination of the polynucleotides bound to the solid support and thus reduced yields of the desired full length product. The methods and compositions of the invention described herein provide controlled deblocking conditions that limit undesirable depurination reactions. In some cases, the polynucleotide bound to the device is washed after deblocking. In some cases, efficient washing after deblocking contributes to low error rate polynucleotide synthesis.
Methods of synthesizing polynucleotides generally involve an iterative sequence of the following steps: applying a protected monomer to an activated functionalized surface (e.g., a seat) to attach to the activated surface, a linker, or to a previously deprotected monomer; deprotection of the applied monomer so that it can react with the subsequently applied protected monomer; and applying another protected monomer for attachment. One or more intermediate steps include oxidation or sulfidation. In some cases, one or more washing steps precede or follow one or all of the steps.
The method of phosphoramidite-based polynucleotide synthesis comprises a series of chemical steps. In some cases, one or more steps of the synthetic method involve reagent circulation, wherein one or more steps of the method include applying reagents to the device that are useful for the step. For example, the reagents are circulated through a series of liquid deposition and vacuum drying steps. For substrates comprising three-dimensional features (e.g., pores, microwells, channels, etc.), reagents optionally pass through one or more regions of the device via the pores and/or channels.
The methods and systems described herein relate to polynucleotide synthesis devices for synthesizing polynucleotides. The synthesis may be parallel. For example, at least or about at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 1000, 10000, 50000, 75000, 100000, or more polynucleotides may be synthesized in parallel. The total number of polynucleotides that can be synthesized in parallel may be 2-100000, 3-50000, 4-10000, 5-1000, 6-900, 7-850, 8-800, 9-750, 10-700, 11-650, 12-600, 13-550, 14-500, 15-450, 16-400, 17-350, 18-300, 19-250, 20-200, 21-150, 22-100, 23-50, 24-45, 25-40, 30-35. Those skilled in the art will appreciate that the total number of polynucleotides synthesized in parallel may fall within any range defined by any of these values, e.g., 25-100. The total number of polynucleotides synthesized in parallel may fall within any range defined by any of the values serving as the endpoints of the range. The total molar mass of the polynucleotides synthesized within the device or the molar mass of each of the polynucleotides can be at least or at least about 10, 20, 30, 40, 50, 100, 250, 500, 750, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 25000, 50000, 75000, 100000 picomoles, or more. The length of each of the polynucleotides or the average length of the polynucleotides within the device may be at least or about at least 10, 15, 20, 25, 30, 35, 40, 45, 50, 100, 150, 200, 300, 400, 500 nucleotides or more. The length of each of the polynucleotides or the average length of the polynucleotides within the device may be at most or about at most 500, 400, 300, 200, 150, 100, 50, 45, 35, 30, 25, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10 nucleotides or less. The length of each of the polynucleotides or the average length of the polynucleotides within the device may fall within 10-500, 9-400, 11-300, 12-200, 13-150, 14-100, 15-50, 16-45, 17-40, 18-35, 19-25. It will be appreciated by those skilled in the art that the length of each of the polynucleotides or the average length of the polynucleotides within the device may fall within any range defined by any of these values, for example 100-300. The length of each of the polynucleotides within the device or the average length of the polynucleotides may fall within any range defined by any of the values that serve as endpoints of the range.
The methods of synthesizing polynucleotides on a surface provided herein allow for rapid synthesis. As an example, at least 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 70, 80, 90, 100, 125, 150, 175, 200 nucleotides or more are synthesized per hour. Nucleotides include adenine, guanine, thymine, cytosine, uridine building blocks or analogues/modified forms thereof. In some cases, libraries of polynucleotides are synthesized in parallel on a substrate. For example, a device comprising about or at least about 100, 1,000, 10,000, 30,000, 75,000, 100,000, 1,000,000, 2,000,000, 3,000,000, 4,000,000, or 5,000,000 resolved loci can support synthesis of at least the same number of different polynucleotides, wherein polynucleotides encoding different sequences are synthesized at the resolved loci. In some cases, a library of polynucleotides is synthesized on a device with a low error rate as described herein in less than about three months, two months, one month, three weeks, 15 days, 14 days, 13 days, 12 days, 11 days, 10 days, 9 days, 8 days, 7 days, 6 days, 5 days, 4 days, 3 days, 2 days, 24 hours, or less. In some cases, larger nucleic acids assembled from a polynucleotide library synthesized at low error rates using the substrates and methods described herein are prepared in less than about three months, two months, one month, three weeks, 15 days, 14 days, 13 days, 12 days, 11 days, 10 days, 9 days, 8 days, 7 days, 6 days, 5 days, 4 days, 3 days, 2 days, 24 hours, or less.
In some cases, the methods described herein provide for the generation of a nucleic acid library comprising variant nucleic acids that differ at multiple codon sites. In some cases, the nucleic acid can have 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, or more variant codon sites.
In some cases, one or more of the variant codon sites may be adjacent. In some cases, one or more of the variant codon sites may not be adjacent, but separated by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more codons.
In some cases, the nucleic acid may comprise multiple sites of variant codon sites, wherein all variant codon sites are adjacent to each other, forming a stretch of variant codon sites. In some cases, the nucleic acid may comprise multiple sites of variant codon sites, wherein none of the variant codon sites are adjacent to each other. In some cases, the nucleic acid may comprise multiple sites of variant codon sites, some of which are adjacent to each other, forming a stretch of variant codon sites, and some of which are not adjacent to each other.
Referring to the drawings, FIG. 3 shows an exemplary process workflow for synthesizing nucleic acids (e.g., genes) from shorter nucleic acids. Workflow is generally divided into the following stages: (1) de novo synthesis of single stranded nucleic acid libraries, (2) ligation of nucleic acids to form larger fragments, (3) error correction, (4) quality control, and (5) transport. The desired nucleic acid sequence or set of nucleic acid sequences is preselected prior to de novo synthesis. For example, a set of genes is preselected for generation.
Once the large nucleic acids are selected for generation, a predetermined nucleic acid library is designed for de novo synthesis. Various suitable methods for generating high density polynucleotide arrays are known. In a workflow example, a device surface layer is provided. In this example, the surface chemistry is altered to improve the polynucleotide synthesis process. The low surface energy regions are created to repel liquid while the high surface energy regions are created to attract liquid. The surface itself may be in the form of a flat surface or contain variations in shape, such as protrusions or micro-holes that increase the surface area. In a workflow example, as disclosed in international patent application publication WO/2015/021080, the selected high surface energy molecules serve the dual function of supporting DNA chemistry, which is incorporated herein by reference in its entirety.
In situ preparation of polynucleotide arrays is performed on a solid support and multiple oligomers are extended in parallel using a single nucleotide extension process. The deposition device (e.g., a material deposition device) is designed to release reagents in a stepwise manner such that multiple polynucleotides extend one residue at a time in parallel to produce an oligomer 302 having a predetermined nucleic acid sequence. In some cases, at this stage, the polynucleotide is cleaved from the surface. Cleavage includes, for example, gas cleavage with ammonia or methylamine.
The resulting polynucleotide library is placed in a reaction chamber. In this exemplary workflow, the reaction chamber (also referred to as a "nanoreactor") is a silicon coated well containing PCR reagents and lowered onto the polynucleotide library 303. Reagents are added to release polynucleotides from the substrate either before or after polynucleotide sealing 304. In an exemplary workflow, the polynucleotides are released after the nanoreactor 305 is sealed. Once released, fragments of single stranded polynucleotides hybridize to span the entire long program column of DNA. Partial hybridization 305 is possible because each synthetic polynucleotide is designed to have a small overlap with at least one other polynucleotide in the pool.
After hybridization, the PCA reaction was started. During the polymerase cycle, the polynucleotide anneals to the complementary fragment and the gaps are filled in with polymerase. Each cycle randomly increases the length of the individual fragments, depending on which polynucleotides find each other. The complementarity between the fragments allows the formation of a complete large-span double-stranded DNA 306.
After the PCA is completed, the nanoreactor is separated 307 from the device and positioned to interact 308 with the device with PCR primers. After sealing, the nanoreactor performs PCR 309 and amplifies the larger nucleic acid. After PCR 310, the nanochamber 311 is opened, error correction reagents 312 are added, the chamber is sealed 313 and an error correction reaction is performed to remove mismatched base pairs and/or strands 314 with poor complementarity from the double-stranded PCR amplification product. The nanoreactor 315 is opened and separated. The error corrected product is then subjected to additional processing steps such as PCR and molecular barcoding, and then packaged 322 for shipment 323.
In some cases, quality control measures are taken. After error correction, the quality control step includes, for example, interacting with a wafer with sequencing primers for amplifying the error correction products 316, sealing the wafer into a chamber containing the error correction amplification products 317, and performing another round of amplification 318. The nanoreactor 319 is opened, the products 320 are combined and sequenced 321. After acceptable quality control results are obtained, the packaged product 322 is approved for transportation 323.
In some cases, the nucleic acid generated by, for example, the workflow in fig. 3 is subjected to mutagenesis using the overlapping primers disclosed herein. In some cases, a library of primers is generated by in situ preparation on a solid support and multiple oligomers are extended in parallel using a single nucleotide extension process. The deposition device (e.g., a material deposition device) is designed to release reagents in a stepwise manner such that multiple polynucleotides extend one residue at a time in parallel to produce an oligomer 302 having a predetermined nucleic acid sequence.
Computer system
Any of the systems described herein are operably connected to a computer and can be automated locally or remotely by the computer. In various cases, the methods and systems of the present invention may also include software programs on a computer system and uses thereof. Accordingly, computer control for synchronizing dispense/vacuum/refill functions (e.g., coordinating and synchronizing material deposition device movements, dispense actions, and vacuum actuation) is within the scope of the present invention. The computer system may be programmed to connect between the user-specified base sequence and the location of the material deposition device to deliver the correct reagent to the specified area of the substrate.
The computer system 400 shown in fig. 4 may be understood as a logical device capable of reading instructions from the medium 411 and/or the network port 405, which may optionally be connected to a server 409 having a fixed medium 412. A system such as that shown in fig. 4 may include a CPU 401, a disk drive 403, optional input devices (e.g., a keyboard 415 and/or a mouse 416), and an optional display 407. Data communication with a local or remote location server may be accomplished through the communication medium shown. A communication medium may include any means for transmitting and/or receiving data. For example, the communication medium may be a network connection, a wireless connection, or an internet connection. Such a connection may provide communication over the world wide web. It is contemplated that data related to the present invention may be transmitted over such a network or connection for receipt and/or examination by party 422 shown in fig. 4.
FIG. 5 is a block diagram illustrating a first example architecture of a computer system 500 that may be used in connection with an example scenario of the present invention. As shown in fig. 5, an example computer system may include a processor 502 for processing instructions. Non-limiting examples of processors include: intel Xeon TM Processor, AMD Opteron TM Processor, samsung 32-bit RISC ARM 1176JZ (F) -S v 1.0.0 TM Processor, ARM Cortex-A8 Samsung S5PC100TM processor, ARM Cortex-A8 Apple A4 TM Processor, marvell PXA 930 TM A processor or a functionally equivalent processor. Multiple threads of execution may be used for parallel processing. In some cases, multiple processors or processors with multiple cores, whether in a single computer system, in a cluster, or distributed across systems on a network including multiple computers, mobile phones, and/or personal data assistant devices, may also be used.
As shown in FIG. 5, a cache 504 may be coupled to or incorporated into the processor 502 to provide high speed memory for instructions or data used recently or frequently by the processor 502. Processor 502 is connected to north bridge 506 through processor bus 508. Northbridge 506 is coupled to Random Access Memory (RAM) 510 via memory bus 512 and manages access to RAM 510 by processor 502. Northbridge 506 is also coupled to southbridge 514 via chipset bus 516. The south bridge 514 is in turn connected to a peripheral bus 518. The peripheral bus may be, for example, PCI-X, PCI Express, or other peripheral bus. The north and south bridges are commonly referred to as processor chipsets and manage the transfer of data between the processor, RAM, and peripheral components on the peripheral bus 518. In some alternative architectures, the functionality of the north bridge may be incorporated into the processor rather than using a separate north bridge chip. In some cases, system 500 may include an accelerator card 522 attached to peripheral bus 518. The accelerator may include a Field Programmable Gate Array (FPGA) or other hardware for accelerating certain processes. For example, accelerators may be used to adapt data reorganization or evaluate algebraic expressions used in extended set processing.
Software and data are stored externallyMemory 524, and may be loaded into RAM 510 and/or buffer 504 for use by the processor. The system 500 includes an operating system for managing system resources; non-limiting examples of operating systems include: linux, windows TM 、MACOSTM、BlackBerry OSTM、iOS TM And other functionally equivalent operating systems, as well as application software running on an operating system for managing data storage and optimization in accordance with the exemplary aspects of this invention. In this example, system 500 also includes Network Interface Cards (NICs) 520 and 521 connected to the peripheral bus for providing a network interface to external memory, such as Network Attached Storage (NAS) and other computer systems available for distributed parallel processing.
Fig. 6 is a diagram illustrating a network 600 having a plurality of computer systems 602a and 602b, a plurality of mobile phones and personal data assistants 602c, and Network Attached Storage (NAS) 604a and 604 b. In an example case, the systems 602a, 602b, and 602c may manage data storage and optimize data access to data stored in Network Attached Storage (NAS) 604a and 604 b. Mathematical models can be used for this data and evaluated using distributed parallel processing on computer systems 602a and 602b and mobile phone and personal data assistant system 602 c. Computer systems 602a and 602b and mobile phone and personal data assistant systems 602c may also provide parallel processing for adaptive data reorganization of data stored in Network Attached Storage (NAS) 604a and 604 b. Fig. 6 illustrates only an example, and a variety of other computer architectures and systems may be used in connection with various aspects of the invention. For example, a blade server may be used to provide parallel processing. The processor blades may be connected through a backplane to provide parallel processing. The memory may also be connected to the backplane through a separate network interface or as Network Attached Storage (NAS). In some example cases, the processor may maintain separate memory space and transfer data through a network interface, backplane, or other connector for parallel processing by other processors. In other cases, some or all of the processors may use shared virtual address memory space.
FIG. 7 is a block diagram of a multiprocessor computer system 700 that uses shared virtual address memory space according to an example scenario. The system includes multiple processors 702a-f that can access a shared memory subsystem 704. The system incorporates a plurality of programmable hardware Memory Algorithm Processors (MAPs) 706a-f in the memory subsystem 704. Each MAP 706a-f may include memory 708a-f and one or more Field Programmable Gate Arrays (FPGAs) 710a-f. The MAPs provide configurable functional units and may provide FPGAs 710a-f with specific algorithms or portions of algorithms for processing in close coordination with the respective processors. For example, in an example case, the MAP may be used to evaluate algebraic expressions with respect to the data model and perform adaptive data reorganization. In this example, each MAP is globally accessible to all of the processors for these purposes. In one configuration, each MAP may access the associated memory 708a-f using Direct Memory Access (DMA), allowing it to perform tasks independently of and asynchronously with the corresponding microprocessor 702a-f. In this configuration, the MAP may feed results directly back to another MAP for pipelining and parallel execution of the algorithm.
The above-described computer architectures and systems are merely examples, and a variety of other computer, mobile telephone, and personal data assistant architectures and systems can be used in conjunction with the exemplary case, including systems using general purpose processors, co-processors, FPGAs, and other programmable logic devices, systems On Chips (SOCs), application Specific Integrated Circuits (ASICs), and any other combination of processing and logic elements. In some cases, all or a portion of the computer system may be implemented in software or hardware. Any kind of data storage medium may be used in connection with the example scenario, including random access memory, hard drives, flash memory, tape drives, disk arrays, network Attached Storage (NAS), and other local or distributed data storage devices and systems.
In an example case, the computer system may be implemented using software modules executing on any of the above or other computer architectures and systems. In other cases, the functionality of the system may be implemented partially or entirely in firmware, programmable logic devices (e.g., a Field Programmable Gate Array (FPGA) as mentioned in fig. 5), a system on a chip (SOC), an application-specific integrated circuit (ASIC), or other processing and logic elements. For example, the group processor and optimizer may be implemented by hardware acceleration using a hardware accelerator card (e.g., accelerator card 522 shown in fig. 5).
The following examples are set forth to more clearly illustrate the principles and practices of embodiments disclosed herein to those skilled in the art and are not to be construed as limiting the scope of any claimed embodiments. All parts and percentages are by weight unless otherwise indicated.
Examples
The following examples are given to illustrate various embodiments of the invention and are not meant to limit the invention in any way. The examples of the invention and the methods described herein are presently representative of the preferred embodiments, are exemplary, and are not intended as limitations on the scope of the invention. Those skilled in the art will recognize modifications and other applications which are within the spirit of the present invention, as defined by the scope of the claims.
Example 1: functionalization of device surfaces
The device is functionalized to support the attachment and synthesis of polynucleotide libraries. First using a catalyst comprising 90% H 2 SO 4 And 10% H 2 O 2 The device surface was wet cleaned for 20 minutes with piranha solution (piranha solution). The apparatus was rinsed with DI water in several beakers, kept under a DI water gooseneck tap for 5min, and rinsed with N 2 And (5) drying. Then put on NH 4 OH (1:100; 3 mL: 300 mL), rinsed with DI water using a hand-held spray gun (handgun), soaked with DI water for 1min in each of three consecutive beakers, and rinsed again with DI water using a hand-held spray gun. Then by exposing the device surface to O 2 To plasma clean the device. O at 250 Watts in downstream mode Using a SAMCO PC-300 instrument 2 And (5) plasma etching for 1min.
Cleaning of device surfaces with a solution comprising N- (3-triethoxysilylpropyl) -4-hydroxybutyramide using a YES-1224P vapor deposition oven system with the following parametersPerforming activation functionalization: 0.5 to 1 Torr, 60min,70 ℃,135 ℃ vaporizer. The device surface was resist coated using a Brewer Science 200X spin coater. Will SPR T The M3612 photoresist was spin coated on the device at 2500rpm for 40sec. The device was pre-baked on a Brewer hotplate at 90℃for 30min. The apparatus was subjected to lithography using a Karl Suss MA6 mask aligner. The device was exposed for 2.2sec and developed in MSF 26A for 1min. The remaining developer was rinsed with a hand-held spray gun and the device was immersed in water for 5min. The apparatus was baked in an oven at 100℃for 30min, after which the lithographic defects were visually inspected using Nikon L200. O was performed at 250 Watts using a SAMCO PC-300 instrument 2 Plasma etching for 1min, a pre-clean (desmut) process was used to remove the remaining resist.
The device surface was passivation functionalized with 100 μl perfluorooctyl trichlorosilane solution mixed with 10 μl light mineral oil. The device was placed in the chamber, pumped for 10min, then the valve of the pump was closed and left to stand for 10min. The chamber is vented. The resist of the device was stripped by soaking in 500mL NMP twice for 5min at 70 ℃ and sonicating at maximum power (9 on the creation system). The device was then immersed in 500mL of isopropanol at room temperature for 5min and sonicated at maximum power. The device was immersed in 300mL of 200 gauge ethanol and N was used 2 And (5) blow-drying. The functionalized surface is activated for use as a support for polynucleotide synthesis.
Example 2: synthesis of 50 mer sequences on an oligonucleotide Synthesis apparatus
The two-dimensional oligonucleotide synthesis device was assembled into a flow cell and connected to the flow cell (Applied Biosystems (ABI 394 DNA synthesizer)). The two-dimensional oligonucleotide synthesis device was uniformly functionalized with N- (3-triethoxysilylpropyl) -4-hydroxybutyramide (Gelest), which was used to synthesize 50bp of an exemplary polynucleotide ("50 mer polynucleotide") using the polynucleotide synthesis methods described herein.
The sequence of the 50 mer is set forth in SEQ ID NO: 1348. 5'AGACAATCAACCATTTGGGGTGGACAGCCTTGACCTCTAGAC TTCGGCAT # TTTTTTTTTT' (SEQ ID NO: 1348), where # denotes thymidine-succinylhexanamide CED phosphoramidite (CLP-2244 from ChemGENs), a cleavable linker capable of releasing an oligonucleotide from a surface during deprotection.
The synthesis was performed using standard DNA synthesis chemistry (coupling, capping, oxidation and deblocking) according to the protocol in table 3 and ABI synthesizer.
Table 3: synthetic scheme
/>
/>
The phosphoramidite/activator combination is delivered in a similar manner as the host agent is delivered through a flow cell. The drying step is not performed while the environment remains "wetted" by the reagent for the entire time.
The restrictor is removed from the ABI 394 synthesizer to enable faster flow. In the absence of a restrictor, the flow rates of amides (0.1M in ACN), activator (0.25M benzoylthiotetrazole ("BTT"; 30-3070-xx from GlenResearch) and Ox (20% pyridine, 10% water and 70% THF solution of 0.02M I2) were approximately-100 uL/sec, the flow rate of a 1:1 mixture of acetonitrile ("ACN") and capping reagent (cap A and cap B), where cap A is a THF/pyridine solution of acetic anhydride and cap B is a THF solution of 16% 1-methylimidazole) was approximately-200 uL/sec, and the flow rate of the deblocking agent (3% toluene solution of dichloroacetic acid) was approximately-300 uL/sec (compared to-50 uL/sec with all reagents with a restrictor). The time to complete removal of the oxidant was observed, the timing of the chemical flow times was adjusted accordingly, and additional ACN washes were introduced between different chemicals. After polynucleotide synthesis, the chip was deprotected overnight in gaseous ammonia at 75 psi. Five drops of water are applied to the surface to recover the polynucleotides. The recovered polynucleotides were then analyzed on a BioAnalyzer small RNA chip.
Example 3: synthesis of 100 mer sequences on an oligonucleotide Synthesis apparatus
Using the same procedure as described for the synthesis of the 50-mer sequence in example 2, 100-mer polynucleotides ("100-mer polynucleotides"; 5 'CGGGATCTTATCGTACGTACAGACGATGATGATGATGATGAGAACCCGCAT # TTTTTTTTTT', where # represents thymidine-succinylhexanamide CED phosphoramidite (CLP-2244 from ChemGENs); SEQ ID NO: 1349), the first chip was uniformly functionalized with N- (3-triethoxysilylpropyl) -4-hydroxybutyramide, the second chip was functionalized with a 5/95 mixture of 11-acetoxyundecyl triethoxysilane and N-decyltriethoxysilane, and the polynucleotides extracted from the surface were analyzed on a BioAnalyzer instrument.
All ten samples from both chips were further PCR amplified using the forward primer (5 'ATGCGGTTCTCTACTATCATC 3'; SEQ ID NO: 1350) and the reverse primer (5 'CGGGATCTTATCGTCTACTCG3'; SEQ ID NO: 1351) in a 50uL PCR mix (25 uL NEB Q5 premix, 2.5uL 10uM forward primer, 2.5uL 10uM reverse primer, 1uL polynucleotide extracted from the surface, water make up to 50 uL) using the following thermal cycling procedure:
98℃,30sec
98℃for 10sec;63 ℃ for 10sec;72 ℃,10sec; repeating 12 cycles
72℃,2min
The PCR products were also run on a BioAnalyzer, showing a spike at the 100 mer position. The PCR amplified samples were then cloned and Sanger sequenced. Table 4 summarizes Sanger sequencing results for samples collected from spots 1-5 of chip 1 and samples collected from spots 6-10 of chip 2.
Table 4: sequencing results
Point(s) Error rate Cycle efficiency
1 1/763bp 99.87%
2 1/824bp 99.88%
3 1/780bp 99.87%
4 1/429bp 99.77%
5 1/1525bp 99.93%
6 1/1615bp 99.94%
7 1/531bp 99.81%
8 1/1769bp 99.94%
9 1/854bp 99.88%
10 1/1451bp 99.93%
Thus, the high quality and uniformity of polynucleotides synthesized repeatedly on two chips with different surface chemistries. Overall, 89% of the 100 mers sequenced were perfect sequences with no errors, corresponding to 233 out of 262.
Table 5 summarizes the error characteristics of sequences obtained from the polynucleotide samples from points 1-10.
Table 5: error characterization
/>
Example 4: functional GLP-1R antibodies identified from synthetic GPCR focused libraries exhibit potent glycemic control.
This example describes the identification of antagonistic and agonistic GLP-1R antibodies with in vitro and in vivo functional activity.
Materials and methods
Stable cell line and phage library generation
The full length human GLP-1R gene (UniProt-P43220) with an N-terminal FLAG tag and a C-terminal GFP tag cloned into a pCDNA3.1 (+) vector (ThermoFisher) was transfected into suspension Chinese Hamster Ovary (CHO) cells to generate stable cell lines expressing GLP-1R. Target expression was confirmed by FACS. Cells expressing >80% glp-1R by GFP were then used directly for cell-based selection.
Germline heavy chain IGHV1-69, IGHV3-30 and germline light chain IGKV1-39, IGKV3-15, IGLV1-51, IGLV2-14 framework combinations were used in GPCR focused phage display libraries, and all six CDR diversity was encoded by pools of oligonucleotides synthesized similarly to examples 1-3 above. CDRs are also screened to ensure that they do not contain manufacturability tendencies, cryptic splice sites or commonly used nucleotide restriction sites. The heavy chain variable region (VH) and the light chain variable region (VL) are connected by a (G4S) 3 linker. The resulting scFv (VH-linker-VL) gene library was cloned into the pADL 22-2c (Antibody Design Labs) phage display vector by NotI restriction digestion and electroporated into TG 1-competent E.coli cells (Lucigen). The final library had a NGS-validated 1.1x10 10 Diversity in size.
Panning and screening strategies for isolating agonist GLP-1R scFv clones
Prior to panning CHO cells expressing GLP-1R, phage particles were blocked with 5% BSA/PBS and the non-specific binders on CHO parental cells were depleted. For CHO parental cell depletion, input phage aliquots were incubated with 1X 10 at Room Temperature (RT) 8 The CHO parental cells were spun together at 14rpm/min for 1 hour. Cells were then pelleted by centrifugation at 1,200rpm for 10min in a bench top Eppendorf centrifuge 5920RS/4 x 1000 rotor to deplete non-specific CHO cell binding agents. Phage supernatant depleted of CHO cell binding agent was then transferred to 1X 10 8 In CHO cells expressing GLP-1R. Phage supernatants and GLP-1R expressing CHO cells were spun at 14rpm/min for 1 hour at RT to select GLP-1R binders. After incubation, cells were washed several times with 1 XPBS/0.5% Tween to remove unbound clones. To elute phage that bound GLP-1R cells, the cells were incubated with trypsin in PBS buffer for 30 minutes at 37 ℃. Cells were pelleted by centrifugation at 1,200rpm for 10 min. Amplification of GLP-1R-enriched binding clones in TG1 E.coli cellsThe supernatant was removed for use as input phage for the next round of selection. The selection strategy was repeated five times. Each round was depleted against CHO parental background. The amplified output phage of one round was used as input phage for the next round, and the stringency of the wash increased with increasing number of washes in each subsequent round of selection. After five rounds of selection, 500 clones from each of rounds 4 and 5 were Sanger sequenced to identify unique clones.
Next generation sequencing analysis
Phagemid DNA was prepared in small amounts from the output bacterial stock of all panning rounds. The variable heavy chain (VH) was PCR amplified from phagemid DNA using forward primer ACAGAATTCATTAAAGAGGAGAAATTAACC and reverse primer TGAACCGCCTCCACCGCTAG. The PCR product was used directly for library preparation using the KAPA HyperPlus library preparation kit (Kapa Biosystems, product # KK 8514). To increase diversity in the library, samples were spiked with a 15% PhiX control purchased from Illumina, inc (product #fc-110-3001). The library was then loaded onto Illumina 600 cycle MiSeq kit v3 (Illumina, product #ms-102-3003) and run on MiSeq instrument.
Reformatting and High Throughput (HT) IgG purification
Expi293 cells were transfected with heavy and light chain DNA at a 2:1 ratio using an Expifectamine (ThermoFisher, A14524) and supernatants were harvested on day 4 post-transfection before cell viability fell below 80%. Purification was performed using King Fisher (ThermoFisher) or Phynexus protein a column tip (Hamilton) with protein a magnetic beads. For large scale production of IgG clones evaluated in vivo mouse studies, akta HPLC purification system (GE) was used.
IgG characterization and quality control. Purified IgG of positive GLP-1R binders (hits) were purity characterized by a high sensitivity assay of LabChip GXII Touch HT protein expression. Dithiothreitol (DTT) was used to reduce IgG to VH and VL. IgG concentrations were measured using Lunatic (UnChain). IgG was further characterized by HPLC for in vivo mouse studies and tested for endotoxin levels [ ]nexgen-PTS TM Endotoxin test, charles River), less than 5EU per kg of administration.
Binding assays and flow cytometry
GLP-1R IgG clones were tested in a binding assay that was combined with flow cytometry analysis as follows: FLAG-GLP-1R-GFP expressing CHO cells (CHO-GLP-1R) and CHO parental cells were incubated with 100nM IgG on ice for 1h, washed three times, and incubated with Alexa 647 conjugated goat anti-human antibody (1:200) (Jackson ImmunoResearch Laboratories, 109-605-044) for 30min on ice followed by three washes, centrifugation to pellet cells between wash steps. All incubations and washes were performed in buffer containing pbs+1% BSA. For titration, igG was serially diluted 1:3 from 100nM to 0.046nM. The cells were analyzed by flow cytometry and hits were identified by measuring GFP signal against Alexa 647 signal (hits are IgG specifically binding to CHO-GLP-1R). Flow cytometry data for binding assays with 100nM IgG are shown as dot plots. Binding assays performed with IgG titration were expressed as a binding curve plotting IgG concentration against MFI (mean fluorescence intensity).
Ligand competition assay
Ligand competition assays involved co-incubating primary IgG with 1 μm GLP-1 (7-36). For each data point, igG (600 nM) was prepared in running buffer (PBS+1% BSA) and diluted 1:3 for 8 drop points. Peptide GLP-1 7-36 (2. Mu.M) was similarly prepared with running buffer (PBS+1% BSA). Each well contains 100,000 cells to which 50 μl IgG and 50 μl peptide (=plus sign) or a separate buffer without peptide (=minus sign) are added. The cells and IgG/peptide mixture were incubated on ice for 1hr, after washing, PBS+1% BSA was added to the secondary antibody (goat anti-human APC, jackson ImmunoResearch Laboratories, product # 109-605-044) at a 1:200 dilution. It was incubated on ice for 30min (50 μl per well), then washed and resuspended in 60 μl buffer. Finally, atAssay readings were measured on an IQue3 screener at a rate of 4 seconds per well.
Cell-based functional assays
cAMP assay. The potential effect of GLP-1R IgG clones on GLP-1R signaling was tested by performing a cAMP assay obtained from Eurofins DiscoverX. The technology involved in detecting cAMP levels is a wash-free signal gain competitive immunoassay based on the enzyme fragment complementation technology. Experiments were designed to test IgG clones for agonist or antagonist activity. To test for agonist activity of IgG, cells were stimulated with either IgG 37℃for 30min (1:3 titration starting at 100nM and diluted to 0.046nM with PBS) or with the known agonist GLP-1 7-36 peptide (MedChemexpress, cat. Number: HY-P005) (1:6 titration starting at 12.5nM and diluted to 0.003nM with PBS). To test for antagonist activity, cells were incubated with a fixed concentration of 100nM IgG for 1h at room temperature to allow binding, followed by stimulation with GLP1 7-36 peptide (1:6 titration starting at 12.5nM and down to 0.003nM in PBS) for 30min at 37 ℃. Intracellular cAMP levels were detected according to assay kit instructions.
Beta-inhibitor protein recruitment assay. Beta-arrestin recruitment assays were obtained from Eurofins DiscoverX (cat#93-0300E 2) that over-expressed CHO-K1 cells with untagged GLP-1R. This experiment was performed to test whether GLP1R-3 has an effect on GLP-1 7-36 agonist-induced beta-inhibitor protein recruitment following GLP-1R activation. Expanded cells were seeded into 96-well plates at 5,000 cells/well and experiments were performed 48 hours after cell plating. 100nM IgG was pre-incubated with 50ul volumes of plated cells for 1 hour at room temperature, followed by 5ul of ligand GLP-1 7-36 added and further incubation for 30min at 37 ℃. To each well was added 22.5uL of detection solution, gently tapped and briefly centrifuged. Plates were then incubated at RT in the dark for 1 hour. The plate was then read by a chemiluminescent plate reader, molecular Devices SpectraMax M5, and output Relative Light Unit (RLU) data using GraphPad Prism analysis.
In vivo study
An animal. All animal procedures were approved by the animal administration committee (IACUC) of san francisco, university of california and were performed according to national institutes of health laboratory animal care and instructions for use. In all of the studies, 8-10 week old C57BL/6NHsd (Envigo RMS, LLC) male litter weighing 20-28 grams was used. Mice were housed in rooms at temperature (22-25 ℃) and light control (12 h: 12h light/dark cycle starting from 7 AM). During the time of the UCSF animal care facility rearing, mice were fed with a 9% fat diet (PicoLab mouse diet 20 (# 5058), supplied by laboratory, fortworth Texas, USA).
Monoclonal antibodies and reagents. anti-GLP-1 monoclonal antibodies (mabs) in PBS buffer were tested in these studies: agonist mAb GLP1R-59-2 and antagonist mAb GLP1R-3. Mice were dosed prior to Glucose Tolerance Test (GTT) or Insulin Tolerance Test (ITT) using the following protocol: the agonist GTT-59-2mAb was administered at 5 or 10mg/kg in three different administration regimen groups and four different administration regimen groups in an insulin resistance test (ITT) prior to GTT. 1. mAb was administered in a single dose 15 hours before GTT and 21 hours before ITT. 2. mAb was administered at double dose 15 hours before GTT and 21 hours before ITT, and at a second mAb dose 2 hours before GTT and ITT. 3. mAb single dose 2 hours before GTT and ITT. 4. mAb single dose only 6 hours prior to ITT.
The antagonist GLP1R-3mAb was administered at 20mg/kg in four different administration regimen groups. 1. mAb was administered in a single dose 15 hours before GTT and 21 hours before ITT. 2. mAb was administered at double dose 15 hours before GTT and 21 hours before ITT, and at a second mAb dose 2 hours before GTT and ITT. 3. mAb was administered in a single dose 6 hours prior to GTT and ITT. 4. mAb single dose 2 hours before GTT and ITT.
Extendin9-39 peptide (MedChemexpress, cat# HY-P0264) was administered at 1.0 or 0.23mg/kg in three different dosing regimen groups. 1. The Extendin was administered in a single dose 21 hours prior to ITT. 2. The Extendin was administered at a double dose 21 hours before ITT and at a second Extendin dose 2 hours before ITT. 3. mAb was administered in a single dose 6 hours prior to ITT.
Glucose tolerance test
Glucose tolerance assay (GTT) was used to evaluate the effect of two different anti-GLP 1 mabs (agonists and antagonists) on glucose tolerance following acute glucose administration. An intraperitoneal glucose tolerance test (IP-GTT) was performed in 8 or 10 week old male mice to assess glucose utilization following glucose injection and blood glucose levels were measured after overnight (14-16 hours) fasting of the mice. To avoid circadian variation of mouse blood glucose levels, the test was performed at fixed times. Mice were weighed after overnight fast and baseline blood glucose levels were measured (before glucose injection; time 0 min). Mice were intraperitoneally injected with a 30% dextrose solution (Hospira, illinois) in a single bolus (10 ul/g body weight) and blood glucose levels were measured 15, 30, 60, 120, and 180 minutes after glucose administration. Blood samples were obtained through a tail incision and blood glucose levels were monitored using an OneTouch Ultra 2 glucose monitor (LifeScan, inc.).
Insulin resistance test
An insulin resistance test (ITT) was performed to evaluate the effect of two different anti-GLP 1 mabs (agonists and antagonists) on insulin sensitivity following acute insulin administration. Male mice of 8 or 10 weeks of age were fasted for 6 hours and body weights before and after fasted were recorded. To avoid circadian variation of mouse blood glucose levels, the test was performed at fixed times. Blood samples were collected through the tail incision and baseline glucose was measured prior to insulin injection. Mice were intraperitoneally injected with human insulin (norcurin, novo Nordisk) in a single bolus (0.75U/Kg body weight) and blood glucose levels were measured 15, 30, 45, 60 and 120 minutes after insulin injection. Blood glucose levels were monitored using an OneTouch Ultra 2 glucose monitor (LifeScan, inc.).
ELISA for Pharmacokinetic (PK) studies.
Rat PK studies were performed at Charles River Laboratories, one Innovation Dr,3Biotech, jejunum, massachusetts, zip 01605. Prior to dosing, each group of 5 male Sprague-Dawley rats was acclimatized at the test facility for at least 3 days. GLP1R-3 and GLP1R59-2 were administered Intravenously (IV) at 10mg/kg in 100mM Hepes, 100mM NaCl, 50mM NaAc, pH 6.0 vehicle. A continuous blood sample of 250ul volume was collected via jugular vein cannulation at each time point (pre-dose, 0.0833, 0.25, 0.5, 1, 2, 4, 8, 24, 48, 72, 96, 168, 240 and 336 hours post-dose). Blood samples were collected to K 2 In EDTA tubesStored on wet ice until processed into plasma by centrifugation (3500 rpm, 10 min at 5 ℃) within 30 minutes after collection. The plasma samples were then transferred to appropriate tubes containing DPP-4 (3.3 μl for 100 μl plasma) and frozen on dry ice. For measurement of human IgG in rat plasma samples, sheep anti-human IgG (1 mg/mL) was used as a coating reagent (binding site, lot AU003. M) in ELISA assay, and goat anti-human IgG, HRP (H)&L) (1 mg/mL) as a detection reagent (Bethy, cat#A80-319P). Stock solutions of human IgG standards and QC were prepared by incorporating human IgG into rat plasma. At least two wells were used to analyze each study sample, QC, standard, and blank. A 4-parameter logic (4 PL) model was used to fit the sigmoid correction curve. A semilogarithmic sigmoid correction curve was obtained by plotting the absorbance response against concentration. The concentration of the analyte in the test sample is determined from the plot of the calibration curve by computer interpolation.
Results
Design of GPCR focused antibody library is based on GPCR binding motif and GPCR antibodies
All known GPCR interactions, including GPCR interactions with ligands, peptides, antibodies, endogenous extracellular loops, and small molecules, are analyzed to map GPCR binding molecule determinants. The crystal structure of approximately 150 peptides, ligands or antibodies that bind to the ECD of about 50 GPCRs (http:// www.gpcrdb.org) was used to identify GPCR binding motifs. More than 1000 GPCR binding motifs were extracted from this assay. In addition, by analyzing all resolved GPCR structures (zhanglab. Ccmb. Med. Umich. Edu/GPCR-EXP /), more than 2000 binding motifs from the endogenous extracellular loop of GPCRs were identified. Finally, by analyzing the structure of more than 100 small molecule ligands that bind to GPCRs, a simplified amino acid library of 5 amino acids (Tyr, phe, his, pro and Gly) was identified that may be able to generalize many structural contacts of these ligands. A sub-library with this reduced amino acid diversity is located within the CxxxxxC motif. In total, more than 5000 GPCR binding motifs were identified (FIGS. 9A-9E). These binding motifs are located in one of five different stem regions: CARDLRELECEEWTxxxxxSRGPCVDPRGVAGSFDVW, CARDMYYDFxxxxxEVVPADDAFDIW, CARDGRGSLPRPKGGPxxxxxYDSSEDSGGAFDIW, CARANQHFxxxxxGYHYYGMDVW, CAKHMSMQxxxxxRADLVGDAFDVW.
These stem regions are selected from structural antibodies with ultralong HCDR3. Antibody germline was specifically selected to accommodate these ultralong HCDR3 s. Structural and sequence analysis of human antibodies longer than 21 amino acids revealed a V gene bias in antibodies with long CDR3. Finally, the germline IGHV (IGHV 1-69 and IGHV 3-30), IGKV (IGKV 1-39 and IGKV 3-15), and IGLV (IGLV 1-51 and IGLV 2-14) genes were selected based on this analysis.
In addition to HCDR3 diversity, limited diversity was also introduced in the other 5 CDRs. There are 416 HCDR1 variants and 258 HCDR2 variants in the IGHV1-69 domain; 535 HCDR1 variants and 416 HCDR2 variants in the IGHV3-30 domain; there are 490 LCDR1 variants, 420 LCDR2 variants and 824 LCDR3 variants in the IGKV1-39 domain; there are 490 LCDR1 variants, 265 LCDR2 variants and 907 LCDR3 variants in the IGKV3-15 domain; there are 184 LCDR1 variants, 151 LCDR2 variants and 824 LCDR3 variants in the IGLV1-51 domain; there were 967 LCDR1 variants, 535 LCDR2 variants, and 922 LCDR3 variants in the IGLV2-14 domain (fig. 10). These CDR variants were selected by comparing germline CDRs with the germline spaces of single, double and triple mutations observed in CDRs within the V genome pool of at least two of the 12 human donors. All CDRs were pre-screened to remove manufacturability tendencies, cryptic splice sites or nucleotide restriction sites. CDRs are synthesized as pools of oligonucleotides and incorporated into selected antibody scaffolds. Heavy chain (VH) and light chain (VL) genes pass (G) 4 S) 3 The joints are connected. The resulting scFv (VH-linker-VL) gene Chi Kelong was added to the phagemid display vector at the N-terminus of the M13 gene-3 minor coat protein. The final size of the GPCR library is 1X 10 in scFv form 10 . The final phage library was subjected to Next Generation Sequencing (NGS) to analyze the HCDR3 length distribution in the library for comparison with HCDR3 length distributions in B cell populations from three healthy adult donors. The HCDR3 sequences used from three healthy donors were derived from publicly available databases with over 3700 ten thousand B cell receptor sequences 31 . The HCDR3 length in the GPCR library is much longer than that observed in the B cell repertoire sequences. On average, GPCRsThe median length of HCDR3 in the library, which shows a biphasic distribution pattern, was two or three times (33 to 44 amino acids) longer than that observed in the natural B cell repertoire sequences (15 to 17 amino acids) (fig. 11). The biphasic length distribution of HCDR3 in the GPCR library is mainly caused by the two sets of stems (8 aa,9aaxxxxx10aa,12 aa) and (14 aa,16aaxxxxx18aa,14 aa) used to present the internal motif of HCDR 3.
Phage panning against GLP-1R overexpressing cell lines resulted in clonal enrichment
GLP-1R over-expressing CHO stable cell lines were generated in which FLAG tags were present at the N-terminus of the receptor to detect cell surface expression and EGFP tags were present at the C-terminus to track total receptor expression. Flow cytometry analysis of these cells confirmed that most of the receptors (> 80%) were expressed on the cell surface (fig. 12A). These GLP-1R expressing CHO cells were used for five rounds of phage panning against a GPCR focused library. The alternative is shown in fig. 12B. The variable heavy chain (VH) output of each round of panning was PCR amplified and sequenced by MiSeq. As the percentage of unique HCDR3 was reduced in each round of output pool NGS sequencing, significant clone enrichment was observed from round 1 to round 5 (fig. 13), indicating target-specific clone selection during panning. A total of about 1000 clones (from rounds 4 and 5) were selected for monoclonal NGS sequencing, and-100 unique VH-VL pairs were selected for reformatting and expression as full-length human IgG2 on a 1ml scale.
The IgG binding agent to GLP-1R contains GLP-1, GLP-2 or an identified unique HCDR3 motif.
Purified IgG clones were tested for specific binding to CHO cells expressing GLP-1R. Single spot flow cytometry analysis using 100nM IgG concentration revealed that of the 100 IgG unique clones tested, 13 IgG clones specifically bound to GLP-1R positive cells (GFP+) instead of parental CHO cells (GFP-). Binding of these 13 hits was then further assessed by 8 spot titration starting from 200nM (30. Mu.g/mL) per IgG clone and cell binding affinities were determined to be in the two-digit nM range. The average CHO parental cell background binding of all 13 IgG clones is represented by black lines and is minimal compared to the specific binding of GLP-1R expressing cells (fig. 14). No complete saturation was observed and the binding curve was at plateau at the highest concentration of 200nM used in the experiment. The HCDR3 amino acid sequences of these 13 IgG clones are shown in figure 15. It was found that 6 of them included a GLP-1 motif, 4 included a GLP-2 motif, and 3 had unknown motifs.
8 of the 13 binding agents were negative antagonists in GLP-1R mediated cAMP signaling.
Next, the functional activity of 13 IgG binding agents in the cAMP signaling pathway was assessed by using GLP-1R overexpressing CHO-K1 cells purchased from discover X, which were designed and validated for assessment of GLP-1R induced cAMP signaling. First, igG clones were tested for agonist activity in dose titration compared to the peptide agonist GLP-1 7-36. Although GLP-1 7-36 stimulated cAMP signaling, no cAMP signaling was observed for IgG clones, indicating that they were not activated. Subsequently, a panel of IgG clones were tested for antagonist activity by pre-incubating GLP-1R expressing cells with a fixed concentration of IgG to allow binding to occur, and then stimulating the cells with GLP-1 7-36 in a dose-dependent manner. This allows examining the effect of the presence of IgG on GLP-1 7-36 induced GLP-1R cAMP signaling, potentially revealing any potential competitive role of IgG. It was observed that in the presence of 8 of the 13 IgG clones, the GLP-1 7-36 dose response curve shifted to the right, indicating that they act as negative antagonists of the GLP-1 7-36 response (data not shown). Similar observations were made regarding the effect of 13 IgG clones on exendin-4 induced GLP-1R cAMP signaling response (data not shown). The remaining 5 IgG clones appeared to have no significant effect on GLP-1R cAMP signaling (data not shown).
Characterization of the mechanism of action of the antagonist IgG GLP1R-3
To determine the mechanism of action of these resulting functional hits, subsequent studies focused on one of the IgG clones containing GLP-1 motifs that showed high binding affinity as well as functionality: GLP1R-3. Ligand competitive binding assays, igG effect on GLP-1 dose response in cAMP signaling, and beta-arrestin recruitment assays were performed and GLP1R-3 was characterized as follows:
competing with endogenous ligands in GLP-1R binding assays. To determine whether GLP1R-3 binds to an orthosteric site (orthostatic site) on the receptor, N-terminal FLAG-labeled and C-terminal GFP-labeled GLP-1R overexpressing CHO cells were incubated with dose-titrated GLP1R-3 starting at 100nM in the presence or absence of a fixed concentration of the peptide agonist GLP-1 7-36 (1. Mu.M). Flow cytometry analysis revealed a significant reduction in GLP1R-3 binding to GLP-1R (GFP+) in the presence of GLP-1 7-36. Although the presence of the GLP-1 7-36 peptide does not completely abrogate GLP1R-3 binding, this observation suggests that the antibody can bind to overlapping epitopes, or that GLP1R-3 has a stronger binding affinity for GLP-1 7-36 to compete for binding. (FIG. 16A).
GLP1R-3 antagonizes GLP-1 activated cAMP signaling. The next step is to determine whether GLP1R-3 exhibits a competitive antagonism against GLP-1R in a dose dependent manner. GLP-1 7-36-induced cAMP signaling was examined with GLP-1 7-36 titrated at a dose of 3-fold down titration starting from 20nM in the presence of a constant concentration (100 nM) of GLP1R-3, and a significant dose-dependent inhibition of cAMP signaling was observed. The EC50 of the GLP-1 7-36 peptide was 0.025nM in the absence of GLP1R-3 and 0.11nM in the presence of 100nM GLP1R-3 (FIG. 16B), which supports GLP1R-3 as a competitive antagonist.
GLP1R-3 reduces beta-arrestin recruitment following GLP-1R activation. When the GPCR is activated by an agonist, β -arrestin is recruited from the cytosol to the GPCR, thereby excluding the receptor from further G-protein interactions and resulting in a signal block, hence the term "arrestin". To determine whether GLP1R-3 has any effect on activated GLP-1R recruitment β -arrestin, GLP-1R over-expressing CHO-K1 cells (discover X) specifically designed and validated for assessment of GLP-1R β -arrestin recruitment were employed in the following manner. Cells were pre-incubated with fixed concentrations of GLP1R-3 (100 nM) for 1hr at room temperature to allow binding to occur, followed by stimulation with GLP-1 7-36. GLP1R-3 showed inhibition of GLP-1 7-36 peptide-induced recruitment of β -arrestin to GLP-1R as evidenced by the right shift of the GLP-1 7-36 dose response curve for β -arrestin recruitment (FIG. 16C). This suggests that GLP1R-3 reduces β -inhibitor protein recruitment to GLP-1R, consistent with the observed reduction in receptor activation. Thus, these cell-based assays indicate that GLP1R-3 is a competitive antagonist of GLP-1 7-36 to GLP-1R.
Design and characterization of GLP-1R agonist IgG GLP1R-59-2
Since none of the 13 IgG hits showed any agonist activity, GLP-1R agonist antibodies (GLP 1R-59-2) were engineered by linking the native GLP-1 7-36 peptide to the N-terminus of the light chain of the functionally inactive but GLP-1R specific binding agent GLP1R-2 (fig. 17). GLP-1R binding assay, cAMP assay, and β -arrestin recruitment assay were performed to obtain a characterization of GLP1R-59-2 as described herein:
GLP1R-59-2 specifically binds to CHO cells expressing GLP-1R. Flow cytometry analysis revealed that GLP1R-59-2 specifically bound to GLP-1R positive cells (GFP+) but not to parental CHO cells (GFP-), specific binding was also confirmed by GLP1R-59-2 dose titration, yielding an apparent binding EC of 15.5nM 50 (FIG. 18A).
GLP1R-59-2 induces a GLP-1R cAMP response similar to GLP-1 7-36. GLP1R-59-2 was tested for agonist activity in GLP-1R over-expressing CHO-K1 cells (discover X) compared to GLP-17-36 and individual dose titration assays were performed on the ligand and antibody. It was found that the two induced cAMP signaling spectra were similar and their dose response curves had almost overlapping ECs 50 The value was 0.042nM for GLP1R-59-2 and 0.085nM for GLP-17-36 (FIG. 18B), which supports the hypothesis that GLP-1R-59-2 can act as a potent agonist of GLP-1R.
GLP1R-59-2 is less effective at recruiting β -arrestin to GLP-1R than GLP-1 7-36. To determine whether GLP1R-59-2 was able to induce recruitment of β -arrestin to GLP-1R at levels similar to GLP-17-36, GLP-1R overexpressing CHO-K1 cells (discover X) were stimulated with each dose titration. Stimulation with GLP1R-59-2 was found to result in less β -arrestin recruitment than stimulation with GLP-17-36 (FIG. 18C). Although GLP1R-59-2 was less effective at maximum beta-inhibitor protein recruitment than GLP-17-36, it appears that the agonist IgG was slightly more effective, EC 50 EC of 0.042nM, and GLP-1 7-36 50 0.085nM.
In vivo PK and PD assays for GLP1R-3 and GLP1R-59-2
Endogenous GLP-1 peptides have a very short serum half-life of only a few minutes, whereas GLP-1R antibodies can have a significantly longer half-life. This has considerable advantages over current GLP-1 peptide analogue therapeutics. In vivo PK rat studies were performed to assess half-lives of the antagonists GLP1R-3 and the agonists GLP1R-59-2 in the IgG form. In a 2 week PK study GLP1R-3 showed an antibody-like in vivo half-life of-1 week in rats, whereas the agonist GLP-1 peptide-antibody fusion GLP1R-59-2 showed a half-life of >2 days in rats (FIGS. 19A-19B). The half-life of the approved GLP-1R agonist liraglutide for the treatment of type II diabetes is 13 hours.
The in vivo Pharmacodynamic (PD) effect of the agonist GLP1R-59-2 was tested in a Glucose Tolerance Test (GTT) using a wild-type C57BL/6NHsd mouse model as compared to vehicle control. Therapeutic doses (5 mg/kg and 10 mg/kg) or dosing regimen (2 hrs, 13+2hrs and 15hrs before glucose challenge) of agonist mAb GLP1R-59-2 significantly stabilized blood glucose even after glucose challenge (FIG. 20A). GLP1R-59-2 treatment was all significant in reducing the area under the GTT curve (AUC) (p < 0.001) compared to control mice (fig. 20B). However, there was no significant difference between each individual treatment time or dose.
Antagonists GLP1R-3mAb and GLP-1 peptide exendin 9-39 treatment, as well as 19+2 hours dosing regimen prior to insulin challenge, significantly stabilized the higher blood glucose in wild-type C57BL/6NHsd mice (FIG. 21A). Both GLP1R-3mAb (20 mg/kg) and exendin (1 mg/kg) treatments were significant in area under stable ITT curve (AUC) (p < 0.0001) compared to control mice (fig. 21B). However, there was no significant difference between GLP1R-3 and control and exendin (0.23 mg/kg) at 19+2 hours of treatment.
Another experiment with a single 6 hour dosing regimen, antagonist GLP1R-3mAb treatment also significantly stabilized higher blood glucose after insulin challenge compared to GLP-1 peptide exendin 9-39 (1.0 or 0.23mg/kg dose) or controls (fig. 22A). GLP1R-3mAb (20 mg/kg) treated for 6 hours significantly (p < 0.05) stabilized the area under ITT curve (AUC) compared to control mice. However, there was no significant difference between the control and exendin (1.0 and 0.23 mg/kg) at a single 6 hour treatment (fig. 22B).
GLP1R-3mAb treatment was also compared to the comparison antibodies GLP1R-226-1 and GLP 1R-226-2. GLP1R-3mAb treatment in a single 6 hour dosing regimen significantly stabilized higher blood glucose after insulin challenge (at time 0) compared to GLP1R-226-1 (20 mg/kg) or controls (FIGS. 23A-23B). GLP1R-3mAb (20 mg/kg) treated for 6 hours significantly (p < 0.05) stabilized the area under ITT curve (AUC) compared to control mice. There was no significant difference (p < 0.05) between control and GLP1R-226-1 or GLP1R-226-2 at single 6 hour treatment.
Example 5: GLP1R variants
GLP1R-3 is optimized to generate additional GLP1R variants.
Panning strategies for GLP1R-221 and GLP1R-222 variants are shown in FIGS. 24A-24B. 768 clones from round 4 and round 5 were picked and sequenced on Miseq. 95 unique clones were reformatted. The data for GLP1R-221 and GLP1R-222 variants are shown in tables 6A-6H. The sequences of GLP1R-221 and GLP1R-222 variants are shown in tables 9-13.
Table 6A.
IgG MFI ratio Subtraction method
GLP1R-3 993.31197 232201
GLP1R-221-065 914.54027 272235
GLP1R-221-075 1174.8495 241813
GLP1R-221-017 1484.8457 240383
GLP1R-221-033 1015.9153 239520
GLP1R-221-076 746.61867 235615.5
GLP1R-221-092 711.73926 231701
GLP1R-221-034 711.15764 222989.5
GLP1R-221-066 927.53542 222368.5
GLP1R-221-084 1067.8986 220848
GLP1R-221-009 1119.868 220417
Table 6B.
/>
Table 6C.
/>
/>
/>
Table 6D.
/>
/>
/>
Table 6E.
Table 6F.
Table 6G.
Table 6H.
/>
GLP1R-221 and GLP1R-222 variants were assayed in a competition assay. The data are shown in FIGS. 25A-25B. Variants were also assayed in the cAMP assay. Briefly, cells were pre-incubated with 100nM of anti-GLP 1R antibody followed by 3X titration of agonist stimulation starting from 12.5 nM. The data is shown in fig. 26, with the modified variant highlighted in green.
Example 6: sequence(s)
TABLE 7 GLP1 sequence embedded in CDRH3
/>
TABLE 8 GLP1R variant CDRH3 sequence
/>
* Bold corresponds to GLP1 or GLP2 motifs
TABLE 9 variable heavy chain sequences
/>
/>
/>
/>
/>
/>
/>
/>
/>
TABLE 10 variable light chain sequences
/>
/>
/>
/>
/>
/>
TABLE 11 GLP1R sequence
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
TABLE 12 variable heavy chain CDR
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
TABLE 13 variable light chain CDR
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
While preferred embodiments of the present invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention. The following claims are intended to define the scope of the invention and their equivalents are therefore covered by the methods and structures within the scope of these claims.

Claims (36)

1. An antibody or antibody fragment comprising a variable domain heavy chain region (VH) and a variable domain light chain region (VL), wherein VH comprises complementarity determining regions CDRH1, CDRH2, and CDRH3, wherein VL comprises complementarity determining regions CDRL1, CDRL2, and CDRL3, and wherein (a) the amino acid sequence of CDRH1 is as shown in any one of SEQ ID NOs 441-619; (b) The amino acid sequence of CDRH2 is shown in any one of SEQ ID NO 620-798; (c) The amino acid sequence of CDRH3 is shown in any one of SEQ ID NO: 799-977; (d) The amino acid sequence of CDRL1 is shown in any one of SEQ ID NOs 978-1156; (e) The amino acid sequence of CDRL2 is shown in any one of SEQ ID NOs 1157-1168; and (f) the amino acid sequence of CDRL3 is shown in any one of SEQ ID NOS 1169-1347.
2. The antibody or antibody fragment of claim 1, wherein the antibody is a monoclonal antibody, polyclonal antibody, bispecific antibody, multispecific antibody, grafted antibody, human antibody, humanized antibody, synthetic antibody, chimeric antibody, camelized antibody, single chain Fv (scFv), single chain antibody, fab fragment, F (ab') 2 fragment, fd fragment, fv fragment, single domain antibody, isolated Complementarity Determining Region (CDR), diabody, fragment consisting of only a single monomer variable domain, disulfide-linked Fv (sdFv), intracellular antibody, anti-idiotype (anti-Id) antibody, or ab antigen binding fragment thereof.
3. The antibody or antibody fragment of claim 1, wherein the antibody or antibody fragment thereof is chimeric or humanized.
4. The antibody or antibody fragment of claim 1, wherein the EC50 of the antibody or antibody fragment in cAMP assays is less than about 25 nanomolar.
5. The antibody or antibody fragment of claim 1, wherein the EC50 of the antibody or antibody fragment in cAMP assays is less than about 20 nanomolar.
6. The antibody or antibody fragment of claim 1, wherein the antibody or antibody fragment has an EC50 in cAMP assays of less than about 10 nanomolar.
7. The antibody or antibody fragment of claim 1, wherein the antibody or antibody fragment is an agonist of GLP 1R.
8. The antibody or antibody fragment of claim 1, wherein the antibody or antibody fragment is an antagonist of GLP 1R.
9. The antibody or antibody fragment of claim 1, wherein the antibody or antibody fragment is an allosteric modulator of GLP 1R.
10. The antibody or antibody fragment of claim 9, wherein the allosteric modulator of GLP1R is a negative allosteric modulator.
11. The antibody or antibody fragment of claim 1, wherein the VH comprises a sequence at least about 90% identical to any one of SEQ ID NOs 58-77.
12. The antibody or antibody fragment of claim 1, wherein the VH comprises the sequence of any one of SEQ ID NOs 58-77.
13. The antibody or antibody fragment of claim 1, wherein the VL comprises a sequence at least about 90% identical to any one of SEQ ID NOs 92-111.
14. The antibody or antibody fragment of claim 1, wherein the VL comprises the sequence of any one of SEQ ID NOs 92-111.
15. A method of treating a metabolic disease or disorder, the method comprising administering an antibody or antibody fragment that binds GLP1R, the antibody or antibody fragment comprising a variable domain heavy chain region (VH) and a variable domain light chain region (VL), wherein VH comprises complementarity determining regions CDRH1, CDRH2, and CDRH3, wherein VL comprises complementarity determining regions CDRL1, CDRL2, and CDRL3, and wherein (a) the amino acid sequence of CDRH1 is as shown in any one of SEQ ID NOs 441-619; (b) The amino acid sequence of CDRH2 is shown in any one of SEQ ID NO 620-798; (c) The amino acid sequence of CDRH3 is shown in any one of SEQ ID NO: 799-977; (d) The amino acid sequence of CDRL1 is shown in any one of SEQ ID NOs 978-1156; (e) The amino acid sequence of CDRL2 is shown in any one of SEQ ID NOs 1157-1168; and (f) the amino acid sequence of CDRL3 is shown in any one of SEQ ID NOS 1169-1347.
16. The method of claim 15, wherein the antibody is a monoclonal antibody, polyclonal antibody, bispecific antibody, multispecific antibody, grafted antibody, human antibody, humanized antibody, synthetic antibody, chimeric antibody, camelized antibody, single chain Fv (scFv), single chain antibody, fab fragment, F (ab') 2 fragment, fd fragment, fv fragment, single domain antibody, isolated Complementarity Determining Region (CDR), diabody, fragment consisting of only a single monomer variable domain, disulfide-linked Fv (sdFv), intracellular antibody, anti-idiotype (anti-Id) antibody, or ab antigen binding fragment thereof.
17. The method of claim 15, wherein the antibody or antibody fragment thereof is chimeric or humanized.
18. The method of claim 15, wherein the antibody or antibody fragment has an EC50 in cAMP assays of less than about 25 nanomolar.
19. The method of claim 15, wherein the antibody or antibody fragment has an EC50 in cAMP assays of less than about 20 nanomolar.
20. The method of claim 15, wherein the antibody or antibody fragment has an EC50 in cAMP assays of less than about 10 nanomolar.
21. The method of claim 15, wherein the antibody or antibody fragment is an agonist of GLP 1R.
22. The method of claim 15, wherein the antibody or antibody fragment is an antagonist of GLP 1R.
23. The method of claim 15, wherein the antibody or antibody fragment is an allosteric modulator of GLP 1R.
24. The method of claim 23, wherein the allosteric modulator of GLP1R is a negative allosteric modulator.
25. The method of claim 15, wherein the antibody or antibody fragment is an allosteric modulator.
26. The method of claim 15, wherein the antibody or antibody fragment is a negative allosteric modulator.
27. The method of claim 15, wherein the VH comprises a sequence at least about 90% identical to any one of SEQ ID NOs 58-77.
28. The method of claim 15, wherein the VH comprises the sequence of any one of SEQ ID NOs 58-77.
29. The method of claim 15, wherein the VL comprises a sequence at least about 90% identical to any one of SEQ ID NOs 92-111.
30. The method of claim 15, wherein the VL comprises a sequence of any one of SEQ ID NOs 92-111.
31. The method of claim 15, wherein the metabolic disease or disorder is type II diabetes or obesity.
32. A nucleic acid composition comprising:
a) A first nucleic acid encoding a variable domain heavy chain region (VH) comprising complementarity determining regions CDRH1, CDRH2 and CDRH3, and wherein (i) the amino acid sequence of CDRH1 is as shown in any one of SEQ ID NOs 441-619; (ii) The amino acid sequence of CDRH2 is shown in any one of SEQ ID NO 620-798; (iii) The amino acid sequence of CDRH3 is shown in any one of SEQ ID NO: 799-977;
b) A second nucleic acid encoding a variable domain light chain region (VL) comprising complementarity determining regions CDRL1, CDRL2 and CDRL3, and wherein (i) the amino acid sequence of CDRL1 is as shown in any one of SEQ ID NOs 978-1156; (ii) The amino acid sequence of CDRL2 is shown in any one of SEQ ID NOs 1157-1168; and (iii) the amino acid sequence of CDRL3 is shown in any one of SEQ ID NOs 1169-1347.
33. A nucleic acid composition comprising: a) A first nucleic acid encoding a variable domain heavy chain region (VH) comprising an amino acid sequence at least about 90% identical to a sequence as set forth in any one of SEQ ID NOs 58-77; b) A second nucleic acid encoding a variable domain light chain region (VL) comprising an amino acid sequence at least about 90% identical to the sequence set forth in any one of SEQ ID NOs 92-111; and (3) an excipient.
34. The nucleic acid composition of claim 33, wherein the VH comprises an amino acid sequence set forth in any one of SEQ ID NOs 58-77.
35. The nucleic acid composition of claim 33, wherein the VL comprises an amino acid sequence shown in any one of SEQ ID NOs 92-111.
36. The nucleic acid composition of claim 33, wherein the VH comprises an amino acid sequence set forth in any one of SEQ ID NOs 58-77, and wherein the VL comprises an amino acid sequence set forth in any one of SEQ ID NOs 92-111.
CN202180073480.2A 2020-08-26 2021-08-25 Methods and compositions relating to GLP1R variants Pending CN116829946A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US63/070,734 2020-08-26
US202063081801P 2020-09-22 2020-09-22
US63/081,801 2020-09-22
PCT/US2021/047616 WO2022046944A2 (en) 2020-08-26 2021-08-25 Methods and compositions relating to glp1r variants

Publications (1)

Publication Number Publication Date
CN116829946A true CN116829946A (en) 2023-09-29

Family

ID=88117086

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180073480.2A Pending CN116829946A (en) 2020-08-26 2021-08-25 Methods and compositions relating to GLP1R variants

Country Status (1)

Country Link
CN (1) CN116829946A (en)

Similar Documents

Publication Publication Date Title
US20230193513A1 (en) Variant nucleic acid libraries for glp1 receptor
US20220064313A1 (en) Methods and compositions relating to glp1r variants
US20210179724A1 (en) Variant nucleic acid libraries for adenosine receptors
US20210102192A1 (en) Variant nucleic acid libraries for crth2
US20220135690A1 (en) Methods and compositions relating to chemokine receptor variants
US20220259319A1 (en) Methods and compositions relating to adenosine receptors
WO2023023190A2 (en) Single domain antibodies for sars-cov-2
CN116829946A (en) Methods and compositions relating to GLP1R variants
US20230265198A1 (en) Neuropilin-1 variant antibodies and methods of use
US20230265179A1 (en) Cytokine variant antibodies and methods of use
US20230312749A1 (en) Dickkopf-1 variant antibodies and methods of use
CN117729937A (en) Methods and compositions relating to adenosine receptors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40100779

Country of ref document: HK