Academic Appointments

Honors & Awards

  • Fellow, Damon-Runyon-Walter Winchell Cancer Research Foundation (1995)
  • Terman Fellow, Lucille Packard Charitable Trust (1998)
  • Young Innovator Award, MIT Technology Review Magazine (1999)
  • Searle Scholar, Chicago Community Trust (1999)
  • Young Investigator in the Pharmacological Sciences, Burroughs Wellcome Fund (2000)
  • Schering-Plough Award, ASBMB (2004)
  • Fellow, John D. and Catherine T. MacArthur Foundation (2005)
  • Director's Pioneer Award, NIH (2005)

Professional Education

  • Ph.D., Harvard Medical School, Biological Chemistry (1994)
  • B.A., Harvard University, Biochemistry (1987)

Research & Scholarship

Current Research and Scholarly Interests

Scientific breakthroughs often come on the heels of technological advances; advances that expose hidden truths of nature, and provide tools for engineering the world around us. Examples include the telescope (heliocentrism), the Michelson interferometer (relativity) and recombinant DNA (molecular evolution). Our lab explores innovative experimental approaches to problems in molecular biochemistry, focusing on technologies with the potential for broad impact.


2015-16 Courses

Stanford Advisees

Graduate and Fellowship Programs


All Publications

  • The solution structural ensembles of RNA kink-turn motifs and their protein complexes NATURE CHEMICAL BIOLOGY Shi, X., Huang, L., Lilley, D. M., Harbury, P. B., Herschlag, D. 2016; 12 (3): 146-152


    With the growing number of crystal structures of RNA and RNA-protein complexes, a critical next step is understanding the dynamic solution behavior of these entities in terms of conformational ensembles and energy landscapes. To this end, we have used X-ray scattering interferometry (XSI) to probe the ubiquitous RNA kink-turn motif and its complexes with the canonical kink-turn binding protein L7Ae. XSI revealed that the folded kink-turn is best described as a restricted conformational ensemble. The ions present in solution alter the nature of this ensemble, and protein binding can perturb the kink-turn ensemble without collapsing it to a unique state. This study demonstrates how XSI can reveal structural and ensemble properties of RNAs and RNA-protein complexes and uncovers the behavior of an important RNA-protein motif. This type of information will be necessary to understand, predict and engineer the behavior and function of RNAs and their protein complexes.

    View details for DOI 10.1038/NCHEMBIO.1997

    View details for Web of Science ID 000371377000006

    View details for PubMedID 26727239

  • Directed Chemical Evolution with an Outsized Genetic Code. PloS one Krusemark, C. J., Tilmans, N. P., Brown, P. O., Harbury, P. B. 2016; 11 (8): e0154765


    The first demonstration that macromolecules could be evolved in a test tube was reported twenty-five years ago. That breakthrough meant that billions of years of chance discovery and refinement could be compressed into a few weeks, and provided a powerful tool that now dominates all aspects of protein engineering. A challenge has been to extend this scientific advance into synthetic chemical space: to enable the directed evolution of abiotic molecules. The problem has been tackled in many ways. These include expanding the natural genetic code to include unnatural amino acids, engineering polyketide and polypeptide synthases to produce novel products, and tagging combinatorial chemistry libraries with DNA. Importantly, there is still no small-molecule analog of directed protein evolution, i.e. a substantiated approach for optimizing complex (≥ 10^9 diversity) populations of synthetic small molecules over successive generations. We present a key advance towards this goal: a tool for genetically-programmed synthesis of small-molecule libraries from large chemical alphabets. The approach accommodates alphabets that are one to two orders of magnitude larger than any in Nature, and facilitates evolution within the chemical spaces they create. This is critical for small molecules, which are built up from numerous and highly varied chemical fragments. We report a proof-of-concept chemical evolution experiment utilizing an outsized genetic code, and demonstrate that fitness traits can be passed from an initial small-molecule population through to the great-grandchildren of that population. The results establish the practical feasibility of engineering synthetic small molecules through accelerated evolution.

    View details for DOI 10.1371/journal.pone.0154765

    View details for PubMedID 27508294

  • Quantifying Nucleic Acid Ensembles with X-ray Scattering Interferometry. Methods in enzymology Shi, X., Bonilla, S., Herschlag, D., Harbury, P. 2015; 558: 75-97


    The conformational ensemble of a macromolecule is the complete description of the macromolecule's solution structures and can reveal important aspects of macromolecular folding, recognition, and function. However, most experimental approaches determine an average or predominant structure, or follow transitions between states that each can only be described by an average structure. Ensembles have been extremely difficult to experimentally characterize. We present the unique advantages and capabilities of a new biophysical technique, X-ray scattering interferometry (XSI), for probing and quantifying structural ensembles. XSI measures the interference of scattered waves from two heavy metal probes attached site specifically to a macromolecule. A Fourier transform of the interference pattern gives the fractional abundance of different probe separations directly representing the multiple conformation states populated by the macromolecule. These probe-probe distance distributions can then be used to define the structural ensemble of the macromolecule. XSI provides accurate, calibrated distance in a model-independent fashion with angstrom scale sensitivity in distances. XSI data can be compared in a straightforward manner to atomic coordinates determined experimentally or predicted by molecular dynamics simulations. We describe the conceptual framework for XSI and provide a detailed protocol for carrying out an XSI experiment.

    View details for DOI 10.1016/bs.mie.2015.02.001

    View details for PubMedID 26068738

  • From a structural average to the conformational ensemble of a DNA bulge. Proceedings of the National Academy of Sciences of the United States of America Shi, X., Beauchamp, K. A., Harbury, P. B., Herschlag, D. 2014; 111 (15): E1473-80


    Direct experimental measurements of conformational ensembles are critical for understanding macromolecular function, but traditional biophysical methods do not directly report the solution ensemble of a macromolecule. Small-angle X-ray scattering interferometry has the potential to overcome this limitation by providing the instantaneous distance distribution between pairs of gold-nanocrystal probes conjugated to a macromolecule in solution. Our X-ray interferometry experiments reveal an increasing bend angle of DNA duplexes with bulges of one, three, and five adenosine residues, consistent with previous FRET measurements, and further reveal an increasingly broad conformational ensemble with increasing bulge length. The distance distributions for the AAA bulge duplex (3A-DNA) with six different Au-Au pairs provide strong evidence against a simple elastic model in which fluctuations occur about a single conformational state. Instead, the measured distance distributions suggest a 3A-DNA ensemble with multiple conformational states predominantly across a region of conformational space with bend angles between 24 and 85 degrees and characteristic bend directions and helical twists and displacements. Additional X-ray interferometry experiments revealed perturbations to the ensemble from changes in ionic conditions and the bulge sequence, effects that can be understood in terms of electrostatic and stacking contributions to the ensemble and that demonstrate the sensitivity of X-ray interferometry. Combining X-ray interferometry ensemble data with molecular dynamics simulations gave atomic-level models of representative conformational states and of the molecular interactions that may shape the ensemble, and fluorescence measurements with 2-aminopurine-substituted 3A-DNA provided initial tests of these atomistic models. More generally, X-ray interferometry will provide powerful benchmarks for testing and developing computational methods.

    View details for DOI 10.1073/pnas.1317032111

    View details for PubMedID 24706812

  • From a structural average to the conformational ensemble of a DNA bulge PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Shi, X., Beauchamp, K. A., Harbury, P. B., Herschlag, D. 2014; 111 (15): E1473-E1480
  • Structural ensemble and microscopic elasticity of freely diffusing DNA by direct measurement of fluctuations. Proceedings of the National Academy of Sciences of the United States of America Shi, X., Herschlag, D., Harbury, P. A. 2013; 110 (16): E1444-51


    Precisely measuring the ensemble of conformers that a macromolecule populates in solution is highly challenging. Thus, it has been difficult to confirm or falsify the predictions of nanometer-scale dynamical modeling. Here, we apply an X-ray interferometry technique to probe the solution structure and fluctuations of B-form DNA on a length scale comparable to a protein-binding site. We determine an extensive set of intrahelix distance distributions between pairs of probes placed at distinct points on the surface of the DNA duplex. The distributions of measured distances reveal the nature and extent of the thermally driven mechanical deformations of the helix. We describe these deformations in terms of elastic constants, as is common for DNA and other polymers. The average solution structure and microscopic elasticity measured by X-ray interferometry are in striking agreement with values derived from DNA-protein crystal structures and measured by force spectroscopy, with one exception. The observed microscopic torsional rigidity of DNA is much lower than is measured by single-molecule twisting experiments, suggesting that torsional rigidity increases when DNA is stretched. Looking forward, molecular-level interferometry can provide a general tool for characterizing solution-phase structural ensembles.

    View details for DOI 10.1073/pnas.1218830110

    View details for PubMedID 23576725

  • Structural ensemble and microscopic elasticity of freely diffusing DNA by direct measurement of fluctuations PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Shi, X., Herschlag, D., Harbury, P. A. 2013; 110 (16): E1444-E1451
  • Mesofluidic Devices for DNA-Programmed Combinatorial Chemistry PLOS ONE Weisinger, R. M., Marinelli, R. J., Wrenn, S. J., Harbury, P. B. 2012; 7 (3)


    Hybrid combinatorial chemistry strategies that use DNA as an information-carrying medium are proving to be powerful tools for molecular discovery. In order to extend these efforts, we present a highly parallel format for DNA-programmed chemical library synthesis. The new format uses a standard microwell plate footprint and is compatible with commercially available automation technology. It can accommodate a wide variety of combinatorial synthetic schemes with up to 384 different building blocks per chemical step. We demonstrate that fluidic routing of DNA populations in the highly parallel format occurs with excellent specificity, and that chemistry on DNA arrayed into 384 well plates proceeds robustly, two requirements for the high-fidelity translation and efficient in vitro evolution of small molecules.

    View details for Web of Science ID 000304523400004

    View details for PubMedID 22479318

  • Highly Parallel Translation of DNA Sequences into Small Molecules PLOS ONE Weisinger, R. M., Wrenn, S. J., Harbury, P. B. 2012; 7 (3)


    A large body of in vitro evolution work establishes the utility of biopolymer libraries comprising 10(10) to 10(15) distinct molecules for the discovery of nanomolar-affinity ligands to proteins. Small-molecule libraries of comparable complexity will likely provide nanomolar-affinity small-molecule ligands. Unlike biopolymers, small molecules can offer the advantages of cell permeability, low immunogenicity, metabolic stability, rapid diffusion and inexpensive mass production. It is thought that such desirable in vivo behavior is correlated with the physical properties of small molecules, specifically a limited number of hydrogen bond donors and acceptors, a defined range of hydrophobicity, and most importantly, molecular weights less than 500 Daltons. Creating a collection of 10(10) to 10(15) small molecules that meet these criteria requires the use of hundreds to thousands of diversity elements per step in a combinatorial synthesis of three to five steps. With this goal in mind, we have reported a set of mesofluidic devices that enable DNA-programmed combinatorial chemistry in a highly parallel 384-well plate format. Here, we demonstrate that these devices can translate DNA genes encoding 384 diversity elements per coding position into corresponding small-molecule gene products. This robust and efficient procedure yields small molecule-DNA conjugates suitable for in vitro evolution experiments.

    View details for DOI 10.1371/journal.pone.0028056

    View details for Web of Science ID 000304523400001

    View details for PubMedID 22479303

  • Structural and kinetic mapping of side-chain exposure onto the protein energy landscape PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Bernstein, R., Schmidt, K. L., Harbury, P. B., Marqusee, S. 2011; 108 (26): 10532-10537


    Identification and characterization of structural fluctuations that occur under native conditions is crucial for understanding protein folding and function, but such fluctuations are often rare and transient, making them difficult to study. Native-state hydrogen exchange (NSHX) has been a powerful tool for identifying such rarely populated conformations, but it generally reveals no information about the placement of these species along the folding reaction coordinate or the barriers separating them from the folded state and provides little insight into side-chain packing. To complement such studies, we have performed native-state alkyl-proton exchange, a method analogous to NSHX that monitors cysteine modification rather than backbone amide exchange, to examine the folding landscape of Escherichia coli ribonuclease H, a protein well characterized by hydrogen exchange. We have chosen experimental conditions such that the rate-limiting barrier acts as a kinetic partition: residues that become exposed only upon crossing the unfolding barrier are modified in the EX1 regime (alkylation rates report on the rate of unfolding), while those exposed on the native side of the barrier are modified predominantly in the EX2 regime (alkylation rates report on equilibrium populations). This kinetic partitioning allows for identification and placement of partially unfolded forms along the reaction coordinate. Using this approach we detect previously unidentified, rarely populated conformations residing on the native side of the barrier and identify side chains that are modified only upon crossing the unfolding barrier. Thus, in a single experiment under native conditions, both sides of the rate-limiting barrier are investigated.

    View details for DOI 10.1073/pnas.1103629108

    View details for Web of Science ID 000292251000036

    View details for PubMedID 21670244

  • Expedient Synthesis of a Modular Phosphate Affinity Reagent BIOCONJUGATE CHEMISTRY Tilmans, N. P., Krusemark, C. J., Harbury, P. A. 2010; 21 (6): 1010-1013


    Isolation and identification of phosphorylated macromolecules is essential for the deconvolution of most biological regulatory networks. Koike and co-workers recently reported the application of a dinuclear zinc-(pyridylmethyl)amine complex to phosphate-specific affinity purifications and gave it the shorthand name "phos-tag". This complex is valuable for studying phosphorylation because it binds selectively to phosphate dianion in the presence of acidic functional groups at physiological pH, and because the binding is largely independent of molecular context. These properties of phos-tag recommend it for applications in phosphoproteomics, metabolomics, and nucleic acid biology. The catch has been that the molecule is difficult to make and prohibitively expensive to buy. Here, we describe an efficient and inexpensive synthesis of a phos-tag derivative with a versatile alkyne handle. The alkyne handle allows for attachment of phos-tag to alkyl azides via the copper(I)-catalyzed azide-alkyne cycloaddition reaction ("click chemistry"). We characterize the phosphate binding behavior of the new phos-tag derivative in a variety of experimental assays, including its conjugation to a fluorescent reporter, to acrylamide gels, and to sepharose chromatography resin. The synthesis we report should enable a broader use of phos-tag for phosphate-related biochemistry, as both an analytical and a preparative reagent.

    View details for DOI 10.1021/bc900538b

    View details for Web of Science ID 000278734900002

    View details for PubMedID 20491467

  • Small molecule libraries generated by DNA-programmed combinatorial chemistry for the in vitro selection of protein ligands and protein kinase substrates Krusemark, C. J., Weisinger, R. M., Tilmans, N. P., Brown, P. O., Harbury, P. A. AMER CHEMICAL SOC. 2010
  • Response to comment on remeasuring the double helix Science Matthew-Fenn RS, Das R, Fenn TD, Schneiders M, Harbury PA 2009; 325 (5940): 538-540
  • A Molecular Ruler for Measuring Quantitative Distance Distributions PLOS ONE Mathew-Fenn, R. S., Das, R., Silverman, J. A., Walker, P. A., Harbury, P. A. 2008; 3 (10)


    We report a novel molecular ruler for measurement of distances and distance distributions with accurate external calibration. Using solution X-ray scattering we determine the scattering interference between two gold nanocrystal probes attached site-specifically to a macromolecule of interest. Fourier transformation of the interference pattern provides a model-independent probability distribution for the distances between the probe centers-of-mass. To test the approach, we measure end-to-end distances for a variety of DNA structures. We demonstrate that measurements with independently prepared samples and using different X-ray sources are highly reproducible, we demonstrate the quantitative accuracy of the first and second moments of the distance distributions, and we demonstrate that the technique recovers complex distribution shapes. Distances measured with the solution scattering-interference ruler match the corresponding crystallographic values, but differ from distances measured previously with alternate ruler techniques. The X-ray scattering interference ruler should be a powerful tool for relating crystal structures to solution structures and for studying molecular fluctuations.

    View details for DOI 10.1371/journal.pone.0003229

    View details for Web of Science ID 000265122400001

    View details for PubMedID 18927606

  • Remeasuring the double helix SCIENCE Mathew-Fenn, R. S., Das, R., Harbury, P. A. 2008; 322 (5900): 446-449


    DNA is thought to behave as a stiff elastic rod with respect to the ubiquitous mechanical deformations inherent to its biology. To test this model at short DNA lengths, we measured the mean and variance of end-to-end length for a series of DNA double helices in solution, using small-angle x-ray scattering interference between gold nanocrystal labels. In the absence of applied tension, DNA is at least one order of magnitude softer than measured by single-molecule stretching experiments. Further, the data rule out the conventional elastic rod model. The variance in end-to-end length follows a quadratic dependence on the number of base pairs rather than the expected linear dependence, indicating that DNA stretching is cooperative over more than two turns of the DNA double helix. Our observations support the idea of long-range allosteric communication through DNA structure.

    View details for DOI 10.1126/science.1158881

    View details for Web of Science ID 000260094500048

    View details for PubMedID 18927394

  • BIOL 182-Small molecule substrates for in vivo imaging of protein kinase activity generated by DNA-programmed combinatorial synthesis Krusemark, C. J., Tilmans, N. P., Weisinger, R. M., Brown, P. O., Harbury, P. A. AMER CHEMICAL SOC. 2008
  • Design of protein-ligand binding based on the molecular-mechanics energy model JOURNAL OF MOLECULAR BIOLOGY Boas, F. E., Harbury, P. B. 2008; 380 (2): 415-424


    While the molecular-mechanics field has standardized on a few potential energy functions, computational protein design efforts are based on potentials that are unique to individual laboratories. Here we show that a standard molecular-mechanics potential energy function without any modifications can be used to engineer protein-ligand binding. A molecular-mechanics potential is used to reconstruct the coordinates of various binding sites with an average root-mean-square error of 0.61 A and to reproduce known ligand-induced side-chain conformational shifts. Within a series of 34 mutants, the calculation can always distinguish between weak (K(d)>1 mM) and tight (K(d)<10 microM) binding sequences. Starting from partial coordinates of the ribose-binding protein lacking the ligand and the 10 primary contact residues, the molecular-mechanics potential is used to redesign a ribose-binding site. Out of a search space of 2 x 10(12) sequences, the calculation selects a point mutant of the native protein as the top solution (experimental K(d)=17 microM) and the native protein as the second best solution (experimental K(d)=210 nM). The quality of the predictions depends on the accuracy of the generalized Born electrostatics model, treatment of protonation equilibria, high-resolution rotamer sampling, a final local energy minimization step, and explicit modeling of the bound, unbound, and unfolded states. The application of unmodified molecular-mechanics potentials to protein design links two fields in a mutually beneficial way. Design provides a new avenue for testing molecular-mechanics energy functions, and future improvements in these energy functions will presumably lead to more accurate design results.

    View details for DOI 10.1016/j.jmb.2008.04.001

    View details for Web of Science ID 000257630000013

    View details for PubMedID 18514737

  • Synthetic ligands discovered by in vitro selection JOURNAL OF THE AMERICAN CHEMICAL SOCIETY Wrenn, S. J., Weisinger, R. M., Halpin, D. R., Harbury, P. B. 2007; 129 (43): 13137-13143


    The recognition and catalytic properties of biopolymers derive from an elegant evolutionary mechanism, whereby the genetic material encoding molecules with superior functional attributes survives a selective pressure and is propagated to subsequent generations. This process is routinely mimicked in vitro to generate nucleic-acid or peptide ligands and catalysts. Recent advances in DNA-programmed organic synthesis have raised the possibility that evolutionary strategies could also be used for small-molecule discovery, but the idea remains unproven. Here, using DNA-programmed combinatorial chemistry, a collection of 100 million distinct compounds is synthesized and subjected to selection for binding to the N-terminal SH3 domain of the proto-oncogene Crk. Over six generations, the molecular population converges to a small number of novel SH3 domain ligands. Remarkably, the hits bind with affinities similar to those of peptide SH3 ligands isolated from phage libraries of comparable complexity. The evolutionary approach has the potential to drastically simplify and accelerate small-molecule discovery.

    View details for DOI 10.1021/1073993a

    View details for Web of Science ID 000250818900048

    View details for PubMedID 17918937

  • Potential energy functions for protein design CURRENT OPINION IN STRUCTURAL BIOLOGY Boas, F. E., Harbury, P. B. 2007; 17 (2): 199-204


    Different potential energy functions have predominated in protein dynamics simulations, protein design calculations, and protein structure prediction. Clearly, the same physics applies in all three cases. The differences in potential energy functions reflect differences in how the calculations are performed. With improvements in computer power and algorithms, the same potential energy function should be applicable to all three problems. In this review, we examine energy functions currently used for protein design, and look to the molecular mechanics field for advances that could be used in the next generation of design algorithms. In particular, we focus on improved models of the hydrophobic effect, polarization and hydrogen bonding.

    View details for DOI 10.1016/

    View details for Web of Science ID 000246330900009

    View details for PubMedID 17387014

  • Accurate, conformation-dependent predictions of solvent effects on protein ionization constants PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Barth, P., Alber, T., Harbury, P. B. 2007; 104 (12): 4898-4903


    Predicting how aqueous solvent modulates the conformational transitions and influences the pKa values that regulate the biological functions of biomolecules remains an unsolved challenge. To address this problem, we developed FDPB_MF, a rotamer repacking method that exhaustively samples side chain conformational space and rigorously calculates multibody protein-solvent interactions. FDPB_MF predicts the effects on pKa values of various solvent exposures, large ionic strength variations, strong energetic couplings, structural reorganizations and sequence mutations. The method achieves high accuracy, with root mean square deviations within 0.3 pH unit of the experimental values measured for turkey ovomucoid third domain, hen lysozyme, Bacillus circulans xylanase, and human and Escherichia coli thioredoxins. FDPB_MF provides a faithful, quantitative assessment of electrostatic interactions in biological macromolecules.

    View details for DOI 10.1073/pnas.0700188104

    View details for Web of Science ID 000245256700026

    View details for PubMedID 17360348

  • Chemical evolution as a tool for molecular discovery ANNUAL REVIEW OF BIOCHEMISTRY Wrenn, S. J., Harbury, P. B. 2007; 76: 331-349


    In modern academic and industrial laboratories, evolutionary strategies are used routinely to identify biopolymers with novel activities. Large libraries of nucleic acids (approximately 10(15)) or peptides and proteins (approximately 10(13)) can be subjected to multiple rounds of selective pressure, amplification, and diversification, yielding individual sequences with desirable properties. Although the evolutionary approach is a powerful search tool, the chemical nature of biopolymers is not suited for all purposes. Application of evolutionary strategies to libraries of arbitrary chemical composition would overcome this problem, and radically change traditional small-molecule discovery. The chemical make-up of in vitro evolution libraries has necessarily been limited, because library synthesis relies on enzymes. A great deal of current research focuses on expanding the chemical repertoire of in vitro evolution by (a) broadening enzyme substrate specificities to include unnatural building blocks, or (b) developing methods to translate DNA sequences into multistep organic syntheses. We discuss the strengths and weaknesses of the approaches, review the successes, and consider the future of chemical evolution as a tool.

    View details for DOI 10.1146/annurev.biochem.76.062205.122741

    View details for Web of Science ID 000249336800015

    View details for PubMedID 17506635

  • Misincorporation proton-alkyl exchange (MPAX): engineering cysteine probes into proteins. Current protocols in protein science / editorial board, John E. Coligan ... [et al.] Burguete, A. S., Harbury, P. B., Pfeffer, S. R. 2005; Chapter 26: Unit26 1-?


    This unit describes a rapid and efficient method to screen a polypeptide for amino acid residues that contribute to protein-protein interaction interfaces. Cysteine residues are introduced as positional probes in a protein at random by co-expression in bacteria with specific cysteine misincorporator tRNAs. The protein is then purified as an ensemble of polypeptides containing cysteine at low frequency, at different positions in each molecule. The ability of the native protein structure to protect different cysteine residues from chemical modification by iodoacetamide is determined to obtain a protein surface map that reveals candidate surface residues that are likely to be important for protein-protein interaction. Cysteine mutants with altered ligand binding can also be selected simultaneously by affinity chromatography.

    View details for DOI 10.1002/0471140864.ps2601s42

    View details for PubMedID 18429287

  • In vitro selection and prediction of TIP47 protein-interaction interfaces NATURE METHODS Burguete, A. S., Harbury, P. B., Pfeffer, S. R. 2004; 1 (1): 55-60


    We present a new method for the rapid identification of amino acid residues that contribute to protein-protein interfaces. Tail-interacting protein of 47 kDa (TIP47) binds Rab9 GTPase and the cytoplasmic domains of mannose 6-phosphate receptors and is required for their transport from endosomes to the Golgi apparatus. Cysteine mutations were incorporated randomly into TIP47 by expression in Escherichia coli cells harboring specific misincorporator tRNAs. We made use of the ability of the native TIP47 protein to protect 48 cysteine probes from chemical modification by iodoacetamide as a means to obtain a surface map of TIP47, revealing the identity of surface-localized, hydrophobic residues that are likely to participate in protein-protein interactions. Direct mutation of predicted interface residues confirmed that the protein had altered binding affinity for the mannose 6-phosphate receptor. TIP47 mutants with enhanced or diminished affinities were also selected by affinity chromatography. These methods were validated in comparison with the protein's crystal structure, and provide a powerful means to predict protein-protein interaction interfaces.

    View details for DOI 10.1038/NMETH702

    View details for Web of Science ID 000226753700019

    View details for PubMedID 15782153

  • Structural test of the parameterized-backbone method for protein design JOURNAL OF MOLECULAR BIOLOGY Plecs, J. J., Harbury, P. B., Kim, P. S., Alber, T. 2004; 342 (1): 289-297


    Designing new protein folds requires a method for simultaneously optimizing the conformation of the backbone and the side-chains. One approach to this problem is the use of a parameterized backbone, which allows the systematic exploration of families of structures. We report the crystal structure of RH3, a right-handed, three-helix coiled coil that was designed using a parameterized backbone and detailed modeling of core packing. This crystal structure was determined using another rationally designed feature, a metal-binding site that permitted experimental phasing of the X-ray data. RH3 adopted the intended fold, which has not been observed previously in biological proteins. Unanticipated structural asymmetry in the trimer was a principal source of variation within the RH3 structure. The sequence of RH3 differs from that of a previously characterized right-handed tetramer, RH4, at only one position in each 11 amino acid sequence repeat. This close similarity indicates that the design method is sensitive to the core packing interactions that specify the protein structure. Comparison of the structures of RH3 and RH4 indicates that both steric overlap and cavity formation provide strong driving forces for oligomer specificity.

    View details for DOI 10.1016/j.jmb.2004.06.051

    View details for Web of Science ID 000223578800023

    View details for PubMedID 15313624

  • DNA display III. Solid-phase organic synthesis on unprotected DNA PLOS BIOLOGY Halpin, D. R., Lee, J. A., Wrenn, S. J., Harbury, P. B. 2004; 2 (7): 1031-1038
  • DNA display I. Sequence-encoded routing of DNA populations PLOS BIOLOGY Halpin, D. R., Harbury, P. B. 2004; 2 (7): 1015-1021
  • DNA display II. Genetic manipulation of combinatorial chemistry libraries for small-molecule evolution PLOS BIOLOGY Halpin, D. R., Harbury, P. B. 2004; 2 (7): 1022-1030
  • DNA display III. Solid-phase organic synthesis on unprotected DNA. PLoS biology Halpin, D. R., Lee, J. A., Wrenn, S. J., Harbury, P. B. 2004; 2 (7): E175-?


    DNA-directed synthesis represents a powerful new tool for molecular discovery. Its ultimate utility, however, hinges upon the diversity of chemical reactions that can be executed in the presence of unprotected DNA. We present a solid-phase reaction format that makes possible the use of standard organic reaction conditions and common reagents to facilitate chemical transformations on unprotected DNA supports. We demonstrate the feasibility of this strategy by comprehensively adapting solid-phase 9-fluorenylmethyoxycarbonyl-based peptide synthesis to be DNA-compatible, and we describe a set of tools for the adaptation of other chemistries. Efficient peptide coupling to DNA was observed for all 33 amino acids tested, and polypeptides as long as 12 amino acids were synthesized on DNA supports. Beyond the direct implications for synthesis of peptide-DNA conjugates, the methods described offer a general strategy for organic synthesis on unprotected DNA. Their employment can facilitate the generation of chemically diverse DNA-encoded molecular populations amenable to in vitro evolution and genetic manipulation.

    View details for PubMedID 15221029

  • DNA display I. Sequence-encoded routing of DNA populations. PLoS biology Halpin, D. R., Harbury, P. B. 2004; 2 (7): E173-?


    Recently reported technologies for DNA-directed organic synthesis and for DNA computing rely on routing DNA populations through complex networks. The reduction of these ideas to practice has been limited by a lack of practical experimental tools. Here we describe a modular design for DNA routing genes, and routing machinery made from oligonucleotides and commercially available chromatography resins. The routing machinery partitions nanomole quantities of DNA into physically distinct subpools based on sequence. Partitioning steps can be iterated indefinitely, with worst-case yields of 85% per step. These techniques facilitate DNA-programmed chemical synthesis, and thus enable a materials biology that could revolutionize drug discovery.

    View details for PubMedID 15221027

  • DNA display II. Genetic manipulation of combinatorial chemistry libraries for small-molecule evolution. PLoS biology Halpin, D. R., Harbury, P. B. 2004; 2 (7): E174-?


    Biological in vitro selection techniques, such as RNA aptamer methods and mRNA display, have proven to be powerful approaches for engineering molecules with novel functions. These techniques are based on iterative amplification of biopolymer libraries, interposed by selection for a desired functional property. Rare, promising compounds are enriched over multiple generations of a constantly replicating molecular population, and subsequently identified. The restriction of such methods to DNA, RNA, and polypeptides precludes their use for small-molecule discovery. To overcome this limitation, we have directed the synthesis of combinatorial chemistry libraries with DNA "genes," making possible iterative amplification of a nonbiological molecular species. By differential hybridization during the course of a traditional split-and-pool combinatorial synthesis, the DNA sequence of each gene is read out and translated into a unique small-molecule structure. This "chemical translation" provides practical access to synthetic compound populations 1 million-fold more complex than state-of-the-art combinatorial libraries. We carried out an in vitro selection experiment (iterated chemical translation, selection, and amplification) on a library of 10(6) nonnatural peptides. The library converged over three generations to a high-affinity protein ligand. The ability to genetically encode diverse classes of synthetic transformations enables the in vitro selection and potential evolution of an essentially limitless collection of compound families, opening new avenues to drug discovery, catalyst design, and the development of a materials science "biology."

    View details for PubMedID 15221028

  • Automated design of specificity in molecular recognition NATURE STRUCTURAL BIOLOGY Havranek, J. J., Harbury, P. B. 2003; 10 (1): 45-52


    Specific protein-protein interactions are crucial in signaling networks and for the assembly of multi-protein complexes, and represent a challenging goal for protein design. Optimizing interaction specificity requires both positive design, the stabilization of a desired interaction, and negative design, the destabilization of undesired interactions. Currently, no automated protein-design algorithms use explicit negative design to guide a sequence search. We describe a multi-state framework for engineering specificity that selects sequences maximizing the transfer free energy of a protein from a target conformation to a set of undesired competitor conformations. To test the multi-state framework, we engineered coiled-coil interfaces that direct the formation of either homodimers or heterodimers. The algorithm identified three specificity motifs that have not been observed in naturally occurring coiled coils. In all cases, experimental results confirm the predicted specificities.

    View details for Web of Science ID 000180216100013

    View details for PubMedID 12459719

  • The equilibrium unfolding pathway of a (beta/alpha)(8) barrel JOURNAL OF MOLECULAR BIOLOGY Silverman, J. A., Harbury, P. B. 2002; 324 (5): 1031-1040


    The (beta/alpha)(8) barrel is the most commonly occurring fold among enzymes. A key step towards rationally engineering (beta/alpha)(8) barrel proteins is to understand their underlying structural organization and folding energetics. Using misincorporation proton-alkyl exchange (MPAX), a new tool for solution structural studies of large proteins, we have performed a native-state exchange analysis of the prototypical (beta/alpha)(8) barrel triosephosphate isomerase. Three cooperatively unfolding subdomains within the structure are identified, as well as two partially unfolded forms of the protein. The C-terminal domain coincides with domains reported to exist in four other (beta/alpha)(8) barrels, but the two N-terminal domains have not been observed previously. These partially unfolded forms may represent sequential intermediates on the folding pathway of triosephosphate isomerase. The methods reported here should be applicable to a variety of other biological problems involving protein conformational changes.

    View details for DOI 10.1016/S0022-2836(02)01100-2

    View details for Web of Science ID 000179960200011

    View details for PubMedID 12470957

  • Rapid mapping of protein structure, interactions, and ligand binding by misincorporation proton-alkyl exchange JOURNAL OF BIOLOGICAL CHEMISTRY Silverman, J. A., Harbury, P. B. 2002; 277 (34): 30968-30975


    Understanding protein conformation, interactions, and ligand binding is essential to all biological inquiry. We report a novel biochemical technique, called misincorporation proton-alkyl exchange (MPAX), that can be used to footprint protein structure at single amino acid resolution. MPAX exploits translational misincorporation of cysteine residues to generate probes for physical analysis. We apply MPAX to the triosephosphate isomerase (beta/alpha)(8) barrel, accurately determining its substrate-binding site, a protein-protein interaction surface, the solvent-accessible protein surface, and the stability of the barrel. Because MPAX requires only microgram quantities of material and is not limited by protein size, it is ideally suited for proteins not amenable to conventional structural methods, such as membrane proteins, partially folded or insoluble proteins, and large protein complexes.

    View details for DOI 10.1074/jbc.M203172200

    View details for Web of Science ID 000177579800073

    View details for PubMedID 12185208

  • Reverse engineering the (beta/alpha)(8) barrel fold PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Silverman, J. A., Balakrishnan, R., Harbury, P. B. 2001; 98 (6): 3092-3097


    The (beta/alpha)(8) barrel is the most commonly occurring fold among protein catalysts. To lay a groundwork for engineering novel barrel proteins, we investigated the amino acid sequence restrictions at 182 structural positions of the prototypical (beta/alpha)(8) barrel enzyme triosephosphate isomerase. Using combinatorial mutagenesis and functional selection, we find that turn sequences, alpha-helix capping and stop motifs, and residues that pack the interface between beta-strands and alpha-helices are highly mutable. Conversely, any mutation of residues in the central core of the beta-barrel, beta-strand stop motifs, and a single buried salt bridge between amino acids R189 and D227 substantially reduces catalytic activity. Four positions are effectively immutable: conservative single substitutions at these four positions prevent the mutant protein from complementing a triosephosphate isomerase knockout in Escherichia coli. At 142 of the 182 positions, mutation to at least one amino acid of a seven-letter amino acid alphabet produces a triosephosphate isomerase with wild-type activity. Consequently, it seems likely that (beta/alpha)(8) barrel structures can be encoded with a subset of the 20 amino acids. Such simplification would greatly decrease the computational burden of (beta/alpha)(8) barrel design.

    View details for Web of Science ID 000167521300031

    View details for PubMedID 11248037

  • Modular enzymes NATURE Khosla, C., Harbury, P. B. 2001; 409 (6817): 247-252


    Although modular macromolecular devices are encountered frequently in a variety of biological situations, their occurrence in biocatalysis has not been widely appreciated. Three general classes of modular biocatalysts can be identified: enzymes in which catalysis and substrate specificity are separable, multisubstrate enzymes in which binding sites for individual substrates are modular, and multienzyme systems that can catalyse programmable metabolic pathways. In the postgenomic era, the discovery of such systems can be expected to have a significant impact on the role of enzymes in synthetic and process chemistry.

    View details for Web of Science ID 000166316200056

    View details for PubMedID 11196653

  • Tanford-Kirkwood electrostatics for protein modeling PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Havranek, J. J., Harbury, P. B. 1999; 96 (20): 11145-11150


    Solvent plays a significant role in determining the electrostatic potential energy of proteins, most notably through its favorable interactions with charged residues and its screening of electrostatic interactions. These energetic contributions are frequently ignored in computational protein design and protein modeling methodologies because they are difficult to evaluate rapidly and accurately. To address this deficiency, we report a revised form of the original Tanford-Kirkwood continuum electrostatic model [Tanford, C. & Kirkwood, J. G. (1957) J. Am. Chem. Soc. 79, 5333-5339], which accounts for the effects of solvent polarization on charged atoms in proteins. The Tanford-Kirkwood model was modified to increase its speed and to improve its sensitivity to the details of protein structure. For the 37 electrostatic self-energies of the polar side-chains in bovine pancreatic trypsin inhibitor, and their 666 interaction energies, the modified Tanford-Kirkwood potential of mean force differs from a computationally intensive numerical potential (DelPhi) by root-mean-square errors of 0.6 kcal/mol and 0.08 kcal/mol, respectively. The Tanford-Kirkwood approach makes possible a realistic treatment of electrostatics in computationally demanding protein modeling calculations. For example, pH titration calculations for ovomucoid third domain that model polar side-chain relaxation (including >2 x 10(23) rotamer conformations of the protein) provide pKa values of unprecedented accuracy.

    View details for Web of Science ID 000082868500043

    View details for PubMedID 10500144

  • Springs and zippers: coiled coils in SNARE-mediated membrane fusion STRUCTURE Harbury, P. A. 1998; 6 (12): 1487-1491


    A conserved molecular machinery based on SNARE proteins catalyzes most, if not all, cellular membrane fusion events. A flurry of recent biophysical studies have established a detailed molecular picture of the core SNARE complex. Structural and biochemical analysis of the SNARE machinery is rapidly advancing our understanding of the specificity, regulation and protein catalysis of membrane fusion.

    View details for Web of Science ID 000077782900001

    View details for PubMedID 9862813

  • High-resolution protein design with backbone freedom SCIENCE Harbury, P. B., Plecs, J. J., TIDOR, B., Alber, T., Kim, P. S. 1998; 282 (5393): 1462-1467


    Recent advances in computational techniques have allowed the design of precise side-chain packing in proteins with predetermined, naturally occurring backbone structures. Because these methods do not model protein main-chain flexibility, they lack the breadth to explore novel backbone conformations. Here the de novo design of a family of alpha-helical bundle proteins with a right-handed superhelical twist is described. In the design, the overall protein fold was specified by hydrophobic-polar residue patterning, whereas the bundle oligomerization state, detailed main-chain conformation, and interior side-chain rotamers were engineered by computational enumerations of packing in alternate backbone structures. Main-chain flexibility was incorporated through an algebraic parameterization of the backbone. The designed peptides form alpha-helical dimers, trimers, and tetramers in accord with the design goals. The crystal structure of the tetramer matches the designed structure in atomic detail.

    View details for Web of Science ID 000077110800046

    View details for PubMedID 9822371



    Progress in homology modeling and protein design has generated considerable interest in methods for predicting side-chain packing in the hydrophobic cores of proteins. Present techniques are not practically useful, however, because they are unable to model protein main-chain flexibility. Parameterization of backbone motions may represent a general and efficient method to incorporate backbone relaxation into such fixed main-chain models. To test this notion, we introduce a method for treating explicitly the backbone motions of alpha-helical bundles based on an algebraic parameterization proposed by Francis Crick in 1953 [Crick, F. H. C. (1953) Acta Crystallogr. 6, 685-689]. Given only the core amino acid sequence, a simple calculation can rapidly reproduce the crystallographic main-chain and core side-chain structures of three coiled coils (one dimer, one trimer, and one tetramer) to within 0.6-A root-mean-square deviations. The speed of the predictive method [approximately 3 min per rotamer choice on a Silicon Graphics (Mountain View, CA) 4D/35 computer] permits it to be used as a design tool.

    View details for Web of Science ID A1995RR84400066

    View details for PubMedID 7667303

  • CRYSTAL-STRUCTURE OF AN ISOLEUCINE-ZIPPER TRIMER NATURE Harbury, P. B., Kim, P. S., Alber, T. 1994; 371 (6492): 80-83


    Subunit oligomerization in many proteins is mediated by short coiled-coil motifs. These motifs share a characteristic seven-amino-acid repeat containing hydrophobic residues at the first (a) and fourth (d) positions. Despite this common pattern, different sequences form two-, three- and four-stranded helical ropes. We have investigated the basis for oligomer choice by characterizing variants of the GCN4 leucine-zipper dimerization domain that adopt trimeric or tetrameric structures in response to mutations at the a and d positions. We now report the high-resolution X-ray crystal structure of an isoleucine-containing mutant that folds into a parallel three-stranded, alpha-helical coiled coil. In contrast to the dimer and tetramer structures, the interior packing of the trimer can accommodate beta-branched residues in the most preferred rotamer at both hydrophobic positions. Compatibility of the shape of the core amino acids with the distinct packing spaces in the two-, three- and four-stranded conformations appears to determine the oligomerization state of the GCN4 leucine-zipper variants.

    View details for Web of Science ID A1994PE38100056

    View details for PubMedID 8072533



    Coiled-coil sequences in proteins consist of heptad repeats containing two characteristic hydrophobic positions. The role of these buried hydrophobic residues in determining the structures of coiled coils was investigated by studying mutants of the GCN4 leucine zipper. When sets of buried residues were altered, two-, three-, and four-helix structures were formed. The x-ray crystal structure of the tetramer revealed a parallel, four-stranded coiled coil. In the tetramer conformation, the local packing geometry of the two hydrophobic positions in the heptad repeat is reversed relative to that in the dimer. These studies demonstrate that conserved, buried residues in the GCN4 leucine zipper direct dimer formation. In contrast to proposals that the pattern of hydrophobic and polar amino acids in a protein sequence is sufficient to determine three-dimensional structure, the shapes of buried side chains in coiled coils are essential determinants of the global fold.

    View details for Web of Science ID A1993MJ04600028

    View details for PubMedID 8248779

  • AMSACRINE AND ETOPOSIDE HYPERSENSITIVITY OF YEAST-CELLS OVEREXPRESSING DNA TOPOISOMERASE-II CANCER RESEARCH Nitiss, J. L., Liu, Y. X., Harbury, P., Jannatipour, M., Wasserman, R., Wang, J. C. 1992; 52 (16): 4467-4472


    Increasing the cellular concentration of DNA topoisomerase II in yeast by expressing constitutively a plasmid-borne TOP2 gene encoding the enzyme greatly increases the sensitivity of the cells to amsacrine and etoposide (VP-16). This increased drug sensitivity at a higher intracellular DNA topoisomerase II level is observed in both RAD52+ repair-proficient strains and rad52 mutants that are defective in the repair of double-stranded breaks. These results provide strong support of the hypothesis that the cellular target of these drugs is DNA topoisomerase II, and that these drugs kill cells by converting DNA topoisomerase II into a DNA damaging agent.

    View details for Web of Science ID A1992JJ83700026

    View details for PubMedID 1322791



    Actin is the major ATP and ADP binding protein in platelets, 0.9-1.3 nmol/10(8) cells, 50-70% in the unpolymerized state. The goal of these experiments was to develop a method for extracting all protein-bound ATP and ADP from undisturbed platelets in plasma. Extraction of actin-bound ADP is routine while extraction of actin-bound ATP from platelets in buffer has been unsuccessful. Prior to extraction the platelets were exposed to 14-C adenine, to label the metabolic and actin pools of ATP and ADP. The specific activity was determined from the actin-bound ADP in the 43% ethanol precipitate. Sequential ethanol and perchlorate extractions of platelet rich plasma, and the derived supernatants and precipitates were performed. ATP concentrations were determined with the luciferase assay, and radioactive nucleotides separated by TLC. A total of 1.18 nmol/10(8) cells of protein-bound ATP and ADP was recovered, 52% ATP (0.61 nmol). The recovery of protein-bound ADP was increased from 0.3 to 0.57 nmol/10(8) cells. This approach for the first time successfully recovered protein bound ATP and ADP from platelets in a concentration expected for actin.

    View details for Web of Science ID A1990DA61900025

    View details for PubMedID 2163554



    Although the yeast his3 promoter region contains two functional TATA elements, TR and TC, the GCN4 and GAL4 upstream activator proteins stimulate transcription only through TR. In combination with GAL4, an oligonucleotide containing the sequence TATAAA is fully sufficient for TR function, whereas almost all single-base-pair substitutions of this sequence abolish the ability of this element to activate transcription. Further analysis of these and other mutations of the TR element led to the following conclusions. First, sequences downstream of the TATAAA sequence are important for TR function. Second, a double mutant, TATTTA, can serve as a TR element even though the corresponding single mutation, TATTAA, is unable to do so. Third, three mutations have the novel property of being able to activate transcription in combination with GCN4 but not with GAL4; this finding suggests that activation by GCN4 and by GAL4 may not occur by identical mechanisms. From these observations, we address the question of whether there is a single TATA-binding factor required for the transcription of all genes.

    View details for Web of Science ID A1989CB55100004

    View details for PubMedID 2685558



    The affinity of the Escherichia coli phage 434 operator for phage 434 repressor is affected by changes in the sequence of the noncontacted base pairs near the operator's center. The results presented here show that base composition near the center of the operator affects the operator's affinity for repressor by altering the ease with which the operator can be overtwisted into the proper configuration for complex formation. We show that both DNA flexibility and repressor flexibility influence the strength of the repressor-operator interaction: an operator with a single-strand nick at its center has a higher affinity for repressor than does the intact operator: and a repressor bearing a mutation that results in a relaxed dimer interaction is less sensitive than is wild type to changes in the flexibility of the operator. We show that the effect of noncontacted base pairs on operator affinity is independent of the slight overall bend of the operator seen in the repressor-operator complex. Central sequence effects on affinity for repressor are independent of the identity of adjacent base pairs, suggesting that the structure of the individual base pairs, not interactions between them, are responsible for the different torsional rigidities of different operators.

    View details for Web of Science ID A1988P227700013

    View details for PubMedID 3387430


    View details for Web of Science ID A1988AP53900017

    View details for PubMedID 3151184