Honors & Awards

  • Gabino Barreda Award for best peer in Computer Science, Universidad Nacional Autonoma de Mexico (2007)

Education & Certifications

  • Licenciado, Universidad Nacional Autonoma Mexico, Computer Science (2008)

Stanford Advisors

  • Euan Ashley, Doctoral Dissertation Co-Advisor (NonAC)
  • Rhiju Das, Doctoral Dissertation Advisor (AC)


Journal Articles

  • An RNA Mapping DataBase for curating RNA structure mapping experiments BIOINFORMATICS Cordero, P., Lucks, J. B., Das, R. 2012; 28 (22): 3006-3008


    We have established an RNA mapping database (RMDB) to enable structural, thermodynamic and kinetic comparisons across single-nucleotide-resolution RNA structure mapping experiments. The volume of structure mapping data has greatly increased since the development of high-throughput sequencing techniques, accelerated software pipelines and large-scale mutagenesis. For scientists wishing to infer relationships between RNA sequence/structure and these mapping data, there is a need for a database that is curated, tagged with error estimates and interfaced with tools for sharing, visualization, search and meta-analysis. Through its on-line front-end, the RMDB allows users to explore single-nucleotide-resolution mapping data in heat-map, bar-graph and colored secondary structure graphics; to leverage these data to generate secondary structure hypotheses; and to download the data in standardized and computer-friendly files, including the RDAT and community-consensus SNRNASM formats. At the time of writing, the database houses 53 entries, describing more than 2848 experiments of 1098 RNA constructs in several solution conditions and is growing rapidly.Freely available on the web at data are available at Bioinformatics Online.

    View details for DOI 10.1093/bioinformatics/bts554

    View details for Web of Science ID 000311303500028

    View details for PubMedID 22976082

  • Quantitative Dimethyl Sulfate Mapping for Automated RNA Secondary Structure Inference BIOCHEMISTRY Cordero, P., Kladwang, W., VanLang, C. C., Das, R. 2012; 51 (36): 7037-7039


    For decades, dimethyl sulfate (DMS) mapping has informed manual modeling of RNA structure in vitro and in vivo. Here, we incorporate DMS data into automated secondary structure inference using an energy minimization framework developed for 2'-OH acylation (SHAPE) mapping. On six noncoding RNAs with crystallographic models, DMS-guided modeling achieves overall false negative and false discovery rates of 9.5% and 11.6%, respectively, comparable to or better than those of SHAPE-guided modeling, and bootstrapping provides straightforward confidence estimates. Integrating DMS-SHAPE data and including 1-cyclohexyl(2-morpholinoethyl) carbodiimide metho-p-toluene sulfonate (CMCT) reactivities provide small additional improvements. These results establish DMS mapping, an already routine technique, as a quantitative tool for unbiased RNA secondary structure modeling.

    View details for DOI 10.1021/bi3008802

    View details for Web of Science ID 000308833500001

    View details for PubMedID 22913637

  • Whole-Genome Sequencing in Personalized Therapeutics CLINICAL PHARMACOLOGY & THERAPEUTICS Cordero, P., Ashley, E. A. 2012; 91 (6): 1001-1009


    Eleven years since the initial drafts of the human genome were published, we have begun to see the first examples of the application of whole-genome sequencing to personalized diagnosis and therapeutics. The exponential decline in sequencing costs and the constant improvement in these technologies promise to further advance the use of a patient's full genetic profile in the clinic. However, realizing the potential benefit of such sequencing will require a concerted effort by science, medicine, law, and management. In this review, we discuss current approaches to decoding the 6 billion-letter genetic code of a whole genome in a clinical context, give current examples of translating this information into therapy-guiding knowledge, and list the challenges that will need to be surmounted before these powerful data can be fully exploited to forward the goals of personalized medicine.

    View details for DOI 10.1038/clpt.2012.51

    View details for Web of Science ID 000304245800018

    View details for PubMedID 22549284

  • Interpretome: a freely available, modular, and secure personal genome interpretation engine. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Karczewski, K. J., Tirrell, R. P., Cordero, P., Tatonetti, N. P., Dudley, J. T., Salari, K., Snyder, M., Altman, R. B., Kim, S. K. 2012: 339-350


    The decreasing cost of genotyping and genome sequencing has ushered in an era of genomic personalized medicine. More than 100,000 individuals have been genotyped by direct-to-consumer genetic testing services, which offer a glimpse into the interpretation and exploration of a personal genome. However, these interpretations, which require extensive manual curation, are subject to the preferences of the company and are not customizable by the individual. Academic institutions teaching personalized medicine, as well as genetic hobbyists, may prefer to customize their analysis and have full control over the content and method of interpretation. We present the Interpretome, a system for private genome interpretation, which contains all genotype information in client-side interpretation scripts, supported by server-side databases. We provide state-of-the-art analyses for teaching clinical implications of personal genomics, including disease risk assessment and pharmacogenomics. Additionally, we have implemented client-side algorithms for ancestry inference, demonstrating the power of these methods without excessive computation. Finally, the modular nature of the system allows for plugin capabilities for custom analyses. This system will allow for personal genome exploration without compromising privacy, facilitating hands-on courses in genomics and personalized medicine.

    View details for PubMedID 22174289

  • A two-dimensional mutate-and-map strategy for non-coding RNA structure NATURE CHEMISTRY Kladwang, W., VanLang, C. C., Cordero, P., Das, R. 2011; 3 (12): 954-962


    Non-coding RNAs fold into precise base-pairing patterns to carry out critical roles in genetic regulation and protein synthesis, but determining RNA structure remains difficult. Here, we show that coupling systematic mutagenesis with high-throughput chemical mapping enables accurate base-pair inference of domains from ribosomal RNA, ribozymes and riboswitches. For a six-RNA benchmark that has challenged previous chemical/computational methods, this 'mutate-and-map' strategy gives secondary structures that are in agreement with crystallography (helix error rates, 2%), including a blind test on a double-glycine riboswitch. Through modelling of partially ordered states, the method enables the first test of an interdomain helix-swap hypothesis for ligand-binding cooperativity in a glycine riboswitch. Finally, the data report on tertiary contacts within non-coding RNAs, and coupling to the Rosetta/FARFAR algorithm gives nucleotide-resolution three-dimensional models (helix root-mean-squared deviation, 5.7 Å) of an adenine riboswitch. These results establish a promising two-dimensional chemical strategy for inferring the secondary and tertiary structures that underlie non-coding RNA behaviour.

    View details for DOI 10.1038/NCHEM.1176

    View details for Web of Science ID 000297685800014

    View details for PubMedID 22109276

  • Understanding the Errors of SHAPE-Directed RNA Structure Modeling BIOCHEMISTRY Kladwang, W., VanLang, C. C., Cordero, P., Das, R. 2011; 50 (37): 8049-8056


    Single-nucleotide-resolution chemical mapping for structured RNA is being rapidly advanced by new chemistries, faster readouts, and coupling to computational algorithms. Recent tests have shown that selective 2'-hydroxyl acylation by primer extension (SHAPE) can give near-zero error rates (0-2%) in modeling the helices of RNA secondary structure. Here, we benchmark the method using six molecules for which crystallographic data are available: tRNA(phe) and 5S rRNA from Escherichia coli, the P4-P6 domain of the Tetrahymena group I ribozyme, and ligand-bound domains from riboswitches for adenine, cyclic di-GMP, and glycine. SHAPE-directed modeling of these highly structured RNAs gave an overall false negative rate (FNR) of 17% and a false discovery rate (FDR) of 21%, with at least one helix prediction error in five of the six cases. Extensive variations of data processing, normalization, and modeling parameters did not significantly mitigate modeling errors. Only one varation, filtering out data collected with deoxyinosine triphosphate during primer extension, gave a modest improvement (FNR = 12%, and FDR = 14%). The residual structure modeling errors are explained by the insufficient information content of these RNAs' SHAPE data, as evaluated by a nonparametric bootstrapping analysis. Beyond these benchmark cases, bootstrapping suggests a low level of confidence (<50%) in the majority of helices in a previously proposed SHAPE-directed model for the HIV-1 RNA genome. Thus, SHAPE-directed RNA modeling is not always unambiguous, and helix-by-helix confidence estimates, as described herein, may be critical for interpreting results from this powerful methodology.

    View details for DOI 10.1021/bi4200524n

    View details for Web of Science ID 000294791100021

    View details for PubMedID 21842868

  • Phased Whole-Genome Genetic Risk in a Family Quartet Using a Major Allele Reference Sequence PLOS GENETICS Dewey, F. E., Chen, R., Cordero, S. P., Ormond, K. E., Caleshu, C., Karczewski, K. J., Whirl-Carrillo, M., Wheeler, M. T., Dudley, J. T., Byrnes, J. K., Cornejo, O. E., Knowles, J. W., Woon, M., Sangkuhl, K., Gong, L., Thorn, C. F., Hebert, J. M., Capriotti, E., David, S. P., Pavlovic, A., West, A., Thakuria, J. V., Ball, M. P., Zaranek, A. W., Rehm, H. L., Church, G. M., West, J. S., Bustamante, C. D., Snyder, M., Altman, R. B., Klein, T. E., Butte, A. J., Ashley, E. A. 2011; 7 (9)


    Whole-genome sequencing harbors unprecedented potential for characterization of individual and family genetic variation. Here, we develop a novel synthetic human reference sequence that is ethnically concordant and use it for the analysis of genomes from a nuclear family with history of familial thrombophilia. We demonstrate that the use of the major allele reference sequence results in improved genotype accuracy for disease-associated variant loci. We infer recombination sites to the lowest median resolution demonstrated to date (< 1,000 base pairs). We use family inheritance state analysis to control sequencing error and inform family-wide haplotype phasing, allowing quantification of genome-wide compound heterozygosity. We develop a sequence-based methodology for Human Leukocyte Antigen typing that contributes to disease risk prediction. Finally, we advance methods for analysis of disease and pharmacogenomic risk across the coding and non-coding genome that incorporate phased variant data. We show these methods are capable of identifying multigenic risk for inherited thrombophilia and informing the appropriate pharmacological therapy. These ethnicity-specific, family-based approaches to interpretation of genetic variation are emblematic of the next generation of genetic risk assessment using whole-genome sequencing.

    View details for DOI 10.1371/journal.pgen.1002280

    View details for Web of Science ID 000295419100031

    View details for PubMedID 21935354

  • Sharing and archiving nucleic acid structure mapping data RNA-A PUBLICATION OF THE RNA SOCIETY Rocca-Serra, P., Bellaousov, S., Birmingham, A., Chen, C., Cordero, P., Das, R., Davis-Neulander, L., Duncan, C. D., Halvorsen, M., Knight, R., Leontis, N. B., Mathews, D. H., Ritz, J., Stombaugh, J., Weeks, K. M., Zirbel, C. L., Laederach, A. 2011; 17 (7): 1204-1212


    Nucleic acids are particularly amenable to structural characterization using chemical and enzymatic probes. Each individual structure mapping experiment reveals specific information about the structure and/or dynamics of the nucleic acid. Currently, there is no simple approach for making these data publically available in a standardized format. We therefore developed a standard for reporting the results of single nucleotide resolution nucleic acid structure mapping experiments, or SNRNASMs. We propose a schema for sharing nucleic acid chemical probing data that uses generic public servers for storing, retrieving, and searching the data. We have also developed a consistent nomenclature (ontology) within the Ontology of Biomedical Investigations (OBI), which provides unique identifiers (termed persistent URLs, or PURLs) for classifying the data. Links to standardized data sets shared using our proposed format along with a tutorial and links to templates can be found at

    View details for DOI 10.1261/rna.2753211

    View details for Web of Science ID 000291683500002

    View details for PubMedID 21610212

  • A mutate-and-map strategy accurately infers the base pairs of a 35-nucleotide model RNA RNA-A PUBLICATION OF THE RNA SOCIETY Kladwang, W., Cordero, P., Das, R. 2011; 17 (3): 522-534


    We present a rapid experimental strategy for inferring base pairs in structured RNAs via an information-rich extension of classic chemical mapping approaches. The mutate-and-map method, previously applied to a DNA/RNA helix, systematically searches for single mutations that enhance the chemical accessibility of base-pairing partners distant in sequence. To test this strategy for structured RNAs, we have carried out mutate-and-map measurements for a 35-nt hairpin, called the MedLoop RNA, embedded within an 80-nt sequence. We demonstrate the synthesis of all 105 single mutants of the MedLoop RNA sequence and present high-throughput DMS, CMCT, and SHAPE modification measurements for this library at single-nucleotide resolution. The resulting two-dimensional data reveal visually clear, punctate features corresponding to RNA base pair interactions as well as more complex features; these signals can be qualitatively rationalized by comparison to secondary structure predictions. Finally, we present an automated, sequence-blind analysis that permits the confident identification of nine of the 10 MedLoop RNA base pairs at single-nucleotide resolution, while discriminating against all 1460 false-positive base pairs. These results establish the accuracy and information content of the mutate-and-map strategy and support its feasibility for rapidly characterizing the base-pairing patterns of larger and more complex RNA systems.

    View details for DOI 10.1261/rna.2516311

    View details for Web of Science ID 000287195900014

    View details for PubMedID 21239468

Stanford Medicine Resources: