David Quigley
Research Summary    -    CV    -    Publications    -    Reproducible Results    -    CARMEN    -    equalizer

Reproducible Results

Overview

Although I try to spend as much time working on biology as I can, I write a lot of software in R, Python, and C++. For software engineering with C++ I make heavy use of the Boost libraries and wxWidgets for cross-platform software development.

Source Code

Source code for my software, including CARMEN, is available at github.com/DavidQuigley

Publications

Quigley et al. Nature 2009

DATA AND RESULTS
Analysis in this section was performed using the 71 microarrays described in the paper; the GEO archive also contains arrays for the parental strains, which should be removed if you want to reproduce as closely as possible the published results.

To reproduce the eQTL analysis from scratch you will need either to perform the linear regressions yourself, or to build the eqtl software that I wrote from C++ source code. The code to call eqtl using four cores and 1000 permutations would be:
BIN_EQTL=/notebook/code/src/eqtl/build/Release/eqtl
DIR_REP=/notebook/hiroki/tail_eQTL_paper/reproduce
$BIN_EQTL \
-d$DIR_REP/expr_above_mean_4.txt \
-f$DIR_REP/sample_attributes.txt \
-g$DIR_REP/gene_attributes_na31_above_mean_4.txt \
-e$DIR_REP/calls_GSE12248.txt \
-h$DIR_REP/sample_attributes_calls.txt \
-s$DIR_REP/gene_attributes_calls_GSE12248.txt \
-kT -lChr,loc_start,Chr,loc_start -carray_id -n1000 -w4 -o$DIR_REP/eqtl_chr_1000.txt 

You can download the Cytoscape CYS file used to create Figure One of Quigley Nature 2009: Q_et_al_figure_one.cys. You will need an up-to-date version of the Java runtime to load the file; I recommend the latest version of Cytoscape as well.

Quigley et al. Genome Biology 2011

DATA AND RESULTS
The raw expression data for this paper (Affymetrix CEL files) and array-CGH data are available in GEO in SuperSeries GSE21264. Note that this SuperSeries includes the data from the Nature paper.

Genotypes used for this analysis can be downloaded here.
aCGH data used for this analysis can be downloaded here.

eQTL Results
eQTL results for Tails, Papillomas, and Carcinomas can be downloaded here. eQTL results were calculated by linear regression using code that can be downloaded from my GitHub repository. Please see this README file for a general overview of the analysis.

To et al. Molecular Cancer Research 2011

DATA
The array-CGH data used in this paper can be downloaded from GEO at GSE29230. The data as used in the reproducible analysis can be downloaded directly as a zip archive.

Quigley et al. Molecular Oncology 2014

Code and data to reproduce. Note that I cannot post the genotype data used for this study online; please contact the corresponding author (Vessela Kristensen) to inquire about access to these genotypes.

Sjolund et al. PNAS 2014

You can download Code and data to reproduce the correlation analysis found in this publication.

DATA
Two different microarray datasets are referred to by the code, mouse skin data and mouse mammary tissue data. Raw data for both of these datasets are publicly available on GEO, at accessions GSE46077 and GSE12248. Processed data as used by the code in the paper is included in the data folder.

ANALYSIS
Please see the README file for details. Analysis was executed in R using code in the file "reproduce_hipk2.r".

Quigley et al., Breast tumor and normal tissue, submitted

Code and data to reproduce.




Updated January 2015