David Quigley, PhD

Genomic Hallmarks and Structural Variation in Metastatic Prostate Cancer
We recently performed whole genome and DNA and RNA sequencing of 101 tumor biopsies from patients with metastatic castration-resistant prostate cancer. This study has recently been published in Cell: Quigley, Dang, Zhao et al. Cell 2018. These data were generated by the Stand Up to Cancer/AACR West Coast Dream Team. We received funding from many groups, and acknowlege in particular Stand Up To Cancer, the Prostate Cancer Foundation and The Movember Foundation. The whole genome study was led by Arul Chinnaiyan (U Michigan), Christopher Maher (Washington U), Eric Small (UCSF), and Felix Feng (UCSF). Requests for reagents or data associated with this paper should be directed to Felix Feng, although I am happy to field questions about the analytical work described in the paper.

Highlights: Summary:
While mutations affecting protein-coding regions have been examined across many cancers, structural variants at the genome-wide level are still poorly defined. Through integrative deep whole genome and transcriptome analysis of 101 castration-resistant prostate cancer metastases (109X tumor / 38X normal coverage), we identified structural variants altering critical regulators of tumorigenesis and progression not detectable by exome approaches. Notably, we observed amplification of an intergenic enhancer region 624 kilobases upstream of the androgen receptor (AR) in 81% of patients, correlating with increased AR expression. Tandem duplication hotspots also occur near MYC, in lncRNAs associated with post-translational MYC regulation. Classes of structural variations were linked to distinct DNA repair deficiencies, suggesting their etiology, including associations of CDK12 mutation with tandem duplications, TP53 inactivation with inverted rearrangements and chromothripsis, and BRCA2 inactivation with deletions. Together, these observations provide a comprehensive view of how structural variations affect critical regulators in metastatic prostate cancer.

Scripts used in the analysis
Scripts employed during the analysis are available on github: https://github.com/DavidQuigley/WCDT. These comprise Python, R, and shell scripts. If you have questions about particular analyses that cannot be answered by the paper's published methods or by reading the code, please contact me.

Whole genome DNA sequence data
mRNA data
Laser-capture microdissected tumor tissue was subjected to RNA-seq. RNA reads were then aligned against HG38-decoy using STAR as described in the manuscript, producing per-gene count files (see below). RNA data were available for 99 of the 101 samples with DNA-seq data, so please expect to see 99 columns in matrix files for this study. A total of 26,485 transcripts were assessed for counts. Count files were then processed to calculate TPM values using the code at https://github.com/DavidQuigley/WCDT/scripts/calculate_RNA_tpm.R, using code adapted from https://gist.github.com/slowkow/c6ab0348747f86e2748b. This script marked as absent any individual gene if no sample had at least 100 counts for that gene and if the mean number of counts across all 101 samples was less than 100. After filtering, 16,844 genes with TPM calls were included in the analysis. You can re-process the count data to your own satisfaction using the raw data linked below.
Updated September 2017