We recently performed whole genome and DNA and RNA sequencing of 101 tumor biopsies from patients with
metastatic castration-resistant prostate cancer. This study has recently been published in Cell:
Quigley, Dang, Zhao et al. Cell 2018
These data were generated by the Stand Up to Cancer/AACR West Coast Dream Team. We received funding from many groups, and
acknowlege in particular Stand Up To Cancer
the Prostate Cancer Foundation
The Movember Foundation
. The whole
genome study was led by Arul Chinnaiyan
(U Michigan), Christopher
(Washington U), Eric Small
(UCSF), and Felix Feng
(UCSF). Requests for reagents or data
associated with this paper should be directed to Felix Feng, although I am happy to field questions
about the analytical work described in the paper.
- Deep whole genome and transcriptome sequencing of 101 prostate cancer metastases
- Tandem duplication affects intergenic regulatory loci upstream of AR and MYC
- Inactivation of CDK12, TP53, and BRCA2 affect distinct classes of structural variants
- Androgen receptor is affected by mutation or structural variation in 85% of mCRPC
While mutations affecting protein-coding regions have been examined across many cancers,
structural variants at the genome-wide level are still poorly defined. Through integrative
deep whole genome and transcriptome analysis of 101 castration-resistant prostate cancer
metastases (109X tumor / 38X normal coverage), we identified structural variants altering
critical regulators of tumorigenesis and progression not detectable by exome approaches.
Notably, we observed amplification of an intergenic enhancer region 624 kilobases upstream
of the androgen receptor (AR) in 81% of patients, correlating with increased AR expression.
Tandem duplication hotspots also occur near MYC, in lncRNAs associated with post-translational
MYC regulation. Classes of structural variations were linked to distinct DNA repair
deficiencies, suggesting their etiology, including associations of CDK12 mutation with
tandem duplications, TP53 inactivation with inverted rearrangements and chromothripsis,
and BRCA2 inactivation with deletions. Together, these observations provide a comprehensive
view of how structural variations affect critical regulators in metastatic prostate cancer.
Scripts employed during the analysis are available on github:
These comprise Python, R, and shell scripts. If you have questions about particular analyses that
cannot be answered by the paper's published methods or by reading the code, please contact me.
Laser-capture microdissected tumor tissue was subjected to RNA-seq. RNA reads were then aligned
against HG38-decoy using STAR as described in the manuscript, producing per-gene count files (see below).
RNA data were available for 99 of the 101 samples with DNA-seq data, so please expect to see 99 columns in matrix files for this study.
A total of 26,485 transcripts were assessed for counts. Count files were then processed
to calculate TPM values using the code at
code adapted from https://gist.github.com/slowkow/c6ab0348747f86e2748b
This script marked as absent any individual gene if no sample had at least 100 counts for
that gene and if the mean number of counts across all 101 samples was less than 100. After
filtering, 16,844 genes with TPM calls were included in the analysis. You can
re-process the count data to your own satisfaction using the raw data linked below.
Updated September 2017