We held a webinar addressing identification of common aberrations using STAC and GISTIC and when to use one or the other. The webinar also covered downstream enrichment analysis. The webinar recording was split into the different topics and is now available to review on the Educational Videos page. I’ve provided the video links and descriptions. Further below that you will find some of the questions and answers arising out of this webinar. You can also read about GISTIC and STAC on a blog post here.
What is it, why should you use it, what insights do you gain from this? When performing copy number variation analysis across a set of samples, we may ask “what are the most frequent alterations?” Another way to look at it is how to find the regions which have an aberration in at least x% of the samples and how to identify the peaks within this region. This presentation will go over how to identify these frequent aberrations and introduce two popular approaches (GISTIC and STAC) used to determine which of these common aberrations are statistically significant.
Reviews the method published by Diskin et. al. at U. Penn (STAC: A method for
testing the significance of DNA copy number aberrations across multiple array-CGH
experiments). The method applies two statistics (frequency and footprint) – the
frequency of aberration at a location across the entire sample set and a footprint
as the interval lengths of overlapping aberrations.
Reviews the GISTIC (Genomic Identification of Significant Targets in Cancer) method
developed at the Broad Institute by Beroukhim et. al. [Assessing the significance of
chromosomal aberrations in cancer: Methodology and application to glioma]. This
method identifies regions that are significantly gained or lost across a set of
samples, giving a greater weight to high-amplitude events which are less likely to
represent random aberrations.
Once significant common aberrations have been identified, we need to see what
biological implications these aberrations have. Enrichment analysis identifies GO
terms and the genes associated with these that are significantly over-represented in
the aberrant regions.
Yes, you can view either as copy number or Q-bound by selecting from the View menu. The G-score will be displayed by hovering over the region in either view. If the Q-bound is the selected view, results will be shown based on the significance settings. You can use the vertical zoom tool to adjust the y-axis as necessary: right-click to zoom out, click and drag to zoom in.
You will have to export both results in text format (use Export text button in Nexus) and combine them in Excel. As GISTIC is run on raw data and STAC is run on processed data, the results can’t be combined within Nexus but can be easily done using Excel.
Yes and no. High copy gains and homozygous deletions are weighted more heavily than single copy gains and single copy losses and that is taken into consideration when doing the analysis. However, high copy gains can have many levels (e.g. 6 copy gain, 20 copy gain, etc.) but deletions only have single copy loss or a homozygous deletion. In that respect, high copy gains can be slightly biased based on varying levels of amplification. However, we find the results to be quite accurate overall.
To learn more about copy number variation analysis, visit the Video Library on the BioDiscovery website.