Several customers have asked about when and whether to use GC wave correction. What causes waviness? Should a correction be applied even if there is no observable data waviness? Should I use a linear, quadratic, or lowess correction?
GC waviness in the intensity signal (LogR) is dependent on the amount of input DNA and the GC content in the analyzed region. Although the precise mechanism is not fully understood, the change in signal can reduce a platform’s ability to detect copy number variants (CNVs). Although GC content introduces a bias in the sample, this can be subtracted (corrected) using a wave-correction file specific to the array type and design. The wave-correction (or ‘systematic correction’) is applied to the original signal and corrected using either a linear, quadratic, or lowess regression model. For most arrays, a linear regression model is appropriate; while for Affymetrix arrays a quadratic model is recommended. There should be no effect from using the systematic correction on an array that does not exhibit the waviness features, so it can be applied uniformly to a data set. For a more thorough treatment of systematic correction, take a look at this research on adjusting genomic waves in signal intensities from whole-genome SNP genotyping platforms.