A script for qPCR analysis in R

An R script for those who like to be close to their qPCR data and catch problems early. It takes export files (multicomponent data, text format, “across columns”) of Life Technologies StepOne machines.

A standard analysis can be done in less than 5 minutes. It consists of these steps:
– plotting of the raw signal (and saving of the result) to catch odd amplification and strong offsets
– baseline correction
– magnified plotting (and save) to check correction and drift in signal
– cycle estimation by using a threshold
– tabulation of the results

Having used this script for a long time, it never became necessary to use the background channel for corrections. Subtracting a second channel is a source of additional noise, therefore it is not used; the correction would be very easy to add though.

I found that the threshold method is straightforward and robust compared to derivation-based methods. I have spotted oddly amplifying samples several times using the diagnostic plots and they could be removed before they could cause confusion in data analysis.  The resulting data is ready for further processing like normalisation, use of a standard curve (using lm( ) ), or plotting.

I usually set up a folder for each qPCR run, and put in a copy of the script. So I can return at any future time to repeat and if necessary amend the analysis. All the customisations I might have made to the script are preserved.

# qPCR analysis for StepOne
# input file: StepOne export files (multicomponent data, text format, "across columns")
# David Molnar 2013-14

# load
input <- read.table("your_data.txt", skip=8, sep="\t", header=TRUE)

# remove empty cols, non-sybr - rows, then the Dye column
empties <-is.na(input[1,])
intermediate <- input[,!empties]
sybrs <- (intermediate[,"Dye"] == "SYBR")
d96 <- intermediate[sybrs,]
rownames(d96)<-d96[,"Cycle"]
d96 <- d96[,!(colnames(d96) %in% c("Dye", "Cycle"))]
#diagnostic plot
plotter <- function(x, y, xlim=c(min(x),max(x)), ylim=c(min(y), max(y)){
   cols = sample(length(x))
   plot(x=0, y=0, ylim=ylim, xlim = xlim, type="n", xlab="cycles", ylab="fluorescence/a.u.")
   i <- 0
   for ( column in y ) {
      i <- i+1
      points( x=x, y=column, type="l", col=cols[i])
   }
}
plotter(as.numeric(rownames(d96)), d96)
pdf(file="raw.pdf")
plotter(as.numeric(rownames(d96)), d96)
dev.off()

# baseline correct
baseline=lapply(d96[1:5,colnames(d96)], median)
baseline=unlist(baseline)
d96c <- t(t(d96)-baseline)
d96c <-data.frame(d96c)
# plot again
plotter(as.numeric(rownames(d96c)), d96c, xlim = c(0,50), ylim=c(0,5000))
pdf(file="bc_detail.pdf")
plotter(as.numeric(rownames(d96c)), d96c, xlim = c(0,50), ylim=c(0,5000))
dev.off()

# cycles for intensity > threshold
overThreshold <- d96c >= 5000 
thresholds=c()
for (column in colnames(overThreshold)){
   thresholds<-c(thresholds, min(as.numeric(row.names(d96c))[overThreshold[,column]]))
}
names(thresholds) <- colnames(d96c)

# remove inf-columns = failed pcr
thresholds <- thresholds[is.finite(thresholds)]

# tabulate
rows <- LETTERS[1:8]
cols <- 1:12
addresses <- outer(rows, cols, paste, sep="")
cycles <- matrix(thresholds[addresses], nrow=8, ncol=12)
# customise labels here
colnames(cycles) <- LETTERS[1:8]
row.names(cycles) <- 1:12
# cycles is a representation of the 96-well plate with cycle numbers or NA where
# the signal is undetectable
# basic processing done, custom code comes here

Leave a Reply

Your email address will not be published.


Please simplify: \(\frac{2^{7700}}{2^{7699}}=\)