Rsolid {Rsolid} | R Documentation |
This function performs quantile normalization of the color channel intensities in one panel of ABI SOLiD sequencing data. It operates on a single spch file, and writes color calls in the csfasta format.
Rsolid(intfile, outfile, ncycles = 50)
intfile |
Path to spch file containing color channel intensity data. |
outfile |
Path to csfasta file where color calls are written to. |
ncycles |
Expected number of sequencing cycles. For SOLiD 3 data this would be 50, for earlier data this would be 35. It should be a multiple of 5 (the number of primer cycles normally used in SOLiD). |
The parser for spch files will warn if less than 5 primer cycles are stored in the file (this happens when SOLiD fails to properly read intensities in these primer cycles). However, normalization will proceed and reads for spch files with less than 5 primer cycles will be written to the csfasta files with '.' where primer cycles are missing. The same happens for missing ligation cycles. This is closer to the behavior of the SOLiD platform.
Some reads may not appear in the csfasta file if too many intensity values are missing (NA in the spch file). Thus, you may see fewer reads in the csfasta file from panels with many NAs in the spch file. The read names generated by Rsolid should match the read names output by the SOLiD software, so reads can be matched by name.
We recommend using cluster-aware libraries like snow or Rmpi to call this function. See example below.
0 if successful.
Hao Wu, Rafael A. Irizarry and Hector Corrada Bravo hcorrada@gmail.com
Wu, H., Irizarry, RA. and Corrada Bravo, H.:
Intensity normalization improves color calling in SOLiD
sequencing. Nature Methods, 7, 336-337.
http://rafalab.org/Rsolid
## Not run:
spchdir <- "directory/with/spch/files"
spch.files <- list.files(dir, pattern="spch$")
for (i in 1:length(spch.files)) {
Rsolid(file.path(spchdir, spch.files[i]), outfile="output.csfasta")
}
## End(Not run)