BioMed Research International

Research Article

biomvRhsmm: Genomic Segmentation with Hidden Semi-Markov Model

Box 1

Code chunk.

# model run
> rhsmm <- biomvRhsmm(x=variosm, maxbp=100, prior.m=‘cluster’, maxgap=100)
> hiDiffgr <- rhsmm@res[mcols(rhsmm@res)[,‘STATE’]!=2
& mcols(rhsmm@res)[,‘SAMPLE’]==‘meth.diff’]
# check for direction of changes
> dirNo <- mcols(hiDiffgr)[,‘STATE’]==‘1’& mcols(hiDiffgr)[,‘AVG’]>0 ∣
mcols(hiDiffgr)[,‘STATE’]==‘3’ & mcols(hiDiffgr)[,‘AVG’]<0
> hiDiffgr <- hiDiffgr[!dirNo]
# locating low p.val regions
> loPgr <- rhsmm@res[mcols(rhsmm@res)[,‘STATE’]==1
& mcols(rhsmm@res)[,‘SAMPLE’]==‘p.val’]
# find common high difference and low p.val regions
> DMRs <- intersect(hiDiffgr, loPgr)
> idx <- findOverlaps(variosm, DMRs, type=‘within’)
> mcols(DMRs) <- DataFrame(cbind(TYPE=‘DMR’, aggregate(as.data.frame(mcols(variosm[idx@queryHits])),
by=list(DMR=idx@subjectHits), FUN=median)[,−1 ]))
> names(DMRs) <- paste0(‘DMRs’, seq_along(DMRs))
>DMRs
GRanges with 5 ranges and 3 metadata columns:
Seqnames Ranges Strand TYPE meth.diff p.val
<Rle> <IRanges> <Rle> <factor> <numeric> <numeric>
DMRs1 chr1 [875227, 875470] * ∣ DMR 0.31947418 6.677193e−06
DMRs2 chr1 [876807, 876958] * ∣ DMR −0.06108219 6.500328e−02
DMRs3 chr1 [877684, 877738] * ∣ DMR −0.06123008 2.844639e−02
DMRs4 chr2 [46126, 46280] * ∣ DMR 0.41008524 1.818530e−07
DMRs5 chr2 [46389, 46558] * ∣ DMR 0.44823172 1.890819e−06
- - -
Seqlengths:
chr1 chr2
NA NA
> plot (rhsmm, gmgr=DMRs)