r - sum up cells in matrix according to different hierarchical level -
i using r make heatmap binary interactions. matrix looks following
9 401 562 68 71 569 700 9 0 1 0 0 0 0 1 401 0 0 1 0 0 na 1 562 0 1 0 1 1 0 1 68 1 1 0 0 0 0 1 71 1 na 0 0 na 0 1 569 1 1 0 1 0 0 0 700 0 0 0 0 0 0 0
also, have metadata corresponding ids
compart group family category 9 ex prt ps 401 ex prt ps 562 ex prt b rh 68 in prt c en 71 in act d stp 569 in act d stp 700 ex act e aqua
i sum cells @ different level, ex here according family. table looks like
b c d e 1 1 0 0 1 b 1 0 0 na 1 c 2 0 0 0 1 d 3 0 1 0 0 e 0 0 0 0 0
and @ compart level , on.
i looking solutions avoid me manually , go hours of work.
your best bet flatten or "stretch out" matrix. try following
library(magrittr) library(data.table) library(reshape2) ## let ids metadata data.frame dt_ids <- as.data.table(ids, keep.rownames=true) # dt_ids[, rn := as.numeric(rn)] setkey(dt_ids, rn) ## let m interactions matrix ## reshape interactions data tall data.table dt_interactions <- m %>% as.data.table(keep.rownames=true) %>% melt(id.vars = "rn", value.name="interaction") ## clean column names setnames(dt_interactions, c("rn", "variable"), c("rn.rows", "rn.cols")) ## add in 2 copies of meta data ## 1 "rows" of m , 1 "cols" of m dt_interactions[, paste0(names(dt_ids), ".rows") := dt_ids[.(rn.rows)]] dt_interactions[, paste0(names(dt_ids), ".cols") := dt_ids[.(rn.cols)]] ## set key of dt_interactions setkey(dt_interactions, rn.rows, rn.cols) ## sum dt_interactions[, sum(interaction), by=c("family.rows", "family.cols")]
i wrap last part in nice function
sumbymeta <- function(..., na.rm=true) { bycols_simple <- list(...) %>% unlist bycols <- bycols_simple %>% lapply(paste0, c(".rows", ".cols")) %>% unlist l <- length(bycols) formula <- paste( bycols[1:(l/2)], bycols[(l/2 + 1) : l] , sep=ifelse(l > 2, " + ", "~"), collapse=" ~ ") dt_interactions[, sum(interaction, na.rm=na.rm), by=bycols] %>% dcast.data.table(formula=as.formula(formula), value.var="v1") %>% setnames(old=seq_along(bycols_simple), new=bycols_simple) %>% {.} } ## eg: sumbymeta("family") # family b c d e # 1: 1 1 0 0 2 # 2: b 1 0 1 1 1 # 3: c 2 0 0 0 1 # 4: d 3 0 1 0 1 # 5: e 0 0 0 0 0 ## try running these sumbymeta("family") sumbymeta("group") sumbymeta("family", "group") sumbymeta("family", "group", "compart") sumbymeta("family", "compart")
Comments
Post a Comment