r - which(vector1 < vector2) -
let's make small example first, computes in r:
x<- c(1,3,1,4,2) max(which(x<2)) [1] 3 now, not 1 value 2, many values simultaneously. should give me that:
max(which(x<c(1,2,3,4,5,6))) [1] na 3 5 5 5 5 of course run for loop, slow:
for(i in c(1,2,3,4,5,6)){ test[i]<-max(which(x<i)) } is there fast way this?
find max index of each value seen in x:
xvals <- unique(x) xmaxindx <- length(x) - match(xvals,rev(x)) + 1l rearrange
xvals <- xvals[order(xmaxindx,decreasing=true)] xmaxindx <- xmaxindx[order(xmaxindx,decreasing=true)] # 2 4 1 3 # 5 4 3 2 select those:
xmaxindx[vapply(1:6,function(z){ ok <- xvals < z if(length(ok)) which(ok)[1] else na_integer_ },integer(1))] # <na> 1 2 2 2 2 # na 3 5 5 5 5 it handily reports values (in first row) along indices (second row).
the sapply way simpler , not slower:
xmaxindx[sapply(1:6,function(z) which(xvals < z)[1])] benchmarks. op's case not described, here benchmarks anyway:
# setup nicola <- function() max.col(outer(y,x,">"),ties.method="last")*na^(y<=min(x)) frank <- function(){ xvals <- unique(x) xmaxindx <- length(x) - match(xvals,rev(x)) + 1l xvals <- xvals[order(xmaxindx,decreasing=true)] xmaxindx <- xmaxindx[order(xmaxindx,decreasing=true)] xmaxindx[vapply(y,function(z){ ok <- xvals < z if(length(ok)) which(ok)[1] else na_integer_ },integer(1))] } beauvel <- function() vectorize(function(u) ifelse(length(which(x<u))==0,na,max(which(x<u))))(y) davida <- function() vapply(y, function(i) c(max(which(x < i)),na)[1], double(1)) hallo <- function(){ test <- vector("integer",length(y)) for(i in y){ test[i]<-max(which(x<i)) } test } josho <- function(){ xo <- sort(unique(x)) xi <- cummax(1l + length(x) - match(xo, rev(x))) xi[cut(y, c(xo, inf))] } require(microbenchmark) (@mrhallo's , @davidarenburg's throw bunch of warnings way have them written now, fixed.) here results:
> x <- sample(1:4,1e6,replace=true) > y <- 1:6 > microbenchmark(nicola(),frank(),beauvel(),davida(),hallo(),josho(),times=10) unit: milliseconds expr min lq mean median uq max neval nicola() 76.17992 78.01171 99.75596 98.43919 120.81776 127.63058 10 frank() 25.27245 25.44666 36.41508 28.44055 45.32306 73.66652 10 beauvel() 47.70081 59.47828 67.44918 68.93808 74.12869 95.20936 10 davida() 26.52582 26.55827 33.93855 30.00990 35.55436 57.24119 10 hallo() 26.58186 26.63984 32.68850 28.68163 33.54364 50.49190 10 josho() 25.69634 26.28724 37.95341 30.50828 47.90526 68.30376 10 there 20 warnings (use warnings() see them) > > > x <- sample(1:80,1e6,replace=true) > y <- 1:60 > microbenchmark(nicola(),frank(),beauvel(),davida(),hallo(),josho(),times=10) unit: milliseconds expr min lq mean median uq max neval nicola() 2341.96795 2395.68816 2446.60612 2481.14602 2496.77128 2504.8117 10 frank() 25.67026 25.81119 42.80353 30.41979 53.19950 123.7467 10 beauvel() 665.26904 686.63822 728.48755 734.04857 753.69499 784.7280 10 davida() 326.79072 359.22803 390.66077 397.50163 420.66266 456.8318 10 hallo() 330.10586 349.40995 380.33538 389.71356 397.76407 443.0808 10 josho() 26.06863 30.76836 35.04775 31.05701 38.84259 57.3946 10 there 20 warnings (use warnings() see them) > > > x <- sample(sample(1e5,1e1),1e6,replace=true) > y <- sample(1e5,1e4) > microbenchmark(frank(),josho(),times=10) unit: milliseconds expr min lq mean median uq max neval frank() 69.41371 74.53816 94.41251 89.53743 107.6402 134.01839 10 josho() 35.70584 37.37200 56.42519 54.13120 63.3452 90.42475 10 of course, comparisons might come out differently op's true case.
Comments
Post a Comment