c++ - How to effectively combine a list of NumericVectors into one large NumericVector? -


i wrote following rcpp code compiles, speed not fast expected.

// [[rcpp::export]] numericvector combine_list_to_vec (const rcpp::list& list) {   int list_size = list.size();   int large_vec_size = 0;   integervector start_index(list_size);   integervector end_index(list_size);   (int ii = 0; ii < list_size; ii++)   {     numericvector vec = list[ii];     start_index[ii] = large_vec_size;     large_vec_size += vec.size();     end_index[ii] = large_vec_size - 1;   }   numericvector large_vec(large_vec_size);   // creating object after getting size   (int ii = 0; ii < list_size; ii++)   {     int current_start_index = start_index[ii];     numericvector vec = list[ii];     (int jj = 0; jj < vec.size(); jj++)     {       large_vec[jj + current_start_index] = vec[jj];     }   }   return large_vec; } 

the input variable 'list' contains bunch of numericvector, , want combine them large one, '...tail - head -tail...' structure. start_index , end_index variables used facilitate copy.

the microbenchmark test gives following info specific example:

x=list(); x[[1]]=runif(1e6);  x[[2]]=runif(1e6); x[[3]]=runif(1e6);  x[[4]]=runif(1e6); x[[5]]=runif(1e6);  x[[6]]=runif(1e6); x[[7]]=runif(1e6);  x[[8]]=runif(1e6); x[[9]]=runif(1e6);  x[[10]]=runif(1e6); microbenchmark(combine_list_to_vec(x) -> y)  # unit: milliseconds                         expr       min        lq       mean    median        uq       max neval # y <- combine_list_to_vec(x) 84.166964 84.587516 89.9520601 84.728212 84.871673 349.33234   100 

another way tried call external r function do.call(c,x):

// [[rcpp::export]] list combine_list_to_vec (const rcpp::list& list) {   int list_size = list.size();   int large_vec_size = 0;   integervector start_index(list_size);   integervector end_index(list_size);   (int ii = 0; ii < list_size; ii++)   {     numericvector vec = list[ii];     start_index[ii] = large_vec_size;     large_vec_size += vec.size();     end_index[ii] = large_vec_size - 1;   }   numericvector large_vec = internal::convert_using_rfunction(list, "sub_do_call");   list rtn = list::create(large_vec, start_index, end_index);   return rtn; }  // following codes exist r codes instead of rcpp sub_do_call <- function (x) {   return (do.call(c, x)); } 

the speed 4 times faster previous code. there way can speedup combination operation using pointer or other tools in rcpp and/or rcpparmadillo, or code do.call(c,x) in rcpp instead of calling externally? thank you.

if understand correctly, you're asking, "how can write base::unlist in rcpp?" and, since base::unlist .internal function (it has c implementation) it's unlikely you'll able better rcpp.

but, let's try anyway, fun. here's implementation use that's similar yours, should cheaper use std::copy rather re-indexing on every iteration:

#include <rcpp.h> using namespace rcpp;  // [[rcpp::export]] numericvector combine(const list& list) {    std::size_t n = list.size();     // figure out length of output vector    std::size_t total_length = 0;    (std::size_t = 0; < n; ++i)       total_length += rf_length(list[i]);     // allocate vector    numericvector output = no_init(total_length);     // loop , fill    std::size_t index = 0;    (std::size_t = 0; < n; ++i)    {       numericvector el = list[i];       std::copy(el.begin(), el.end(), output.begin() + index);        // update index       index += el.size();    }     return output;  }  /*** r library(microbenchmark) x <- replicate(10, runif(1e6), simplify = false) identical(unlist(x), combine(x)) microbenchmark(    unlist(x),    combine(x) ) */ 

running code gives me:

> rcpp::sourcecpp('c:/users/kevin/scratch/combine.cpp')  > library(microbenchmark)  > x <- replicate(10, runif(1e6), simplify = false)  > identical(unlist(x), combine(x)) [1] true  > microbenchmark( +    unlist(x), +    combine(x) + ) unit: milliseconds        expr      min       lq     mean   median       uq      max neval   unlist(x) 21.89620 22.43381 29.20832 23.14454 35.32135 68.09562   100  combine(x) 20.96225 21.55827 28.13269 22.08985 24.13403 51.68660   100 

so, same. gain tiny bit of time because don't type checking (which means code blows if don't have list containing numeric vectors) should @ least illustrative of fact can't better here.

(the exception, guess, huge vectors parallel processing might helpful here)


Comments

Popular posts from this blog

android - MPAndroidChart - How to add Annotations or images to the chart -

javascript - Add class to another page attribute using URL id - Jquery -

firefox - Where is 'webgl.osmesalib' parameter? -