Call data.frame columns inside of R functions? -


what proper way this?

i have function works great on own given series of inputs , i'd use function on large dataset rather singular values looping through data row. have tried update function call data.frame columns rather vector values, have been unsuccessful.

a simple example of is:

let's have date.frame 4 columns, data$id, data$height, data$weight, data$gender. want write function loop on each row (using apply) , calculate bmi (kg/m^2). know easy dplyr learn how without resorting external packages can't find clear answer how reference columns within function.

apologize in advance if duplicate. i've been searching stackoverflow pretty thoroughly in hopes of finding exisiting example.

i think you're looking for. easiest way refer columns of data frame functionally use quoted column names. in principle, you're doing this

data[, "weight"] / data[, "height"]^2 

but inside function might want let user specify height or weight column named differently, can write function

add_bmi = function(data, height_col = "height", weight_col = "weight") {     data$bmi = data[, weight_col] / data[, height_col]     return(data) } 

this function assume columns use named "height" , "weight" default, user can specify other names if necessary. similar solution using column indices instead, using names tends easier debug.

functions simple useful. if you're calculating bmi lot of datasets maybe worth keeping function around, since one-liner in base r don't need it.

my_data$bmi = with(my_data, weight / height^2) 

one note using column names stored in variables means can't use $. price pay making things more programmatic, , it's habit form such applications. see fortunes::fortune(343):

sooner or later r beginners bitten convenient shortcut. r newbie, think of r bank account: overuse of $-extraction can lead undesirable consequences. it's best acquire '[[' , '[' habit early.

-- peter ehlers (about use of $-extraction) r-help (march 2013)

for fancier usage dplyr don't have quote column names , such (and can evaluate expressions), lazyeval package makes things relatively painless , has nice vignettes.

the base function with can used lazy evaluating, e.g.,

with(mtcars, plot(disp, mpg)) # nice plot(mtcars$disp, mtcars$mpg) 

but with best used interactively , in straightforward scripts. if writing programmatic production code (e.g., own r package), it's safer avoid non-standard evaluation. see, example, warning in ?subset, base r function uses non-standard evaluation.


Comments

Popular posts from this blog

android - MPAndroidChart - How to add Annotations or images to the chart -

javascript - Add class to another page attribute using URL id - Jquery -

firefox - Where is 'webgl.osmesalib' parameter? -