regex - Extract multiple instances of a pattern from a string in R -


i have character vector t follows.

t <- c("gid456 spk711", "gid456 gid667 vink", "gid45345 dnp990 gid2345",      "gid895 gid895 k350") 

i extract strings starting gid , followed sequence of digits.

this works, not retrieve multiple instances.

gsub(".*(gid\\d+).*", "\\1", t) [1] "gid456"  "gid667"  "gid2345" "gid895"  

how extract strings in case? desired output follows

out <- c("gid456", "gid456", "gid667", "gid45345", "gid2345",          "gid895", "gid895") 

here's approach using package maintain qdapregex (i prefer or stringi/stringr) base consistency , ease of use. show base approach. in event i'd @ more "extraction" problem subbing problem.

y <- c("gid456 spk711", "gid456 gid667 vink", "gid45345 dnp990 gid2345",      "gid895 gid895 k350")  library(qdapregex) unlist(ex_default(y, pattern = "gid\\d+"))  ## [1] "gid456"   "gid456"   "gid667"   "gid45345" "gid2345"  "gid895"   "gid895"  

in base r:

unlist(regmatches(y, gregexpr("gid\\d+", y))) 

Comments

Popular posts from this blog

android - MPAndroidChart - How to add Annotations or images to the chart -

javascript - Add class to another page attribute using URL id - Jquery -

firefox - Where is 'webgl.osmesalib' parameter? -