regex - Extract multiple instances of a pattern from a string in R -
i have character vector t
follows.
t <- c("gid456 spk711", "gid456 gid667 vink", "gid45345 dnp990 gid2345", "gid895 gid895 k350")
i extract strings starting gid , followed sequence of digits.
this works, not retrieve multiple instances.
gsub(".*(gid\\d+).*", "\\1", t) [1] "gid456" "gid667" "gid2345" "gid895"
how extract strings in case? desired output follows
out <- c("gid456", "gid456", "gid667", "gid45345", "gid2345", "gid895", "gid895")
here's approach using package maintain qdapregex (i prefer or stringi/stringr) base consistency , ease of use. show base approach. in event i'd @ more "extraction" problem subbing problem.
y <- c("gid456 spk711", "gid456 gid667 vink", "gid45345 dnp990 gid2345", "gid895 gid895 k350") library(qdapregex) unlist(ex_default(y, pattern = "gid\\d+")) ## [1] "gid456" "gid456" "gid667" "gid45345" "gid2345" "gid895" "gid895"
in base r:
unlist(regmatches(y, gregexpr("gid\\d+", y)))
Comments
Post a Comment