regex - Using R, how do I test a dataframe of phrases to see if it contains keywords -
i have 2 dataframes. 1 containing search phrases this
search.phrases 1 quick 2 brown fox jumps 3 on lazy 5 dog 6 why 7 nobody knows ...
and containing keywords
keywords 1 quick 2 lazy 3 dog 4 knows ...
ideally, want find search phrases contains 1 or more (either boolean or count) of the keywords
search.phrases keyword.found 1 quick true 2 brown fox jumps false 3 on lazy true 5 dog true 6 why false 7 nobody knows true ...
i've been trying while i'm stumped. appreciated.
lots of love
g
you can use grepl()
rgx <- paste(as.character(df2$keywords), collapse = "|") df$keyword.found <- grepl(rgx, df$search.phrases)
result:
search.phrases keyword.found 1 quick true 2 brown fox jumps false 3 on lazy true 5 dog true 6 why false 7 nobody knows true
data:
df2 <- structure(list(keywords = structure(c(4l, 3l, 1l, 2l), .label = c("dog", "knows", "lazy", "quick"), class = "factor")), .names = "keywords", class = "data.frame", row.names = c("1", "2", "3", "4")) df <- structure(list(search.phrases = structure(c(5l, 1l, 4l, 2l, 6l, 3l), .label = c("brown fox jumps", "dog", "nobody knows", "over lazy", "the quick", "why"), class = "factor")), .names = "search.phrases", class = "data.frame", row.names = c("1", "2", "3", "5", "6", "7"))
Comments
Post a Comment