有一个带有单词的向量
# текст
text <- c("R is a very essential tool for data analysis. While it
is regarded as domain specific, it is a very complete programming
language. Almost certainly, many people who would benefit from
using R, do not use it")
# разбиваю текст на вектор сo словами с пом. пакета stringr
text <- unlist( stringr::str_match_all(text , '\\w+\\b') )
text
[1] "R" "is" "a" "very" "essential" "tool" "for"
[8] "data" "analysis" "While" "it" "is" "regarded" "as"
[15] "domain" "specific" "it" "is" "a" "very" "complete"
[22] "programming" "language" "Almost" "certainly" "many" "people" "who"
[29] "would" "benefit" "from" "using" "R" "do" "not"
[36] "use" "it"
我想在其中找到“使用”一词
text[text=="using"]
[1] "using"
一切都很好,一切都被找到了,但是如果你稍微改变一下寄存器
text[text=="Using"]
character(0)
找不到这个词
问题是如何使单词搜索不区分大小写?
将向量中的单词转换为相同的寄存器就足够了:
您可以使用该功能
grep
ignore.case=TRUE
- 忽略大小写value=TRUE
- 返回向量的值,而不是找到的单词的位置更新程序
grep
将寻找条目。因此,搜索“it”将返回一个不完全正确的结果:在这种情况下,函数会
regexpr
更好地工作: