Insert line breaks in long string -- word wrap
Here is a function I wrote to break a long string into lines not longer than a given length
strBreakInLines <- function(s, breakAt=90, prepend="") {
words <- unlist(strsplit(s, " "))
if (length(words)<2) return(s)
wordLen <- unlist(Map(nchar, words))
lineLen <- wordLen[1]
res <- words[1]
lineBreak <- paste("\n", prepend, sep="")
for (i in 2:length(words)) {
lineLen <- lineLen+wordLen[i]
if (lineLen < breakAt)
res <- paste(res, words[i], sep=" ")
else {
res <- paste(res, words[i], sep=lineBreak)
lineLen <- 开发者_Python百科0
}
}
return(res)
}
It works for the problem I had; but I wonder if I can learn something here. Is there a shorter or more efficient solution, especially can I get rid of the for loop?
How about this:
gsub('(.{1,90})(\\s|$)', '\\1\n', s)
It will break string "s" into lines with maximum 90 chars (excluding the line break character "\n", but including inter-word spaces), unless there is a word itself exceeding 90 chars, then that word itself will occupy a whole line.
By the way, your function seems broken --- you should replace
lineLen <- 0
with
lineLen <- wordLen[i]
For the sake of completeness, Karsten W.'s comment points at strwrap
, which is the easiest function to remember:
strwrap("Lorem ipsum... you know the routine", width=10)
and to match exactly the solution proposed in the question, the string has to be pasted afterwards:
paste(strwrap(s,90), collapse="\n")
This post is deliberately made community wiki since the honor of finding the function isn't mine.
For further completeness, there's:
stringi::stri_wrap
stringr::str_wrap
(which just ultimately callsstringi::stri_wrap
The stringi
version will deal with character sets better (it's built on the ICU library) and it's in C/C++ so it'll ultimately be faster than base::strwrap
. It's also vectorized over the str
parameter.
You can look at e.g. the write.dcf()
FUNCTION in R itself; it also uses a loop so nothing to be ashamed of here.
The first goal is to get it right --- see Chambers (2008).
精彩评论