Can I gracefully include formatted SQL strings in an R script?
I'm working in an R script that uses a long SQL string, and I would like to keep the query relatively free of other markup so as to allow copying and pasting between editors and applications. I'd also like the ability to split the query across lines for better readability.
In the RODBC documentation, the paste
function is used to build the query out of separate chunks, but I'd prefer something less kludgy and with fewer quotes and comma开发者_StackOverflow中文版s. Thanks for your help.
If you're an old C programmer from way back, as I am, you might enjoy just using sprintf().
Borrowing Ian's example:
y<-"y1"
x<-"somethingorother"
query <- sprintf(
'SELECT DISTINCT x AS %s,
y AS %s,
FROM tbl
WHERE id=%%s
AND num=%%d', x, y)
yields:
> cat(query,"\n")
SELECT DISTINCT x AS somethingorother,
y AS y1,
FROM tbl
WHERE id=%s
AND num=%d
you can override the %+% operator to have better string concatination syntax:
'%+%' <- function(x,y) paste(x,y,sep="")
y<-"y1"
x<-"somethingorother"
query<-
'SELECT DISTINCT x AS ' %+% x %+%',\n' %+%
' y AS ' %+% y %+% '\n' %+%
' FROM tbl
WHERE id=%s
AND num=%d'
cat(query,"\n")
yields:
> cat(query,"\n")
SELECT DISTINCT x AS somethingorother,
y AS y1
FROM tbl
WHERE id=%s
AND num=%d
A graceful way of "including" a long SQL query is to keep it in a separate .sql
file. Preferably somewhere it can be syntax highlighted, a text file in RStudio will do the job. You can then in your main R script read the file into a string and populate it with variables using one of the many "named" sprintf
-type solutions, such as infuser.
.sql
select *
from mytable
where id = {{a}}
and somevar = {{b}}
.R
library(readr)
library(infuser)
query <- read_file("query.sql") %>%
infuse(a = 1, b = 2)
I'd recommend just using a plain string, and not embedding variable values into it. Use placeholders instead.
sql <- "SELECT foo FROM bar
WHERE col1 = ?
AND col2 = ?
ORDER BY yomama"
I'm not sure if the double-quote is the best way to embed multi-line strings in R code (is there something like here-docs?), but it does work, unlike in Java.
Is there some reason you don't want to send "\n"
or "\t"
to your database? They should be fine in the SQL.
I've ended up simply hitting the sql string with sql <- gsub("\n","",sql)
and sql <- gsub("\t","",sql)
before running it. The string itself can be as long as it needs to be, but stays free of any concatenation markup.
You can now do this easily using glue, allowing you to write code like
glue_sql("
SELECT {`var`}
FROM {`tbl`}
WHERE {`tbl`}.sepal_length > {num}
AND {`tbl`}.species = {val}
", .con = con)
精彩评论