开发者

Retrieving GWAS information with R

I am trying to get specific disease-related information from the GWAS catalog. This can be done directly from the website via a spreadsheet download. But I was wondering if I could possibly do it programm开发者_Python百科atically in R. Any suggestions will be greatly appreciated.

Thanks.

Avoks


Checkout the function download.file() and the package rcurl (http://cran.r-project.org/web/packages/RCurl/index.html) - this should do what you are looking for


You will have to download .tsv file(s) first and manually edit them. This is because GWAS Catalog files contain HTML symbols, like &#x000A7 in "Behçet's disease" (defining that special fourth letter). The # in these symbols will be interpreted by R as an end of line, thus you will get an error message, e.g.:

line 2028 did not have 34 elements

So you downlad it first, open in plain text editor, automatically replace every # with empty character, and only then load it into R with:

read.table("gwas_catalog_v1.0-associations_e91_r2018-02-21.tsv",sep="\t",h=T,stringsAsFactors = F,quote="")
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜