开发者

Recommendations for database with R [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.

开发者_Python百科

Closed 2 years ago.

Improve this question

I am using R to run simulations using time series data. I have been using arrays to store data but I need a less memory intensive solution for storing data at intermediate steps in order to document the process. I am not a programmer so I am looking for something relatively easy to setup on multiple platforms if possible (Windows, Mac, Linux). I also need to be able to directly call the database from R since learning another language is not feasible now. Ideally, I would like to be able to read and write frequently to the database in a manner similar to an array though I don't know if that is realistic. I will gladly sacrifice speed for ease of use but I am willing to work to learn open source solutions. Any suggestions would be appreciated.


Quick comments:

  • R is good at this, as a language for programming with data, there are plenty of interfaces
  • There is an entire manual devoted to data import/export, and it has a section on relational databases, so start there.
  • R has the widely-used DBI package which provides a unified interface for many backends, among them SQLite, MySQL, PostgreSQL, Oracle, ... Use that, maybe with RSQLite to get something going quickly. You can still switch backends afterwards.
  • There is also RODBC but I find ODBC tedious to work with.
  • R also has a specialised variant in the TSdbi package by Paul Gilbert which brings the DBI-alike abstraction to timeseries databases. It also supports multiple backends.
  • The data.table package was written for this and is very fast on indexing and aggregation.


I also need to be able to directly call the database from R

I suggest setting up MySQL with RMySQL interface.

Once the DB connection is open, you can query the database and get the the data into R, example:

# Run an SQL statement by creating first a resultSet object
rs <- dbSendQuery(con, statement = paste(
                      "SELECT w.laser_id, w.wavelength, p.cut_off",
                      "FROM WL w, PURGE P",
                      "WHERE w.laser_id = p.laser_id",
                      "SORT BY w.laser_id")
# we now fetch records from the resultSet into a data.frame
data <- fetch(rs, n = -1)   # extract all rows

RMySQL: R interface to the MySQL database

Database interface and MySQL driver for R. This version complies with the database interface definition as implemented in the package DBI 0.2-2.

MySQL Database:

Available for all the platforms you cited in the question, and more, download here.


Do you really need a database solution for your purpose? You say you want a "solution for storing data at intermediate steps " -- how about simply saving the data array to disk at the required time points?

Edit: to make it possible to retrieve the information, you can embed meta-information, e.g. trial index and/or timestamp, in the filename. Then later you can locate and load the file using the correct filename.


You can also take a look at the ff package.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜