开发者

data frame usage in R

I have a .csv file which I have read into R as a dataframe (say df). The first column is date in开发者_如何学运维 mm/dd/yyyy format. The second column is a double number. What I want to do is to create a new dataframe like:

df2<-data.frame(date=c(df[10,1],df[15,2]),num=c(111,222))

When I try to do this I get very messy df2. Most probably I am doing it wrong because I do not understand the data frame concept.

Whenever I try to do df[10,1], the output is the 10th row and 1st column of df, including all the levels of column 1.


You can control how R will interpret the classes of data being read in by specifying a vector of column classes as an argument to read.table with colClasses. Otherwise R will use type.convert which will convert a character vector in a "logical" fashion, according to R's definition of logical. That obviously has some potential quirks to it if you aren't familiar with them.

You can also prevent R from creating a factor by specifying stringsAsFactors = FALSE as an argument in read.table, this is generally an easier option than specifying all of the colClasses.

You can format the date with strptime(). Taking all of this into consideration, I would recommend reading your data into R without turning character data into factors and then use strptime to format.

df <- read.csv("myFile.csv", stringsAsFactors = FALSE)
#Convert time to proper time format
df$time <- strptime(df$time, "%m/%d/%Y")


if you don't want to type out stringsAsFactors=FALSE each time you read in / construct a data frame. you can at the outset specify

 options(stringsAsFactors=FALSE)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜