data frame usage in R
I have a .csv
file which I have read into R as a dataframe (say df).
The first column is date in开发者_如何学运维 mm/dd/yyyy format. The second column is a double number. What I want to do is to create a new dataframe like:
df2<-data.frame(date=c(df[10,1],df[15,2]),num=c(111,222))
When I try to do this I get very messy df2. Most probably I am doing it wrong because I do not understand the data frame concept.
Whenever I try to do df[10,1]
, the output is the 10th row and 1st column of df
, including all the levels of column 1.
You can control how R will interpret the classes of data being read in by specifying a vector of column classes as an argument to read.table
with colClasses
. Otherwise R will use type.convert
which will convert a character vector in a "logical" fashion, according to R's definition of logical. That obviously has some potential quirks to it if you aren't familiar with them.
You can also prevent R from creating a factor by specifying stringsAsFactors = FALSE
as an argument in read.table
, this is generally an easier option than specifying all of the colClasses
.
You can format the date with strptime()
. Taking all of this into consideration, I would recommend reading your data into R without turning character data into factors and then use strptime
to format.
df <- read.csv("myFile.csv", stringsAsFactors = FALSE)
#Convert time to proper time format
df$time <- strptime(df$time, "%m/%d/%Y")
if you don't want to type out stringsAsFactors=FALSE each time you read in / construct a data frame. you can at the outset specify
options(stringsAsFactors=FALSE)
精彩评论