开发者

Getting dataframe directly from JSON-file?

First, let me thank everybody who contributes to Stackoverflow and R! I'm one of those R-users who is not so good at programming, but bravely try to use it for work, so the issue below is probably trivial...

Here's the problem. I need to import files in JSON-format to R:

# library(plyr)
# library(RJSONIO)
# lstJson <- fromJSON("JSON_test.json")        #This is the file I read
# dput(lstJson)                                              #What I did to get the txtJson below, for the benefit of testing.

txtJson <- structure(list(version = "1.1", result = structure(list(warnings = structure(list(), class = "AsIs"), fields = list(structure(list(info = "", rpl = 15, name = "time", type = "timeperiod"), .Names = c("info", "rpl", "name", "type")), structure(list(info = "", name = "object", type = "string"), .Names = c("info", "name", "type")), structure(list(info = "Counter1", name = "Counter1", type = "int"), .Names = c("info", "name", "type")), structure(list( info = "Counter2", name = "Counter2", type = "int"), .Names = c("info", "name", "type"))), timeout = 180, name = NULL, data = list( list(list("2011-05-01 17:00", NULL), list("Total", NULL), li开发者_Python百科st(8051, NULL), list(44, NULL)), list(list("2011-05-01 17:15", NULL), list("Total", NULL), list(8362, NULL), list( 66, NULL))), type = "AbcDataSet"), .Names = c("warnings", "fields", "timeout", "name", "data", "type"))), .Names = c("version", "result"))

dfJson <- ldply(txtJson, data.frame)  

What I need is a data frame similar to this:

time  object  Counter1  Counter2  
2011-05-01 17:00  Total  8051  44  
2011-05-01 17:15  Total  8362  66 

But instead I get

"Error in data.frame("2011-05-01 17:00", NULL, check.names = FALSE, stringsAsFactors = TRUE) : 
  arguments imply differing number of rows: 1, 0"

I get the same error if I use the lstJson.

I'm not sure if RJSONIO is supposed to be "smart enough" to parse files like this, or if I have to manually read the first line of the file, set column-types etc. The reason I'm not using CSV is that I want to "automatically" get dates in date-format, etc.

Thanks, /Chris


Looking at the structure of txtJson you see that all of the useful bits are in txtJson$result$data:

> sapply( txtJson$result$data, unlist )
     [,1]               [,2]              
[1,] "2011-05-01 17:00" "2011-05-01 17:15"
[2,] "Total"            "Total"           
[3,] "8051"             "8362"            
[4,] "44"               "66"              
> t(sapply( txtJson$result$data, unlist ))
     [,1]               [,2]    [,3]   [,4]
[1,] "2011-05-01 17:00" "Total" "8051" "44"
[2,] "2011-05-01 17:15" "Total" "8362" "66"
> as.data.frame(t(sapply( txtJson$result$data, unlist )) )
                V1    V2   V3 V4
1 2011-05-01 17:00 Total 8051 44
2 2011-05-01 17:15 Total 8362 66

In the process of gettting these as unlisted vectors and then passing to 'as.data.frame' they are now all class 'factor', so there is probably additional effort to re-class() these values. You can instead use:

data.frame(t(sapply( txtJson$result$data, unlist )) ,stringsAsFactors=FALSE)

And they would all be 'character'

As far as importing CSV files, read.table()'s colClasses argument will accept "POSIXlt" or "POSIXct" as known types. The rule I believe is that there must an as._ method available. Here's a minimal example:

> read.table(textConnection("2011-05-01 17:00"), sep=",", colClasses="POSIXct")
                   V1
1 2011-05-01 17:00:00
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜