What is the second column of `str` report in R and what does `atomic` in this column mean?

2023-03-01 16:16 问答作者：

Using str(survey_OM) on my data frame indicates that all of my numerical data is atomic. If I use class(survey_OM$perc.OM) it returns numeric.

I have always thought that the second column of str showed the class of the data but it does not appear that simple... so my questions are:

What is the second column of str reporting?
What is atomic and how does it differ from numeric?
Why in this case would the data appear as atomic and not num or int?

thank you.

dput(head(survey_OM, 20)) provides:

> dput(head(survey_OM, 20))
  structure(list(lake = structure(c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 
  3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("E-2", 
  "E-4", "E pond", "EX 1", "GTH 110", "GTH 112", "GTH 114", "GTH 156", 
  "GTH 91", "GTH 98", "N-1", "NE-10", "NE-11", "NE-3", "NE-8", 
  "NE-9", "NE-9b", "S-10", "S-11", "S-3", "S-6", "S-7"), class = "factor"), 
  date = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
  2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("2007/06/15", 
  "2007/06/18", "2007/06/19", "2007/06/20", "2007/06/21", "2007/06/27", 
  "2007/06/29", "2007/07/07", "2007/07/19", "2007/07/20", "2008/07/26", 
  "2008/07/30", "2008/08/04", "2008/08/06"), class = "factor"), 
  depth = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
  2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("E", 
  "epi", "H", "hypo"), class = "factor"), 
  depth.m = structure(c(6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L), .Label = c("", "10.9", "12.9", "1.5", "2", 
  "2.1", "2.2", "2.3", "2.4", "2.5", "2.6",开发者_StackOverflow "2.7", "3", "3.1", 
  "3.5", "4", "4.2", "4.8", "4.9", "5", "5.1", "5.5", "6", 
  "6.5", "7", "7.2", "9.9", "not recorded"), class = "factor"), 
  rep = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
  2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", 
  "B", "C"), class = "factor"), 
  sed = c(0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L), 
  notes = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
  1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("", 
  "col on SE side", "lg snail shell", "not collected", "very hard sediments"
  ), class = "factor"), 
  dry.mass = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), 
  perc.OM = c(47.1300248455119, 47.4260808104607, 47.7349307375515, 46.4501104675465, 44.1513415737111, 43.5608499678045, 42.9921259842519, 42.2674677347574, 39.6643311064039, 
  39.0968130690949, 46.7768514928267, 46.9211608642763, 46.7877013177158, 
  47.0709930313588, 44.3241152581706, 43.7905468025952, 41.706074101281, 
  36.5061097383474, 37.4329041152142, 37.7757939038389)), .Names = c("lake", 
  "date", "depth", "depth.m", "rep", "sed", "notes", "dry.mass", 
  "perc.OM"), comment = c("working data frame of the sediment char from the 2007 sed    survey       created:", "Wed Apr 27 14:23:33 2011"), row.names = c(NA, 20L), class = "data.frame")

and the complete output of str(survey_OM) is:

> str(survey_OM)
'data.frame':   780 obs. of  9 variables:
 $ lake    : Factor w/ 22 levels "E-2","E-4","E pond",..: 3 3 3 3 3 3 3 3 3 3 ...
  ..- attr(*, "comment")= chr "names of the lakes"
 $ date    : Factor w/ 14 levels "2007/06/15","2007/06/18",..: 2 2 2 2 2 2 2 2 2 2 ...
  ..- attr(*, "comment")= chr "date that the cores were collected"
 $ depth   : Factor w/ 4 levels "E","epi","H",..: 2 2 2 2 2 2 2 2 2 2 ...
  ..- attr(*, "comment")= chr "relative depth ID; epi = shallowest corable Z, hypo =   deepest Z, S, M, D = shallow, med, deep"
 $ depth.m : Factor w/ 28 levels "","10.9","12.9",..: 6 6 6 6 6 6 6 6 6 6 ...
  ..- attr(*, "comment")= chr "depth that core was collected in m"
 $ rep     : Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 1 1 ...
  ..- attr(*, "comment")= chr "replicate ID for core"
 $ sed     : atomic  0 1 2 3 4 5 6 7 8 9 ...
  ..- attr(*, "comment")= chr "depth of sample from sed/water interface in cm"
 $ notes   : Factor w/ 5 levels "","col on SE side",..: 1 1 1 1 1 1 1 1 1 1 ...
  ..- attr(*, "comment")= chr "comments on sample"
 $ dry.mass: atomic  0 0 0 0 0 0 0 0 0 0 ...
  ..- attr(*, "comment")= chr "dry mass of the sediment at that sed Z in g/m^2"
 $ perc.OM : atomic  47.1 47.4 47.7 46.5 44.2 ...
  ..- attr(*, "comment")= chr "percent OM of the samp. based on LOI at 550d C"
 - attr(*, "comment")= chr  "working data frame of the sediment char from the 2007 sed survey created:" "Wed Apr 27 14:23:33 2011"

Looking at utils:::str.default, we see that we get the usual output of int, num, etc., if the following if statement is true:

if (     is.vector(object) 
     || (is.array(object) && is.atomic(object))
     ||  is.vector(object, mode = "language") 
     || is.vector(object, mode = "symbol")
   )

We get atomic if this statement is false (and it would otherwise have been int, num, etc).

Looking at the help page for is.vector, we see that it returns true only if it's a vector with no attributes other than names. Here's a data frame where b has an extra attribute:

d <- data.frame(a=1:4, b=5:8)
attr(d$b, "tmp") <- "surprise!"

And calling str on it gives atomic for b instead of int.

> str(d)
'data.frame':   4 obs. of  2 variables:
 $ a: int  1 2 3 4
 $ b: atomic  5 6 7 8
  ..- attr(*, "tmp")= chr "surprise!"

I see in your edit that you have extra attributes on the elements of your data frame, and that you're getting these extra lines about your attributes as well, so it would seem that this explains it.

R divides data types into atomic and recursive. The things most people call vectors are all atomic (as mentioned by several people so far.) Lists can have arbitrary levels of complexity, i.e. lists within lists and will return FALSE from is.atomic(). Atomic vectors can have attributes without loosing their 'atomicity'.

I believe your three questions essentially boil down to one thing.

The second column of str() returns the mode of the object, and not the class. The instruction ?atomic redirects to ?vector where it states: "The atomic modes are "logical", "integer", "numeric" (synonym "double"), "complex", "character" and "raw"."

Thus numeric is one of the modes of atomic.

mode refers to the storage mode of an object. See ?mode for more details.

What is the second column of `str` report in R and what does `atomic` in this column mean?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？