开发者

How do I know if a table is an array?

I'm developing a simple optimi开发者_StackOverflow社区zed JSON function. Lua uses tables to represent arrays but in JSON I need to recognize between them. The code below is used:

t={
    a="hi",
    b=100
}

function table2json(t,formatted)
if type(t)~="table" then return nil,"Parameter is not a table. It is: "..type(t)    end

local ret=""--return value
local lvl=0 --indentation level
local INDENT="  " --OPTION: the characters put in front of every line for indentation
function addToRet(str) if formatted then ret=ret..string.rep(INDENT,lvl)..str.."\n" else ret=ret..str end end

addToRet("{")
lvl=1
for k,v in pairs(t) do
    local typeof=type(v)
    if typeof=="string" then
        addToRet(k..":\""..v.."\"")
    elseif typeof=="number" then
        addToRet(k..":"..v)
    end
end
lvl=0
addToRet("}")

return ret
end

print(table2json(t,true))

As you can see in JSON reference an object is what is called a table in Lua and it's different from an array.

The question is how I can detect if a table is being used as an array?

  • One solution of course is to go through all pairs and see if they only have numerical consecutive keys but that's not fast enough.
  • Another solution is to put a flag in the table that says it is an array not an object.

Any simpler/smarter solution?


If you want fast, simple, non-intrusive solution that will work most of the times, then I'd say just check index 1 - if it exists, the table is an array. Sure, there's no guarantee, but in my experience, tables rarely have both numerical and other keys. Whether it's acceptable for you to mistake some objects for arrays and whether you expect this to happen often depend on your usage scenario - I guess it's not good for general JSON library.

Edit: For science, I went to see how Lua CJSON does things. It goes through all pairs and checks if all keys are integers while keeping the maximum key (the relevant function is lua_array_length). Then it decides whether to serialize the table as an array or object depending on how sparse the table is (the ratio is user controlled) i.e. a table with indices 1,2,5,10 will probably be serialized as an array while a table with indices 1,2,1000000 will go as an object. I guess this is actually quite good solution.


The simplest algorithm to differentiate between arrays/non-arrays is this one:

local function is_array(t)
  local i = 0
  for _ in pairs(t) do
      i = i + 1
      if t[i] == nil then return false end
  end
  return true
end

Explanation here: https://web.archive.org/web/20140227143701/http://ericjmritz.name/2014/02/26/lua-is_array/

That said, you will still have issues with empty tables - are they "arrays" or "hashes"?

For the particular case of serializing json, what I do is marking arrays with a field in their metatable.

-- use this when deserializing
local function mark_as_array(t)
  setmetatable(t, {__isarray = true})
end

-- use this when serializing
local function is_array(t)
  local mt = getmetatable(t)
  return mt.__isarray
end


Here is more simplest check based on Lua specific #len function mechanism.

function is_array(table)
  if type(table) ~= 'table' then
    return false
  end

  -- objects always return empty size
  if #table > 0 then
    return true
  end

  -- only object can have empty length with elements inside
  for k, v in pairs(table) do
    return false
  end

  -- if no elements it can be array and not at same time
  return true
end

local a = {} -- true
local b = { 1, 2, 3 } -- true
local c = { a = 1, b = 1, c = 1 } -- false


No there is no built-in way to differentiate, because in Lua there is no difference.

There are already existing JSON libraries, which probably already do this (eg. Lua CJSON.

Other options are

  • leave it up to the user to specify what type the argument is, or as what type he'd want to have it treated.
  • have arrays explicitly declared by arranging the __newindex such that only new numerical and subsequent indices are allowed to be used.


@AlexStack

if not t[i] and type(t[i])~="nil" then return false end

This code is wrong, if fails when one of the elemets is false.

> return  isArray({"one", "two"})
true
> return  isArray({false, true})
false

I think the whole expression can be changed to type(t[i]) == nil but it still will fail in some scenarios because it will not support nil values.

A good way to go, I think, is trying with ipairs or checking whether #t is equal to count, but #t returns 0 with objects and count will be zero with empty arrays, so it may need an extra check at the beginning of the function, something like: if not next(t) then return true.

As a sidenote, I'm pasting another implementation, found in lua-cjson (by Mark Pulford):

-- Determine with a Lua table can be treated as an array.
-- Explicitly returns "not an array" for very sparse arrays.
-- Returns:
-- -1   Not an array
-- 0    Empty table
-- >0   Highest index in the array
local function is_array(table)
    local max = 0
    local count = 0
    for k, v in pairs(table) do
        if type(k) == "number" then
            if k > max then max = k end
            count = count + 1
        else
            return -1
        end
    end
    if max > count * 2 then
        return -1
    end

    return max
end 


You can simply test this (assuming t is a table):

function isarray(t)
  return #t > 0 and next(t, #t) == nil
end

print(isarray{}) --> false
print(isarray{1, 2, 3}) --> true
print(isarray{a = 1, b = 2, c = 3}) --> false
print(isarray{1, 2, 3, a = 1, b = 2, c = 3}) --> false
print(isarray{1, 2, 3, nil, 5}) --> true

It tests if there is any value in the "array part" of the table, then checks if there is any value after that part, by using next with the last consecutive numeric index.

Note that Lua does some logic to decide when to use this "array part" and the "hash part" of the table. That's why in the last example the provided table is detected as an array: it is dense enough to be considered an array despite the nil in the middle, or in other words, it is not sparse enough. Just as another answer here mentions, this is very useful in the context of data serialization, and you don't have to program it for yourself, you can use Lua underlying logic. If you would serialize that last example, you could use for i = 1, #t do ... end instead of using ipairs.

From my observation in Lua and LuaJIT implementation, the function next always looks up the array part of the table first, so any non array index will be found after the whole array part, even though after that it doesn't follow any particular order. I'm not sure if this is a consistent behaviour across different Lua versions, though.

Also, it's up to you to decide if empty tables should be treated as arrays as well. In this implementation, they are not treated as arrays. You could change it to return next(t) == nil or (#t > 0 and next(t, #t) == nil) to do the opposite.

Anyway, I guess this is the shortest you can get in terms of code lines and complexity, since it is lower bounded by next (which I believe is either O(1) or O(logn)).


Thanks. I developed the following code and it works:

---Checks if a table is used as an array. That is: the keys start with one and are sequential numbers
-- @param t table
-- @return nil,error string if t is not a table
-- @return true/false if t is an array/isn't an array
-- NOTE: it returns true for an empty table
function isArray(t)
    if type(t)~="table" then return nil,"Argument is not a table! It is: "..type(t) end
    --check if all the table keys are numerical and count their number
    local count=0
    for k,v in pairs(t) do
        if type(k)~="number" then return false else count=count+1 end
    end
    --all keys are numerical. now let's see if they are sequential and start with 1
    for i=1,count do
        --Hint: the VALUE might be "nil", in that case "not t[i]" isn't enough, that's why we check the type
        if not t[i] and type(t[i])~="nil" then return false end
    end
    return true
end


I wrote this function for pretty printing lua tables, and had to solve the same problem. None of the solutions here account for edge cases like some keys being numbers but others not. This tests every index to see if it's compatible with being an array.

function pp(thing)
    if type(thing) == "table" then
        local strTable = {}
        local iTable = {}
        local iterable = true
        for k, v in pairs(thing) do
            --if the key is a string, we don't need to do "[key]"
            local key = (((not (type(k) == "string")) and "["..pp(k).."]") or k)
            --this tests if the index is compatible with being an array
            if (not (type(k) == "number")) or (k > #thing) or(k < 1) or not (math.floor(k) == k) then
                iterable = false
            end
            local val = pp(v)
            if iterable then iTable[k] = val end
            table.insert(strTable, (key.."="..val))
        end
        if iterable then strTable = iTable end
        return string.format("{%s}", table.concat(strTable,","))
    elseif type(thing) == "string" then
        return '"'..thing..'"'
    else
        return tostring(thing)
    end
end


This is not pretty, and depending on how large and cleverly-deceptive the table is, it might be slow, but in my tests it works in each of these cases:

  • empty table

  • array of numbers

  • array with repeating numbers

  • letter keys with number values

  • mixed array/non-array

  • sparse array (gaps in index sequence)

  • table of doubles

  • table with doubles as keys

    function isarray(tableT)   
    
        --has to be a table in the first place of course
        if type(tableT) ~= "table" then return false end
    
        --not sure exactly what this does but piFace wrote it and it catches most cases all by itself
        local piFaceTest = #tableT > 0 and next(tableT, #tableT) == nil
        if piFaceTest == false then return false end
    
        --must have a value for 1 to be an array
        if tableT[1] == nil then return false end
    
         --all keys must be integers from 1 to #tableT for this to be an array
         for k, v in pairs(tableT) do
             if type(k) ~= "number" or (k > #tableT) or(k < 1) or math.floor(k) ~= k  then return false end
         end
    
         --every numerical key except the last must have a key one greater
         for k,v in ipairs(tableT) do
             if tonumber(k) ~= nil and k ~= #tableT then
                 if tableT[k+1] == nil then
                     return false
                 end
             end
         end
    
         --otherwise we probably got ourselves an array
         return true
     end
    

Much credit to PiFace and Houshalter, whose code I based most of this on.


At least in luajit 2.1.0-beta3 (where I tested it), I reliably get sorted iteration with pairs() for numerical indices, therefore this should also work and could be a bit faster than https://stackoverflow.com/a/25709704/7787852

Even if the Lua reference manual says explicitly that the iteration order of pairs cannot be relied upon.

local function is_array(t)
  local prev = 0
  for k in pairs(t) do
    if k ~= prev + 1 then
      return false
    end
    prev = prev + 1
  end
  return true
end
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜