Lua 型別檢查

lua-users home
wiki

許多程式語言提供某種類型的靜態(編譯時)或動態(執行時)型別檢查,每種類型都有其優點 [1]。Lua 對內建操作執行執行時型別檢查。例如,此程式碼會觸發執行時錯誤

> x = 5 + "ok"
stdin:1: attempt to perform arithmetic on a string value

但是,與 C 等語言不同,Lua 中沒有內建機制來檢查函數呼叫的參數和回傳值型別。型別並未指定

function abs(x)
  return x >= 0 and x or -x
end

這樣做會增加許多彈性。例如,像是 print 這樣的函數可以接受多種類型的值。但是,它可能會使函數未指定且更容易發生錯誤。您可以呼叫此函數對數字以外的任何東西執行操作,即使函數內的運算會在執行時觸發有點莫名其妙的錯誤(在 C 中這會是編譯時錯誤)。

> print(abs("hello"))
stdin:2: attempt to compare number with string
stack traceback:
        stdin:2: in function 'abs'
        stdin:1: in main chunk
        [C]: ?

解決方案:函數最上方的斷言

為了改善錯誤回報,通常會告訴您執行類似這樣的操作

function abs(x)
  assert(type(x) == "number", "abs expects a number")
  return x >= 0 and x or -x
end

> print(abs("hello"))
stdin:2: abs expects a number
stack traceback:
        [C]: in function 'assert'
        stdin:2: in function 'abs'
        stdin:1: in main chunk
        [C]: ?

這是個好建議,但有人可能會抱怨這樣會造成額外的執行時負擔,它只會偵測實際執行的程式碼是否有程式錯誤,而不是所有已編譯的程式碼,函數值的型別全部都在函數實作中(無法透過自省取得),而且可能會造成許多重複的程式碼(特別是透過表格傳遞命名參數時)。

以下是主動檢查命名參數的方法之一

local Box = {}
local is_key = {x=true,y=true,width=true,height=true,color=true}
function create_box(t)
  local x = t.x or 0
  local y = t.y or 0
  local width = t.width or 0
  local height = t.height or 0
  local color = t.color
  assert(type(x) == "number", "x must be number or nil")
  assert(type(y) == "number", "y must be number or nil")
  assert(type(width) == "number", "width must be number be number or nil")
  assert(type(height) == "number", "height must be number or nil")
  assert(color == "red" or color == "blue", "color must be 'red' or 'blue'")
  for k,v in pairs(t) do
    assert(is_key[k], tostring(k) .. " not valid key")
  end
  return setmetatable({x1=x,y1=y,x2=x+width,y2=y+width,color=color}, Box)
end

這相當於許多程式碼。事實上,我們可能會想要使用 error 而不是 assert 來提供給堆疊追蹤一個適切的 level 參數。

解決方案:函數裝飾器

另一種方法是把型別檢查程式碼放在原始函數外,有可能使用「函數裝飾器」(背景資訊請參閱 DecoratorsAndDocstrings)。

is_typecheck = true

function typecheck(...)
  return function(f)
    return function(...)
      assert(false, "FIX-TODO: ADD SOME GENERIC TYPE CHECK CODE HERE")
      return f(...)
    end
  end
end

function notypecheck()
  return function(f) return f end
end

typecheck = is_typecheck and typecheck or notypecheck

sum = typecheck("number", "number", "->", "number")(
  function(a,b)
    return a + b
  end
)

優點是型別資訊在函數實作之外。我們可以透過切換單一變數來停用所有型別檢查,且函數執行時不會有任何額外負擔(但函數建置時會有一點點額外負擔)。typecheck 函數也可以將型別資訊儲存起來以供之後自省。

此方法類似 LuaList:/2002-07/msg00209.html(警告:Lua 4)所描述的方法。

實作此類型的檢查裝飾器的方法之一是

--- Odin Kroeger, 2022, released under the MIT license.
do
    local abbrs = {
        ['%*'] = 'boolean|function|number|string|table|thread|userdata',
        ['%?(.*)'] = '%1|nil'
    }

    local msg = 'expected %s, got %s.'

    --- Check whether a value is of a type.
    --
    -- Type declaration grammar:
    --
    -- Declare one or more Lua type names separated by '|' to require that
    -- the given value is of one of the given types (e.g., 'string|table'
    -- requires the value to be a string or a table). '*' is short for the
    -- list of all types but `nil`. '?T' is short for 'T|nil' (e.g.,
    -- '?table' is short for 'table|nil').
    --
    -- Extended Backus-Naur Form:
    --
    -- > Type = 'boolean' | 'function' | 'nil'    | 'number'   |
    -- >        'string'  | 'table'    | 'thread' | 'userdata'
    -- >
    -- > Type list = [ '?' ], type, { '|', type }
    -- >
    -- > Wildcard = [ '?' ], '*'
    -- >
    -- > Type declaration = type list | wildcard
    --
    -- Complex types:
    --
    -- You can check types of table or userdata fields by
    -- declarding a table that maps indices to declarations.
    --
    --    > type_check({1, '2'}, {'number', 'number'})
    --    nil    index 2: expected number, got string.
    --    > type_check({foo = 'bar'}, {foo = '?table'})
    --    nil    index foo: expected table or nil, got string.
    --    > type_check('foo', {foo = '?table'})
    --    nil    expected table or userdata, got string.
    --
    -- Wrong type names (e.g., 'int') do *not* throw an error.
    --
    -- @param val A value.
    -- @tparam string|table decl A type declaration.
    -- @treturn[1] bool `true` if the value matches the declaration.
    -- @treturn[2] nil `nil` otherwise.
    -- @treturn[2] string An error message.
    function type_match (val, decl, _seen)
        local t = type(decl)
        if t == 'string' then
            local t = type(val)
            for p, r in pairs(abbrs) do decl = decl:gsub(p, r) end
            for e in decl:gmatch '[^|]+' do if t == e then return true end end
            return nil, msg:format(decl:gsub('|', ' or '), t)
        elseif t == 'table' then
            local ok, err = type_match(val, 'table|userdata')
            if not ok then return nil, err end
            if not _seen then _seen = {} end
            assert(not _seen[val], 'cycle in data tree.')
            _seen[val] = true
            for k, t in pairs(decl) do
                ok, err = type_match(val[k], t, _seen)
                if not ok then return nil, format('index %s: %s', k, err) end
            end
            return true
        end
        error(msg:format('string or table', t))
    end
end

--- Type-check function arguments.
--
-- Type declaration grammar:
--
-- The type declaration syntax is that of @{type_match}, save for
-- that you can use '...' to declare that the remaining arguments
-- are of the same type as the previous one.
--
-- Obscure Lua errors may indicate that forgot the quotes around '...'.
--
-- Caveats:
--
-- * Wrong type names (e.g., 'int') do *not* throw an error.
-- * Sometimes the strack trace is wrong.
--
-- @tparam string|table ... Type declarations.
-- @treturn func A function that adds type checks to a function.
--
-- @usage
-- store = type_check('?*', 'table', '?number', '...')(
--     function (val, tab, ...)
--          local indices = table.pack(...)
--          for i = 1, n do tab[indices[i]] = val end
--     end
-- )
--
-- @function type_check
function type_check (...)
    local decls = pack(...)
    return function (func)
        return function (...)
            local args = pack(...)
            local decl, prev
            local n = math.max(decls.n, args.n)
            for i = 1, n do
                if     decls[i] == '...' then prev = true
                elseif decls[i]          then prev = false
                                              decl = decls[i]
                elseif not prev          then break
                end
                if args[i] == nil and prev and i >= decls.n then break end
                local ok, err = type_match(args[i], decl)
                if not ok then error(format('argument %d: %s', i, err), 2) end
            end
            return func(...)
        end
    end
end

解決方案:checks 函式庫

前述解決方案提出了一些限制

* 太過冗長,非平凡的驗證可能難以閱讀;

* 錯誤訊息不如 Lua 基本元件回傳的訊息清楚。此外,它們會指出在呼叫函數中發生 assert() 失敗的地方有錯誤,而不是傳遞無效參數的呼叫函數。

checks 函式庫提供一種簡潔、靈活且具可讀性的方式,用於產生良好的錯誤訊息。型態由字串描述,這些字串當然可以是 Lua 型態名稱,但也可以儲存在物件的 metatable 中,而 __type 欄位中。另外,也可以將自訂的任意型態檢查函式註冊在專用的 checkers 函式表。例如,若想檢查 IP 埠號碼(必須介於 0 和 0xffff 之間),可以如下定義 port 型態

function checkers.port(x)
    return type(x)=='number' and 0<=x and x<=0xffff and math.floor(x)==0 
end

為了移除無用的樣板程式碼,checks() 直接從堆疊架構擷取參數,無須重複輸入參數;例如,如果函式 f(num, str) 需要數字和字串,則它可以實作如下

function f(num, str)
    checks('number', 'string')
    --actual function body
end

型態可以組合

* 垂直線允許接受多個型態,例如,checks('string|number') 接受字串和數字作為第一個參數。

* 前綴 「?」使型態為可選擇的,亦即也接受 nil。在功能上,它等於前綴 "nil|",雖然它較具可讀性,且在執行階段檢查速度較快。

* 問號可以與聯集線組合,例如,checks('?string|number') 接受字串、數字和 nil

* 最後,特殊的 "!" 型態接受任何內容,但排除 nil

更加詳細的函式庫運作說明可以在其原始碼標頭中找到 (https://github.com/fab13n/checks/blob/master/checks.c)。這個函式庫是 Sierra Wireless 的應用程式架構的一部分,可以由此取得:https://github.com/SierraWireless/luasched。為了方便起見,它也當成獨立的 rock 在這裡取得:https://github.com/fab13n/checks

Hack:包覆值 + 可能值

如同前面提到的,執行階段型態檢查不會偵測未執行的程式錯誤。廣泛的測試套件對動態型態語言的程式而言尤其重要,如此才能在所有可想像的資料集(或至少是良善的代表性資料集)中執行程式碼的所有分支,如此執行階段驗證得以充份達到。您無法過度仰賴編譯器為您執行這些檢查。

或許,我們可以透過取得更多完整型態資訊和數值來改善這個問題。以下列舉一種方法,雖然它算是一種新穎的驗證概念,但目前尚未準備好生產用途。

-- ExTypeCheck.lua ("ExTypeCheck")
-- Type checking for Lua.
--
-- In this type model, types are associated with values at run-time.
-- A type consists of the set of values the value could have
-- at run-time.  This set can be large or infinite, so we
-- store only a small representative subset of those values.
-- Typically one would want to include at least the boundary
-- values (e.g. max and min) in this set.
-- Type checking is performed by verifying that all values
-- in that set are accepted by a predicate function for that type.
-- This predicate function takes a values and returns true or false
-- whether the value is a member of that type.
--
-- As an alternative to representing types as a set of representative
-- values, we could represent types more formally, such as with
-- first-order logic, but then we get into theorem proving,
-- which is more involved.
--
-- DavidManura, 2007, licensed under the same terms as Lua itself.

local M = {}

-- Stored Expression design pattern
-- ( https://lua-users.dev.org.tw/wiki/StatementsInExpressions )
local StoredExpression
do
  local function call(self, ...)
    self.__index = {n = select('#', ...), ...}
    return ...
  end
  function StoredExpression()
    local self = {__call = call}
    return setmetatable(self, self)
  end
end
 
-- Whether to enable type checking (true/false).  Default true.
local is_typecheck = true

-- TypeValue is an abstract type for values that are typed
-- This holds the both the actual value and a subset of possible
-- values the value could assume at runtime.  That set should at least
-- include the min and max values (for bounds checking).
local TypedValue = {}

-- Check that value x satisfies type predicate function f.
function M.check_type(x, f)
  for _,v in ipairs(x) do
    assert(f(v))
  end
  return x.v
end


-- Type check function that decorates functions.
-- Example:
--   abs = typecheck(ranged_real'(-inf,inf)', '->', ranged_real'[0,inf)')(
--     function(x) return x >= 0 and x or -x end
--   )
function M.typecheck(...)
  local types = {...}
  return function(f)
    local function check(i, ...)
      -- Check types of return values.
      if types[i] == "->" then i = i + 1 end
      local j = i
      while types[i] ~= nil do
        M.check_type(select(i - j + 1, ...), types[i])
        i = i + 1
      end
      return ...
    end
    return function(...)
      -- Check types of input parameters.
      local i = 1
      while types[i] ~= nil and types[i] ~= "->" do
        M.check_type(select(i, ...), types[i])
        i = i + 1
      end
      return check(i, f(...))  -- call function
    end
  end
end


function M.notypecheck() return function(f) return f end end
function M.nounbox(x) return x end

M.typecheck = is_typecheck and M.typecheck or M.notypecheck
M.unbox = is_typecheck and M.unbox or M.nounbox

-- Return a boxed version of a binary operation function.
-- For the returned function,
--   Zero, one, or two of the arguments may be boxed.
--   The result value is boxed.
-- Example:
--   __add = boxed_op(function(a,b) return a+b end)
function M.boxed_op(op)
  return function(a, b)
    if getmetatable(a) ~= TypedValue then a = M.box(a) end
    if getmetatable(b) ~= TypedValue then b = M.box(b) end
    local t = M.box(op(M.unbox(a), M.unbox(b)))
    local seen = {[t[1]] = true}
    for _,a2 in ipairs(a) do
      for _,b2 in ipairs(b) do
        local c2 = op(a2, b2)
        if not seen[c2] then
          t[#t + 1] = op(a2, b2)
          seen[c2] = true
        end
      end
    end
    return t
  end
end

-- Return a boxed version of a unary operation function.
-- For the returned function,
--   The argument may optionally be boxed.
--   The result value is boxed.
-- Example:
--   __unm = boxed_uop(function(a) return -a end)
function M.boxed_uop(op)
  return function(a)
    if getmetatable(a) ~= TypedValue then a = M.box(a) end
    local t = M.box(op(M.unbox(a)))
    local seen = {[t[1]] = true}
    for _,a2 in ipairs(a) do
      local c2 = op(a2)
      if not seen[c2] then
        t[#t + 1] = op(a2)
        seen[c2] = true
      end
    end
    return t
  end
end

TypedValue.__index = TypedValue
TypedValue.__add = M.boxed_op(function(a, b) return a + b end)
TypedValue.__sub = M.boxed_op(function(a, b) return a - b end)
TypedValue.__mul = M.boxed_op(function(a, b) return a * b end)
TypedValue.__div = M.boxed_op(function(a, b) return a / b end)
TypedValue.__pow = M.boxed_op(function(a, b) return a ^ b end)
TypedValue.__mod = M.boxed_op(function(a, b) return a % b end)
TypedValue.__concat = M.boxed_op(function(a, b) return a .. b end)
-- TypedValue.__le -- not going to work? (metafunction returns Boolean)
-- TypedValue.__lt -- not going to work? (metafunction returns Boolean)
-- TypedValue.__eq -- not going to work? (metafunction returns Boolean)
TypedValue.__tostring = function(self)
  local str = "[" .. tostring(self.v) .. " in "
  for i,v in ipairs(self) do
    if i ~= 1 then str = str .. ", " end
    str = str .. v
  end
  str = str .. "]"
  return str 
end
-- Convert a regular value into a TypedValue.  We call this "boxing".
function M.box(v, ...)
  local t = setmetatable({v = v, ...}, TypedValue)
  if #t == 0 then t[1] = v end
  return t
end
-- Convert a TypedValue into a regular value.  We call this "unboxing".
function M.unbox(x)
  assert(getmetatable(x) == TypedValue)
  return x.v
end


-- Returns a type predicate function for a given interval over the reals.
-- Example: ranged_real'[0,inf)'
-- Note: this function could be memoized.
function M.ranged_real(name, a, b)
  local ex = StoredExpression()

  if name == "(a,b)" then
    return function(x) return type(x) == "number" and x > a and x < b end
  elseif name == "(a,b]" then
    return function(x) return type(x) == "number" and x > a and x <= b end
  elseif name == "[a,b)" then
    return function(x) return type(x) == "number" and x >= a and x < b end
  elseif name == "[a,b]" then
    return function(x) return type(x) == "number" and x >= a and x <= b end
  elseif name == "(inf,inf)" then
    return function(x) return type(x) == "number" end
  elseif name == "[a,inf)" then
    return function(x) return type(x) == "number" and x >= a end
  elseif name == "(a,inf)" then
    return function(x) return type(x) == "number" and x > a end
  elseif name == "(-inf,a]" then
    return function(x) return type(x) == "number" and x <= a end
  elseif name == "(-inf,a)" then
    return function(x) return type(x) == "number" and x < a end
  elseif name == "[0,inf)" then
    return function(x) return type(x) == "number" and x >= 0 end
  elseif name == "(0,inf)" then
    return function(x) return type(x) == "number" and x > 0 end
  elseif name == "(-inf,0]" then
    return function(x) return type(x) == "number" and x <= 0 end
  elseif name == "(-inf,0)" then
    return function(x) return type(x) == "number" and x < 0 end
  elseif ex(name:match("^([%[%(])(%d+%.?%d*),(%d+%.?%d*)([%]%)])$")) then
    local left, a, b, right = ex[1], tonumber(ex[2]), tonumber(ex[3]), ex[4]
    if left == "(" and right == ")" then
      return function(x) return type(x) == "number" and x > a and x < b end
    elseif left == "(" and right == "]" then
      return function(x) return type(x) == "number" and x > a and x <= b end
    elseif left == "[" and right == ")" then
      return function(x) return type(x) == "number" and x >= a and x < b end
    elseif left == "[" and right == "]" then
      return function(x) return type(x) == "number" and x >= a and x <= b end
    else assert(false)
    end
  else
    error("invalid arg " .. name, 2)
  end
end


return M

使用範例

-- type_example.lua
-- Test of ExTypeCheck.lua.

local TC = require "ExTypeCheck"
local typecheck = TC.typecheck
local ranged_real = TC.ranged_real
local boxed_uop = TC.boxed_uop
local box = TC.box

-- Checked sqrt function.
local sqrt = typecheck(ranged_real'[0,inf)', '->', ranged_real'[0,inf)')(
  function(x)
    return boxed_uop(math.sqrt)(x)
  end
)

-- Checked random function.
local random = typecheck('->', ranged_real'[0,1)')(
  function()
    return box(math.random(), 0, 0.999999)
  end
)

print(box("a", "a", "b") .. "z")
print(box(3, 3,4,5) % 4)

math.randomseed(os.time())
local x = 0 + random() * 10 - 1 + (random()+1) * 0
print(x + 1); print(sqrt(x + 1)) -- ok
print(x); print(sqrt(x)) -- always asserts! (since x might be negative)

範例輸出

[az in az, bz]
[3 in 3, 0, 1]
[8.7835848325787 in 8.7835848325787, 0, 9.99999]
[2.9637113274708 in 2.9637113274708, 0, 3.1622760790292]
[7.7835848325787 in 7.7835848325787, -1, 8.99999]
lua: ./ExTypeCheck.lua:50: assertion failed!
stack traceback:
        [C]: in function 'assert'
        ./ExTypeCheck.lua:50: in function 'check_type'
        ./ExTypeCheck.lua:78: in function 'sqrt'
        testt.lua:30: in main chunk
        [C]: ?

注意:將數值保留多個數值的方法與 Perl6 連接處(最初稱為「量子重疊」)有些類似。

解決方案:Metalua 執行階段型態檢查

在 Metalua 中有一個執行階段型態檢查的範例 [2]

解決方案:Dao

部分以 Lua 為基礎的 Dao 語言,內建支援選用型態 [3]


--DavidManura

解決方案:Teal

[Teal 語言] 為 Lua 的類型化方言,編譯成 Lua。

解決方案:TypeScriptToLua?

[TypeScriptToLua] 是一款 TypeScript? 轉 Lua 轉譯器,它讓我們使用 TypeScript? 語法和編譯時間類型檢查來編寫 Lua 程式碼。

我想辯稱,除了非常簡單的程式或永遠具有相同輸入的程式之外,在腳本中停用類型檢查是個壞主意。--JohnBelmonte

這是經過匿名刪除的留言,說「這不是發表個人意見的論壇」。恰恰相反,這是既定的 wiki 風格。人們可以對頁面發表評論,對於爭論點,這比直接隨便變更原始文字要來得禮貌。原始作者可以決定是否將此類評論納入原始文字。--JohnBelmonte


最新變更 · 喜好設定
編輯 · 歷史記錄
最後編輯時間為 2022 年 7 月 12 日上午 8:38 GMT (差異)