This is a programming article disguised as a Lua-centric piece or
at least so I think. I find this kind of clever well-oiled mechanisms
exciting and interesting. If you do too (or you don't know yet), I
encourage you to join me today even if you never wrote a line of code
in Lua.
Lua has four looping constructs:
- a numerical for,
- a generic for,
- a repeat–until, and
- a while loop.
We could use an iterator in each and one of them, but "iterator"—or
"iterator function"—has a special meaning in the context of the
generic for loop and I intend to focus on that. For those unfamiliar,
the syntax is:
generic-for ::= "for" name-list "in" expression-list "do" block "end"
Where the unbound name-list, expression-list,
and block are hopefully self-explanatory since they refer
to concepts that are not exclusive to Lua (although must be compliant
to its grammar). For example, we can use the generic for together with
pairs(t):
for key, value in pairs(sth) do
print(key, value)
end
Now, when pairs is called it returns three values:
- next(t
[, index]); an iterator function,
- the passed table as-is (above it's
sth); a state value, and
- a
nil.
The first variable from the name-list (above it's key)
is also known as control variable. The oddly specific
nil returned from pairs is used to initialise
this control variable before the first iteration of the loop. On each
iteration, state and control variable are fed into the iterator function
to assign values to the variables from the name-list (including the
control variable itself). We can unwind this into:
iterator, state, key = pairs(sth)
key, value = iterator(state, key)
print(key, value)
key, value = iterator(state, key)
print(key, value)
Of course, I simplified scoping rules and Lua implementation internals,
but the essence of the operation is preserved.
Here, we can note that the order of arguments passed to iterator
allows to leverage the Lua's syntactic sugar for methods, assume:
seq = {}
function seq:next (i)
i = i + 1
if i > self.last then
return nil
end
return i
end
function seq:new (last)
return self.next, {last=last}, 0
end
Simplified to avoid using metatables or third-party class/OOP library.
With this, we can seq:new in place of pairs:
for i in seq:new(5) do
print(i)
end
This will print numbers from 1 to 5 inclusive.
Similarly to pairs we returned a table, this time a
completely new one (although if the example was more object-oriented,
we could have returned self), but there is no rule that
says the state must be a table. In case of a simple sequence like above,
we can also:
function successor (max, i)
i = i + 1
if i > max then
return nil
end
return i
end
function sequence (max)
return successor, max, 0
end
Now, similarly to above we will get numbers 1-5 inclusive:
for i in sequence(5) do
print(i)
end
Of course, state may also be nil and we are free to ignore
it in our iterator function although it's hard to draw a solid example
for this. What if we were to look at control variable in the same light?
Ignoring just control variable while using state is pretty much a classic
OOP placed in Lua syntax. Try it. As for ignoring both, we could:
max = 0
counter = 0
function global_step ()
counter = counter + 1
if counter > max then
return nil
end
return counter
end
function sequence (new_max)
max = new_max
counter = 0
return global_step, nil, nil
end
Trailing nils are there for clarity.
But this raises a big red flag with the global state. Well, it does
illustrate the point as long as you are not distracted. Now, if we
consider a global state as simply unbounded state (sic! see my article
on environments for detailed clarification), then the next step is to
bind the state to the function. Lua has closures, so we could:
function sequence (max)
local i = 0
return function ()
i = i + 1
if i > max then
return nil
end
return i
end
end
This time I ignored trailing nils for state and initial control value.
This, once again, replaces previous functions of the same name and can
be plugged into generic for loop:
for i in sequence(5) do
print(i)
end
In general, we can attach a state—in wider programming sense of
the word, not the generic-for-specific one—to an iterating operation
in four ways:
- control variable with optional initial value,
- the "state" from loop initialisation,
- globals accessible from iterator function, and
- locals (upvalues) accessible from iterator function that is a closure.
Where the first two will inevitably end up getting bound as arguments
to iterator function on each step. Additionally, the iterator function
may be any callable; this includes table with metatable that implements
__call metamethod. Calling table via metatable binds
"self", so we get yet another way to slip some more information when
calculating next iteration step.
I ignored remaining variables from the name-list so far. Well,
that's because they are not significant here, but as a last remark today,
let's just note that while pairs has just two variables:
key (control) and associated value, it is again not a rule:
function next_two (t, i)
i = (i or 0) + 1
local a, b = t[i], t[i + 1]
if a == nil or b == nil then
return nil
end
return i, a, b
end
function twos (t)
return next_two, t
end
This will produce three values in each iteration:
for i, a, b in twos{"a", "b", "c"} do
print(i, a, b)
end
With three elements in input table we will get two iterations: first
1, "a", "b", second 2, "b", "c".
I find this mechanism charming. It gives you a number of full-pledged
tools and allows to craft whatever seems to be correct for the problem,
with all the context, constraints, data structures, and whatnot involved.
Yet, at the same time, it still allows making proper generic solutions.
How do you compose this things together? We'll see it next time.