About Lua Iterators

This is a programming article disguised as a Lua-centric piece or at least so I think. I find this kind of clever well-oiled mechanisms exciting and interesting. If you do too (or you don't know yet), I encourage you to join me today even if you never wrote a line of code in Lua.

Lua has four looping constructs:

We could use an iterator in each and one of them, but "iterator"—or "iterator function"—has a special meaning in the context of the generic for loop and I intend to focus on that. For those unfamiliar, the syntax is:

generic-for ::= "for" name-list "in" expression-list "do" block "end"

Where the unbound name-list, expression-list, and block are hopefully self-explanatory since they refer to concepts that are not exclusive to Lua (although must be compliant to its grammar). For example, we can use the generic for together with pairs(t):

for key, value in pairs(sth) do
	print(key, value)
end

Now, when pairs is called it returns three values:

  1. next(t [, index]); an iterator function,
  2. the passed table as-is (above it's sth); a state value, and
  3. a nil.

The first variable from the name-list (above it's key) is also known as control variable. The oddly specific nil returned from pairs is used to initialise this control variable before the first iteration of the loop. On each iteration, state and control variable are fed into the iterator function to assign values to the variables from the name-list (including the control variable itself). We can unwind this into:

iterator, state, key = pairs(sth)
key, value = iterator(state, key)
print(key, value)
key, value = iterator(state, key)
print(key, value)

Of course, I simplified scoping rules and Lua implementation internals, but the essence of the operation is preserved.

Here, we can note that the order of arguments passed to iterator allows to leverage the Lua's syntactic sugar for methods, assume:

seq = {}

function seq:next (i)
	i = i + 1
	if i > self.last then
		return nil
	end
	return i
end

function seq:new (last)
	return self.next, {last=last}, 0
end

Simplified to avoid using metatables or third-party class/OOP library. With this, we can seq:new in place of pairs:

for i in seq:new(5) do
	print(i)
end

This will print numbers from 1 to 5 inclusive.

what the hell is this

Similarly to pairs we returned a table, this time a completely new one (although if the example was more object-oriented, we could have returned self), but there is no rule that says the state must be a table. In case of a simple sequence like above, we can also:

function successor (max, i)
	i = i + 1
	if i > max then
		return nil
	end
	return i
end

function sequence (max)
	return successor, max, 0
end

Now, similarly to above we will get numbers 1-5 inclusive:

for i in sequence(5) do
	print(i)
end

Of course, state may also be nil and we are free to ignore it in our iterator function although it's hard to draw a solid example for this. What if we were to look at control variable in the same light? Ignoring just control variable while using state is pretty much a classic OOP placed in Lua syntax. Try it. As for ignoring both, we could:

max = 0
counter = 0

function global_step ()
	counter = counter + 1
	if counter > max then
		return nil
	end
	return counter
end

function sequence (new_max)
	max = new_max
	counter = 0
	return global_step, nil, nil
end

Trailing nils are there for clarity.

But this raises a big red flag with the global state. Well, it does illustrate the point as long as you are not distracted. Now, if we consider a global state as simply unbounded state (sic! see my article on environments for detailed clarification), then the next step is to bind the state to the function. Lua has closures, so we could:

function sequence (max)
	local i = 0
	return function ()
		i = i + 1
		if i > max then
			return nil
		end
		return i
	end
end

This time I ignored trailing nils for state and initial control value. This, once again, replaces previous functions of the same name and can be plugged into generic for loop:

for i in sequence(5) do
	print(i)
end

In general, we can attach a state—in wider programming sense of the word, not the generic-for-specific one—to an iterating operation in four ways:

Where the first two will inevitably end up getting bound as arguments to iterator function on each step. Additionally, the iterator function may be any callable; this includes table with metatable that implements __call metamethod. Calling table via metatable binds "self", so we get yet another way to slip some more information when calculating next iteration step.

I ignored remaining variables from the name-list so far. Well, that's because they are not significant here, but as a last remark today, let's just note that while pairs has just two variables: key (control) and associated value, it is again not a rule:

function next_two (t, i)
	i = (i or 0) + 1
	local a, b = t[i], t[i + 1]
	if a == nil or b == nil then
		return nil
	end
	return i, a, b
end

function twos (t)
	return next_two, t
end

This will produce three values in each iteration:

for i, a, b in twos{"a", "b", "c"} do
	print(i, a, b)
end

With three elements in input table we will get two iterations: first 1, "a", "b", second 2, "b", "c".

I find this mechanism charming. It gives you a number of full-pledged tools and allows to craft whatever seems to be correct for the problem, with all the context, constraints, data structures, and whatnot involved. Yet, at the same time, it still allows making proper generic solutions.

How do you compose this things together? We'll see it next time.