Lazyness and ToArray()
2012-06-04
In C# and F# we have these wonderful lazy enumerations and sequence generating methods that are being rewritten by the compiler into a state machine. And some people even misused this mechanism to implement Coroutines.
But like so often, and specifically in an imperative setting and in a world with side effects, these enumerations are a leaky abstraction: the developer needs to be conscious about which enumerations are evaluated on demand and which are ready to use.
I try to differentiate these two types of collections by using IEnumerable<T> for the lazy ones and T[] (Array of T) for the ones that are ready to use without code being evaluated. In addition, I try to guarantee that any instance of IEnumerable<T> has no side effects.
But looking closer, there are some side effects that can not so easily be contained:
- A sequence evaluation might throw an exception.
- A sequence evaluation may take a long time to compute its elements.
Now, obviously, Exceptions shouldn’t be catched in the enumerator method, and if a software stack is properly designed, not even handled inside a bounded context or domain, but sometimes it feels just wrong to pass around IEnumerable<T> that runs a lot of code when it is being enumerated. In that case I personally resort to a paranoia response (red lights turn on!!!), a ToArray() conversion, which converts IEnumerable<T> to an array and so evaluates all its elements. This helps to make the evaluation of the sequence explicit and pins down the exact time and location where it should happen.
Now that we have an Array, it is often the case to pass around the fully evaluated sequence to other methods, which – as by recommendation – also expect IEnumerable<T> instead of T[]. So the awareness about the nature of our sequence (if it is already evaluated or not), is getting lost again.
I would prefer a more specific solution to this problem so that paranoia responses can be avoided in most situations. Any ideas?
Update: I found this solution that materializes sequences through ToArray() and “tags” them with a custom IEnumerable<T> derived class. It’s a partial solution because it does not fix the paranoia response, nevertheless it makes the code much more readable. I like it.