Monday, August 17, 2009 8:20 PM bart

LINQ to Ducks – Bringing Back The Duck-Typed foreach Statement To LINQ

I promise, it will be a (relatively) short post this time. You all know the foreach statement in C#, don’t you? Think twice before you answer and tell me exactly how the following works:

foreach (int x in src)
{
    // Do something with x.
}

Got an answer? Let me disappoint you: if you have the answer, you’re wrong. There’s no single answer to the question above as you need to know more about the type of src to make a final decision on how the above works…

You may say that clearly that object needs to implement IEnumerable or IEnumerable<T>, and maybe you’ll even mention that in the former case the compiler inserts a cast for you when it gets “x” back from the call to the IEnumerator’s Current property getter. In other words, the code gets translated like this:

var e = src.GetEnumerator();
while (e.MoveNext())
{
    var x = (int)e.Current; // without the cast if src was an IEnumerable<T>
    // Do something with x.
}

A worthy attempt at the translation but not quite right. First of all, the variable x is declared in an outer scope (causing some grief when talking about closures, but that’s a whole different topic…). Secondly, the enumerator may implement IDisposable, in which case the foreach-statement will ensure proper disposal a la “using”:

{
    int x;

    using (var e = src.GetEnumerator())
    {
        while (e.MoveNext())
        {
            x = (int)e.Current; // without the cast if src was an IEnumerable<T>
            // Do something with x.
        }
    }
}

That’s a bit more sane, but we’re missing out on another kind of source foreach can work with: any object, as long as it exposes the enumeration pattern of GetEnumerator in tandem with MoveNext and Current. Here’s a sample object that just works fine with the foreach-statement:

class Source
{
    public SourceEnumerator GetEnumerator()
    {
        return new SourceEnumerator();
    }
}

class SourceEnumerator
{
    private Random rand = new Random();

    public bool MoveNext()
    {
        return rand.Next(100) != 0;
    }

    public int Current
    {
        get
        {
            return rand.Next(100);
        }
    }
}

With its usage shown below:

foreach (int x in new Source())
    Console.WriteLine(x);

Okay, that’s flexible, isn’t it? In fact, the foreach-statement can be said to be duck typed: it’s not the nominal type that matters (i.e. Source is explicitly declared to be an IEnumerable, and SourceEnumerator an IEnumerator) but just the structure of the object that determines “compatibility” with the foreach-statement.

But who says foreach over a collection immediately starts thinking about LINQ, no? Say the consumer of Source looked like this:

List<int> res = new List<int>();
foreach (int x in new Source())
    if (x % 2 == 0)
        res.Add(x);

A great candidate for LINQ it seems, especially as we start adding more and more logic to the “query”. Nothing surprising about this conclusion, but trying to realize it fails miserably:

image

Why? Because LINQ is statically typed (update: to be taken with a grain of salt, see comments below this post; agreed, it'd be more precise to write LINQ to Objects as the subject of this sentence), so it expects what I’ve referred to as a nominal enumerator implementation: something that has explicitly stated to be an IEnumerable and not something that “accidentally” happens to look like that. Question of the day: how to morph an existing structural enumerator onto a nominal one so it can be used with LINQ? Sure, we could write specialized code for the Source object above that essentially creates an iterator on top of Source:

static void Main()
{
    var res = from x in IterateOver(new Source())
              where x % 2 == 0
              select x;

    foreach (var x in res)
        Console.WriteLine(x);
}

static IEnumerable<int> IterateOver(Source s)
{
    foreach (int i in s)
        yield return i;
}

But maybe you’re in a scenario with plenty of those structural enumerator constructs around (e.g. some Office automation libraries expose GetEnumerator on types like Range, while the Range object itself doesn’t implement IEnumerable hence it’s not usable with LINQ), so you want to generalize the above. Essentially, given any object you’d like to provide a duck-typed iterator over it, a suitable task for another extension method and C# 4.0 dynamic:

static class DuckEnumerable
{
    public static IEnumerable<T> AsDuckEnumerable<T>(this object source)
    {
        dynamic src = source;

        var e = src.GetEnumerator();
        try
        {
            while (e.MoveNext())
                yield return e.Current;
        }
        finally
        {
            var d = e as IDisposable;
            if (d != null)
            {
                d.Dispose();
            }
        }
    }
}

Question to the reader: why can’t we simply write a foreach-loop over the “source casted as dynamic” object? Tip: how would you implement the translation of foreach when encountering a dynamic object as its source?

Yes, you’re cluttering the apparent member list on System.Object, so use with caution or just use plain old method calls to do the “translation”. What matters more is the inside of the operator, using the dynamic type quite a bit to realize the enumeration pattern. Notice how easy on the eye dynamically typed code looks in C# 4.0. With much more casts, it’d look like this:

static class DuckEnumerable
{
    public static IEnumerable<T> AsDuckEnumerable<T>(this object source)
    {
        dynamic src = (dynamic)source;

        dynamic e = src.GetEnumerator();
        try
        {
            while ((bool)e.MoveNext())
                yield return (T)e.Current;
        }
        finally
        {
            var d = e as IDisposable;
            if (d != null)
            {
                d.Dispose();
            }
        }
    }
}

And now we can write:

var res = from x in new Source().AsDuckEnumerable<int>()
          where x % 2 == 0
          select x;

foreach (var x in res)
    Console.WriteLine(x);

Dynamic glue – why not? In fact, even objects from other languages (like Ruby or Python) that follow the pattern will now work with LINQ, and for existing compatible objects the operator call is harmless (but wasteful). Oh, and notice you can also have an IEnumerable of “dynamic” objects if you’re dealing with objects originating from dynamic languages...

Can you implement the AsDuckEnumerable operator in C# 3.0? Absolutely, if you limit yourself to reflection-based discovery methods (left as an exercise for the reader).

Enjoy!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Filed under: , ,

Comments

# Reflective Perspective - Chris Alcock &raquo; The Morning Brew #414

Pingback from  Reflective Perspective - Chris Alcock  &raquo; The Morning Brew #414

# re: LINQ to Ducks – Bringing Back The Duck-Typed foreach Statement To LINQ

Tuesday, August 18, 2009 1:22 AM by James Hart

"LINQ is statically typed, so it expects what I’ve referred to as a nominal enumerator implementation: something that has explicitly stated to be an IEnumerable"

You're oversimplifying, and you know it.- LINQ to Objects- is statically typed, in that its provision of the query pattern only applies to types that implement IEnumerable.

But -LINQ- isn't statically typed at all - it's just looking for a method called Where, which takes an appropriate lambda argument. One way to get it is to lean on LINQ to objects, but there's plenty of other ways - explicit implementation, extension method, etc.

If anything, it's even more liberally duck typed than foreach. Will foreach accept an extension GetEnumerator() method?

# Dew Drop &#8211; August 18, 2009 | Alvin Ashcraft's Morning Dew

Pingback from  Dew Drop &#8211; August 18, 2009 | Alvin Ashcraft's Morning Dew

# re: LINQ to Ducks – Bringing Back The Duck-Typed foreach Statement To LINQ

Tuesday, August 18, 2009 10:31 AM by Craig Stuntz

But LINQ is also duck typed. So (if you were especially masochistic) you could also implement the LINQ query pattern (Select(), Where(), etc.) without having to use dynamic.

# re: LINQ to Ducks – Bringing Back The Duck-Typed foreach Statement To LINQ

Tuesday, August 18, 2009 10:54 AM by bart

Hi James, Craig,

Sure, those are valid points. Especially if you're reading the blog of someone who has created state machines of types to implement the query pattern and enforce ordering and cardinality of operator use... (See ExceLINQ for starters, and the C# query translation cheat sheet for "enders".)

But in this case, all of that doesn't help for the general case of having an object that happens to behave like an IEnumerable but isn't one. Sure I could implement the whole LINQ to Objects again and again on every single object I want to use LINQ over. It's hard to imagine I wouldn't figure out IterateOver is a better solution in such a case, and for the same reason having a general-purpose mechanism to "promote" a duck-type enumerator object into the IEnumerable world (or dare I to say query comprehension monad).

So, I kept my promise of "(relatively) short post this time", which implies there's lots of reading between the lines possible and sometimes desirable :-).

Thanks,

-Bart

# LINQ to Ducks – Bringing Back The Duck-Typed foreach Statement To LINQ

Wednesday, August 19, 2009 12:43 AM by progg.ru

Thank you for submitting this cool story - Trackback from progg.ru

# ContinousLearner: Links (8/20/2009) | Astha

Thursday, September 10, 2009 12:27 AM by ContinousLearner: Links (8/20/2009) | Astha

Pingback from  ContinousLearner: Links (8/20/2009) | Astha

# Top 9 Posts from 2009

Sunday, January 03, 2010 1:08 AM by B# .NET Blog

select top 9 [Subject] from dbo . cs_Posts where postlevel = 1 and usertime &lt; &#39;01/01/2010&#39;