Thursday, July 05, 2007 8:39 PM
LINQ to SharePoint - Improving the parser == debugger visualizer fun
Welcome back to what's going to end up as "LINQ to SharePoint: The Cruel Sequel" :-). The last couple of days, LINQ to SharePoint has been a full-time job and the result is getting better and better build after build. In this post, I'd like to highlight another feature that was planned from the start but didn't make its way to the 0.2 release of last month: parser enhancements.
So, what's up? Simply stated, the query parser so far was a runtime parser only. When executing LINQ to SharePoint queries, the LINQ query expression tree gets parsed sooner or later, possibly throwing exceptions in case something can't be translated into CAML. A typical example is the following:
var res = from u in users where u.FirstName.EndsWith("t") select u;
The reason this can't be translated is the EndsWith call on the FirstName entity property. Since CAML doesn't support an equivalent in its query language, we can't provide a translation. There are much more such things that make the parse operation fail, due to the relatively limited expressiveness of CAML. The problem however, especially with big queries, is for developers to get to know where the problem is located exactly. In the previous alphas an InvalidOperationException is thrown with some message, possibly referring to something in the query that couldn't be translated (e.g. "Unsupported string filtering query expression detected: EndsWith. Only the methods Contains and StartsWith are supported."). Although this sample message is pretty easy to understand, there are more complex ones that deserve a better approach.
To put this in a broader context, you should be aware of the fact that LINQ lacks support for compile-time query validation by custom query providers. All the LINQ-capable compilers (C# 3.0, VB 9.0) do, is generating an expression tree representing the query. Therefore, the only way to find out about problems in the query is to execute the code, which triggers the IQueryable-supported (custom query provider's) query expression tree parser that can signal issues in the query by throwing some exception. All LINQ providers expose such a behavior. As an example, take a look at the following situation in LINQ to SQL:
Luckily the message is pretty clear in order to figure out what's going wrong. Also observe the time when and the place where the exception occurs: not at definition time of the query (the query - i.e. var res = ... in our case - remains an expression) but when the iteration statement is executed.
Note: LINQ to SharePoint alpha 0.1 did produce parse errors at query expression definition time instead; this has been fixed in 0.2 so that the query parser isn't invoked before query execution time (i.e. iteration over the results).
So what's wrong with this? Not that much, exception for the fact that we would be able to signal such problems at compile time if we could have the appropriate infrastructure in place at the compiler's side. This would mean that the C# and VB compiler would have to pass the generated expression tree to the custom query provider's query parser (which could be interfaced for communication with a front-end compiler) as part of the compilation job. Our query parser could then feed a set of warnings and errors back to the compiler, which are then presented to the developer as regular compiler warnings or errors (albeit generated by the custom query provider instead of the compiler itself).
Since we don't have such a thing at this very moment, alternatives have to be invented. That's exactly what we've done in LINQ to SharePoint in order to help the developer spot the location of the problem in his/her query.
So, what's our approach? Of course we don't drop the NotSupportedException approach: if your query can't be translated, you're out of luck and we need to signal this in some way or another at runtime. However, when debugging we provide a debugger visualizer for LINQ to SharePoint queries that allows you to inspect the query, including the generated CAML. Essentially, the debugger visualizer triggers the parser albeit in a slightly different "parser run mode": instead of throwing exceptions for parse-time errors, all errors are collected and fed back to the visualizer with enough information to spot the problem. A picture is worth a thousand words, so take a look at this:
This is the debugger visualizer for LINQ to SharePoint that will become available in a later release (keep an eye on my blog). At the top of the dialog you can see the LINQ query. Admitted, it's not in its original shape anymore but it's the best we can do right now (the original LINQ query in either C# 3.0 or VB 9.0 has been eaten by the respective compiler at this stage of execution). The original query looks as follows:
var res = from t in lst where !(t.FirstName.Contains("Bart") && t.Age >= 24) || t.LastName.EndsWith("De Smet") && CamlMethods.DateRangesOverlap(t.Modified.Value) orderby 1 select t;
The LINQ query you can see in the dialog above is basically the query's expression tree ToString() call result. With a little knowledge about extension methods and expression trees, you can read such an expression string representation in just a few seconds (as a little exercise play the human compiler, translating a LINQ query to an expression tree followed by a mental ToString-call).
What the LINQ to SharePoint parser does when running in "debug mode" - in addition to its regular parsing job - is the identification of subexpressions that can't be translated while continuing the parsing (instead of throwing an exception). All places where something went wrong are marked by <ParseError /> placeholders in the CAML query and each of these have a unique identifier that's linked (bidirectionally) with the subexpression in the LINQ query that caused the problem. This way, developers can identify problems in a more visually attractive way.
Even more, in the example above we can see four problems with the query at once. If we'd run the application we'd get only one single exception (which would result in at least four "run-crash-fix" iterations). The goal is to take this to the maximum level possible, providing links from the debugger visualizer to specific help information about the parser issues that occurred (observe the unique identification code on the error, in this case SP0011). In case you're curious why you're seeing SP0011 in the fragment above: observe that the t.FirstName.Contains("Bart") expression is nested inside a Not expression. CAML doesn't have a Boolean negation operator in its query schema, so we can't express the !t.FirstName.Contains("Bart") expression as a whole.
Stay tuned for more LINQ to SharePoint fun soon!Del.icio.us
| Digg It
Filed under: C# 3.0, Orcas, LINQ