Language integrated queries

upd. Language-integrated queries were the very first use case implemented with macros when they made their appearance in Scala 2.10.0-M3. This facility was prototyped as the underlying mechanism of direct embedding in Slick and has been used in different forms since then.

LINQ is a brilliant concept. Being released five years ago, even now it remains state of the art. However, it has its weaknesses: 1) relying on a clever but rigid compiler hardcode, 2) lack of composability, 3) unclear semantics of calls to external code.

The trick employed by LINQ is, actually, very neat. Whenever a compiler witnesses an expression of some functional type, say, Func<T, Boolean>, in the context that requires an Expression<Func<T, Boolean>>, it automatically lifts the provided function into an AST that is accessible during the run-time. This and this alone immediately solves the problem that plagued data access frameworks for a long time (we discuss that problem in detail in the "Advanced domain-specific languages" use case).

Being a step forward in comparison with traditional language tools for building eDSLs, LINQ is still not very extensible solution. For example, LINQ only supports lifting expressions, but not statements, which rules out many useful language constructs, and the programmer cannot do anything about that.

Also, this particular lifting scheme brings an unfortunate composability restriction. Consider the following code (the example is taken from the blog post "Calling functions in LINQ queries" by Tomas Petricek, if you want to know more about the composability problem - be sure to take a look at that post):

static bool MyPriceFunc(Nwind.Product p) {
  return p.ProductName.StartsWith("B");
}

var q =
  from p in db.Products
  where MyPriceFunc(p.UnitPrice) > 30m
  select p

This code compiles with no errors, but when you execute it DLINQ throws an exception saying: "Static method System.Boolean MyTest(LINQTest.Nwind.Product) has no supported translation to SQL".

This happens because the lifting scheme is hardcoded to be shallow - lifting only processes the expression it sees and does not recursively generate ASTs for the functions used in the query. There exist several solutions to this problem (one of those is described in the aforementioned blog post), but they are not especially elegant.

In our proposal LINQ can be implemented as follows (below we omit the discussion of LINQ infrastructure, and focus on the lifting part of the implementation):

class Queryable[T, Repr](query: Query) {
  def filter(p: T => Boolean): Repr = Impl.filter[T, Repr]
}

object Impl {
  def filter[T: c.TypeTag, Repr: c.TypeTag](c: Context)(p: c.Expr[T => Boolean]) = reify {
    val b = c.prefix.value.newBuilder
    b.query = Filter(c.prefix.value.query, c.reifyTree(p))
    b.result
  }
}

Implemented as a macro def, the filter high-order function receives its predicate in the form of an AST. Having done that, the macro is free to perform arbitrary manipulations (e.g. to call the reify function that builds a domain-specific AST from the compiler AST that is provided in p) to accumulate and, eventually, process the query.

Also, there's a solution to the composability problem: if we declare MyPriceFunc as a macro def as well, then it will expand into the call site and, consequently, will be processed by the domain-specific query translator. Among alternative approaches is the macro that requests ASTs for all annotated functions in the compilation unit and stores those ASTs for future uses (that macro could be implemented as a method-level macro annotation or even as a program-wide package annotation). The domain of possible solutions is quite big, thanks to macros being a core language feature, not an ad-hoc extension to the compiler.