inforakesha

A fine WordPress.com site

How to Use Grouping in C# LINQ Syntax

Leave a comment

When you started using LINQ, did you think it looked like SQL? I did.

The more I learned LINQ, the more I realized it wasn’t anything like SQL. Take grouping, for example. Because LINQ has a group by statement, and it looks like SQL, I assumed that the syntax for grouping in LINQ would be just like SQL. Ha ha! Wrong! As soon as I tried to use it, I discovered that the LINQ syntax is not only nothing like the SQL equivalent, but the whole grouping concept in LINQ is completely different too.

At first glance, the two syntaxes look slightly similar.

SQL:

select ReportsTo, count(LastName) as NameCount
from Employees
group by ReportsTo

LINQ (C#):

from employee in Employees
group employee by employee.ReportsTo

Ignoring the omission of the select statement from LINQ, and the requirement of a range variable, they do look similar. But looks can be deceiving.

Let me explain why.

Grouping in SQL

In SQL, you typically group a set of entities using the following syntax:

select ReportsTo, count(LastName) as NameCount
from Employees
group by ReportsTo

This kind of query results in a simple table with two columns: ReportsTo and NameCount. The count of the names was grouped into a single column using an aggregate function.

Of course, this is a simple example; you could do much more complex grouping by using more expressions in the aggregation or the group by statement. Whatever the grouping though, the result is a flat table.

LINQ is different. Completely different.

Grouping in LINQ

Before you can understand the LINQ syntax, you have to understand how grouping works in LINQ.

When you use the group by statement, the C# compiler translates it to the LINQ Enumerable’s GroupBy extension method. The GroupBy method returns an object of type

IEnumerable<IGrouping>
The GroupBy method takes a parameter of a delegate which specifies the key of each grouping. The key is defined to be of type TKey, and is usually inferred by the compiler depending on the expression.

The key to understanding the C# query syntax is that you don’t get back a flat structure from group by, as you would with SQL. You actually get an “IEnumerable” of “IGrouping” objects.

The IGrouping interface has only one property, “Key”, and implements IEnumerable. Again, TElement is usually inferred. “Key” is typed as the generic type you define for TKey. For the minute, that’s not so important, so we’ll come back to the Key later.

One of the first indications that LINQ is different to SQL is that you discover it is possible to finish a LINQ statement with a group by statement without needing a Select. As IGrouping implements IEnumerable, group by actually returns an IEnumerable of IEnumerable. The result is a set of sets. It’s not a flat table. To use it, you have to iterate over it again or perform another operation on the set.

Let’s see if an example can make it clearer.

When grouping a set of employees, you might use the following code:

var emp1 = new { ReportsTo = 1, LastName = “Richard” };
var emp2 = new { ReportsTo = 1, LastName = “Christopher” };
var emp3 = new { ReportsTo = 2, LastName = “John” };
var emp4 = new { ReportsTo = 2, LastName = “Greg” };
var employees = new[] { emp1, emp2, emp3, emp4 };

var groups =
from employee in employees
group employee by employee.ReportsTo;

foreach (var group in groups)
Console.WriteLine(“{0} employees report to {1}.”,
group.Key, group.Count());

(Don’t worry about the anonymous class definitions for now. You just have to know that my employee objects are all treated as the same type internally, so this is valid C# code.)

The employees variable is of type IEnumerable, so it can be used in a LINQ statement. The groups variable is also an IEnumerable, as it is a set of results, as usually returned by a LINQ statement. Remember though, it’s not like the set you would get back in a SQL statement. It is actually an IEnumerable of IGrouping, or an IEnumerable of IEnumerable.

To do something with the grouping, you have to either treat it as another IEnumerable, and iterate over it, or use methods like Count() to do something with it.

foreach (var group in groups)
foreach (var employee in group)
Console.WriteLine(“{0} reports to {1}”,
employee.LastName, employee.ReportsTo);

This is how LINQPad shows what is contained in the groups variable:

IEnumerable<IGrouping>
Key=
1

IGrouping (2 items)
ReportsTo
LastName
1
Richard
1
Christopher
Key=
2
IGrouping (2 items)
ReportsTo
LastName
2
John
2
Greg
Notice that each IGrouping has an associated Key property of type Int32.

Grouping Syntax in LINQ

So now it should be a little easier to understand how to use the LINQ group by statement.

In LINQ, you always need to declare the range variable between group and by.

from employee in Employees

group employee by employee.ReportsTo

After by, you give an expression which returns the value for the Key. The enumeration is then split into groups of IGroupings where the key for each item in that IGrouping is the same.

So here, we specify the employee.ReportsTo variable as the key. Each employee with a different value for ReportsTo creates a new IGrouping with a Key of that value. The employee is then added to that new IGrouping as an item. The IGrouping is then added to the returned IEnumerable<IGrouping>.

The next time an item in the employees variable has the same value for ReportsTo as the Key property of an already-existing IGrouping object, the item is simply added to that IGrouping.

Of course, that’s probably not the way it works under the covers. But I’m not trying to specify the algorithm, just what happens in the end result.

In the end, an object implementing IEnumerable<IGrouping> is returned by the LINQ statement. As each item in the IEnumerable is an IGrouping, you can either use it’s Key or treat it as an IEnumerable.

So there you have it. Just remember, group by is nothing like SQL, and it returns an IEnumerable of IGrouping, and you should stay happy.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s