GINQ (Groovy-Integrated Query) lets you query in-memory collections using familiar SQL-like syntax. It also works with parsed XML, JSON, YAML, and other formats that produce collections.

The heart of GINQ is its query-like expressions, which can be very simple:

from n in [0, 1, 2]
select n

Or involve multiple clauses:

from s in sales
where s.amount > 50
groupby s.customer as customer into g
having g.count() > 1
orderby g.customer
select g.customer, g.count() as items, g.sum(s -> s.amount) as total

A GINQ expression can include the following clauses:

  • from — the data source (required). Accepts any Iterable, Stream, array, or GINQ result set.

  • join — combines data from additional sources (join, leftjoin, rightjoin, fulljoin, crossjoin), matched using an on condition.

  • where — filters rows before grouping, as s.amount > 50 does above.

  • groupby — groups rows by one or more expressions. The into clause binds the group to a variable (like g above) that supports aggregate methods such as count(), sum(), min(), max(), avg(), and toList(). Named keys (via as) can be accessed as properties, e.g. g.customer.

  • having — filters groups after aggregation, as g.count() > 1 does above.

  • orderby — sorts results in ascending (in asc, the default) or descending (in desc) order, with nullslast/nullsfirst control.

  • limit — restricts output to a given size, with optional offset for pagination.

  • select — the projection (required). Defines the output columns, with optional as aliases.

  • union, unionall, intersect, minus — set operations that combine two complete from…​select expressions.

GINQ also supports window functions and nested subqueries — see Advanced Topics for details.

How it works: Under the covers GINQ transforms the query into calls within a fluent API. For example, the above query is transformed into:

from(sales)
    .where(s -> s.amount > 50)
    .groupByInto(s -> s.customer, g -> g.count() > 1)
    .orderBy(new Order(g -> g.key, true))
    .select((g, q) -> Tuple.tuple(g.key, g.count(), g.sum(s -> s.amount)))

You normally never see this transformed code, but if you are debugging or looking at stacktraces, don’t be surprised to see fluent API calls instead of an exact match to the terms in the SQL-like syntax.

1. Basics

This section covers how to embed GINQ in your code and walks through each clause: data sources, projection, filtering, joining, grouping with aggregation, sorting, and pagination.

1.1. Integration

Other code in your scripts and classes may use the same names as the SQL-like syntax, such as from, select, where, etc. To avoid conflicts, GINQ expressions must be wrapped in a GQ block, which serves as a marker for the GINQ parser to recognize and transform the SQL-like syntax into the underlying fluent API calls.

A GINQ expression is wrapped in a GQ block and returns a lazy Queryable result:

def result = GQ {
    /* GINQ SQL-LIKE EXPRESSION */
}
def stream = result.stream() // get the stream from GINQ result
def list = result.toList() // get the list from GINQ result

Since the result is a standard Queryable/Stream, it composes naturally with other Groovy and JDK library calls:

import java.util.stream.Collectors

def numbers = [0, 1, 2]
assert '0#1#2' == GQ {
    from n in numbers
    select n
}.stream()
    .map(e -> String.valueOf(e))
    .collect(Collectors.joining('#'))

Two other markers, GQL and @GQ, are also supported. GQL is a shorthand that returns a List directly (equivalent to GQ {…​}.toList()):

def list = GQL {
    /* GINQ SQL-LIKE EXPRESSION */
}

GINQ can also be used as a method annotation — see <<`@GQ` Annotation>> in Advanced Topics:

@GQ
def queryMethod() {
    /* GINQ SQL-LIKE EXPRESSION */
}

1.2. Data Source

The data source for GINQ could be specified by from clause, which is equivalent to SQL’s FROM. Currently GINQ supports Iterable, Stream, array and GINQ result set as its data source:

1.2.1. Iterable Data Source

from n in [1, 2, 3] select n

1.2.2. Stream Data Source

from n in [1, 2, 3].stream() select n

1.2.3. Array Data Source

from n in new int[] {1, 2, 3} select n

1.2.4. GINQ Result Set Data Source

def vt = GQ {from m in [1, 2, 3] select m}
assert [1, 2, 3] == GQ {
    from n in vt select n
}.toList()

1.3. Projection

The column names could be renamed with as clause:

def result = GQ {
    from n in [1, 2, 3]
    select Math.pow(n, 2) as powerOfN
}
assert [[1, 1], [4, 4], [9, 9]] == result.stream().map(r -> [r[0], r.powerOfN]).toList()
The renamed column could be referenced by its new name, e.g. r.powerOfN. Also, it could be referenced by its index, e.g. r[0]
assert [[1, 1], [2, 4], [3, 9]] == GQ {
    from v in (
        from n in [1, 2, 3]
        select n, Math.pow(n, 2) as powerOfN
    )
    select v.n, v.powerOfN
}.toList()
select P1, P2, …​, Pn is a simplified syntax of select new NamedRecord(P1, P2, …​, Pn) when and only when n >= 2. Also, NamedRecord instance will be created if as clause is used. The values stored in the NamedRecord could be referenced by their names.

Construct new objects as column values:

@groovy.transform.EqualsAndHashCode
class Person {
    String name
    Person(String name) {
        this.name = name
    }
}
def persons = [new Person('Daniel'), new Person('Paul'), new Person('Eric')]
assert persons == GQ {
    from n in ['Daniel', 'Paul', 'Eric']
    select new Person(n)
}.toList()

1.3.1. Distinct

distinct is equivalent to SQL’s DISTINCT

def result = GQ {
    from n in [1, 2, 2, 3, 3, 3]
    select distinct(n)
}
assert [1, 2, 3] == result.toList()
def result = GQ {
    from n in [1, 2, 2, 3, 3, 3]
    select distinct(n, n + 1)
}
assert [[1, 2], [2, 3], [3, 4]] == result.toList()

1.4. Filtering

where is equivalent to SQL’s WHERE

from n in [0, 1, 2, 3, 4, 5]
where n > 0 && n <= 3
select n * 2

1.4.1. In

from n in [0, 1, 2]
where n in [1, 2]
select n
from n in [0, 1, 2]
where n in (
    from m in [1, 2]
    select m
)
select n
import static groovy.lang.Tuple.tuple
assert [0, 1] == GQ {
    from n in [0, 1, 2]
    where tuple(n, n + 1) in (
        from m in [1, 2]
        select m - 1, m
    )
    select n
}.toList()

1.4.2. Not In

from n in [0, 1, 2]
where n !in [1, 2]
select n
from n in [0, 1, 2]
where n !in (
    from m in [1, 2]
    select m
)
select n
import static groovy.lang.Tuple.tuple
assert [2] == GQ {
    from n in [0, 1, 2]
    where tuple(n, n + 1) !in (
        from m in [1, 2]
        select m - 1, m
    )
    select n
}.toList()

1.4.3. Exists

from n in [1, 2, 3]
where (
    from m in [2, 3]
    where m == n
    select m
).exists()
select n

1.4.4. Not Exists

from n in [1, 2, 3]
where !(
    from m in [2, 3]
    where m == n
    select m
).exists()
select n

1.5. Joining

More data sources for GINQ could be specified by join clauses.

from n1 in [1, 2, 3]
join n2 in [1, 3] on n1 == n2
select n1, n2
join is preferred over innerjoin and innerhashjoin as it has better readability, and it is smart enough to choose the correct concrete join(i.e. innerjoin or innerhashjoin) by its on clause.
from n1 in [1, 2, 3]
innerjoin n2 in [1, 3] on n1 == n2
select n1, n2
from n1 in [1, 2, 3]
leftjoin n2 in [2, 3, 4] on n1 == n2
select n1, n2
from n1 in [2, 3, 4]
rightjoin n2 in [1, 2, 3] on n1 == n2
select n1, n2
from n1 in [1, 2, 3]
fulljoin n2 in [2, 3, 4] on n1 == n2
select n1, n2
from n1 in [1, 2, 3]
crossjoin n2 in [3, 4, 5]
select n1, n2

hash join is especially efficient when data sources contain lots of objects

from n1 in [1, 2, 3]
innerhashjoin n2 in [1, 3] on n1 == n2
select n1, n2
from n1 in [1, 2, 3]
lefthashjoin n2 in [2, 3, 4] on n1 == n2
select n1, n2
from n1 in [2, 3, 4]
righthashjoin n2 in [1, 2, 3] on n1 == n2
select n1, n2
from n1 in [1, 2, 3]
fullhashjoin n2 in [2, 3, 4] on n1 == n2
select n1, n2
Only binary expressions(==, &&) are allowed in the on clause of hash join

1.6. Grouping

groupby is equivalent to SQL’s GROUP BY, and having is equivalent to SQL’s HAVING. The into clause binds the grouped result to a named variable, enabling direct method calls for key access and aggregates. The variable is a GroupResult which extends Queryable, so all aggregate methods (count(), sum(), toList(), etc.) are available as real method calls:

assert [[1, 2], [3, 2], [6, 3]] == GQ {
    from n in [1, 1, 3, 3, 6, 6, 6]
    groupby n into g
    select g.key, g.count()
}.toList()

The group variable supports all Queryable aggregate methods:

assert [[1, 2], [3, 6], [6, 18]] == GQ {
    from n in [1, 1, 3, 3, 6, 6, 6]
    groupby n into g
    select g.key, g.sum(n -> n)
}.toList()

For multi-key grouping, individual keys can be named with as and accessed as properties or via subscript:

def result = GQ {
    from e in employees
    groupby e.dept as department, e.role as role into g
    select g.department, g.role, g.count()
}.toList()

Subscript access (g['department']) and explicit g.get('department') are also supported.

having works with into — the group variable can be used directly in the condition:

assert [[6, 3]] == GQ {
    from n in [1, 1, 3, 3, 6, 6, 6]
    groupby n into g
    having g.count() > 2
    select g.key, g.count()
}.toList()
The where clause after groupby…​into is reserved for future use; use having for now.

1.6.1. Aggregate Functions

GINQ provides some built-in aggregate functions:

Function Argument Type(s) Return Type Description

count()

java.lang.Long

number of rows, similar to count(*) in SQL

count(expression)

any

java.lang.Long

number of rows for which the value of expression is not null

min(expression)

java.lang.Comparable

same as argument type

minimum value of expression across all non-null values

max(expression)

java.lang.Comparable

same as argument type

maximum value of expression across all non-null values

sum(expression)

java.lang.Number

java.math.BigDecimal

sum of expression across all non-null values

avg(expression)

java.lang.Number

java.math.BigDecimal

the average (arithmetic mean) of all non-null values

list(expression)

any

java.util.List

the aggregated list of all non-null values

median(expression)

java.lang.Number

java.math.BigDecimal

value such that the number of non-null values above and below it is the same ("middle" value, not necessarily same as average or mean)

stdev(expression)

java.lang.Number

java.math.BigDecimal

the statistical standard deviation of all non-null values

stdevp(expression)

java.lang.Number

java.math.BigDecimal

the statistical standard deviation for the population for all non-null values

var(expression)

java.lang.Number

java.math.BigDecimal

the statistical variance of all non-null values

varp(expression)

java.lang.Number

java.math.BigDecimal

the statistical variance for the population for all non-null values

agg(expression)

any

any

customizes the aggregation logic in expression and returns single value

from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
select n, count()
from s in ['a', 'b', 'cd', 'ef']
groupby s.size() as length
select length, min(s)
from s in ['a', 'b', 'cd', 'ef']
groupby s.size() as length
select length, max(s)
from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
select n, sum(n)
from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
select n, avg(n)
from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
select n, median(n)
assert [['A', ['APPLE', 'APRICOT']],
        ['B', ['BANANA']],
        ['C', ['CANTALOUPE']]] == GQL {
    from fruit in ['Apple', 'Apricot', 'Banana', 'Cantaloupe']
    groupby fruit[0] as firstChar
    select firstChar, list(fruit.toUpperCase()) as fruit_list
}
def persons = [new Person('Linda', 100, 'Female'),
               new Person('Daniel', 135, 'Male'),
               new Person('David', 122, 'Male')]
assert [['Male', ['Daniel', 'David']], ['Female', ['Linda']]] == GQL {
    from p in persons
    groupby p.gender
    select p.gender, list(p.name)
}
from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
select n, agg(_g.stream().map(r -> r.n).reduce(BigDecimal.ZERO, BigDecimal::add))
_g is an implicit variable for agg aggregate function, it represents the grouped Queryable object and its record(e.g. r) could reference the data source by alias(e.g. n)
from fruit in ['Apple', 'Apricot', 'Banana', 'Cantaloupe']
groupby fruit.substring(0, 1) as firstChar
select firstChar, agg(_g.stream().map(r -> r.fruit).toList()) as fruit_list

Also, we could apply the aggregate functions for the whole GINQ result, i.e. no groupby clause is needed:

assert [3] == GQ {
    from n in [1, 2, 3]
    select max(n)
}.toList()
assert [[1, 3, 2, 2, 6, 3, 3, 6]] == GQ {
    from n in [1, 2, 3]
    select min(n), max(n), avg(n), median(n), sum(n), count(n), count(),
            agg(_g.stream().map(r -> r.n).reduce(BigDecimal.ZERO, BigDecimal::add))
}.toList()
assert [0.816496580927726] == GQ {
    from n in [1, 2, 3]
    select stdev(n)
}.toList()
assert [1] == GQ {
    from n in [1, 2, 3]
    select stdevp(n)
}.toList()
assert [0.6666666666666667] == GQ {
    from n in [1, 2, 3]
    select var(n)
}.toList()
assert [1] == GQ {
    from n in [1, 2, 3]
    select varp(n)
}.toList()

1.7. Sorting

orderby is equivalent to SQL’s ORDER BY

from n in [1, 5, 2, 6]
orderby n
select n
in asc is optional when sorting in ascending order
from n in [1, 5, 2, 6]
orderby n in asc
select n
from n in [1, 5, 2, 6]
orderby n in desc
select n
from s in ['a', 'b', 'ef', 'cd']
orderby s.length() in desc, s in asc
select s
from s in ['a', 'b', 'ef', 'cd']
orderby s.length() in desc, s
select s
from n in [1, null, 5, null, 2, 6]
orderby n in asc(nullslast)
select n
nullslast is equivalent to SQL’s NULLS LAST and applied by default. nullsfirst is equivalent to SQL’s NULLS FIRST.
from n in [1, null, 5, null, 2, 6]
orderby n in asc(nullsfirst)
select n
from n in [1, null, 5, null, 2, 6]
orderby n in desc(nullslast)
select n
from n in [1, null, 5, null, 2, 6]
orderby n in desc(nullsfirst)
select n

1.8. Pagination

limit is similar to the limit clause of MySQL, which could specify the offset(first argument) and size(second argument) for paginating, or just specify the only one argument as size

from n in [1, 2, 3, 4, 5]
limit 3
select n
from n in [1, 2, 3, 4, 5]
limit 1, 3
select n

1.9. Set Operations

Set operations combine the results of two complete GINQ expressions.

union returns distinct rows from both queries (duplicates removed):

assert [1, 2, 3, 4, 5] == GQ {
    from n in [1, 2, 3]
    select n
    union
    from m in [3, 4, 5]
    select m
}.toList()

unionall returns all rows from both queries (duplicates preserved):

assert [1, 2, 3, 3, 4, 5] == GQ {
    from n in [1, 2, 3]
    select n
    unionall
    from m in [3, 4, 5]
    select m
}.toList()

intersect returns only rows that appear in both queries:

assert [3, 4] == GQ {
    from n in [1, 2, 3, 4]
    select n
    intersect
    from m in [3, 4, 5, 6]
    select m
}.toList()

minus returns rows from the first query that do not appear in the second:

assert [1, 2] == GQ {
    from n in [1, 2, 3, 4]
    select n
    minus
    from m in [3, 4, 5, 6]
    select m
}.toList()

Each side of a set operation is a complete from…​select expression and can include where, groupby, orderby, limit, and any other clauses.

The set operation keyword must appear on its own line. Set operations are also available as methods on any Queryable result: q1.union(q2), q1.intersect(q2), q1.minus(q2).

2. Common Recipes

Recipes for everyday tasks: row numbering, list comprehensions, querying JSON, updating collections, and workarounds for SQL features not yet in GINQ.

2.1. Row Number

_rn is the implicit variable representing row number for each record in the result set. It starts with 0

from n in [1, 2, 3]
select _rn, n

2.2. List Comprehension

List comprehension is an elegant way to define and create lists based on existing lists:

assert [4, 16, 36, 64, 100] == GQ {from n in 1..<11 where n % 2 == 0 select n ** 2}.toList()
assert [4, 16, 36, 64, 100] == GQ {from n in 1..<11 where n % 2 == 0 select n ** 2} as List
assert [4, 16, 36, 64, 100] == GQL {from n in 1..<11 where n % 2 == 0 select n ** 2}
GQL {…​} is the abbreviation of GQ {…​}.toList()

GINQ could be used as list comprehension in the loops directly:

def result = []
for (def x : GQ {from n in 1..<11 where n % 2 == 0 select n ** 2}) {
    result << x
}
assert [4, 16, 36, 64, 100] == result

2.3. Query JSON

def json = new JsonSlurper().parseText('''
    {
        "fruits": [
            {"name": "Orange", "price": 11},
            {"name": "Apple", "price": 6},
            {"name": "Banana", "price": 4},
            {"name": "Mango", "price": 28},
            {"name": "Durian", "price": 32}
        ]
    }
''')

assert [['Mango', 28], ['Orange', 11], ['Apple', 6], ['Banana', 4]] == GQ {
    from f in json.fruits
    where f.price < 30
    orderby f.price in desc
    select f.name, f.price
}.toList()

2.4. Query & Update

This is like update statement in SQL

import groovy.transform.*
@TupleConstructor
@EqualsAndHashCode
@ToString
class Person {
    String name
    String nickname
}

def linda = new Person('Linda', null)
def david = new Person('David', null)
def persons = [new Person('Daniel', 'ShanFengXiaoZi'), linda, david]
def result = GQ {
    from p in persons
    where p.nickname == null
    select p
}.stream()
    .peek(p -> { p.nickname = 'Unknown' }) // update `nickname`
    .toList()

def expected = [new Person('Linda', 'Unknown'), new Person('David', 'Unknown')]
assert expected == result
assert ['Unknown', 'Unknown'] == [linda, david]*.nickname // ensure the original objects are updated

2.5. Alternative for with clause

GINQ does not support with clause for now, but we could define a temporary variable to workaround:

def v = GQ { from n in [1, 2, 3] where n < 3 select n }
def result = GQ {
    from n in v
    where n > 1
    select n
}
assert [2] == result.toList()

2.6. Alternative for case-when

case-when of SQL could be replaced with switch expression:

assert ['a', 'b', 'c', 'c'] == GQ {
    from n in [1, 2, 3, 4]
    select switch (n) {
        case 1 -> 'a'
        case 2 -> 'b'
        default -> 'c'
    }
}.toList()

3. Advanced Topics

This section covers window functions, nested subqueries, the classic (pre-into) groupby style, the @GQ annotation for declaring GINQ methods, parallel querying, and options for customizing and optimizing GINQ. It also includes the full syntax reference.

3.1. Window Functions

Window can be defined by partitionby, orderby, rows and range:

over(
    [partitionby <expression> (, <expression>)*]
    [orderby <expression> (, <expression>)*
       [rows <lower>, <upper> | range <lower>, <upper>]]
)
  • 0 used as bound of rows and range clause is equivalent to SQL’s CURRENT ROW, and negative means PRECEDING, positive means FOLLOWING

  • null used as the lower bound of rows and range clause is equivalent to SQL’s UNBOUNDED PRECEDING

  • null used as the upper bound of rows and range clause is equivalent to SQL’s UNBOUNDED FOLLOWING

Also, GINQ provides some built-in window functions:

Function Argument Type(s) Return Type Description

rowNumber()

java.lang.Long

number of the current row within its partition, counting from 0

rank()

java.lang.Long

rank of the current row with gaps

denseRank()

java.lang.Long

rank of the current row without gaps

percentRank()

java.math.BigDecimal

relative rank of the current row: (rank - 1) / (total rows - 1)

cumeDist()

java.math.BigDecimal

relative rank of the current row: (number of rows preceding or peer with current row) / (total rows)

ntile(expression)

java.lang.Long

java.lang.Long

bucket index ranging from 0 to expression - 1, dividing the partition as equally as possible

lead(expression [, offset [, default]])

any [, java.lang.Long [, same as expression type]]

same as expression type

returns expression evaluated at the row that is offset rows after the current row within the partition; if there is no such row, instead return default (which must be of the same type as expression). Both offset and default are evaluated with respect to the current row. If omitted, offset defaults to 1 and default to null

lag(expression [, offset [, default]])

any [, java.lang.Long [, same as expression type]]

same as expression type

returns expression evaluated at the row that is offset rows before the current row within the partition; if there is no such row, instead return default (which must be of the same type as expression). Both offset and default are evaluated with respect to the current row. If omitted, offset defaults to 1 and default to null

firstValue(expression)

any

same type as expression

returns expression evaluated at the row that is the first row of the window frame

lastValue(expression)

any

same type as expression

returns expression evaluated at the row that is the last row of the window frame

nthValue(expression, n)

any, java.lang.Long

same type as expression

returns expression evaluated at the row that is the nth row of the window frame

count()

java.lang.Long

number of rows, similar to count(*) in SQL

count(expression)

any

java.lang.Long

number of rows for which the value of expression is not null

min(expression)

java.lang.Comparable

same as argument type

minimum value of expression across all non-null values

max(expression)

java.lang.Comparable

same as argument type

maximum value of expression across all non-null values

sum(expression)

java.lang.Number

java.math.BigDecimal

sum of expression across all non-null values

avg(expression)

java.lang.Number

java.math.BigDecimal

the average (arithmetic mean) of all non-null values

median(expression)

java.lang.Number

java.math.BigDecimal

value such that the number of non-null values above and below it is the same ("middle" value, not necessarily same as average or mean)

stdev(expression)

java.lang.Number

java.math.BigDecimal

the statistical standard deviation of all non-null values

stdevp(expression)

java.lang.Number

java.math.BigDecimal

the statistical standard deviation for the population for all non-null values

var(expression)

java.lang.Number

java.math.BigDecimal

the statistical variance of all non-null values

varp(expression)

java.lang.Number

java.math.BigDecimal

the statistical variance for the population for all non-null values

agg(expression)

any

any

INCUBATING: customizes the aggregation logic in expression and returns single value

3.1.1. rowNumber

assert [[2, 1, 1, 1], [1, 0, 0, 2], [null, 3, 3, 3], [3, 2, 2, 0]] == GQ {
    from n in [2, 1, null, 3]
    select n, (rowNumber() over(orderby n)),
              (rowNumber() over(orderby n in asc)),
              (rowNumber() over(orderby n in desc))
}.toList()
assert [[1, 0, 1, 2, 3], [2, 1, 2, 1, 2], [null, 3, 0, 3, 0], [3, 2, 3, 0, 1]] == GQ {
    from n in [1, 2, null, 3]
    select n, (rowNumber() over(orderby n in asc(nullslast))),
              (rowNumber() over(orderby n in asc(nullsfirst))),
              (rowNumber() over(orderby n in desc(nullslast))),
              (rowNumber() over(orderby n in desc(nullsfirst)))
}.toList()
The parentheses around the window function is required.

3.1.2. rank, denseRank, percentRank, cumeDist and ntile

assert [['a', 1, 1], ['b', 2, 2], ['b', 2, 2],
        ['c', 4, 3], ['c', 4, 3], ['d', 6, 4],
        ['e', 7, 5]] == GQ {
    from s in ['a', 'b', 'b', 'c', 'c', 'd', 'e']
    select s,
        (rank() over(orderby s)),
        (denseRank() over(orderby s))
}.toList()
assert [[60, 0, 0.4], [60, 0, 0.4], [80, 0.5, 0.8], [80, 0.5, 0.8], [100, 1, 1]] == GQ {
    from n in [60, 60, 80, 80, 100]
    select n,
        (percentRank() over(orderby n)),
        (cumeDist() over(orderby n))
}.toList()
assert [[1, 0], [2, 0], [3, 0],
        [4, 1], [5, 1],
        [6, 2], [7, 2],[8, 2],
        [9, 3], [10, 3]] == GQ {
    from n in 1..10
    select n, (ntile(4) over(orderby n))
}.toList()

3.1.3. lead and lag

assert [[2, 3], [1, 2], [3, null]] == GQ {
    from n in [2, 1, 3]
    select n, (lead(n) over(orderby n))
}.toList()
assert [[2, 3], [1, 2], [3, null]] == GQ {
    from n in [2, 1, 3]
    select n, (lead(n) over(orderby n in asc))
}.toList()
assert [['a', 'bc'], ['ab', null], ['b', 'a'], ['bc', 'ab']] == GQ {
    from s in ['a', 'ab', 'b', 'bc']
    select s, (lead(s) over(orderby s.length(), s in desc))
}.toList()
assert [['a', null], ['ab', null], ['b', 'a'], ['bc', 'ab']] == GQ {
    from s in ['a', 'ab', 'b', 'bc']
    select s, (lead(s) over(partitionby s.length() orderby s.length(), s in desc))
}.toList()
assert [[2, 1], [1, null], [3, 2]] == GQ {
    from n in [2, 1, 3]
    select n, (lag(n) over(orderby n))
}.toList()
assert [[2, 3], [1, 2], [3, null]] == GQ {
    from n in [2, 1, 3]
    select n, (lag(n) over(orderby n in desc))
}.toList()
assert [['a', null], ['b', 'a'], ['aa', null], ['bb', 'aa']] == GQ {
    from s in ['a', 'b', 'aa', 'bb']
    select s, (lag(s) over(partitionby s.length() orderby s))
}.toList()
assert [[2, 3, 1], [1, 2, null], [3, null, 2]] == GQ {
    from n in [2, 1, 3]
    select n, (lead(n) over(orderby n)), (lag(n) over(orderby n))
}.toList()

The offset can be specified other than the default offset 1:

assert [[2, null, null], [1, 3, null], [3, null, 1]] == GQ {
    from n in [2, 1, 3]
    select n, (lead(n, 2) over(orderby n)), (lag(n, 2) over(orderby n))
}.toList()

The default value can be returned when the index specified by offset is out of window, e.g. 'NONE':

assert [[2, 'NONE', 'NONE'], [1, 3, 'NONE'], [3, 'NONE', 1]] == GQ {
    from n in [2, 1, 3]
    select n, (lead(n, 2, 'NONE') over(orderby n)), (lag(n, 2, 'NONE') over(orderby n))
}.toList()

3.1.4. firstValue, lastValue and nthValue

assert [[2, 1], [1, 1], [3, 2]] == GQ {
    from n in [2, 1, 3]
    select n, (firstValue(n) over(orderby n rows -1, 1))
}.toList()
assert [[2, 3], [1, 2], [3, 3]] == GQ {
    from n in [2, 1, 3]
    select n, (lastValue(n) over(orderby n rows -1, 1))
}.toList()
assert [[2, 2], [1, 1], [3, 3]] == GQ {
    from n in [2, 1, 3]
    select n, (firstValue(n) over(orderby n rows 0, 1))
}.toList()
assert [[2, 1], [1, null], [3, 1]] == GQ {
    from n in [2, 1, 3]
    select n, (firstValue(n) over(orderby n rows -2, -1))
}.toList()
assert [[2, 1], [1, null], [3, 2]] == GQ {
    from n in [2, 1, 3]
    select n, (lastValue(n) over(orderby n rows -2, -1))
}.toList()
assert [[2, 3], [1, 3], [3, null]] == GQ {
    from n in [2, 1, 3]
    select n, (lastValue(n) over(orderby n rows 1, 2))
}.toList()
assert [[2, 3], [1, 2], [3, null]] == GQ {
    from n in [2, 1, 3]
    select n, (firstValue(n) over(orderby n rows 1, 2))
}.toList()
assert [[2, 2], [1, 1], [3, 3]] == GQ {
    from n in [2, 1, 3]
    select n, (lastValue(n) over(orderby n rows -1, 0))
}.toList()
assert [[2, 1], [1, 1], [3, 1]] == GQ {
    from n in [2, 1, 3]
    select n, (firstValue(n) over(orderby n rows null, 1))
}.toList()
assert [[2, 3], [1, 3], [3, 3]] == GQ {
    from n in [2, 1, 3]
    select n, (lastValue(n) over(orderby n rows -1, null))
}.toList()
assert [['a', 'a', 'b'], ['aa', 'aa', 'bb'], ['b', 'a', 'b'], ['bb', 'aa', 'bb']] == GQ {
    from s in ['a', 'aa', 'b', 'bb']
    select s, (firstValue(s) over(partitionby s.length() orderby s)),
            (lastValue(s) over(partitionby s.length() orderby s))
}.toList()
assert [[1, 1, 2, 3, null], [2, 1, 2, 3, null], [3, 1, 2, 3, null]] == GQ {
    from n in 1..3
    select n, (nthValue(n, 0) over(orderby n)),
              (nthValue(n, 1) over(orderby n)),
              (nthValue(n, 2) over(orderby n)),
              (nthValue(n, 3) over(orderby n))
}.toList()

3.1.5. min, max, count, sum, avg, median, stdev, stdevp, var ,varp and agg

assert [['a', 'a', 'b'], ['b', 'a', 'b'], ['aa', 'aa', 'bb'], ['bb', 'aa', 'bb']] == GQ {
    from s in ['a', 'b', 'aa', 'bb']
    select s, (min(s) over(partitionby s.length())), (max(s) over(partitionby s.length()))
}.toList()
assert [[1, 2, 2, 2, 1, 1], [1, 2, 2, 2, 1, 1],
        [2, 2, 2, 4, 2, 2], [2, 2, 2, 4, 2, 2],
        [3, 2, 2, 6, 3, 3], [3, 2, 2, 6, 3, 3]] == GQ {
    from n in [1, 1, 2, 2, 3, 3]
    select n, (count() over(partitionby n)),
              (count(n) over(partitionby n)),
              (sum(n) over(partitionby n)),
              (avg(n) over(partitionby n)),
              (median(n) over(partitionby n))
}.toList()
assert [[2, 6, 3, 1, 3, 4], [1, 6, 3, 1, 3, 4],
        [3, 6, 3, 1, 3, 4], [null, 6, 3, 1, 3, 4]] == GQ {
    from n in [2, 1, 3, null]
    select n, (sum(n) over()),
              (max(n) over()),
              (min(n) over()),
              (count(n) over()),
              (count() over())
}.toList()
assert [[1, 1, 1], [2, 2, 3], [5, 2, 10], [5, 2, 10]] == GQ {
    from n in [1, 2, 5, 5]
    select n, (count() over(orderby n range -2, 0)),
              (sum(n) over(orderby n range -2, 0))
}.toList()
assert [[1, 2, 3], [2, 1, 2], [5, 2, 10], [5, 2, 10]] == GQ {
    from n in [1, 2, 5, 5]
    select n, (count() over(orderby n range 0, 1)),
              (sum(n) over(orderby n range 0, 1))
}.toList()
assert [[1, 2, 3], [2, 2, 3], [5, 2, 10], [5, 2, 10]] == GQ {
    from n in [1, 2, 5, 5]
    select n, (count() over(orderby n range -1, 1)),
              (sum(n) over(orderby n range -1, 1))
}.toList()
assert [[1, 1, 2], [2, 0, 0], [5, 0, 0], [5, 0, 0]] == GQ {
    from n in [1, 2, 5, 5]
    select n, (count() over(orderby n in desc range 1, 2)),
              (sum(n) over(orderby n in desc range 1, 2))
}.toList()
assert [[1, 0, 0], [2, 1, 1], [5, 0, 0], [5, 0, 0]] == GQ {
    from n in [1, 2, 5, 5]
    select n, (count() over(orderby n in desc range -2, -1)),
              (sum(n) over(orderby n in desc range -2, -1))
}.toList()
assert [[1, 3, 12], [2, 2, 10], [5, 0, 0], [5, 0, 0]] == GQ {
    from n in [1, 2, 5, 5]
    select n, (count() over(orderby n range 1, null)),
              (sum(n) over(orderby n range 1, null))
}.toList()
assert [[1, 2, 3], [2, 2, 3], [5, 4, 13], [5, 4, 13]] == GQ {
    from n in [1, 2, 5, 5]
    select n, (count() over(orderby n range null, 1)),
              (sum(n) over(orderby n range null, 1))
}.toList()
assert [[1, 0.816496580927726],
        [2, 0.816496580927726],
        [3, 0.816496580927726]] == GQ {
    from n in [1, 2, 3]
    select n, (stdev(n) over())
}.toList()
assert [[1, 1], [2, 1], [3, 1]] == GQ {
    from n in [1, 2, 3]
    select n, (stdevp(n) over())
}.toList()
assert [[1, 0.6666666666666667],
        [2, 0.6666666666666667],
        [3, 0.6666666666666667]] == GQ {
    from n in [1, 2, 3]
    select n, (var(n) over())
}.toList()
assert [[1, 1], [2, 1], [3, 1]] == GQ {
    from n in [1, 2, 3]
    select n, (varp(n) over())
}.toList()
assert [[1, 4], [2, 2], [3, 4]] == GQ {
    from n in [1, 2, 3]
    select n,
           (agg(_g.stream().map(r -> r.n).reduce(BigDecimal.ZERO, BigDecimal::add)) over(partitionby n % 2))
}.toList()

3.2. Nested GINQ

3.2.1. Nested GINQ in from clause

from v in (
    from n in [1, 2, 3]
    select n
)
select v

3.2.2. Nested GINQ in where clause

from n in [0, 1, 2]
where n in (
    from m in [1, 2]
    select m
)
select n
from n in [0, 1, 2]
where (
    from m in [1, 2]
    where m == n
    select m
).exists()
select n

3.2.3. Nested GINQ in select clause

assert [null, 2, 3] == GQ {
    from n in [1, 2, 3]
    select (
        from m in [2, 3, 4]
        where m == n
        limit 1
        select m
    )
}.toList()
It’s recommended to use limit 1 to restrict the count of sub-query result because TooManyValuesException will be thrown if more than one values returned

We could use as clause to name the sub-query result

assert [[1, null], [2, 2], [3, 3]] == GQ {
    from n in [1, 2, 3]
    select n, (
        from m in [2, 3, 4]
        where m == n
        select m
    ) as sqr
}.toList()

3.3. Classic groupby style

GINQ also supports an older style without the into keyword which looks simpler for some cases but has some limitations — aggregate functions use a special syntax rather than real method calls, and the group cannot be accessed as a composable collection. Each field in any nonaggregate expression in the select clause must be included in the groupby clause:

from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
select n, count(n)
from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
having n >= 3
select n, count(n)
from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
having count() < 3
select n, count()

The group columns could be renamed with as clause:

from s in ['ab', 'ac', 'bd', 'acd', 'bcd', 'bef']
groupby s.size() as length, s[0] as firstChar
select length, firstChar, max(s)
from s in ['ab', 'ac', 'bd', 'acd', 'bcd', 'bef']
groupby s.size() as length, s[0] as firstChar
having length == 3 && firstChar == 'b'
select length, firstChar, max(s)

3.4. Using the Queryable API directly

The groupByInto method is also available on the Queryable API directly, which can be useful when building queries programmatically:

def nums = [1, 2, 2, 3, 3, 4, 4, 5]
def result = from(nums)
    .groupByInto(e -> e, g -> g.count() > 1)
    .select((g, q) -> Tuple.tuple(g.key, g.count()))
    .toList()
assert [[2, 2], [3, 2], [4, 2]] == result

3.5. @GQ Annotation

GINQ could be written in a method marked with @GQ:

@GQ
def someGinqMethod() {
    /* GINQ CODE */
}

For example,

  • Mark the ginq method as a GINQ method with @GQ annotation:

@groovy.ginq.transform.GQ
def ginq(list, b, e) {
    from n in list
    where b < n && n < e
    select n
}

assert [3, 4] == ginq([1, 2, 3, 4, 5, 6], 2, 5).toList()
  • Specify the result type as List:

import groovy.ginq.transform.GQ

@GQ(List)
def ginq(b, e) {
    from n in [1, 2, 3, 4, 5, 6]
    where b < n && n < e
    select n
}

assert [3, 4] == ginq(2, 5)
GINQ supports many result types, e.g. List, Set, Collection, Iterable, Iterator, java.util.stream.Stream and array types.
  • Enable parallel querying:

import groovy.ginq.transform.GQ

@GQ(parallel=true)
def ginq(x) {
    from n in [1, 2, 3]
    where n < x
    select n
}

assert [1] == ginq(2).toList()

3.6. Parallel Querying

Parallel querying is especially efficient when querying big data sources. It is disabled by default, but we could enable it by hand:

assert [[1, 1], [2, 2], [3, 3]] == GQ(parallel: true) {
    from n1 in 1..1000
    join n2 in 1..10000 on n2 == n1
    where n1 <= 3 && n2 <= 5
    select n1, n2
}.toList()

As parallel querying will use a shared thread pool, the following code can release resources after all GINQ statements execution are completed, and it will wait util all tasks of threads are completed.

GQ {
    shutdown
}
Once shutdown is issued, parallel querying can not work anymore.

The following code is equivalent to the above code, in other words, immediate is optional:

GQ {
    shutdown immediate
}

Shutdown without waiting tasks to complete:

GQ {
    shutdown abort
}

3.7. Customize GINQ

For advanced users, you could customize GINQ behaviour by specifying your own target code generator. For example, we could specify the qualified class name org.apache.groovy.ginq.provider.collection.GinqAstWalker as the target code generator to generate GINQ method calls for querying collections, which is the default behaviour of GINQ:

assert [0, 1, 2] == GQ(astWalker: 'org.apache.groovy.ginq.provider.collection.GinqAstWalker') {
    from n in [0, 1, 2]
    select n
}.toList()

3.8. Optimize GINQ

GINQ optimizer is enabled by default for better performance. It will transform the GINQ AST to achieve better execution plan. We could disable it by hand:

assert [[2, 2]] == GQ(optimize: false) {
    from n1 in [1, 2, 3]
    join n2 in [1, 2, 3] on n1 == n2
    where n1 > 1 &&  n2 < 3
    select n1, n2
}.toList()

3.9. GINQ Syntax Reference

The full GINQ clause structure for reference:

GQ, i.e. abbreviation for GINQ
|__ <query>
|   |__ from
|   |   |__ <data_source_alias> in <data_source>
|   |__ [join/innerjoin/leftjoin/rightjoin/fulljoin/crossjoin]*
|   |   |__ <data_source_alias> in <data_source>
|   |   |__ on <condition> ((&& | ||) <condition>)* (NOTE: `crossjoin` does not need `on` clause)
|   |__ [where]
|   |   |__ <condition> ((&& | ||) <condition>)*
|   |__ [groupby]
|   |   |__ <expression> [as <alias>] (, <expression> [as <alias>])* [into <group_alias>]
|   |   |__ [having]
|   |       |__ <condition> ((&& | ||) <condition>)*
|   |__ [orderby]
|   |   |__ <expression> [in (asc|desc)] (, <expression> [in (asc|desc)])*
|   |__ [limit]
|   |   |__ [<offset>,] <size>
|   |__ select
|       |__ <expression> [as <alias>] (, <expression> [as <alias>])*
|__ [union/unionall/intersect/minus]*
    |__ <query>
[] means the related clause is optional, * means zero or more times, and + means one or more times. Also, the clauses within each query are order sensitive, so the order of clauses should be kept as the above structure. ONLY ONE from clause is required per query. Multiple data sources are supported through from and the related joins. Set operations chain left-to-right, so Q1 union Q2 minus Q3 is evaluated as (Q1 union Q2) minus Q3.

3.10. Known Limitations

Currently GINQ can not work well when STC is enabled.

4. GINQ Examples

Complete worked examples showing GINQ in action.

4.1. Generate Multiplication Table

from v in (
    from a in 1..9
    join b in 1..9 on a <= b
    select a as f, b as s, "$a * $b = ${a * b}".toString() as r
)
groupby v.s
select max(v.f == 1 ? v.r : '') as v1,
       max(v.f == 2 ? v.r : '') as v2,
       max(v.f == 3 ? v.r : '') as v3,
       max(v.f == 4 ? v.r : '') as v4,
       max(v.f == 5 ? v.r : '') as v5,
       max(v.f == 6 ? v.r : '') as v6,
       max(v.f == 7 ? v.r : '') as v7,
       max(v.f == 8 ? v.r : '') as v8,
       max(v.f == 9 ? v.r : '') as v9

4.2. More examples

link: the latest GINQ examples

Some examples in the above link require the latest SNAPSHOT version of Groovy to run.