One of the big things I missed with JOOQ was JSON aggregate fields in Postures. There were a number of times where it would be very useful to aggregate. But JSON is just an example of an aggregate field, there are much more in the ANSI SQL standard.

What is an Aggregate Field

At a base level, take count. It's a function that takes in a field and counts it. Based on this it will return an Int representation of the determined result set. But this is a very simple example.

If you take a more complicated example of json_agg, and yes this was my benchmark for this feature.

json_agg(
                json_build_object(
                    'orderPlaced', user_orders.order_placed,
                    'inventory', json_build_object(
                        'id', inventory.id,
                        'name', inventory.name,
                        'price', inventory.price
                    )
                )
            )

This takes a lot more than one field. It takes a number of nested fields, and maybe even another aggregate function. With that let's scope out the aggregate field.

Building a Generic Aggregate Field

In my parlance column is a field. Fields are wrapped in a sealed class of Fields. We build a new AggregateField type. This generic container allows us to

data class AggregateField<C : Any?, D : IDatabase, R : ADatabaseRecord, T : ITable<D, R>>(
        val aggregate: ISQLAggregate,
        val columns: List<Fields<*, D, *, *>> = listOf(),
        val multipleColumns: Boolean = false,
        val named: String = ""
)

There are other items common to the Fields parent class. But we're highlighting the new additions. ISQLAggregate allows us to easily build out a new aggregate field. It's not tied to a specific relational database type. So we can define ANSI and database specific aggregates.

  • aggregate The aggregate function to execute
  • columns The list of columns that fall under aggregate function.
  • multipleColumns If the aggregate function call takes multiple columns or just one.
interface ISQLAggregate {
    val functionName: String
}

Let's Start with Count


sealed class CommonAggregates : ISQLAggregate
object Count : CommonAggregates() {  override val functionName = "count"}

internal suspend fun <C : Any?, D : IDatabase, R : ADatabaseRecord, T : ITable<D, R>, AG : ISQLAggregate> buildAggregateForSingleField(
        aggregate: AG, field: Fields<C, D, R, T>) =
        AggregateField<C, D, R, T>(
                aggregate, field.columnName, field.columnType, field.schema, field.table, listOf(), false
        )

suspend fun <C : Any?, D : IDatabase, R : ADatabaseRecord, T : ITable<D, R>> count(field: Fields<C, D, R, T>) =
        buildAggregateForSingleField(Count, field)

With select, and other conditional operators are all built on the sealed class Fields. This means we don't have to do anything special you can now use count like the following.

select(count(Users.id), Users.firstName) from Users groupBy listOf(Users.firstName)

This is where the builder comes in. By and large the query module set is a type safe SQL builder. This also highlights the benefits of sealed classes. The parts that format the query are when expressions. When we add a new field type it throws an error and we just need to add that format.

Creating the Builder

This is from the postgres sql builder. But it has the same idea across databases.

override fun buildColumnString(schemaTable: SchemaTableName, friendlyTable: FriendlyTableName, field: Fields<*, *, *, *>): String {
     return when (field) {
     	is AggregateField -> {
        is Count, Sum, Maximum, Minimum, Average -> "${field.aggregate.functionName}($schemaTable.${field.columnName})"                    
        }
    }

With grouped pattern matching we've now vetted most of the ANSI SQL aggregate fields. Looking back at the interface we defined there is a property for function name. We enclose that in (). Then pass in the field name.

Json

I went over this DSL a number of times until I hit on a happy point.

Starting at the top we define a Postgres aggregate sealed class. This allows us to build out a group of additional aggregate fields in the future.

sealed class PostgresSQLAggregate : ISQLAggregate

data class JsonAggregate(val struct: List<JsonAggPair>) : PostgresSQLAggregate() {
    override val functionName = "json_agg"
}

data class JsonObject(val pairs: List<JsonAggPair>) : PostgresSQLAggregate() {
    override val functionName = "json_build_object"
}

The immediate difference is that these are stateful. That is because they will contain additional state outside of a base list of fields to aggregate. As before these are not exposed to the end user. We build an abstraction to ease creation of this. Trying to map closely to SQL.

suspend fun <D : IDatabase> jsonAgg(name: String = "", builder: suspend JsonAgg.() -> Unit): AggregateField<Any?, D, ADatabaseRecord, ITable<D, ADatabaseRecord>> {
    val state = JsonAgg()
    state.builder()
    return AggregateField<Any?, D, ADatabaseRecord, ITable<D, ADatabaseRecord>>(
            JsonAggregate(state.builder.toList()), name, "", "", "", named = name
    )
}

As in the last step we build out an AggregateField from a simple helper function. We provide a DSL to build out this aggregate statement. With our generics we bind it a data base, the rest are generic. Essentially we allow aggregates across fields but not databases.

The DSL

class JsonAgg {
    var builder: MutableList<JsonAggPair> = mutableListOf()
    suspend fun jsonObj(builder: suspend JsonObjectBuilder.() -> Unit) {
        val state = JsonObjectBuilder()
        state.builder()
        this.builder.addAll(state.pairs)
    }
}

At the third level we get to the meat and potatoes.

class JsonObjectBuilder {
    val pairs: MutableList<JsonAggPair> = mutableListOf()

    suspend infix fun <C : Any?, D : IDatabase, R : ADatabaseRecord, T : ITable<D, R>>
            String.to(field: Fields<C, D, R, T>) =
            pairs.add(Pair(this, field))

    suspend infix fun String.to(jsonObj: JsonObject) =
            pairs.add(Pair(this, AggregateField<Any?, IDatabase, ADatabaseRecord, ITable<IDatabase, ADatabaseRecord>>(
                    jsonObj, "", "", "", ""
            )))

    suspend fun jsonObj(builder: suspend JsonObjectBuilder.() -> Unit): JsonObject {
        val state = JsonObjectBuilder()
        state.builder()
        return JsonObject(state.pairs.toList())
    }
}

Looking at the initial sql example it was nested i.e.:

json_agg(json_build_object(..))

That is because it returns an array of objects.

[{id: 1}, {id: 2}]

Before showing the builder logic let's clarify the DSL.

            selectJ(Users.id, Users.firstName, Users.lastName, jsonAgg {
                jsonObj {
                    "orderPlaced" to UserOrders.orderPlaced
                    "inventory" to jsonObj {
                        "id" to Inventory.id
                        "name" to Inventory.item
                        "price" to Inventory.price
                    }
                }
            }) 

We overload/utilize the known infix function of to. Mapping a key to a field. This also shows we're not bound to one level. This should work at a very deep level. As `jsonObj` is just a field. The infix to is a String to a Fields object. This goes back to our sealed class and the flexibiltiy it gives.

There isn't much at this layer. We are building a series of structs.

typealias JsonAggPair = Pair<String, Fields<*, *, *, *>>

That type alias spells out a majority of the logic. The JsonAgg field takes in a list of JsonAggPair. Which that can be nested. So now we have a String key mapped to a Fields. Unfortunately I had to wild card via * due to type inference errors. One of the things I miss about F#.

Builder

This is where the hard part came in.

        return when (field) {
            is AggregateField -> {
                when (field.aggregate) {
                    is JsonAggregate -> {
                        val aggregate = field.aggregate as JsonAggregate
                        val jsonBuilder = aggregate.struct.joinToString(",") { jsonPair ->
                            val (key, jsonField) = jsonPair
                            val columnRepresentation = buildColumnString(jsonField.table, jsonField.table, jsonField)
                            "'$key',$columnRepresentation"
                        }
                        val named = if (field.named.isEmpty()) "" else " as ${field.named}"
                        "${aggregate.functionName}(json_build_object($jsonBuilder))$named"
                    }
                    is JsonObject -> {
                        val aggregate = field.aggregate as JsonObject
                        val jsonBuilder = aggregate.pairs.joinToString(",") { pair ->
                            val (key, jsonField) = pair
                            val columnRepresentation = buildColumnString(jsonField.table, jsonField.table, jsonField)
                            "'$key',$columnRepresentation"
                        }
                        val named = if (field.named.isEmpty()) "" else " as ${field.named}"
                        "${aggregate.functionName}($jsonBuilder)$named"
                    }
                    else -> "${field.aggregate.functionName}($schemaTable.${field.columnName})"
                }
            }
        }
    }

This is a tail recursive function, and the cost is high. The more you nest this aggregate the deeper the recursion.

Breaking down the example we had.

 jsonAgg {
                jsonObj {
                    "orderPlaced" to UserOrders.orderPlaced

We first hit the JsonAggregate type. This will de-structure the pair. We know on the left we have a key of string. On the right will be a column of some sort. Since this is a sealed class it can be a NamedField, AggregateField, or Field. To avoid assumptions we pass it back into itself. Of note is that we pass in the extracted fields table. A string of the field's table name is embedded into the field object.

In this example we would say:

  • Key of orderPlaced
  • To Field UserOrder.orderPlaced

This maps too:

object UserOrdersOrderPlaced : Field<String, Postgres, UserOrdersRecord, UserOrders>("order_placed",
    "timestamp without time zone", "jasync_example", "user_orders")

So we can translate this too.

'orderPlaced', user_orders.order_placed

The next part is where it get's more interesting.

"inventory" to jsonObj {

So now when we hit the pattern match. Passing back in the field representation. We now hit the is JsonObject -> {} logic. The loops then repeats like back and forth.

There be Problems

Much like the Titanic we're about to hit a problem, too soon? #SorryNotSorry.

There are two big problems with this.

  • First Is the tail recursive nature of this. Building that query over and over again will be very costly and have a performance impact.
  • Secondly We now have to de serialize JSON object.

Seriously at this point I feel like Kotlin poet is my hammer, and everything is a nail waiting to be whacked.

Looking at the above. When we build out a query we know.

  • The fields in the query that are to be returned.
  • The type of the field multi-platform and other wise.
  • The structure to be returned.

With these fields we can build out a custom type to be returned. We can also denote if it's a json agg field it will be a serializble type.

Next Up

Next up I will be discussing prepared queries. Then compilation should be after that.