For the past several years I have been using jooq for writing my SQL queries. Combined with Flyway. It allowed for close to bare metal almost raw SQL access.

However during my recent transition to more reactive micro services, and doubling down on Vertx. I switched it's use model. I was no longer using fetchInto, or having it return an object. Instead it was just a string builder for me. The jaSync driver being utilized took in a string as a query.

When Strings are Good and Bad

Having a litany of unchecked sql strings in your code can lead to a possible bug. With Jooq plus Flyway. If you altered a table, and queried a column that no longer existed. Or even changed type then it would result in an error.

If we just use a plain string for sending the query, prepared or not. Then there is no type checking at compile time. We lost that ability of allowing the compiler to ensure type safety.

Looking at almost any data source connection library. The earliest building block is a string. It may change as to whether the prepared template variable is a !, ?, or #. But it takes in a string at the most fundamenetal level.

This isn't just JVM, but JS. What if we could reflect on the schema, generate out a interface / object hierarchy. Of the SQL / data access. Why stop at SQL, if we can pull a schema we can build a query language.

As an example we reflect on the Postgres schema. Pulling all tables under a given schema/database. Then generate the requisite objects / types for accessing it. Now we just need methods/DSL to build out the query string. Irrespective of runtime (js or jvm). Then we can have an executor that calls JaSync, Node Postgres, insert connection layer here.

This brings us to when Strings are bad. When you are building a high performant platform. Where every millisecond counts. A query string builder is unecessary resource allocation. You're building a data class, lists, etc. Parsing strings. Why repeat this over and over at run time?

What if instead we could compile the above DSL into an object with a const string. Versioning that object. If we were to have a query of:

select user.id, user.first_name, user.last_name from users where last_login between '2018-01-01' and '2018-01-31';

It could fit into an object of

object GetUsersByLastLogin : CompiledQuery {
	const query = "$ThatSQLString"
    version = 2
}

Now we have a compiled final string, without having to go through the query building process. This is beneficial primarily for performance tailord applications. But lastly it is type safe.

Building A Query Language

            val query = sqlQuery<ExampleDatabase, Users> {
                select() from Users where {
                    (Users.id equal 1) and (Users.firstName equal "John") or (Users.lastName equal "Doe")
                }
            }

Jooq use a fluent or builder pattern. One call after the other. I am still finalizing the API calls. But the raw string builder will look like the above.

The entry point is:

select()

This builds out a MutableSelectQuery which holds all the information about the query. Columns to be queries, conditions, table, etc. This call returns that instance. Then we use infix functions based off that object to finish building the query. We can also do

select(Users.id, Users.firstName as "FIRSTNAME")

Taking in a vararg on a field type. That field type also has helper methods. Instead of

//Jooq
Users.ID.`as`("id")
// Kotlin FRM
Users.id AS "id"

We are chaining functions together to build out a type safe query string. The same over load is applied to the table and other areas. To seperate some calls, ala order of operations. We need to encapsulate them in (). But it looks strikingly like straight SQL.

What does this do though? So far it's just a string. That above snippet is from a test case.

            assertTrue {
                (query.toString()) ==
                        "SELECT * from public.example.users where public.example.users.id = '1' AND public.example.users.first_name = 'John' OR public.example.users.last_name = 'Doe'"
            }

We are verifying that it builds out the proper query. But going back to the inital statement. We have types representing our database now. Those types are multi platform enabled. This is where the concept of executors come in.

Executors

The project is broken into a number of different modules.

  • query
  • schema
  • executor

Query

Is utilized for building out the queries. There are modules under query sql(SQL), cql(cassandra), etc. Then there are also db specific sql/query-postgres, sql/query-mariadb. These take in a base type of what a table and field is. Then builds the query out for that.

Schema

This will read out the schema from an existing database. Connecting via a JVM connection agent. Then using kotlin poet to generate the respective type hiearchy.

This constructed type hiearchy can then be used with the query fields.

Executor

The executor is the last portion of this. It will establish a connection to the respective data source. Looking back the smalles building block is a string. Using the query modules to build a string. We can then pass it to whichever executor, on any given runtime we wish.

There will be syntactic sugar atop of it. But this is a structured building block and composition.

Gradle Plugin

In the schema section we referenced building out the type hiearchy automatically.

plugins {
    id "kotlin-platform-jvm"
    id "design.animus.datasource.sql.schema.postgresql" version "0.1.0"
}

dependencies {
    implementation "org.jetbrains.kotlin:kotlin-stdlib-jdk8:$KotlinVersion"
    implementation "org.jetbrains.kotlinx:kotlinx-coroutines-core:$CoroutineVersion"
    implementation "org.jetbrains.kotlinx:kotlinx-serialization-runtime:$SerializationVersion"
    compile project(":DataSourceDSL:sql:query:query-common:")
}
sourceSets {
    main.resources.srcDirs += "main/resources"
    main.kotlin.srcDirs += "main"
    main.kotlin.srcDirs += "generated"
    test.kotlin.srcDirs += "test"
    test.resources.srcDirs += "test/resources"
}
compileKotlin.kotlinOptions.jvmTarget = '1.8'
sourceCompatibility = 1.8
targetCompatibility = 1.8

postgresConfig {
    dataBases = ["example"]
    dataBaseHost = "localhost"
    dataBaseUser = "postgres"
    dataBasePassword  = "postgres"
    dataBasePort = 5432
    namespace  = "design.animus.datasource.sql.example.generated"
    outputDir =  "$projectDir/generated"
    dataBaseSchema = "public"
}

The above is a gradle snippet for building out the type hierarchy.

This will generate a structure like the following.

// Data Base
object ExampleDatabase : Database {
    override val database = "example"
    override val schema = "public"
}

// Table
object Users : ITable<ExampleDatabase> {
    val id: PrimaryKey<Long, ExampleDatabase, Users> = object : PrimaryKey<Long, ExampleDatabase, Users>("id", "bigint") {}
    val firstName: Field<String, ExampleDatabase, Users> = object : Field<String, ExampleDatabase, Users>("first_name", "varchar(24)") {}
    val lastName: Field<String, ExampleDatabase, Users> = object : Field<String, ExampleDatabase, Users>("last_name", "varchar(24)") {}
    override val name = "users"
    override val database = ExampleDatabase
}

The generics I've constructed so far are.

//Database

interface Database {
	val database: String // The database name
    val schema : String // The schema of the database
}

// Table

interace Itable<D:Database> {
   val database : Database // The data base object
   val name : String // The name of the table
}

// Fields
abstract class Field<C:Any, D:Database, T:ITable<D>(val columnName:String, val columnType:String)

We have a descendant tree pattern database -> table -> fields. This ensures strong type saftey top to bottom. Additionally by using generics and defining at the beggining of the query (`sqlQuery<ExampleDataBase, Users>`). We have set a scope the queries fall under that database, and table. This type safety carries over to conditional checks and columns. You will receive a type error if you try and access another table outside the anticipated boundary.

API Beta

I am still defining the core API syntax. I need to account for transactions, and other more advanced query mechanisms.

Functional?

( for( c <- coffees; if c.price < limit ) yield c.name ).result

Slick via scala gives the above example. The functional relational part speaks to that. How do we map this over in kotlin?

Right now I'm thinking something like.

// Builder Notation
Users.filter { Date(2018, 10, 9) <lastLogin < Date.now() }
     .map {record: UsersRecord -> 
        println(record)
     }
     .executor(Executor.JaSync)
     
Users.join(UserProfile, UserProfile.userId equals Users.id)
     .of(Users.id, Users.firstName, Users.lastName, UserProfile.email)
     .map { row:Map<String, Any> ->
        println(row)
     }
     .executor(Executor.JaSync)

A builder notation seems more discoverable than a custom DSL. The prior DSL is based off SQL and has a low barrier to discovery. Additionally Kotlin functional is built largely off these chains of filter, map, groupBy, etc.

Next Steps

I'm working to finish initial postgres support with a JaSync executor, and maybe JDBC. Once that is done I will post a guide on using the string builder pattern. Then work on functional aspects.