Part 1

I just finished the alpha API for the executor layer, but that will be part three. In this we will be going over generating the type analogy of the database.

FRM, ORM, raw SQL? With a type safe builder, and in this instance we are reflecting off of the schema building out types. As noted in the first part we have a branching path of types ( Database -> ITable -> Fields).

The difference to other systems is two fold. Firstly I prefer writing SQL over objects/classes. Where we define the database schema from sql files, with a migration platform. This gets to me still liking to tune a Linux kernel for optimal bare metal performance, but that's another matter. The second portion is building the query near bare syntax. There is no magic, it's explicit, you know exactly what you're asking the data base for. But the query builder builds a lot of repetitive data holders (list, maps, etc.). Which is why we look to compile queries.

Reflecting the Schema

While this will iniitally be outside of postgres. At this point I have built strictly on the PostgresSQL data base. But the goal is the same, irrespective of the under lying data base. To reflect over the database, gathering information on a type schema. Then building out the respective types.

Getting the information.

SELECT information_schema.COLUMNS.table_catalog as database,
                information_schema.COLUMNS.table_schema as schema,
                information_schema.COLUMNS.table_name as table,
                information_schema.COLUMNS.column_name as column_name,
                information_schema.COLUMNS.ordinal_position as column_order,
                information_schema.COLUMNS.is_nullable as column_nullable,
                information_schema.COLUMNS.data_type as column_type,
                information_schema.COLUMNS.udt_name as column_udt_type,
                information_schema.table_constraints.constraint_type as column_key
		    from information_schema.COLUMNS
                left join information_schema.key_column_usage 
                    on information_schema.key_column_usage.column_name = information_schema.COLUMNS.column_name and information_schema.key_column_usage.constraint_schema = '${config.dataBaseSchema}'
                left join information_schema.table_constraints
                    on information_schema.table_constraints.table_name = information_schema.COLUMNS.table_name and information_schema.table_constraints.constraint_type = 'PRIMARY KEY' and information_schema.key_column_usage.constraint_schema = '${config.dataBaseSchema}'
		    where information_schema.COLUMNS.table_schema = '${config.dataBaseSchema}' and information_schema.COLUMNS.table_catalog = 'postgres'
		    group by 
                information_schema.COLUMNS.table_catalog,
                information_schema.COLUMNS.table_schema,
                information_schema.COLUMNS.table_name,
                information_schema.COLUMNS.column_name,
                information_schema.COLUMNS.ordinal_position,
                information_schema.COLUMNS.is_nullable,
                information_schema.COLUMNS.data_type,
                information_schema.COLUMNS.udt_name,
                information_schema.table_constraints.constraint_type

This query will give us the following information.

  • Schema
  • Database
  • Table
  • Column
    • Name
    • Type
    • Can it be Null
    • Primary Key
    • Order

This is then cast into the following data class representation.

data class RawInformationSchema(val map: Map<String, Any?>) : IDBSchema {
    val database: String by map
    val schema: String by map
    val table: String by map
    val columnName: String by map
    val columnOrder: Int by map
    val columnNullable: String by map
    val columnType: String by map
    val columnUdtType: String by map
    val columnKey: String? by map
}

internal val informationSchemaColumns = listOf(
        "database",
        "schema",
        "table",
        "columnName",
        "columnOrder",
        "columnNullable",
        "columnType",
        "columnUdtType",
        "columnKey"
)

We are casting this via delegation by map.

        val result = conn.sendQuery(query).get()
        return result.rows.map { r ->
            RawInformationSchema((informationSchemaColumns zip r.toList()).toMap())
        }

Enums

The above only gives information on table and column structure. But not custom data types. Warning: I've only covered enums thus far.

         SELECT pg_type.typname AS enumtype, 
            pg_enum.enumlabel AS enumlabel
            FROM pg_type 
            JOIN pg_enum ON pg_enum.enumtypid = pg_type.oid
internal val enumColumns = listOf("enumtype", "enumlabel")

data class EnumSchema(val map: Map<String, Any?>) : IEnumSchema {
    val enumtype: String by map
    val enumlabel: String by map
}

Code Generation

Now that we have an iterable collection of information. We can generate code based off that information. This is where kotlin poet comes in. First we need to group by the table. While this could be done via json_agg or other sql tools. I couldn't get it working in my allotted time window. So we use Kotlin collections groupBy.

Alpha API

Seeing how I've only done postgres thus far this API is subject to change. I was trying to make it as agnostic as possible.

interface ISChemaGeneratorConfig {
    var dataBases: List<String>
    var dataBaseHost: String
    var dataBaseUser: String
    var dataBasePassword: String
    var dataBasePort: Int
    var namespace: String
    var outputDir: String
    var logLevel: String
}

Any database we want to reflect on will need connection information. The connection information is prefixed by dataBase. The other items are for the gradle plugin. Schema is a term in postgres but not other databases. Extending this for postgres results in.

open class PostgresSQLConfiguration : ISChemaGeneratorConfig {
    override var dataBases: List<String> = listOf()
    override var dataBaseHost: String = "localhost"
    override var dataBaseUser: String = "postgres"
    override var dataBasePassword: String = "postgres"
    override var dataBasePort: Int = 5432
    override var namespace: String = "design.animus.datasource.sql.generated"
    override var outputDir: String = "./"
    override var logLevel = "DEBUG"
    var dataBaseSchema: String = "public"
    override var platforms: Set<Platform> = setOf(Common, JavaScript, JVM)
    override var excludeTables = setOf<String>()
}

Above we've added a few more fields for postgres connection.

Generating Code

The following snippet generates the database object. The database is simply a scoping mechanism. Where we want to limit queries across multiple databases. The table inherits this custom type.

object DatabaseGenerator : IDatabaseGenerator<PostgresSQLConfiguration> {
    override suspend fun generateDatabaseCode(db: String, config: PostgresSQLConfiguration): ClassName {
        val codeWriter = CodeWriter<PostgresSQLConfiguration>()
        val packageName = "${config.namespace}.database.$db"
        val databaseClassName = renameClass(db)
        val dbType = TypeSpec.Companion.objectBuilder(databaseClassName)
                .addSuperinterface(IPostgresDatabase::class)
                .addProperty(
                        PropertySpec.builder("database", String::class)
                                .initializer("%S", db)
                                .addModifiers(KModifier.OVERRIDE)
                                .build()
                )
                .addProperty(
                        PropertySpec.builder("schema", String::class)
                                .initializer("%S", config.dataBaseSchema)
                                .addModifiers(KModifier.OVERRIDE)
                                .build()
                )
                .build()
        codeWriter.platformWriter(packageName, databaseClassName, config, dbType)
        return ClassName(packageName, databaseClassName)
    }
}

It will output the following code.

object Postgres : IPostgresDatabase {
  override val database: String = "postgres"
  override val schema: String = "example"
}

Any imports or dependencies are handled automatically. At the end we return a ClassName. This is passed into the next section to provide the type to the generated table.

From the plugin we have the following snippet.

val databaseClass = DatabaseGenerator.generateDatabaseCode(db, extension)
....
generateSQLTableCode(databaseClass, db, inTable, columns, tableKeys, extension)

Passing in the databaseClass we can specify that via parametization.

Generating Table / Fields

This is where the difficult part comes in. Each target platform will have a different type for the given columns. Examples are UUID is not available in JavaScript, and Date is different between platforms.

Below is a snippet of the interface used to generate tables. This should be extensible enought to other platforms, but provide a easy to build base. I've lended to more variable passing in my design rather than maintaing state.

interface ITableGenerator<C : ISChemaGeneratorConfig, S : IDBSchema, K : ISchemaKeyReflector> {
    suspend fun buildJVMTable(databaseClass: ClassName,
                              tableClassName: String,
                              rawTableName: String,
                              recordClassName:String,
                              recordClass:ClassName,
                              config: C,
                              cols: List<S>,
                              keys: List<K>,
                              multiPlatform: Boolean): TypeSpec

    suspend fun buildJVMRecord(databaseClass: ClassName,
                              tableClassName: String,
                              rawTableName: String,
                              recordClassName:String,
                              recordClass:ClassName,
                              config: C,
                              cols: List<S>,
                              keys: List<K>,
                              multiPlatform: Boolean): TypeSpec

    suspend fun buildJSTables(databaseClass: ClassName,
                              tableClassName: String,
                              rawTableName: String,
                              recordClassName:String,
                              recordClass:ClassName,
                              config: C,
                              cols: List<S>,
                              keys: List<K>,
                              multiPlatform: Boolean
    ): TypeSpec

    suspend fun buildCommonTable(databaseClass: ClassName,
                                 tableClassName: String,
                                 rawTableName: String,
                                 recordClassName:String,
                                 recordClass:ClassName,
                                 config: C,
                                 multiPlatform: Boolean): TypeSpec
}

This pretty much boils down too.

  • Build $Platform Table
  • Build $Platform Record

The code is very similar between the aforementioned database generator. The main differientator is getting the column type.

val colType = getColumnClassType(col.columnType, col, config)
....
enum class SQLTypes(val friendly: String) {
    ARRAY("ARRAY"),
    BigInt("bigint"),
    Boolean("boolean"),
    VarChar("character varying"),
    Date("date"),
    Integer("integer"),
    Numeric("numeric"),
    Text("text"),
    TimeStamp("timestamp without time zone"),
    UUID("uuid")
}
....
internal suspend fun getColumnClassType(dataType: String, col: RawInformationSchema, config: PostgresSQLConfiguration): TypeName {
    val type = getSQLTypeByName(dataType)
    return when (type) {
        ARRAY -> {
            val list = ClassName("kotlin.collections", "List")
            if (col.columnUdtType.startsWith("_")) {
                val packageName = "${config.namespace}.database.types.enum"
                val nestedType = ClassName(packageName, renameClass(col.columnUdtType.replace("_", "")))
                list.parameterizedBy(nestedType)
            } else {
                val nestedType = getColumnClassType(col.columnUdtType, col, config)
                list.parameterizedBy(nestedType)
            }
        }
        BigInt -> Long::class.asTypeName()
        SQLTypes.Boolean -> Boolean::class.asTypeName()
        VarChar -> String::class.asTypeName()
        Date -> SQLDate::class.asTypeName()
        Integer -> Int::class.asTypeName()
        Numeric -> Number::class.asTypeName()
        Text -> String::class.asTypeName()
        TimeStamp -> Timestamp::class.asTypeName()
        UUID -> JavaUUID::class.asTypeName()
        else -> String::class.asTypeName()
    }.copy(nullable = (col.columnNullable.toLowerCase() == "yes"))
}

Pulling the column type from Postgres via reflection. We specify what class they will be converted too. This removes the biggest bottle kneck we have with multi platform. Ensuring proper type casting. By seperating the methods to each target platform allows us to narrow down the corresponding type of the column.

A sample generated table

object Users : IPostgresTable<Postgres, UsersRecord> {
  override val name: String = "users"

  override val database: Postgres = Postgres

  override val columns: List<String> = listOf("id","firstName","lastName")

  override val recordClass: KClass<UsersRecord> = UsersRecord::class

  override val table: IPostgresTable<Postgres, UsersRecord> = this

  val id: PrimaryKey<Int, Postgres, UsersRecord, Users> = object : PrimaryKey <kotlin.Int, Postgres,
      UsersRecord, Users>("id", "", database.schema, name) {}

  val firstName: Field<String, Postgres, UsersRecord, Users> = object : PrimaryKey <kotlin.String,
      Postgres, UsersRecord, Users>("first_name", "", database.schema, name) {}

  val lastName: Field<String, Postgres, UsersRecord, Users> = object : PrimaryKey <kotlin.String,
      Postgres, UsersRecord, Users>("last_name", "", database.schema, name) {}
}

data class UsersRecord(
  val map: Map<String, Any>
) : ADatabaseRecord(map) {
  val id: Int by map

  val firstName: String by map

  val lastName: String by map
}

The data class is only good for a select * from users. So we have an option to provide a custom class to pass items into.

While there is a lot of over information in the ITable object. It is necessary due to the lack of reflection on some platform. By over populating data it allows for more repeatable functions across platforms.

That being said this object is only utilzed when building a query. With compiled queries we avoid that performance over head.

Gradle Plugin

To automatically generate this code a gradle plugin is provided. The snippet to configure the plugin follows. Flyway is a database migration utility. Allowing to build and apply migrations.

With CI/CD pulling the database information from the environment is recommended. For testing this it is provided as Gitlab CI/CD secrets. Allowing migrations to be performed during the CI process.

The second part is configuration for the gradle plugin. The differienating factors are:

  • platforms A lot of users only need JVM, if left blank it will build a multi platform project.
  • excludeTables Contains any tables that you do not want to generate code for. In this example we are excluding the FlyWay migration table.
plugins {
    id("design.animus.kotlin.frm.sql.schema.postgresql") version "0.1.0-SNAPSHOT"
    id("org.flywaydb.flyway") version "6.0.7"
}

val dbHost = System.getenv("pgDataBaseHost") ?: "localhost"
val db = System.getenv("pgDataBase") ?: "postgres"
val dbPort = (System.getenv("pgDataBasePort") ?: "5432").toInt()
val dbUser = System.getenv("pgDataBaseUser") ?: "postgres"
val dbPassword = System.getenv("pgDataBasePassword") ?: "postgres"

flyway {
    url = "jdbc:postgresql://$dbHost/$db?user=$dbUser&password=$dbPassword"
    schemas = arrayOf("example")
    locations = arrayOf("filesystem:$projectDir/main/resources")
}

databaseGeneratorConfig {
    dataBases = listOf(db)
    dataBaseHost = dbHost
    dataBaseUser = dbUser
    dataBasePassword = dbPassword
    dataBasePort = dbPort
    namespace = "design.animus.kotlin.frm.sql.example.jvm.postgres.generated"
    outputDir = "$projectDir/generated"
    dataBaseSchema = "example"
    platforms = setOf(design.animus.kotlin.frm.sql.schema.common.JVM)
    excludeTables = setOf("flyway_schema_history")
}

Tasks

  • buildSchema Will build the types from the schema
  • checkConnection Verifies your connection works.

Seperating Generated from Source Code

If you're using multi platform you may want this to be in a completely different module. But assuming it's like the above strictly JVM.

kotlin {
    sourceSets {
        main {
            kotlin.setSrcDirs(
                    mutableListOf("main", "generated")
            )

This will include the generated directory, and there by any built types. Into the project.

Closing

This is an initial introduction to the gradle plugin, and how the code is generated. The more I use Kotlin poet the more impressed I am with it. Generating types is just the tip of the ice berg.

Combiningt this with a gradle plugin has made for a easy way to buildout a type safe query builder.