/

Advanced type mapping

Techniques for mapping messy, real-world data


These features describe techniques used when mapping data to semantic values when things don't neatly line up, or when enriching another representation of a model.

These are often useful when mapping data provided by a system outside your control, but can be used anywhere. Additionally, these techniques can be applied to incrementally expose existing data to the semantics of your type system.

Field accessors

Field accessors describe a way of reading data into a field from somewhere within a document.

By default, taxi assumes that the content that is being described read matches the structure of the content. However, for some content (such as CSV), or for external content, this isn't always possible.

Accessors are being replaced by expressions.

They still provide access for jsonPath(), column() and xpath(), but are deprecated for all other usages.

Describing CSV data - column()

CSV data can be described in taxi using by column() syntax, passing either the header name or the column index.

eg:

model Person {
    firstName : FirstName by column('firstName')
    lastName : LastName by column('lastName')
}

The value passed can either be:

  • the name of the column
  • the index of the column. Indexes are assumed to be 1-based. (That is, the first column is index 1, not index 0).

Describing Json data - jsonPath()

By default, json data can be parsed by Taxi without any special markup. However, if you wish to expose a value from elsewhere in a document, you can use a jsonPath expression:

model Person {
    firstName : FirstName by jsonPath('$.person.names.FirstName')
}

Parsers will read the firstName attribute from the specified jsonPath.

Relative JsonPath:

type Oscar inherits String
model Actor {
   name : ActorName inherits String
   awards : {
      mostRecentOscar: Oscar by jsonPath("oscars[0]")
   }
}

Given the following json:

[
   {
      "name" : "Tom Cruise",
       "awards" : {
         "oscars" : [ "Best Movie" ]
       }
   },
   {
      "name" : "Tom Jones",
       "awards" : {
         "oscars" : [ "Best Song" ]
       }
   }
]

This line: oscars: Oscar by jsonPath("oscars[0]") would flatten out the oscars property to the first value - ie:

[
   {
      "name" : "Tom Cruise",
       "awards" : {
         "mostRecentOscar" : "Best Movie"
       }
   },
   {
      "name" : "Tom Jones",
       "awards" : {
         "mostRecentOscar" :  "Best Song"
       }
   }
]

Describing Xml data - xpath()

By default, xml data can be described by Taxi and parsed by parsers without any special markup. However, if you wish to expose a value from elsewhere in a document, you can use an xpath expression:

model Person {
    firstName : FirstName by xpath('//person/names[@type = 'firstName]/name')
}

Parsers will read the firstName attribute from the specified xpath.

Conditional logic - when blocks

Conditional fields let you expose semantic data based on values from other fields within the model. This is useful when parsing semi-structured data into a semantic model.

model Person {
    homePhone : HomePhoneNumber by column("homePhone")
    workPhone : WorkPhoneNumber by column("workPhone")
    preferredPhone : String by column("preferredPhone")
    preferredPhoneNumber : PreferredPhoneNumber? by when (column("preferredPhone")) {
        "Home" -> this.homePhone
        "Work" -> this.workPhone
        else -> null
    }
}

When blocks are very flexible, and analogous to a series of if...then...else statements . They support matching either on a specific value:

// switching on a specific value
model Person {
    preferredPhone : String by column("preferredPhone")
    preferredPhoneNumber : PreferredPhoneNumber? by when (this.preferredPhone) {
        "Home" -> this.homePhone
        "Work" -> this.workPhone
        else -> null
    }
}

or matching on the first true result:

model Person {
    preferredPhoneNumber : PreferredPhoneNumber? by when {
        this.preferredPhone == 'Home' -> column("homePhone")
        this.preferredPhone == 'Work' -> column("workPhone")
        else -> null
    }
}

The conditions in a when block may make reference to other fields, and perform the following comparisons:

SymbolMeaning
==Equal to
!=Not equal to
>Greater than
>=Greater than or equal to
<Less than
<=Less than or equal to
&&And
||Or

Here's a more complex example:

enum OrderStatus {
    Unfilled,
    Rejected,
    PartialFill,
    FilledCompletely
}
model PurchaseOrder {
    sellerName: String?
    status: String?
    initialQuantity: Decimal?
    remainingQuantity: Decimal?
    quantityStatus: OrderStatus by when {
        this.initialQuantity = this.leavesQuantity -> OrderStatus.Unfilled
        this.trader = "Marty" || this.status = "Foo" -> OrderStatus.Rejected
        this.remainingQuantity > 0 && this.remainingQuantity < this.initialQuantity -> OrderStatus.PartialFill
        else -> OrderStatus.Unfilled
    }
}

Expressions

As with everything in Taxi, evaluation of expressions is left to a parser / runtime, like Orbital.

This document outlines the expected spec of expressions, but Taxi itself does not ship with an evaluation engine.

Expressions can be declared either on a type directly, or on a field. They can be simple statements, or include function calls.

As statements, expression can use the following operators:

SymbolMeaning
+Add
-Subtract
*Multiply
/Divide

Expressions can also perform comparisons, using the same operators described in conditional logic.

Expression types

type Height inherits Int
type Width inherits Int

// A simple expression, multiplying two fields
type Area inherits Int by Width * Height

// Expressions may mix references to other types, and constants
type WhackyArea inherits Int by (Width * Height) / 2

// Expressions can call functions
type FullName inherits String by concat(FirstName, ' ', LastName)

Expression on fields

Fields in a model may declare expressions inline.

In addition to standard syntax used in expression types, fields may also make reference to other fields, using this.fieldName syntax

eg:

model Purchase {
    price : PerItemPrice
    quantity : Quantity
    total : TotalCost by (PerItemPrice * Quantity)
    // alternatively:
    anotherTotal : TotalCost by (this.price * this.quantity)
}

Special calculation operations

Calculation operations aren't limited to just numeric types. The following special operation types are supported:

OperationResult typeEffectExample
String + String (Deprecated)StringConcatenationfullName : FullName by (this.firstName + this.lastName)
Date + TimeInstantCombinestimestamp : Instant by (this.recordDate + this.recordTime)

Concatenating strings using the + operator is deprecated. Consider using the concat function instead.

Note: Taxi does not perform the evaluations. This is left to parsers to handle.

As such, other parsers may provide additional operations. The above outlines the capabilities supported by Orbital.

Referencing other fields

When referencing other fields, either in functions, or when classes, you can reference the field in one of two ways:

MethodSyntaxExample
By field namethis.fieldNamewhen(this.firstName)
By typeFieldTypewhen(FirstName)

Type Extensions

Type extensions allow mixing in additional data to a type after it's been defined.

It's possible to extend a type by adding either annotations, or type refinements. Structural changes to types are not permitted (and the compiler will prevent them).

Type extensions can be defined either in the same file, or (more typically), in a different file from the original defining type.

// As defined in one file.
type Person {
    personId : Int
    firstName : String
    lastName : String
}

type alias FirstName as String
// Extending the type to provide additional context:
type extension Person {

    // Adding an annotation.  No additional type is defined, so the
    // underlying type remains the same -- an Int
    @Id
    personId

    // Refining the type.  Note that FirstName represents the same
    // type as the original String, so this is permitted.
    // If the types were incompatible, the compiler would alert an error
    firstName : FirstName
}

There are a few scenarios where this may be useful:

Adding documentation

If consuming a third party API via a format like Swagger, it's useful to mix in consumer specific documentation.

Type extensions allow this, by adding taxidoc to the types

Refining types with semantic types

Not all languages and spec tools are created equal - and some don't have great support for semantic types. Extensions allow for the narrowing (but not redefinition) of the type, through specifying more narrow types.

The compiler will prevent type extensions from introducing incompatible type changes.

The compiler also restricts type refinements to a single refinement per type

Mixing-in metadata

A consumer of an API may wish to leverage tooling that uses annotations - eg., persisting an entity into a database.

Type extensions allow providing tooling specific metadata that's only useful to the consumer - not the producer.

Functions

Taxi allows for the declaration of functions, which can be used within a type spec.

As with everything in Taxi, evaluation of functions is left to a parser / runtime, like Orbital.

As Taxi doesn't support evaluation, functions are declared simply by declaring their interface. There is no ability to express the implementation of a function within Taxi.

Declaring functions

Functions are declared with a name, a list of inputs, and a return type:

declare function uppercase(String):String

These can then be used within expressions:

model Person {
   title : Title by uppercase(Salutation)
}

Vararg parameters

A function may declare it's final parameter as vararg, which means that multiple arguments may be provided.

Vararg is only permitted in the final argument.

declare function concat(String...):String

Generics

A function may declare a generic type signature. The Taxi compiler will infer types, and ensure that passed parameters in usages adhere to contracts.

declare function <T,A> reduce(T[], (T,A) -> A): A
Edit on Github