/

Types and models

Types and Models are the foundation of describing data within Taxi


Types and Models form the core building blocks of data and API's. These describe the contracts of data that float about in our ecosystems - published by services, exposed via processes, and sent from 3rd party suppliers.

In Taxi, as part of building out an robust composable type system, we differentiate between xxxx

Types

Types declare a simple, resuable concept.

Basic syntax:

// types can have comments (ignored by the parser), or docs, as shown below:
[[ Any type of Name, used for refer to things ]]
type Name inherits String

[[ A first name of a person.  Use this to call them for cake. ]]
type FirstName inherits Name

Using inheritence to describe specificitiy

It's strongly recommended to inherit a type from another type - either one of Taxi's built-in primitives, or one of your own, to narrow the specificity of the type. This is discussed more in inheritence.

Fields on types

It is possible, (but discouraged) for types to contain fields:

// This is possible, but discouraged.  Use a Model instead
type Name {
    firstName : FirstName
    lastName : LastName
}

Generally, it is discouraged for Types to have fields. It limits the reusability of types, as all users of the type must satisfy the same contract. In general, we recommend using Types that don't have fields, and Models that do.

We encourage widely shared types, and narrowly shared models.

Semantic typing

Semantic typing simply means defining types that describe the actual information being shared. It differs from most type systems, which focus purely on the representation of the data.

TermMeaningExample
RepresentationHow data is transmitted or encoded within a systemString,Number,Date
InformationThe actual facts sitting inside the dataCustomerName,Age,DateOfBirth

The information is the most useful part of any data, and so Taxi is intended to build type systems that describe this. Taxi encourages using lots of small, descriptive and narrowly focussed types as the building blocks of a shared enterprise taxonomy.

This is one of the key features of Taxi, and allows tooling to make intelligent choices about the information our systems are exposing.

For example, consider two pieces of information - a customers name, and their email address. Both are represented as Strings when being sent between systems - but both are very differnet pieces of information. Semantic typing aims to describe the information within a field, not just it's representation.

type Name inherits String
type EmailAddress inherits String

By building out a rich vocabulary of semantic types, models and APIs can be explicit about the type of information they require and produce.

For example - consider a service that takes an EmailAddress, and returns the customers name.

Without semantic typing, this would look like this:

// Not semantically typed.  Technically correct, but lacks sufficient information to be descriptive, or automate any tooling around
operation getCustomerName(String):String

// Semantically typed.  The API is now much richer, and has hints that tooling can start leveraging
operation getCustomerName(EmailAddress):Name

Through the use of inheritence, we can further refine these concepts, to add richer semantic descriptions about the information. This allows operations to add extra levels of specificity to their contracts, and for models to be descriptive about the data they are producing:


type FirstName inherits Name
type PersonalEmaillAddress inherits EmailAddress
type WorkEmailAddress inherits EmailAddress

// This operation can work against any type of `EmailAddress`:
operation getCustomerName(EmailAddress):Name

// Whereas this operation only works if passed a `WorkEmailAddress`.
operation getCustomerName(WorkEmailAddress):Name

Models

Models describe the contract of data either produced or consumed by a system. Models contain fields, described as types.

type PersonId inherits Int
type FirstName inherits String
model Person {
   id : PersonId // refernece to another custom type
   firstName : FirstName
   lastName : String // You don't have to use microtypes if they don't add value
   friends : Person[] // Lists are supported
   spouse : Person? // '?' indicates nullable types
}

Models can use inheritence, but this is less common and generally less useful than with Types.

Models should be produced and owned by a single system, rather than shared widely across mutliple systems in an enterprise.
When models start getting shared too widely, it becomes difficult for them to change and evolve.

Instead, favour single-system models, composed of terms from a widely shared Taxonomy of types.

Parameter models

Parameter models are special models that indicate to tooling that it's safe to construct these at runtime. Declaring something as a Parameter type has no other impact within Taxi on the type's definition, but is used in other tooling. (Eg., Vyne).

service PeopleService {
    operation createPerson(CreatePersonRequest):Person
}

model Person {
    id : Id as Int
    firstName : FirstName as String
    lastName : LastName as String
}

parameter model CreatePersonRequest {
    firstName : FirstName
    lastName : LastName
}

Types vs Models - Which to use?

Types

Use Types to describe a single specific piece of information. Try to form agreement on the definition of Types across an organisation. Generally, Types should match the shared business terms used across an organsiation to describe the business. Because types are small (only describing a single attribute), and specific, it's generally easier to form consensus around the definition of a Type.

Types are owned globally across an organisation, so there should be a curation process surrounding their definition.

Types generally don't have fields. As soon as a types have fields, users must agree on both the meaning of the information, and the representation. The latter tends to provide much greater friction, and leads to less reusable types.

Models

Use Models to describe the contract and structure of data either produced or consumed by a single system.

Models should be owned by the owners of a system, who are free to evolve and grow their models autonomously. Models have fields and structure, described using Types.

Built-in types

Taxi is a strongly typed language, that is, a discrete attribute will ultimately inherit from a number of limited primitive types. If you recall, the our example taxonomy had “animal” at the top, these primitive types will replace “animal” in taxonomies you design.

Taxi currently supports the following primitive types. (See the Language Spec)

Taxi TypeCommentsDefault format
Booleantrue or false-
StringA list of characters-
IntA numeric type without decimal places-
DoubleA numeric type with decimal places, using floating-point precision-
DecimalA numeric type with decimal places-
DateRepresents a day, without time, or a time zone.yyyy-MM-dd
TimeA time, in the form of hh:mm:ss. Does not support dates, or time-zones.HH:mm:ss
DateTimeDescribes both a date and a time, but without a timezone.yyyy-MM-dd'T'HH:mm:ss.SSS
InstantA timestamp indicating an absolute point in time. Includes time-zone.yyyy-MM-dd'T'HH:mm:ss[.SSS]X
AnyCan be anything. Try to avoid using Any as it's not descriptive, favour a more specific type if possible

Array types

An Array (specifically, lang.taxi.Array) is a special type within Taxi, and can be represented in one of two ways - either as Array<Type> or Type[].

model Person {
    friends : Person[]
}

// or:

model Person {
    friends : Array<Person>
}

These two approaches are exactly the same. In the compiler, Person[] gets translated to Array<Person>.

It is possible to alias Arrays to another type:

type alias Friends as Person[]

model Person {
    // This is the same as friends : Person[]
    friends : Friends
}

However, be aware that this has a questionable effect on readability.

See also: Type aliases

Date / Time types

Taxi has four built-in date-time types (Date,Time,DateTime and Instant), which can specify additional format rules.

Currently, Taxi does not enforce or use these symbols, it simply makes them available to parsers and tooling to consume. As such, it's possible that other parsers may define their own set of formats for use. This is possible, but discouraged.

Date/Time formats

Formats can be specified using a (@format='xxxx') syntax after the type declaration:

type DateOfBirth inherits Date(@format = 'dd/MMM/yy')
type Timestamp inherits Instant(@format = 'yyyy-MM-dd HH:mm:ss.SSSZ')
SymbolMeaningPresentationExamples
yYear of erayear2018; 18
MMonth of yearnumber/text07; 7; July; Jul
dDay of monthnumber10
EDay of weektextTuesday; Tue; T
aAm/PM of daytextPM
HHour in day (0-23)number0
hclock hour of am/pm (1-12)number12
mMinute of hournumber30
sSecond of minutenumber55
SMillisecond (fraction of second)number978
zTimezone name (Formal spec)textPacific Standard Time; PST; GMT-08:00
ZTimezone (RFC 822 time zone) (Formal spec)text+0000; -0800; -08:00;
XTimezone offset, accepts 'Z' as zero (Formal spec)textZ; -08; -0830; -08:30; -083015; -08:30:15;
xTimezone offsettext+0000; -08; -0830; -08:30; -083015; -08:30:15;

The actual parsing of dates is left to parser implementations. Our reference implementation (Vyne) uses the Java SimpleDateFormat rules for parsing, assuming a lenient parser.

Sample patterns

Date and Time PatternWill match
dd/MM/yy04/09/19
dd MMM yyyy04 Sep 2019
yyyy-MM-dd2019-09-04
dd-MM-yyyy h:mm a04-09-2019 1:45 AM
dd-MM-yyyy hh:mm a, zzzz04-09-2019 01:45 AM, Singapore Time
dd-MM-yyyy HH:mm:ss04-09-2019 01:45:48
yyyy-MM-dd HH:mm:ss.SSS2019-09-04 01:45:48.616
yyyy-MM-dd HH:mm:ss.SSSZ2019-09-04 01:45:48.616+0800
yyyy-MM-dd HH:mm:ss.SSSXXX2019-09-04 01:45:48.616Z ; 2019-09-04 01:45:48.616554Z ; 2019-09-04 01:45:48.616+01:00
EEEE, dd MMMM yyyy HH:mm:ss.SSSZ Wednesday, 04 September 2019 01:45:48.616+0800 

More timezone examples

Timezone patternWill match
XXX
Timezone offset with a :, also accepts Z for zero.

eg:yyyy-MM-dd'T'HH:mm:ss.SSSXXX
1979-03-02T08:41:04.551Z ;
1979-03-02T08:41:04.551+01:00 ;
1979-03-02T08:41:04.551-01:00
XXXX
Timezone offset without a :, also accepts Z for zero.

eg: yyyy-MM-dd'T'HH:mm:ss.SSSXXXX
1979-03-02T08:41:04.551Z ;
1979-03-02T08:41:04.551+0100 ;
1979-03-02T08:41:04.551-0100

Optional and lenient matching.

Sections of a pattern can be marked optional by using [].

The reference implementation follows lenient parsing rules, so values like S and SSS are equivalent. More details on Lenient are here.

For example:

Date and Time PatternWill match
yyyy-MM-dd'T'HH:mm:ss[.S]XXX2021-06-07T08:41:04.551555+01:00 ;
2021-06-07T08:41:04.551+01:00 ;
2021-06-07T08:41:04.551Z ;
2021-06-07T08:41:04Z
yyyy-MM-dd HH:mm:ss.SSSZ2019-09-04 01:45:48.616+0800
yyyy-MM-dd HH:mm:ss.SSSXXX2019-09-04 01:45:48.616Z ;
2019-09-04 01:45:48.616554Z ;
2019-09-04 01:45:48.616+01:00

Date/Time offsets

A model can also specify restrictions around the offset from UTC that a date must be represented/interpreted in:

model Transaction {
    timestamp : Instant( @offset = 60 )
}

This specifies that the value written to or read from the timestamp value must have an offset of UTC +60 minutes. This is most useful for contracts of consumers of data, specifying that the data presented must be in a certain timezone.

eg:

model Transaction {
    timestamp : Instant( @offset = 0 ) // Whenever data is written to this attribute, it must be in UTC.
}

Note - taxi itself doesn't enforce or mutate these times, that's left to parsers and tooling.

Nullability

By default, all fields defined in a Model are mandatory.

To make a field optional, relax this by adding the ? operator:

model Person {
    id : PersonId // Mandatory
    spouse : Person? // Optional - may be null.
}

Note that Taxi does not enforce nullabillity constraints - that's left to the systems which interpret the data.

Inheritence

Taxi provides inheritence across both Types and Models. It's recommended that inheritence is used heavily in Types to increase the specificity of a concept, and sparingly in Models.


type Name inherits String

type PersonName inherits Name
type FirstName innherits PersonName
type LastName inherits PersonName

type CompanyName inherits Name

As with most languages, inheritence is one-way -- ie., in the above example, all Names are Strings, but not all Strings are names.

Likewise, all FirstNames are Names, but not all Names are FirstNames.

This is incredibly useful in building out a taxonomy that allows publishers and consumers to be more or less sepcific about the types of data they are providing.

Inline inheritance

It is possible to define a type with inheritance inline within a model definition.

model Person {
   // This declares a new type called FirstName, which inherits PersonName
   firstName : FirstName inherits PersonName
   lastName : LastName inherits PersonName
}

// is exactly the same as writing:

type FirstName inherits PersonName
type LastName inherits PersonName

model Person {
   firstName : FirstName
   lastName : LastName
}
Inline inheritance is a great way to quickly start building out your taxonomy. However, it can make it harder to see where types are, and can lead to errors by accidentally declaring duplicate types (which will cause a compilation error).

Annotations

Annotations can be defined against any of the following:

  • Types
  • Fields
  • Enums
  • Services
  • Operations

Annotations may have parameters, or be left 'naked':

@ValidAnnotation
@AnotherAnnotation(stringParam = 'hello', boolValue = true, intValue = 123)
type Foo {}

Annotation parameters may be values of type string,int or boolean

Annotations do not have any direct impact on the types. However, they can be used to power tooling - such as generators, or to provide hints to consumers

Compiler support for annotations in Taxi has been improving recently. Originally, there was no checking of validity of annotations, other than simply syntax checking.

In the current version, there's partial support for type checking around annotations, beyond basic syntax checking. If an annotation has been defined as a type, then it's contract is enforced by the compiler (ie., attributes must match).

However, it's also possible to declare annotations that don't have an associated Annotation Type. This has been left for backward compatibility. If you are building out custom annotations, you're encouraged to define a corresponding Annotation type to go with it.

In a future release, all annotations will be required to have a corresponding annotation type.

See also: Annotation types

Annotation types

Annotation Types allow declaring new annotations. Annotations can have models, which have types associated with them.

Eg:

enum Quality {
    HIGH, MEDIUM, BAD
}
annotation DataQuality {
    quality : Quality
}

This defines a new annotation - @DataQuality, which has a single mandatory attribute -quality, read from an enum.

Here's how it might be used:

@DataQuality(quality = Quality.HIGH)
model MeterReading {}

When an annotation has an associated type, then it's contract is checked by the compiler.

For example, given the above sample, the following would be invalid:

@DataQuality(quality = Quality.Foo) // Invalid, Foo is not a member of Quality
model MeterReading {}

@DataQuality(qty = Quality.High) // Invalid, qty is not an attribute of the DataQuality annotation
model MeterReading {}

@DataQuality(quality = "High") // Invalid, as quality needs to be popualted with a value from the `Quality` enum.
model MeterReading {}

Type aliases

Type aliases provide a way of declaring two concepts within the taxonomy are exactly the same. This is useful for mapping concepts between two independent taxonomies.

For example:

type FirstName inherits String

type alias GivenName as FirstName

This states that anywhere FirstName is used, GivenName may also be used.

Unlike Inheritence, aliases are bi-directional. That is all FirstNames are GivenNames, and all GivenNames are FirstNames.

Inheritence vs Type Aliases - Which to use?

Type aliases were an early language feature in Taxi, and have since been replaced by inheritence, which has stricter rules and is more expressive.

Generally speaking, we encouarge the use of Inheritence where a relationship is one-way. Type aliases are really only useful when mapping between two different taxonomies, which both expose the same concept with different terminology.

Enums

Enum types are defined as follows:

enum BookClassification {
    @Annotation // Enums may have annotations
    FICTION,
    NON_FICTION
}

Enum synonyms

Synonyms may be declared between values in multiple sets of enums. This is useful when two different systems publish the same concept in slightly different formats.

namespace acme {
   enum Country {
      NEW_ZEALAND,
      AUSTRALIA,
      UNITED_KINGDOM
   }
}

namespace foo {

   enum Country {
      NZ synonym of acme.Country.NEW_ZEALAND,
      AUS synonym of acme.Country.AUSTRALIA,
      UK synonym of acme.Country.UNITED_KINGDOM
   }
}

Synonyms are bi-directional, so whenever foo.Country.NZ is used, acme.Country.NEW_ZEALAND could also be used.

Enum values

Enums may declare values:

enum Country {
      NEW_ZEALAND("NZ"),
      AUSTRALIA("AUS"),
      UNITED_KINGDOM("UK")
}

Parsers will match inputs against either the name or the value of the enum.

ie:

{
    "countryOfBirth" : "NZ", // Matches Country.NEW_ZEALAND
    "countryOfResidence" : "UNITED_KINGDOM" // Matches Country.UNITED_KINGDOM
}

Lenient enums

Lenient enums instruct parsers to be case insensitive when matching values.

Enums already support matching on either the Name of the enum or the value. Adding "lenient" to the start of the enum declaration means that these matches will be case insensitive.

eg - Without Lenient:

enum Country {
   NZ("New Zealand"),
   AUS("Australia")
}

An ingested value of either "NZ" or "New Zealand" would match the NZ enum. However, an ingested value of "nz" would be rejected

Adding lenient makes checking against enums case insensitive

lenient enum Country {
   NZ("New Zealand"),
   AUS("Australia")
}

An ingested value of "nz", "Nz", "new zealand", "New zealand" etc would match against NZ. A value of "UK" (not defined in the list) would be rejected.

Best practice reccomendation

If you must use lenient enums, restrict them to external data you're ingesting. Try not to make your internal Taxonomy lenient - as you want internal data to be as strict as possible.

You can use synonyms against a lenient external data to match against your stricter internal taxonomy. eg:

namespace my.vendor {
   lenient enum Country {
   NZ("New Zealand") synonym of acme.Country.NZ
   AUS("Australia") synonym of acme.Country.AUS
}

Default values on Enums

Enum values can be marked as default to instruct parsers to apply these values if nothing matches. You can specify at most one default in the list of enum values.

enum Country {
   NZ("New Zealand"),
   AUS("Australia"),
   default UNKNOWN("Unknown")
}

In this example, an ingested value of "NZ" would match NZ, and a value of "UK" would match UNKNOWN. No value would be rejected.

Mixing Default and Lenient

You can mix lenient and default

lenient enum Country {
   NZ("New Zealand"),
   AUS("Australia"),
   default UNKNOWN("Unknown")
}

In this scenario, because there's both a lenient keyword (making the enum case insensitive), and a default:

  • A value of Nz would match Country.NZ
  • A value of new zealand would match Country.NZ
  • A value of Uk would match Country.UNKNOWN
Edit on GitLab