Data types
The article gives an introduction to data types in EvitaDB query language, including basic and complex types, and provides code examples to demonstrate their usage.
There are two categories of data types:
- simple data types that can be used both for attributes and associated data
- complex data types that can be used only for associated data
Simple data types
evitaDB data types are limited to following list:
- String, formatted as "string"
- Byte, formatted as 5
- Short, formatted as 5
- Integer, formatted as 5
- Long, formatted as 5
- Boolean, formatted as true
- Character, formatted as 'c'
- BigDecimal, formatted as 1.124
- OffsetDateTime, formatted as 2021-01-01T00:00:00+01:00
- LocalDateTime, formatted as 2021-01-01T00:00:00
- LocalDate, formatted as 2021-01-01
- LocalTime, formatted as 00:00:00
- DateTimeRange, formatted as [2021-01-01T00:00:00+01:00,2022-01-01T00:00:00+01:00]
- BigDecimalNumberRange, formatted as [1.24,78]
- LongNumberRange, formatted as [5,9]
- IntegerNumberRange, formatted as [5,9]
- ShortNumberRange, formatted as [5,9]
- ByteNumberRange, formatted as [5,9]
- Locale, formatted as language tag 'cs-CZ'
- Currency, formatted as 'CZK'
- UUID, formatted as 2fbbfcf2-d4bb-4db9-9658-acf1d287cbe9
- Predecessor, formatted as 789
String
Dates and times
Why do we internally use OffsetDateTime for time information?
DateTimeRange
- when both boundaries are specified:
- when a left boundary (since) is specified:
- when a right boundary (until) is specified:
NumberRange
Both boundaries of the number range must be of the same type - you cannot mix for example BigDecimal as lower bound and Byte as upper bound.
- when both boundaries are specified:
- when a left boundary (since) is specified:
- when a right boundary (until) is specified:
Predecessor
Motivation for linked lists in database sorting
The linked list is a very optimal data structure for sorting entities in a database that holds large amounts of data. Inserting a new element into a linked list is a constant time operation and requires only two updates:
- inserting a new element into the list, pointing to an existing element as its predecessor
- updating the original element pointing to the predecessor to point to the new element.
Moving (updating) an element or removing an existing element from a linked list is also a constant time operation, requiring similar two updates. The disadvantage of the linked list is its poor random access performance (get element at n-th index) and list traversal, which requires a lot of random access to different parts of memory. However, these disadvantages can be mitigated by keeping the linked list in the form of an array or binary tree of properly positioned primary keys.
- It doesn't require mass updates of surrounding entities or occasional "reshuffling".
- it doesn't force the client logic to be complicated (and it plays well with the UI drag'n'drop repositioning flow)
- it is very data efficient - it only requires a single (4B) per single item in the list
Maintaining consistency of the linked list
That's why we designed our linked list implementation to tolerate partial inconsistencies, and to converge to a consistent state as missing data is inserted. We support these inconsistency scenarios:
- multiple head elements
- multiple successor elements for a single predecessor
- circular dependencies, where a head element points to an element in its tail
The sorting by an inconsistent predecessor attribute sorts the entities by the chains in the following order:
- the chains starting with a head element (starting with the chain with most elements, to the chain with least elements)
- the chains with elements sharing the same predecessor (starting with the chain with most elements, to the chain with least elements)
- the chains with circular dependencies (starting with the chain with most elements, to the chain with least elements)
The inconsistent state is also allowed in the transactional phase, but we recommend avoiding it and updating all the elements involved (in any order) within a single transaction, which will ensure that the linked list remains consistent for all other transactions.
Complex data types
The complex type can contain the properties of
- any simple evitaDB types
- any other complex types (additional inner POJOs)
- generic Lists
- generic Sets
- generic Maps
- any array of simple evitaDB types or complex types