evitaDB - Fast e-commerce database
logo
page-background

Locale filtering

Numerous e-commerce applications function in various regions and rely on localized data. While product labels and descriptions are clear examples, there are also several numeric values that must be specific to each locale due to the distinction between the metric system and imperial units. That's why evitaDB offers first-class support for localization in its data structures and query language.

Entity locale equals

argument:string!
a mandatory specification of the locale to which all localized attributes targeted by the query must conform; examples of a valid language tags are: en-US or en-GB, cs or cs-CZ, de or de-AT, de-CH, fr or fr-CA etc.
If you are working with evitaDB in Java, you can use instead of the language tag. This is a natural way to work with locale specific data on the platform.

The language tag, also known as the locale or language identifier, is a standardized format used to represent a specific language or locale in computer systems and software. It provides a way to identify and differentiate languages, dialects, and regional variations.

The most commonly used format for language tags is the BCP 47 (IETF Best Current Practice 47) standard. BCP 47 defines a syntax and set of rules for constructing language tags.
A language tag is typically constructed using a combination of subtags that represent various components. Here's an example breakdown of a language tag: en-US.
  1. Primary Language Subtag: In the example above, en represents the primary language subtag, which indicates English as the primary language.
  2. Region Subtag: The region subtag is optional and represents a specific region or country associated with the language. In the example, US represents the United States.

Language tags can also include additional subtags to specify variations such as script, variant, and extensions, allowing for more granular language identification.

If any filter constraint of the query targets a localized attribute, the entityLocaleEquals must also be provided, otherwise the query interpreter will return an error. Localized attributes must be identified by both their name and language tag in order to be used.
Only a single occurrence of entityLocaleEquals is allowed in the filter part of the query. Currently, there is no way to switch context between different parts of the filter and build queries such as find a product whose name in en-US is "screwdriver" or in cs is "šroubovák".
Also, it's not possible to omit the language specification for a localized attribute and ask questions like: find a product whose name in any language is "screwdriver".

While it's technically possible to implement support for these tasks in evitaDB, they represent edge cases, and there were more important scenarios to handle.

To test the locale specific query, we need to focus on the Vouchers for shareholders category in our demo dataset. We know that there are products that have only English (en_US) localization. To select the products with English localization, we can issue this query:

... and we will get a list with the number of them.

List of all products with English localization in category
You will notice that the output contains two columns: code and name. The code is not a localized attribute, while the name is. The names listed in the response reflect the English locale that is part of the filter constraint.
If you use entityLocaleEquals in your filter, all returned localized data (both attributes and associated data) will respect the filtered locale. If you need data for locales other than the one used in the filter constraint, you can use the require constraint data-in-locale.

But when we request products in Czech locale:

... the query returns none of them, even though we know there are products in this category.

Author: Ing. Jan Novotný

Date updated: 27.5.2023

Documentation Source