Aim and purpose of the study
The document contains a summary of the tender documentation and hypothesis definition for research grant No CZ.01.1.02/0.0/0.0/19_262/0020308 and a signpost to further information related to this research.
Hypothesis
A single-purpose NoSQL database for fast reading, tailored to the requirements of e-shop solutions not only in the Czech Republic, but also abroad, will have an order of magnitude better response than generic SQL or NoSQL database solutions on the same hardware configuration (i.e. 10 times lower latency between request and response).
Search engine implementation
Whether the hypothesis is confirmed or refuted - the actual goal of the project is to build a search engine on top of the chosen technology, which will facilitate the future implementations of e-commerce platforms.
- implementation of a dedicated stand-alone NoSQL database to the level of usable for typical e-commerce catalogs with its own HTTP API
- documentation of code and preparation of technical documentation to enable adoption by third parties
- choice of open license and monetization options to enable further engine development and maintenance
Prerequisites
The use of currently commercially available hardware is assumed:
- 4x CPU at 2.7GHz
- 16GB RAM
- SSD drive - 80GB
- Ubuntu 20.04 - server
Based on our experience and market research, we claim that the basic e-commerce catalog requirements are very similar, and it is therefore possible to define a common API that will provide the general functionality for the majority of current e-commerce websites.
In addition to the functional requirements, the basic non-functional requirement is the performance of the search engine, which can be measured in two ways:
- the latency of a single request
- the throughput of requests processed per second
Anticipated benefit
Small e-commerce sites can be served by a relatively trivial implementation or an existing platform. Today, however, the consolidation of the e-commerce market is already underway and a number of medium-sized e-shops are emerging with significant dataset size and system throughput requirements that start to fight with the limitations of these solutions.
Today, all e-commerce producers are creating their own application data structures on top of general purpose databases and are spending huge amounts of resources to create their own APIs and engines that fulfill the client requirements. They are very often forced to compromise on the e-commerce catalog functionality when the rich features lead to performance degradation.
- might save a significant amount of work for implementers and opens up the market for other startups
- might allow implementers to devote their energy to add-on functionalities and thus enrich the market
- might lead to savings on electricity and operational costs - e-commerce websites are operated in 24x7 mode and using a more optimal search engine will save annually considerable amount of hardware resources and money (see studies on CPU consumption under different loads)