The Modulus RTD Server is a high-performance time-series database engine that hedge funds, data providers, financial institutions and exchanges around the world use to store and retrieve multi-petabyte volumes of market data.
Why use Modulus RTD Server over a relational database model (RDBM) database?
Unlike RDBM systems, RTD Server was built for speed, not for multidimensional data analysis.
One of the key strengths of RTD Server is the ability to quickly locate and return segments of historic time-series data, which may be buried deep within petabytes of data. RTD Server performs this task faster than any database engine we've ever tested.
Although RDBM systems have their advantages in a variety of business applications, they fail miserably for time-series analysis applications. In-depth studies of the method in which Time Series Analysis (TSA) results are obtained reveals that a structured relational database is the worst way to store time-series data.
"Having an RDBMSs doesn't mean instant decision-support nirvana. As enabling as RDBMSs have been for users, they were never intended to provide powerful functions for data synthesis, analysis, and consolidation (functions collectively known as multidimensional data analysis)."
SQL databases consist of a set of row/column-based "tables" (containers that store data) indexed by a "data dictionary." A table looks a lot like a spreadsheet, as it is composed of rows (records), and each row is composed of columns (fields). A collection of related tables is known as a relational database.
Using the very flexible SQL (structured query language), you can retrieve data from any table, or groups of related tables, and present that data as a "view."
This basic functionality, and the flexibility to store and relate almost anything, makes the relational database management system (RDBMS) so powerful and so widely used for nearly every serious business application.
Unfortunately, this one-size-fits-all approach to data storage and retrieval is exactly why the RDBM is the wrong choice for time-series applications.
The relational database model produces substantial overhead due to its inherent multiple row and table record structures. When indices, clusters, and procedures are heaped on top, it creates even more overhead which slows down performance considerably.
Since all RDBMS records are equally "important" to the database, they are not optimized for speed.
Also, since an RDBMS has no inherent data compression method, they are usually combined with exception reporting and averaging techniques, which may result in data loss and inaccurately reproduced data.
Typically the speed of an RDBMS writing to the drive is quite slow. Major RDBMS vendors often claim benchmarks which include very high transactions per second (TPS). What they don't say: the TPS speed refers to actions performed on the data after it is already in the database, and not to the speed at which it is written to the database or the data retrieval speed. What goes on inside of the database is of little interest to the end user. The data acquisition speed, and the actual time it takes to put a set of results onto the screen, is what matters most.
An additional SQL drawback, regarding any time-series data reporting, is that statistics are not automatically calculated by the RDBMS because SQL mathematics is limited to sums, minimums, maximums, and averages.
Worse, a traditional RDBMS is generally limited to a one-second-time resolution. This is a problem when acquiring high burst quantities of data with sub second time stamps, or microsecond timestamps, such as with high frequency trading, real-time telemetry, or other low latency data acquisitions.
The ideal solution is a storage and retrieval methodology which is able to access data in a nearly instantaneous manner and then calculate the statistics for a given time span "on the fly," without the overhead of a RDBMS.
We've created the ideal solution for time-series data: a storage and retrieval methodology that's able to access data almost instantaneously and then calculate statistics for a given time span "on the fly," without the overhead of RDBMSs.
RTD Server is a new data retrieval methodology engineered by Modulus specifically for time-series data, using a novel data storage and retrieval algorithm.
At the heart of RTD Server is a novel time-series search algorithm, which uses triangulation. Developed in C++ code, RTD Server is both multi-threaded and highly scalable. It is capable of reading petabytes of data to retrieve specific records or data-sets thousands of times faster than the fastest RDBMS.
RTD Server can quickly locate and return segments of data, even if they're buried deep within historical time-series data, making RTD Server suitable for a variety of applications including quantitative research of markets, defense and security applications, manufacturing, medical applications and more.
RTD Server is available for Windows, Mac, and Linux operating systems, including Linux cluster distributions such as Rocks for high-performance computing.