PostgreSQL is the all-rounder database; it’s the most widely used after MySQL.
If you don’t know what a database is, you can find out here.
What does this database offer?
- Store data in a structured format
- Store data in an unstructured format
- Scale vertically
- Scale horizontally (using plugins)
- Full table partitioning (including indexes)
- It won’t let you down (30 years of reliability)
This database is a standard for application development, thanks to its long history and high compatibility with many libraries across all existing programming languages. It has been in use for over 30 years.
Its language is SQL; this is how a table would be created:
CREATE TABLE users ( id SERIAL PRIMARY KEY, name TEXT, extra_data JSONB );
Why is this database called “the all-rounder”?
Because it allows you to create your own data types for cases where your data structure is very complex.
For example, if we want to create a data type that contains a model’s metrics, we could define a type like this:
CREATE TYPE model_metrics AS ( precision_val NUMERIC(5,4), -- Example: 0,9500 recall_val NUMERIC(5,4), f1_score NUMERIC(5,4) );
And then use it when creating a table:
CREATE TABLE experiment_log ( id SERIAL PRIMARY KEY, model_name TEXT, training_status training_state DEFAULT 'started', -- Using an ENUM results model_metrics -- Using the COMPOSITE type );
To insert data, it would look like this:
INSERT INTO experiment_log (model_name, training_status, results) VALUES ('Neural Network V1', 'finished', ROW(0,8, 0,7, 0,75)), ('Decision Tree', 'finished', (0,9, 0,85, 0,88)), ('Linear Regression', 'failed', NULL);
And finally, to query the data, you can do it like this:
SELECT model_name, (results).precision_val FROM experiment_log WHERE (results).f1_score > 0,8;
This feature allows us to store an $n$ number of variables without having to expand the database table structure excessively.
If you need flexibility, this database also provides it through JSON or JSONB fields.
This type of field allows you to store a JSON directly in the database and even perform searches based on its content (in which case you should use JSONB). This is extremely useful when an application is growing, as it doesn’t force us to constantly modify the table schema.
What is the difference between JSON and JSONB, you ask?
The difference is that one is stored as plain text on the disk (JSON) while the other is stored in binary format (JSONB). Each has its own advantages and disadvantages.
JSON Advantages:
- It is great when you just want to store a JSON, as you don’t need to serialize or deserialize it.
JSONB Advantages:
- More efficient if you want to query fields inside the JSON.
- Takes up less space on disk.
- Allows you to create indexes on the JSON fields for even faster performance.
Examples coming soon.
Bibliography