Introduction
A standalone instance MariaDB ColumnStore 1.2.3 is installed and configured on Ubuntu 18 in this article.
- The installation is performed with a non root user (user
mcs
). MariaDB ColumnStore will run with the usermcs
. - The storage is internal.
- The MySQL/MariaDB front end database is installed in the directory
/sqlpac/mcs/mysql
. - One DBRoot ColumnStore with one disk is created (
DBRoot1
) and installed in the directory/sqlpac/mcs/data1
.
Pre-requisites
Preparing the user mcs
MariaDB ColumnStore will be installed and will run with the user mcs
(id: 10010, $HOME : /opt/mcs)
.
For a non root installation :
- The home directory (
$HOME
) must be the root installation directory of MariaDB ColumnStore. - The group name must be the same than the user name (
mcs
). - When planning multiple instances installation, the user id and group id must be the same on all hosts.
root@vps$ useradd -d /opt/mcs -m -u 10010 -s /bin/bash mcs
System requirements
Increase the number of opened files in the system configuration file /etc/security/limits.conf
for the user mcs
.
/etc/security/limits.conf
mcs hard nofile 65536
mcs soft nofile 65536
Packages dependencies installation
The following packages are mandatory :
- libboost
- libjemalloc (memory allocator)
- libsnappy (compression library)
If not installed, as root
user :
root@vps$ apt-get install libboost-all-dev
root@vps$ apt-get install libjemalloc1
root@vps$ apt-get install libsnappy1v5
MariaDB ColumnStore download and installation
Download MariaDB ColumnStore 1.2.3 from MariaDB website : mariadb-columnstore-1.2.3-1-bionic.amd64.bin.tar.gz.
As user mcs
, unzip and extract the archive in the directory $HOME
.
mcs@vps$ cd $HOME
mcs@vps$ gunzip mariadb-columnstore-1.2.3-1-bionic.amd64.bin.tar.gz
mcs@vps$ tar xvf mariadb-columnstore-1.2.3-1-bionic.amd64.bin.tar
Do not change the directory structure during the extraction, the installation directory must absolutely be $HOME/mariadb/columnstore
, otherwise setup fails.
Configuration and setup
Modifying databases locations
Before running any post installation scripts and building the system, in this article the MySQL Front End database and the DBRoot ColumnStore are not installed in their
default locations, respectively $HOME/mariadb/columnstore/mysql/db
and $HOME/mariadb/columnstore/data1
.
In such case, soft links must be created :
mcs@vps$ ln -s /sqlpac/mcs/mysql /opt/mcs/mariadb/columnstore/mysql/db
mcs@vps$ ln -s /sqlpac/mcs/data1 /opt/mcs/mariadb/columnstore/data1
Check the user mcs
is the owner of the new locations.
post-install
As user mcs
, run the script post-install
.
Be sure the script post-install
is executed when the current directory is $HOME
.
mcs@vps$ ./mariadb/columnstore/bin/post-install --installdir=$HOME/mariadb/columnstore
Various configurations are prepared by this script (links to libraries, directories…).
The output gives the next steps to be performed depending on the context.
NOTE: For non-root install, you will need to run the following commands as root user to
setup the MariaDB ColumnStore System Logging
export COLUMNSTORE_INSTALL_DIR=/opt/mcs/mariadb/columnstore
export LD_LIBRARY_PATH=:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib
/opt/mcs/mariadb/columnstore/bin/syslogSetup.sh --installdir=/opt/mcs/mariadb/columnstore --user=mcs install
The next steps are:
If installing on a pm1 node using non-distributed install
export COLUMNSTORE_INSTALL_DIR=/opt/mcs/mariadb/columnstore
export LD_LIBRARY_PATH=:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib
/opt/mcs/mariadb/columnstore/bin/postConfigure -i /opt/mcs/mariadb/columnstore
If installing on a pm1 node using distributed install
export COLUMNSTORE_INSTALL_DIR=/opt/mcs/mariadb/columnstore
export LD_LIBRARY_PATH=:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib
/opt/mcs/mariadb/columnstore/bin/postConfigure -i /opt/mcs/mariadb/columnstore -d
If installing on a non-pm1 using the non-distributed option:
export COLUMNSTORE_INSTALL_DIR=/opt/mcs/mariadb/columnstore
export LD_LIBRARY_PATH=:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib
/opt/mcs/mariadb/columnstore/bin/columnstore start
syslogSetup.sh
Only one script must be executed as root
user : syslogSetup.sh
. This script configures logging functionalities in the
directory /var/log/mariadb/columnstore
. If not performed, log files won’t be available for diagnose purposes
(crit.log, warn.log, info.log, debug.log, warn.log…
).
root@vps$ export COLUMNSTORE_INSTALL_DIR=/opt/mcs/mariadb/columnstore
root@vps$ export LD_LIBRARY_PATH=:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib
root@vps$ /opt/mcs/mariadb/columnstore/bin/syslogSetup.sh --installdir=/opt/mcs/mariadb/columnstore --user=mcs install
System logging being used: rsyslog
postConfigure
The columnstore system is then built with the script postConfigure
as user mcs
.
A performance module pm1 using a non distributed install is generated.
mcs@vps$ export COLUMNSTORE_INSTALL_DIR=/opt/mcs/mariadb/columnstore
mcs@vps$ export LD_LIBRARY_PATH=:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib
mcs@vps$ /opt/mcs/mariadb/columnstore/bin/postConfigure -i /opt/mcs/mariadb/columnstore
A wizard starts :
- System server type :
single
- System name :
mcs1
- Type of data storage :
internal
- DBRoot IDs assigned to module pm1 : 1
This is the MariaDB ColumnStore System Configuration and Installation tool.
It will Configure the MariaDB ColumnStore System and will perform a Package
Installation of all of the Servers within the System that is being configured.
IMPORTANT: This tool requires to run on the Performance Module #1
Prompting instructions:
Press 'enter' to accept a value in (), if available or
Enter one of the options within [], if available, or
Enter a new value
===== Setup System Server Type Configuration =====
There are 2 options when configuring the System Server Type: single and multi
'single' - Single-Server install is used when there will only be 1 server configured
on the system. It can also be used for production systems, if the plan is
to stay single-server.
'multi' - Multi-Server install is used when you want to configure multiple servers now or
in the future. With Multi-Server install, you can still configure just 1 server
now and add on addition servers/modules in the future.
Select the type of System Server install [1=single, 2=multi] (2) > 1
Performing the Single Server Install.
Enter System Name (columnstore-1) > mcs1
===== Setup Storage Configuration =====
----- Setup Performance Module DBRoot Data Storage Mount Configuration -----
There are 2 options when configuring the storage: internal or external
'internal' - This is specified when a local disk is used for the DBRoot storage.
High Availability Server Failover is not Supported in this mode
'external' - This is specified when the DBRoot directories are mounted.
High Availability Server Failover is Supported in this mode.
Select the type of Data Storage [1=internal, 2=external] (1) > 1
Enter the list (Nx,Ny,Nz) or range (Nx-Nz) of DBRoot IDs assigned to module 'pm1' (1) >1
When the installation is successfull :
===== Performing Configuration Setup and MariaDB ColumnStore Startup =====
NOTE: Setting 'NumBlocksPct' to 50%
Setting 'TotalUmMemory' to 25% of total memory.
Running the MariaDB ColumnStore setup scripts
post-mysqld-install Successfully Completed
post-mysql-install Successfully Completed
Starting MariaDB Columnstore Database Platform
Starting MariaDB ColumnStore Database Platform Starting, please wait ..... DONE
System Catalog Successfull Created
MariaDB ColumnStore Install Successfully Completed, System is Active
Enter the following command to define MariaDB ColumnStore Alias Commands
. /opt/mcs/.profile
Enter 'mcsmysql' to access the MariaDB ColumnStore SQL console
Enter 'mcsadmin' to access the MariaDB ColumnStore Admin console
NOTE: The MariaDB ColumnStore Alias Commands are in /etc/profile.d/columnstoreAlias.sh
Post installation environment variables
After the installation, the file .profile
for the user mcs
is updated. Environment variables
and useful aliases are defined.
$HOME/.profile
# MariaDB Columnstore Non-Root Environment Variables
export COLUMNSTORE_INSTALL_DIR=/opt/mcs/mariadb/columnstore
export LD_LIBRARY_PATH=:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib:/opt/mcs/mariadb/columnstore/lib:/opt/mcs/mariadb/columnstore/mysql/lib
# MariaDB Columnstore Non-Root Alias Variables
# MariaDB Columnstore Alias Commands
#
alias mcsmysql='/opt/mcs/mariadb/columnstore/mysql/bin/mysql --defaults-extra-file=/opt/mcs/mariadb/columnstore/mysql/my.cnf -u root'
alias ma=/opt/mcs/mariadb/columnstore/bin/mcsadmin
alias mcsadmin=/opt/mcs/mariadb/columnstore/bin/mcsadmin
alias cpimport=/opt/mcs/mariadb/columnstore/bin/cpimport
alias mcshome='cd /opt/mcs/mariadb/columnstore'
alias log='cd /var/log/mariadb/columnstore/'
alias core='cd /var/log/mariadb/columnstore/corefiles'
alias tmsg='tail -f /var/log/messages'
alias tdebug='tail -f /var/log/mariadb/columnstore/debug.log'
alias tinfo='tail -f /var/log/mariadb/columnstore/info.log'
alias terror='tail -f /var/log/mariadb/columnstore/err.log'
alias twarning='tail -f /var/log/mariadb/columnstore/warning.log'
alias tcrit='tail -f /var/log/mariadb/columnstore/crit.log'
alias dbrm='cd /opt/mcs/mariadb/columnstore/data1/systemFiles/dbrm'
alias mcsmodule='cat /opt/mcs/mariadb/columnstore/local/module'
Mariadb ColumnStore processes architecture
11 processes run in a MariaDB ColumnStore performance module, ProcMon (ProcessMonitor) is the parent process of all the other processes except
mysql
:
mcs@vps$ ps -fauxwww | grep 'mcs'
\_ /opt/mcs/mariadb/columnstore/bin/ProcMon \_ [ProcMgr] \_ /opt/mcs/mariadb/columnstore/bin/controllernode fg \_ /opt/mcs/mariadb/columnstore/bin/ServerMonitor \_ /opt/mcs/mariadb/columnstore/bin/workernode DBRM_Worker1 fg \_ [PrimProc] \_ [ExeMgr] \_ [WriteEngineServ] \_ [DDLProc] \_ [DMLProc] \_ /opt/mcs/mariadb/columnstore/mysql//bin/mysqld --basedir=/opt/mcs/mariadb/columnstore/mysql/ \ --datadir=/opt/mcs/mariadb/columnstore/mysql/db \ --plugin-dir=/opt/mcs/mariadb/columnstore/mysql/lib/plugin \ --log-error=/opt/mcs/mariadb/columnstore/mysql/db/vps.err \ --pid-file=/opt/mcs/mariadb/columnstore/mysql/db/vps.pid \ --socket=/opt/mcs/mariadb/columnstore/mysql/lib/mysql/mysql.sock \ --port=3306
Processes listen on various ports, ports defined in the configuration file $HOME/mariadb/columnstore/etc/Columnstore.xml
:
mcs@vps$ lsof -i -P -n | grep LISTEN | grep 'mcs'
mysqld 16684 mcs 28u IPv6 182203 0t0 TCP *:3306 (LISTEN) controlle 16789 mcs 5u IPv4 183378 0t0 TCP *:8616 (LISTEN) ServerMon 16810 mcs 3u IPv4 183939 0t0 TCP *:8622 (LISTEN) workernod 16836 mcs 9u IPv4 183420 0t0 TCP *:8700 (LISTEN) PrimProc 16895 mcs 10u IPv4 183479 0t0 TCP *:8620 (LISTEN) ExeMgr 16979 mcs 11u IPv4 183577 0t0 TCP *:8601 (LISTEN) WriteEngi 17039 mcs 9u IPv4 183653 0t0 TCP *:8630 (LISTEN) DDLProc 17079 mcs 10u IPv4 183747 0t0 TCP *:8612 (LISTEN) DMLProc 17138 mcs 10u IPv4 183871 0t0 TCP *:8614 (LISTEN) ProcMon 24558 mcs 4u IPv4 150911 0t0 TCP *:8800 (LISTEN) ProcMon 24558 mcs 11u IPv4 150920 0t0 TCP *:8604 (LISTEN) ProcMgr 24680 mcs 9u IPv4 151557 0t0 TCP *:8603 (LISTEN) ProcMgr 24680 mcs 10u IPv4 151555 0t0 TCP *:8606 (LISTEN)
A schema is more explicit :
To manage Mariadb ColumnStore, use the admin console mcsadmin
.
List processes - getProcessStatus
mcs@vps$ mcsadmin getProcessStatus
getprocessstatus Thu Dec 12 13:50:04 2019 MariaDB ColumnStore Process statuses Process Module Status Last Status Change Process ID ------------------ ------ --------------- ------------------------ ---------- ProcessMonitor pm1 ACTIVE Fri Nov 15 13:37:48 2019 22045 ProcessManager pm1 ACTIVE Fri Nov 15 13:37:54 2019 22171 DBRMControllerNode pm1 ACTIVE Fri Nov 15 13:38:00 2019 22581 ServerMonitor pm1 ACTIVE Fri Nov 15 13:38:02 2019 22604 DBRMWorkerNode pm1 ACTIVE Fri Nov 15 13:38:03 2019 22656 PrimProc pm1 ACTIVE Fri Nov 15 13:38:06 2019 22694 ExeMgr pm1 ACTIVE Fri Nov 15 13:38:10 2019 22771 WriteEngineServer pm1 ACTIVE Fri Nov 15 13:38:14 2019 22853 DDLProc pm1 ACTIVE Fri Nov 15 13:38:19 2019 22916 DMLProc pm1 ACTIVE Fri Nov 15 13:38:23 2019 22981 mysqld pm1 ACTIVE Fri Nov 15 13:38:00 2019 22470
Starting / Stopping the ColumnStore system
Use mcsadmin
to start and stop the database columnstore system :
mcs@vps$ mcsadmin startSystem
startsystem Thu Dec 12 14:32:13 2019 startSystem command, 'columnstore' service is down, sending command to start the 'columnstore' service on all modules System being started, please wait........... Successful start of System
mcs@vps$ mcsadmin shutdownSystem y
shutdownsystem Thu Dec 12 14:38:02 2019 This command stops the processing of applications on all Modules within the MariaDB ColumnStore System Checking for active transactions Stopping System... Successful stop of System Shutting Down System... Successful shutdown of System
The option shutdownSystem
stops the whole system. When using the option stopSystem
, all processes
are stopped except ProcessMonitor and ProcessManager.
Configuration, resource usage
mcsadmin
displays useful informations about configuration (DBRoot…), resources usage (CPU, Memory…) :
mcs@vps$ mcsadmin getStorageConfig
getstorageconfig Thu Dec 12 15:17:49 2019 System Storage Configuration Performance Module (DBRoot) Storage Type = internal System Assigned DBRoot Count = 1 DBRoot IDs assigned to 'pm1' = 1
mcs@vps$ mcsadmin getResourceUsage
getsystemresourceusage Thu Dec 12 15:18:23 2019 System Resource Usage per Module Module 'pm1' Resource Usage CPU: 3% Usage Mem: 7973380k total, 657728k used, 584824k cache, 8% Usage Swap: 0 k total, 0k used, 0% Usage Top CPU Process Users: f2b/server 6% Top Memory Process Users: mysqld 2%, systemd-journald 1%, mono 1%, DMLProc 1%, workernode 1% Disk Usage: / 78%
mcs@vps$ mcsadmin getModuleMemory pm1
getmodulememory Thu Dec 12 15:19:32 2019 Module Memory Usage (in K bytes) Module Mem Total Mem Used cache Mem Usage % Swap Total Swap Used Swap Usage % ------ --------- ------- ------- ---------- ---------- --------- ----------- pm1 7973380 658000 584840 8 0 0 0
For the full list of the available options : use mcsadmin help
.
The MySQL/MariaDB front end database
mcsmysql
Locally, use the alias mcsmysql
to connect to the database. MariaDB 10.3.13 is embedded with MariaDB ColumnStore 1.2.3.
mcs@vps$ mcsmysql
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 11
Server version: 10.3.13-MariaDB-log Columnstore 1.2.3-1
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]>
Storage engines, plugins, variables
The ColumnStore storage engine is installed and InnoDB is the default storage engine :
MariaDB [(none)]> show storage engines;
+--------------------+---------+----------------------------------------+--------------+------+------------+ | Engine | Support | Comment | Transactions | XA | Savepoints | +--------------------+---------+----------------------------------------+--------------+------+------------+ | Columnstore | YES | Columnstore storage engine | YES | NO | NO | | ... | | | | | | | MyISAM | YES | Non-transactional engine with good | NO | NO | NO | | | | performance and small data footprint | | | | | ... | | | | | | | InnoDB | DEFAULT | Supports transactions, | YES | YES | YES | | | | row-level locking, foreign keys | | | | | | | and encryption for tables | | | | | ... | | | | | | | Aria | YES | Crash-safe tables with MyISAM | NO | NO | NO | | | | heritage | | | | | ... | | | | | | +--------------------+---------+----------------------------------------+--------------+------+------------+
Many plugins related to the ColumnStore storage engine are enabled :
MariaDB [(none)]> show plugins;
+-------------------------------+----------+--------------------+---------------------------+---------+ | Name | Status | Type | Library | License | +-------------------------------+----------+--------------------+---------------------------+---------+ | Columnstore | ACTIVE | STORAGE ENGINE | libcalmysql.so | GPL | | InfiniDB | ACTIVE | STORAGE ENGINE | libcalmysql.so | GPL | | COLUMNSTORE_TABLES | ACTIVE | INFORMATION SCHEMA | is_columnstore_tables.so | GPL | | COLUMNSTORE_COLUMNS | ACTIVE | INFORMATION SCHEMA | is_columnstore_columns.so | GPL | | COLUMNSTORE_EXTENTS | ACTIVE | INFORMATION SCHEMA | is_columnstore_extents.so | GPL | | COLUMNSTORE_FILES | ACTIVE | INFORMATION SCHEMA | is_columnstore_files.so | GPL |
Historically, the ColumnStore storage engine was first developed by the Calpont company, company that does not exist anymore, and
was called InfiniDB. MariaDB ColumnStore is a fork of Calpont/InfiniDB. That’s why libraires are called for example libcalmysql.so
, cal
for Calpont.
Variables related to the ColumnStore storage engine are named infinidb%
:
MariaDB [(none)]> show variables like 'infinidb%';
+---------------------------------------------+-------+ | Variable_name | Value | +---------------------------------------------+-------+ | infinidb_compression_type | 2 | | infinidb_decimal_scale | 8 | | infinidb_diskjoin_bucketsize | 100 | | infinidb_diskjoin_largesidelimit | 0 | | infinidb_diskjoin_smallsidelimit | 0 | | infinidb_double_for_decimal_math | OFF | | infinidb_import_for_batchinsert_delimiter | 7 | | infinidb_import_for_batchinsert_enclosed_by | 17 | | infinidb_local_query | 0 | | infinidb_ordered_only | OFF | | infinidb_string_scan_threshold | 10 | | infinidb_stringtable_threshold | 20 | | infinidb_um_mem_limit | 0 | | infinidb_use_decimal_scale | OFF | | infinidb_use_import_for_batchinsert | ON | | infinidb_varbin_always_hex | OFF | | infinidb_vtable_mode | 1 | +---------------------------------------------+-------+
By default, the compression is set to 2 (infinidb_compression_type : 2
), compression is enabled.
Client configuration
ColumnStore uses a dedicated database called infinidb_vtable
for creation of all temporary tables
used for ColumnStore tables query processing. The permission "CREATE TEMPORARY TABLES" must be given for all user accounts in the
database infinidb_vtable
.
MariaDB [(none)]> create user flightsdbo@'%' identified by '************';
MariaDB [(none)]> grant create temporary tables on infinidb_vtable.* to flightsdbo;
MariaDB [(none)]> flush privileges;
When this permission is not given, an error occurs when processing queries :
ERROR 1044 (42000): Access denied for user 'flightsdbo'@'%' to database 'infinidb_vtable'
Creating a ColumnStore table
To create a ColumnStore table, only use the option engine=columnstore
in the CREATE TABLE
statement.
MariaDB [(none)]> use flights;
MariaDB [(flights)]>
create table airlines (
iata_code char(2),
airline varchar(30)
) engine=columnstore default character set=utf8;
Loading data with cpimport
cpimport
is the utility to bulk load data from flat files.
Example : loading airports in the table airports
(database flights
) from the file airports.csv
.
mcs@vps$ cpimport -m 2 -s ',' -E '"' flights airports -l $SCHEMA_DIR/airports.csv
2019-11-18 12:36:14 (25992) INFO : Running distributed import (mode 2) on all PMs... 2019-11-18 12:36:16 (25992) INFO : For table flights.airports: 342 rows processed and 342 rows inserted. 2019-11-18 12:36:16 (25992) INFO : Bulk load completed, total run time : 2.11828 seconds
The demo database flights
A useful demo database is available : Mariadb ColumnStore Flights Demo Database.
In this demo database, three tables : airlines
, airports
and flights
. All american flights during the year 2018 are loaded.
/opt/mcs/flights/data/2018-03.csv
2019-11-18 12:01:07 (2547) INFO : Running distributed import (mode 2) on all PMs...
2019-11-18 12:01:10 (2547) INFO : For table flights.flights: 665810 rows processed and 665810 rows inserted.
2019-11-18 12:01:10 (2547) INFO : Bulk load completed, total run time : 3.24067 seconds
/opt/mcs/flights/data/2018-04.csv
2019-11-18 12:01:11 (2674) INFO : Running distributed import (mode 2) on all PMs...
2019-11-18 12:01:15 (2674) INFO : For table flights.flights: 649023 rows processed and 649023 rows inserted.
2019-11-18 12:01:15 (2674) INFO : Bulk load completed, total run time : 4.22641 seconds
The variable infinidb_vtable_mode
An important variable : infinidb_vtable_mode
. This parameter can be defined at the server level and at the session level.
MariaDB [(none)]> show variables like 'infinidb_vtable_mode%';
+----------------------+-------+ | Variable_name | Value | +----------------------+-------+ | infinidb_vtable_mode | 1 | +----------------------+-------+
Value | Behaviour |
---|---|
0 |
Some WHERE clause components can be processed by
ColumnStore, but joins are processed entirely by mysqld
using a nested-loop join mechanism |
1 |
Default : query syntax is evaluated by ColumnStore for compatibility with distributed execution and incompatible queries are rejected. |
2 |
Auto-switch mode : ColumnStore will attempt to process the query internally, if it cannot, it will automatically switch the query to run in row-by-row mode. |
The needed value will depend on the target application querying system.
ColumnStore metadata informations
Space usage
In the database columnstore_info
, the stored procedure total_usage
returns the total space used in the ColumnStore system :
MariaDB [(none)]> call columnstore_info.total_usage();
+-----------------+------------------+ | TOTAL_DATA_SIZE | TOTAL_DISK_USAGE | +-----------------+------------------+ | 701.76 MB | 764.35 MB | +-----------------+------------------+
To get space used for the ColumnStore tables, use the stored procedure columnstore_info.table_usage()
.
MariaDB [(none)]> call columnstore_info.table_usage(null,null);
+--------------+------------+-----------------+-----------------+-------------+ | TABLE_SCHEMA | TABLE_NAME | DATA_DISK_USAGE | DICT_DATA_USAGE | TOTAL_USAGE | +--------------+------------+-----------------+-----------------+-------------+ | flights | airlines | 2.52 MB | 2.01 MB | 4.52 MB | | flights | airports | 9.55 MB | 6.02 MB | 15.58 MB | | flights | flights | 744.25 MB | 0.00 Bytes | 744.25 MB | +--------------+------------+-----------------+-----------------+-------------+
MariaDB [(none)]> call columnstore_info.table_usage('flights','flights');
+--------------+------------+-----------------+-----------------+-------------+ | TABLE_SCHEMA | TABLE_NAME | DATA_DISK_USAGE | DICT_DATA_USAGE | TOTAL_USAGE | +--------------+------------+-----------------+-----------------+-------------+ | flights | flights | 744.25 MB | 0.00 Bytes | 744.25 MB | +--------------+------------+-----------------+-----------------+-------------+
Compression ratio
The average compression ratio is calculated with the stored procedure columnstore_info.compression_ratio
.
MariaDB [(none)]> call columnstore_info.compression_ratio();
+-------------------+ | COMPRESSION_RATIO | +-------------------+ | 2.7716:1 | +-------------------+
INFORMATION_SCHEMA
MariaDB ColumnStore implements four Information Schema tables that expose informations about the tables and columns storage :
COLUMNSTORE_TABLES
COLUMNSTORE_COLUMNS
COLUMNSTORE_EXTENTS
COLUMNSTORE_FILES
The stored procedures columnstore_info.total_usage()
, columnstore_info.table_usage()
,
columnstore_info.compression_ratio()
relies on these information tables.
MariaDB [information_schema]> select * from columnstore_tables;
+--------------+------------+-----------+---------------+--------------+---------------+ | TABLE_SCHEMA | TABLE_NAME | OBJECT_ID | CREATION_DATE | COLUMN_COUNT | AUTOINCREMENT | +--------------+------------+-----------+---------------+--------------+---------------+ | flights | airlines | 3000 | 2019-11-18 | 2 | NULL | | flights | airports | 3004 | 2019-11-18 | 7 | NULL | | flights | flights | 3015 | 2019-11-18 | 32 | NULL | +--------------+------------+-----------+---------------+--------------+---------------+
To find where is stored a column, its properties and its size, for example the column year
in the table flights
:
MariaDB [information_schema]> SELECT b.table_name as "Table", b.column_name as "Column", a.filename, a.file_size, a.compressed_data_size, c.min_value as "Min", c.max_value as "Max" FROM COLUMNSTORE_FILES a, COLUMNSTORE_COLUMNS b, COLUMNSTORE_EXTENTS c WHERE a.object_id = b.object_id AND b.object_id = c.object_id AND b.COLUMN_NAME='year' AND b.table_name='flights'
+-----------+--------+----------------------------------------------------------------------------------------+ | Table | Column | filename | +-----------+--------+----------------------------------------------------------------------------------------+ | flights | year | /opt/mcs/mariadb/columnstore/data1/000.dir/000.dir/012.dir/136.dir/000.dir/FILE001.cdf | +-----------+--------+----------------------------------------------------------------------------------------+ | file_size | compressed_data_size | Min | Max | +-----------+----------------------+---------+--------+ | 16785408 | 860160 | 2015 | 2017 | +-----------+----------------------+---------+--------+
The min
and max
values are important informations when the ColumnStore engine processes query,
these values govern partitions eliminations.
Conclusion
Installing a standalone MariaDB ColumnStore is quite easy and adequate for getting started before planning architectures with multiple performance modules.
First tests performed with analytical queries : the performances are very good. But there is this big inconvenience : unable to create unicity constraints to enforce data quality. If duplicate rows are inserted due to an error, duplicate tuples must be removed manually.
Another inconvenience (for the moment), feeding is efficient only when using cpimport
. Like many other column store systems, atomic feeding
(INSERT, UPDATE, DELETE
) is not optimized in MariaDB ColumnStore : performance and storage degradations. That’s probably the reason why replication is
not (yet) implemented and supported.