cstore_fdw

Fast columnar store for analytics with PostgreSQL

View project on GitHub

Introducing cstore_fdw, the first column-oriented store available for PostgreSQL. Using it will let you:

  • Leverage typical analytics benefits of columnar stores
  • Deploy on stock PostgreSQL or scale-out PostgreSQL (Citus)

Download and get started at https://github.com/citusdata/cstore_fdw.

Highlights

Key areas improved by this extension:

  • Faster Analytics — Reduce analytics query disk and memory use by 10x
  • Lower Storage — Compress data by 3x
  • Easy Setup — Deploy as standard PostgreSQL extension
  • Flexibility — Mix row- and column-based tables in the same DB
  • Community — Benefit from PostgreSQL compatibility and open development

Learn more on our blog post.

Faster Analytics

cstore_fdw brings substantial performance benefits to analytics-heavy workloads:

  • Column projections: only read columns relevant to the query
  • Compressed data: higher data density reduces disk I/O
  • Skip indexes: row group stats permit skipping irrelevant rows
  • Stats collections: integrates with PostgreSQL’s own query optimizer
  • PostgreSQL-native formats: no deserialization overhead at query time
Table Type TPC-H 3 TPC-H 5 TPC-H 6 TPC-H 10
PostgreSQL 4444 4444 3512 4433
cstore 786 754 756 869
cstore (LZ) 322 346 269 302
4GB data using PostgreSQL 9.3 on m1.xlarge

Lower Storage

Cleanly implements full-table compression:

Easy Setup

If you know how to use PostgreSQL extensions, you know how to use cstore_fdw:

  • Deploy as standard PostgreSQL extension
  • Simply specify table type at creation time using FDW commands
  • Copy data into your tables using standard PostgreSQL COPY command

Flexibility

Have the best of all worlds… mix row- and column-based tables in the same DB:

CREATE FOREIGN TABLE cstore_table
  (num integer, name text)
SERVER cstore_server;

CREATE TABLE plain_table
  (num integer, name text);

COPY cstore_table FROM STDIN (FORMAT csv);
-- 1, foo
-- 2, bar
-- 3, baz
-- \.

COPY plain_table FROM STDIN (FORMAT csv);
-- 4, foo
-- 5, bar
-- 6, baz
-- \.

SELECT * FROM cstore_table c, plain_table p WHERE c.name=p.name;
-- num | name | num | name 
-------+------+-----+------
--   1 |  foo |   4 |  foo
--   2 |  bar |   5 |  bar
--   3 |  baz |   6 |  baz

Community

Join the cstore users Google Group.

Full integration with rich PostgreSQL ecosystem:

  • Compatible with all existing PostgreSQL data types
  • Leverage semi-structured data using hstore or json
  • Quickly keep track of distinct values with HyperLogLog

Learn more about the Optimized Row Column (ORC) file format , which influenced the layout used by cstore_fdw, or dive into the code.