We use cookies on our website, hope you don’t mind. Just imagine a plate of delicious meringues, they look like clouds. Read more

Development

5 Novelties in PostgreSQL 10 That Will Make You a Happier Developer

In the past few years, we’ve heard a lot about NoSQL databases: Redis, MongoDB, Cassandra, Memchached, or DynamoDB, to name a few. Even though they gained a lot of deserved hype, the good old relational databases are mostly irreplaceable. PostgreSQL, an object-relational (we will get to it later) database management system, is still gaining new fans.

In this post, I will look at some examples of features introduced in the newest version of Postgres, which was released on 5 October. Throughout this post, I will present most of the queries on Postgres 10 on Ubuntu 16. Only the first part about identity columns will show how it used to be done in “the ol’ days” on Postgres 9.6 (on the left-hand side, as below).

Environment Setup

The setup process is rather easy. First, you need to install Docker CE according to your system preferences.

PostgreSQL 9.6

$ docker pull postgres:9.6
$ docker run --name old-postgres -d postgres:9.6
$ docker run -it --rm --link old-postgres:postgres postgres:9.6 psql -h postgres -U postgres
PostgreSQL 10

$ docker pull postgres:10
$ docker run --name new-postgres -d postgres:10
$ docker run -it --rm --link new-postgres:postgres postgres:10 psql -h postgres -U postgres

After pulling and starting the images, you should see prompts with the Postgres version you are using:

$ docker pull postgres:9.6
$ docker run --name old-postgres -d postgres:9.6
$ docker run -it --rm --link old-postgres:postgres postgres:9.6 psql -h postgres -U postgres
$ docker pull postgres:10
$ docker run --name new-postgres -d postgres:10
$ docker run -it --rm --link new-postgres:postgres postgres:10 psql -h postgres -U postgres

Now we’re ready to go!

5 Novelties in PostgreSQL 10 That Will Make You a Happier Developer

1. Identity columns

After switching from an MS SQL Server, this was something I could not get used to: identity columns. In short, these are the columns that store the IDs of your rows – they should be unique and increment automatically.

CREATE TABLE foo (id SERIAL PRIMARY KEY, val1 INTEGER);
CREATE TABLE


\d
             List of relations
 Schema |    Name    |   Type   |  Owner   
--------+------------+----------+----------
 public | foo        | table    | postgres
 public | foo_id_seq | sequence | postgres
CREATE TABLE foo (id INTEGER GENERATED ALWAYS AS IDENTITY PRIMARY KEY, val1 INTEGER);
CREATE TABLE

\d
             List of relations
 Schema |    Name    |   Type   |  Owner   
--------+------------+----------+----------
 public | foo        | table    | postgres
 public | foo_id_seq | sequence | postgres

The syntax is quite similar on both sides and underneath there still lies a sequence. So what’s the big deal? First of all, the new syntax is SQL-compliant, so it should now be easier to run code on different databases. What is more, you can explicitly decide on the behavior while inserting by choosing either ALWAYS or DEFAULT.

Another convenient thing is changing the next value of a sequence, which I used a lot while migrating databases. In the new version, you don’t have to pay attention to the sequence your table is using, because all the magic is done by PG:

ALTER SEQUENCE foo_id_seq RESTART WITH 1000;
ALTER SEQUENCE
ALTER TABLE foo ALTER COLUMN id RESTART WITH 1000;
ALTER TABLE

If this still does not convince you, there is also one more thing really exciting that I was not aware of before. Suppose we would like to create a copy of the table foo:

CREATE TABLE bar (LIKE foo INCLUDING ALL);
CREATE TABLE

\d
             List of relations
 Schema |    Name    |   Type   |  Owner   
--------+------------+----------+----------
 public | bar        | table    | postgres
 public | foo        | table    | postgres
 public | foo_id_seq | sequence | postgres

CREATE TABLE foo (id INTEGER GENERATED ALWAYS AS IDENTITY PRIMARY KEY, val1 INTEGER);
CREATE TABLE

\d
             List of relations
 Schema |    Name    |   Type   |  Owner   
--------+------------+----------+----------
 public | foo        | table    | postgres
 public | foo_id_seq | sequence | postgres

Both operations resulted in creating a new table, but only in pg10, a new sequence is created as well. So if you wanted to insert some new rows into the table bar, in version 9.6 you would get id 1000 (it uses foo_id_seq), and in version 10, you will start from 1, which in my opinion is the desired result. Now, if you wanted to drop foo, you could come across some problems:

DROP TABLE foo;
ERROR:  cannot drop table foo because other objects depend on it
DETAIL:  default for table bar column id depends on sequence foo_id_seq
HINT:  Use DROP ... CASCADE to drop the dependent objects too.
DROP TABLE foo;
DROP TABLE

You cannot drop foo, since its sequence is being used elsewhere. If you decide to do as the hint suggests, you are in for a hard time inserting values into the bar, though. And what if you created a lot of tables by copying some existing ones? You would end up with a lot of mess…

5 Novelties in PostgreSQL 10 That Will Make You a Happier Developer

If you want to see some other monstrosities of the old approach, I encourage you to check out this article.

2. Native partitioning

This is a feature that I’m extremely hyped about because of its power and possible applications. I think that we’re still some way from home, but, in comparison to 9.6, it’s a big step forward for Postgres when it comes to ease of use and performance.

What is partitioning all about? Let me give you an example.

Some time ago, my colleagues at 10Clouds wrote about their attempt to predict smog levels by using machine learning. Consider a table that stores all the measurements. A minimal model table would have a timestamp of a sample’s acquisition, site id and type of pollutant (CO2, PM10, PM2.5, and so on). After some time, the amount of data in the table would grow considerably, and the performance would suffer.

» Will Vue.js Become a Giant Like Angular or React? Read our article to find out!

Depending on the type of calculations you perform most frequently, we could try splitting the data. If we decide that we want to have a separate table, e.g. for measurements from each month, this is where partitioning comes in handy. If the application usually queries the database for the records from the last two weeks, then we would have to scan at most two partitions instead of all the records. Of course, we could specify in the query which tables we should scan, but that would be troublesome and ugly, to be honest.

What we want to achieve is:

  • Create another level of abstraction – we want to query only one (master) table
  • The data themselves should be dispatched to different child tables based on the sample’s timestamp

Here’s how to accomplish this task in Postgres:

  1. Create a master table.
  2. Create as many child tables with datetime constraints as needed.
  3. Create indices, keys, and other constraints on child tables.
  4. Create a trigger on the master table that will dispatch rows to proper child tables before insert.

As of Postgres 10, point 4 is no longer needed – DBMS will handle it, and, thus, the syntax becomes greatly simplified. A sample implementation might look like this:

-- 1. Create master table, specify partitioning rule
CREATE TABLE measurement(
id INTEGER GENERATED ALWAYS AS IDENTITY,
datetime TIMESTAMPTZ,
site_id INTEGER,
pollutant_id INTEGER,
value FLOAT)
PARTITION BY RANGE (datetime);

-- 2. Create a couple of child tables. Define data range limits they should store
CREATE TABLE measurement_201708
PARTITION OF measurement(datetime)
FOR VALUES FROM ('2017-08-01') TO ('2017-09-01');

CREATE TABLE measurement_201709
PARTITION OF measurement(datetime)
FOR VALUES FROM ('2017-09-01') TO ('2017-10-01');

CREATE TABLE measurement_201710
PARTITION OF measurement(datetime)
FOR VALUES FROM ('2017-10-01') TO ('2017-11-01');

-- 3. Add all needed keys and indices, for each child table
ALTER TABLE measurement_201708 ADD PRIMARY KEY (id);
ALTER TABLE measurement_201708 ADD CONSTRAINT fk_measurement_201708_site FOREIGN KEY (site_id) REFERENCES site(id);
CREATE INDEX idx_measurement_201708_datetime ON measurement_201708(datetime);

Assuming that we created tables with sites and pollutants, this is how we might feed the measurement table and then query it:

-- Inserting looks the same as in normal tables 
INSERT INTO measurement(datetime, site_id, pollutant_id, value)
SELECT '2017-08-01'::TIMESTAMPTZ + ((random()*90)::int) * INTERVAL '1 day',
(1 + random()*(SELECT max(id)-1 FROM site))::int,
(1 + random()*(SELECT max(id)-1 FROM pollutant))::int,
random()
FROM generate_series(1,1000000);

-- Selecting data as well
SELECT * FROM measurement WHERE datetime BETWEEN '2017-09-20' AND '2017-09-27';

The new syntax provides easier partitioning of data in specific ranges or belonging to specific categories. This means that we could also use a different approach and create a separate table for every measurement site. If one of them happens to be deactivated, we can easily detach this partition from the master table and archive it. What is probably the most important part is that the performance of native partitioning improved greatly. I recommend reading depesz’s blog post, where he explores it in depth.

On the other hand, there are still many issues with partitions: you have to set keys and indices for each child table, you can’t set a unique key across different child tables, etc. But we can see that partitioning is being worked on hard, so I hope to see more in the coming releases.

5 Novelties in PostgreSQL 10 That Will Make You a Happier Developer

3. Multicolumn statistics

This feature might be a relief for many of you who know your data and happen to see a lot of underestimated execution plans. A hypothetical example: a business has a class of 1000 children that like counting rhymes. I mean, they really LOVE it. For another crazy reason, they want to store the info about which child said which word. You know your craft, so you don’t even ask why and just get down to coding stuff. Here’s how you might want to store and populate it:

CREATE TABLE counting_log (id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, datetime TIMESTAMP WITH TIME ZONE, child_id INTEGER, word TEXT);
CREATE TABLE
INSERT INTO counting_log(datetime, child_id, word) 
SELECT current_timestamp, i%1000, 
CASE WHEN i%4=1 THEN 'eeny' 
WHEN i%4=2 THEN 'meeny' 
WHEN i%4=3 THEN 'miny' 
WHEN i%4=0 THEN 'moe' 
ELSE 'nope' END 
FROM generate_series(1, 1000000) i;
INSERT 0 1000000
CREATE INDEX idx_counting_log_child_id on counting_log(child_id);
CREATE INDEX
CREATE INDEX idx_counting_log_datetime on counting_log(datetime);
CREATE INDEX

As soon as you report you are ready, the business comes back to you and asks you to retrieve when the child under the id 123 that said ‘miny’… This is what Postgres would normally do:

  1. Estimate what fraction of total rows (p1) would be returned with the predicate child_id=123
  2. Estimate what fraction of total rows (p2) would be returned with the predicate word = ‘miny’
  3. Assume that the total number of rows using both predicates would be equal to:
    total_rows*p1*p2

This way of calculating the returned row number might be accurate only when columns in predicates are not correlated. In our case, they quite obviously are, so the estimate would be way lower than the actual result.
But with the introduction of correlated statistics, things might change for better. Let’s see how it’s done now:

CREATE STATISTICS st_counting_log_child_id_word ON child_id, word FROM counting_log;
CREATE STATISTICS
ANALYZE counting_log;
ANALYZE

In these examples, I will be showing things in reverse order. First, let’s see what the estimate is with the help of the newly created statistic:

EXPLAIN SELECT datetime FROM counting_log WHERE child_id=123 AND word='miny';
                                        QUERY PLAN                                         
-------------------------------------------------------------------------------------------
 Bitmap Heap Scan on counting_log  (cost=19.92..2696.34 rows=967 width=8)
   Recheck Cond: (child_id = 123)
   Filter: (word = 'miny'::text)
   ->  Bitmap Index Scan on idx_counting_log_child_id  (cost=0.00..19.68 rows=967 width=0)
         Index Cond: (child_id = 123)

We expect 967 rows to be returned. Ok. How would we do it in the old-fashioned way?

DROP STATISTICS st_counting_log_child_id_word;
DROP STATISTICS
EXPLAIN SELECT datetime FROM counting_log WHERE child_id=123;
                                        QUERY PLAN                                         
-------------------------------------------------------------------------------------------
 Bitmap Heap Scan on counting_log  (cost=19.92..2693.92 rows=967 width=8)
   Recheck Cond: (child_id = 123)
   ->  Bitmap Index Scan on idx_counting_log_child_id  (cost=0.00..19.68 rows=967 width=0)
         Index Cond: (child_id = 123)
EXPLAIN SELECT datetime FROM counting_log WHERE word='miny';
                             QUERY PLAN                              
---------------------------------------------------------------------
 Seq Scan on counting_log  (cost=0.00..19695.00 rows=202133 width=8)
   Filter: (word = 'miny'::text)

Now, let’s do some math:

returned rows=total rows * p1*p2=1000000*(967/1000000) * (202133/1000000)=195,46
This is what we would expect. And what would Postgres say?

EXPLAIN SELECT datetime FROM counting_log WHERE child_id=123 AND word='miny';
                                        QUERY PLAN                                         
-------------------------------------------------------------------------------------------
 Bitmap Heap Scan on counting_log  (cost=19.73..2696.14 rows=195 width=8)
   Recheck Cond: (child_id = 123)
   Filter: (word = 'miny'::text)
   ->  Bitmap Index Scan on idx_counting_log_child_id  (cost=0.00..19.68 rows=967 width=0)
         Index Cond: (child_id = 123)

Jackpot!

Now there is also one more thing – we are only talking about estimates. What’s the real number of returned rows?

SELECT count(datetime) FROM counting_log WHERE child_id=123 AND word='miny';
 count 
-------
  1000

To sum up: this example shows that we might achieve a lot if we know our data well. Ignoring the correlation led to a big underestimation – about one order of magnitude! Here it doesn’t look so serious, but consider joining another table that stores data about the children. With such a plan, a nested loop might be considered to be the cheapest, when, in fact, a hash join should be used. If you are interested in seeing what data are stored when collecting multicolumn statistics, just type:

SELECT * FROM pg_statistic_ext WHERE stxname = 'st_counting_log_child_id_word' \gx
-[ RECORD 1 ]---+------------------------------
stxrelid        | 16555
stxname         | st_counting_log_child_id_word
stxnamespace    | 2200
stxowner        | 16385
stxkeys         | 3 4
stxkind         | {d,f}
stxndistinct    | {"3, 4": 1000}
stxdependencies | {"3 => 4": 1.000000}

We will not delve into it any deeper, just note that stxkeys correspond to column numbers (you can specify more than two), and two kinds of statistics are collected: stxdistinct and stxdependencies.
Oh, and by the way, did you notice that I ended the previous query with \gx instead of a semicolon? Yup, that’s also a new feature 😉

5 Novelties in PostgreSQL 10 That Will Make You a Happier Developer

4. More parallelism!

The first attempts to introduce parallel queries had been made in Postgres 9.6. Since then you could use parallel sequential scans, hash and nested loop join strategies and aggregates. From now on, you will also be able to use parallel merge joins, bitmap heap scans and maybe, most importantly, index and index-only scans.

That’s why you can find the new settings for parallel query usage configuration in the Postgresql.conf file:

  • min_parallel_table_scan_size – the minimal size of a table for which parallelism can be triggered, by default 8MB
  • min_parallel_index_scan_size – same as the above but applied to indices, by default 512 kB
  • max_parallel_workers – the maximum number of parallel workers to be used, defaults to 8. Note that in the previous version of Postgres, max_parallel_workers_per_gather was introduced.

What are these settings for? Well, everything comes at a cost. Setting up a parallel worker sometimes might not be worth anything, because it would use more resources than simply running the query without any parallel workers. Again, a query by default might be considered by a planner to be executed with more than one worker, if the estimated cost is higher than 1000, which corresponds to the setting parallel_setup_cost.

From my short experience with Postgres 10, I have a feeling that the new parallel workers are rather shy creatures. Let’s see them in action. It isn’t really an interesting example, but let’s store values of trigonometric functions, just like that:

CREATE TABLE trigonometry AS SELECT i AS arg, sin(i) AS sine, cos(i) AS cosine, tan(i) AS tangent FROM generate_series(0, 100000, 0.01) i;
SELECT 10000001
CREATE INDEX idx_trigonometry_arg on trigonometry(arg);
CREATE INDEX
create index idx_trigonometry_sine on trigonometry(sine);
CREATE INDEX
create index idx_trigonometry_cosine on trigonometry(cosine);
CREATE INDEX

First, some aggregations that were introduced in version 9.6:

EXPLAIN SELECT count(arg) FROM trigonometry WHERE arg > 50000;
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=140598.88..140598.89 rows=1 width=8)
   ->  Gather  (cost=140598.67..140598.88 rows=2 width=8)
         Workers Planned: 2
         ->  Partial Aggregate  (cost=139598.67..139598.68 rows=1 width=8)
               ->  Parallel Seq Scan on trigonometry  (cost=0.00..134437.55 rows=2064449 width=8)
                     Filter: (arg > '50000'::numeric)

Easy. The estimated cost is quite big, so the planner decided to activate two additional workers.
Now let’s try to use a parallel index scan.

EXPLAIN SELECT * FROM trigonometry WHERE arg > 50000;
                                             QUERY PLAN                                              
-----------------------------------------------------------------------------------------------------
 Index Scan using idx_trigonometry_arg on trigonometry  (cost=0.43..201369.28 rows=4954677 width=32)
   Index Cond: (arg > '50000'::numeric)

Nope. What if you helped it a little bit by lowering the cost of adding another worker?

SET parallel_setup_cost=100;
SET
EXPLAIN SELECT * FROM trigonometry WHERE arg > 50000;
                                             QUERY PLAN                                              
-----------------------------------------------------------------------------------------------------
 Index Scan using idx_trigonometry_arg on trigonometry  (cost=0.43..201367.27 rows=4954562 width=32)
   Index Cond: (arg > '50000'::numeric)

Still nothing, which is a bit surprising for me. The estimated cost is high, the size of the table and the index is well above the threshold… Let’s try a different query, then:

SET parallel_setup_cost=1000;
SET
EXPLAIN SELECT arg FROM trigonometry WHERE sine > 0.999 AND arg >100 AND arg < 10000;
                                                  QUERY PLAN                                                   
---------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.43..40801.85 rows=13403 width=8)
   Workers Planned: 2
   ->  Parallel Index Scan using idx_trigonometry_arg on trigonometry  (cost=0.43..38461.55 rows=5585 width=8)
         Index Cond: ((arg > '100'::numeric) AND (arg < '10000'::numeric))
         Filter: (sine > '0.999'::double precision)

Finally, there you are! Now you can see the Parallel Index Scan in all its glory. I ran this query a few of times and the results were retrieved in 162 ms on average. Now let’s play some more. Let’s try to forcefully use the parallelism and limit additional workers to 0.

SET max_parallel_workers =0;
SET
SET force_parallel_mode=on;
SET
EXPLAIN ANALYZE SELECT arg FROM trigonometry WHERE sine > 0.999 AND arg >100 AND arg < 10000;
                                                                          QUERY PLAN                                                                           
---------------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.43..40801.85 rows=13403 width=8) (actual time=0.265..263.796 rows=14097 loops=1)
   Workers Planned: 2
   Workers Launched: 0
   ->  Parallel Index Scan using idx_trigonometry_arg on trigonometry  (cost=0.43..38461.55 rows=5585 width=8) (actual time=0.072..262.133 rows=14097 loops=1)
         Index Cond: ((arg > '100'::numeric) AND (arg < '10000'::numeric))
         Filter: (sine > '0.999'::double precision)
         Rows Removed by Filter: 975902
 Planning time: 0.164 ms
 Execution time: 264.473 ms

First of all, you see that you can execute this query. I want to emphasize it here, because it wasn’t that obvious to me that the number of launched workers might be equal to 0, which means that there are no additional workers.

Secondly, if you limit the number of available processes, the planner might want to use more than it actually can. Finally, the results are faster when you use a parallel index scan, because the results for the above query were returned on average after 260ms. All in all, that’s awesome, even though many of us might not even be aware that this feature exists.

5. Full-text search for JSON and JSONB

Last but not least, full-text search support was added for JSON and JSONB type columns. If you’re familiar with how it works on text columns, this won’t be a breakthrough, but it still is a nice feature that comes out-of-the-box in Postgres 10.

Imagine that you are using an external system for payment management. You store most of the data in normal columns but, just to be safe, you keep responses from that system also in JSON columns. This is what it might look like:

CREATE TABLE transaction(id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, transaction_id VARCHAR(10), user_id INTEGER, created_datetime TIMESTAMP WITH TIME ZONE, result BOOL, amount INT ,response_data JSON);
CREATE

INSERT INTO transaction(transaction_id, user_id, created_datetime, result, amount, response_data)
SELECT tran.id, ceil(random()*100), tran.datetime, tran.result, tran.amount, 
('{"transaction": {"id": "'|| tran.id ||'", "transaction_datetime": "'|| tran.datetime || '","amount": '|| tran.amount::text || ',"is_success": "'|| tran.result || '", "message": "'|| tran.msg || '"}}')::json
FROM (
    SELECT substring(md5(random()::text), 1, 10) as id, 
        current_timestamp + (ceil(random()*1000)-500) * INTERVAL '1 minute' as datetime,
        ceil(random()*1000) as amount,
        NOT (i%3=1) as result,
        CASE WHEN i%9=1 THEN 'insufficient funds' WHEN i%9=4 THEN 'blocked account' WHEN i%9=7 THEN 'fraud detected' ELSE 'accepted' END as msg
    FROM generate_series(1,1000) i) tran;
INSERT 0 1000

This is a small example, but if the table grows considerably, it might be a good idea to create an index for the JSON column. Here’s how we do it:

CREATE INDEX idx_transaction_response_data ON transaction USING GIN (to_tsvector('english', response_data));
CREATE INDEX

Now that we have some data in the table, let’s take a look how we can query it, based on, for instance, insufficient funds response message. Do we use our new index?

SELECT transaction_id FROM transaction WHERE to_tsvector('english', response_data) @@ to_tsquery('english', 'insufficient') LIMIT 5;
 transaction_id 
----------------
 b114b0927f
 28613b40c6
 649b9d3285
 ce9c16ec6f
 24f69de12e

EXPLAIN SELECT transaction_id FROM transaction WHERE to_tsvector('english', response_data) @@ to_tsquery('english', 'insufficient') LIMIT 5;
                                                QUERY PLAN                                                
----------------------------------------------------------------------------------------------------------
 Limit  (cost=8.04..23.12 rows=5 width=11)
   ->  Bitmap Heap Scan on transaction  (cost=8.04..23.12 rows=5 width=11)
         Recheck Cond: (to_tsvector('english'::regconfig, response_data) @@ '''insuffici'''::tsquery)
         ->  Bitmap Index Scan on idx_transaction_response_data  (cost=0.00..8.04 rows=5 width=0)
               Index Cond: (to_tsvector('english'::regconfig, response_data) @@ '''insuffici'''::tsquery)

Cool, really cool! Note that we needed to convert the word ‘insufficient’ to lexeme ‘insuffici’ to query efficiently.

Ugh… Is this it?

Nope, not even close. As I wrote at the beginning, there is a big hype for Postgres10, as this version introduced a lot of new features. The five novelties above are just the tip of the iceberg, but I chose them, because I believe that I will be using them most often (consciously or not).

So what did I skip?

  • logical replication
  • quorum commit for synchronous replication
  • XML tables
  • SCRAM authentication
  • renaming of some functions
  • \if \elif \else statements in psql

… and many, many more!

For a full list, visit the Postgres wiki. And if you can’t wait for even newer software, visit depesz’s blog, where he has already started describing features from upcoming Postgres 11.

Jakub Wilkowski

Python Developer

Telco enthusiast, avoided programming for a long time, but finally gave up and fell in love with it. Likes terminal, weird music and brewing beer.

  • Alex

    you have repeated the docker steps twice.

  • k0mpjut0r

    I haven’t yet read the content, since the “mobile” view is not really mobile friendly. It would be manageable if the layout allowed me to touch-zoom, or view in landscape mode to see the side-by-side “source code” boxes, but no. Please allow us mobile users enjoy your content.

  • Jonathan S. Katz

    Hi Jakub,

    This is a great post about PostgreSQL 10, thanks for writing!

    I have a suggestion for you to make it a bit easier to manipulate your JSON population query. Instead of:

    SELECT tran.id, ceil(random()*100), tran.datetime, tran.result, tran.amount,
    (‘{“transaction”: {“id”: “‘|| tran.id ||'”, “transaction_datetime”: “‘|| tran.datetime || ‘”,”amount”: ‘|| tran.amount::text || ‘,”is_success”: “‘|| tran.result || ‘”, “message”: “‘|| tran.msg || ‘”}}’)::json

    you could take advantage of the “json_build_object” function which would turn the clause into:

    SELECT tran.id, ceil(random()*100), tran.datetime, tran.result, tran.amount, json_build_object(‘id’, tran.id, ‘transaction_datetime’, tran.datetime, ‘amount’, tran.amount::text, ‘is_success’, tran.result, ‘message’, tran.msg)

    which should help with readability and take some of the thinking out of setting up the query.

    “json_build_object” and “jsonb_build_object” have both been available since PostgreSQL 9.4 and should be available on most modern PostgreSQL installations.

    Thanks again!

    • Jakub Wilkowski

      Hi Jonathan,

      Thanks a lot for this suggestion! I did not know this function, but it certainly is a big gain for overall readability.

      Cheers!

  • Nathan Clayton

    Also – hash indexes are now crash-safe and replicable, so you can safely use them.

    This is great when you’re working with data warehouses and have large fact tables and dimension tables and need to join using a simple equality function.

    • Jakub Wilkowski

      Hi Nathan, thanks for your comment.

      I’ve heard that so far hash indices were not so reliable so it’s good to see that they are working on them. So many cool things about pg10!

      Regards,
      Jakub

  • urikanegun

    Hello, I’d like to translate your recent article https://10clouds.com/blog/postgresql-10/ into Japanese and publish on our tech blog https://techracho.bpsinc.jp/ if you’re OK.

    I make sure to indicate the link to original, title, author name in the case.

    Best regards,

    • Jakub Wilkowski

      Hi, feel free to share it. To be honest I didn’t even expect it to reach Japan 😉

      Best regards,
      Jakub

you may also like these posts

Blog image

This React Native Feature Saves Your Company Time and Money

Blog image

7 Things Your Development Agency Won’t Tell You

Blog image

How Vue.js Helped Me Start My Angular Career

SUBSCRIBE TO OUR NEWSLETTER

Get more stuff like this in your inbox.

Thank you for signing up!

You’ve been added to our mailing list.

Free project quote

Fill out the enquiry form and we'll get back to you as soon as possible.


Thank you for your message!


We’ll get back to you in 24 hours.

Meanwhile take a look at our blog.

Read our blog
Gareth N. Genner Photograph

Gareth N. Genner

Co-Founder of Trust Stamp

Quote

We needed a partner who could take on our idea, and make it real. 10Clouds bring so many different skills. We feel that every member that’s involved in the project is a member of our team.