

Of course, we were initially suspicious, as one can often make mistakes when running experiments that can lead to erroneous results.

To our surprise, the query finished in 0.085 ms! This is a massive 99.9995% reduction. On a lark, we decided to try the SQL query again, but this time without the LIMIT clause since that seems to be the source of the problem. This would not have been a big deal if the DBMS’s assumption that it could cut off the scan after the first ten rows were correct. The DBMS thinks it will be only ten rows, but it ends up being 9400! Notice how it almost correctly estimates the number of rows it will retrieve from the index scan on the item table. The problem is that the DBMS does not know where the rows are located physically in the item index that will satisfy the predicate item_set.name = 'ABC' on the other side of the join operator. Original query plan with the ORDER BY and LIMIT clauses. The optimizer is trying to be clever here: it knows that there is an LIMIT operator above the join, so rather than sort the thousands of tuples that it expects from the join to only grab the first ten tuples, it uses the item table’s index to get the data pre-sorted. Using this index will automatically sort the item tuples in the same order needed in the query’s ORDER BY clause. This causes it use an index scan on the item table’s primary key index ( item.id) for the outer table (i.e., the child operator on the left side of the join). In the visualization shown here, we see that PostgreSQL’s optimizer overestimates the number of tuples that it expects the nested-loop join operator to emit by almost 1000x. We then use Dalibo’s fantastic visualization tool to examine the plan to understand what is going wrong. Our engineers worked with the customer to get the query’s execution plan using EXPLAIN ANALYZE. So this problem is not unique to our customer. See also this Carnegie Mellon Database Tech Talk from a leading PostgreSQL developer that discusses this exact problem (starts at 45:15). There are several StackOverflow posts with others that have similar issues in PostgreSQL queries that use LIMIT clauses with ORDER BY ( Example 1, Example 2, Example 3). But while this recommendation made other queries run faster, it did not help this problem. OtterTune’s automated health checks initially recommended that the customer to increase statistics collection on the table (i.e., ANALYZE) by increasing PostgreSQL’s statistics_target configuration knob. In this case, the application retrieves the first ten items from a set named ‘ABC.’ But the customer reported that the query took an inexplicably long time to complete (18 seconds!). This is a common query pattern in many applications to show the top n entries in a table (e.g., the most recent orders for a customer). JOIN item_set ON (item_set.id = em_set_id) This LIMIT will cause us problems later in this article, which is why we are calling it out. This OtterTune user reported that a query ran slowly when using ORDER BY along with the LIMIT 10 clause.

Recently, a Postgres user contacted us via the OtterTune Community Slack channel. OtterTune has a health check in place to determine if ANALYZE needs to be run more often.īut what if running ANALYZE doesn’t provide the necessary improvement? A recent interaction with an OtterTune user might shed light on your issue. It’s like the “restart your computer when it gets slow” advice for databases. We have found that running ANALYZE more frequently has solved many of our customers’ slow query problems. PostgreSQL’s query planner then uses these statistics to help determine the most efficient execution plans for queries.

PostgreSQL’s ANALYZE command collects statistics about specific table columns, entire tables, or entire databases and stores the results in the pg_statistic system catalog. The first thing to do - without question - is run ANALYZE. Ah, the frustrating slow-running PostgreSQL query.
