SELECT … FOR UPDATE on non-existent rows

TL; DR

SELECT … FOR UPDATE has a (not so) surprising side effect on non-existent rows: it could cause a (serious) performance penalty and even prevent you from inserting new rows at all.

Locking rows for update

A development team of ours was working on an application that needed to ensure an update on a single row item isn’t modified by another transaction. Naturally they started making use of SELECT … FOR UPDATE to lock the row before updating it. This worked excellent to keep anyone else from updating this row. However they started to get some lock wait timeouts on new inserts of totally unrelated items during a load test and they asked me to look into this.

SELECT … FOR UPDATE is described as following in the MySQL documentation:
A SELECT ... FOR UPDATE reads the latest available data, setting exclusive locks on each row it reads. Thus, it sets the same locks a searched SQL UPDATE would set on the rows.
So far so good: this behavior is expected to happen. It also doesn’t mention anything about locking anything but the rows it reads.

I asked the team whether they were attempting to insert the same data as they were locking in the other transaction and they said they were not.

In pseudo code they were doing this:
SELECT ... WHERE uuid='some-uuid' FOR UPDATE;
if row { UPDATE row }
else { INSERT row }

The uuid column here is the primary key of the table. This code executed fine and had no issues by itself as a single transaction.

You may wonder why not use the INSERT … ON DUPLICATE KEY UPDATE or REPLACE INTO?
First of all we are inserting only occasionally, so that would fail the insert command 99% of the time. Second of all we may only be updating a single column within a single row, so that implies we would need to know the entire row up front when we have to insert or replace the row.

No row to update

Now what would happen if there is no row to update?

According to the description in the MySQL documentation it sets an exclusive lock on each row it reads, but what about when there is no row to read? This other excerpt on gap locks might hint what it actually does do:
For example, SELECT c1 FROM t WHERE c1 BETWEEN 10 and 20 FOR UPDATE; prevents other transactions from inserting a value of 15 into column t.c1, whether or not there was already any such value in the column, because the gaps between all existing values in the range are locked.
If there is no row, a lock will still be set on the associated index entries. This shouldn’t be bad, right? We can still insert other rows, right? Wrong: it isn’t a gap lock alone, but a next-key lock!

Since the lock is set on a non-existent index entry, a next-key lock is created. Depending on where you would insert in the index, you may find a whole range being locked as it needs to insert within this range. In our version of UUID this shouldn’t happen very often as there is a random factor, but it still can happen often if you only have a few rows in your table. In that case the gaps between UUIDs is large and due to the random factor yu may end up locking large gaps within the index. As cross region latency is present on this system, this keeps the next-key locks open longer and the chance of a collision in gap increases also a little bit. So that explains the behavior during the load test. So all’s well, ends well?

Nasty side effect of Next-Key locks

There is one nasty side effect with the next-key lock: if the index value would be greater than the largest value in the table it locks everything above the largest value until infinity.

So what would happen to a table where the primary key is sequential like an integer? For example this table with the following rows:
CREATE TABLE sometable (
id int(11) NOT NULL,
some_data varchar(255) NOT NULL default '',
PRIMARY KEY (some_id)
);
INSERT INTO sometable VALUES
(1, 'test'),
(2, 'test'),
(3, 'test'),
(4, 'test'),
(5, 'test'),
(10, 'test'),
(11, 'test'),
(12, 'test'),
(13, 'test'),
(14, 'test'),
(15, 'test');

This would create a gap between 5 and 10 and a gap from 15 till infinity.

When we are selecting within the gap between 5 and 10, we create a next-key lock between 5 and 10 and we can’t insert new rows inside this gap. We can still insert new rows at the end of the table though. However if we would select a row on id greater than 15 we would put a next-key lock on 15 till infinity. This means nobody can append anything to this table anymore until we have committed our transaction! This could become a serious bottleneck if you insert more rows than update.

Conclusion

I wasn’t aware of the fact that SELECT … FOR UPDATE actually locked the index for rows that don’t exist yet. I would have reasoned that if there were no rows to be selected, there wouldn’t be anything to lock. And if there wouldn’t be anything to lock, the whole SELECT … FOR UPDATE would simply fizzle.

Even though SELECT … FOR UPDATE sounds like a great way to ensure your transaction is the only one who modifies a specific row, it’s quite dangerous as it could lock out or delay other transactions. If you would take our example above, the safe way to do it (in pseudo code) is:
SELECT ... WHERE uuid='some-uuid';
if row { SELECT ... WHERE uuid='some-uuid' FOR UPDATE; UPDATE row }
else { INSERT row }

This would ensure the lock would only be set when there actually is a row present, however this is at the expense of an additional query.

Edit: added that UUID gaps can be significant

Using CURDATE() the wrong way

It has been a while since I wrote on this blog. Basically I had too much on my mind (expanding my department, holidays, etc) to actually write here and I’ll promise to post more regularly from now onwards. 😉

Syntax error!

Anyway, as the title already suggests: I found out how you can use CURDATE() in a wrong way. One of the developers in my company asked me to help him out as his code all of a sudden did not work properly anymore. Or even better: it used to process several thousands of rows and all of a sudden it processed none.

I looked at his code snippet and it was quite a long query with a lot of brackets:

SELECT SUM(some_count_col), logdate, loghour FROM logs
WHERE (logdate = CURDATE() AND loghour = HOUR(NOW())
GROUP BY logdate, loghour;

Column wise logdate is of the type DATE and loghour of the type TINYINT.
note that this is, obviously, not the original query, but it is similar

Apart from the fact that his usage of brackets makes the query quite unreadable I was quickly able to simplify the query to this: Continue reading “Using CURDATE() the wrong way”