Computer Bugs

Race Condition when Creating Unique Values

During my tuning sessions, there is a race condition I run across so often that I think it is worth blogging about it. It concerns the handling of a common requirement: How do I check if the value I want to insert is in the table already? And if it is, don’t insert the value.

Here is how the typical, non DBA person, often does this:

The problem with this pattern is that it is wrong. Under transaction isolation levels less than SERIALIZABLE, the above code does not guarantee that you won’t insert a duplicate.

Here is why:

Consider two queries A and B both wanting to insert the value 42. If this sequence happens:

  • A executes the SELECT with @X = 42. Concludes that 42 is not in Foo
  • B executes the SELECT also with @X = 42. Concludes that 42 is not in Foo
  • A executes INSERT
  • B executes INSERT

You now have duplicate values of X in the table. More times than I can remember, I have had programmers ask me “Why this is allowed when there is a transaction around the two statements?”. “Why does the database get us into an inconsistent state here?”. The answer is that the transaction isolation level of the database determines just how strong the consistency should be. By default, SQL Server will use the READ COMMITTED isolation level which releases locks fast (good for scale). MySQL will use REPEATABLE READ as the default isolation level, but this has other problems (long held locks for bad queries). Both of these won’t guarantee that the above only inserts the value when it is not already there.

Solution 1: Declare a Unique Key

The obvious solution is of course to declare a unique key on X. However, it is not obvious that this is the best solution. There are typically two problems with using a key to enforce the uniqueness:

1) X may only be unique under certain conditions (if you are in this situation, it may indicate that you may have a normalisation problem)
2) Exceptions are expensive if the violation is common

Ad 2) When the key violation happens, you typically send an exception back to the calling application. If your database support exception handling, you can handle it before the exception travels to the client. But in both cases, throwing an exception is pretty expensive. Exceptions are meant to be… Exceptional. If you expect the violation of X uniqueness to be common, this solution is not a good way to address the problem.

Solution 2: Change the Isolation Level to SERIALIZABLE

Another option is to force the isolation level of the transaction to be SERIALIZABLE. However, this has very dangerous consequences that can introduce subtle bugs. Having a large transaction in SERIALIZABLE mode also limits scalability of the database.

To illustrate this, try the following demo:

Execute the below transaction on two connections, taking turns between each statement (and inspecting locks as you go along).

Not quite what you expected perhaps? Here are the results on my SQL Server:

Isolation Level Effect
READ UNCOMMITED Duplicates
READ COMMITTEED Duplicates
READ REPEATABLE READ Duplicates
SERIALIZABLE Deadlock (but consistent)
SNAPSHOT Duplicates
READ UNCOMMITED Duplicates

Solution 3: Force a Lock

It is perhaps not surprising to my regular readers that the manhandling of the database with query hints is my preferred solution to this problem.

This works beautifully:

This solution has the added advantage of not causing deadlock situations. Unfortunately, it is database specific. In MySQL, the hint to solve this is SELECT FOR UPDATE.

Why am I forcing both exclusive (XLOCK) and row (ROWLOCK) locks above? Consider the case where the optimiser decides to scan the table instead of accessing the index to find the row (this can be illustrated by hinting INDEX = 0 above). The HOLDLOCK hint is also required to make sure the range lock gets held during the transaction. If the index was unique, this can be omitted (but no harm in leaving it in)

Solution 4: Do it in One Statement

Depending on the exact semantics of your database platform, you may be able to do this in a really wizardly way.

Most databases will treat a single statement as a self contained, fully consistent unit of work. This means that if you can check both for the existence of the row AND do the INSERT if found. If you use a single statement, you wont have to worry about race conditions. In fact, you wont even have to manually create a transaction.

As Brad points out in the comments, there is an issue with this approach when the key is not unique. The range lock is taken too late and this creates a race condition which allows for duplicates even in a single statement situation. This means that even if you do it in a single statement, you will still have to hint it. I have been pondering this quite a bit and I am concluding that based on Brad excellent comments, I think this solution (even with the hints) is dangerous. Solution 3 is much cleaner as it is clear which order the locks get taken in

This will works in SQL Server:

And for generating and returning a unique ID from a table that contains the last ID generated, you can do this:

Summary

In this short blog, I have shown you ways to work around a common race condition. I would encourage you to read the comments section as there are some very important subtleties here that Brad has kindly help address.

Happy bug slaying.

  6Comments

  1. David   •  

    Great article Thomas!

    I would have never considered this before but makes perfect sense.

  2. Brad   •  

    Your solutions specified in 2*, 3, 4 are incorrect and will still allow duplicates given higher concurrency despite your observations.

    This will properly block concurrent threads and prevent duplicates:

    INSERT INTO dbo.FOO
    (
    X,
    Payload
    )
    SELECT
    1001,
    ‘New’
    WHERE
    1001 NOT IN
    (
    SELECT
    X
    FROM
    dbo.Foo WITH(ROWLOCK, UPDLOCK, SERIALIZABLE)
    );

    OR

    BEGIN TRANSACTION;
    IF NOT EXISTS (SELECT * FROM dbo.Foo WITH(ROWLOCK, UPDLOCK, SERIALIZABLE) WHERE X = 1001)
    BEGIN
    INSERT INTO dbo.FOO(X,Payload)
    VALUES(1001, ‘New’)
    END
    COMMIT TRANSACTION;

    2* — the conversion deadlock you’re getting can be avoided by requesting incompatible locks

    • Thomas Kejser   •     Author

      Great spot Brad.

      2: For the SERIALIZABLE setup, I wanted to show the solution without hints. You comment provides good additional context thank you.

      3: In my hurry to show the issue, I forgot that the index hint does not force the right lock when the index is not unique (the Range lock is required, but only the page is locked when there is no unique key). I will correct the post and add the extra hint

      4: Here, I believe you are mistaken. If you check the sequence of lock events (with XEvent or profiler) you will see that the range lock IS taken by the single statement solution.

      • Thomas Kejser   •     Author

        Update Brad: I have been running concurrency tests (8 cores) with the different variants. There is a situation where the single statement variant is causing duplicates. Currently trying to figure out the exact series of locks being taken that is causing this.

        • Brad   •  

          Thomas —

          I believe it can be explained by the following:

          BEGIN TRANSACTION;
          INSERT INTO Foo
          SELECT 1001, ‘New’
          WHERE 1001 NOT IN (SELECT X FROM Foo);

          (Using a transaction and not committing so we can see the order that the locks are acquired)

          From Profiler this is what I get my comments are in-line:

          — This section pertains locks acquired for checking the NOT IN clause —
          acquired: PAGE-IS (1:156581) <– Checking the table for the key value 1001 and getting lock
          released: PAGE-IS (1:156581) <– Didn't find key value 1001 so releasing lock on the page
          — This section pertains to locks acquired for the insert —
          acquired: PAGE-IX (1:169) <– This is where the insert begins, obtaining an intent exclusive lock on page
          acquired: RID-X (1:169:0) <– Written to heap
          acquired: KEY-RangeI-N (37755d7e43d2) <– Testing the range for the key value 1001 "(37755d7e43d2)"
          acquired: KEY-X (37755d7e43d2) <– Written to nonclustered index IX

          Note: page numbers and rid is likely to be different, lock hash will be the same (2008r2)

          The critical section here is that when we test FOO for the non existence of the value 1001 the check is done using a PAGE-IS lock. (We would also see a KEY-S lock if the value had already existed)

          The thing to notice here is that two threads executing at the same time with both acquire and release PAGE-IS locks since intent-shared locks are compatible with each other.

          In the case you are observing in your latest test, 2 (or more) threads will therefore acquire/release the lock PAGE-IS lock since there is 1001 doesn't exist yet to both threads and both will then attempt to insert the same 1001 value, causing you have a duplicate value inserted.

          • Thomas Kejser   •     Author

            This in indeed what I am seeing.

            What I am looking at is why the Key-Range-N lock isn’t preventing the final insert from properly sync’ing with another query. Why isn’t the Key-Range-N lock not used to validate that 1001 is still not there?

            HOLDLOCK/SERIALIZABLE on either the select or INSERT portion obviously fixes the issue (as per your original comment). HOLDLOCK alone with deadlocks, XLOCK/UPDLOCK with a nice queing, behaving like solution 2 and 3.

            The behaviour is different when the index is unique (perhaps a great extension of this blog article)

Leave a Reply

Your email address will not be published. Required fields are marked *