Spinlock Post Logo

Secret Scale Sauce : Heavy Singleton INSERT spinlocks

Paul Randal recently ran a very interesting test on a simple INSERT workload. His results were promising and provide both good conclusions and more suggestions for research.

You should read Paul’s blog post first, because if you have not, this blog will not make much sense.

I decided to bring Paul’s test onto a DL980 box with 8 x 320 GB FusionIO cards I have access to. After a few email exchanges with Paul, I managed to run his test harness which basically is 64 times SQLCMD running this command:

 

DECLARE @counter BIGINT;
DECLARE @start   DATETIME;
DECLARE @end     DATETIME;

SELECT @counter = 0;
SELECT @start = GETDATE ();

WHILE (@counter < 2500000)
BEGIN

  IF @counter % 1000 = 0 BEGIN TRAN

  INSERT INTO MyBigTable DEFAULT VALUES;
SELECT @counter = @counter + 1;
IF @counter % 1000 = 999 COMMIT TRAN   

END;

SELECT @end = GETDATE ();

INSERT INTO Results VALUES (CONVERT (INTEGER, DATEDIFF (second, @start, @end)));

Unfortunately, my current DL980 has the FusionIO drives set up for max capacity and low write speeds (and there are files on the drives I cannot move)– so the result will not yet be comparable to Paul’s – you should take these early runs with a grain of salt and not draw any conclusions based on them.

Observations:

  • I can drive the CPU to 100% on my 64 core box.
  • Throughput is not steady, the CPU’s are all over the place (big fluctuations)
  • I am running on 8 files
  • Largest wait is the WRITELOG wait (about 70%) – rest are PAGELATCH.
  • There are big differences between thread completion times. The threads on the first two NUMA nodes complete much faster than the others. Runtime fluctuations are between 150 and 600 seconds (but still nearly four times faster than Paul’s run)
    The WRITELOG wait is not surprising; Paul’s run is done on write optimized drives and I have sacrificed the write optimization for additional space (I hope to get some runs in on write optimized drives later this weekend). But, it might be surprising that I hit 100% CPU while still struggling with this wait. Whenever CPU is too high compared to throughput, it is generally a good idea to dig into spinlocks:

SELECT * FROM sys.dm_os_spinlock_stats
WHERE backoffs > 0
ORDER BY spins DESC

Aha! This gives us some insights:

image

We are spinning on log cache structures and burning CPU on this – again, not surprising and some of this will go away as we boost the transaction log speed. We are trying to squeeze way beyond what the log will allow us to squeeze through, and there is no one in the runnable queue behind us.

But what are those other spins? OPT_IDX_STATS and SOS_OBJECT_STORE?

Well, OPT_IDX_STATS is the optimizer trying to update the DMVs sys.dm_db_index_usage_stats and sys.dm_db_missing_index_stats. You can read about this in KB 2003031. While the KB is about databases with many thousand rows in those DMV – you might see why this problem can appear for a single table that has heavy insert activity: A quick calculation (16M rows in 600 sec) shows us that we are hitting at least 270.000 inserts/sec on this run, with peaks up to 4 times that (nice, by the way). It is not strange that there is some contention on that data structure.

The knowledge base article hints at a solution: run TF2330. This has to be set at startup, not with DBCC TRACEON. Running this trace flag got rid of the wait, and it looks like I got a minor throughput increase (hard to measure under big fluctuations)

My next step is to diagnose the SOS_OBJECT_STORE, more to follow…

(By the way, I played around with naming the columns in the INSERT statement instead of using default values. This turned out to be a very bad idea, I got massive amounts of SQL_MGR spins – you can read about why I got that here.)

Category: Grade of the Steel

  6Comments

  1. Mark   •  

    There is a slight buglet in your loop control (which isn’t present in Paul’s).
    Your code will use an explicit transaction to insert 999 rows and then an auto transaction to insert one and then another explicit transaction for the next 999 rows. Like this:

    998 INSERT INTO MyBigTable DEFAULT VALUES;
    999 COMMIT TRAN
    999 INSERT INTO MyBigTable DEFAULT VALUES;
    1000 BEGIN TRAN
    1000 INSERT INTO MyBigTable DEFAULT VALUES;

    I’m pretty sure this will double your log IO/sec.

    • Thomas Kejser   •  

      You are right Mark, there is an off by one error here. Thanks for pointing it out. This actually means I am committing too much, which means I should be able to run faster.

      I will rerun the test once I get a chance to play with a big box again 🙂 I will leave the bug in the blog for now, since the results are only valid under that code.

  2. Pingback: Benchmarking: Multiple data files on SSDs (plus the latest Fusion-io driver) | Paul S. Randal

  3. Pingback: Thread Synchronization in SQL Server « Thomas Kejser's Database Blog

  4. Pingback: Benchmarking I/O: The test rig | David Curlewis

  5. Pingback: Diagnosing and fixing SOS_OBJECT_STORE spins for Singleton INSERTS « Thomas Kejser's Database Blog

Leave a Reply

Your email address will not be published. Required fields are marked *