Paul Randal recently ran a very interesting test on a simple INSERT workload. His results were promising and provide both good conclusions and more suggestions for research.
You should read Paul’s blog post first, because if you have not, this blog will not make much sense.
I decided to bring Paul’s test onto a DL980 box with 8 x 320 GB FusionIO cards I have access to. After a few email exchanges with Paul, I managed to run his test harness which basically is 64 times SQLCMD running this command:
DECLARE @counter BIGINT;
DECLARE @start DATETIME;
DECLARE @end DATETIME;
SELECT @counter = 0;
SELECT @start = GETDATE ();
WHILE (@counter < 2500000)
IF @counter % 1000 = 0 BEGIN TRAN
INSERT INTO MyBigTable DEFAULT VALUES;
SELECT @counter = @counter + 1;
IF @counter % 1000 = 999 COMMIT TRAN
SELECT @end = GETDATE ();
INSERT INTO Results VALUES (CONVERT (INTEGER, DATEDIFF (second, @start, @end)));
Unfortunately, my current DL980 has the FusionIO drives set up for max capacity and low write speeds (and there are files on the drives I cannot move)– so the result will not yet be comparable to Paul’s – you should take these early runs with a grain of salt and not draw any conclusions based on them.
- I can drive the CPU to 100% on my 64 core box.
- Throughput is not steady, the CPU’s are all over the place (big fluctuations)
- I am running on 8 files
- Largest wait is the WRITELOG wait (about 70%) – rest are PAGELATCH.
- There are big differences between thread completion times. The threads on the first two NUMA nodes complete much faster than the others. Runtime fluctuations are between 150 and 600 seconds (but still nearly four times faster than Paul’s run)
- The WRITELOG wait is not surprising; Paul’s run is done on write optimized drives and I have sacrificed the write optimization for additional space (I hope to get some runs in on write optimized drives later this weekend). But, it might be surprising that I hit 100% CPU while still struggling with this wait. Whenever CPU is too high compared to throughput, it is generally a good idea to dig into spinlocks:
SELECT * FROM sys.dm_os_spinlock_stats
WHERE backoffs > 0
ORDER BY spins DESC
Aha! This gives us some insights:
We are spinning on log cache structures and burning CPU on this – again, not surprising and some of this will go away as we boost the transaction log speed. We are trying to squeeze way beyond what the log will allow us to squeeze through, and there is no one in the runnable queue behind us.
But what are those other spins? OPT_IDX_STATS and SOS_OBJECT_STORE?
Well, OPT_IDX_STATS is the optimizer trying to update the DMVs sys.dm_db_index_usage_stats and sys.dm_db_missing_index_stats. You can read about this in KB 2003031. While the KB is about databases with many thousand rows in those DMV – you might see why this problem can appear for a single table that has heavy insert activity: A quick calculation (16M rows in 600 sec) shows us that we are hitting at least 270.000 inserts/sec on this run, with peaks up to 4 times that (nice, by the way). It is not strange that there is some contention on that data structure.
The knowledge base article hints at a solution: run TF2330. This has to be set at startup, not with DBCC TRACEON. Running this trace flag got rid of the wait, and it looks like I got a minor throughput increase (hard to measure under big fluctuations)
My next step is to diagnose the SOS_OBJECT_STORE, more to follow…
(By the way, I played around with naming the columns in the INSERT statement instead of using default values. This turned out to be a very bad idea, I got massive amounts of SQL_MGR spins – you can read about why I got that here.)
Category: Grade of the Steel