Domino pieces about to fall

Adding another file to improve Transaction Log Performance

Mike West recently posted a rather active thread on LinkedIn about database with more than one transaction log file. Eventually, I realised that the full answer to the question: “When does it make sense to add another log file to improve transaction log performance?” is complex enough that it needs a blog available for future reference. This is my attempt at creating that blog.

The Scenario

First of all, let me make the scenarios that I am speaking about clear. There are two different ways to “add a log file” for to a database:

1) Add another database to the system and split the data in the original database in half.
2) Add another log file to the same database

Both techniques can be used to add performance, but the first is much more common than the second. Let me just get the first one out of the way:

Adding another Database

The speed of the log file is limited by the latency of your I/O subsystem. When adding complexity to the I/O both (example: EMC SRDF for geo-replicated clusters or VM-ware and Hyper-V replicated disks) this increases latency – even if you run entirely in DRAM backed caches.

Because SQL Server flushes the log at a max size of 60KB and because the flush is with 1 oustanding I/O – your total transaction log throughput is  1 / [latency in s]  * 60KB. For a reasonably configured system that has a log latency of 1ms, this means that the max write throughput of the database is 60MB/sec.

If you need to scale transaction log throughput beyond limits of your current latency, you need to split the data into multiple database and hence multiple log files. Some call this method: sharding. Note that this makes sense even on the same machine, not just as a trick to scale across multiple machines. This method is particularly good in environments where log latency is expected to be high (ex: Azure).

To illustrate the principle:

image

Add another transaction log file to the same database

The point I raised in the LinkedIn discussion is that there are cases where adding a second transaction log to the same database also makes sense from a performance perspective.

Let me just make it clear: in SQL Server, a single database can only write to one log file at a time (this is not generally true of databases). Unlike with data files, there is no concurrency to be had by adding multiple log files to the same database – you need the sharding trick described above for that. However, this does not mean that there isn’t a performance benefit in some niche cases. These are the cases that I say are rare, but they do exist.

Consider disk and cache topologies. In a SAN or a disk shelf backed by spindles, when you give a cache time to flush, it will provide consistent throughput when you do write to it. The same is true for SSD/flash: if you give the drive some time to recover from the “write cliff” (which is not really a cliff at all on a modern drive) it will provide a more consistent performance the next time you write to it. Generally, disks in all flavours handle bursty writes better than consistently high writes.

Now, assume you have more than one LUN and that each one is backed by some cache (in the case of disk) or available write cycles (in the case of SSD). You can now use the mechanics of the log to cycle between those LUN – as long as the log files you add are small enough. To illustrate:

image

This method allows for some extremely consistent performance numbers on the log. But, it is a lot of trouble to configure and troubleshoot.

Summary

In this blog, I have described the only scenarios I know of where adding more than one log file makes sense from a performance perspective.

The second scenario, adding more than one log file to a single database for performance, is very rare. But I know of at least two systems that have or have had this in production. This was only made possible by some masterful tuning by the DBA. I have also used this trick to get consistent benchmark results.

In the vast majority of cases, a single log file for performance is fine. But now you have a new trick up your sleeve and I hope I have satisfied your curiosity too.