SQLite transaction isolation

| 1116 words

Introduction

I use SQLite for a couple of hobby projects. Since they are only used by a handful of users, SQLite’s simplicity is a great fit.

During an interesting debugging session on a small monitoring app, I discovered a surprising behavior in the SQLite .NET provider. In my case, it caused a noticeable performance degradation.

The application has a C# backend and a JavaScript frontend that displays a dashboard with various system metrics.

The problem

While loading the dashboard, I noticed that some endpoints were much slower than others:

GET ../system/ram-chart responded 200 in 2.662 ms
GET ../system/disk-chart responded 200 in 154.843 ms
GET ../system/cpu-chart responded 200 in 312.298 ms

Interestingly, if the calls are serialized, the slowness disappears:

GET ../system/cpu-chart responded 200 in 2.876 ms
GET ../system/ram-chart responded 200 in 2.777 ms
GET ../system/disk-chart responded 200 in 2.502 ms

The backend implementation that handles these calls can be seen below. It queries the database, and SQLite handles the aggregation:

[HttpGet("cpu-chart")]
public ActionResult<List<ChartDatasetDto>> GetCpuChart([FromQuery] GetSystemChartRequest r)
{
  var window = r.GetTimeWindow();
  var (start, end) = window.GetBounds(r.CurrentTime?.UtcDateTime);

  var rows = Db.Transaction(ctx => SystemSnapshotRow.GetChart(start,
    end,
    window.Resolution,
    SnapshotMetric.Cpu,
    ctx));

  return Ok(BuildChart(rows, (row => row.Cpu, "CPU (%)")));
}

public static List<SystemSnapshotRow> GetChart(Period start,
    Period end,
    TimeSpan resolution,
    SnapshotMetric[] metrics,
    Db.Context ctx)
  {
    string cols = string.Join(",\n", metrics.Select(m => $"{m.Agg}({m.Col}) AS {m.Col}"));

    string sql = $@"
    SELECT
      strftime('%Y-%m-%d %H:%M',
        (strftime('%s', Period) / @ResSeconds) * @ResSeconds, 'unixepoch') AS Period,
      {cols}
    FROM system_snapshots
    WHERE Period >= @Start AND Period < @End
    GROUP BY (strftime('%s', Period) / @ResSeconds)
    ORDER BY Period";

    return ctx.Connection.Query<SystemSnapshotRow>(sql,
      new
      {
        Start = start,
        End = end,
        ResSeconds = (long)resolution.TotalSeconds
      },
      ctx.Transaction).AsList();
  }

Isolation and concurrency in SQLite

SQLite has only the serializable isolation level. As the documentation states:

There can only be a single writer at a time to an SQLite database. There can be multiple database connections open at the same time, and all of those database connections can write to the database file, but they have to take turns. SQLite uses locks to serialize the writes automatically; this is not something that the applications using SQLite need to worry about.

That would explain the slowness if we were writing data, but these endpoints are read-only. SQLite also supports multiple readers, so that should not be the issue. In addition, the database has WAL mode enabled, so even if another thread is inserting data, readers should still not be blocked.

So what is going on?

The database access code looks like this:

public TData Transaction<TData>(Func<Context, TData> execute)
{
  using var connection = new SqliteConnection(GetConnectionString(_name));
  connection.Open();
  using var transaction = connection.BeginTransaction(IsolationLevel.Serializable);

  try
  {
    var result = execute(new Context(connection, transaction));
    transaction.Commit();
    return result;
  }
  catch
  {
    transaction.Rollback();
    throw;
  }
}

At first glance, there is nothing unusual here, at least if you are used to other database providers. But the SQLite transaction documentation adds an important detail:

SQLite supports multiple simultaneous read transactions coming from separate database connections, possibly in separate threads or processes, but only one simultaneous write transaction. A read transaction is used for reading only. A write transaction allows both reading and writing. A read transaction is started by a SELECT statement, and a write transaction is started by statements like CREATE, DELETE, DROP, INSERT, or UPDATE (collectively “write statements”).

This is interesting: SQLite distinguishes between read and write transactions. That makes this look like a transaction-detection problem. The documentation further explains:

Transactions can be DEFERRED, IMMEDIATE, or EXCLUSIVE. The default transaction behavior is DEFERRED. If the first statement after BEGIN DEFERRED is a SELECT, then a read transaction is started. Subsequent write statements will upgrade the transaction to a write transaction if possible, or return SQLITE_BUSY. If the first statement after BEGIN DEFERRED is a write statement, then a write transaction is started. IMMEDIATE causes the database connection to start a new write immediately, without waiting for a write statement. EXCLUSIVE is similar to IMMEDIATE in that a write transaction is started immediately. EXCLUSIVE and IMMEDIATE are the same in WAL mode

This explains the behavior: if a transaction is started as IMMEDIATE, SQLite treats it as a write transaction even if it never actually writes anything.

That still leaves one question: why is this happening in the .NET code above? To answer that, we need to look at the actual implementation of SqliteConnection.BeginTransaction:

public new virtual SqliteTransaction BeginTransaction(IsolationLevel isolationLevel)
  => BeginTransaction(isolationLevel, deferred: isolationLevel == IsolationLevel.ReadUncommitted);

// This calls this constructor
SqliteTransaction(SqliteConnection connection, IsolationLevel isolationLevel, bool deferred)
{
  if ((isolationLevel == IsolationLevel.ReadUncommitted
       && ((connection.ConnectionOptions!.Cache != SqliteCacheMode.Shared) || !deferred))
       || isolationLevel == IsolationLevel.ReadCommitted
       || isolationLevel == IsolationLevel.RepeatableRead
       || isolationLevel == IsolationLevel.Unspecified)
  {
    isolationLevel = IsolationLevel.Serializable;
  }

  _connection = connection;
  IsolationLevel = isolationLevel;

  if (isolationLevel == IsolationLevel.ReadUncommitted)
  {
    connection.ExecuteNonQuery("PRAGMA read_uncommitted = 1;");
  }
  else if (isolationLevel != IsolationLevel.Serializable)
  {
    throw new ArgumentException(Resources.InvalidIsolationLevel(isolationLevel));
  }

  connection.ExecuteNonQuery(
    IsolationLevel == IsolationLevel.Serializable && !deferred
      ? "BEGIN IMMEDIATE;"
      : "BEGIN;");

  sqlite3_rollback_hook(connection.Handle, RollbackExternal, null);
}

Here we can see the root cause. Passing IsolationLevel.Serializable sets deferred = false, which causes the provider to execute BEGIN IMMEDIATE;. That starts a write transaction, and as a result, each database query waits for the others to finish.

The improvement from serializing requests on the frontend also makes sense in light of the provider’s implementation. The .NET provider uses a retry loop with a 150 ms sleep:

// inside SqliteDataReader.NextResult
while (IsBusy(rc = sqlite3_step(stmt)))
{
  if (_command.CommandTimeout != 0
    && (_totalElapsedTime + timer.Elapsed).TotalMilliseconds >= _command.CommandTimeout * 1000L)
  {
    break;
  }

  sqlite3_reset(stmt);
  Thread.Sleep(150);
}

The solution

The fix is simple: pass the deferred parameter explicitly.

public TData Transaction<TData>(Func<Context, TData> execute, bool deferred = true)
{
  using var connection = new SqliteConnection(GetConnectionString(_name));
  connection.Open();

  using var transaction = connection.BeginTransaction(IsolationLevel.Serializable, deferred);

  // ... same as example above
}

After doing this, the problem is gone:

GET ../system/cpu-chart responded 200 in 2.171 ms
GET ../system/ram-chart responded 200 in 1.399 ms
GET ../system/disk-chart responded 200 in 1.786 ms

And because the requests can now run in parallel, the total time drops as well, even if it is not really noticeable in this small example 😀

Conclusion

I only ran into this on my second SQLite project, which probably suggests that for hobby projects this kind of thing often does not matter much. Even then, I initially suspected something else. My first implementation aggregated the chart data in memory, so I assumed that was the source of the slowness. But after moving the aggregation into the database and seeing no improvement, I started digging deeper.

In my opinion, the issue comes from an unfortunate default in the .NET SQLite provider. A better choice would be to make deferred default to true. On the other hand, if that had been the default, my SQLite knowledge would be even worse, and this post would never have been written.