SQL Server manual change tracking using Transaction Log Backups

Shahar Gvirtz
8 min readJan 21, 2025

--

One of the Facebook groups I follow recently had a post where someone asked how they could track changes made to the tables in SQL Server in order to transfer the data to another database. At first glance, it seems like a simple scenario, not particularly complicated, and one I’ve written about before. However, the person asking had a few requirements that were a bit less trivial:

  • The tables contain data that changes, not just data that is added (so simply “crawling” an ID column isn’t enough).
  • The modification time isn’t stored in the table, so you can’t tell that a row was changed by looking at such column.
  • They can’t use CDC or configure anything further at the database level (triggers and similar solutions were suggested in the comments, but these aren’t feasible in their use-case).

This is a somewhat tricky situation, which led to several proposals, such as copying all the data to another service and performing the joins there, or using a query engine that supports federated queries (like Trino or Redshift) to perform a JOIN between the target DB and the source DB, etc.

It’s very possible that these solutions are the easiest to implement. But, while thinking about this issue, I wondered whether it’s possible to implement independent “change tracking” and what the implications of that would be. It’s quite possible that, from various perspectives, the solution I’ll propose in this post isn’t cost-effective to implement (especially if the database is small, in which case it’s easy to copy the data each time or perform queries that join the data again), but it is technically interesting.

So, what’s the solution?

I’m going to rely on (and hope) that the database in question has backups and that the database is configured with the Full Recovery Model, meaning that transaction log backups are being performed. I also hope that the person asking the question will be able to access these backups (although there is an option to do so even without access, and we’ll discuss that later).

To understand the core of the solution, it’s important to know one thing about the Transaction Log file: every change is written to it before being applied to the data files.
In the case of the Full recovery model, this means that the changes are retained in the transaction log until a transaction log backup is taken. In other words, if we read the content of the transaction log, we can determine what the changes were (and thus the application can use that information).

So, the idea will involve the following steps:

  1. We collect all the transaction log backup files.
  2. We run a script on each file that extracts the information about the rows where changes were made.
  3. We want to extract the IDs (i.e., the primary key values of the row, to uniquely identify it) from the transaction log, and then use those values to run a query against the database to fetch the correct values.

How can we do this? This is where several documented (and less documented) functions come handy, which allow us to access the “behind-the-scenes” part of the database and directly query the storage engine for the content of a page or read the transaction log content.

Step 1: Let’s read the content of the transaction log

To read the content of the current transaction log, we can use the function fn_dblog. To read the content of a transaction log backup file, we can use the function fn_dump_dblog, which takes many parameters, most of which we don't really care about:

SELECT [Current LSN] as LSN, Operation, Context, [Page ID] as PageId
FROM fn_dump_dblog(NULL, NULL, 'DISK', 1, N'c:\tmp\dates\20250120\l1.bak',DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT)
WHERE Operation IN ('LOP_MODIFY_ROW', 'LOP_INSERT_ROWS', 'LOP_MODIFY_ROW') AND Context = 'LCX_CLUSTERED'

And here’s the results:

Results of fn_dump_dblog

Note that we are focusing on certain types of actions (specifically the ones that interested the original poster — insert and update; you may be interested in other actions as well). Each log record contains the PageId that was affected. The page ID is composed of the file ID, followed by a colon, and then the page ID in hexadecimal. For the next step, we need to convert it to decimal.

Step 2: Reading the content of a page

To read the content of a page, we use the DBCC PAGE command:

DBCC PAGE ('SO-2016',1,581463,3) WITH TABLERESULTS

And here’s the results:

Results of executing DBCC PAGE

Note that we can actually see the content of the affected row. Within each page, we also see the object_id associated with the table that the page belongs to (in a single page we’ll have data from a single table), which can help us link the information to the table and from there to the metadata, such as the primary keys by which we are going to query.

Step 3: Build a dynamic query and fetch the latest value from the table

Although it’s possible to extract all the values from the transaction log, this would require dealing with some nasty edge cases (for example, overflow pages, or in the case of heaps, dealing with forwarded records). It’s much easier (though less efficient) to extract just the PK value of the row, and then run a query against the database itself to fetch the entire row’s data.

Example Script

To wrap everything together (at least enough to explain the current example), the following PowerShell script automates these steps, given a transaction log backup and the fact that the name of the PK column is fixed.

param (
[Parameter(Mandatory = $true)]
[string]$LogBackupPath,
[Parameter(Mandatory = $true)]
[string]$ServerName,
[Parameter(Mandatory = $true)]
[string]$DatabaseName
)

#if not installed, execute: install-module -name dbatools -scope currentuser
Import-Module dbatools -ErrorAction Stop
Set-DbatoolsConfig -FullName sql.connection.trustcert -Value $true -Register
Set-DbatoolsConfig -FullName sql.connection.encrypt -Value $false -Register
$pkColumnName = "Id" #in real world it would be configurable, or dynamically detected
$allTablesQuery = "SELECT name, object_id FROM sys.tables"
$tablesList = Invoke-DbaQuery -SqlInstance $ServerName -Database $DatabaseName -Query $allTablesQuery
$transactionLogContentQuery = @"
SELECT [Current LSN] as LSN, Operation, Context, [Page ID] as PageId
FROM fn_dump_dblog(NULL, NULL, 'DISK', 1, N'$LogBackupPath',DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT)
WHERE Operation IN ('LOP_MODIFY_ROW', 'LOP_INSERT_ROWS', 'LOP_MODIFY_ROW') AND Context = 'LCX_CLUSTERED'
"@
$transactionLogContent = Invoke-DbaQuery -SqlInstance $ServerName -Database $DatabaseName -Query $transactionLogContentQuery

$processedPages = [System.Collections.Generic.HashSet[string]]::new()
$processedIds = [System.Collections.Generic.HashSet[string]]::new()
foreach ($trow in $transactionLogContent) {
if ($processedPages.Contains($trow.PageId)) {
continue
}
$pageIdSplitted = $trow.PageId.Split(':')
$pageId = $pageIdSplitted[1]
$fileId = $pageIdSplitted[0]
$pageIdNum = [Convert]::ToInt32($pageId, 16)
$pageInfoQuery = "dbcc page ('$DatabaseName',$fileId,$pageIdNum,3) WITH TABLERESULTS"
$pageInfo = Invoke-DbaQuery -SqlInstance $ServerName -Database $DatabaseName -Query $pageInfoQuery

$pageGroupedByParent = $pageInfo | Group-Object -Property ParentObject
$tableObjectId = ($pageInfo | Where-Object { $_.Field -eq "Metadata: ObjectId" }).Value
$tableName = (($tablesList | Where-Object { $_.object_id -eq $tableObjectId }) | Select-Object -First 1).Name
$currentRow = @{}
$currentSlot = $null
foreach ($parentGrp in $pageGroupedByParent) {
if ($parentGrp.Name.StartsWith("Slot ") -eq $false) {
continue
}

if ($parentGrp.Name -ne $currentSlot) {
if ($null -ne $currentSlot) {
$pkColumnValue = $currentRow[$pkColumnName]
$rowId = "$tableName-$pkColumnValue"
if ($processedIds.Contains($rowId) -eq $false) {
$dbQuery = "SELECT TOP 1 * FROM $tableName WHERE $pkColumnName=$pkColumnValue"
Write-Host "Finished processing row content, going to fetch the current state from DB with query: $dbQuery"
$updatedRowData = (Invoke-DbaQuery -SqlInstance $ServerName -Database $DatabaseName -Query $dbQuery) | Select-Object -First 1
Write-Host "This is a row that was affected in this transaction log backup: $($updatedRowData| Select-Object * -ExcludeProperty ItemArray, Table, RowError, RowState, HasErrors | ConvertTo-Json)"
$processedIds.Add($rowId) | Out-Null
}
$currentRow = @{}
}
$currentSlot = $parentGrp.Name
}

foreach ($field in $parentGrp.Group) {
if ($field.Field -ne "" -and $field.Object -clike "Slot * Column *") {
$currentRow[$field.Field] = $field.Value
}
}
}

Write-Host "Done processing file '{$fileId}' page '$pageIdNum'"
$processedPages.Add($trow.PageId) | Out-Null

}

Limitations of the current implementation

There are several drawbacks in this script that anyone considering production usage of this method should consider:

  • We are going to process rows that were not actually updated as rows that were updated.
    Data in SQL Server is stored at the granularity of pages, with each page always being 8KB. In the transaction log, we can easily extract which page the change occurred on. We can also relatively easily retrieve the content of that page. What we can’t do as easily is parse and truly understand which row was affected by the change. Since the original poster’s goal was to identify rows with changes in order to transfer them to another DB infrastructure, and since all their alternatives involve massive copying of data to the other DB infrastructure (whether temporarily for querying or permanently with each new transfer), this solution is still cost-effective.
  • There are several other situations where we might identify a page as having been updated, even though nothing changed: for example, if a transaction ran, made changes, and then rolled back. Although it’s possible to complicate the implementation to handle this, since a transaction can span multiple backup files that we don’t process at the same time, full handling of this situation would require maintaining state between runs.
  • We will perform a large number of queries (one for each row that was changed in the file we are currently processing). This is a problem exists in this naive implementation, but can be solved easily by batching the queries to retrieve the full rows together.
  • We are not handling overflow pages: when using varchar(n) or nvarchar(n), etc., and if the row-page runs out of space, SQL Server might decide to allocate a separate page elsewhere in the data file to store the actual data and concatenate multiple pages with pointers. In this case, only a pointer to the overflow page is stored on the row page. If the page containing the actual row changes (e.g., if the pointer or some other value is updated), this solution will still work correctly. Since we're fetching the current state of the row with a regular query, we'll simply get the true values. However, the current code doesn't handle the situation where only the overflow page was updated without a change to the row itself. In that case, we would ignore the row and not have the actual PK column value.
  • I tested the implementation only with a table that has a clustered index. Adjustments may be needed to work with heaps.

If someone wants to use this solution for production, they also need to ensure monitoring and handle potential scenarios that may occur (preferably with automatic recovery). For example, what happens if a backup file is missing? This can be identified by a gap in the LSN, and in such cases, we might need to perform a full copy of the data to avoid missing updates.

--

--

No responses yet