Measuring Transactional Replication Latency Without Tracer Tokens
SQL Server 2005 introduced Tracer Tokens (SQL 2005 | SQL 2008), a new methodology for programmatically measuring replication latency in transactional replication. To measure latency with a tracer token, you simply insert a tracer token at the publisher. The replication process will trace the token as it moves through the steps of the process and report back how long it took for the token to reach the distributor and the subscriber.
Sounds great, right? Do you see the problem I see?
What’s wrong with tracer tokens?
There is an inherent problem with tracer tokens. In order to get the results back, replication has to be working well, or you have to wait a long time. For example, if the replication subscriber is 4 hours behind guess how long it will take for the tracer token to get from the distributor to the subscriber.
You guessed it: 4 hours!! If replication is latent, and you need to determine by how much, can you really afford to wait for 4 hours to find out? So tracer tokens aren’t really very effective when you need them most.
It’s All in the Distributor
There is no great alternative for tracking latency from Publisher to Distributor because you’d have to be able to read the transaction log to determine if there are transactions that have not been read by the log reader. You can calculate replication latency from Distributor to all Subscribers by querying the tracking tables in the distribution database.
The simplest way to check latency is simply to let the replication monitor procedures do the work for you and then pull the data from the replication monitor tables. To be sure that we’re getting current data, we must first refresh the monitor data by executing sp_replmonitorrefreshjob.
Just modify the query below to set your publisher server and publisher database and execute the query in the distribution database.
Declare @Publisher sysname, @PublisherDB sysname; -- Set Publisher server name Set @Publisher = 'Publisher server name'; -- Set Publisher database name Set @PublisherDB = 'Publisher database name'; -- Refresh replication monitor data Exec sys.sp_replmonitorrefreshjob @iterations = 1; With MaxXact (ServerName, PublisherDBID, XactSeqNo) As (Select S.name, DA.publisher_database_id, max(H.xact_seqno) From dbo.MSdistribution_history H with(nolock) Inner Join dbo.MSdistribution_agents DA with(nolock) On DA.id = H.agent_id Inner Join master.sys.servers S with(nolock) On S.server_id = DA.subscriber_id Where DA.publisher_db = @PublisherDB Group By S.name, DA.publisher_database_id) , OldestXact (ServerName, OldestEntryTime) As (Select MX.ServerName, Min(entry_time) From dbo.msrepl_transactions T with(nolock) Inner Join MaxXact MX On MX.XactSeqNo < T.xact_seqno And MX.PublisherDBID = T.publisher_database_id Group By MX.ServerName) Select [Replication Status] = Case MD.status When 1 Then 'Started' When 2 Then 'Succeeded' When 3 Then 'In progress' When 4 Then 'Idle' When 5 Then 'Retrying' When 6 Then 'Failed' End, Subscriber = SubString(MD.agent_name, Len(MD.publisher) + Len(MD.publisher_db) + Len(MD.publication) + 4, Charindex('-', MD.agent_name, Len(MD.publisher) + Len(MD.publisher_db) + Len(MD.publication) + 5) - (Len(MD.publisher) + Len(MD.publisher_db) + Len(MD.publication) + 4)), [Subscriber DB] = A.subscriber_db, [Publisher DB] = MD.publisher_db, Publisher = MD.publisher, [Current Latency (sec)] = MD.cur_latency, [Current Latency (hh:mm:ss)] = Right('00' + Cast(MD.cur_latency/3600 As varchar), 2) + ':' + Right('00' + Cast((MD.cur_latency%3600)/60 As varchar), 2) + ':' + Right('00' + Cast(MD.cur_latency%60 As varchar), 2), [Latency Threshold (min)] = Cast(T.value As Int), [Agent Last Stopped (sec)] = DateDiff(hour, agentstoptime, getdate()) - 1, [Agent Last Sync] = MD.last_distsync, [Last Entry TimeStamp] = OX.OldestEntryTime From dbo.MSreplication_monitordata MD with(nolock) Inner Join dbo.MSdistribution_agents A with(nolock) On A.id = MD.agent_id Inner Join dbo.MSpublicationthresholds T with(nolock) On T.publication_id = MD.publication_id And T.metric_id = 2 -- Latency Inner Join OldestXact OX On OX.ServerName = SubString(MD.agent_name, Len(MD.publisher) + Len(MD.publisher_db) + Len(MD.publication) + 4, Charindex('-', MD.agent_name, Len(MD.publisher) + Len(MD.publisher_db) + Len(MD.publication) + 5) - (Len(MD.publisher) + Len(MD.publisher_db) + Len(MD.publication) + 4)) Where MD.publisher = @Publisher And MD.publisher_db = @PublisherDB And MD.publication_type = 0 -- 0 = Transactional publication And MD.agent_type = 3; -- 3 = distribution agent