Upgrading SQL Server using Availability Groups – Checklist - JOYK Joy of Geek, Geek News, Link all geek

This is a checklist put together and followed for an upgrade of a pair of physical SQL Server 2012 servers which hosted a single availability group of several terabytes of data with minimal downtime.

The availability group was configured with synchronous commit and automatic failover.

List of Steps

If possible, stop any ETL processes that put data into the affected databases – less traffic here means less traffic to catch up with when you resume the AG HADR.
Make sure all AGs hosted by the servers are on the same server
Put servers, clusters, listeners, everything into Maintenance Mode in SCOM / SQLMonitor / Solarwinds / whatever other monitoring system you use.
Disable Transaction Log backup jobs on all servers hosting the Availability Group(s)
Take Log backups
Configure the Availability Group(s) to manual failover
Suspend data movement for each database on the secondary server
Backup the system databases on the secondary server
Upgrade the secondary server and reboot if required (it might not be required, but it is recommended to do this anyway)
Resume data movement on the secondary, and allow the servers to synchronise.
Failover the availability group(s) to the upgraded server with no data loss
Check that data movement is suspended to the new secondary / unupgraded server due to version incompatibility
Upgrade other server and then reboot
Resume data movement on secondary, and allow to synchronise
(optional) failover AG with no data loss to test all is OK (this isn’t really required, but can be useful for extra peace of mind)
Reconfigure availability group(s) back to automatic failover (if that’s how it was)
Re-enable transaction log backup jobs
Take servers out of maintenance mode in SCOM/SQLMonitor/Solarwinds/Whatever

A couple of useful SQL queries

Suspend / Resume HADR for all databases on the server

This is the suspend version – change suspend to resume and rerun later

DECLARE @sql VARCHAR(MAX) = '';

SELECT @sql = @sql + 'ALTER DATABASE ' + QUOTENAME(DB_NAME(hars.database_id)) + ' SET HADR SUSPEND;' + CHAR(10) + CHAR(13)

FROM sys.dm_hadr_database_replica_states AS hars

INNER JOIN sys.dm_hadr_availability_group_states AS hags

ON hags.group_id = hars.group_id

WHERE hars.is_local = 1

ORDER BY DB_NAME(hars.database_id);

PRINT @sql; -- optional

EXEC (@sql);

Check health of replicas in the AG (run on primary)

SELECT ag.name,

ar.replica_server_name,

hars.is_local,

hars.role_desc,

hars.operational_state_desc,

hars.connected_state_desc,

hars.recovery_health_desc,

hars.synchronization_health_desc,

hars.last_connect_error_number,

hars.last_connect_error_description,

hars.last_connect_error_timestamp,

ag.failure_condition_level,

ag.health_check_timeout,

ar.replica_server_name,

ar.owner_sid,

ar.endpoint_url,

ar.availability_mode_desc,

ar.failover_mode_desc,

ar.session_timeout,

ar.primary_role_allow_connections_desc,

ar.secondary_role_allow_connections_desc,

ar.backup_priority,

ag.automated_backup_preference_desc,

ar.read_only_routing_url

FROM sys.dm_hadr_availability_replica_states AS hars

INNER JOIN sys.availability_groups AS ag

ON ag.group_id = hars.group_id

INNER JOIN sys.availability_replicas AS ar

ON ar.group_id = ag.group_id

AND ar.replica_id = hars.replica_id;

Monitor progress of catchup/synchronisation (run on primary)

SELECT ag.name,

adc.database_name,

ar.replica_server_name,

hdrs.is_local,

hdrs.synchronization_state,

hdrs.synchronization_state_desc,

hdrs.is_commit_participant,

hdrs.synchronization_health_desc,

hdrs.database_state_desc,

hdrs.is_suspended,

hdrs.suspend_reason,

hdrs.suspend_reason_desc,

hdrs.last_sent_time,

hdrs.last_received_time,

hdrs.last_hardened_time,

hdrs.last_redone_time,

hdrs.log_send_queue_size,

hdrs.log_send_rate,

hdrs.redo_queue_size,

hdrs.redo_rate,

hdrs.filestream_send_rate,

CEILING(hdrs.redo_queue_size / hdrs.redo_rate) / 60.0 AS est_redo_completion_minutes, -- approximately

hdrs.last_commit_time,

hdrs.low_water_mark_for_ghosts,

hdrs.secondary_lag_seconds -- for SQL 2016 and above

FROM sys.dm_hadr_database_replica_states AS hdrs

INNER JOIN sys.availability_groups AS ag

ON ag.group_id = hdrs.group_id

INNER JOIN sys.availability_databases_cluster AS adc

ON adc.group_database_id = hdrs.group_database_id

INNER JOIN sys.availability_replicas AS ar

ON ar.replica_id = hdrs.replica_id

WHERE hdrs.log_send_queue_size <> 0

OR hdrs.redo_queue_size <> 0;

Upgrading SQL Server using Availability Groups – Checklist

List of Steps

A couple of useful SQL queries

Suspend / Resume HADR for all databases on the server

Check health of replicas in the AG (run on primary)

Monitor progress of catchup/synchronisation (run on primary)

Recommend

Departure of a SQL Server Administrator – What to Check?

Login Script Generation For Migration or DR

Nested Virtual / Remote Desktops and CTRL+ALT+DELETE

Waiting for a job to complete

Testing SQL Agent Job Step Output During The Job

CmdExec vs Powershell – error? What error?

随想录（gcc生成的中间语言、汇编代码）

多媒体播放（windows音乐播放）

随想录（easyx中的键盘输入和鼠标消息）

随想录（MFC开发有感）

About Joyk