DR Drill with Log shipping
by Rachamallu Jayaprakash Reddy, SQL Server DBA, Rackspace Technology
Introduction
One common task for every DBA is to make sure that all mission-critical SQL Server instances and the databases within them are available around the clock to keep the business up and running by minimizing or no business disruptions.
When your primary location experiences a geographical disaster like an earthquake, flood, or fire the business must be prepared by recovering or resuming the services from a geographically different location.
In SQL Server databases, Log shipping is one of the oldest methods of providing disaster recovery which is implemented in many organizations where other options may be challenging due to environment, administrative skills, or budget.
Why do we need a DR Drill? In order to avoid business downtime we always have to maintain a proper DR mechanism and it's mandatory to conduct a DR test to verify the application connectivity and the data synchronization state once we restore or recover the database on the DR site. It is a best practice to conduct a DR drill every six months, although it entirely depends on the client's requirements.
Let’s get into the detailed steps of the DR drill setup:
I have already configured the Log shipping and below are the configuration details :
- Primary Database Name : LS_P
- LS Primary Server : Node1
- LS Secondary/DR Server : Node2
- Secondary Database: LS_DR
Please follow the checklist for a smooth DR exercise and get the steps prepared for the failover and failback.
Checklist
- Coordinate with the client for the maintenance window to perform the DR drill.
- Engage the application team for the connectivity and the data validation during the DR test.
- Confirm with the customer if they want to copy any application-related SQL agent jobs on DR server.
- Copy all the Logins using `revlogin` script.
Steps for Failover
1. Take the confirmation from customer to stop their applications from connecting to the LS primary server.
2. Once the customer confirms for go-head, Run Manually LS Backup job on Primary Server and LS Copy, LS Restore on DR server.
3. Upon completion of the above jobs, Check the LS report and make sure Last backup, Last copy, and last restore are having the same backup file name.
4. On confirming the backup files are same on step#3, Disable all the LS Jobs mentioned on the Primary and DR server.
5. Take the Tail log backup on the primary server and this will leave the databases into Restoring state on the primary server.
6. Copy the Tail log backups from primary to DR and Restore with recovery on the DR server, this will bring the database online state.
7. Configure Reverse Log shipping from DR to the Primary server.
- From DR server, now configure the Log shipping and during the configuration on secondary database settings, select ‘No. The secondary database is initialized’ Instead of creating the secondary database from scratch and choose the secondary database from the drop-down.
8. Validate the Reverse Log shipping health by manually running the newly created LS backup jobs on the DR server and LS copy and restore jobs on the Primary server.
9. Check the LS report and make sure Last backup, Last copy, and last restore are having the same backup file name.
10. Copy any application-related Jobs, logins from the primary server to DR and fix any orphan users.
11. Confirm to the customer that Failover has been completed and they are good for application testing.
Steps for Failback
1. Once the customer confirmed that DR test has been completed, proceed with the below steps to failback.
2. Run Manually LS Backup job on DR and LS Copy, LS Restore on Primary server and Check the LS report to make sure Last backup, Last copy, and last restore are having the same backup file name.
3. On confirming the backup files are the same on step#2, Disable the LS backup Job on DR and LS copy and restore Jobs on the Primary server.
4. Take the Tail log backup on the DR server and this will leave the databases in Restoring state on the DR server.
NOTE: Clear the tail log backups on the folder which were copied during the failover.
5. Copy the Tail log backups from DR to Primary and Restore the tail log with recovery on the Primary server.
NOTE: Before copying clear all the tail log backups on the local path of the Primary server which were taken during the failover.
6. Enable and run manually the old LS Backup job on the Primary server and old LS copy and restore jobs on the DR server.
7. Validate the health check to make sure Log shipping is working fine.
Well, we can see the status of the Log shipping is Good on both Primary and secondary.
8. Confirm to the customer that Failback has been completed and do a final test with the application connectivity to the primary server.
Conclusion:
I hope you find this discussion useful when you are planning the DR drill with Log shipping.
Recent Posts
Google Cloud Hybrid Networking Patterns — Part 1
October 17th, 2024
Google Cloud Hybrid Networking Patterns — Part 3
October 17th, 2024
Google Cloud Hybrid Networking Patterns — Part 2
October 17th, 2024
How Rackspace Leverages AWS Systems Manager
October 9th, 2024
Windows Server preventing time sync with Rackspace NTP
October 7th, 2024
Related Resources