Resolving Database with State SUSPECT in AlwaysOn AG Secondary Node
If you notice that there is a database in secondary node in your AlwaysOn Availability Group SQL Server that has state "Not Synchronizing/Suspect" then its data movement is in error state and you might notice the huge growth in log file size either in Secondary or Primary. The thing is, you might need to resolve this issue quickly and after that try to troubleshoot what is the cause of the issue. Here is the step on how to resolve this issue if it happens.
If the issue is happening and you see the Primary node is still in synchronized (synchronous commit) or synchronizing state (asynchronous commit) then the problem might lie only in the Secondary node.
1. Steps to be taken in Secondary Instance to try resolving the issue
These are some steps which you can try first to see if it can resolve the issue:
1. Login to your Primary node and then suspend data movement for all of your databases in that Availability Group (Use query so you can do it faster)
2. Go to you Secondary node and then stop the SQL Server Service, use cmd with administrator and "net stop" command to force stop the service
3. After the service has been stopped, start the SQL Server Service again with cmd (administrator) using "net start" command to force start the service
4. Wait for the Secondary Node to start fully, and see the database state, if the Suspect state has been removed then all is good
5. Go to the Primary Node and then resume data movement for all the databases in that Availability Group
6. See your Secondary Node and check to see that all databases have data movement resumed
7. Check you AG Dashboard from Primary Node to see last time sent and last time received, it should start to send live data again
8. You might want to shrink log file afterward to prevent full disk because of log file
2. If the first alternative solution failed, then reinitiate the database
1. If you have done all the steps in the first alternative solution and it's still at Suspect State, then you have to reinitiate the database
2. Remove the database from the availability group and then drop the database
3. Use join only method to join secondary database to availability group to prevent long process in automatic seeding (heavy network load)
4. Backup database from primary and also backup log from primary, and then restore full backup and log backup to secondary with no recovery
5. Rejoin database with join only option to the availability group
6. Monitor the AG Dashboard and see if the database has received data (last sent time and last received time)
Comments
Post a Comment