How to Add Error Handling to an Unexpected Shutdown of the Remote Database Service
How to Add Error Handling to an Unexpected Shutdown of the Remote Database Service
For my current project, I have a SQL Server database on a remote server. The database is accessed using a stored procedure that connects via ADO .NET Data Services (formerly Astoria). In addition, this database is hosted in a web farm environment, so there are many instances of my application running simultaneously. This means that if one instance of any type crashes unexpectedly then all applications/users will be disconnected from the database until I manually intervene and restart services on the remote machine or reboot it completely. This also occurs for two other reasons: firstly, if the Windows service responsible for servicing requests on port 1433 goes down; secondly, if someone who doesn’t have administrator access reboots the machine. So I need to add some error handling code so that when an instance of my web application crashes, it is gracefully stopped and all associated connections are closed within 30 seconds.
The Application:
The .NET solution I’m working on is a Windows service written in C# using Visual Studio 2010 Ultimate Edition. It runs as Local System on the remote SQL Server computer. This may sound like it’s less likely for this type of application to crash than your average user-mode client program, but there is no reason why any software can’t contain bugs or unexpected errors which bring down the process or the entire machine. The amount of time that you have to rescue an ungraceful shutdown will vary from application to application. If your software is just querying data then it may only be a matter of seconds before you get the “connection failed” warning box. However if you are working with large amounts of data in memory (such as I am) or updating records via transactions, then there will probably be much more time for this error handling code to be called. Luckily in my scenario, an instance crash would not cause any permanent damage because all changes are stored in the database’s transaction logs and replicated across other database instances that are still running (this is also where I’m currently storing all my state information).
So how do you handle Unexpected Shutdowns?
- When considering how to implement failure handling code, your first thought might be to use your language’s exception handling support, but in my case I’m writing code for a Windows service which runs as Local System and cannot write any files. Other than error logs that are typically written by the system when an ungraceful shutdown occurs, there isn’t much opportunity to write extra logging information in the event of an unexpected crash. Instead, you can use windows management instrumentation (WMI) exceptions which allow you to catch events related to process termination.
- You can also see what happened just before your application or its components stopped working by inspecting the error code using Process Explorer.
- To get started with this approach, I followed Microsoft’s guidelines on how to monitor for these errors in Power Shell scripts. I then installed some event tracing rules via System Diagnostics. Trace Source namespace to trace all WMI error messages to my application’s log file. I now have any ungraceful shutdowns automatically logged so that I can deal with them the next time I’m near a computer.
- The Event Log may show event id 1108 with source Remote Procedure Call (RPC) which says: The RPC server is unavailable. This can indicate that the remote procedure call (RPC) network support or network transport layers are not functioning properly. It could be caused due to low-memory conditions, file corruption, or hardware failure. You may receive this event when a process terminates abnormally while it’s trying to connect to the RPC server service on a Windows 2003 Server. If your third-party application is closing unexpectedly without being gracefully shut down then you should configure the following registry key before restarting the server or adding back/removing any services or from the system:
- This will display a notification error message to inform you of the crash and provides instructions on what to do. Windows Registry Editor Version 5.00
Conclusion:
This key has been confirmed by Microsoft as an Application Experience (ANAT) issue. It is caused when an application is terminating unexpectedly and the RPC server service is starting back up automatically. To turn off this functionality, configure the following registry key: This will prevent auto-restart of RPC server’s service and hence it will not be responsive again since the last running state was an unexpected shutdown. The server will be forced to shut down and this behavior may not be applicable in all situations. As an alternative, you can also create a custom exit routine for the program that is crashing, which is capable of logging off users (if applicable) or shutting down cleanly. And if your application crashes before it has successfully connected to SQL Server then you will need to enable the ‘Terminate Connection on a Configuration Change’ option under the General tab in SQL Server’s connection properties. Windows Registry Editor Version 5.00.