This is part of a series on creating and installing Windows services.
- How to Create a Windows Service
- How to Install a Windows Service
- How to Run a Windows Service from Visual Studio
- How to Avoid Windows Service Start-up Timeouts
- How to Make a Windows Service Install Itself
- How to Handle Windows Service Errors
- How to Build a Windows Service Framework
In our first two episodes, we saw that the Service Control Manager automatically logs service lifecycle events (start/stop) to the Windows Event Log. But, what happens if our service encounters an unexpected error and crashes during its run? We want to make it as easy as possible for our operations team to quickly identify the root cause of the problem.
To start, let’s build a service based on the infrastructure we have created over the past couple episodes. To test how our infrastructure reacts in the face of errors, we will intentionally make the service rather poorly behaved:
class Program { public const string Name = "DemoService"; static void Main(string[] args) { BasicServiceStarter.Run<MyService>(Name); } } class MyService : IService { public void Start() { throw new Exception("exception!"); } public void Dispose() { } }
If we run this service from the console, you will notice that even without any special error handling in place, the exception is printed out to the console by the framework:
If you go ahead and install this service, we can experiment with how it behaves when running in service mode. If we start the service via services.msc, we get the following pop-up immediately:
And the service, as we would expect, did not start successfully. If we instead use net start
to start the service, we get the following message:
Notably, when using either method, the Service Control Manager isn’t really telling us anything useful about what happened, and it seems to not even know that there was an error during start-up. However, if we look at the event viewer, we notice that there is a useful error message there:
The framework has kindly gone ahead and logged the issue for us, even using the logger named after the service.
What happens if there is an unhandled exception thrown on a different thread? We know that those will tear down the application as well. Let’s experiment by modifying our misbehaving service to throw on a background thread instead of the main thread:
class MyService : IService { public void Start() { ThreadPool.QueueUserWorkItem( o => { throw new Exception("exception!"); }); } public void Dispose() { } }
Repeating our experiments, we now see the following behavior when running from the console:
Which is pretty similar to what happened when we threw an exception on the main thread. When starting this service, we get the following slightly different (and more useful) message from services.msc:
And the following messages in the event viewer:
Unfortunately, none of these messages display the message associated with the exception that was thrown, although we do get the stack trace.
In both console mode and service mode, the framework already leaves a decent paper trail in the event of an unhandled exception. Any improvements we make should not eliminate this default behavior (by masking exceptions via try/catch
, for example).
There are three major shortcomings of the default behavior:
- In the event of an exception on the main thread during service start-up, the service controller does not seem to know that there was a failure, and reports a vague error.
- In the event of a background thread exception, the error written to the event log does not contain the exception message.
- We are not capturing fatal exceptions for the purpose of logging them to log4net (you are using log4net, right?)
Report start-up errors to the Service Control Manager
To handle this issue, we need to set the service’s ExitCode property to a non-zero value in the face of a start-up error. To do so, we must wrap our OnStart
method in a try/catch
:
public class BasicService<T> : ServiceBase where T : IService, new() { private IService _service; protected override void OnStart(string[] args) { try { _service = new T(); _service.Start(); } catch { ExitCode = 1064; throw; } } protected override void OnStop() { _service.Dispose(); } }
I have chosen error 1064 : “An exception occurred in the service when handling the control request” because it seems to be the most appropriate (if anyone has a better idea of what we should set the error to, please leave a note in the comments). Also note that we re-throw the exception, thus preserving the existing behaviors (including the event logging) we saw above. If we start our service with this change in place, services.msc gives us a much more useful message, indicating that the service not only stopped, but failed:
Write background thread exception messages to the event log
To accomplish this goal, we need to attach to the current AppDomain’s UnhandledException event:
public static class BasicServiceStarter { public static void Run<T>(string serviceName) where T : IService, new() { AppDomain.CurrentDomain.UnhandledException += (s, e) => { if (EventLog.SourceExists(serviceName)) { EventLog.WriteEntry(serviceName, "Fatal Exception : " + Environment.NewLine + e.ExceptionObject, EventLogEntryType.Error); } };
In our handler, we write a custom error message to the event log that will include the message associated with the exception (taking advantage of the default behavior of Exception.ToString). With this change, we will be able to see a bit more information in the event log when our service crashes due to an exception on a non-entry thread:
Log all errors to log4net
It is a good idea to log any error that takes down the application as a fatal error in our application logs. To accomplish this, we will need to add logging to both the error handlers we created above (both the AppDomain.UnhandledException
event handler and the catch
block in our ServiceBase.OnStart
method). We have to do both because an unhandled exception in ServiceBase.OnStart
is actually handled by the service controller and does not trigger firing of the AppDomain.UnhandledException
event. Adding logging calls to these two locations is left as an exercise to the reader.
Best article I’ve found on this topic. Thanks!