Azure - how to troubleshoot failing deployments
I have a worker role that fails to deploy (cycles between initializing and aborted) in the management console. It runs fine in the emulator.
The frustrating thing is not that the deployment fails, its that it seems virtually impossible to find out why.
I've checked all my connection strings, enabled diagnostics, checked all my assemblies are deployed, googled ALOT and lost some hair.
The point I'm at now is that I'm left adding code bit by bit, and redeploying to find the code that fails, a process that is ridiculously slow.
The worker itself connects to sql azure, and azure storage. I have it connecting to the live endpoints in the emulator without any problems.
It seems to fai开发者_如何学Cl as soon as I configure StructureMap (IoC). However, I'm using near identical code in my web role and this works fine.
So where can I go from here (apart from to the bottle)?
I'm going to start by re-iterating the feedback you've received so far. The biggest killer is the Run() process in the WorkerRole. If the WorkerRole is having trouble starting, you can wrap the code inside this method with a try/catch and log it.
If you choose to use the built in diagnostics, I'd recommend reading through Ryan Dunn's blog, as well as, smarx's blog. Both have tread this ground and done a great job documenting/sharing as they go. The MSDN site (sorry, first answer so only two links:)) has also improved quite a bit on this topic.
The part I'll add to this conversation, is in HOW you follow the advice. I don't use Intellitrace as I don't have access to it, and have resorted to configuring Remote Desktop (can be done from within Visual Studio) to my roles when hitting the wall. If you configure log4net, or something similar, (local to the role) you'll be able to log on via RDP and read through the logs.
Now, two of the things we find most frequently..
UseDevelopmentStorage=True - this is a default setting and can create problems when deployed. There's quite a bit written on this already.
Dependencies - There are many things devs have access to that are not in the Hosted Role. The easiest example of this, IMO, is ASP.NET MVC. You can either manage with the 'Stable Release' philosophy, or use something like the Web Platform Installer Command line (there's also the Azure Boostrapper on dunnry's blog) to prep the Role before startup.
For me, the key is the RDP as you can actually log on and see what's happening.
UPDATE - Can't believe I forgot this one as it kills me all the time, but, you may also need to configure the Firewall if using SQL Azure. In the dev process, we'll often destroy and redeploy our roles, instead of updating, and leads to occasional IP address changes. If these aren't configured in the Firewall where SQL Azure is involved, can be problematic.
Hope this helps man.
Well would you believe it, it was a missing assembly on the worker role. My advice to anyone who faces similar issues is to single, double and triple check all of your dependencies.
Microsoft's response was to use Intellitrace, but if you don't want to shell out for a VS upgrade, you can use AsmSpy (a great little utility by Mike Hadlow).
This is what eventually allowed me to find that one of my worker role dependencies had a dependency on asp.net mvc! It shouldn't have been there, shame it took me so long to find.
In addition to the excellant tips shared by Mike above, here are some additional ones to look out for:
- Make sure you're using a https endpoint for diagnostics
- Missing assemblies (just to reiterate). I also had an issue with NHibernate 3.1 where the proxy factory factory assembly was not copied to the output path (even with copy local = true). Had to copy over this manually (NHibernate.Bytecode.Castle)
Here's the code I'm using for writing logs to Azure storage:
#region Setup diagnostics
DiagnosticMonitorConfiguration diagnosticsConfig
= DiagnosticMonitor.GetDefaultInitialConfiguration();
// Windows event logs
diagnosticsConfig.WindowsEventLog.DataSources.Add("System!*");
diagnosticsConfig.WindowsEventLog.DataSources.Add("Application!*");
diagnosticsConfig.WindowsEventLog.ScheduledTransferLogLevelFilter = LogLevel.Warning;
diagnosticsConfig.WindowsEventLog.ScheduledTransferPeriod = TimeSpan.FromMinutes(1);
// Windows Azure logs
diagnosticsConfig.Logs.ScheduledTransferLogLevelFilter = LogLevel.Warning;
diagnosticsConfig.Logs.ScheduledTransferPeriod = TimeSpan.FromMinutes(1);
DiagnosticMonitor.Start("Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString", diagnosticsConfig);
#endregion Setup diagnostics
You can set the Azure logs ScheduledTransferLogLevelFilter to Undefined to log everything sent to the Trace listeners.
I use an ILogger
interface for logging throughout my application so I just wrote one that called Trace.WriteLine
so that any exceptions would be logged to Azure storage.
One issue for me was that even after wrapping everything in a huge try catch block, the exception produced during the StructureMap initialization was not very useful.
I added the following extension method to my logger so that I could get the inner exception. It was this that led to me seeing the missing MVC assembly issue and that classic face-palm moment.
public static string BuildExceptionMessage(this ILogger logger, Exception x)
{
var logException = x;
while (logException.InnerException != null)
{
logException = logException.InnerException;
}
var errorMessage = string.Empty;
if (HttpContext.Current != null)
{
errorMessage = Environment.NewLine + "Error in Path :" + System.Web.HttpContext.Current.Request.Path;
// Get the QueryString along with the Virtual Path
errorMessage += Environment.NewLine + "Raw Url :" + System.Web.HttpContext.Current.Request.RawUrl;
}
// Get the error message
errorMessage += Environment.NewLine + "Message :" + logException.Message;
// Source of the message
errorMessage += Environment.NewLine + "Source :" + logException.Source;
// Stack Trace of the error
errorMessage += Environment.NewLine + "Stack Trace :" + logException.StackTrace;
// Method where the error occurred
errorMessage += Environment.NewLine + "TargetSite :" + logException.TargetSite;
return errorMessage;
}
I hope that helps some others.
精彩评论