Why is my WCF channel failing?
I have a computer that is running a single program that manages up to 48 individual processes on 4 other computers. I have the WCF services (one for each process) set up as such:
public void StartService(Uri uri, string identifier)
{
unitMetaData = identifier;
var binding = new WSDualHttpBinding(WSDualHttpSecurityMode.None);
binding.ReliableSession.InactivityTimeout = TimeSpan.FromDays(20);
var reader = binding.ReaderQuotas as XmlDictionaryReaderQuotas;
reader.MaxStringContentLength = WCFContentSize; // 16777216
service = new ServiceHost(this, uri);
service.Faulted += TestService_Faulted;
service.AddServiceEndpoint(
typeof(IController),
binding,
identifier);
service.Open();
}
Here is the code for the remote processes:
public void Connect()
{
// External binding used to change the WCF XML text content size
var binding = new WSDualHttpBinding(WSDualHttpSecurityMode.None);
binding.ReliableSession.InactivityTimeout = TimeSpan.FromDays(20);
var reader = binding.ReaderQuotas as XmlDictionaryReaderQuotas;
reader.MaxStringContentLength = WCFContentSize; // 16777216
DuplexChannelFactory<IController> factory = new DuplexChannelFactory<IController>(new InstanceContext(this), binding);
controllerChannel = factory.CreateChannel(new EndpointAddress(controllerAddress, new DnsEndpointIdentity(controllerAddress.DnsSafeHost), new System.ServiceModel.Channels.AddressHeaderCollection()));
((IClientChannel)controllerChannel).OperationTimeout = TimeSpan.FromSeconds(ChannelOperationTimeoutInSeconds); // 300
controllerChannel.RequestTestData();
}
I have some code that will call a remote "Ping()" function that simply returns the string "Pong" about every 30 seconds on each remote process. I did this to ensure that the connection stays open as I had some issue with the ReliableSession timing out. Occasionally (as in much too often for production code) I get the following exception from one and usually more services that test processes are connect开发者_如何学编程ing to:
An ExceptionDetail, likely created by IncludeExceptionDetailInFaults=true, whose value is:
System.ServiceModel.CommunicationObjectFaultedException: The communication object, System.ServiceModel.Channels.ServerReliableDuplexSessionChannel, cannot be used for communication because it is in the Faulted state.
Server stack trace:
at System.ServiceModel.Channels.TransmissionStrategy.WaitQueueAdder.Wait(TimeSpan timeout)
at System.ServiceModel.Channels.TransmissionStrategy.InternalAdd(Message message, Boolean isLast, TimeSpan timeout, Object state, MessageAttemptInfo& attemptInfo)
at System.ServiceModel.Channels.ReliableOutputConnection.InternalAddMessage(Message message, TimeSpan timeout, Object state, Boolean isLast)
at System.ServiceModel.Channels.ReliableDuplexSessionChannel.OnSend(Message message, TimeSpan timeout)
at System.ServiceModel.Channels.DuplexChannel.Send(Message message, TimeSpan timeout)
at System.ServiceModel.Dispatcher.DuplexChannelBinder.Request(Message message, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)
at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)
Exception rethrown at [0]:
at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
at SEL.MfgTestDev.ESS.ServiceContracts.ITestProcessClient.Ping()
at SEL.MfgTestDev.ESS.Testing.Service.TestService.Ping() in C:\Projects\Mfg_TestDev_ESS_Rev3\branches\MSU-5-18-2010\ESS.Testing.Service\TestService.cs:line 349
So what's going on? Why is it suddenly ending up in a faulted state. Is there a way I can get the reason why a connection has faulted?
Not a good idea for production environment but you can try to turn on WCF tracing on both server and clients. You will hopefully find better error description.
Btw. you had problems with reliable session because it timed out after 10 minutes of inactivity. You set up inactivity timeout for reliable session but there is also recieve timeout on binding which is by default 10 minutes. If no message arrives in 10 minutes application session is closed = service instance is destroyed and reliable session is closed as well.
Edit:
The problem description is insufficient. Also architecture is very strange. There is not one service communicating with 48 clients over duplex channels but 48 same services communication with one 1 client over duplex channels. This can of course add additional problems which are not known from common scenarios so diagnostics (tracing / performance counters) is realy needed!
When checking the code of Connect method it even looks like client callback is singleton communicating with all 48 services, isn't it? What concurrency mode is used on that callback? If concurrency mode is single there can be timeout problems when calling the callback because message size is set to 16MB. If all 48 processes sends 16MB message in the same time they will be queued and processed in FIFO order. Default settings demands processing within 30s otherwise timeout exception occures and channel is faulted. If the concurrency mode is multiply there still can be some synchronization problems inside callback implementation.
Your channel can be in faulted state if you do not wrap service exceptions in to FaultException
or FaultException<T>
:
http://blogs.msdn.com/b/pedram/archive/2008/01/25/wcf-error-handling-and-some-best-practices.aspx
I assume what that some other service call throws an exception, the channel is faulted and then you get the exception that you describe, when you attempt to ping the service.
Assuming that you're using the same channel for pinging the remote service as other remote calls (which was the whole point of this ping right?) it could be that one of the other method calls excepted/timed out and faulted your channel?
Also, in your configuration for ServiceBehaviors, is 'includeExceptionDetailInFaults' set to true? e.g.
<behaviors>
<behavior name="MyServiceBehaviors">
<serviceDebug includeExceptionDetailInFaults="true" />
</behavior>
</behaviors>
During debug this is useful as it allows you to see the exception message from the server but the downside is that if faults your channel too, so in a production environment it's best to leave it off.
精彩评论