I've been asked to look at a live problem we have where we seem to have a bottleneck calling OPA. The high level design is the client is running as a WCF service and as it chunks up the work it needs to do it spawns new threads which in turn call OPA running on a separate box running under IIS. The call is through the standard .net OPA API.
Looking at the logs the threading seems to be working and when it is under load you can see it spawning plenty of threads which are calling OPA. However if you watch the open connections on the OPA server these never rise above 8 per cpu.
Initial thoughts are the problem is,
• Our calling code is broken i.e. the way our code breaks a chunk of work into parallel threads (this seems to be behaving although I'm not sure we've over complicated matters)
• The wrapper we’ve written that utilises the OPA API (seems to follow standard pattern so low possibility)
• The OPA API provided is limiting in some way. Potentially tied to how it maintains its own threadsafety in a singleton.
• The set up of IIS hosting OPA (the connections created are so low I think this is a low possibility)
• What we are seeing is a red herring and I need to go and read about how IIS opens and manages connections
• Somethings a miss on the network between the 2 boxes
Am I missing anything obvious from the OPA side?
Thanks in advance,
From your post it sounds like the bottleneck could be the client, the server, or anywhere in between. Since you don't seem to see that many connections on the server, it would to seem to point towards the problem either being with the client and/or the network. My advice would be to narrow down the problematic areas. Have you attempted to induce load directly on the server using a local network connection? If so what are you performance numbers like? Also have you tried measuring the latency of your network or confirming that all the request you expect the client to be making are actually hitting the server?
Also, what version are you using? 10.4.1 is out so if you can upgrade it would be worth doing since we regularly fix issues and make improvements that can affect performance in certain situations.
From my experience this is very likely to be the IIS setup on the OPA box.
Setting up IIS to run OPA fast is a bit of an art form and far beyond what I can write in these pages! Assuming you are using ODS (not OWD as this requires a different setup as it's stateful) Initially check
1. You have a separate App Pool for OPA
2. If you are using IIS 7.5 you are running in integrated pipeline mode
3. Set the app pool for minimum monitoring, small timeouts etc. ODS simply doesn't need stuff like 'Load User Profile' and all the normal website type stuff as it's stateless
When you've got that setup correctly then look at your WCF configuration - in particular the web (or app) .config file and make sure that you have a connection management tag under system.net. You can then override the default connection behaviour. This is the configuration FROM your WCF TO the OPA server - effectively your WCF is acting as a client of OPA in this instance, so the connection behaviour needs to be from the caller and not on the callee.
It may alos be that you have to make alterations to your machine.config file on either client or server
See http://support.microsoft.com/kb/821268 for more details - I know it's about ASP.NET but its a good gateway in....