开发者

Determining system requirements (hardware, processor & memory) for a batch based software application

I am tasked with building an application wherein the business users will be defining a number of rules for data manipulation & processing (e.g. taking one numerical value and splitting it equally amongst a number of records selected on the basis of the condition specified in the rule).

On a monthly basis, a batch application has to be run in order to process around half a million records as per the rules defined. Each record has around 100 fields. The environment is .NET, C# and SQL server with a third party rule engine

Could you p开发者_开发知识库lease suggest how to go about defining and/or ascertaining what kind of hardware will be best suited if the requirement is to process records within a timeframe of let's say around 8 to 10 hours. How will the specs vary if the user either wants to increase or decrease the timeframe depending on the hardware costs?

Thanks in advance

Abby


Create the application and profile it?


Step 0. Create the application. It is impossible to tell real world performance of a multi-computer system like you're describing from "paper" specifications... You need to try it and see what holds the biggest slow downs... This is traditionally physical IO, but not always...

Step 1. Profile with sample sets of data in an isolated environment. This is a gross metric. You're not trying to isolate what takes the time, just measuring the overall time it takes to run the rules.

What does isolated environment mean? You want to use the same sorts of network hardware between the machines, but do not allow any other traffic on that network segment. That introduces too many variables at this point.

What does profile mean? With current hardware, measure how long it takes to complete under the following circumstances. Write a program to automate the data generation.

Scenario 1. 1,000 of the simplest rules possible.

Scenario 2. 1,000 of the most complex rules you can reasonably expect users to enter.

Scenarios 3 & 4. 10,000 Simplest and most complex.

Scenarios 5 & 6. 25,000 Simplest and Most complex

Scenarios 7 & 8. 50,000 Simplest and Most complex

Scenarios 9 & 10. 100,000 Simplest and Most complex

Step 2. Anaylze the data.

See if there are trends in completion time. Figure out if they appear tied to strictly the volume of rules or if the complexity also factors in... I assume it will.

Develop a trend line that shows how long you can expect it to take if there are 200,000 and 500,000 rules. Perform another run at 200,000. See if the trend line is correct, if not, revise your method of developing the trend line.

Step 3. Measure the database and network activity as the system processes the 20,000 rule sets. See if there is more activity happening with more rules. If so the more you speed up the throughput to and from the SQL server the faster it will run.

If these are "relatively low," then CPU and RAM speed are likely where you'll want to beef up the requested machines specification...

Of course if all this testing is going to cost your employer more than buying the beefiest server hardware possible, just quantify the cost of the time spent testing vs. the cost of buying the best server and being done with it and only tweaking your app and the SQL that you control to improve performance...


If this system is not first of a kind, so you can consider following:

  • Re-use (after additional evaluation) hardware requirements from previous projects
  • Evaluate hardware requirements based on workload and hardware configuration of existing application

If that is not the case and performance requirements are very important, then the best way would be to create a prototype with, say, 10 rules implemented. Process the dataset using the prototype and extrapolate to a full rule set. Based on this information you should be able to derive initial performance and hardware requirements. Then you can fine tune these specifications taking into account planned growth in processed data volume, scalability requirements and redundancy.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜