Sizing for Dummies

Sizing Portal applications during Pre-sales activities

More and more customers have been asking for sizing inputs during RFP phase itself. Sizing depends on many factors and hence it is not a trivial exercise to arrive at sizing information at this stage. Some of the inputs that are required for sizing are:

  • Page views per second
  • Hits per second
  • Hit to cache ratio
  • Think time of users
  • Service time
  • Concurrent sessions
  • Ratio of static and dynamic pages
  • Usage of SSL
  • Usage of clusters

The above is only a partial listing.

During the pre-sales stage, some customers might be able to give an idea of expected hits per second. However, if one were to ask a customer about number of portlets on a page to arrive at possible caching, most customers will be lost for ideas. Hence it is important to educate the customer that sizing at this stage can only be indicative and a correct figure can only be arrived at after a detailed analysis.

Approaches for Sizing

There are essentially three approaches for arriving at sizing recommendations:

1. Sizing based on some algorithm and formulae
2. Sizing based on Proof-of concept
3. Sizing based on benchmarked applications

The first approach (Sizing based on some algorithm and formulae) is probably more scientific and logical than others. However, portal applications are a completely new breed of applications and are evolving at a rapid pace. Hence there is not enough historical data to arrive at an algorithm to predict accurate sizing. Secondly, any algorithm or formulae will require inputs that will be difficult to obtain during pre-sales stage.

The second approach (Sizing based on Proof-of concept) will possibly give most accurate results. However, it is time and resource consuming and can not be carried out during the pre-sales stage.

The last approach (Sizing based on benchmarked applications) provides an optimal method to arrive at sizing recommendations. It is based on a set of assumptions and is based on a set of benchmarked applications. In this approach, the customer application is compared with a benchmark application using same or similar technologies. The results are extrapolated to arrive at approximate sizing figures.

I have a working example of this approach. Mail me if you need one.

Updated: 20th May, 2005

PJ has posted a nice one on sizing here

12 thoughts on “Sizing for Dummies”

  1. The third approach looks inaccurate. More than anything else – its a sample application against which you are benchmarking. This application may not have been tested with real life loads of millions of hits, thousands of hits per second etc etc. So the how this sample app scales up is not clearly known. To size your app against this certainly seems immature.

  2. The sample application does not mean that it is a simple desktop applicationn. Most vendors provide real life application data that is used as benchmark. This data contains real life loads, millions of hits, thousands of hits per second etc etc

  3. Does the benchmark application also provide what kind of performance stats it supports? How does one extrapolate these data points to what one’s own app requires. It is a well known fact that the perf stats for say 50,000 hits a day cannot be linearly extrapolated to 5 million hits a day. Any extrapolation, based on any assumptions is at best only a vague guess and may prove to be a costly mistake.

  4. It depends on which benchmarks you use. As an example, i know BEA provides application stats with varying loads.Also remember, this is not exact sizing. This is only a baseline to be considered during pre-sales stage.

  5. Ok so BEA gives it to you. What about others? So your fundamental assumption is that applications running on BEA and any other platform (like ATG,IBM,Silverstream etc) would scale up similarly??Dude, the guy from IBM would certainly like to disagree with you on that one !!Also, even if its pre-sales, these figures are used by the top guys who sign the checks to do budgetary estimates. You cant afford to be off by too much.Am I missing something here?

  6. BEA does. Oracle does and so does IBM.Also, in this stage the assumption is that you donot even know how many portlets, haw many database queries etc are there. In such a scenario, I donot think that baseline numbers for BEA would be drastically different from those of IBM.Have you ever seen an application that uses 2 machines with BEA and 10 with IBM?? Usually the difference is not much

  7. All portal server products will use their entire app server stack too. So essentially you are extrapolating the performance of the portal to the entire stack of the products (app server, integration stack, middle layer and in case of Oracle – database too). That I believe is not the correct way to do it.The performance stats for a complete product stack does considerably vary.Also, what about the hardware stack? The sample application may be benchmarked using HP. What if your stack is Sun or IBM (or even Dell for that matter)?Add another assumption to your already long list.Where are we headed ?

  8. I am not. That is your assumption. All I am suggesting is that you use an “appropriate” benchmark application. If you donot get a reasonably “close” benchmark, you take the next best available and make assumptions.BTW, these vendors will give you benchmarks on all kinds of hardware. I have results for the same application on Intel and Solaris from one vendor.

  9. Too many assumptions. Not to mention the question mark on the initial benchmark application itself. Even for budgetary estimates it sounds a little too stretched.I would go for using empirical data of perf reqts and use standard formulae for TPS, THS and required response time. As pointed by you in your first option,I feel that is more optimal than the last one. Finally its left to one’s own experience and judgement on which method to follow.

  10. Correct. If you have all that data that is required for using so called standard formulae, please use them. Infact that should be the preferred approach.I will repeat however that what i’ve written about is for a scenario when you donot have any of this information or the client does not want to share this information. You can obviously say NO to the client or make assumptions and give him something. If you work in an Indian company (like I do), I am sure you will understand what i mean. Frequently, we have to give “reccos” to clients without knowing anything and that is when this approach has proven useful for me.

  11. I am a frequent reader of your blog posts. I liked the recent one and other posts on your blog so much that I have subscribed to the blog’s RSS feed in Thunderbird. Even thinking of stealing some ideas and put them to work. Keep all the good work going by posting more informative posts. Thank you. Time well spent on this post.

Comments are closed.

If you would like to get short takes directly in your mailbox, please do consider subscribing to my newsletter. I won’t spam you and your information will be safe. I usually send it like once a week (or once in 15 days).