Friday Apr 20, 2012

Myth Busters: Ops edition. Is EC2 is less expensive then running your own gear?

A buddy of mine and I were trying to figure out how much better ground computing is then cloud computing. We determined our typical hadoop machine is about equal to an extra large instance. We have a cluster of 20 of these machines. 2x quad socked processors, 32GB RAM, 8x 2TB disks

Here are some numbers:

20 x extra large instances 12 hours of usage per day on one year term.

Its like you have to pay about 80K upfront and about 5K a month. So in a year its going to be about 140K -150K (without any storage)

Now If i was to run these servers in my data-center :

20 servers = about 150K (upfront cost)
Rack and power = about 2K a month
Total expenses in year one about 175K about 25-35K more from AWS - looks bad, huh?

but.... With our own hardware

Free storage about 400TB (Raw)
Cluster available 24 hours
After first year cost is only 2K a month (24k/year) and all the hardware is free.

I went ahead for 3 year comparison:


$126K upfront for AWS and about 5K a month = 186K for first year
120K for next two year
total needs to be paid to AWS = 126+120 = 246K

If i were to put this in my own datacenter i'd pay:
150K upfront and 2K a month = about 175K for first year
48K for next two year
total expenses = 223K
free storage about 400TB (Raw)
cluster available 24 hours a day
I still have those 3 year old machine maybe someone will buy them for 10K? :)

BTW - I'm not too sure how to calculate $$$ for storage in AWS yet, but when I do it its come to about 585K a month - assuming that you want to keep all the data in AWS...

Myth Busted? Absolutely busted to heck.


Ed, Don't forget the cost of administrators (at least for a general comparison). I'm not saying you're wrong, but it's not in your calculation.

Posted by Dean Wampler on April 20, 2012 at 02:14 PM EDT #

What about elasticity and ability to spin up another server just like that vs. ordering and installing a new one? What, the value of that or cost of not having that?

Posted by Otis on April 20, 2012 at 02:34 PM EDT #

Is there empiricle evidence that the time spent on operations per compute (whatever the title of people doing it) varies with AWS vs joyent vs hosted vs your own cages? If not I don't see anyway to include it objectivly. It would be intresting to know what percentage of instance orders (for public facing things, not offline batch computatoins) were "elastic". I fear that's an even more subjective determination.

Posted by Chris Burroughs on April 20, 2012 at 04:39 PM EDT #

Here's one example of elasticity. We wanted to test the agent/client piece of our Performance Monitoring service (http://sematext.com/spm/index.html ). Does it install correctly on Ubuntu? CentOS? Fedora? Different versions of those, etc. A hop to EC2 Console, Spot Instance, and a few minutes later I had all variations I needed and I paid just a few cents. Sure, a different sort of use case and it doesn't mean I could not have all my own hw in a DC, but...

Posted by Otis Gospodnetic on April 20, 2012 at 06:53 PM EDT #

Cost of administrators: Did you guys notice the cost of disk storage is 545K per month? That covers the cost of administrators. We do not spend 6 million paying our ops team every year. Though I wish we did. Also notice our comparison is only 12 compute hour for EC2 a day. Imagine if we punched in numbers for 24 hours in a day. We could easily cover all our costs and some. I do not like this 'cost of administrators thing' . Why not factor in the cost of developers? After all you cant have software to install on servers without someone developing it? And without admins devs are going to have to do the installs! BTW are you implying that cloud machines do not need administration? Other people have pointed out you still need ops people in the cloud. Racking and preping servers like ec2 does "magically" is not as complex or time depended as they pretend it is. Our guy who does desktop support periodically racks servers. Maybe one or two trips a month to DC. As for spinning up spot instanced to test something. I write Java code so my stuff runs everywhere by default :) There are pre-build WMware and Qemu images. I do not need ec2 to do that. That is a bit out of scope for this conversation anyway, We are talking about the real cost of running a 20 node hadoop cluster.

Posted by Edward on April 21, 2012 at 09:32 AM EDT #

Nice analysis. I wrote up something along the same lines (http://blog.rapleaf.com/dev/2008/12/10/rent-or-own-amazon-ec2-vs-colocation-comparison-for-hadoop-clusters/) all the way back in 2008 when we were deciding where to run our Hadoop cluster, and got the same answer. If you know your hardware requirements, I think you'll always be able to buy and maintain physical hardware cheaper than you could get from the cloud. But as people are quick to point out, the real benefit of EC2 is not cheapness, it's the scalability. You have to ask yourself whether your application can wait 2-6 weeks while new servers are shipped. If the answer is yes, then great. If not, then you should be glad that you can pay a premium to make scaling instant.

Posted by Bryan Duxbury on April 23, 2012 at 12:19 PM EDT #

Why not make it a more apples-to-apples comparison, and use 20 m2.2xlarge instances utilized 100%. That way, you get 850G per instance for your storage, and effectively halves your cost/month ($2586.57), plus you get 24/7 usage if you want.

Posted by Apples and Oranges on April 23, 2012 at 01:23 PM EDT #

This is like my personal pet peeve. Why don't more people realize that EC2 is insanely expensive.

Posted by Kevin Burton on April 25, 2012 at 06:38 PM EDT #

My Response. Agree and Disagree. http://www.jonzobrist.com/2012/04/27/my-response-to-edward-capriolos-myth-busters-ops-edition-is-ec2-is-less-expensive-then-running-your-own-gear/

Posted by Jon Zobrist on April 27, 2012 at 03:00 PM EDT #

Nice Analysis.One thing is not clear to me what is the "80K upfront" cost for the EC2 instances.Is it price for "Reserved Instances".I am currently evaluating Buy-vs-Rent for a hadoop cluster , but contrary to this article Amazon comes out to be cheaper even for 5 years according to my calculations. ------------------------------------------------------------ For Rent from Amazon ------------------------------------------------------------- $3K per month - http://calculator.s3.amazonaws.com/calc5.html?key=calc-5F392419-5BEF-48FD-B3EE-6B2E07EE2443 For 5 years - 180K ----------------------------------------------------- For Own ----------------------------------------------------- Dell R620 Rack Servers - $2K each (pick the minimum - 2GB RAM,250 GB storage) 50 servers - 50*2000 = 100K Maintenance/year = 24K Total for 5 years - 100K + 120K ( 24*5) = 220K Am I missing something here ? Probably understanding the upfront cost would help me figure out better.

Posted by Himanish Kushary on September 21, 2012 at 10:47 AM EDT #

