Trying to find a fit for Yarn and Mesos
I have been following the development of Yarn and Mesos and done some tinkering over the past few months. If you have not ever heard of these projects get some information here:
http://nosql.mypopescu.com/post/27840903966/hadoop-yarn-beyond-mapreduce
https://github.com/mesos/mesos/wiki/
I have a good conceptual understanding of what can be done with these projects, but I have some trouble fitting them into my current infrastructure. There is no one specific reason but more of a question of 'What can this thing do that puppet can not?'.
I look at Yarn and I see a tool with one use a tool to stand up 3 revs of hadoop on the same hardware mainly because the migration path off a release is ugly. This is something I can already do with configuration management.
I look at the mesos examples and I see a container that 'starts a jail, installs http and installs ha-proxy'. Again, something I can already do with configuration management.
Maybe I have just been standing up clusters for too long so everything looks the same to me, but in my own head I have trouble sorting these things out. The big questions are:
- Can a technology like yarn or mesos be used together with puppet or chef?
- What at the best practices when using these two things together?
- In YARNs case. How many current software packages can yarn manage outside hadoop?
- MPI?
- Then what?
- Aren't yarn/mesos just sneaky forms of devops/noops?
- With clusters spinning up and falling on command how do we monitor this environment and guarantee quality of service?
- Couldn't AWS/open stack do this on a more general scale?
- Shouldn't we just all be using solaris zones?
Thinking deeper on #6. Really one of the things about solaris is they spent a lot of time making a virtualized environment. They spent time making resource controls. Controlling RAM, sockets, open files per process. Currently AFAIK there is no support in the mainline linux kernel for sharing/limiting disk IO like solaris has. When I look at yarn the only resource constraint I see is units of memory.
How are these platforms supposed to be successful when Quality of Service is an afterthought? Lets say you use YARN to spin up hbase or Cassandra and want low latency. Then randomly a map task lands on the machine and crushes the node. Just putting a cap on memory is not going to help as the map task is crushing your IO subsystem and degrading your service. This is like bringing the noisy-neighbors problem home to your private cloud.
Posted at 11:52AM Aug 10, 2012 by edwardcapriolo in General | Comments[7]
Posted by Arun C Murthy on August 10, 2012 at 01:38 PM EDT #
Posted by cburroughs on August 10, 2012 at 02:05 PM EDT #
Posted by edward capriolo on August 10, 2012 at 07:03 PM EDT #
Posted by Moran Faigenbaum on August 28, 2012 at 08:49 AM EDT #
Posted by Jordan Shoes on January 27, 2013 at 09:35 PM EST #
Posted by Cheap Jordan Shoes on January 30, 2013 at 03:03 AM EST #
Posted by Nike Air Max on March 31, 2013 at 01:51 AM EDT #