Friday Jul 16, 2010

Func & Hadoop the end of start-all.sh


If you follow the stock hadoop docs for setting up, they suggest constructing and ssh keys from your name node or management node to your other nodes so that they can startup datanode and task tracker components. Hadoop itself does not use SSH keys other then for this startup.  This article will show another tactic.

For reference the way to locally start and stop a hadoop component and daemon is:

cd /home/edward/hadoop/hadoop-0.20.2
bin/hadoop-daemon.sh start namenode

If you want to run in foreground:

cd /home/edward/hadoop/hadoop-0.20.2
bin/hadoop namenode

Note: "Edward Capriolo hates using SSH KEYS for software interaction" 

Why do i dislike ssh-keys? I have heard and seen "horror stories" of chains of computers linked together by ssh-key mechanism, where one computer auto-logs into the next to move a file, or run a remote command. Come on get real! if you have a process like this, make your own protocol, use a message queue , or rpc.

hadoop does not use ssh keys in any way other then for the start-all scripts. These scripts leave lots to be desired, mostly they do not provide a basic way to start or stop one node, and they don't have a switch 'status all' or only call start if not already running.

Enter func  https://fedorahosted.org/func/

In a nutshell, func is "for ( computer : list of computer) { run command on computer } "

Func uses a client server certificate management with a CA. So it is more advanced then say just throwing pubkey signatures in /root/.ssh/authorized_keys. Func does not have to run as root either

As I mention all the time I have a private RPM repo. It is uber awesome. Find these rpms and throw them in there.

certmaster-0.24-1.el5.noarch.rpm
func-0.24-1.el5.noarch.rpm
pyOpenSSL-0.6-1.p24.7.2.2.x86_64.rpm
python-simplejson-2.0.5-1.el5.rf.x86_64.rpm

Follow the func docs to get the master/cert server setup. If you are clever you can have func and puppet share the same CA (I am not so I did not)

chkconfig certmaster on

Now we go about installing funcd everywhere we want it. Well great news we already have puppet so this is a breeze

vi /etc/puppet/manifests/func.pp

class funcserver{
  package { ["pyOpenSSL","python-simplejson","certmaster","func"]: ensure => installed }
  file { "/etc/certmaster/minion.conf" :
    owner => root,
    group => root,
    path => "/etc/certmaster/minion.conf",
    source => "puppet:///mainfiles/func/minion.conf" ,
    require => Package[certmaster]
  }
  service {
    funcd:
      ensure => true,
      enable => true,
      subscribe => [ File["/etc/certmaster/minion.conf"], Package[func] ] ,
      require =>  File["/etc/certmaster/minion.conf"]
  }
}

As you can see here we are installing the func packges, and using puppet to start the service and ensure that it is always running. The only customization we need here is we have to push our minion conf onto clients ( this tells them where the func server is)

[main]
certmaster = funcserver.jointhegrid.com
log_level = DEBUG
cert_dir = /etc/pki/certmaster

Push func out with puppet

node 'had01.hadoop.pvt','had02.hadoop.pvt','had03.hadoop.pvt','had04.hadoop.pvt','had05.hadoop.pvt',
     'had06.hadoop.pvt', 'had07.hadoop.pvt' ,'had08.hadoop.pvt', 'had09.hadoop.pvt', 'had10.hadoop.pvt',
     'had11.hadoop.pvt', 'had12.hadoop.pvt', 'had13.hadoop.pvt', 'had14.hadoop.pvt', 'had15.hadoop.pvt',
     'had16.hadoop.pvt', 'had17.hadoop.pvt', 'had18.hadoop.pvt', 'had19.hadoop.pvt', 'had20.hadoop.pvt'

{
  include hadoop_tasktracker_cleanup,standardserver,hadoop_prod_conf
  include funcserver
}

Lets test to see what puppet is going to do notice no-op

[root@had01 ~]# puppetd --server xxxxxxx--waitforcert 60 --test --noop
info: Caching catalog for had01.hadoop.pvt
info: Applying configuration version '1279293057'
notice: //funcserver/Package[func]/ensure: is absent, should be present (noop)
info: //funcserver/Package[func]: Scheduling refresh of Service[funcd]
notice: //funcserver/Package[certmaster]/ensure: is absent, should be present (noop)
notice: //funcserver/File[/etc/certmaster/minion.conf]/ensure: is absent, should be file (noop)
info: //funcserver/File[/etc/certmaster/minion.conf]: Scheduling refresh of Service[funcd]
notice: //funcserver/Service[funcd]/ensure: is stopped, should be running (noop)
notice: //funcserver/Service[funcd]: Would have triggered refresh from 2 dependencies
notice: Finished catalog run in 5.53 seconds

Looks good to me!

puppetd --server xxxxxxx--waitforcert 60 --test --noop

Take a lunch break

Seriously, you deserve it your soo f'in smart. Soon you are going to be able to run commands across your network without ever leaving your machine. Go get yourself a $10 coffee and toast your brilliance.

Signing clients

So after your coffee, all your clients should have func installed on them. You now need to sign the certs.

Signing certs

# certmaster-ca --list
had01.hadoop.pvt

#certmaster-ca --sign had01.hadoop.pvt
/var/lib/certmaster/certmaster/csrs/had01.hadoop.pvt.csr signed - cert located at /var/lib/certmaster/certmaster/certs/had01.hadoop.pvt.cert

Using func
 

# func had*.hadoop.pvt call command run "df -h /"
('had01.hadoop.pvt',
 [0,
  'Filesystem            Size  Used Avail Use% Mounted on\n/dev/sda2              95G   12G   79G  13% /\n',
  ''])
('had02.hadoop.pvt',
 [0,
  'Filesystem            Size  Used Avail Use% Mounted on\n/dev/sda2              95G  8.2G   82G  10% /\n',
  ''])

Conclusion

So now we can manage our systems remotely and specify targets with * expressions!

Hints

Do not run "rm -rf /"

Comments:

Post a Comment:
  • HTML Syntax: Allowed