Hadoop 20

ALERT! WARNING!

WARNING! This page is for an older version of Hadoop. For newer versions, please visit Hadoop Release 3 Installation

Please note: This documentation is for OSG 1.2. While we still provide critical security updates for OSG Software 1.2, we recommend you use OSG Software 3 for any new or updated installations. We are considering May 31, 2013 as possible OSG 1.2 End of Life (EOL).

ReleaseDocumentation
Hadoop20Install
Owner JeffDost
Area Storage
Role SysAdmin
Type Installation
Reviewer DouglasStrain Tester NehaSharma Owner JeffDost
In Progress In Progress Not Released

ALERT! WARNING!

WARNING! This page is for an older version of Hadoop. For newer versions, please visit Hadoop Release 3 Installation

Purpose: The purpose of this document is to provide Hadoop based SE administrators the information on how to prepare, install and validate the SE.

Conventions used in this document:

A User Command Line is illustrated by a green box that displays a prompt:

  [user@client ~]$

A Root Command Line is illustrated by a red box that displays the root prompt:

  [root@client ~]$

Lines in a file are illustrated by a yellow box that displays the desired lines in a file:

priorities=1

Preparation

Introduction

Hadoop Distributed File System (HDFS) is a scalable reliable distributed file system developed in the Apache project. It is based on map-reduce framework and design of the Google file system. The VDT distribution of Hadoop includes all components needed to operate a multi-terabyte storage site. Included are:

The VDT packaging and distribution of Hadoop is based on YUM. All components are packaged as RPMs. Two YUM repositories are available:

The stable YUM repository is enabled by default through the osg-hadoop-20 RPM, and contains the golden release supported by OSG for LHC operations.

VDT Downloads webpage

The VDT Downloads webpage is http://vdt.cs.wisc.edu/components/hadoop.html

VDT Release notes webpage

The VDT Release notes are available at http://vdt.cs.wisc.edu/hadoop/release-notes.html

Note on upgrading from Hadoop 0.19

If you already have a working Hadoop 0.19 system and would like to upgrade to 0.20, please get familiar with this document first and then proceed to follow the upgrade instructions here.

Architecture

This diagram shows the suggested topology and distribution of services at a Hadoop site. Major service components and modules which need to be deployed on the various nodes are listed. Please use this as a recommendation to prepare for the Hadoop deployment procedure at your site.

HELP NOTE
Throughout this document it will be stated which node the relevant installation instructions apply to. It can apply to one of the following:
  • Namenode
  • Datanode
  • Secondary Namenode
  • GridFTP node (can be installed on the same machine as a Datanode)
  • SRM node

Engineering Considerations

Please read the planning document to understand different components of the system.

Help!

Total installation time, on an average, should not exceed 8 to 24 man-hours. If your site needs further assistance to help expedite, please email osg-storage@opensciencegrid.org and osg-hadoop@opensciencegrid.org.

Installation Procedure

Main server components can be divided in 3 categories:

Main client components are FUSE and Hadoop command line client.

Initializer RPM

HELP NOTE
This must be done on all nodes

Initializing the YUM Repository

Download and install the osg-hadoop-20 RPM on all nodes. This will initialize the OSG YUM repository for Hadoop.

[root@client ~]$ rpm -Uvh http://vdt.cs.wisc.edu/hadoop/osg-hadoop-20-3.el5.noarch.rpm

This initializes YUM repository configuration in /etc/yum.repos.d/osg-hadoop.repo.

Choosing Stable or ITB Repository

HELP NOTE
For Integration Testbed (ITB) Sites:
  • By default, Stable Repository is enabled (enabled=1) in the YUM configuration. Production sites should use the default setting.
  • ITB sites doing testing can enable the Testing Repository to fetch pre-release packages.

Simply set enabled=0 in [hadoop] section and enabled=1 in [hadoop-testing] section of /etc/yum.repos.d/osg-hadoop.repo.

YUM Repository types in /etc/yum.repos.d/osg-hadoop.repo

Production Sites:
[hadoop]
... ...
enabled=1
... ...

[hadoop-testing]
... ...
enabled=0
... ...

[hadoop-unstable]
... ...
enabled=0
... ...
Integration Sites:
[hadoop]
... ...
enabled=0
... ...

[hadoop-testing]
... ...
enabled=1
... ...

[hadoop-unstable]
... ...
enabled=0
... ...

Installing Hadoop

HELP NOTE
Hadoop must be installed on the following nodes:
  • Namenode
  • Datanode
  • Secondary Namenode

HELP NOTE
On the following nodes Hadoop must also be installed but the Hadoop service itself does not need to be started:
  • GridFTP node
  • SRM node

Prerequisites

Hadoop will run anywhere that Java is supported (including Solaris). However, these instructions are for RedHat 5 derivants (including Scientific Linux) because of the RPM based installation.

The HDFS prerequisites are:

  • Minimum of 1 headnode (the namenode), although 2 recommended (the namenode and the secondary namenode)
  • At least one node which will hold data, preferably at least 2. Most sites will have 20 to 200 datanodes.
  • The namenode and secondary name node are not datanodes.
  • Working Yum and RPM installation on every system.
  • fuse kernel module and fuse-libs.
  • Java RPM. If java isn't already installed we supply the Oracle jdk 1.6.0 rpm and it will come in as a dependency. Oracle jdk is currently the only jdk supported by OSG so we highly recommend you use the version supplied.

Compatibility Note Note that versions of OpenAFS less than 1.4.7 and greater than 1.4.1 create nameless groups on Linux; these groups confuse Hadoop and prevent its components from starting up successfully. If you plan to install Hadoop on a Linux OpenAFS client, make sure you're running at least OpenAFS 1.4.7.

Note: The rpm/yum installation will create a 'hadoop' system account and group (uid,gid < 500) on the host system for running the datanode services. If you would like to control the uid/gid that is used, then you should create the 'hadoop' user and group manually before installing the rpms.

Installation

The Hadoop init script assumes that you are not running multiple hadoop services (datanode, namenode, secondary namenode) on the same host.

The only node that requires a FUSE mount is the SRM node. However to install hadoop, the hadoop-0.20-osg rpm requires fuse and fuse-libs packages to be installed. If you are using RHEL >= 5.4 this requirement is met and they will be brought in as dependencies. Otherwise you must find these packages for your platform or refer to Notes on Building a FUSE Module in the Troubleshooting section below.

To install hadoop, run:

[root@client ~]$ yum install hadoop-0.20-osg

Configuration

The Hadoop RPMs install files into the standard system locations. The following table highlights some of the more interesting locations, and documents whether you might ever want to edit them.

File Type Location Needs editing?
Log files /var/log/hadoop/* No
PID files /var/run/hadoop/*.pid No
init scripts /etc/init.d/hadoop, /etc/init.d/hadoop-firstboot No
init script config file /etc/sysconfig/hadoop Yes
runtime config files /etc/hadoop/conf/* Maybe
System binaries /usr/bin/hadoop No
JARs /usr/lib/hadoop/* No

Edit /etc/sysconfig/hadoop

The most common site configuration settings can be changed in /etc/sysconfig/hadoop. In most cases, this file will be identical on the namenode and datanodes. The configuration settings are documented in the file itself, but we document some of the most commonly edited ones in the table below:

Option Name Needs editing? Suggested value
HADOOP_NAMENODE Yes The host name of your namenode; should match the output 'hostname -s' on the namenode server
HADOOP_NAMEPORT Yes 9000
HADOOP_SECONDARY_NAMENODE Yes The host name of the secondary namenode; should match the output of 'hostname -s'
HADOOP_CHECKPOINT_DIRS Yes Comma-separated (important: no spaces between commas!) list of directories to store checkpoints on. The safest configuration is to store 2 checkpoints locally on 2 block devices and 1 checkpoint on a NFS server. At least 1 checkpoint directory is required.
HADOOP_CHECKPOINT_PERIOD Yes The time, in seconds, between checkpoints. 600 is suggested for small sites
HADOOP_REPLICATION_DEFAULT Yes Default number of replications. Suggested: 2
HADOOP_REPLICATION_MIN Yes Minimum number of replications; below this, an error will be thrown. Suggested: 1 or 2.
HADOOP_REPLICATION_MAX Yes Maximum number of replications. Suggested: 512
HADOOP_GANGLIA_ADDRESS Maybe Hostname or IP of your Ganglia gmetad. If left empty then hadoop will try to extract the ganglia metad address from /etc/gmond.conf. If you aren't using Ganglia just leave it blank.
HADOOP_DATADIR Yes The base directory where HDFS temp and management data will be written. On datanodes this is usually the parent of the first data partition. It is safe to leave this empty for client-only installations.
HADOOP_DATA Yes A comma-separated list of directories (no spaces!) where the HDFS data blocks will be stored. The first one is typicall the same as $HADOOP_DATADIR/data. It is safe to leave this empty for client-only installations.
HADOOP_USER Maybe The username that the hadoop datanode daemons will run under. Suggested: hadoop
HADOOP_NAMENODE_HEAP Maybe The Java heap size for the namenode; bigger is better, but the node shouldn't swap. Minimum: 2048m. Suggested: 8192m
HADOOP_MIN_DATANODE_SIZE Maybe A value in GB; if the data directory is smaller than this size, HDFS will refuse to start. Safeguards against starting the datanode daemon on non-datanodes. Suggested: 300 (this value will vary widely with your datanode size). Set to zero or an empty string to bypass this check.

After making changes to the file, you must run:

[root@client ~]$ service hadoop-firstboot start

This propagates the changes to the hadoop configuration files in /etc/hadoop and must be run every time you make changes to /etc/sysconfig/hadoop.

NOTE: If you just installed Hadoop for the first time, you must log in/out of your shell or source /etc/profile.d/hadoop.sh before your you try playing with the command line tools.

Upgrade note: Configuration files will be saved with a .rpmsave extension if you ever update your hadoop rpms with rpm or yum. Make sure to copy your settings from /etc/sysconfig/hadoop.rpmsave to /etc/sysconfig/hadoop if you ever update your hadoop rpms. Any manual changes to the hadoop configuration files in /etc/hadoop/ should be preserved during an upgrade, but may be overwritten when running hadoop-firstboot.

Side topic: Multiple data directories on a datanode.

Hadoop has the ability to store data in multiple directories on a datanode. This can be useful if you have multiple drives on your datanode and don't want to run them in a raid array, or if you have multiple large storage volumes mounted on your datanode. To configure a datanode to use multiple directories, you need to enter each directory in the HADOOP_DATA setting in /etc/sysconfig/hadoop as a comma-separated list of directories (no spaces!) and then run service hadoop-firstboot start. Here is an example of a datanode with 4 storage directories:

HADOOP_DATA=/data1/hadoop/data,/data2/hadoop/data,/data3/hadoop/data,/data4/hadoop/data

Running Hadoop

The Hadoop rpms install a startup script in /etc/init.d/hadoop. The same command is used to start hadoop services on a datanode, namenode, or secondary namenode:

[root@client ~]$ service hadoop start

You will also want to configure hadoop to start at boot time with:

[root@client ~]$  chkconfig hadoop on

Side topic: Client-only installation

Sometimes it is handy to configure a node to be a client, that is, a system that has access to hadoop but will not serve as a datanode or namenode. The installation and configuration for such a node is the same as above, except that you do not need to start any hadoop services with /etc/init.d/hadoop. It is still necessary to modify /etc/sysconfig/hadoop, but it is not necessary to specify any datanode directories in HADOOP_DATA.

FUSE

A FUSE mount is only required on the SRM node and any other node you would like to use standard POSIX-like commands on the Hadoop filesystem. If these cases don't apply you may skip to the Hadoop Validation section.

HELP NOTE
Before using FUSE you may need to add the module using modprobe first:

[root@client ~]$ modprobe fuse

Mounting FUSE at Boot Time

You can mount FUSE by adding the following line to /etc/fstab (Be sure to change the /mnt/hadoop mount point and namenode.host to match your local configuration. To match the help documents, we recommend using /mnt/hadoop as your mountpoint):

hdfs# /mnt/hadoop fuse server=namenode.host,port=9000,rdbuffer=131072,allow_other 0 0

Alternatively this can be taken care of automatically when running hadoop-firstboot if in your /etc/sysconfig/hadoop file you set the following line:

HADOOP_UPDATE_FSTAB=1

Once your /etc/fstab is updated, to mount FUSE run:

[root@client ~]$ mount /mnt/hadoop

When mounting the HDFS FUSE mount, you will see the following harmless warnings printed to the screen:

# mount /mnt/hadoop
port=32767,server=(
fuse-dfs didn't recognize /mnt/hadoop,-2
fuse-dfs ignoring option allow_other

If you have troubles mounting FUSE refer to Running FUSE in Debug Mode in the Troubleshooting section.

Validation

The first thing you may want to do after installing and starting your Namenode is to verify that the web interface works. In your web browser go to:

http://namenode.hostname:50070/dfshealth.jsp

Get familiar with Hadoop commands. Run hadoop with no arguments to see the list of commands.

[user@client ~]$ hadoop
 
Usage: hadoop [--config confdir] COMMAND where COMMAND is one of: namenode -format format the DFS filesystem secondarynamenode run the DFS secondary namenode namenode run the DFS namenode datanode run a DFS datanode dfsadmin run a DFS admin client mradmin run a Map-Reduce admin client fsck run a DFS filesystem checking utility fs run a generic filesystem user client balancer run a cluster balancing utility fetchdt fetch a delegation token from the NameNode jobtracker run the MapReduce job Tracker node pipes run a Pipes job tasktracker run a MapReduce task Tracker node job manipulate MapReduce jobs queue get information regarding JobQueues version print the version jar run a jar file distcp copy file or directories recursively archive -archiveName NAME -p * create a hadoop archive oiv apply the offline fsimage viewer to an fsimage classpath prints the class path needed to get the Hadoop jar and the required libraries daemonlog get/set the log level for each daemon or CLASSNAME run the class named CLASSNAME Most commands print help when invoked w/o parameters.

For a list of supported filesystem commands:

[user@client ~]$ hadoop fs
 
Usage: java FsShell [-ls ] [-lsr ] [-df []] [-du ] [-dus ] [-count[-q] ] [-mv ] [-cp ] [-rm [-skipTrash] ] [-rmr [-skipTrash] ] [-expunge] [-put ... ] [-copyFromLocal ... ] [-moveFromLocal ... ] [-get [-ignoreCrc] [-crc] ] [-getmerge [addnl]] [-cat ] [-text ] [-copyToLocal [-ignoreCrc] [-crc] ] [-moveToLocal [-crc] ] [-mkdir ] [-setrep [-R] [-w] ] [-touchz ] [-test -[ezd] ] [-stat [format] ] [-tail [-f] ] [-chmod [-R] PATH...] [-chown [-R] [OWNER][:[GROUP]] PATH...] [-chgrp [-R] GROUP PATH...] [-help [cmd]] Generic options supported are -conf specify an application configuration file -D use value for given property -fs specify a namenode -jt specify a job tracker -files specify comma separated files to be copied to the map reduce cluster -libjars specify comma separated jar files to include in the classpath. -archives specify comma separated archives to be unarchived on the compute machines. The general command line syntax is bin/hadoop command [genericOptions] [commandOptions]

An online guide is also available at Apache Hadoop commands manual. You can use Hadoop commands to perform filesystem operations with more consistency.

Example, to look into the internal hadoop namespace:

[user@client ~]$ hadoop fs -ls /
Found 1 items
drwxrwxr-x   - engage engage          0 2011-07-25 06:32 /engage

Example, to adjust ownership of filesystem areas (there is usually no need to specify the mount itself /mnt/hadoop in Hadoop commands):

[root@client ~]$ hadoop fs -chown -R engage:engage /engage

Example, compare hadoop fs command vs. using FUSE mount:

[user@client ~]$ hadoop fs -ls /engage
Found 3 items
-rw-rw-r--   2 engage engage  733669376 2011-06-15 16:55 /engage/CentOS-5.6-x86_64-LiveCD.iso
-rw-rw-r--   2 engage engage  215387183 2011-06-15 16:28 /engage/condor-7.6.1-x86_rhap_5-stripped.tar.gz
-rw-rw-r--   2 engage engage    9259360 2011-06-15 16:32 /engage/glideinWMS_v2_5_1.tgz

[user@client ~]$ ls -l /mnt/hadoop/engage
total 935855
-rw-rw-r-- 1 engage engage 733669376 Jun 15 16:55 CentOS-5.6-x86_64-LiveCD.iso
-rw-rw-r-- 1 engage engage 215387183 Jun 15 16:28 condor-7.6.1-x86_rhap_5-stripped.tar.gz
-rw-rw-r-- 1 engage engage   9259360 Jun 15 16:32 glideinWMS_v2_5_1.tgz

Creating VO and User filesystem areas

Prior to starting basic day-to-day operations, it is important to create dedicated areas for each VO and/or user. This is similar to user management in simple UNIX filesystems.

HELP NOTE
Create (and maintain) usernames and groups with UIDs and GIDs on all nodes. These are maintained in basic system files such as /etc/passwd and /etc/group.

HELP NOTE
In the examples below It is assumed a FUSE mount is set to /mnt/hadoop. As an alternative hadoop fs commands could have been used.

For clean HDFS operations and filesystem management:

(a) Create top-level VO subdirectories under /mnt/hadoop.

Example:

[root@client ~]$ mkdir /mnt/hadoop/cms
[root@client ~]$ mkdir /mnt/hadoop/dzero
[root@client ~]$ mkdir /mnt/hadoop/sbgrid
[root@client ~]$ mkdir /mnt/hadoop/fermigrid
[root@client ~]$ mkdir /mnt/hadoop/cmstest
[root@client ~]$ mkdir /mnt/hadoop/osg

(b) Create individual top-level user areas, under each VO area, as needed.

[root@client ~]$ mkdir -p /mnt/hadoop/cms/store/user/tanyalevshina
[root@client ~]$ mkdir -p /mnt/hadoop/cms/store/user/michaelthomas
[root@client ~]$ mkdir -p /mnt/hadoop/cms/store/user/brianbockelman
[root@client ~]$ mkdir -p /mnt/hadoop/cms/store/user/douglasstrain
[root@client ~]$ mkdir -p /mnt/hadoop/cms/store/user/abhisheksinghrana

(c) Adjust username:group ownership of each area.

[root@client ~]$ chown -R cms:cms /mnt/hadoop/cms
[root@client ~]$ chown -R sam:sam /mnt/hadoop/dzero

[root@client ~]$ chown -R michaelthomas:cms /mnt/hadoop/cms/store/user/michaelthomas

Installing GridFTP

HELP NOTE
GridFTP must be installed on the GridFTP node

Prerequisites

  1. Install the Hadoop RPM on your GridFTP node, edit /etc/sysconfig/hadoop, and verify your installation
  2. We assume your site is running a sufficiently recent GUMS server >= 1.3 (grid-mapfiles are not currently tested or supported).

The GridFTP server for Hadoop can be very memory-hungry, up to 500MB/transfer in the default configuration. You should plan accordingly to provision enough GridFTP servers to handle the bandwidth that your site can support.

The installation includes the latest CA Certificates package from the OSG as well as the fetch-crl CRL updater. NOTE the fetch-crl service does not start by default after installing GridFTP. To have fetch-crl update automatically, run:

[root@client ~]$ service fetch-crl-cron start

cron will check for CRL updates every 6 hours. If this is your first time installing you may need to run it immediately:

[root@client ~]$ /usr/sbin/fetch-crl -r 20  -a 24 --quiet

Note: You do not need FUSE mounted on GridFTP nodes,

Installation

To install gridftp-hdfs server, run:

[root@client ~]$ yum install gridftp-hdfs

Updates can be installed with:

[root@client ~]$ yum upgrade gridftp-hdfs

Configuration

The installation of gridftp-hdfs and its dependencies creates several directories. In addition to the Hadoop installation files, you will also find:

Log files /var/log/gridftp-auth.log, /var/log/gridftp.log
xinetd files /etc/xinetd.d/gridftp-hdfs
runtime config files /etc/gridftp-hdfs/*
System binaries /usr/bin/gridftp-hdfs*
System libraries /usr/lib64/libglobus_gridftp_server_hdfs.so*
GUMS client (called LCMAPS) configuration /etc/lcmaps/lcmaps.db
CA certificates /etc/grid-security/certificates/*

lcmaps.db is provided by the globus-mapping-osg package.

gridftp-hdfs reads the Hadoop configuration file to learn how to talk to Hadoop. As per the prerequisites section, you should have already edited /etc/sysconfig/hadoop and run service hadoop-firstboot start. If you did not follow the directions, please do that now.

It is not necessary to start any Hadoop services with service hadoop start if you are running a dedicated GridFTP server (that is, no datanode or namenode services will be run on the host).

In /etc/lcmaps/lcmaps.db you will need to enter the URL for your GUMS server, as well as the path to your host certificate and key:

             "--endpoint https://gums.hostname:8443/gums/services/GUMSXACMLAuthorizationServicePort"

The default settings in /etc/gridftp-hdfs/*.conf should be ok for most installations. The file gridftp-inetd.conf is used by the xinetd service for starting up the GridFTP server. The file gridftp.conf is used by /usr/bin/gridftp-hdfs-standalone for starting up the GridFTP server in a testing mode. gridftp-hdfs-local.conf contains additional site-specific environment variables that are used by the gridftp-hdfs dsi module in both the xinetd and standalone GridFTP server. Some of the environment variables that can be used in gridftp-hdfs-local.conf include:

Option Name Needs Editing? Suggested value
GRIDFTP_HDFS_REPLICA_MAP No File containing a list of paths and replica values for setting the default # of replicas for specific file paths
GRIDFTP_BUFFER_COUNT No The number of 1MB memory buffers used to reorder data streams before writing them to Hadoop
GRIDFTP_FILE_BUFFER_COUNT No The number of 1MB file-based buffers used to reorder data streams before writing them to Hadoop
GRIDFTP_SYSLOG No Set this to 1 in case if you want to send transfer activity data to syslog (only used for the HadoopViz? application)
GRIDFTP_HDFS_MOUNT_POINT Maybe The location of the FUSE mount point used during the Hadoop installation. Defaults to /mnt/hadoop. This is needed so that gridftp-hdfs can convert fuse paths on the incoming URL to native Hadoop paths. Note: this does not imply you need FUSE mounted on GridFTP nodes!
GRIDFTP_LOAD_LIMIT No GridFTP will refuse to start new transfers if the load on the GridFTP host is higher than this number; defaults to 20.
TMPDIR Maybe The temp directory where the file-based buffers are stored. Defaults to /tmp.

gridftp-hdfs-local.conf is also a good place to increase per-process resource limits. For example, many installations will require more than the default number of open files (ulimit -n).

Running GridFTP

If you were not already running the xinetd service (by default it is not installed on RHEL5), then you will need to start it with the command:

[root@client ~]$ service xinetd restart

Otherwise, the gridftp-hdfs service should be configured to run automatically as soon as the installation is finished.

Validation

HELP NOTE
The commands used to verify GridFTP below assume you have access to a node where you can first generate a valid proxy using voms-proxy-init or grid-proxy-init. Obtaining grid credentials is beyond the scope of this document.

[user@client ~]$ globus-url-copy file:///home/users/jdost/test.txt gsiftp://devg-7.t2.ucsd.edu:2811/mnt/hadoop/engage/test.txt

If you are having troubles running GridFTP refer to Starting GridFTP in Standalone Mode in the Troubleshooting section.

Installing BeStMan2

HELP NOTE
BeStMan2 must be installed on the SRM node

Prerequisites

  1. Make sure FUSE is installed and mounted on the SRM node.
  2. A GridFTP-HDFS server must also be installed, but this does not need to be on the same node as the BeStMan2 server. A larger site will prefer to have their GridFTP and BeStMan2 servers installed on separate hosts.
  3. In addition to the Java jdk you need the corresponding Java sun-compat package. For example for jdk-1.6.0 you need to install java-1.6.0-sun-compat. If you installed the jdk rpm that we supplied you can just let java-1.6.0-sun-compat come in as a dependency. Otherwise you need to find and manually install the correct version before continuing. See the jpackage installation doc for more details.
  4. CA Certificates installed in /etc/grid-security/certificates.

BeStMan2 is preconfigured to look for the host certificate and key in /etc/grid-security/http/http*.pem. These files must exist and be owned by the bestman user. Using certificates in a different directory or with different names is not supported.

NOTE: The names are misleading. You must copy over your hostcert.pem and hostkey.pem as httpcert.pem and httpkey.pem respectively in the /etc/grid-security/http/ directory. http certs / keys will NOT work.

NOTE This rpm no longer brings in the OSG CA Certificates package or fetch-crl CRL updater as dependencies. However we still provide the packages in the repo. If you would like them run:

[root@client ~]$ yum install osg-ca-certs

[root@client ~]$ yum install fetch-crl

NOTE if you chose not to install them you must install the certificates in another way. BeStMan2 still assumes you have them located in /etc/grid-security/certificates.

NOTE the fetch-crl service does not start by default on installation. To have fetch-crl update automatically, run:

[root@client ~]$ service fetch-crl-cron start

cron will check for CRL updates every 6 hours. If this is your first time installing you may need to run it immediately:

[root@client ~]$ /usr/sbin/fetch-crl -r 20  -a 24 --quiet

The rpm/yum installation will create a 'bestman' system account and group (uid,gid < 500) on the host system for running the BeStMan2 SRM process. If you would like to control the uid/gid that is used, then you should create the 'bestman' user and group manually before installing the rpms.

Installation

[root@client ~]$ yum install bestman2-server

Updates can be installed with:

[root@client ~]$ yum upgrade bestman2-server

Configuration

For those familiar with the VDT installation of BeStMan, you will know about the configure_bestman script for configuring the BeStMan server. This script is not supported or included in the RPM package. Certain operations that you would normally do with configure_bestman, such as changing the certificate location, are not supported.

The installation of BeStMan2 and its dependencies creates several directories. In addition to the Hadoop installation files, you will also find:

Log files /var/log/bestman2
main config file /etc/bestman2/conf/bestman2.rc
other runtime config files /etc/bestman2/conf/*
BeStMan2 lib files /usr/share/java/bestman2/
init.d startup script /etc/init.d/bestman2

BeStMan2 SRM uses the Hadoop FUSE mount to perform namespace operations, such as mkdir, rm, and ls. As per the Hadoop install instructions, edit /etc/sysconfig/hadoop and run service hadoop-firstboot start. It is not necessary (or even recommended) to start any hadoop services with service hadoop start.

The BeStMan2 SRM configuration file is located in /etc/bestman2/conf/bestman2.rc. There are a few settings that you need to add or change manually, depending on your site configuration:

supportedProtocolList=gsiftp://your.gridftp.server1:2811;gsiftp://your.gridftp.server2:2811
GUMSserviceURL=https://your.gums.host:8443/gums/services/GUMSAuthorizationServicePort
localPathListAllowed=/mnt/hadoop;/tmp

BeStMan2 uses sudo to perform changes to the filesystem namespace. This ensures that directories get created and file get removed with the proper permissions. You must manually add permissions. Append the following to the end of the /etc/sudoers file with the visudo command:

Cmnd_Alias SRM_CMD = /bin/rm, /bin/mkdir, /bin/rmdir, /bin/mv, /bin/ls 
Runas_Alias SRM_USR = ALL, !root 
bestman ALL=(SRM_USR) NOPASSWD:SRM_CMD

If you are running SL5, comment out the following line in /etc/sudoers:

Defaults    requiretty
With this option enabled, BeStMan2 will be unable to use sudo because it doesn't use a console.

Running BeStMan2

Start the BeStMan2 SRM server with the command

[root@client ~]$ service bestman2 start

To start BeStMan2 SRM automatically at boot time:

[root@client ~]$ chkconfig bestman2 on

Validation

HELP NOTE
The commands used to verify BeStMan2 below assume you have access to a node where you can first generate a valid proxy using voms-proxy-init or grid-proxy-init. Obtaining grid credentials is beyond the scope of this document.

Check SRM server ping response:

[user@client ~]$ srm-ping srm://devg-1.t2.ucsd.edu:8443/srm/v2/server
srm-ping   2.2.1.3.18    Mon Dec 20 20:16:15 PST 2010
BeStMan and SRM-Clients Copyright(c) 2007-2010,
Lawrence Berkeley National Laboratory. All rights reserved.
Support at SRM@LBL.GOV and documents at http://sdm.lbl.gov/bestman
SRM-CLIENT: Connecting to serviceurl httpg://devg-1.t2.ucsd.edu:8443/srm/v2/server

SRM-PING: Mon Jul 25 06:35:16 PDT 2011  Calling SrmPing Request...
versionInfo=v2.2

Extra information (Key=Value)
backend_type=BeStMan
backend_version=2.2.2.0.13
backend_build_date=2011-06-27T21:13:48.000Z 
gsiftpTxfServers[0]=gsiftp://devg-7.t2.ucsd.edu:2811
GatewayMode=Enabled
clientDN=/DC=org/DC=doegrids/OU=People/CN=Jeffrey M. Dost 948199
gumsIDMapped=engage

Check SRM based remote directory listing:

[user@client ~]$ lcg-ls -l -b -D srmv2 srm://devg-1.t2.ucsd.edu:8443/srm/v2/server?SFN=/mnt/hadoop/engage
----------   1     2     2 733669376              UNKNOWN /mnt/hadoop/engage/CentOS-5.6-x86_64-LiveCD.iso
----------   1     2     2 215387183              UNKNOWN /mnt/hadoop/engage/condor-7.6.1-x86_rhap_5-stripped.tar.gz
----------   1     2     2 9259360              UNKNOWN /mnt/hadoop/engage/glideinWMS_v2_5_1.tgz
----------   1     2     2      45              UNKNOWN /mnt/hadoop/engage/test.txt

Check SRM copy using GridFTP underneath:

[user@client ~]$ lcg-cp -v -b -D srmv2 file:/home/users/jdost/test2.txt srm://devg-1.t2.ucsd.edu:8443/srm/v2/server?SFN=/mnt/hadoop/engage/test2.txt
Using grid catalog type: UNKNOWN
Using grid catalog : (null)
VO name: Engage
Checksum type: None
Destination SE type: SRMv2
Destination SRM Request Token: put:2
Source URL: file:/home/users/jdost/test2.txt
File size: 59
Source URL for copy: file:/home/users/jdost/test2.txt
Destination URL: gsiftp://devg-7.t2.ucsd.edu:2811//mnt/hadoop/engage/test2.txt
# streams: 1
           59 bytes      0.04 KB/sec avg      0.04 KB/sec inst
Transfer took 3010 ms

[user@client ~]$ lcg-ls -l -b -D srmv2 srm://devg-1.t2.ucsd.edu:8443/srm/v2/server?SFN=/mnt/hadoop/engage
----------   1     2     2 733669376              UNKNOWN /mnt/hadoop/engage/CentOS-5.6-x86_64-LiveCD.iso
----------   1     2     2 215387183              UNKNOWN /mnt/hadoop/engage/condor-7.6.1-x86_rhap_5-stripped.tar.gz
----------   1     2     2 9259360              UNKNOWN /mnt/hadoop/engage/glideinWMS_v2_5_1.tgz
----------   1     2     2      45              UNKNOWN /mnt/hadoop/engage/test.txt
----------   1     2     2      59              UNKNOWN /mnt/hadoop/engage/test2.txt

Installing Gratia Transfer Probe

HELP NOTE
The Gratia Transfer Probe must be installed on the GridFTP node

Prerequisites

  1. GridFTP is installed and working

The Gratia probe requires the file osg-user-vo-map.txt to exist and be up to date. We provide an rpm package that takes care of creating and updating this file as needed. To get it run:

[root@client ~]$ yum install osg-user-vo-map-cron

It will pull in the gums-client rpm as a dependency. Details on setting up osg-user-vo-map-cron are below.

Installation

To install the Gratia Transfer Probe, run:

[root@client ~]$ yum install gratia-probe-gridftp-transfer

Updates can be installed with:

[root@client ~]$ yum upgrade gratia-probe-gridftp-transfer

Configuration

This RPM does not use Linux-standard file locations. Here are the most relevant file and directory locations:

Purpose Needs Editing? Location
Probe Configuration Yes /opt/vdt/gratia/probe/gridftp-transfer/ProbeConfig
Probe Executables No /opt/vdt/gratia/probe/gridftp-transfer
Log files No /opt/vdt/gratia/var/logs
Temporary files No /opt/vdt/gratia/var/tmp
Gums configuration Yes /etc/gums/gums-client.properties

The RPM installs the Gratia probe into the system crontab, but does not configure it. The configuration of the probe is controlled by the file

/opt/vdt/gratia/probe/gridftp-transfer/ProbeConfig

This is usually one XML node spread over multiple lines. Note that comments (#) have no effect on this file. You will need to edit the following:

Attribute Needs Editing Value
ProbeName Maybe This should be set to "gridftp-transfer:<hostname>", where <hostname> is the fully-qualified domain name of your gridftp host.
CollectorHost Maybe Set to the hostname and port of the central collector. By default it sends to the OSG collector. See below.
SiteName Yes Set to the resource group name of your site as registered in OIM.
GridftpLogDir Yes Set to /var/log, or wherever your current gridftp logs are located
Grid Maybe Set to "ITB" if this is a test resource; otherwise, leave as OSG.
UserVOMapFile Yes Set to the location of your osg-user-vo-map.txt; see below for information about this file.
SuppressUnknownVORecords Maybe Set to 1 to suppress any records that can't be matched to a VO; 0 is strongly recommended.
SuppressNoDNRecords Maybe Set to 1 to suppress records that can't be matched to a DN; 0 is strongly recommended.
EnableProbe Yes Set to 1 to enable the probe.

Selecting a collector host

The collector is the central server which logs the GridFTP transfers into a database. There are usually three options:

  1. OSG Transfer Collector: This is the primary collector for transfers in the OSG. Use CollectorHost="gratia-osg-transfer.opensciencegrid.org:80".
  2. OSG-ITB Transfer Collector: This is the test collector for transfers in the OSG. Use CollectorHost="gratia-osg-transfer.opensciencegrid.org:8881".
  3. Site local collector: If your site has set up its own collector, then your admin will be able to give you an endpoint to use. Typically, this is along the lines of CollectorHost="collector.example.com:8880".

Generating osg-user-vo-map.txt

The osg-user-vo-map.txt is a simple, space-separated format that contains 2 columns; the first is a unix username and the second is the VO which that username correspond to. In order to create it you must install the osg-user-vo-map-cron rpm as mentioned above. Once osg-user-vo-map-cron is installed you need to configure the gums client.

The primary configuration file for the gums-client utilities is located in /etc/gums/gums-client.properties. The two properties that you must change are:

Attribute Needs Editing Value
gums.location Yes This should be set to the admin URL for your gums server, usually of the form gums.location=https://GUMS_HOSTNAME:8443/gums/services/GUMSAdmin
gums.authz Yes This should be set to the authorization interface URL for your gums server, usually of the form gums.authz=https://GUMS_HOSTNAME:8443/gums/services/GUMSXACMLAuthorizationServicePort

After osg-user-vo-map-cron is installed and the gums client is configured osg-user-vo-map.txt should be created in the following location:

/etc/grid-security/osg-user-vo-map.txt

Make sure the UserVOMapFile field is set to this location in

/opt/vdt/gratia/probe/gridftp-transfer/ProbeConfig

Without osg-user-vo-map.txt , all gridftp transfers will show up as belonging to the VO "Unknown".

Validation

Run the Gratia probe once by hand to check for functionality:

[root@client ~]$ /opt/vdt/gratia/probe/gridftp-transfer/gridftp-transfer_meter.cron.sh

Look for any abnormal termination and report it if it is a non-trivial site issue. Look in the log files in /opt/vdt/gratia/var/logs/<date>.log and make sure there are no error messages printed.

Installing Hadoop Storage Probe

HELP NOTE
The Hadoop Storage Probe must be installed on the Namenode

Installation

[root@client ~]$ yum install gratia-probe-hadoop-storage

Updates can be installed with:

[root@client ~]$ yum upgrade gratia-probe-hadoop-storage

Configuration

This RPM does not using Linux-standard file locations. Here are the most relevant file and directory locations:

Purpose Needs Editing? Location
Probe Configuration Yes /opt/vdt/gratia/probe/hadoop-storage/ProbeConfig
Probe Executable No /opt/vdt/gratia/probe/hadoop-storage/hadoop_storage_probe
Log files No /opt/vdt/gratia/var/logs
Temporary files No /opt/vdt/gratia/var/tmp

The RPM installs the Gratia probe into the system crontab, but does not configure it. The configuration of the probe is controlled by two files

/opt/vdt/gratia/probe/hadoop-storage/ProbeConfig
/opt/vdt/gratia/probe/hadoop-storage/storage.cfg

ProbeConfig

This is usually one XML node spread over multiple lines. Note that comments (#) have no effect on this file. You will need to edit the following:

Attribute Needs Editing Value
CollectorHost Maybe Set to the hostname and port of the central collector. By default it sends to the OSG collector. You probably do not want to change it.
SiteName Yes Set to the resource group name of your SE as registered in OIM.
Grid Maybe Set to "ITB" if this is a test resource; otherwise, leave as OSG.
EnableProbe Yes Set to 1 to enable the probe.

storage.cfg

This file controls which paths in HDFS should be monitored. This is in the Windows INI format.

For each logical "area" (arbitrarily defined by you), specify both a given name and a list of paths that belong to that area. Unix globs are accepted.

To configure an area named "CMS /store" that monitors the space usage in the paths /user/cms/store/*, one would add the following to the storage.cfg file.

[Area CMS /store]
Name = CMS /store
Path = /user/cms/store/*
Trim = /user/cms

For each such area, add a section to your configuration file.

Example file

Below is a configuration file that includes three distinct areas. Note that you shouldn't have to touch the [Gratia] section if you edited the ProbeConfig above:

[Gratia]
gratia_location = /opt/vdt/gratia
Storage.ProbeConfig = %(gratia_location)s/probe/hadoop-storage/ProbeConfig

[Area /store]
Name = CMS /store
Path = /store/*

[Area /store/user]
Name = CMS /store/user
Path = /store/user/*

[Area /user]
Name = Hadoop /user
Path = /user/*

Installing Hadoop Storage Reports (Optional)

HELP NOTE
The Hadoop Storage Reports may be installed on any node that has access to a local Gratia Collector

The Hadoop storage reports provides a daily report on the status and usage of your SE. This serves as a handy tool for both site administrators and site executives. An example report is copied at the end of this guide.

Prerequisites

  1. A working HDFS installation
  2. A local Gratia Collector installed
  3. A Hadoop Storage Probe installed and configured to point to the local Gratia Collector

Installation

[root@client ~]$ yum install GratiaReporting

Updates can be installed with:

[root@client ~]$ yum upgrade GratiaReporting

Configuration

This RPM uses Linux-standard file locations. Here are the most relevant file and directory locations:

Purpose Needs Editing? Location
Report Configuration Yes /etc/gratia_reporting
Cron template Yes /etc/gratia_reporting/gratia_reporting/gratia_reporting.cron (move to /etc/cron.d)
Logging Configuration No /etc/gratia_reporting/logging.cfg
Log files No /var/log/gratia_reporting.log

Configuration file

Copy the file /etc/gratia_reporting/reporting.cfg to a new filename in /etc/gratia_reporting (for example, /etc/gratia_reporting/reporting_cms.cfg). You will do this once for every report you want to send out.

Attribute Needs Editing Value
SiteName Yes Set to the resource group name of your SE as registered in OIM.
database Maybe Set to the database section containing the login details for your Gratia Collector (a few, non-functioning examples sections are included). Installing a Gratia Collector is covered here, but ask around on osg-hadoop: Nebraska will usually run these reports for you if requested.
toNames Yes Python list for the "to names" for the report email.
toEmails Yes Python list for the "to emails" for the report email.
smtphost Maybe Hostname of a SMTP server that accepts email from this host.
fromName Maybe Set to the "from name" for the report email.
fromEmail Maybe Set to the "from email" for the report email.

Cron

Copy the file /etc/gratia_reporting/gratia_reporting.cron to /etc/cron.d. There is one line per report; comment out all except the hadoop report. It is the line containing -n hadoop. Update the line to point at your new configuration file.

Sample report

This is a sample report from the Nebraska HDFS instance.

============================================================
  The Hadoop Chronicle | 85 % | 2009-09-25
============================================================

--------------------
| Global Storage   |
-----------------------------------------------------
|                  |  Today  | Yesterday | One Week |
-----------------------------------------------------
| Total Space (GB) | 311,470 |   357,818 |  368,711 |
| Free Space (GB)  |  47,304 |    93,719 |  128,391 |
| Used Space (GB)  | 264,166 |   264,100 |  240,320 |
| Used Percentage  |     85% |       74% |      65% |
-----------------------------------------------------
--------------
| CMS /store |
-------------------------------------------------------------------------------------------------------------------------------------
|           Path           | Size(GB) | 1 Day Change | 7 Day Change | Remaining | # Files | 1 Day Change | 7 Day Change | Remaining |
-------------------------------------------------------------------------------------------------------------------------------------
| /store/user              |      771 |            0 | UNKNOWN      | NO QUOTA  |   4,859 |            0 | UNKNOWN      | NO QUOTA  |
| /store/mc                |   95,865 |         -353 | UNKNOWN      | NO QUOTA  |  86,830 |         -171 | UNKNOWN      | NO QUOTA  |
| /store/test              |        0 |            0 | UNKNOWN      | NO QUOTA  |     569 |           25 | UNKNOWN      | NO QUOTA  |
| /store/results           |      237 |            0 | UNKNOWN      | NO QUOTA  |     198 |            0 | UNKNOWN      | NO QUOTA  |
| /store/phedex_monarctest |      729 |            0 | UNKNOWN      | NO QUOTA  |     257 |            0 | UNKNOWN      | NO QUOTA  |
| /store/unmerged          |    3,681 |            3 | UNKNOWN      | NO QUOTA  |  35,687 |           23 | UNKNOWN      | NO QUOTA  |
| /store/CSA07             |        0 |            0 | UNKNOWN      | NO QUOTA  |       0 |            0 | UNKNOWN      | NO QUOTA  |
| /store/data              |        0 |            0 | UNKNOWN      | NO QUOTA  |       0 |            0 | UNKNOWN      | NO QUOTA  |
| /store/PhEDEx_LoadTest07 |        0 |          -21 | UNKNOWN      | NO QUOTA  |       1 |          -22 | UNKNOWN      | NO QUOTA  |
-------------------------------------------------------------------------------------------------------------------------------------

-------------------
| CMS /store/user |
----------------------------------------------------------------------------------------------------------------------------------
|          Path         | Size(GB) | 1 Day Change | 7 Day Change | Remaining | # Files | 1 Day Change | 7 Day Change | Remaining |
----------------------------------------------------------------------------------------------------------------------------------
| /store/user/hpi       |        0 |            0 | UNKNOWN      |     1,099 |      15 |            0 | UNKNOWN      |     9,985 |
| /store/user/gattebury |        0 |            0 | UNKNOWN      |     1,100 |       1 |            0 | UNKNOWN      |     9,999 |
| /store/user/mkirn     |        0 |            0 | UNKNOWN      |     1,100 |       3 |            0 | UNKNOWN      |     9,997 |
| /store/user/spadhi    |       12 |            0 | UNKNOWN      |     1,062 |   1,114 |            0 | UNKNOWN      |     8,886 |
| /store/user/creed     |        0 |            0 | UNKNOWN      |     1,100 |       0 |            0 | UNKNOWN      |    10,000 |
| /store/user/rossman   |        0 |            0 | UNKNOWN      |     1,099 |       5 |            0 | UNKNOWN      |     9,995 |
| /store/user/eluiggi   |        0 |            0 | UNKNOWN      |     1,099 |       6 |            0 | UNKNOWN      |     9,994 |
| /store/user/ewv       |        7 |            0 | UNKNOWN      |     1,081 |     284 |            0 | UNKNOWN      |     9,716 |
| /store/user/test      |        0 |            0 | UNKNOWN      | NO QUOTA  |     167 |            0 | UNKNOWN      |     9,833 |
| /store/user/schiefer  |      751 |            0 | UNKNOWN      |     1,044 |   3,264 |            0 | UNKNOWN      |     6,736 |
----------------------------------------------------------------------------------------------------------------------------------

----------------
| Hadoop /user |
----------------------------------------------------------------------------------------------------------------------------------
|       Path      | Size(GB) | 1 Day Change | 7 Day Change | Remaining | # Files | 1 Day Change | 7 Day Change |    Remaining    |
----------------------------------------------------------------------------------------------------------------------------------
| /user/djbender  |        0 |            0 | UNKNOWN      | NO QUOTA  |       1 |            0 | UNKNOWN      | NO QUOTA        |
| /user/lhcb      |        0 |            0 | UNKNOWN      |        54 |       0 |            0 | UNKNOWN      | NO QUOTA        |
| /user/dzero     |      897 |            0 | UNKNOWN      |       347 |  89,376 |            0 | UNKNOWN      |         410,624 |
| /user/bloom     |      454 |            0 | UNKNOWN      | NO QUOTA  |   1,410 |            0 | UNKNOWN      | NO QUOTA        |
| /user/uscms01   |  101,384 |         -362 | UNKNOWN      | NO QUOTA  | 129,739 |         -141 | UNKNOWN      | NO QUOTA        |
| /user/cdf       |        0 |            0 | UNKNOWN      | NO QUOTA  |       6 |            0 | UNKNOWN      | 536,870,911,994 |
| /user/osg       |        1 |            0 | UNKNOWN      | NO QUOTA  |       3 |            0 | UNKNOWN      |   5,368,709,117 |
| /user/dweitzel  |       20 |            0 | UNKNOWN      | NO QUOTA  |   2,282 |            0 | UNKNOWN      | NO QUOTA        |
| /user/gattebury |        5 |            0 | UNKNOWN      | NO QUOTA  |  10,002 |            0 | UNKNOWN      | NO QUOTA        |
| /user/brian     |       72 |            0 | UNKNOWN      | NO QUOTA  |   2,697 |            0 | UNKNOWN      | NO QUOTA        |
| /user/usatlas   |        0 |            0 | UNKNOWN      | NO QUOTA  |       0 |            0 | UNKNOWN      | NO QUOTA        |
| /user/powers    |        1 |            1 | UNKNOWN      | NO QUOTA  |     211 |          211 | UNKNOWN      | NO QUOTA        |
| /user/ifisk     |        0 |            0 | UNKNOWN      | NO QUOTA  |       1 |            0 | UNKNOWN      | NO QUOTA        |
| /user/gpn       |      261 |           -5 | UNKNOWN      |     1,360 |   3,805 |            1 | UNKNOWN      |         996,195 |
| /user/engage    |      461 |          367 | UNKNOWN      | NO QUOTA  |      16 |           13 | UNKNOWN      |         999,984 |
| /user/clundst   |        0 |            0 | UNKNOWN      | NO QUOTA  |       6 |            0 | UNKNOWN      | NO QUOTA        |
| /user/che       |        0 |            0 | UNKNOWN      | NO QUOTA  |      13 |            0 | UNKNOWN      | NO QUOTA        |
| /user/store     |        0 |            0 | UNKNOWN      | NO QUOTA  |       0 |            0 | UNKNOWN      | NO QUOTA        |
| /user/dteam     |        0 |            0 | UNKNOWN      |        53 |      18 |            0 | UNKNOWN      | NO QUOTA        |
| /user/root      |        0 |            0 | UNKNOWN      | NO QUOTA  |       1 |            0 | UNKNOWN      | NO QUOTA        |
----------------------------------------------------------------------------------------------------------------------------------

-------------
| FSCK Data |
-------------
 Total size:	114592906796932 B (Total open files size: 38923141120 B)
 Total dirs:	41293
 Total files:	295431 (Files currently being written: 38)
 Total blocks (validated):	1356788 (avg. block size 84458962 B) (Total open file blocks (not validated): 297)
 Minimally replicated blocks:	1356788 (100.0 %)
 Over-replicated blocks:	1 (7.370348E-5 %)
 Under-replicated blocks:	0 (0.0 %)
 Mis-replicated blocks:		0 (0.0 %)
 Default replication factor:	3
 Average block replication:	2.2943976
 Corrupt blocks:		0
 Missing replicas:		0 (0.0 %)
 Number of data-nodes:		101
 Number of racks:		1
The filesystem under path '/' is HEALTHY

Troubleshooting

Hadoop

To view all of the currently configured settings of Hadoop from the web interface, enter the following url in your browser:

http://namenode.hostname:50070/conf

You will see the entire configuration in XML format, for example:

<?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration>
<property><!--Loaded from core-default.xml--><name>fs.s3n.impl</name><value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.task.cache.levels</name><value>2</value></property>
<property><!--Loaded from mapred-default.xml--><name>map.sort.class</name><value>org.apache.hadoop.util.QuickSort</value></property>
<property><!--Loaded from core-site.xml--><name>hadoop.tmp.dir</name><value>/data1/hadoop//scratch</value></property>
<property><!--Loaded from core-default.xml--><name>hadoop.native.lib</name><value>true</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.namenode.decommission.nodes.per.interval</name><value>5</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.https.need.client.auth</name><value>false</value></property>
<property><!--Loaded from core-default.xml--><name>ipc.client.idlethreshold</name><value>4000</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.system.dir</name><value>${hadoop.tmp.dir}/mapred/system</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.datanode.data.dir.perm</name><value>755</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.job.tracker.persist.jobstatus.hours</name><value>0</value></property>
<property><!--Loaded from hdfs-site.xml--><name>dfs.namenode.logging.level</name><value>all</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.datanode.address</name><value>0.0.0.0:50010</value></property>
<property><!--Loaded from core-default.xml--><name>io.skip.checksum.errors</name><value>false</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.block.access.token.enable</name><value>false</value></property>
<property><!--Loaded from Unknown--><name>fs.default.name</name><value>hdfs://nagios.t2.ucsd.edu:9000</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.child.tmp</name><value>./tmp</value></property>
<property><!--Loaded from core-default.xml--><name>fs.har.impl.disable.cache</name><value>true</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.skip.reduce.max.skip.groups</name><value>0</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.safemode.threshold.pct</name><value>0.999f</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.heartbeats.in.second</name><value>100</value></property>
<property><!--Loaded from hdfs-site.xml--><name>dfs.namenode.handler.count</name><value>40</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.blockreport.initialDelay</name><value>0</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.jobtracker.instrumentation</name><value>org.apache.hadoop.mapred.JobTrackerMetricsInst</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.tasktracker.dns.nameserver</name><value>default</value></property>
<property><!--Loaded from mapred-default.xml--><name>io.sort.factor</name><value>10</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.task.timeout</name><value>600000</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.max.tracker.failures</name><value>4</value></property>
<property><!--Loaded from core-default.xml--><name>hadoop.rpc.socket.factory.class.default</name><value>org.apache.hadoop.net.StandardSocketFactory</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.job.tracker.jobhistory.lru.cache.size</name><value>5</value></property>
<property><!--Loaded from core-default.xml--><name>fs.hdfs.impl</name><value>org.apache.hadoop.hdfs.DistributedFileSystem</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.skip.map.auto.incr.proc.count</name><value>true</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.block.access.key.update.interval</name><value>600</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapreduce.job.complete.cancel.delegation.tokens</name><value>true</value></property>
<property><!--Loaded from core-default.xml--><name>io.mapfile.bloom.size</name><value>1048576</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapreduce.reduce.shuffle.connect.timeout</name><value>180000</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.safemode.extension</name><value>30000</value></property>
<property><!--Loaded from mapred-site.xml--><name>tasktracker.http.threads</name><value>50</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.job.shuffle.merge.percent</name><value>0.66</value></property>
<property><!--Loaded from core-default.xml--><name>fs.ftp.impl</name><value>org.apache.hadoop.fs.ftp.FTPFileSystem</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.output.compress</name><value>false</value></property>
<property><!--Loaded from core-site.xml--><name>io.bytes.per.checksum</name><value>4096</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.healthChecker.script.timeout</name><value>600000</value></property>
<property><!--Loaded from core-default.xml--><name>topology.node.switch.mapping.impl</name><value>org.apache.hadoop.net.ScriptBasedMapping</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.https.server.keystore.resource</name><value>ssl-server.xml</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.reduce.slowstart.completed.maps</name><value>0.05</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.reduce.max.attempts</name><value>4</value></property>
<property><!--Loaded from core-default.xml--><name>fs.ramfs.impl</name><value>org.apache.hadoop.fs.InMemoryFileSystem</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.block.access.token.lifetime</name><value>600</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.skip.map.max.skip.records</name><value>0</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.name.edits.dir</name><value>${dfs.name.dir}</value></property>
<property><!--Loaded from core-default.xml--><name>hadoop.security.group.mapping</name><value>org.apache.hadoop.security.ShellBasedUnixGroupsMapping</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.job.tracker.persist.jobstatus.dir</name><value>/jobtracker/jobsInfo</value></property>
<property><!--Loaded from core-site.xml--><name>hadoop.log.dir</name><value>/var/log/hadoop</value></property>
<property><!--Loaded from core-default.xml--><name>fs.s3.buffer.dir</name><value>${hadoop.tmp.dir}/s3</value></property>
<property><!--Loaded from hdfs-site.xml--><name>dfs.block.size</name><value>134217728</value></property>
<property><!--Loaded from mapred-default.xml--><name>job.end.retry.attempts</name><value>0</value></property>
<property><!--Loaded from core-default.xml--><name>fs.file.impl</name><value>org.apache.hadoop.fs.LocalFileSystem</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.output.compression.type</name><value>RECORD</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.local.dir.minspacestart</name><value>0</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.datanode.ipc.address</name><value>0.0.0.0:50020</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.permissions</name><value>true</value></property>
<property><!--Loaded from core-default.xml--><name>topology.script.number.args</name><value>100</value></property>
<property><!--Loaded from core-default.xml--><name>io.mapfile.bloom.error.rate</name><value>0.005</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.max.tracker.blacklists</name><value>4</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.task.profile.maps</name><value>0-2</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.datanode.https.address</name><value>0.0.0.0:50475</value></property>
<property><!--Loaded from core-site.xml--><name>dfs.umaskmode</name><value>002</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.userlog.retain.hours</name><value>24</value></property>
<property><!--Loaded from hdfs-site.xml--><name>dfs.secondary.http.address</name><value>gratia-1:50090</value></property>
<property><!--Loaded from hdfs-site.xml--><name>dfs.replication.max</name><value>32</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.job.tracker.persist.jobstatus.active</name><value>false</value></property>
<property><!--Loaded from core-default.xml--><name>hadoop.security.authorization</name><value>false</value></property>
<property><!--Loaded from core-default.xml--><name>local.cache.size</name><value>10737418240</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.min.split.size</name><value>0</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.namenode.delegation.token.renew-interval</name><value>86400000</value></property>
<property><!--Loaded from mapred-site.xml--><name>mapred.map.tasks</name><value>7919</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.child.java.opts</name><value>-Xmx200m</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.https.client.keystore.resource</name><value>ssl-client.xml</value></property>
<property><!--Loaded from Unknown--><name>dfs.namenode.startup</name><value>REGULAR</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.job.queue.name</name><value>default</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.job.tracker.retiredjobs.cache.size</name><value>1000</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.https.address</name><value>0.0.0.0:50470</value></property>
<property><!--Loaded from hdfs-site.xml--><name>dfs.balance.bandwidthPerSec</name><value>2000000000</value></property>
<property><!--Loaded from core-default.xml--><name>ipc.server.listen.queue.size</name><value>128</value></property>
<property><!--Loaded from mapred-default.xml--><name>job.end.retry.interval</name><value>30000</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.inmem.merge.threshold</name><value>1000</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.skip.attempts.to.start.skipping</name><value>2</value></property>
<property><!--Loaded from hdfs-site.xml--><name>fs.checkpoint.dir</name><value>/var/hadoop/checkpoint-a</value></property>
<property><!--Loaded from mapred-site.xml--><name>mapred.reduce.tasks</name><value>1543</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.merge.recordsBeforeProgress</name><value>10000</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.userlog.limit.kb</name><value>0</value></property>
<property><!--Loaded from core-default.xml--><name>webinterface.private.actions</name><value>false</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.max.objects</name><value>0</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.job.shuffle.input.buffer.percent</name><value>0.70</value></property>
<property><!--Loaded from mapred-default.xml--><name>io.sort.spill.percent</name><value>0.80</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.map.tasks.speculative.execution</name><value>true</value></property>
<property><!--Loaded from core-default.xml--><name>hadoop.util.hash.type</name><value>murmur</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.datanode.dns.nameserver</name><value>default</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.blockreport.intervalMsec</name><value>3600000</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.map.max.attempts</name><value>4</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapreduce.job.acl-view-job</name><value> </value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.job.tracker.handler.count</name><value>10</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.client.block.write.retries</name><value>3</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.max.reduces.per.node</name><value>-1</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapreduce.reduce.shuffle.read.timeout</name><value>180000</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.tasktracker.expiry.interval</name><value>600000</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.https.enable</name><value>false</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.jobtracker.maxtasks.per.job</name><value>-1</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.jobtracker.job.history.block.size</name><value>3145728</value></property>
<property><!--Loaded from mapred-default.xml--><name>keep.failed.task.files</name><value>false</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.datanode.failed.volumes.tolerated</name><value>0</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.task.profile.reduces</name><value>0-2</value></property>
<property><!--Loaded from core-default.xml--><name>ipc.client.tcpnodelay</name><value>false</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.output.compression.codec</name><value>org.apache.hadoop.io.compress.DefaultCodec</value></property>
<property><!--Loaded from mapred-default.xml--><name>io.map.index.skip</name><value>0</value></property>
<property><!--Loaded from core-default.xml--><name>ipc.server.tcpnodelay</name><value>false</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.namenode.delegation.key.update-interval</name><value>86400000</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.running.map.limit</name><value>-1</value></property>
<property><!--Loaded from mapred-default.xml--><name>jobclient.progress.monitor.poll.interval</name><value>1000</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.default.chunk.view.size</name><value>32768</value></property>
<property><!--Loaded from core-default.xml--><name>hadoop.logfile.size</name><value>10000000</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.reduce.tasks.speculative.execution</name><value>true</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapreduce.tasktracker.outofband.heartbeat</name><value>false</value></property>
<property><!--Loaded from core-default.xml--><name>fs.s3n.block.size</name><value>67108864</value></property>
<property><!--Loaded from hdfs-site.xml--><name>dfs.datanode.du.reserved</name><value>10000000000</value></property>
<property><!--Loaded from core-default.xml--><name>hadoop.security.authentication</name><value>simple</value></property>
<property><!--Loaded from hdfs-site.xml--><name>fs.checkpoint.period</name><value>3600</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.running.reduce.limit</name><value>-1</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.job.reuse.jvm.num.tasks</name><value>1</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.web.ugi</name><value>webuser,webgroup</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.jobtracker.completeuserjobs.maximum</name><value>100</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.df.interval</name><value>60000</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.task.tracker.task-controller</name><value>org.apache.hadoop.mapred.DefaultTaskController</value></property>
<property><!--Loaded from hdfs-site.xml--><name>dfs.data.dir</name><value>/data1/hadoop//data</value></property>
<property><!--Loaded from core-default.xml--><name>fs.s3.maxRetries</name><value>4</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.datanode.dns.interface</name><value>default</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.support.append</name><value>true</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapreduce.job.acl-modify-job</name><value> </value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.local.dir</name><value>${hadoop.tmp.dir}/mapred/local</value></property>
<property><!--Loaded from core-default.xml--><name>fs.hftp.impl</name><value>org.apache.hadoop.hdfs.HftpFileSystem</value></property>
<property><!--Loaded from hdfs-site.xml--><name>dfs.permissions.supergroup</name><value>root</value></property>
<property><!--Loaded from core-default.xml--><name>fs.trash.interval</name><value>0</value></property>
<property><!--Loaded from core-default.xml--><name>fs.s3.sleepTimeSeconds</name><value>10</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.submit.replication</name><value>10</value></property>
<property><!--Loaded from hdfs-site.xml--><name>dfs.replication.min</name><value>1</value></property>
<property><!--Loaded from core-default.xml--><name>fs.har.impl</name><value>org.apache.hadoop.fs.HarFileSystem</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.map.output.compression.codec</name><value>org.apache.hadoop.io.compress.DefaultCodec</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.tasktracker.dns.interface</name><value>default</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.namenode.decommission.interval</name><value>30</value></property>
<property><!--Loaded from Unknown--><name>dfs.http.address</name><value>nagios:50070</value></property>
<property><!--Loaded from mapred-site.xml--><name>mapred.job.tracker</name><value>nagios:9000</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.heartbeat.interval</name><value>3</value></property>
<property><!--Loaded from core-default.xml--><name>io.seqfile.sorter.recordlimit</name><value>1000000</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.name.dir</name><value>${hadoop.tmp.dir}/dfs/name</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.line.input.format.linespermap</name><value>1</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.jobtracker.taskScheduler</name><value>org.apache.hadoop.mapred.JobQueueTaskScheduler</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.tasktracker.instrumentation</name><value>org.apache.hadoop.mapred.TaskTrackerMetricsInst</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.datanode.http.address</name><value>0.0.0.0:50075</value></property>
<property><!--Loaded from mapred-default.xml--><name>jobclient.completion.poll.interval</name><value>5000</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.max.maps.per.node</name><value>-1</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.local.dir.minspacekill</name><value>0</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.replication.interval</name><value>3</value></property>
<property><!--Loaded from mapred-default.xml--><name>io.sort.record.percent</name><value>0.05</value></property>
<property><!--Loaded from core-default.xml--><name>fs.kfs.impl</name><value>org.apache.hadoop.fs.kfs.KosmosFileSystem</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.temp.dir</name><value>${hadoop.tmp.dir}/mapred/temp</value></property>
<property><!--Loaded from mapred-site.xml--><name>mapred.tasktracker.reduce.tasks.maximum</name><value>4</value></property>
<property><!--Loaded from hdfs-site.xml--><name>dfs.replication</name><value>2</value></property>
<property><!--Loaded from core-default.xml--><name>fs.checkpoint.edits.dir</name><value>${fs.checkpoint.dir}</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.tasktracker.tasks.sleeptime-before-sigkill</name><value>5000</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.job.reduce.input.buffer.percent</name><value>0.0</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.tasktracker.indexcache.mb</name><value>10</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapreduce.job.split.metainfo.maxsize</name><value>10000000</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.skip.reduce.auto.incr.proc.count</name><value>true</value></property>
<property><!--Loaded from core-default.xml--><name>hadoop.logfile.count</name><value>10</value></property>
<property><!--Loaded from core-default.xml--><name>fs.automatic.close</name><value>true</value></property>
<property><!--Loaded from core-default.xml--><name>io.seqfile.compress.blocksize</name><value>1000000</value></property>
<property><!--Loaded from hdfs-site.xml--><name>dfs.hosts.exclude</name><value>/etc/hadoop-0.20/conf/hosts_exclude</value></property>
<property><!--Loaded from core-default.xml--><name>fs.s3.block.size</name><value>67108864</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.tasktracker.taskmemorymanager.monitoring-interval</name><value>5000</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.acls.enabled</name><value>false</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapreduce.jobtracker.staging.root.dir</name><value>${hadoop.tmp.dir}/mapred/staging</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.queue.names</name><value>default</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.access.time.precision</name><value>3600000</value></property>
<property><!--Loaded from core-default.xml--><name>fs.hsftp.impl</name><value>org.apache.hadoop.hdfs.HsftpFileSystem</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.task.tracker.http.address</name><value>0.0.0.0:50060</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.reduce.parallel.copies</name><value>5</value></property>
<property><!--Loaded from core-default.xml--><name>io.seqfile.lazydecompress</name><value>true</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.safemode.min.datanodes</name><value>0</value></property>
<property><!--Loaded from mapred-default.xml--><name>io.sort.mb</name><value>100</value></property>
<property><!--Loaded from core-default.xml--><name>ipc.client.connection.maxidletime</name><value>10000</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.compress.map.output</name><value>false</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.task.tracker.report.address</name><value>127.0.0.1:0</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.healthChecker.interval</name><value>60000</value></property>
<property><!--Loaded from core-default.xml--><name>ipc.client.kill.max</name><value>10</value></property>
<property><!--Loaded from core-default.xml--><name>ipc.client.connect.max.retries</name><value>10</value></property>
<property><!--Loaded from core-default.xml--><name>fs.s3.impl</name><value>org.apache.hadoop.fs.s3.S3FileSystem</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.job.tracker.http.address</name><value>0.0.0.0:50030</value></property>
<property><!--Loaded from core-default.xml--><name>io.file.buffer.size</name><value>4096</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.jobtracker.restart.recover</name><value>false</value></property>
<property><!--Loaded from core-default.xml--><name>io.serializations</name><value>org.apache.hadoop.io.serializer.WritableSerialization</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.task.profile</name><value>false</value></property>
<property><!--Loaded from hdfs-site.xml--><name>dfs.datanode.handler.count</name><value>10</value></property>
<property><!--Loaded from mapred-default.xml--><name>mapred.reduce.copy.backoff</name><value>300</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.replication.considerLoad</name><value>true</value></property>
<property><!--Loaded from mapred-default.xml--><name>jobclient.output.filter</name><value>FAILED</value></property>
<property><!--Loaded from hdfs-default.xml--><name>dfs.namenode.delegation.token.max-lifetime</name><value>604800000</value></property>
<property><!--Loaded from mapred-site.xml--><name>mapred.tasktracker.map.tasks.maximum</name><value>4</value></property>
<property><!--Loaded from core-default.xml--><name>io.compression.codecs</name><value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec</value></property>
<property><!--Loaded from core-default.xml--><name>fs.checkpoint.size</name><value>67108864</value></property>
</configuration>

Please refer to OSG Hadoop debug webpage and Apache Hadoop FAQ webpage for answers to common questions/concerns

FUSE

Notes on Building a FUSE Module

If you are running a custom kernel, then be sure to enable the fuse module with CONFIG_FUSE_FS=m in your kernel config. Building and installing a fuse kernel module for your custom kernel is beyond the scope of this document.

Note: If you cannot find a fuse kernel module to match your kernel, ATRPMs has a guide for using their RPM spec files in order to generate a module. That page mostly works, although sections are a bit out dated. Contact the osg-hadoop@opensciencegrid.org list if you need help.

Running FUSE in Debug Mode

To start the FUSE mount in debug mode, you can run the FUSE mount command by hand:

[root@client ~]$  /usr/bin/hadoop-fuse-dfs  /mnt/hadoop -o rw,server=namenode.host,port=9000,rdbuffer=131072,allow_other -d

Debug output will be printed to stderr, which you will probably want to redirect to a file. Most FUSE-related problems can be tackled by reading through the stderr and looking for error messages.

GridFTP

Starting GridFTP in Standalone Mode

If you would like to test the gridftp-hdfs server in a debug standalone mode, you can run the command:

[root@client ~]$ gridftp-hdfs-standalone

The standalone server runs on port 5002, handles a single GridFTP request, and will log output to stdout/stderr.

Known Issues

copyFromLocal java IOException

When trying to copy a local file into Hadoop you may come across the following java exception:

 
11/06/24 11:10:50 WARN hdfs.DFSClient: Error Recovery for block null bad datanode[0] nodes == null 11/06/24 11:10:50 WARN hdfs.DFSClient: Could not get block locations. Source file "/osg/ddd" - Aborting... copyFromLocal: java.io.IOException: File /osg/ddd could only be replicated to 0 nodes, instead of 1 11/06/24 11:10:50 ERROR hdfs.DFSClient: Exception closing file /osg/ddd : org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /osg/ddd could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1415) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:588) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:528) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1319) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1315) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1313)

This can occur if you try to install a Datanode on a machine with less than 10GB of disk space available. This can be changed by lowering the value of the following property in /usr/lib/hadoop-0.20/conf/hdfs-site.xml:

<property>
  <name>dfs.datanode.du.reserved</name>
  <value>10000000000</value>
</property>

Hadoop always requires this amount of disk space to be available for non-hdfs usage on the machine.

References

Benchmarking

Comments

Topic revision: r36 - 15 Feb 2012 - 21:00:13 - KyleGross
Hello, TWikiGuest
Register

Introduction

Installation and Update Tools

Clients

Compute Element

Storage Element

Other Site Services

VO Management

Software and Caches

Central OSG Services

Additional Information

Community
linkedin-favicon_v3.icoLinkedIn
FaceBook_32x32.png Facebook
campfire-logo.jpgChat
 
TWIKI.NET

TWiki | Report Bugs | Privacy Policy

This site is powered by the TWiki collaboration platformCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors..