Summer Workshop Tutorials 2011

Introduction

This is a repository of information for the OSG Summer workshop and the CMS Tier 3 workshop, both held at the Texas Tech University (Lubbock, TX), August 9-11, 2011.

General Requirements

These apply to both the OSG site administrators and the CMS Tier 3 workshops. Attendees are expected to have a laptop for use SSH and to connect to their resources. They should already have a valid grid certificate. Information about certificates is available here. Attendees will work with their own servers or VMs which should have pre-installed certificates as specified below in the requirements for the different sessions. Hosts for Pacman installations should be one of the platforms supported by VDT. Hosts for RPM installations should be RHEL 5 based systems (Centos 5 and SL5 are OK). Attendees of the "Site administrators" tracks will need admin (root) privileges for most of the sessions.

Tuesday morning

The morning contributions are presentations. Please check the Indico agenda

OSG and Tier 3s (Marco Mambelli)

This session presents an overview of OSG 1.2, VDT 2.0 and introduces some OSG concepts and the Tier 3 idea to newcomers. This will be followed by a presentation of key security concepts.

Attendees will receive a brief description of the hands-on sessions.

Site administrators 1

Network performance and troubleshooting (Jason Zurawski)

Description: This Hands-on workshop will introduce the participants to network performance tools, and suggest strategies to locate and quickly resolve problems. Emphasis will be placed on the use of diagnostic tools available in VDT including OWAMP, BWCTL and NDT as well as introducing the concept of regular monitoring through the use of the perfSONAR-PS framework.

Supplemental Reading: The following set of slides represents the a 1.5 day workshop offered by Internet2 to help administrators and operators cope with network performance problems. Reading this material is not required, but may be useful.

Requirements: Workshop exercises will be performed on a specially designed testbed accessed from the participants laptop. Participants will need to have a laptop with a web browser and a terminal capable of connecting to remote resources via SSH. Plugins for Flash and Java for the web browser will be required to view some of the web content.

Hands-on Section

  • Attendees will log in to 'npw.internet2.edu' using the usernames and passwords supplied before the workshop
  • Slides will go through the steps we will be taking on these hosts. The goal is to find the following network performance metrics:
    • Round Trip Latency (via 'ping')
    • One Way Latency, packet loss, packet re-ordering, and jitter (via 'owamp')
    • TCP Bandwidth (via 'bwctl')
    • Network diagnostics, round trip latency, packet loss (via 'ndt')
  • Solutions to exercises will be discussed, and reasons for poor network performance will be presented. E.g. small amounts of packet loss on a network with large latencies will impact throughput.

Agenda: 1h total

KVM hands-on (Steven Timm, Suchandra Thapa)

Description: This talk will give a short description of the KVM hardware virtualization hypervisor which is the default virtualization method for Red Hat Linux, Scientfic Linux, and CentOS?. We will give an introduction to virtualized machines and virtualized networking, then cover the basic ideas of KVM and libvirt. There will be a short demonstration of how to use the virt-install, virt-manager, and virsh utilities.

Requirements: Participants in this tutorial should have access to a Redhat/Scientific Linux/Centos machine (5 or 6) on which they have root and are allowed to start a virtual machine. This machine must have its BIOS set to support hardware virtualization. (Most Intel and AMD hardware built after 2007 has this capability.) Note, if you have a Windows laptop with VMWare or a MAC with VMWare or VirtualBox? there are other similar tutorials on the CernVM? site you can follow.

Hands on CernVM KVM installation tutorial

Supplemental Reading: KVM Wiki Red Hat Virtualization Guide for RHEL6 Red Hat Virtualization Guide for RHEL5

Storage Element Introduction (Douglas Strain)

Description: This talk will introduce the concept of a storage element and illustrate several examples and their usage, include Bestman Gateway with Hadoop/Xrootd, Bestman Fullmode, and dCache. This will not be a hands-on talk.

Requirements: none.

Storage Element Installation: BeStMan2 and Hadoop (Jeff Dost)

Description: This tutorial will demonstrate how to install Hadoop as well as set up a BeStMan2 server to give a fully functional Storage Element. Participants are encouraged to be familiar with Hadoop and the installation process from the official OSG Hadoop twiki page: https://twiki.grid.iu.edu/bin/view/ReleaseDocumentation/Hadoop20Install

Requirements: The participants will need to have three servers (Machine or VM), and root access on all servers.

Each machine must have:

  • Its own public IP address and be registered to a domain name.
  • RHEL 5 (or compatible OS, e.g. CentOS 5, SL5) (>=5.4 recommended)
  • Sun Java 1.6 installed as well as corresponding java-1.6.0-sun-compat package

The machines will be designated as follows:

  • one Hadoop NameNode
  • one Hadoop DataNode
  • one node to run BeStMan2 and GridFTP services

The BeStMan2 + GridFTP machine additionally requires:

  • Access to a working gums server for user mappings; grid-mapfile mappings are not currently supported.
  • valid host certificate installed in /etc/grid-security/{hostcert.pem,hostkey.pem}

Additional requirements:

  • To verify services after installation you need the OSG client and be able to generate a grid proxy

Xrootd and Distributed Xrootd (Douglas Strain)

Description: This talk will introduce the concepts of Xrootd and how it works in general. It will briefly list how CMS and ATLAS are using the Xrootd framework for analysis and storage. This will not be a hands-on tutorial.

Requirements: None.

Users forum 1

Job submission overview (Marco Mambelli)

Description: This talk will present different ways to access OSG resources, specially to submit jobs. It is an overview for the material in the talks following in this session.

Installing a Glidein WMS VO frontend (Derek Weitzel)

Description: Install & Configure a glideinwms frontend using the RPM. The frontend will point to a test factory setup by UCSD (Jeff).

Requirements:

  • 1 RHEL 5 (CentOS? 5, SL5) Machine or VM
  • Root access on the machine
  • DOE Grid Proxy. Preferred Proxy and Host Certs for VM
  • Public network access on the machine (public ip address)

Guide: VO Frontend Documentation

Installing a Campus Factory (Derek Weitzel)

Description: Install & Configure a campus factory. I will focus on local submissions, ie submissions from the node the campus factory is running on, in order to reduce the number of nodes required.

Requirements:

  • PBS Submission node (doesn't have to be head node)
  • Access to (not necessarily root) to the submission node.

Guide: Campus Factory Install

OSG client installation and basic use (Marco Mambelli)

Description: Installation of the OSG client tools both using RPMs and the Pacman package. Attendees may install the software using one, the other or both. Basic use of the OSG client tools to submit grid jobs, transfer files and query for resources.

Requirements:

  • 1 Machine or VM with one of the platforms supported by VDT (RHEL 5 or compatible OS, e.g. CentOS 5, SL5 required for RPM installation)
  • Outbound network access from the machine. Public ip address preferred.
  • Root access on the machine preferred (required for RPM installation).
  • IGTF x509 user certificate (grid certificate as specified in the general requirements). Membership to one OSG VO preferred.

Guide:

Tuning for HTPC submission, GPU, ... and using a single submit host (Greg Thain)

Description:

Requirements:

Site administrators 2

Rocks hands-on (Yu Fu)

Description:

Install, configure and manage a cluster with help of the ROCKS software.

Requirements:

Reference and Guide:

Puppets hands-on (Suchandra Thapa)

Description

This session will go through a toy example of using puppet to install and configure a cluster and update configurations as needed.

Requirements

  • We probably will not have enough time to go through the install and setup of a puppet system but can discuss existing puppet systems as well as go through an existing toy install to see how puppet works.
  • For puppet, at least two systems are needed: one for the puppet master and another for the client system. Both of these systems can be virtual systems although you probably want at least 1.5GB of memory on both systems and 10GB of memory or more.

References

Moving away from Pacman to RPMs (Alain Roy)

You might have heard that OSG is moving away from the Pacman-based version of the VDT and is working on a version of the VDT based entirely on RPMs for Red Hat Enterprise Linux 5 (and variants). Alain will give you the scoop in some technical details. This will complement his Tuesday morning talk that gave the goals, the timeline, and the background. In this rather brief 30-minute session, it will not be feasible to do a hands-on session, so Alain will walk through the technical details and give a brief demo. He's also hoping that you can give constructive criticism to make sure that are moving in the right direction and making the right technical decisions. Please come and give feedback!

Some background reading:

Fault tolerant installations and virtualization (Steven Timm)

Description This hands-on will describe how to set up a fault-tolerant and virtualized version of an OSG web service using GUMS as an example. Past experience shows that it is not possible to complete this whole process in an hour but we will give demonstrations of the key points in the hands-on tutorial and people can continue later on their own time. We will cover LVS, Heartbeat, DRBD, virtualization, and multi-master MySQL? replication.

Prerequisites To complete the full hands-on tutorial you will need access to six virtual machines on which you have root. Gums Hands-on Tutorial (installing a Single gums server) High Availability Hands On Tutorial

TWE: OSG installation tuning (CE/GUMS, ...)

Talk-With-the-Experts: OSG installation tuning (CE/GUMS, ...): This session closing the system administrators track is an open, loosely structured interactive session where OSG experts are available to guide and support users (scientists and system administrators) by answering questions or providing one-on-one coaching. Among the others it will cover questions about installation, optimization and troubleshooting of OSG resources (e.g. CE, GUMS, SEs, ).

A similar session is closing the campus grid and users track. Attendees are encouraged to move between one and the other room to meet with the different experts.

To help our planning, submit your questions in advance by sending an email to OSG2011@highenergy.phys.ttu.edu

In the afternoon some experts will continue to be available in a breakout session parallel to the CMS Tier 3 workshop.

Users forum 2

Security best practices, certificates and VO membership (Anand Padmanabhan)

Give an introduction about certificates, CA, CRLs, trusted relationships on OSG.

Instruct on security best practices for users (and maybe site administrators).

OIM and MyOSG? (Scott Teige)

Description: In this tutorial you will learn how to register a person, a resource, a site, etc. in the OSG Information Management (OIM) system. There will also be a short demonstration of some the the most popular features of MyOSG, the "one stop shopping" site for the OSG. Nearly the entire content of this tutorial is given in this overview.

Requirements: You should have a valid grid certificate installed in your browser.

Data movement with Globus Online (Douglas Strain)

Description: In this tutorial, you will register with Globus Online, set up your credentials, and go through an example data movement using the Globus Online transfer service.

Requirements: You will need a machine with ssh, and access to 2 gridFTP servers. You will need a valid grid certificate. Also, it would help to speed the tutorial if you create an account on Globus Online before the tutorial. If you wish to test Globus Connect (ie to upload files from a laptop), you will need to install this on your machine. See https://www.globusonline.org/ for details.

Tutorial: The tutorial we will be following is at https://twiki.grid.iu.edu/bin/view/ReleaseDocumentation/GlobusOnlineTutorial

Porting your code to OSG: an example (Derek Weitzel)

Description: Talk about common pitfalls when running on the grid. Things you have to consider, such as:
  • No global file system
  • Data Pull model is much, much easier than push model.
  • Common problems with scientific code when ported to the grid.

Requirements: None really. Maybe code that you want to port.

TWE: Porting your code to the Grid

Talk-With-the-Experts: Porting your code to the Grid:

This session closing the users and campus grids track is an open, loosely structured interactive session where OSG experts are available to guide and support users (scientists and system administrators) by answering questions or providing one-on-one coaching. Among the others it will cover questions about running your workflow on OSG campus or production grids and the installation, optimization and troubleshooting of campus grids. A similar session is closing the system administrators track. Attendees are encouraged to move between one and the other room to meet with the different experts.

To help our planning, submit your questions in advance by sending an email to OSG2011@highenergy.phys.ttu.edu

In the afternoon some experts will continue to be be available for further support in a breakout session parallel to the CMS Tier 3 workshop.

CMS Tier 3 workshop

CRAB (Eric Vaandering)

Requirements

The only hard and fast requirements are that

  • every user must have a DOE Grids certificate and have it installed on the machine where you wish to submit jobs from and
  • be registered in the CMS SiteDB.
Please follow these instructions: https://twiki.cern.ch/twiki/bin/view/Main/CRABPrerequisitesGRIDCredentials and these to register in SiteDB: https://twiki.cern.ch/twiki/bin/view/CMS/SiteDBForCRAB

It is also highly recommended that you have an account on the CMS LPC cluster. Please follow these instructions: http://www.uscms.org/uscms_at_work/physics/computing/getstarted/getaccount_fermilab.shtml

The tutorial will work from anywhere that has CMSSW, CRAB, and the gLite Grid client installed. You are encouraged to try to do this at your own institution, but we can't spend much time during the tutorial working through any problems with local setups, so even if you intend to set this up at your home institution, please have an account on the FNAL LPC ready to go as a backup.

For those who wish to try this at home, I can't do better than Malina's instructions:

For the truly adventurous, it is possible that the CRAB client will work with the OSG client commands and not require gLite at all. CRAB in Client mode basically uses the voms-proxy-*, myproxy-*, and lcg-* commands and not a lot else.

So, before coming to the tutorial, please follow the instructions about getting your certificate set up on both your home machine (if applicable) and the FNAL LPC (for sure). If you want to try at home, set up CMSSW, gLite, and CRAB as well.

Talk outline

Tutorial

SummerWorkshopCRABTutorial2011

GlideIns - The Future of GRID Computing (Jeff Dost)

Requirements: None. This talk is a slide based presentation with a demonstration to follow.

Description: This talk will focus on the benifits of using glideinWMS for CMS job management on the grid. It will give a broad overview of the glidein architecture and how it is implemented in CMS including its integration with CRAB. It will illustrate the advantages submission with glideinWMS has over non-pilot methods, and show the implications of how it changes the workflow for the CMS VO as well as for Site Administrators.

The demonstration portion will have two major parts:

  1. Trace the entire path a job takes from user submission to matching with a glidein pilot to running on a worker node.
  2. Show some techiques site admins can use to track glideins on their system.
    • example: finding the real user of the job that the pilot is currently running

Reference: Commands used in glidein demo

Monitoring your site (Joel Walker)

Description Every Tier 3 site is a unique entity composed of a vast array of extremely complicated interdependent hardware and software, extensively cross-networked for participation in the global endeavor of processing LHC data. Successful operation of a Tier 3 site, including performance optimization and tuning, requires intimately detailed, near real-time feedback on how the individual system components are behaving at a given moment, and how this compares to design goals and historical norms. Our monitoring project represents the creation of an array of custom server daemons which harvest data from the excellent existing analysis tools at various locations across the web, collecting the results into a site specific unified display designed extreme visual efficiency and information density.

Requirements A laptop computer with internet access is recommended to facilitate an interactive discussion and tour of the Texas A&M monitoring "Beta" deployment.

Resources Slides on indico: https://indico.fnal.gov/contributionDisplay.py?contribId=23&sessionId=5&confId=4531 Functioning Monitor Prototype (in development) : http://collider.physics.tamu.edu/tier3/mon/

Integrating Campus Clusters with CMS Jobs (Will Maier)

Internet2 projects for LHC (Jason Zurawski)

Multicore Processing (Eric Vaandering)

PERCEUS Cluster Provisioning Tool (Bill Strossman)

Description Perceus is a cluster provisioning and management package. In addition to providing easy provisioning, it has a comprehensive set of command-line tools for cluster management. This talk will give an overview of Perceus and, time permitting, a hands-on demonstration.

Requirements

  • RHEL/CentOS/SL/Fedora
  • perceus-1.5.3-2207.x86_64.rpm (master)
  • perceus-provisiond-1.5.3-2207.x86_64.rpm
  • perl-DBI-1.609-1.caos.x86_64.rpm (part of CentOS?/SL/RHEL 5 repo)
  • perl-IO-Interface-0.98-4.caos.noarch.rpm
  • perl-Net-ARP-1.0-3.nsa1.noarch.rpm
  • perl-Unix-Syslog-0.99-3.caos.noarch.rpm
  • KVM Suite - for hands on; need only 1 VM for compute node w/1 GB RAM - can be installed by "yum install kvm"

Unless othewise specified, the RPMs can be downloaded from http://altruistic.infiscale.org/rhel/5/RPMS/x86_64/

Documentation The Perceus manual can be downloaded in PDF at http://altruistic.infiscale.org/docs/perceus-userguide1.6.pdf


Comments

-- MarcoMambelli - 20 Jul 2011

Topic revision: r37 - 18 Jun 2012 - 19:51:31 - ElizabethChism

Hello, TWikiGuest!
Register

 
TWIKI.NET

TWiki | Report Bugs | Privacy Policy

This site is powered by the TWiki collaboration platformCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors..