OSG Area Coordinators Meeting

Meeting Coordinates

Date Thursday, April 16, 2009
Time Noon Central
Telephone Number 510-665-5437
Teleconference ID 1111

Attendees

Abhishek, Chander, Chris, Rob G, Marco, Britta, David, Alina, Tanya, Alain, Mats, Brian, Dan, Maxim

Agenda

  • Canvas for News Items - David Ritchie

  • 1.1 Software Alain Roy (Storage Tanya Levshina)

  • 1.3 Integration & Sites Rob Gardner

  • 1.4 VOs Abhishek Rana, Britta Daudert

  • 1.5 Engagement - Mats Rynge

Software Report

I last reported in January. Major things of note since then.

VDT Releases

  • Enormous amount of work on VDT 1.10.1 update. Focused on:
    • New method for updating the VDT: significant, important
    • Updated Globus
    • Updated GUMS/PRIMA/glexec.
      • GUMS now has user banning
      • New protocol between PRIMA and GUMS
    • GIP 1.2 & new CEMon
    • New version of Bestman
    • Added OSG Matchmaker
    • http://vdt.cs.wisc.edu/releases/1.10.1/release-bigupdate.html?
  • This release is in final testing now.
    • Getting the updater right has taken more work than expected
    • VTB/ITB testing was slower than expected.
  • Good progress on new VDT (1.11.0) focusing on:
    • Improved updatability
    • Debian 5 support for LIGO
  • VDT 1.11.0 is nearly done, mostly delayed by VDT 1.10.0 update.

OSG Storage

Current Initiatives

  • Incorporating Info Provider tests into certification test suite
  • Installing BeStMan?-gateway/xrootd, BeStMan?-gateway and Bestman on BeStMan? test stand at Fermilab (20 VMs)
  • Working on performance test suite
  • Testing a new major release of dCache
  • Involved in ITB testing installation and upgrade procedure, documentation

Issues/Concerns

  • Timely releases
  • Support/efforts

Integration & Sites

Virtual Organizations Group

Weekly VO Forum Meetings

  • Usual attendance is from CDF, D0, DES, Engage-VO, Fermilab-VO, ILC, nanoHUB, NYSGrid, OSG-VO, SBGrid.
  • Increased focus for next few months will be on VOs with low activity: CompBioGrid, GPN, GUGrid, GROW, IceCube.

Bi-monthly VO Forum Meetings

  • Stakeholder virtual Round-table.
  • Second such forum was organized in Jan/Feb09, in two parts over two weeks. General Plans and Reports from CIGI, D0, DOSAR, DES, Engage-VO, IceCube, STAR were presented by each stakeholder in first week. In second week, CDF, CompBioGrid, Fermilab-VO, GEANT4, GRASE, GUGrid, nanoHUB, NYSGrid.

At-Large VOs Consortium Stakeholder Input to OSG Council and to OSG Executive Board

  • Document; unedited to preserve raw input
  • Official input provided by almost 17 at-large VOs. Consists of general views by each VO on - scope (science production, or resource provision, or a composite); mission statement; average and peak utilization of OSG; resource provisioning to OSG; quantitative science output; short/long-term plans and needs; any key milestones.

Quantitative wall-hour consumption and Efficiency of Usage

  • CDF, D0, DOSAR, Fermilab-VO, GLOW are sustained users with high effectiveness.
  • Usage by the remaining segment is nominal. Causes: (a) custom low-scale needs, or otherwise (b) real low activity.
  • Geant4, ILC, nanoHUB, NYSGrid, SBGrid, STAR are moderately active in usage.
  • GPN, NWICG, GROW are making efforts to start up.

Joint nanoHUB-OSG TaskForce

  • Mission - Investigating ways to make workflow improvements, to enable nanoHUB science production to high job volume at high efficiency. Target closure - end of May'09.
  • Milestones reached - Jan'09: nanoHUB site monitoring system enhanced, with influx of Condor-G based GridProbe jobs. Feb'09: A high level of OSG site stability established. March'09: Influx of nanoHUB Grid-application-test jobs started. April'09: A new nanoHUB-OSG web monitoring display is in place.
  • General - Constant debugging with TaskForce sites. New error-codes that were agreed with nanoHUB are in use. Error-codes and frequency of occurrence at every site is now tabulated on the web.

Joint ALICE-OSG TaskForce

  • Mission - To enable ALICE's specialized AliEn framework and ALICE production on OSG Facility. Target closure - end of May'09.
  • Milestones reached - Dec'08: Jobs successfully submitted using AliEN-OSG common integrated interface at 1 site. Paused Jan-Mar'09 to facilitate momentum and consensus within ALICE. Mar'09: ALICE registered as an official VO in OSG. Security understanding of certificate renewal mechanism is nearing closure. Apr'09: Scalability tests started; ongoing at NERSC site; validating sustained load of 100 running jobs per day; results being made available in global ALICE web display.

Joint Geant4-OSG TaskForce

  • More details available from Chris/Engagement.

Science Validation of Integration Testbed

  • ITB 0.9.2 validation, during Mar'09, with Sites/Integration group.
  • 6 participants: ATLAS, CDF, CMS, D0, Fermilab-VO, LIGO.

General Concerns

  • VORS deprecation: Need to get a listing of new functionality and new API, to convey to the VOs.
  • Opportunistic Storage: D0 is a sustained user. CDF got started at 1-2 sites, but has slowed down. Workflows of most other at-large VOs are not in immediate need of SRM-based usage; simple GridFTP is being used.

Engagement

  • Maxim: here's the list of people and organizations I have written letters to, on behalf of OSG Engagement:
    • Vice President for research of Columbia U,
    • Vice President for research of SUNY Stony Brook,
    • Vice Chancellor for Research at CUNY,
    • Provost for Research of NYU,
    • BNL research groups:
    • two biology research groups
    • one group at the Center for Functional Nanomaterials
    • LSST computing
    • Atmospheric research Division at BNL
    • Nonproliferation and National Security Division at BNL
  • From the top portion of the list, got one reply from NYU which was lukewarm and didn't result in further engagement.
  • From the bottom portion of the list, the Nonproliferation activity was a qualified success as the following was accomplished:
    • I established the "embarrasingly parallel" task flow for the project (image analysis for material science)
    • tested necessary code adaptations (code written in IDL)
    • processed existing data for the experiment, using the IDL installation on Jacquard (NERSC)
    • now awaiting further data and working on a generic data hosting solution to give the group web-based access to Panda (which is tied into Education plans as discussed in separate threads)

  • Recent momentum at Duke: we will be at critical mass with users once the recent engagements start showing success (estimate one to two months); working to find a home for campus condor pool central services to seed a campus grid to tie into OSG.

  • Status of OSG Workshop at RENCI. Slightly different agenda than most grid schools: one day for users, one day for providers. We are trying to reach out to new users as well as central IT people at the three big universities in the area: UNC Chapel Hill, Duke, and NCSU. So far, 22 attendees have signed up.

  • UNC-CH campus grid has been green-lighted by UNC-ITS, so it is official now, and done in time for the OSG Workshop. In a perfect world, there will be sufficient activity on TarHeel? Grid within a month or two that it will be worth of an iSGTW article.

  • We are beginning a move in a direction where we can use the technology base that we have developed over the years in our TeraGrid? Science Gateway efforts for OSG Engagement activities. As a result, Mats is focusing more on development and integration efforts with this infrastructure for a while, and Chris Bizon and Michael Stealey will ramp up on the user engagements to free up Mats to do this work. Chris Bizon is looking into the feasibility of having Raptor (Jinbo Xu) as a trial run for this new job submission/mgmt interface. Submitted an abstract to TG09 conference related to this.

  • We are struggling with the best way to handle parallel activities at a campus for users submitting to: OSG and a budding local condor pool. Dont want them to be two separate submission processes, dont want to require DOE Certs for local campus condor pool. options include job router, higher level tools for submission (eg Swift), and more. Not clear yet where it is best to expend effort. This is making OSG harder to sell to campuses.

  • Chris:
    • PNNL: was asked to hold off while John M. worked some contacts.
    • NREL: waiting for outcome of higher level security-related discussions.
    • ITER: "tgyro" MPI application running on remote sites from OSG client: NERSC FRANKLIN and JACQUARD are operational; PURDUE CAESAR (new platform from ITER's p.o.v) is almost working pending some WN dynamic library dependency issues. Minimal changes to their application package. 'Phone conference yesterday on next steps: agreed to investigate ways of automating compile / install / maintenance on remote sites. Try to leverage OSGMM but may be non-trivial -- new features?
    • Geant4: Contact renewed this morning after a three week hiatus. May be moving forward again, but still some technical issues.

-- MatsRynge - 16 Apr 2009

Topic attachments
I Attachment Action Size Date Who Comment
pdfpdf 2009-04-16.IntegrationSites.pptx.pdf manage 304.5 K 16 Apr 2009 - 16:57 RobGardner R. Gardner
Topic revision: r11 - 16 Apr 2009 - 20:28:28 - ChanderSehgal
Hello, TWikiGuest
Register

 
TWIKI.NET

TWiki | Report Bugs | Privacy Policy

This site is powered by the TWiki collaboration platformCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors..