MinutesOct25
Introduction
- Attendees: RobG, Burt, Tom, RobQ, Jeff, Anand, Suchanda, Horst, Karthik
- Apologies: none
- Coordinates: Thursdays, 2:30pm Central; 510-665-5437, #1212
- Previous meetings, MeetingMinutes
- ITB 0.7 background
Sites coordination survey results
Documentation
Status of ITB 0.7.1 (Suchandra)
- VDT 1.8.1b status (Alain): VDT 1.8.1b is closed to finished, release early next week; there are patches for Globus.
- ITB 0.7.1 status
- will is.grid.iu.edu will be used for BDII
- will prepare production cache
- Site issues from last week, to follow-up:
- FNAL_FERMIGRID_ITB
- BNL_ITB_Test1
- UC_ITB
- Still working with Britta @ LIGO - Pegasus worflows not working. Adding a few more VO.
- GIP validator - default SE a requirement? Need to consult w/ Burt.
- Still can't run inteoperability jobs. Suchandra will consult with Anand regarding publishing PBS queues.
- CIT_ITB_1
- TTU-TESTWULF
- UCSDT2-ITB1
VO validation (Abhishek)
Due to the current situation in San Diego, I will need to miss the
Integration meeting today. Here is a recent update on the ongoing
validation of ITB 0.7 -
ATLAS, CDF, CMS, DOSAR, Dzero, Engage, Fermilab, nanoHUB and SDSS/DES
have now confirmed a completed validation and have given 'green flags'
towards OSG 0.8.
Of the remaining VOs, validation by
CompBioGrid? and STAR is in
progress today and more feedback can be expected soon. LIGO workflow
was affected by an unknown failure at UC_ITB; Suchandra and Britta may
need to investigate further on the failure mode.
From the sites' perspective, here is a quick summary. In some cases,
problems are now known, otherwise investigation is in progress and may
need to be continued further.
FNAL_FERMIGRID_ITB - No known problems.
BNL_ITB_Test1 - Failures for CDF, CMS, nanoHUB. CDF unknown yet, need
to follow up. CMS due to firewall at WN's and SRM. nanoHUB due to
error code 74 more recently, discussion in progress.
UC_ITB - Failures for CDF, LIGO. CDF unknown yet, need to follow up.
LIGO affected by unknown failure, investigation in progress.
CIT_ITB_1 - Failures for Engage, LIGO. Engage due to authorization
failure. LIGO due to NFS-Lite/Pegasus related known problem; more
details to be exchanged between LIGO and User Group.
LIGO-CIT-ITB - No known problems.
TTU-TESTWULF - No known problems.
UCSDT2-ITB1 - No known problems.
(OUHEP_ITB - No known problems.)
Again, thanks a lot everyone for your ongoing wonderful work.
Hopefully, ITB 0.7 will very soon get more feedback and remaining
green flags from
CompBioGrid?, LIGO, and STAR. I will continue to be in
intermittent email contact for the rest of the week.
regards,
Abhishek
Attributes, information services (Anand, Gabriele, Suchandra)
- Some attributes cannot be published correctly through the glue schema. A long-term problem.
- Two updates for the current release. Should be in VDT 1.8.1b. These were minor bugs.
- All done for OSG 0.8 deployment
- Post-deployment, convene the attributes subcommittee to work on validating, and requirements collection.
- Define a RSV probe for critical attributes.
WS Gram - final chapter? (Jeff, Suchandra)
- Two types of tests - basic scaling w/ default Condor-G configuration. Submitting 100 jobs at a time. Occasionally there are staging problems.
- There are performance modifications that need to be made - eg., the default memory for the container. There are docs in the install guide for this.
- Validation w/ the Star VO. Was able to take their workflow and run from Jeff's own submit host. Problems with TTU site - waiting to hear from Alan. Troubleshooting w/ Charles.
- Problem: Condor version in osg-client is 6.8.6; but 6.9.3 is what the Globus team recommends. Note - 6.9.4 is the latest development version. Install
condor-devel. (Important for scalability w/ ws-gram)
- Suchandra working with Britta to troubleshoot LIGO problems on UC_ITB.
OSG 0.8 Deployment readiness (RobQ)
- BDII and schema issue nearing resolution. Spoke w/ Laurence Field - move current BDII up to Glue 1.3 schema, without effect. Will test this tomorrow w/ Burt. As sites upgrade to OSG 0.8, they will simply report more information. Replace schema files and restart server tomorrow AM. Will leave in place at 1.3.
- syslog-ng server. Tim and Suchandra working on this - Tim has prepared a machine. (Minor problems with syslog-ng on a Gentoo distro.)
AOB
- Follow-up from last week:
dccp in workernode client? A request from CMS and ATLAS. Alain will consult Ted.
--
RobGardner - 24 Oct 2007