Minutes of the Integration meeting, November 3, 2005.
Rob, Ruth, Greg, Marty, Kent, Chris, Greg, Iosif, Doug, Horst, Karthik, Fred, Rob Q, Alain, Shaowen, Dane, Mike, Michael, Armind, Razvan, John W, Burt, Stu
- Status of testbed installaions of ITB 0.3.0:
- UC - installing on development cluster, in progress. Condor
- IU - OSG 0.3.0-ws up, WS-GRAM working, MDS4 opened Bug 3856. Resource is in GridCat with the shortname of IUPUI
- TTU - not here
- UB - not here
- BNL - not here: from Xin via email: Yes, I have done the pacman install part on gridtest01, our new ITB GK. There was a problem in VDT that couldn't handle the globus user account home dir if it's on NFS (root squash), Alain said that will be fixed in VDT 1.3.8. We got around it yesterday and now I am about to configure it, but get distracted to many other things today...
- FNAL - no success -
- Burt was having problems with grid exerciser. need to confirm with another site. Alain consulted.
- John is having trouble with the configure monalisa script (this comes from VDT). Alain is aware of this - needs a pretty big update - will be targeted for 1.3.8. This was re-written to fit into the VDT environment better. John has a URL describing the problem. Some of this is VDT specific, not easily integrated back into VDT. Iosif is helping, hope to provide a new version by tomorrow.
- IOWA - Have installed ITB release, and still have to configure Condor. Could submit fork jobs just fine (pre-webservices version). Hope to fix by end week.
- PU - was able to sort out issues with ws-gram not finding pbs logs. Problem was with how local installation organizes its logfiles. Fixed with a symlink. Allow a site to designate a filename rather than a directory location (this is for ws-gram)? Alain doesn't think its configurable at the moment (confirms later that the piece of gram that reads logfiles). Submitting simple jobs from another box with gt4 client tools. Note - these are available separately in the VDT (for those that want to test). Question comes up as to whether GridCat can develop a status check of these WS gatekeepers? Suggestion is to whether this could be presented into the Discovery service. (Aside - can the interface to the discovery service documented some place? Michael says is really meant as proof of concept rather than a designed interface for users. Can this be captured on a readiness plan?
- Caltech - Suresh is starting to work on upgrading their Opteron cluster. Probably delayed till after SC05.
- Kent - ligo will install after SC05. Also want to upgrade with FC4. (John is installing on FC4). Alain - thought FC4 was already supported. (Note will be trying to run on 0.3.0 sites.) Have tried on FSU. Let Kent know when a site is ready, will send a LIGO application.
- Horst - will also install when Panda work cools down.
- Other comments - many of the questions are now gone. Possible issue with Condor and Monalisa configuration.
- New Monalisa release for ITB
- Repository version: 1.2.50
- Release date: 2005.11.01
- Updated repository at http://gocmon.uits.iupui.edu:8888/stats?page=summary
- Iosif doesn't think there will be compatibility problems between services. The new service extracts as much as it can from Condor and PBS, and individual jobs. Extracts accounting information and VO modules updated. Old and new can be run in parallel. Hope to get more tests for different local schedulers. What does the site admin have to do to configure if Ganglia isn't used.
- Back to gt4 ws service discovery. By default a registry service is the comes to describe the container contents for services at that site. How would integration with the discovery (clarens) service work. (going local to global). Stu says there are wsrf query clients available. Stu will follow up with MDS people - perhaps using a trigger service. Mike - what is the priority here? Grid cat versus discovery? Long discussion about how the information gets capture and transmitted through the common interface. For osg 0.4 the ldiff version and bdii, ws not required.
- ValidationPage, in particular USCMS Daily validation of the ITB, here. Burt and John have setup a page to validate the production releases. For the ITB its under production.
- Ruth is about to release the deployment document on the OSG 0.4 release, which the ITB uses for guidance. One of the deliverables was how enviros get propogated to worker nodes via config files. The site admins will have to set this up, and it will have to be part of the CE install guide.
- Alain - see mail from today, depends on grid3info.sh script. Job manager perl modules have to modified for every scheduler (for Condor, PBS, etc). Could be done once, instead, at the Globus RLS level, consulting Stu (will take a look).
- Ruth: Storage elements: 0.4 release should include more information about SE's. Fermilab has made contribution to fund this (Neha and Timur). Expect to implement the local CE storage for 0.4. Need to read through this line-by-line to understand impact. Will ask Neha to write a Storage Admin guide, and we should carefully review. Marco: there is the possibility that a site may not have shared directories, and not all sites will be the same. Ruth - an area of storage which has an interface should be considered an SE, and should be published with a policy.
- Next week: should walk through the CE storage document, and invite Neha to lead the discussion.
- Proposal for meeting provisioning OSG 0.4 at Fermilab, November 30-Dec 1. VDT will be attend.
- 03 Nov 2005
Topic revision: r5 - 16 Dec 2008 - 16:16:00 - KyleGross