Storage Meeting Minutes 2009 Sep 09
- Andrew Baranovski
- Brian Bockelman
- Ted Hesselroth
- Alex Sim
- Tanya Levshina
- Igor Sfiligoi (UCSD)
- Michael Thomas (Caltech)
- Dan Gunter (LBL)
Scalability & performance test plans (Igor)
Igor is setting up the test stand with 16 worker nodes and 3 servers that will be used for performance and scalability tests. They are planning to test bestman-gateway/hadoop, dcache and ,probably, bestman-gateway/xrootd installations. Each installation should sustain at least 50Hz srm calls and have number job failures due to I/O errors less then 5%. They would like to compare the performance of the current storage installations with a relevant new "golden" releases (srm/dcache with https and load balancing, bestman with https etc). They are planning to use data server nodes as worker nodes and allow access to the data via fuse or dcap.
Xrootd 20090721-0636 has been release for testing by VDT. Tanya will install and run tests later this week. It is unclear id this is the version that was released has been tested by Doug.
BeStMan release (Alex)
New version of BeStMan is coming soon. It contains bug fixes and significant modification of configuration script and file. It will be the last BeStMan release before "!BeStMan/https" release later this year.
Storage discovery tools status - packaging , integration with MyOSG?
Ted is working with MyOSG?
team on understanding how to incorporate discovery tools into MyOSG?
. He thinks that no effort is required from him in order to package it in VDT.
SCEC, LIGO status
Mats has notified us that he had problems with getting compute cycles in Caltech and Firefly had been down for couple of weeks. Michael has acknowledged that the site was pretty busy with CMS jobs recently but has promised to allocate some nodes for non-cms users.
LCG-utils and BeStMan-gateway (Tanya)
Tanya has started the twiki page to record the different behaviors of various clients with different bestman configuration( work in progress
) . During the tests it was discovered that BeStMan
-gateway when configured with static tokens returns 0 as the amount of available space per a particular space token. Both fermi and lbnl clients are ignoring this information when transferring the file to the allocated space but lcg-copy fails. Wei has sent the c-code that could be specified as a plugin in BeStMan configuration and provides the necessary information about available space. It looks like this code is not part of any distribution for the time being. Hadoop will also need similar plugin, probably, Andrew could write it. Brian suggested it would be better if BeStMan doesn't set the default available space to 0 when it can not calculate it. Alex will investigate that.
7445 - dCache - srm door showing incorrect dcache version number (1.9.3 instead of 1.9.4)
Follow up in progress:
7446 - dCache - 1.9.4-3 slow response to 'df' on nodes mounting pnfs
7449 - dCache - Debugging a "no route to host" file transfer problem
Need to follow up:
7447 - dCache - Low throughput observed after upgrading to dcache 1.9.4-3
7448 - dCache - Poor PNFS/Chimera performance on upgrade from 1.9.2-5 to 19.4.-x
7373 - GridFTP?
- How to setup pool accounts for GridFTP?
7109 - dCache - authgrouplist_pkey exceptions in catalina.out
7305 - dCache - Finding standalone GridFTP?
Follow up in progress:
7410 - Xrootd - Designing Site layout with xrootd
7380 -dCache - gPlazma drops roles when multiple roles map to same username
7326 - dCache - misleading error message when using dccp to transfer data
7329 - dCache - postgresql-libs vs. apr-util
6971 - dCache - Problem with pnfs register command
6967/7010 - dCache - File replicas not being removed
6908 - dCache - log4j errors when restarting dcache 1.9.2-5
- 11 Sep 2009