OSG Newsletter, April 2011
Engage Status Report
The Engage team’s recent work not only benefits the users they are assisting, but has also resulted in generalizable techniques and improvements to the Engage VO’s hosted infrastructure.
Recently, the team has been working with researchers at the University of Iowa and the University of North Carolina at Chapel Hill to develop molecular dynamics simulations. In the process, they have developed techniques for running simulations that take advantage of the OSG’s new parallel architecture (HTPC), scalable workflow management systems and emerging GPU computing architectures.
In addition, Engage has enhanced its hosted service platform with a higher capacity hardware and software stack featuring GlideinWMS. Once existing Engage users are migrated to the new service, they will also be able to use the Pegasus Workflow Management System, which is appropriate for running larger, more portable workflows.
This new architecture moves the Engage user base towards the OSG’s recommended pilot-based submission model, which has been demonstrated to be scalable. It also puts the Engage team on a stable long-term footing for effectively engaging new communities with improved scalability and high-quality services.
~ Steve Cox
Panda Workload Management System
Since the start of data taking, Panda usage has gradually ramped up, reaching 840,000 jobs processed daily by the fall of 2010. It has remained at consistently high levels ever since.
Given the upward trend in load due to the increase in beam intensity and data taking rates, the Panda team is facing a new set of challenges in the areas of database scalability and monitoring system efficiency. These challenges are being met with aggressive R&D with two goals. The first is to implement a scalable and efficient storage system for Panda’s monitoring data. To that end, we are using a “noSQL” solution (Cassandra). The second goal is to achieve better integration of existing diverse components of the monitoring software.
We have also welcomed a new group of Panda users from the Daya Bay and LBNE neutrino physics research team at Brookhaven National Laboratory; we continue to work with them to better handle their Monte Carlo production needs.
~ Maxim Potekhin
CMS Production Report
The LHC continues to perform brilliantly, setting new instantaneous luminosity records. In one week alone, over 100 inverse picobarns were delivered to the detector. These increases also increase the “pile-up” phenomenon, where the detector triggers on more than one event at a time. Expect CMS jobs to last longer and have larger memory footprints as the LHC continues to deliver! Over the past few month, we have sustained more than 400,000 hours per day of useful computing across the Tier 1, Tier 2s, and Tier 3s on the Open Science Grid.
~ Burt Holzman
CMS & ATLAS Thumbnails
RedHat awards “Cloud Leadership Award” award to the University of Wisconsin Center for High Throughput Computing
The award is another step in the growing recognition of the effectiveness, need for and appreciation of distributed high throughput computing, the technologies that the Condor project offers and the vision and innovation that their leadership has brought and will continue to bring to the research and scientific field.
Grid Colombia Launching
We are pleased to report that the Grid Colombia pilot launched 1 April 2011.
The project launched with nearly 20 institutions running the OSG software stack in order to share their computer resources. We’ve worked to foster a friendly environment similar to that of the OSG. The result is a great deal of enthusiasm to collaborate.
Our next step is to work towards going into production. There are several ideas we must explore and implement first, however.
For instance, we would like to investigate the Engage model so that we can understand the OSG approach to attracting new users. We’d also like to find some resources one or two people can try accessing in an OSG production environment, and set up a stack-overflow tool to develop our own knowledge stack about configurations, installations, etc.
As you can see, there are still a lot of things to do. But we are hopeful that our strategy to get new funds for developing projects using this new infrastructure will succeed.
~Professor Harold Enrique Castro Barrera
From the Executive Director
In April, I attended the European Grid Infrastructure (EGI) User Forum in Vilnius, Lithuania.
I was there as a member of the OSG External Advisory Committee, as well as to give a keynote on the US Shared Cyberinfrastructure. (Ian Fisk of Fermilab also gave a keynote, on behalf of the WLCG.)
The event was well-attended, with about 350 attendees total. I enjoyed several interesting demonstrations. Their structural biology portal (eNMR VO), scientific cloud infrastructure services (e.g. Stratuslab), and their new operations portal (GGUS) were particularly noteworthy.
The challenges EGI faces mirror ours, and they face a number of challenging questions. Are virtualization technologies beneficial? If so, are they mature enough to be part of a production facility? How can we make the identity and trust infrastructure more usable and coherent while maintaining an appropriate level of protection and security?
EGI is addressing the latter with a new service under discussion that provides a “certificate translation services” between any two implementations of identity certificates. The service will most likely be a collaboration between the Switch and Shibboleth projects.
Overall, it was an interesting meeting, and important given the continuing evolution of the services that federate the EU and US grids used by our science communities.
~ Ruth Pordes