You are here: TWiki > SoftwareTeam Web>Projects>NPTransition (18 Jul 2011, AlainRoy?)

The transition for the OSG Software Stack from Pacman to RPMs.

0.0 Executive Summary

Summary: This document contains the details of how we will transition away from Pacman to native packages (RPMs and debs). It covers many details, particularly:

  • Our plan to drop new development of Pacman packages as soon as possible while adding RPM packaging as soon as possible.
  • The set of packages that will be in the VDT, with notes about whether we need to package them or not and what order we'd work on them.
  • Implications for our end-user community
  • Implications for our testing (internal and ITB)
  • Implications for our documentation

0.1 Known issues that will be addressed in future versions of this document.

  • CA Certs: only from GOC? Also from VDT? IGTF vs. non-IGTF? Different packaging so you can choose subset? (Note from BB: I integrated these below).
  • How do LIGO's requests for SL6 and Debian 6 support mesh?

0.2 Open questions for readers

  • Sysadmins: how do you feel about the requirement that we depend on EPEL?

Table of Contents

1.0 General Approach

Our general approach is documented in our Community Packaging Proposal.

Here is a brief summary of our approach, as quoted from that document:

Proposed Principle of Community Packaging:
The OSG Software Team should be a good community citizen when it comes to packaging: When possible, we should use packages from existing and/or broader communities; when that is not possible, we should make our own packaging but contribute them back to the broader communities. Therefore, we should package software only when one of the following is true:

  1. The software is not already packaged; or
  2. The software is packaged but needs significant changes to be acceptable to our users. (Different version, extra patches, etc...)

Otherwise we should use the existing packaging provided by external developers or software repositories.

While this does not mention native packages, that is the implication: we can only use packages from a larger community effectively if we are using the same packaging mechanism with the same standards.

2.0 Outline of timeline

There are two portions to our current timeline.

  1. Finish pending work on Pacman-based releases, then stop any non-critical work on the Pacman-based releases.
  2. Immediately begin work on transitioning our infrastructure and products to RPMs for Red Hat Enterprise Linux 5.

2.1 Finish pending work on Pacman-based releases.

We want to stop new work on Pacman packages as soon as possible, but we have some lingering work to finish up to ensure that we are well-positioned to provide our users with a working VDT while we make the transition to RPMs. We plan for the following.

Date Effort What
June 10   As of June 10, we won't accept any new requests for changes to the Pacman cache.
June 2-3 FTE Days Release OSG 1.2.20 (VDT 2.0.0.p27), which is currently in testing in the ITB. This provides some long-requested changes to GRAM for CMS, among other minor updates. This release will not require much effort from the OSG Software Team: perhaps a day or two to deal with minor problems and the release process.
July 1 FTE Week Minor update to address low priority security issues and a GIP update from CMS. While the security updates are low priority, we want to ensure they are addressed because some of our users may be using the Pacman-based releases for a long time, and it's best to have these resolved. (To be precise, we need to update Squid, MySQL-JDBC.) This should take less than a week of one person in the OSG Software Team.

After these releases, we will not make any plans to do updates to the Pacman-based packaging of the VDT unless there are critical updates. By "critical update", we mean a moderate or high priority security update, or anything approved by the OSG Executive Team.

Please note: This means that the requested VOMS/VOMS-Admin upgrade will not be provided in Pacman-based packaging.

2.2 Immediately begin work on RPMs for RHEL 5

We will immediately begin the transition to producing the OSG Software Stack as RPMs for Red Hat Enterprise Linux 5 and variants (especially Scientific Linux 5). VDT installations will depend on EPEL (a trusted third-party software repository). Please Note: Because we are relying on EPEL, this implies a transition to Globus 5.

Rough timeline:

Date Effort What
mid-August ?? glexec + worker node + osg-client + VOMS server, ready for OSG site admin meeting
December ?? Complete CE installation, ready for LHC downtime

2.2.1 Order of packages to release

  1. vdt-system-profiler: We will first ship an updated version of the vdt-system-profiler. This is a simple script that allows users to collect debugging information to be sent to VDT support when help is needed. This is our first target because the package itself is simple, but it will allow us to make sure our infrastructure is in place. (Building RPMs, yum repositories, etc.)
  2. Xrootd: Xrootd (without the DSI Plugin) will be our next target. This is already mostly done, so the goal is to make sure that our infrastructure works properly.
  3. Xrootd DSI Plugin: This will require us to use Globus.
    • Note from BB: We have already done this for the Hadoop DSI plugin.
  4. glexec: CMS wants an upgraded version of glexec. While it is not difficult to build, there are several dependencies (lcas, lcmaps, VOMS, and Globus) that need to be addressed.
    • Note from BB: VOMS is an interesting issue, as (last I checked) the VOMS in EPEL doesn't have the "--dont-verify" flag. This is possibly the first place where we'll have to decide between "copy and maintain a patch" versus "work with upstream". This is based on 9-month-old information; may have changed.
  5. Worker Node Client: This is "low-hanging fruit". Many of these packages have already been done by CMS and we can use the results. They also require minimum configuration to get working.
    • Note from BB: We have proof of concept source packages for all of the WN-client except Fermi-SRM-Client.
  6. OSG Client: Probably falls under the "starter" package category in complexity, but it does possibly affect end-users. (Added by BB)
  7. VOMS Admin: This is the first Tomcat+MySQL application we'll need to support. This would be the first test of our ability to port complex configuration scripts over.
  8. OSG-CE: This is the most complex software we offer. It's going to be necessary to start very early, as it will take much longer than any other software, and will require coordination with several pieces of software that have never had native packaging. (Added by BB)
  9. CREAM: EMI is also doing source packages for this; it might benefit to let them shake-out bugs.
    • Note from BB: The following software will be in both OSG-CE and CREAM, and have never had native packages made: Gratia probes, GIP, OSG-RSV, Configure-OSG, Fermi-SRM-Client.

TBD:

  1. Gratia Collector. Small user base (5-6). Requires MySQL?+Tomcat; might be easy after VOMS Admin. (Added by BB)
  2. VDT-CA-Manage. This is to manage the set of CAs; while we have RPMs for CAs right now, they're far less powerful than VDT-CA-Manage. We probably don't want to regress on these capabilities, as they were strongly desired by the OSG Technical Director. (Added by BB)
  3. Mostly-unused-packages: dccp, bwctl, ndt, npad, MonaLisa?, owamp. These packages all provide some level of functionality to a small crowd. I don't think we should consider removing them (for now), but we might want to state they won't happen for the initial release. (Added by BB)
  4. Pegasus client: Last I checked with the Pegasus folks, they now glide-in their own client. We should check to see if this is still necessary. (Added by BB)
  5. CA certs. Who defines the starting set of CA certs? Can it be only GOC?

What do we say about timeline? Goal of X, Y, and Z by... August 31?

2.2.2 Supported operating systems

Initially we plan to support Red Hat Enterprise Linux 5 and variants (particularly Scientific Linux 5) in both 32 and 64-bit architectures.

Note the absence of other platforms, at least initially: No RHEL 4, no SuSE?, no Debian. More will come later, but unspecified as to what and when. (The software we provide to LIGO on Debian will continue to exist.)

3.0 Packages: What can we borrow, what do we have to create?

This is a list of packages from the VDT. For each package we say if it will be in the VDT RPM repository and where the package is packaged.

*Please Note*: This is not the complete set of RPMs, not even close. In many cases, there are multiple RPMs for each package (perhaps server, client and library RPMs) and there are many dependencies (lcg-utils depends on GFAL and cgsi-gsoap). In addition, some will be rather complicated to package. Therefore you cannot easily derive effort from this table, but you can get a feeling for how much needs to be packaged just for the VDT and how much we can borrow from other sources.

(Click on a header name to sort by that column. In particular, "Order/Bundle" is a good sort order.)

Name Will be in VDT repo? Packaging Source Order/Bundle Dependencies
Ant No OS  
Apache No OS  
BC-Provider No EPEL  
BDII No N/A  
Berkeley-DB No N/A  
Bestman Yes VDT  
blahp Yes EMI #9: CREAM
BWCTL Yes VDT #6: OSG-Client
CA-Certificates-Manager Yes VDT #3: Xrootd-DSI
CA-Certificates-Updater Yes VDT #3: Xrootd-DSI
CA-Certificates Yes OSG #3: Xrootd-DSI
CEMon-Server Yes EMI #8: OSG-CE
Condor-Cron Yes VDT  
Condor Yes Condor  
CREAM Yes EMI #9: CREAM
VDT/OSG Configuration Yes VDT  
Curl No OS #5: Worker-Node
Dccp Yes EPEL #5: Worker-Node
EDG-GridFTP-Client Yes VDT? #5: Worker-Node
EDG-Make-Gridmap Yes EMI #8: OSG-CE
Expat No OS  
Fetch-CRL No EPEL #5: Worker-Node
Generic-Information-Provider Yes EPEL #8: OSG-CE
Glexec Yes EMI or glexec #4: glexec
GlideinWMSVOFrontEnd Yes GlideinWMS  
gLite-FTS-Client Yes EMI #5: Worker-Node
Globus 5 Mixed EPEL/VDT #3: Xrootd-DSI
Globus Condor NFSlite Jobmanager Yes VDT #8: OSG-CE
Globus-ManagedFork-Setup Yes VDT #8: OSG-CE
Gratia Probes Yes VDT #8: OSG-CE
Gratia Server Yes VDT  
GSIOpenSSH Yes VDT #6: OSG-Client
GUMS Yes VDT #8: OSG-CE
Java 1.6 No OS #7: VOMS-Admin
KX509 Yes VDT  
LCG-Info Yes EMI #6: OSG-Client
LCG-Infosites Yes EMI #6: OSG-Client
LCG-Utils Yes VDT? #5: Worker-Node
LFC-Client Yes VDT? #5: Worker-Node
Logrotate No OS  
Metronome No N/A  
MonaLisa No N/A #8: OSG-CE
MyProxy No EPEL #5: Worker-Node
MySQL No OS #7: VOMS-Admin
NDT Yes VDT #6: OSG-Client
Nmap Yes VDT #6: OSG-Client
NPAD-Client Yes VDT #6: OSG-Client
OpenLDAP No OS #5: Worker-Node
OSG-Discovery Yes VDT #6: OSG-Client
OSG-Match-Maker Yes? VDT  
OSG-RSV Yes VDT #8: OSG-CE
OSG-Site-Verify No N/A #8: OSG-CE
OSG-Site-Web-Page Yes VDT #8: OSG-CE
OSG-VO-Map Yes VDT #8: OSG-CE
OWAMP Yes VDT #6: OSG-Client
Pegasus Yes Pegasus? #5: Worker-Node
Perl-Modules No EPEL/OS  
Pigeon-Tools Yes VDT  
PPDG-Cert-Scripts Yes VDT #6: OSG-Client
PRIMA-GT4 No N/A #8: OSG-CE
PRIMA Yes VDT #8: OSG-CE
Procd Yes VDT  
PyGlobus No? N/A  
PyGridWare No? N/A  
Python No OS/EPEL  
Squid No OS  
SRM-Client-Fermi Yes VDT #5: Worker-Node
SRM-Client-LBNL Yes VDT #5: Worker-Node
Tomcat-5.5 No OS #7: VOMS-Admin
UberFTP Yes VDT #5: Worker-Node
VDT-System-Profiler Yes VDT #1: vdt-system-profiler
VOMRS No N/A  
VOMS-Admin Yes EMI #7: VOMS-Admin
VOMS Yes EMI #4: glexec
Wget No OS #5: Worker-Node
Xrootd Yes Xrootd #2: Xrootd
Xrootd-DSI Yes VDT #3: Xrootd-DSI

4.0 Implications for end-user system administrators

4.1 Migration from Pacman to RPM

We will need to provide assistance for our users to migrate their existing Pacman installations to RPM installations. This will be some combination of documentation and or utilities to assist with the migration. In some cases (such as preserving contents of databases for things like VOMS), this will be significant work.

4.2 Non-root installations & multiple installations

RPMs do not naturally support non-root installs or multiple installations on a single computer. These are possible, but they are not the common, standard procedures for using RPM. Some VDT users, particularly ATLAS (which installs client software multiple times as non-root on a shared filesystem) will need a solution for installing in these scenarios. There are at least two solutions:

  • Provide relocatable RPMs (this is extra work in addition to simply making standard RPMs), then provide tools to install those RPMs as non-root.
  • Provide the VDT (or the necessary subset) in an alternative packaging format. Tarballs may be sufficient here, but we do not yet have a complete understanding of how people would maintain an installation based off of tarballs.

4.3 Backups of previous installations (for rollback)

One benefit of Pacman's capability for doing installations in multiple locations on disk is that it is very easy to back up a VDT installation to another directory. This is because the Pacman-based VDT installation is in a directory that is separate from all other installed software and not combined into standard FHS locations. Because it is easy to backup, system administrators have the ability to roll-back to previous installations in case there are problems with an upgrade. This is much harder (impossible?) to do with RPM: it is not easy to do rollbacks in general. To some extent we can help administrators by sharing best practices for upgrading systems (testing on separate computers, etc.), but we also have to accept the loss of rolling back an installation.

4.4 Retraining

The transition to RPMs will require retraining for OSG site administrators. This is unlikely to be a significant problem for larger sites that already have considerable experience with RPMs, but it will be a problem for smaller sites such as Tier-3s that may be run by graduate students with limited time for system administrations.

4.5 Site customization

From Steve Timm: "Heavily customized sites such as FermiGrid? would have to redo much of the customization that we have done, both in modification to job managers and in our internal monitoring. This would be amplified if the "community" version of Globus shifts to GT5.x as part of this process. (GT5 is something that Fermilab dearly wants to see deployed but it will take us work to adapt to it when it is deployed.)"

From Steve Timm: "A transition to RPMS which locate their files in standard Linux locations such as /usr and /var would necessitate a OS reconfiguration of many OSG head nodes which are currently configured to have their VDT installed in a separate partition like /opt or /usr/local."

5.0 Implications on configuration management

Our existing configuration tools (configure-osg and the host of underlying scripts used by configure-osg) will need significant porting and/or rewriting to be usable with RPMs. The existing scripts rely on many characteristics of Pacman-based installations and specific installation choices we made. The scripts also blend two kinds of configuration. These are:

  1. Install-time configuration These are things like making the initial version of a configuration file, initializing a database table, creating a user, and making an init script. These are tasks that are appropriate to do during the installation of an RPM. This kind of configuration will be done as part of our RPMs. That is, the work of creating the RPMs for the VDT will involve rewriting our existing configuration scripts and melding them with the RPMs. (Of course, pre-existing RPMs from other packaging sources will likely already have this work completed.)
  2. Post-install configuration These are scripts (like configure-osg) that allow system administrators to change the configuration of the software after the installation. We do not yet have a story on what we will do for this kind of configuration. Should we merely document what needs to be done? Should we port our existing configure-osg? Switch to YAIM? Suggest that all sites use a configuration management system like cfengine and provide tools that work with one or more of these configuration management systems?

6.0 Implications for internal VDT testing

The VDT team uses an internal test suite. It's used to verify that our Pacman installations work correctly. It verifies many aspects of the VDT installation process and the functionality of the software. It is not used outside the UW-Madison. In particular, it's not used by the ITB. It's tuned for the needs of our pre-release testing. We cannot use this existing internal test suite for the VDT without significant updates

6.1 Need to publish test results against latest system updates

Today the VDT only does internal testing against un-updated versions of the OS. We need to add a new class of tests, to verify that our existing releases work correctly with updated versions of the OS. The test results should be published publicly so that both we and our users know if OS updates work correctly with the VDT. These tests are similar or identical to our pre-release tests, but we will need to extend our set of tests, work with updated platforms, and publish the results. We may need a variety of OSes to test with (RHEL5, CentOS? and Scientific Linux?)

6.2 Need to update installation process

Replace our installation process, which currently uses Pacman. There are a couple of options for this:

  • Write new scripts to do RPM installations to directly replace scripts that do Pacman installations.  If we just wanted to hack our existing scripts we could probably replace the Pacman installation with RPM installation in a day or two.  Then it would probably take a few more days to figure out how to properly remove RPMs and validate that they
are removed.
  • This would be a good opportunity to significantly overhaul our testing suite and start using virtual machines (likely in the Batlab). This would take a lot more investigation.  It would probably take a few days to investigate the problem and form an effort estimate for completion. This would allow us to always do our RPM installations in a known environment (no problems caused by earlier VDT installations).

6.3 Update existing tests

Almost all of our existing tests will need to be modified because they rely on hardcoded paths within a Pacman installation.  For example, 33 out of 38 test files contain $VDT_LOCATION.  We believe that the changes are mostly straightforward, but it will take significant time to update the tests.

Some of the tests in our test suite no longer make sense because they are tied to the Pacman infrastructure. We would consider doing similar appropriate tests (such as verifying we can remove VDT RPMs correctly) but they would be written from scratch.

6.4 Internal learning

The VDT team needs training and infrastructure to support manual testing. When we need to do testing during development, how do we do it? Will we use virtual machines? How will we make those virtual machines?

We also need to write/update documentation for our new testing framework

6.5 Possible improvement: new test framework

We should consider updating our existing test framework (Perl::More). It is a dated framework that has been supplanted by new, improved, shiny frameworks. That doesn't mean it's bad, but this is the right time to consider if we should move to an alternative.

7.0 Implications for integration testbed testing

7.1 Impacts on ITB testing

  • So ITB sites have to test both Pacman and RPM? In general, this might imply double the effort, or at least double the knowledge.

7.2 Possibilities for improving the ITB

The change in the packaging infrastructure provides the opportunity for changing the testing infrastructure to make it more responsive and allow for quicker more targeted releases. In addition, a switch to an RPM based packaging allows for testing that previous would have been difficult to conduct.

7.2.1 Testing smaller changes

With RPMs we should be able to have more granularity in which components can be updated at a given point in time because it is easier and more reliable to install specific sets of software with RPM than with Pacman. This allows for testing of specific components instead of monolithic releases that update components en masse. For example, just Gratia probes could be tested for a release. Although this implies that testing would be a more continual process rather than the cyclical ITB testing process, it would also allow for quicker releases of packages and better responsiveness to stakeholder requests.

That said, there is a potential downside to this: more frequent testing will lead to steadier demands on our limited ITB effort and may lead to ITB tester fatigue--people might just get sick of constant ITB testing. The current test cycles have downtimes between test cycles which let ITB resource admins the opportunity to work on other things. With ongoing releases, effort will probably be needed more frequently although the effort levels may remain the same. Some of the new effort requirements may be mitigated by the having ad-hoc groups of ITB admins that are interested in a given change do the testing allowing admins that are not affected by the change to work on other things.

7.2.2 Automated testing

Given the infrastructure available for handling RPM installation and configuration, the switch to RPM based packaging allows for the establishment of automated installation and test stands for testing. With the addition of a configuration management system (like cfengine or puppet), it would be possible to setup infrastructure that does daily installation and configuration of the VDT RPM packages and runs simple tests after software has been setup. With the addition of ITB robot jobs, quite a bit more functionality can be exercised on a daily basis.

8.0 Implications for Documentation

All of our existing documentation assumes that installations are being done with Pacman and that end-users can install the VDT in any disk location they like. The documentation is littered with references to Pacman and $VDT_LOCATION.

It will be a significant effort to create new documentation for the new RPM-based VDT installation. It is difficult to estimate the effort, but historically speaking, documentation has been slow to write. That said, we assume that the documentation can be gradually written over time.

We should note that Robert Engel is leaving OSG at the end of September and may be unavailable for help with the creation of the documentation.

9.0 Versioning

Warning: This whole section was added by Brian Bockelman and not yet agreed to by Alain.

We've previously said the minor version number transitions (as in, OSG 1.2 -> 1.3) are only incremented when upgrades aren't possible. This makes the major version number mostly meaningless, and leads to a very large patch-level (1.2.19) and major changes happen between patch levels. I'd like to fix this while we are doing a clean sweep.

The point of making versioning changes is to communicate within the version number the amount of risk one takes in performing the upgrade.

We use the traditional MAJOR.MINOR.PATCH numbering scheme. By "version number", we are referring to the version number of any OSG meta-package RPMs.

9.1 Major version number

  • Significant new services will only be added on major releases.
  • We don't guarantee configuration file compatibility across major releases (although we will keep the existing policy of making configuration files as backward compatible as possible). We are allowed to obsolete, remove, or semantically change configuration options. We will avoid semantic changes if at all possible.
  • We don't guarantee ABI compatibility across major releases.
  • We don't guarantee protocol compatibility across major releases.
  • Major version changes will be ITB-tested.
  • We support the previous two major versions (i.e., about 18 months of support for a release).
  • Each major release will have a separate repository to prevent inadvertent upgrades. To upgrade across a release, we can use yum plugins to do something like "yum updateosg 3" to switch to the OSG 3.0 repository. We expect the upgrade process to be:
    1. Switch yum repository using custom yum command.
    2. Run "yum update".
    3. Edit configuration files. The average major release is thus far, far simpler than major release in the past.
  • The contents of the MAJOR release repository will contain version MAJOR.0.0 of the RPMs.

Note that, while we have a significant leeway for change at each major release, we aren't necessarily going to change things. We will continue to be conservative in breaking things - but this will provide a predefined point when we need to.

Also note that previous minor releases would now be considered major releases due to the configuration option changes.

9.2 Minor version number

  • Configuration files will be backward, but not necessarily forward compatible. A configuration file working with minor version X will work with minor version Y if X<Y.
  • Minor version changes will be ITB-tested. We may change this item in the future if ITB effort is reduced.
  • Minor updates will be available in an -updates repo.
  • We expect updating across minor version numbers to be "safe", but admins are advised to not do them automatically.
  • Forward and backward ABI and protocol compatibility is expected.
  • Services may have new capabilities or versions in minor releases.
    • New packages may be added in a minor release if the upgrade of an existing service requires it.
  • Releases will happen on the 1st or 3rd Tuesday of the month (TODO: check to verify this jives with the current policy)
  • Packages in the MAJOR updates repository will contain the RPMs corresponding to the latest MAJOR.MINOR.VERSION.

9.3 Patch-levels

  • Configuration variables may not change.
  • Reserved only for bugfixes or security fixes.
  • ABI compatibility is guaranteed.
  • Patch-level fixes may skip ITB if necessary.
  • It is expected to patch-level upgrades to be completely safe.
  • We expect updates should be safe to be done automatically.
  • Patch-level fixes will go into the -updates repo.

9.4 Exceptions

There cases where we make exceptions to the versioning and release policy outlined above.

  • If we find a compelling reason to break the above process, the OSG-ET must approve it. It will be noted in the release nodes, and separately emailed to osg-sites.
  • The OSG 2.0 release cycle will be special because we will not transition all OSG 1.2 services in one release. OSG 2.0 might start with only the worker node client (or less).
    • Services existing in OSG 1.2 will be added in minor releases of OSG 2.0. The new services will go into the release, not the updates repository. Compatibility guarantees are unchanged. The minor updates of OSG 2.0 will be tested by ITB.
    • Services not in OSG 1.2 will wait for OSG 3.0.

9.5 Repository Content Examples

Suppose OSG 2.0 has an RPM called foo. OSG 2.0.0 released with foo-4.0.3, OSG 2.1.0 released foo-4.0.4, and OSG 2.1.1 released foo-4.0.5. According to the above policy, the OSG2 release repo will contain foo-4.0.3 and the OSG2 updates repo will contain foo-4.0.5. foo-4.0.4 will be available from Koji, but not from the release or updates repository.

10.0 RPM Development guide

There are many technical details about how we will maintain our RPMs, and they are documented on a separate page, the RPM Development Guide. Ultimately, the guide will likely outlive this transition document and become a key document for the OSG Software Team and external contributors alike.

Topic revision: r24 - 18 Jul 2011 - 21:55:50 - AlainRoy?
Hello, TWikiGuest
Register

 
TWIKI.NET

TWiki | Report Bugs | Privacy Policy

This site is powered by the TWiki collaboration platformCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors..