Gratia Release v0.38.3 (9/16/08)
Overview
The main purpose of this release (v0.38.3) is a maintenance release.
Main Release Features:
- Fully automated install and updates; major simplification of the procedures.
- Support for Firefox 3.
- Summary table for Transfer record.
- Overhaul of replication and GUI replication control.
- Overhaul of collector logging.
- MetricRecord? fixes and improvements
Reports
- Provide report for VO's that use many sites on the distribution of usage across the sites
- Clean ups of psacct reports.
- Monthly report for emailing summarizing usage by Site for OSG VO users by DN (ready to be sent, waiting for email to sen to)
- Weekly report of usage by user on each site.
- Update the ownership and reporting grade reports to use the information from OIM; including properly detecting 'hidden site'.
- Fix problem with timezone setting depending on the services load ordering which was leading to report date being off (and drill through report to be useless)
- Updated reports to support birt 2.3.1 required by Firefox 3.
Collector Improvement
- Code Cleanups
- Major simplification and rationalization of the log file output (use exclusively log4j)
-
- 30 logs of each type kept.
- Initialization of logger is now easy -- properties are read from Logging.java.
- straighten out the time-zone of the printed time
- Change relative positions of custom (CONFIG, FINE, FINER, FINEST, etc) log levels with respect to standard log4j levels (INFO, DEBUG TRACE) to remove confusion.
- Simplification and rationalization of Admin pages.
- Replication Pages:
- Shows entries read-only, except the one you want to edit.
- Uses hibernate (also in Replication Data Pump)
- Sorts entries, "sensibly."
- Allows one to specify the bundle size.
- Fix insert point focus issue in the Admin pages.
- Fix updates of the Probe information in case the probe is sending only an handshake. This was leading to report as broken site with no job but properly configured
- HouseKeeping
- Reduce size of bunch to avoid out of memory errors.
- Improve responsiveness to interrupt.
- Improve performance of RawXML deletion.
- Delete record without EndTime (based on the ServerDate instead)
- Fix cleanup of trace table (use for reporting)
- Fix recording of ApplicationExitCode in the summary table.
- FIx handling of Condor job that terminated by a signal.
- Fix replication for MetricRecord
- Fix duplicate record detection of MetricRecord when replicated.
- Fix recording of information about MetricRecord duplicate.
- Improve indexing for MetricRecord? information.
- Enhance error handling in receiving replicated record with bad xml
- Enhance error handling when loading configuration settings and at startup.
- Fix problem in duplicate record detection for ProbeDetails (handshake)
Additional Features:
.
Anticipated downtime
It is expected that this release will require the Gratia services and reporting to be unavailable beginning at:
- Start: 9/16/08 hh:mm CST
- Available: 9/16/08 hh:mm CST
The changes affecting downtime for this release are:
- Length of time to make a backup of the database to the backup area using
mysqlhotcopy
- Installation and validation on the 6 Gratia schemas
Collectors and Databases Affected
The following Gratia collectors and databases will be converted with this release:
- gratia08
- tomcat-fermi_transfer
- tomcat-fermi_osg
- tomcat-fermi_itb
- tomcat-qcd
- tomcat-ps
- gratia09
- tomcat-osg_transfer
- tomcat-osg_daily
- tomcat-osg_integration
- tomcat-itb
- tomcat-gratia
Build the v0.38.3 for distribution
- Make sure your build area contains all committed changes.
Done -v0.38 - 9/15/2008 12:15
Done - v0.38.1 - 9/15/2008 14:18
Done - v0.38.3 - 9/18/2008 16:00
- After the initial release of v0.38, the following changes were made due to initial upgrade of the fermi_itb collector:
gratia/common/configuration/create_build-stored-procedures-sql
Missed commit removing JobName from query
gratia/reporting/gratia-reports/WebContent/MenuConfig/UserConfig_osg.xml
inexplicably missed commit adding Site Usage by VO report
gratia/collector/gratia-services/net/sf/gratia/services/CollectorService.java
gratia/collector/gratia-services/net/sf/gratia/storage/DatabaseMaintenance.java
Put missing indexes on MetricRecord, MetricRecord_Meta and ProbeDetails_Meta.
- In gratia/build-scripts/Makefile , change the version_default to:
- version_default = v0.39
- commit the change
Done - v0.38 - 9/15/2008 12:20
- Tag the release (for all committed changes). Suggest waiting 5 minutes for this to get distributed across all nodes at source forge.
- cvs rtag -R v0-38-3 gratia
Done -v0.38 - 9/15/2008 12:25
Done - v0.38.1 - 9/15/2008 14:27
Done - v0.38.3 - 9/18/2008 16:18
- As gratia user, in /home/gratia/gratia-releases , export the tagged release:
- cd /home/gratia/gratia-releases
- cvs export -d gratia-v0.38.3 -r v0-38-3 gratia
Done -v0.38 - 9/15/2008 12:35
Done - v0.38.1 - 9/15/2008 14:43
Done - v0.38.3 - 9/18/2008 16:26
- As gratia user, build it for this release (this insures that tar files are produced for VDT):
- cd gratia-v0.38.3/build-scripts
- source setup-jdk15.sh
- make release
Done -v0.38 - 9/15/2008 12:38
Done - v0.38.1 - 9/15/2008 14:52
Done - v0.38.3 - 9/18/2008 16:28
- As gratia user, copy the tarballs for this release to a 'save' area:
- cd /home/gratia/gratia-releases/gratia-v0.38.3/target
- cp -p gratia_reporting_v0.38.3.tar /home/gratia/gratia-releases/tarballs/.
- cp -p gratia_services_v0.38.3.tar /home/gratia/gratia-releases/tarballs/.
Done - v0.38.3 - 9/19/2008 10:49
- As yourself (assuming you have permissions), copy the built tar files to the release area:
- cd /home/gratia/gratia-releases/gratia-v0.38.3/target
- scp gratia_reporting_v0.38.3.tar flxi07.fnal.gov:/afs/fnal.gov/files/expwww/gratia/html/Files
- scp gratia_services_v0.38.3.tar flxi07.fnal.gov:/afs/fnal.gov/files/expwww/gratia/html/Files
Done - v0.38.1 - 9/17/2008 07:55
Done - v0.38.3 - 9/18/2008 16:33
- Update the version number on the services release TWiki page:
- Edit and update the TWiki variable ReleaseVersion.
Done - v0.38.1 - 9/17/2008 07:55
Done - v0.38.3 - 9/18/2008 16:35
- As yourself , create a VDT support ticket via email so it gets prioritized with the VDT team. Example :
Done - v0.38.1 - ???????
Database backups and cron/init.d services
The upgrade should not performed until all database backups have been completed.
If it is expected that the upgrade will take a long time, then
- on the tomcat/collector nodes (gratia08/gratia09)
- the static report crons should be commented
- disable the init.d services
- on the database node (gratia06)
- the backup crons should be commented out.
- daily reports cron should be commented out
- the Gratia-APEL interface crons should be commented out
On the tomcat/collector nodes (gratia08/gratia09)
- comment out the root user cron entry for the static reports. Example:
42 0 * * * '/data/tomcat-fermi_itb/gratia/staticReports.py' '/data/tomcat-fermi_itb' 'http://gratia-fermi.fnal.gov:8881/gratia-reporting/'
- Disable init.d services as root user:
- chkconfig tomcat-[GRATIA_INSTANCE] off
On the MySql server node ( gratia06 )
- comment out the cron entries for:
gratia user cron jobs.
0 0 1-15 * * dir=/home/gratia/interfaces/apel-lcg; cd $dir; ./lcg.sh --config=lcg.conf --date=previous --update 30 01 * * * dir=/home/gratia/interfaces/apel-lcg; cd $dir; ./lcg.sh --config=lcg.conf --date=current --update
- root user cron entry.
43 2 * * * names supplied when required
Upgrade and implementation
The
upgrades should be single-threaded , that is, performed for each database schema one at a time.
We will perform these upgrades based on the size of the individual database schema, in ascending order.
Tomcat/collector nodes and instances:
- gratia08
- tomcat-fermi_transfer
- tomcat-fermi_osg
- tomcat-fermi_itb
- tomcat-qcd
- tomcat-ps
- gratia09
- tomcat-osg_transfer
- tomcat-osg_daily
- tomcat-osg_integration
- tomcat-itb
- tomcat-gratia
On the tomcat/collector nodes
To install the new software on a Gratia tomcat instance, run the following script:
- /home/gratia/gratia-releases/gratia-v0.38.3/build-scripts/gratia-upgrade.sh
This script will perform the following:
- Shuts down the tomcat instance
- runs the update-gratia-local script to update the software
- backs up the tomcat/gratia logs and places in the /data directory as tarballs
- deletes all logs from the /data/tomcat-[GRATIA_INSTANCE]/logs directory so you have a fresh view when the re-start the service
- optionally allows you to start the Gratia/tomcat instance.
When you have restarted the service, go to the /data/tomcat-[GRATIA_INSTANCE]/logs directory and tail the logs looking for any bizarre errors. Definition of bizarre will be supplied later.
Run the static reports cron as
root and verify these reports are generated. At the end of the gratia-update.sh execution, the root cron for the static reports for that instance will be displayed. Run them as root. Then in the UI, view the reports to verify they look correct.
Post-mortem
At this time, this will appear to be random notes. After the conversion, they may be organized.
Major updates