BDII Service Level Agreement

Version Control

Version Number Date Author Comments
1.3 5-12-09 Rob Quick Updates based on CMS and GOC feedback
1.1 2-23-09 Rob Quick Move to TWiki and Updates
1.2 3-10-09 Rob Quick Incorporated comments from Chander and Jim

Executive Summary

This SLA is an agreement between OSG Operations and the OSG Stakeholders describing details of the OSG BDII Information Service. The BDII service runs separately and independently on hardware at Indiana University and Indiana University Purdue University Indianapolis and consists of the CEMon Server data collection software, the Berkeley Database Information Index (BDII), and GOC provided software which serves as the bridge between CEMon and BDII. A DNS round-robin system is used for access, redundancy, and load-balancing.

Owners

The BDII SLA is owned by OSG Operations and will be reviewed and agreed upon by the OSG Executive Team.

Service Name and Description

Name

GOC Production BDII

Description

The GOC Production BDII service consumes raw GLUE information from CEMon client located at the OSG Resource level and serves it to OSG users in ldap friendly format. The BDII service consists of two independent BDII software structures available using DNS round-robining techniques.

Security Considerations

The BDII does not gather or distribute any information deemed as private. It will be subject to Indiana University institutional policy. Access to any of the hardware or software will be restricted to OSG Operations staff. Other OSG staff, upon request, may be giving access to a development version of the BDII for experimenting with new software updates, changes in database/GLUE schema, or for stress testing.

Service Target Response Priorities and Response Times

One Hour Two Hour One Day One Week
Critical High Elevated Normal
Work Outage
The issue causes a full service or customer outage or a compromise to the GLUE data, software, or hardware on both BDII instances The issue causes a full service outage or a compromise to the GLUE data, software, or hardware on a single BDII instance The issue causes short (less than 5 minute) periods of unstable or inconsistent performance The issue causes minor (less than 10 seconds) periods of unstable or inconsistent performance
Number of Clients Affected
The issue affects all BDII users or OSG resources The issue affects all BDII users or OSG resources The issue may or may not affect all BDII users or OSG resources The issue affects only a small number of BDII users or OSG resources
Workaround
All GOC BDIIís are unavailable BDII failover mechanisms are used to direct usage away from the source of the outage BDII failover mechanisms are used to direct usage away from the source of the outage No workaround will be necessary
Response Time
Within one (1) hour Workaround is addressed within (1) hour. Issue will be addressed by the next business day Within the next business day Within the next business day
Resolution Time
The maximum acceptable resolution time is four (4) continuous hours, after initial response time The maximum acceptable resolution time is 24 continuous hours, after the initial response time The maximum acceptable resolution time is five (7) business days The maximum acceptable resolution time is (30) business days
Escalates Every

Escalation Contacts

Escalation Level OSG Contact VO Contact
4th OSG Facility Manager and Executive Director ?
1st OSG Support Lead ?
2nd OSG Operations Coordinator ?
3rd OSG Production Coordinator ?

Detailed information on contact will be kept within the OSG Information Management database.

Any ongoing "Normal" or "High" level issues will be discussed at the weekly Operations and Production meetings.

Service Availability and Outages

The GOC will strive for 99% service availability. If service availability falls below 99% monthly as monitored by the GOC on two consecutive months a service plan will be submitted to the OSG stakeholders for plans to restore an acceptable level of service availability.

A maximum of two non-scheduled outages will be accepted by OSG during each six month period of service. If the GOC experiences more than the allotted outage, a service plan will be submitted to the OSG stakeholders with plans to restore the service to an acceptable level of operations.

Service Support Hours

The BDII service is supported 24x7 by the GOC and Indiana University. Critical and High level issues will result in response within (1) hour. All other issues will be investigated by the next business day.

Service Off-Hours Support Procedures

All BDII issues should be reported to the GOC immediately via email, phone, or by trouble ticket web submission. If the problem is deemed critical or high, a GOC staff member will be alerted immediately. If the problem is deemed elevated or normal it will be addressed by the next business day.

Requests for Service Enhancements

The OSG Operations will respond to customer requests for service enhancements based on OSG Managements determination of the necessity and desirability of the enhancement. No enhancements will be brought to the production OSG BDII without a minimum of one month of testing. The GOC reserves the right to enhance the physical environment of the service based on IU and GOC needs. No enhancement will occur without advanced notice to the OSG community.

Customer Problem Reporting

The GOC provides operators 24x7x365. BDII problems should be reported immediately by one of the following mechanisms.

Responsibilities

Customer Responsibilities

BDII Customers agree to:

  • Use the BDII to gather information about OSG resources for purposes of VO approved work only.
  • Alert the GOC if they are going to use the BDII in a non-standard way, this includes testing or anticipated mass increases in usage.
  • Contact the GOC by means outlined in the Customer Problem Reporting section of this document if they encounter any service issues.
  • Be willing and available to provide critical information with one hour of reporting a critical incident or one business day for any other criticality.

OSG Operations Responsibilities

General responsibilities:
  • Create and add appropriate documentation to the OSG TWiki for appropriate use of the BDII.
  • Meet response times associated with the priority assigned to Customer issues.
  • Maintain appropriately trained staff.

GOC service desk responsibilities:

  • Log and track all Customer requests for service through the OSG ticketing system.

Database & Application Services responsibilities:

  • Schedule maintenance (downtime) outside of normal business hours (Eastern Time) unless circumstances warrant performing maintenance at another time.
  • Announce and negotiate maintenance with stakeholders to assure minimal interruption to production workload.
  • Alert the community of scheduled maintenance periods at least five business days prior to the start of a service affecting maintenance window.

Service Measuring and Reporting

The GOC will provide the customer with the following reports in the intervals indicated (monthly, quarterly, semi-annually, or annually):

Report Name Reporting Interval Delivery Method Responsible Party
Report of Critical and High Priority Issues Quarterly Web Posting GOC
System Uptime Monthly Web Posting GOC
Service Uptime Monthly Web Posting GOC

SLA Validity Period

This SLA will be in affect for one year.

SLA Review Procedure

This SLA will renew automatically on a yearly basis unless change or update is requested by the OSG Operations Coordinator, a member of the OSG Executive Team and the Stakeholders.

References

Appendix A - Customer Information

All BDII end-users and VO representatives are considered customers.

Appendix B - Supported Hardware and Software

Supported Hardware

The following hardware is supported:
  • Physical devices used to provide the BDII service.
  • Physical devices used to provide the environment used to house the BDII service.

Hardware Services

The following hardware services are provided:
  • Recommendations. OSG Operations will be responsible for specifying and recommending for purchase or lease hardware meeting customers' needs.
  • Installation. OSG Operations will install, configure and customize system hardware and operating systems.
  • Upgrades. OSG Operations is responsible for specifying and recommending for purchase any hardware upgrades.
  • Diagnosis. OSG Operations will diagnose problems with service related hardware.
  • Repair. OSG Operations analysts are not hardware technicians and receive no training in hardware maintenance, nor do we have the test equipment and tools necessary to do such work.

Performing repairs under warranty: Any work to be performed under warranty may be referred to the warranty service provider at the discretion of the Service Provider analyst(s). Service Provider analysts will not undertake work that will void warranties on customer hardware unless specifically requested and authorized by customer's management in writing.

Obtaining repair services: The Service Provider analyst will recommend a service vendor whenever he/she feels the repair work requires specialized skills or tools.

  • Backup. Service Provider agrees to fully back up all Service Provider-supported software and data nightly every business day.

Software Services

Service Provider agrees to cover software support services, including software installations and upgrades. All upgrades will be announced via the policy put forth in the "OSG Operations Responsibilities" section of this document. Where possible all updates will be applied to a single instance of the BDII while the other operates normally using the DNS mechanisms described above to reduce service downtime.

Software Costs

OSG bears all costs for new and replacement software.

Appendix C - Approval

Approved By Position Date
Rob Quick Operations Coordinator 5-26-2009
  Facility Coordinator  
Ruth Pordes OSG Excutive Director 5-27-2009
Frank Wuerthwein USCMS 5-11-2009
Ian Fisk USCMS 7-26-2009

Appendix D - Published Availability and Reliability

Month Year Availability Reliability
October 2009 99.42% 99.96%
November 2009 99.89% 99.99%
December 2009 100.00% 100.00%
January 2010 100.00% 100.00%
February 2010 100.00% 100.00%
March 2010 100.00% 100.00%
April 2010 100.00% 100.00%
May 2010 99.8% 100.0%
June 2010 100.00% 100.00%
July 2010 99.91% 99.91%
August 2010 99.72% 99.72%
September 2010 99.91% 99.95%
October 2010 100.00% 100.00%
November* 2010 100.00% 100.00%
December 2010 100.00% 100.00%
January 2011 100.00% 100.00%
February 2011 100.00% 100.00%
March 2011 100.00% 100.00%
April 2011 100.00% 100.00%
Recent availability statistics

* November 2010 - Indianapolis BDII experienced network issues due to a degrading fiber connection. Bloomington BDII remained fully functional.

-- RobQ - 23 Feb 2009

Topic attachments
I Attachment Action Size Date Who Comment
elsexlsx 2011.xlsx manage 65.9 K 23 Apr 2012 - 13:44 ScottTeige 2011 summary
elsexlsx 2012.xlsx manage 61.2 K 06 Jul 2012 - 18:32 ScottTeige 2012 Q1,Q2 summary
Topic revision: r29 - 08 Jan 2014 - 15:08:12 - ScottTeige
Hello, TWikiGuest
Register

 
TWIKI.NET

TWiki | Report Bugs | Privacy Policy

This site is powered by the TWiki collaboration platformCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors..