This document is in Draft Status

Post Upgrade Functionality Checks

This document is to provide a list of checks that will test the functionality of the BDII services after an upgrade has been done.

Checks Immediately after service is brought up

  • login to or and become root.
    • Execute /opt/service-monitor/is/, if a problem is found by this test you will get e-mail.
  • Check Web Page Display (This is not a vital part of the BDII Service but allows a manual check of the incoming data.)

Checks ~5 Minutes after the service is brought up

  • From Status Page at or
    • Check freshness of Raw Incoming Data most resources should be < 5 Minutes
    • Check freshness of Data Feeds to OSG and WLCG these should also be < 5 Minutes
  • Checks from Command Line of LDAP Server Functionality (Do not use Mac OS X 10.6.*, test will fail even with a functional BDII)
    • Run ldapsearch -h -p 2170 -x -b mds-vo-name=local,o=grid | wc -l
    • Run ldapsearch -h -p 2180 -x -b o=grid | wc -l
    • Repeat these with -h Compare the -p 2170 results to each other, they should be the same within 5%. Likewise the -p 2180 results.

Ongoing Monitoring Checks

  • All these are optional and may be obsolete. Maintenance may continue to the other instance at this point.
  • Several scripted checks run on these services including:
    • Check of BDII Freshness at CERN Top Level BDII and SAM BDII available at only WLCG resources will be listed here as they are the only resources publishing to CERN BDIIs, these tests are run each hour on the 30 minute mark. So may be behind up to an hour after upgrade has completed.
  • Email alerts are sent to the GOC-ALERTS mailing list for the following conditions
    • More than 10% of resources are not updating BDII information (either WLCG or OSG)
    • FNAL or BNL is not available from the CERN Top Level BDIIs
    • RSV Probes check timestamps of information in the BDII failure are reported via mail
  • The BDII Service also reports many system level metrics via Munin, these should be checked continuously for anomaly after an upgrade.

LDAP Errors

LDAP Server Not Running

This error will happen in the LDAP Server is not responding.

ldap_bind: Can't contact LDAP server (-1)

Data not found

This type of error will happen if no data is found matching your query, first check the ldapsearch syntax if it is correct data is missing.

# extended LDIF
# LDAPv3
# base  with scope subtree
# filter: (objectclass=*)
# requesting: ALL

# local, grid
dn: Mds-Vo-name=local,o=grid
objectClass: GlueTop
objectClass: Mds
Mds-Vo-name: local

# search result
search: 2
result: 0 Success

# numResponses: 2
# numEntries: 1
-- RobQ - 11 Aug 2010

-- RobQ - 24 May 2011

Topic revision: r6 - 04 May 2016 - 14:47:43 - ScottTeige
Hello, TWikiGuest


TWiki | Report Bugs | Privacy Policy

This site is powered by the TWiki collaboration platformCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors..