Field of Science and Campus Accounting


Over the last few years there have been multiple requests to better understand the Fields of Science that are supported on OSG; this includes a view from the sites (especially Fermilab) that provide the resources as well as OSG management and our agency Sponsors. Due to the recent work that was undertaken to implement OSG as a Service Provider in XD, we now believe we have the building blocks to provide much improved accounting by Field of Science and for Campuses.

In OSG Gratia there are two domains of reporting:

  1. "Batch" records which are reported from the CE
  2. "Batch_Pilot" records sent from front-ends and submit hosts.
The vision below does not plan any changes to "Batch" domain and relies on extensions to how we treat "Batch_Pilot" records.
  1. Campus submission points will run the Batch_Pilot probe and use the VOOveride (e.g. VOOverride="UCSD"); this will tag all the records as coming from that submit host and allow gratia queries to zoom in on the records of interest for that Campus (or community)
  2. For multi-science VOs and/or submit hosts, we ask them to include ProjectName in their jobs (e.g. +ProjectName = "string"). This builds upon the work we did for OSG-XSEDE. Right now ProjectName strings are kept in a hand managed "csv" file but we envision a simple ProjectName registration tool at GOC where you enter info about you, your work, and field-of-science and you get back a unique string as your "ProjectName". A ProjectName corresponds to a PI and could be shared by multiple users working on the same project.

With these two constructs, we can have sites be able to see what science is being supported at their sites. And we can have Campus researchers query with their VOOveride name or ProjectName or use .AND. between these two to see only Projects from their Campus.

There are many details to be worked out; the one that concerns us the most is getting all multi-science submit hosts to use ProjectName in their jobs. The incentive will be better accounting but will that be enough? Including one line in a job is pretty low overhead but since we don't mandate very many things in the OSG culture, the adoption rate is hard to predict; do we need to have a new policy around this from the OSG Consortium?

Development Tasks

ProjectName Registration Database

  • ProjectName database fields
  • Field of Science List used by XSEDE
  • Supported fields at Indiana University
  • Only allow those names that appear in OIM Campus Grids entries or OIM registered VOs to add/edit their own records in the database (for records where they are named as Sponsor). For example, HCC registered names are only allowed to add/edit records where HCC is the named Sponsor.
  • Admins from the Sponsor campus should access service with admin certificate registered in the database.
  • Auto generate unique project name from the form input: template OSG-<3char of FOS><5 digit number> (e.g. OSG-CSC00100). If a user leaves the field blank, the name will be auto generated at the submission time. A user can select a particular project name. The name will be checked for uniqueness.

  • Key commands that are needed include
    • List Projects (and associated info) for my campus (or all)
    • List info associated with a Project (PI, campus, institution, email, scientific publication)
    • Add Project (and associated data)
    • Add multiple projects (typically from a .csv file)
    • Modify PI info for a project (email, institution?)
    • Add/modify science publication associated with a Project

Soichi's todolist (2013/07/25)

  1. Add validation to prevent user from choosing VO/CG that user doesn't have write access
  2. Bulk load of project name from TeraGrid? - do auto load of FOS and run cleanup.
  3. Think about way to allow PI to edit publication using token URL
  4. Publish a flat table of all projects with field of science
  5. Allow to download the list of all the project and related information in csv format.

Wrapper for enforcing ProjectName

Gratia Queries and Graphs

  • Develop additional queries including
    • Bar Graph showing ProjectName usage over sites (for a user specified time period)
    • Pie Chart showing ProjectName usage at a site (for a user specified time period)
    • Pie Chart showing various Science Types usage at a site (for a user specified time period)

Deployment and Use


  • Operate the ProjectName Registration Database service

Submit Host Operators

  • Only applies to those that support users that span multiple Fields of Science

  • When they enable a new user on their submit host, they register that user in the ProjectName database
  • Advise the PI/user of the ProjectName they have been assigned
  • Implement the wrapper at their site to ensure that all jobs submitted there use ProjectName


  • Must include the +ProjectName = "string" declaration in their jobs


  • No action needed to enable deployment of this capability
  • Sites can use the appropriate queries to see what Projects are running at their sites along with hours; and use the appropriate queries to see what science types are running at their sites along with hours
  • Sites can choose to ban multi-science VOs that do not use the ProjectName construct to declare their science

Project Management

  • Every quarter, send an email to PIs (from the ProjectName) database who had significant usage to ask them to submit their science publications into the database



* Presentation for the first FOS meeting - 2013/07/10

-- TanyaLevshina - 10 Jul 2013

