OSG Accounting Activity - Report Requirements

Introduction

One of the main gratia requirements is to serve the accounting needs of the OSG community and the Fermilab Computing Division community. The differences in the accounting reporting requirements of these main user groups will be stated separately. (The following paragraphs are from the existing requirements documents.)

Workflow

Stakeholders will be able to generate accounting reports using the Grid Accounting system. The Grid Accounting system will also offer an interface through which stakeholders will be able to export accounting data for custom manipulation and presentation. The project will work closely with the Stakeholders to develop and implement these reports. The project will be responsible for providing the interfaces needed and initially for posting and archiving these accounting reports to a web accessible repository. Stakeholders will be able to use this web repository to review and transmit the reports as needed.

Software Interfaces (API)

The accounting system will support the following interfaces, accessible over the network (LAN and WAN):

  • An interface for reporting service usage data (push model). This API will be used by the services to report periodically usage data.
  • An interface to read accounting records. This API will allow the development of application for displaying and analyzing accounting records. The API will support querying (by date, user, VOs, site…) of accounting records and accounting events used to generate the records. This API will also support the exporting of accounting records to different data and file format (Excel, Root, etc.).
  • An interface to generate report and to perform filtering and statistical analysis of the accounting records. The accounting system will offer a basic set of reports and analytical tools. This API will allow the selection of a subset of accounting records, the selection of a function to be applied to the selected records and the format to use to save the results.

Reporting Requirements

General Accounting Requirements Pertaining to Reporting

These requirements are common to both OSG and Fermilab users (as defined in the gratia requirements documentation).

Accounting Resource Utilization

Req-1.0 The accounting system must report resource utilization per grid user.
Explanation Reporting resource usage per user is the only way the accounting system can provide detailed accounting information that can be use to support the missions of all the system users.

Accounting Information Publishing

Req-2.0 The accounting system must provide a mechanism through which each site can publish its accounting records.
Explanation Resource providers own the accounting information. They must retain control on who access this information, where and how.

Req-3.0 The accounting system must provide a mechanism through which each resource provider can decide which accounting records to publish and which to keep private as well as who has access to what.
Explanation Resource providers own the accounting information. They must retain control on who access this information, where and how.

Accounting Records Tagging

Req-4.0 Accounting record must be “taggable”.
Explanation The accounting system must provide a mechanism that resource provider managers can use to tag accounting records. This will simplify repetitive queries and auditing activities.
In addition tags could be used to decide the access control list for the accounting records.

Auditing

Req-5.0 The accounting system must provide a mechanism to link an accounting record with the usage data (records of actions and events related to the user’s resource usage) such that from a tabular view of the accounting record it is possible to see its detailed usage data.
Explanation Because the accounting information will be used by many users to evaluate contracts and resource allocation models, it is paramount that the users have a certain confidence in he produced accounting information.
The accounting system will provide a mechanism to keep all the accounting information (usage data, authentication information….) that is used to create the final accounting records.

Accounting Information Viewing

Req-6.0 The accounting system must provide a web interface that users can use to query the accounting data storage.
Explanation None.

Accounting Records Importing and Exporting

Req-7.0 The accounting system must provide an export function that allows exporting a selected number of accounting records in Windows Comma Separated format (CSV format for importing in Excel), XML format and ROOT format.
Explanation This feature will allow users to use various tools for accounting records processing and viewing

Non-Functional Requirements

Security

Req-8.0 The accounting system must protect accounting information from unauthorized access (read and write)
Explanation Accounting information contains detailed information about users and systems. Typically, sites want to keep this information private to prevent cyber security incidents and to respect the privacy rights.

Req-8.1 The accounting system communication channels must be secure (they will support confidentiality and data integrity)
Explanation See explanation of Req-8.0.
In addition, international agreements require that user’s accounting data must be encrypted when crossing national borders.

OSG Specific Report Requirements

Req-9.0 Produce summary reports like Monalisa or Panda or OU

New User Requirements

[Section to be filed directly by the users list their requirements. Those will later be folded in the main body of the document. A consolidated list of these requirements/requests is available at https://twiki.grid.iu.edu/twiki/pub/Accounting/ReportRequirements/Gratia_Reporting_Requests-1.xls ]

From Ken Bloom

I would like to be able to do filtering so that I can restrict the view to a smaller number of sites. For instance, I'm typically only interested in the US CMS Tier-2 sites. Right now, it is hard to fish those seven sites out of the very many OSG sites that Gratia tracks.

Then, once that is done, there are several kinds of reports I'd like to get at easily:

  • History of CPU usage by site
  • History of CPU usage by VO
  • History of number of running jobs by site or VO (or better still, do this as a fraction of available batch slots)
  • Integral distributions of all of the above, so we can see the total work done per site/VO for a given time period

From fkw

In addition to Ken's plots I want to be able to see:
  • history of CPU useage per VO for a given site. E.g. in a way similar to the "VOs per site" view in the "jobs" folder of ReleaseDocumentationMonALISA.
  • history of CPU useage per site for a given VO. E.g. in a way similar to the "Sites per VO" view in the "jobs" folder for ReleaseDocumentationMonALISA.

I'm assuming both Ken and I are content for now to simply know the wallclock time as a metric for "CPU usage".

  • In addition, I'd like to be able to type a fqan into a form, preferably with regular expressions wild cards, and display the corresponding plots as if this was a VO, i.e. plotting either the total useage, or the useage split up by sites. E.g.:
    • I'd like to see the report plotted for a particular DN
    • ... for a particular role or group ... [Developer notes: This information is not yet collected]
  • for a given site and a given time window, I want the use per fqans. I.e. different colored lines for each and every distinct fqan.

And do me a favor, don't try to hide things from me for privacy reasons. We have the concept of a local GRATIA accounting DB, and I for sure have the right to see all the records displayed for all the fqans active at the site I operate.

[Developer notes: We will not hide anything until we add the ability to login with a role having 'extra' priviledge [yes Frank can see everything but his summer students might not be allowed to]]

P.s. Here some ideas I would persue if I knew how.

  • pick all HEP VOs, and show what their useage is within the last month.
  • pick all non-hep VOs and show them instead.
  • pick Atlas and CMS and show what their useage is within the last month For cms this is essentially impossible right now because we have an
accounting mess in cms with many sites being recorded in a variety of different ways.
  • find among the HEP VOs one that uses many sites, and show the
distribution across sites I'm guessing that's either D0 or CDF because Atlas and CMS use only their own sites. In Atlas, it's use of BNL dominates everything. In CMS, they're not even trying to use sites they don't own.

I'd want to demonstrate that:

  1. OSG provides utility to Atlas and CMS
  2. OSG provides utility to HEP other than LHC
  3. OSG provides utility to people outside HEP
  4. the premise that you can find resources that you don't own is indeed
working for somebody in HEP. I.e. somebody in HEP benefits from the LHC program without paying for it.

From Ruth

  • I would like to be able to have a set of canned queries that just give the tables and CSV files (ie no plots). It would be nice to be able to save the SQL queries I put in for easy reuse later and for other people to use.
  • I would like all the plots to give me the queries used - at the moment there are some exeptions.
  • Can there please be an interface to the "Daily Report" information as well as the main Gratia information so we can add the USATLAS usage to the main ifnormation. At the moment no PANDA usage is reported through the main Gratia web pages (and it is significant).
  • Filter out particular VOs or sites from the overall plots - ie complete flexibility to remove one or more sites or VOs from any plot.
  • http://home.fnal.gov/~weigand/gratia/gratia-queries.html#toc

Added on June 9th:

  1. When selecting a plot for one VO or SITE the VO or SITE name should be part of the title of the plot. At the moment this is in small print as part of the X Axis label. The word "GratiaUser" in the title at the moment is not needed.
  2. The date range on plots which are not showing date in the x axis should also be at the top and more prominent.
  3. Can there be a top level page which gives links to the different Gratia databases - I believe there is one for FermiLab?, the Summary information from PANDA for example, and maybe others? (LQCD?) * Developers' note, this is currently available on the twiki page: https://twiki.grid.iu.edu/twiki/bin/view/Accounting/WebHome#Reports.
  4. I would like to see if we can get CMS and other VO summary information. I presume the CMS might come from the CMS dashboard.
  5. Can one make a set of exclusions or selections and have them a subsequent set of queries?

Added on August 11th Please could these daily reports give a summary at the top that would help those skimming the information on a daily basis

the total Wall Duration, the total CPU Time, the number of sites reporting, the number of CEs in the Registration database listed as in the production OSG infrastructure as a clearly delineated line at the top of the report? I see 50 sites are now reporting. That is good!

I am interested in some more information from the daily - these could be weekly or monthly instead - reports also.

  • Efficiency of jobs. that is the ratio of CPU time to Wall duration.
  • The profile of the length of time of jobs by VO/by Site. If these cannot be sent as text files but as xls and charts that is ok by me.
  • Alerts: I am interested in those jobs that have been running more than X days by VO. where X is a week and a month.
  • Alerts: I am interested in those sites that show 0 Wall Duration for more than X days - where X is a week and a month

These can be cron jobs that post to the web and that I get emailed a url. I would like the structure to be such that I can go back to all old reports from the web interface.

-- KenBloom? - 11 May 2007

Topic attachments
I Attachment Action Size Date Who Comment
xlsxls Gratia_Reporting_Requests-1.xls manage 28.0 K 08 Apr 2008 - 19:40 UnknownUser Format changes
Topic revision: r21 - 07 Feb 2017 - 19:16:27 - BrianBockelman
Hello, TWikiGuest
Register

 
TWIKI.NET

TWiki | Report Bugs | Privacy Policy

This site is powered by the TWiki collaboration platformCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors..