Prep for CDF Research Highlight Interview
Ashutosh and Pasha for Data Processing side
Their email to CDF executive board
expansion of the CDF computing to the Open Science Grid (OSG) during the last 6 months CDF was able to consume more CPU resources than any other experiment, you can find 'by experiment' OSG CPU consumption plot at the CDF tiki:
http://www-cdf.fnal.gov/tiki/tiki-read_article.php?articleId=19 (I don't have authorization to view this).
Ability to use practically unlimited today off-site computing for MC generation had huge impact on the physics productivity of the
experiment in 2006.
- In a short time our physicists were able to generate a lot of MC necessary for the data analysis.
- Having moved time-consuming MC generation offsite we were able to use more on-site CPU resources for the data analysis and use proximity of these CPUs to the data to increase efficiency and shorten the time of the data analysis.
GRID expansion of the CDF computing was one of the building blocks underlying success of our physics and leading to multiple important physics results reported by CDF in 2006. This is a tremendous achievement of the CDF GRID team which has already made available to us North-American (
NAmCAF?) and European GRID (LCGCAF) resources and just recently started working on
PacCAF? which will allow CDF to access GRID computing sites in Asia.
Actual questions
Start off:
400-word article (not very long)
will discuss grid/computing side with you, and physics side with spokespersons
will need a good image from you
we'll draft, send to you for review
finalized by xmas
How did the grid help the collaboration make the recent discoveries of B sub s mixing and the Sigma sub b particles?
ash (let's pasha speak first)
pasha
for any analysis, need to do lots of computations, when search for new, compare expectations to data,
generate monte carlo
amount of cpu on site huge but limited
in finite time, import to have as many cpu resources as can use
else time is too long
analysis process desn't conferge as fasta s want
and the quantity of transfer becomes a quality issue.
if less cpu than x
is impossible to do certain things
for bsub s. need to gen lots of mc, est systematic uncertainties,
uncertainties of detector paramters into observ
how variation in assumptions
compare to data to see if effect
ah rewrite: observe how variation in assumptions
compare to data to see effect of uncertainties of detector parameters
lucky to move all mc to grid, this is longterm strg__ use grid for mc
is tru, is because of grid that were able to do mc in timely fashion
at same time, having mc offsite, released more cpu at fnal at close prox for data anlay, so sped up analysis
nice to see people working in mode limited not by cpu but by intellectual capacity
tools for submitting jobs, so that it's easy for phys to submit -- lucky to have cdf dev's to devel these tools
tech comment: no other expt has such adv tools for submitting
and monitoring jobs
ash
agree with pasha, detector as sophisticated as cdf, when look for subtle effects, or rarely produced signal, is important to make case; is crucial to compare data to all hypotheses of what data would look like with signal or without (consistent with this signal, not with others)
how to calulate, only way is to run large amount of mc simulation
to make sci case unequiv, need suffic quant of simulation -- power of grid essential, and willl be more and more as move forward
look for in future, even more subtle things seen, need even more precise comparisons
relied on computing power of grid.
looking forward to building own interface to grid, keeps up with technical developments
want to be a full partner, but
the process of scientific discovery is at stake and we are relying on the increasing power of the grid what we can into it, we have something at stake -- the science.
Tools -- is becoming a contribution
ash
our submis tools existed from before osg became such a useful resource
we looked ahead and dev tools to use locally before osg easy to access
as osg opens up, our tools are evolving, old functionality plus now tools can access computers out on the grid.
users can transiition to grid without lots of extra learning on their part.
looks same to users, but from sw dev the backend is more generic more gridlike
we have a way of useing grid for phyis anal we'd like to share, so if people would like to adopt similar rout, happy t o share.
it hwlps if others do same things, partners can adopt. openly as can share tech of accessing grid, is advenatageious.
pash
true one approach that uscms is looking at is next gen of cdf tools, naturally comes with the people. cms is looking at other ways too.
since cdf
ash
proof that people can do phys and make disc is proof that it's a well system -- not hypothetical, people are publishing with it now.
if technol has a path to move to future with osg, confidence level gets higher, so would like nw experiments on lhc and others to know about
pash
our physisits are submitting jobs to grid on daily basis. not just a few people, everybody doing, even w/o thinking what is on back end.
"People are just doing it."
this is proof of success of approach, not bleeding edge for users, just everyday work. don't thinka bout where will be executed, routine best proof of viability.
tool : originally
CAF -- central anylys facilty, one at fnal, then organized similar dedicated faciltities around world, distributed cafs
got many of these, difficult for many batch farms
2nd change, all these previously dedicated started becoming parts of grid sites
together, resulted in evolution of job subm tools,
make it such that users don't know where submitting
resulted in concept of grid entry points, have a virtual batch farm per continent
1st spans north am : north am caf -- legacy name
2nd entry point, implemented in europe, contrib of LCG CAF is one machine, talks to fnal, cdf jobs throu that machine get distrib all through europe, uses lcg grid tools on back end, lcg middleware, we are really well fit to submit jobs to whatever grid is available, all of noram cafs today is osg, canadians use lcg but it's work in progress,
asia, after collabs in asia heard how well we're doing, they decided to create entrypoint pac- caf (pacific) is progressing rapidly, in 1-2 months, jobs transparently submit to asia, same as nor am and europe
ash
japan and taiwan are taking lead in pac caf. and others may join
pash
nor am caf, euro, pac caf -- each has one machine that talks to grid sites
ash
condor, is one of products we use that comes from grid middleware effort
for na caf, a lot of the clever stuff behind scenes, done by condor batch system,
comes with many tools allow us to send jobs to sites, find out which sites to send to
big consortium that develops for condor batch sys. we are clients of condor consort
we use condor and have provided feedback tocondor, and there's good ptop communication, we don't dev condor, but help with dev via feedback
all gain by experienes
How, exactly, is CDF using the OSG? How many sites? What's the amount of usage?
Today 5 sites, about 5 more in the works.
plan to continue expa till all grid sites, want to go stepbystep, integ into sub system one site after another
can afford it, already good enough for the next year, acces to sites we have now, covers most of our est offsite needs
_CPU usage on vs off-site:_jin absolute scale, split 60-40 or 70-30, largest onsite, but mc all offsite
easier/faster to do offsite
not enforcing as policy
mt metrics from fkw article
1.6 pb at fermilab on tape, includes data from det and mc data
4-5000 jobs running concurrently? pas: yes, total onsite is 5000 jobs, + 1/2 of that for offsite, amt of jobs offsite varies with time
10s of 1000s of jobs waiting in queue
ash
often seen in excess of 1000 cdf jobs on no am caf alone (5 osg sites)
peak jobs? or metrics we should include
8000 jobs concurrent upto
unique features of sw, love to share, monitoring system. mon sw dev of cdf lets users see status conveniently as if running locally, key part of success, what is diff is to watch, monitor job
How long has CDF been using the OSG?
ash
criteria n am caf -- have sites where no computers at sites have cdf-sw. simply runs osg sw, and our interface can submit from fnal to site
use this as "use grid out of box"
n am caf in prod in june 06
pas criteria; osg monitoring (not cdf) see cdf job since beginning of that monitoring from beginning, mon from 2005 (ash likes it better)
what services, resources, expertise has CDF contributed to OSG
pas
ash
ideas and tech all goes with people
igor sfil was w/ cdf now w/ cd
wking on OSG full time
fkw had idea of interface to work on whatever batch system (CAF) igor came along, fkw invented word CAF
people who had cdf connections, go to cms, cd elsewhere, coming from running expt, know what service to provide to users. This helps them and whoever they're helping now.
Some expertise goes via sw and via people.
What are the future plans of CDF and the OSG?
currently pinging 5 sites, want to be able to submit to all sites, operations type work, will integrate lcg sites in canada, will expand set of sites available thru lcg caf euro entry point
until lhc startup, the cpu resources will be increasing, need to learn how to use efficiently, then will expand in asia,
then game will change in a couple of years, today amt of cpu is virtually unlimited, mostly because peole don't know how to use it. cdf has learned fast
in 2 yrs, expect landsacpe to change, competetiton for cpu resources, plan to learn in this time how to succeed in this new mode when resources become sparse.
Key for that is that jobs to submit do not have any cdf spec requirements to sites
set it as requirement that job be self-contained, any linxu cpu .
true concept of a grid: net or web with cpus .
Anne's questions
- when was this written?
- how much is "practically unlimited"?
- Explain CAF, NAmCAF? and how it relates to OSG
- Numbers (maybe they're in the image), how many MC jobs, CPU cycles, TBytes, what units are best?
- what services, resources, expertise has CDF contributed to OSG
Katie's suggested questions
- How did the grid help the collaboration make the recent discoveries of B sub s mixing and the Sigma sub b particles?
- How, exactly, is CDF using the OSG? How many sites? What's the amount of usage?
- How long has CDF been using the OSG?
- What are the future plans of CDF and the OSG?
Spokespersons for Physics side
Meet with Rob and Jaco 12/1/06 2:30
Describe B sub S and sigma sub B for layperson
rob
jaco
two results v diff in some ways, sim in other wys
Particle physics analysis is a statistical process, and CDF is seeking
ever rarer processes and particles. "The more data we
get from the Tevatron, the more we can study rarely produced phenomena, and
understand what happens, how particles behave, and gain confidence that
we're not seeing random processes."
global statement:
study very rare processses , theme
more data from tev, more can sudty rarely produced phenomena
more can understand
more data --> get more events/partcl like b sbs or sigb, can conc
property (osc) or a particle (sigma)
comon denom: more data,
allows to get enough that can underst what they do, how behave
enough that know it's not random pr
when make discovery, not an
rob
is statistical process, get enough on tape so can conclude state that what you see is signal and not bkgnd fluctuation mimicking signal
jaco
links to idea of grid coputing
things to do to understand that really disc, understand effect well enoug
need to run so many simulated expts millions of collistions, how it would look in detector, lengthy computing process
anaylzy throu programs
do so much that need distrub computing
rob
as study,k what's interesting: probe standard model, find any flaws as learn more. these are cases in which the measurements turn out standard-model like, not new physics. we know there's something else out there, these don't lead us to what they might be.
jaco
find window to outside
which window is outside stand model
each measurement is complicated, complex, look to match what expect or to show diff landscape,
so far, these are right on expectations
look into many windows.
b sub s mix
b quark and strange q
elec charge combines to neutral
flipping of b and s quarks into anti-b and anti-s and back again,3 tr/s
improbable, but well described in standard mod
if rate had been found faster, then other things besides anti particles , --: super symm could be there sparticles
j
trans between mat and antim part/spart, very subtle and rapid effect, astonishing that can measure this rapid process.
concrete
sigma b
see a peak, reconstruct en of mass of outgoing particles, see in well defined measure
bsub s
find
stand model; sty? subt eff can reveal existance of other processes,
predict very precise,
small changes will change results dsignif
open to finding new phys
lot of very accurate meas, turn out to be consistent, then
more and more acc and precise, to make big deviations
when using same collder (limited energy),
indirect way of searching for new phys.
new phys, particles and forces beyond standard model
sigma sub b
like finding a new element at quark level, show periodic table of baryons, fall in certain regions of quantum numbers
quarks bound and form statues, like nuclie are formed
periodic table well predicted, could find deviations from stna model
heavier quark , the less well studied they are
sigma b, whether they have pred masses based on understanding of how qu bind together
jaco
Andreas Kron -- LGT
three body systems qcd, difficult to do analyt calcu
Lattice computing: ways of computing qcd,
we find these states, they see if are in accordance with their understanding, and make improv on their things
not as dramatic
is more internal consistency of how qcd can be used to explain fundamental properties of qurak, how they find
most exciting part: how to form find? bound states, right place w right quan numers
remarkable experimental achievement, and from accel point of view.
important thing: triggers, whether to record passage of particle, difficult thing
remarkable success w/ silicon det and triggers
gain ability to trigger and record process with high purity, not let them get lost.
for sigma b, lattice calc not for mc
lattice: ways to find say mass of proton, or sigma b
take quark component, lots of qcd intersactionk how to calc?
do calcs in tiy steps, in momentum space
monte carlo, running 10 million times, what's typical.
- Contrast mixing with oscillation.
How was the oscillation detected and measured?
rob
don't observe meson, but decay prod
is b or antib meson
once decided, measure distance where collision and wehre decay occured,
map out oscillation pattern
jaco
another way
look at production point of particle, was particle or antipart?, look at decay point, still part or anti part?
distance gives how long it took to convert from one to other
repeat many times
rob
sigma sub b
reconstruct it
trigger on lambda sub b particles, make mass distributions apply cuts and a peak comes out of tha ashes
use mc to subtract backgrounds
make sure bump is real
jaco
use grid for everything, not just for these two measurements
simulate processes over and over till get full range of possiblities,
need to run
really understand in the most rare things as simulated process
every mc job run offsite for cdf
any process
How did use of OSG software and resources help make this discovery possible?
r
1/2 of offsite are osg, 1/2 are lcg sites
don't want to reinvent wheel
use a pull model, sent out pilot job to local site, it pulls in all jobs we want to do, gives us control over job, set priorities and monitor, standard osg toolkit, let's physiscits worry about pys not computing
Get plot, what does the plot show?
amplitute,
delta ms (mass)
value of delta ms
fourier transform
reulst w freq
cant measru freq, can measure fourier xform of it
test data for all possible values of delta m
do fit of data for that
if consistent with it, gives you 1
scan till data resonates with hypothes, get 1 clearly and narrowly, found right frequency
too close to zero, not interesting
what would you state as the significance of B sub S result?
rob
collecting more data, this is just a couple on the way to a v rich program
going to measre rarer and rarer, w z bozon pair is latest, xsec of few picobarns
looking for few 1/10 of pco barn -- higgs
knocking off bricks of foundation, moving to rarere and rarer
j
show can make better and better measurements,
signif in own sense.
achievement of finding rare things
how does this discovery fit in with the goals of CDF?
what are future physics goals, measurements in progress...
end interview
Related Result of the week articles and press releases
B sub S press release (highlights)
http://www.fnal.gov/pub/presspass/press_releases/CDF_meson.html
...announced (September 25, 2006) that they have met the exacting standard to claim discovery of astonishingly rapid transitions between matter and antimatter: 3 trillion oscillations per second.
The CDF discovery of the oscillation rate is immediately significant for two major reasons: reinforcing the validity of the Standard Model, which governs physicists' understanding of the fundamental particles and forces; and narrowing down the possible forms of supersymmetry, a theory proposing that each known particle has its own more massive "super" partner particle. (The currently popular models of supersymmetry, for example, predict a much higher transition frequency than that observed by CDF, and those models will need to be reconsidered. )
"Scientists have been pursuing this measurement for two decades, but the convergence of capabilities to make it possible has occurred just now," said CDF cospokesperson Jacobo Konigsberg of the University of Florida. "We needed to produce sufficient quantities to be able to study these particles in detail. That condition was met by the superb performance of the Tevatron. Then, with a process this fast, we needed extremely precise detectors and sophisticated analysis tools. Those conditions were met at CDF, along with the skill and contributions of a great team of people."
Scientists hope that by assembling a large number of precise measurements involving the exotic behavior of these particles (matter and antimatter, especially as it pertains to strange, charm and bottom quarks), they can begin to understand why they exist, how they interact with one another and what role they played in the development of the early universe. Most importantly, they could also be the place in which to look for new physics beyond the Standard Model, which scientists believe is incomplete.
The experimenters acquired their data between February 2002 and January 2006, an operating period known as Tevatron Run 2,
Surprisingly, the bizarre behavior of the B_s (pronounced "B sub s") mesons is actually predicted by the Standard Model of fundamental particles and forces. The discovery of this oscillatory behavior is thus another reinforcement of the Standard Model's durability.
Developing the software tools to make maximal use of the information in each collision takes time and effort," said Roser, "but the rewards are there in terms of discovery potential and increased level of precision."
Sigma sub b
http://www.fnal.gov/pub/today/archive_2006/today06-10-23.html
http://www.fnal.gov/pub/presspass/press_releases/sigma-b-baryon.html
announced today (October 23, 2006) the discovery of two rare types of particles, exotic relatives of the much more common proton and neutron.
The CDF collaboration discovered two types of Sigma-sub-b particles, each one about six times heavier than a proton.
The two types of baryons discovered by the CDF experiment are made of two up quarks and one bottom quark (u-u-b), and two down quarks and a bottom quark (d-d-b). For comparison, protons are u-u-d combinations, while neutrons are d-d-u. The new particles are extremely short-lived and decay within a tiny fraction of a second.
The CDF experiment identified 103 u-u-b particles, positively charged Sigma-sub-b particles (Σ+b), and 134 d-d-b particles, negatively charged Sigma-sub-b particles (Σ-b). In order to find this number of particles, scientists culled through more than 100 trillion high-energy proton-antiproton collisions produced by the Tevatron over the last five years.
the two types of Sigma-sub-b particles are produced in two different spin combinations, J=1/2 and J=3/2, representing a ground state and an excited state, as predicted by theory.
Quark theory predicts six different types of baryons with one bottom quark and spin J=3/2 (see graphic). The CDF experiment now accounts for two of these baryons.
Preliminary outline/draft of research highlight
Jazzy introduction to the physics
Include something like "the OSG software stack helps CDF study..." or "the distributed OSG computing resources handled..."
Three trillion times per second -- that's how fast the B sub S particle has been measured to change between its matter and antimatter states. The Collider Detector at Fermilab (CDF) experiment discovered this rapid oscillation with the help of the world's most powerful particle accelerator, the Tevatron, and unprecedented computing power made available by the Open Science Grid.
Femilab's Tevatron accelerates protons and antiprotons close to the speed of light, and then makes them collide head-on inside the CDF detector. The products of these collisions are studied to discover the identity and properties of the particles that make up the universe and to
understand the forces and interactions between those particles. They've just added to the particle menagerie with their second recent discovery, that being two new types of 3-quark particles (called baryons): a u-u-b quark combination resulting in a positively charged Sigma-sub-b particle, and its negatively charged counterpart composed of d-d-b quarks. For comparison, each of these is about six times heavier than a proton, which is made up of u-u-d (or u-d-d?) quarks.
This year CDF has achieved the convergence of capabilities that make these discoveries possible. "We needed to produce sufficient quantities (of data) to be able to study these particles in detail. That condition was met by the superb performance of the Tevatron," said CDF cospokesperson Jacobo Konigsberg of the University of Florida, speaking of the B sub S measurement. "Then, with a process this fast, we needed extremely precise detectors and sophisticated analysis tools. Those conditions were met at CDF, along with the skill and contributions of a great team of people." Use of OSG resources has been a significant factor as well. CDF computing experts Ashutosh .. and Pasha .. stressed that "The ability to use practically unlimited off-site computing had a huge impact on the physics productivity of the experiment in 2006."
Clarify "mixing" vs "oscillation". Try to enhance the quotes to mention OSG...
How OSG facilitated the research
basic idea: need to run lots of mc, varying signal expectations and factoring in uncertainties.
Compare data to all results, see what matches
This has to happen in timescale that allows analysis to converge. To do this, need lots of cpu.
So -- run mc on grid (virtually unlimited cpu), release local cpu for data analysis, speeds up analysis
Allows physicists to work in mode where their work is not limited by CPU power, but only by their intellectual capacity.
CDF fortunate to be able to move virtually all mc simulation to grid. This will continue as longterm strategy.
for any analysis, need to do lots of computations, when search for new, compare expectations to data, generate monte carlo. amount of cpu on site huge but limited in finite time, import to have as many cpu resources as can use else time is too long analysis process desn't conferge as fasta s want and the quantity of transfer becomes a quality issue. if less cpu than x is impossible to do certain things
"When you're looking for subtle effects like B sub S mixing, or rarely produced particles like the sigma sub b, it is crucial to compare your data to all hypotheses to verify that the data are consistent with the expected signal, and not with others. To make the scientific case unequivocally, you need to run a large amount of Monte Carlo simulation. For this, the power of the grid is essential, and willl be more and more so as we move forward" explains Ashutosh Kotwal.
More detail about physics with quotes
Try to get spokesperson quote on significance of results; below I have text directly taken from press release
The discovery of the B sub S oscillation rate is immediately significant for two major reasons: reinforcing the validity of the Standard Model, which governs physicists' understanding of the fundamental particles and forces; and narrowing down the possible forms of supersymmetry, a theory proposing that each known particle has its own more massive "super" partner particle. The currently popular models of supersymmetry, for example, predict a much higher transition frequency than that observed by CDF, and those models will need to be reconsidered.
Future plans, both data processing and physics goals
--
AnneHeavey - 14 Nov 2006