You are here: TWiki > SoftwareTeam Web>Projects>BoscoV0Usability (08 Aug 2012, AlainRoy?)

Plan for Usability Testing of BOSCO

1 What is usability?

From [1]:

Usability is not a quality that exists in any real of absolute sense. Perhaps it can be best summed up as being a general quality of the appropriateness to a purpose of any particular artefact.

2 Limits to current phase of usability testing

Thorough usability testing is an art and a science. Given the short time-frame and lack of in-house experience, we are going to limit the scope of usability testing. Good usability testing should include a fair number of candidate subjects, multiple rounds of testing, etc. Our goal is to gain insight into the usability we have in a cost-effective way.

3 Inputs for the usability testing

Our usability testing requires a few things to be in place before it can begin:

  1. The software must be ready for usability testing.
  2. An initial round of beta testing should be complete. The goal of usability testing is not to find bugs (though we welcome bug reports) but to understand the usability. Therefore basic bug fixing should be complete before usability testing begins.
  3. There must be sufficient documentation for the user to be able to do all of the tasks required for the usability testing.
  4. Dan will work with the team to identify a set a couple of testers for usability testing. These should not be the same as the beta testers.

4 Plan

4.1 Target audience

Our target audience is composed of scientists and researchers who:

  • Have access to a cluster of computers using PBSPro or Torque.
  • Have need for high-throughput computing
  • Are comfortable with the basics of using Linux and the command-line.

4.2 Task

We will ask our testers to spend about one hour doing the following two tasks:

  1. Install Bosco and connect it to their PBS/Torque cluster.
  2. Run one job via Bosco.

The tester will be given:

The test is expected to have:

  • An appropriate computer on which to install Bosco
  • Access to a remote cluster (currently PBS) to use with Bosco.

4.3 Usability testing

Our usability testing will consist of three parts. All three are desirable, but due to time constraints we may only be able to do the second and third parts.

4.3.1 Observation (for v1 and beyond)

Have an OSG staff member (preferably a Bosco team member) watch the tester complete the task. The staff member will not interfere with the task. Specifically, we will not provide any guidance or help. We will take extensive notes on what happened. In particular, we will note what parts of the task were hard or confusing and whether the users used the recommended workflow.
This task will require someone to be present. In future usability testing, we may find appropriate remote collaboration testing to do this remotely.

Questions that the OSG staff member will try to answer are:

  1. Did the tester follow the documentation and process linearly? If they jumped around, what did they do?
  2. What parts of the process made the tester get stuck?
  3. What unexpected errors did the users find? How did they deal with them?
  4. Did the tester show signs of frustration?
  5. Did the tester run any jobs? If not, why not?

4.3.2 Survey: System Usability Scale (for v1 and beyond)

We will ask the tester to fill out the System Usability Scale survey (below) and mail it to Alain. This survey has been widely used ([1], [2]) and acccording to [2], it's been well studied and people have found it to consistently be a reliable measure of usability. It gives a single number (0 - 100) that summarizes the usability of the product. It's particularly useful when we iterate because we can see if we are improving.

Each question is answered on a scale from 1 to 5 where 1 means "Strongly disagree" and 5 means "Strongly agree". If they are unsure, they should select the middle (3).

  1. I think that I would like to use this system frequently
  2. I found the system unnecessarily complex
  3. I thought the system was easy to use
  4. I think that I would need the support of a technical person to be able to use this system
  5. I found the various functions in this system were well integrated
  6. I thought there was too much inconsistency in this system
  7. I would imagine that most people would learn to use this system very quickly
  8. I found the system very cumbersome to use
  9. I felt very confident using the system
  10. I needed to learn a lot of things before I could get going with this system

Scoring, from [1]:

SUS yields a single number representing a composite measure of the overall usability of the system being studied. Note that scores for individual items are not meaningful on their own.

To calculate the SUS score, first sum the score contributions from each item. Each item's score contribution will range from 0 to 4. For items 1,3,5,7,and 9 the score contribution is the scale position minus 1. For items 2,4,6,8 and 10, the contribution is 5 minus the scale position. Multiply the sum of the scores by 2.5 to obtain the overall value of SU.

SUS scores have a range of 0 to 100.

4.3.3 Survey: Detailed comments

In addition to the numerical System Usability Scale, we'll ask the tester for specific feedback.

  1. Did you find the documentation clear? (Do you understand what Bosco does?)
  2. Did you have any difficulty with the installation and configuration? If so, what?
  3. Did you successfully run any jobs? If not, what happened?
  4. Is this software you would like to use? Will it help you? If not, why not?

4.4 Timeline: External dependencies

Task State Owner Target Start Target Finish Actual Finish Notes
BOSCO v0 complete
 Achieved 
Fraser/Gore - 27-Feb-2012    
Documentation/web site complete
 Achieved 
Mambelli/Fraser - 27-Feb-2012    
Beta testing complete
 Achieved 
Fraser/Gore - 5-Mar-2012    
Find usability testers
 Achieved 
Fraser 20-Feb-2012 5-Mar-2012 12-Mar-2012 Only got one tester

4.5 Timeline: Usability testing

Task State Owner Target Start Target Finish Actual Finish Notes
Determine impact of Human Subjects Research
 Achieved 
Alain 16-Feb-2012 24-Feb-2012    
Update plan to reflect Human Subjects Research
 Achieved 
Alain 24-Feb-2012 28-Feb-2012    
Work with tester #1 and #2
 Achieved 
Alain 5-Mar-2012 9-Mar-2012 12-Mar-2012  
Work with testers #3 and #4
 Skipped 
Alain 12-Mar-2012 16-Mar-2012   Only got one tester from source of testers
Write up usability testing results
 Achieved 
Alain 16-Mar-2012 19-Mar-2012 12-Mar-2012  

5 Appendices

5.1 Human Subjects Research

Miron advised Alain to check out rules governing Human Subjects Research, and ensure that we follow appropriate guidelines.

After some research, he communicated with the directory of the relevant Institutional Research Board at the UW-Madison, who said:

If these people are only commenting on the software, it is evaluation, not human subjects research. If you are getting details that are personal or identifiable-- it becomes research.

To avoid moving into the research realm-- your questions should focus on a "reporting" nature (tell me what you see) rather than, for example, "Compare this to another software and give me your opinion."

Its a gray area, but so far, what you have described is not human subjects research. It involves human subjects-- but what you are doing so far is not human subjects research.

So if we are careful not to collect personal information, we will be fine.

Some other background:

5.2 Result

(As emailed to participants in Bosco project)

Hi everyone,

This afternoon I did our first usability testing of Bosco. I followed the plan laid out at:

https://twiki.grid.iu.edu/bin/view/SoftwareTeam/BoscoV0Usability

He followed the documentation at:

https://twiki.grid.iu.edu/bin/view/CampusGrids/BoSCO

We'll call our tester Bilbo, to be anonymous.

Bilbo is an undergraduate CS major. While not exactly our target audience, he's not a bad choice. He feels comfortable using a command-line (even tabbed terminal windows and scp) and using computers but he is not an expert in Linux.

I explained what Bosco does, and he understood. He has access to a one-node PBS cluster here in the CS department that we use for testing purposes.

We spent about one hour on the testing.

Interesting things

  • The download link to the Bosco tarball was broken. With Jaime's help I fixed it. This should not have happened. * Bilbo is comfortable with the shell, but he edits files on his Mac then uses scp to transfer them over. Our users may do this too--it's not the first time I've seen this.
  • Bilbo got stuck when asked to source a file because he did not know what the difference was between sh and bash. He didn't know how to decide if he has sh, bash, or csh. I helped him, otherwise we wouldn't get anywhere. He has tcsh, which is not one of the options on our page, so I told him what to do.
  • He install Bosco but couldn't start it:
    % bosco_start
    03/12/12 16:37:17 Can't open "/afs/cs.wisc.edu/u/k/a/kahl/bosco/local.mumble-20/log/MasterLog"
    ERROR: BOSCO not started.
    
    It turns out that he logged into "best-mumble.cs.wisc.edu", which uses load balancing to select random machines in an instructional lab in the department. Since he used multiple terminal windows, he installed on one computer (mumble-20) but ran it on another computer. He had no idea what to do about this. He eventually decided to just start over. It took a while to undo this because he failed to get a truly fresh installation.
  • While re-installing, he asked me how to remove a directory and what "cd ~" means. Remember, he considers himself comfortable in the shell.
  • I wanted him to submit a job, but he couldn't follow the documentation because there is NO DOCUMENTATION on how to submit jobs with Bosco. The only thing we have is a line in command summary table that says:
    condor-*	
    Various [Arguments]	
    Various	[Implicit Input]
    Various [Output] see the Condor manual
    So I told him how to use condor_submit and condor_q.
  • He copied the submission file and had no idea that he had to edit the file (to change the grid_resource), even when I told him he had to edit the file in some way. When I pointed to that line, he realized what he had to do.
  • The job sat idle in the queue, and he had no idea what to do next. Why was it idle? How should he figure it out? He was stuck and had no idea what to do.
  • The machine he was connected too went down. He presume it's because it was a busy instructional machine and not due to Bosco, but we don't know for sure.
  • I make my web browser nice and wide so I can see the full command-line. Everyone does, right? No, Bilbo made his window really narrow so it fit next to his terminal window. People interact with their computers in very different ways.

Survey

I gave him the System Usability Scale survey. His answers gave a score of 37.5/100 where 100 is "most usable". A couple of notable answers:

Q: I think that I would need the support of a technical person to be able to use this system.
A: Strongly agree

Q: I needed to learn a lot of things before I could get going with this system:
A: Strongly agree

Q: I would imagine that most people would learn to use this system very quickly.
A: Strongly disagree

He's only one person, so this is not statistically significant. But given what I saw him struggle with, I think there is a lot of work to be done to make Bosco usable.

Things we should do

  • Clarify the documentation on how to source a file. Can we simplify the process?
  • Tell people how to submit and watch jobs. Make sure it's clear what lines should be edited in the submission file.
  • Tell people how to get the output from their jobs. Whatever the scheme is (i.e. no file transfer) they need to know. The example submit file has /dev/null for the output, so I'm guessing we're not going to get any output. Is that really what we want in our example?
  • Help people figure out what to do when jobs are idle.

Things we should know

  • Users that are comfortable with Linux may not know:
    • What shell they are using, or how to find out
    • How to do basic things, like remove a directory or what ~ means.
    • How to edit a file in Linux (may prefer to edit on their own computer)
  • Some errors are very hard to debug, including:
    • Why did I get that error when I started Bosco?
    • Why is my job idle?
  • Users will interact differently than us:
    • Smaller or larger web browser windows
    • May prefer to edit files on their laptop and transfer them

Conclusion

A lot of hard work has gone into Bosco. It's clear that we are thinking hard about usability, and that's good. However, I think we have a lot of work to do before Bosco v0 can be labeled as "usable" by non-geeks. Some of the work is independent of the changes planned for v1 (like answering why jobs are idle), while others (file transfer and deploying on a laptop) may be related to usability. If we are serious about usability, we have a lot to work on.

5.3 References

[1] Brooke, J. (1996). "SUS: a "quick and dirty" usability scale". In P. W. Jordan, B. Thomas, B. A. Weerdmeester, & A. L. McClelland?. Usability Evaluation in Industry. London: Taylor and Francis. Word

[2] Lewis, J.R. & Sauro, J. (2009). The factor structure of the system usability scale. international conference (HCII 2009), San Diego CA, USA. PDF

Topic revision: r9 - 08 Aug 2012 - 18:30:20 - AlainRoy?
Hello, TWikiGuest
Register

 
TWIKI.NET

TWiki | Report Bugs | Privacy Policy

This site is powered by the TWiki collaboration platformCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors..