'Shaping the Curation Costs Exchange: sharing your feedback' by Magdalena Getler
With the end of the project just days away, and the time for the Curation Costs Exchange (CCEx) to fly the nest and brave the digital curation community alone almost upon us, our usability testers share part of its evolution and how the project has taken structured usability feedback to shape the CCEx into what it is today. Magdalena Getler of the Digital Curation Centre at the University of Edinburgh explains...
We would like to share with you the results from usability testing carried out on Cost Comparison Tool (CCT) developed for the Curation Costs Exchange platform.
In November-December 2014 we conducted a comprehensive study of the CCT using various research methods, such as focus groups, heuristic evaluations and usability testing. The findings pointed us towards some potential improvements and enhancements and the results can be seen in the current CCT version. In this blog I will focus on the results from usability testing which led us to these improvements.
Usability testing is a technique used to evaluate a product by testing it with representative users. It reveals to what extent the interface facilitates the users’ ability to complete critical tasks, things that users must be able to do, for example create an account. It allows us to test whether something is working as planned.
Throughout the process we focus on learning about users’ behavior – observe how they interact with the tool, whether they are in control or appear frustrated when using it, whether they complete the tasks – rather than ask them what they think of it.
During the usability testing of CCT, we recorded and analysed the following:
- Whether participants were able to complete the tasks and find desired information,
- The length of time, how quickly they could perform specific tasks, how time-consuming it was,
- Steps taken. This also allowed us to gauge how quickly and easily users learnt how to use the tool the first time they saw it.
The test was conducted with a group of typical users, in our case participants who had been responsible for managing budgets in their current or previous roles. Each session lasted approximately an hour.
The usability test consisted of three key steps:
- First, we identified the critical tasks users need to be able to perform with CCT. This also included testing the homepage design to ascertain whether it had any impact on usability and whether users could figure out what the tool was for. Users were asked questions such as ‘what strikes you about this page’, ‘what do you think you can do here’, ‘what it’s for’.
- Then, for each task we created scenarios, for example: “You would like to compare archival costs at the University of Edinburgh with archival costs at similar institution”. Task: “Compare costs with peers at California Digital Library”.
- For each scenario we measured success by recording the following:
- Completion rates, coded as binary measure of task success, where 1=success and 0=failure,
- Usability problems encountered by users. Each problem was given a severity rating as follows: the problem 'prevents task completion', 'causes a significant delay or frustration', 'has relatively minor effect on task performance', or 'is a suggestion,'
- Task completion time – how long each user spent on activity,
- Errors – a list of any unintended action, slip, mistake or omission was also assembled. Where possible, these were mapped to usability problems,
- Satisfaction ratings –after the test, users were asked to complete a standardised usability questionnaire,
- Expectation ratings – we asked each participant to rank the difficulty of each task both before and after completion.
Summary of results
Tasks
Test participants were asked to complete the following tasks:
“You are a lecturer at the University of Edinburgh. You have been awarded a grant for research project on Conservation of Mexican Turtles. You have heard of a new online tool that can help you rationalize expenditure..."
- Find the tool.
- The tool appears to be relevant and you would like to try it out. Sign up.
- You will employ one researcher and together you will take video recordings of turtle migrations and will log numbers of turtle hatchlings on spreadsheets. The project will start in 2014 and will last two years. You anticipate that it will generate four gigabytes of data, which will cost £10,000 to store. Please enter information about your dataset and the costs associated with storing it. Save.
- You would like to compare archival costs at the University of Edinburgh with archival costs at similar institution. Compare costs with peers at California Digital Library,
- Your connection has timed out and you are worried that you may have lost some data. Please contact support.
- You have consulted with colleagues and reached the conclusion that your project will generate six gigabytes of data, not four and you have to plan accordingly. Go back and amend your data set.
Task Completion Rate
All participants successfully completed Task 2 (sign up for CCT), Task 5 (contact support for the website) and Task 6 (go back and amend your dataset). Four of the five (80%) completed Task 1 (find the CCT tool on the website). The lowest completion rate 60% (but still more than half) was for Task 3, which asked users to enter information about a dataset and the costs associated with storing it (the core task in CCEx).
Despite struggling the process for Task 3, something which has now been addressed in the latest version of the CCT to make this more intuitive, the results show that the interface is user friendly as demonstrated by a high task completion rate.
Table 1: Task completion rate
Participant |
Task 1 |
Task 2 |
Task 3 |
Task 4 |
Task 5 |
Task 6 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
2 |
1 |
1 |
1 |
1 |
1 |
1 |
3 |
1 |
1 |
0 |
1 |
1 |
1 |
4 |
0 |
1 |
0 |
1 |
1 |
1 |
5 |
1 |
1 |
1 |
0 |
1 |
1 |
Success |
4 |
5 |
3 |
4 |
5 |
5 |
Completion Rates |
80% |
100% |
60% |
80% |
100% |
100% |
Time on Task
Table 2 shows the time spent on each task, irrespective of whether the user completed the task successfully or not. Some tasks were inherently more difficult and longer to complete than others, and this is reflected by the average time on task.
Task 3 required participants to enter information about their dataset and the costs associated with storing it and took the longest, understandably, time to complete (mean = 755 seconds, a little over 12 minutes). However, completion times ranged from 268 seconds (approximately 4.5 minutes) to 1392 seconds (more than 23 minutes).
Task 2, which asked users to sign up for CCT was the second longest (mean = 176 seconds, approximately 3 minutes), closely followed by Task 4 to compare costs with peers at California Digital Library (mean = 145 seconds, more than 2 minutes).
Table 2: Time on task
User 1 |
User 2 |
User 3 |
User 4 |
User 5 |
Average Total |
|
Task 1 |
158 |
70 |
122 |
120 |
30 |
100 |
Task 2 |
300 |
189 |
171 |
122 |
95 |
176 |
Task 3 |
680 |
882 |
1392 |
550 |
268 |
755 (12min) |
Task 4 |
245 |
122 |
98 |
118 |
141 |
145 |
Task 5 |
58 |
54 |
39 |
30 |
61 |
49 |
Task 6 |
112 |
114 |
314 |
37 |
73 |
130 |
Errors
In this context an error means any unintended action, slips, mistakes or omissions made by participants while trying to complete the task scenarios. Table 5 displays a summary of the test data. Tasks with low completion rates and high errors and time on tasks are highlighted in red. The most complex tasks, Task 3, received the highest number of errors.
Table 3: Summary per task in terms of completion rates, errors and time
Task |
Task Completion |
Errors |
Time on Task |
1 |
4 |
5 |
100 |
2 |
5 |
0 |
176 |
3 |
3 |
12 |
755 (12 mins) |
4 |
4 |
4 |
145 |
5 |
5 |
5 |
49 |
6 |
5 |
9 |
130 |
Usability Problems
The usability problems encountered by users were also listed and an impact score calculated for each. Impact scores were calculated by combining four levels of impact (4 = prevents task completion, 3 = causes a significant delay or frustration, 2 = has a relatively minor effect on task performance, 1 = is a suggestion) with four levels of frequency (4 = frequency >90%; 3 = 51-89%, 2 = 11-50%; 1 = <10%). The resulting score was used to prioritise issues to be solved in future versions of CCT.
Overall we recorded 29 issues, ranging from hard to spot buttons and imprecise terms used in the tool (i.e. Related services) through to some difficulties experienced by users in finding the tool on the website.
Post-test questionnaire: System Usability Scale (SUS)
We also collected a subjective assessment of system usability. At the end of each session we asked participants to rate the usability of the tool using SUS questionnaire on a 5-point Scale with endpoints of Strongly disagree (1) and Strongly agree (5). Statements covered a variety of aspects of system usability, for example the need for training ('I could use CCT without having to learn anything new'), support ('I thought that I could use CCT without the support of anyone else – colleagues, IT staff, etc.), complexity of the system ('I felt very confident using CCT), etc.
SUS scores range from 0 to 100. The SUS score for CCT was 53.5 (out of 100).
Future plans
We were able to gather lots of ideas, based on this user research and our own vision for the tool, how to make it even better. Here I mention just some of them:
Improve the precision with which the tool calculates costs and makes comparisons between organisations:
- This means broadening the scope of the tool and enquiring, for example, whether data that your organisation is holding is complex or simple; how often is it used; what are the staffing numbers;
- Another one is to eliminate false entries (data that distorts comparisons);
- And finally, to enable other ways of calculating costs (currently the results are displayed in Euros per Gigabyte per year, which may lead to misleading interpretations if your data is complex but low in volume. Other ways of demonstrating the results are per asset type or number of objects).
Increase user engagement: Throughout the project it has proven very difficult to get users to submit their costs data due to, among others, sensitivity of financial information and the general lack of central support for acquiring this kind of information.
Continue improving the interface based on an ongoing user testing.
Conclusion
The results of the tests were overall positive and showed that the tool was simple to use, as reflected in the high task completion rates and low number of errors made by users. The majority of users, 60%, agreed that the various functions in the tool were well integrated, found CCT intuitive and consistent, and thought that most people would learn to use it very quickly.
However, we discovered that we could make the tool’s purpose and the benefits clearer and have now addressed this in the new version of CCT by providing clear explanations of how the tool works and why it should be used, coupled with good help information (explanations on mouse-over, FAQs, step-by-step guides).
We hope that by undertaking this research, the CCT has become a valuable resource that the digital curation community will emabrace. This is just a short summary of the results. But tell us what you think of the CCT? Use it for yourself, and get in touch with your thoughts.
Magdalena Gelter, University of Edinburgh-DCC
Magdelena and the DCC lead on project task 2.2 on stakeholder analysis, building and monitoring a network of partners and stakeholders. The DCC also participates in other tasks in WP2, relating to engagement and analysis of stakeholder initiatives, outreach, publications, events and social network utilisation.