A part of the UX gate is a set of guidelines for deciding when and how to do usability testing. This is the working document for those guidelines.
A finalized version will live in the documentation pages on drupal.org: http://drupal.org/node/1284300
When is required?
We want to test changes that make an major impact on the user experience of Drupal core.
- When your change introduces a major new interface element.
- When your solution aims to solve a existing major/critical user experience issue.
We want to prevent requiring (up front) usability testing in this case:
- Minor or normal usability bug fixes
- When your change introduces a minor to normal new interface element.
Who should do usability testing?
Everyone in the community who is interested in conducting a usability test will be encouraged to contribute.
There will be guidance and resources to get people up to speed on the process of how to conduct a usability test.
How-to
In order to create great research results, we supply you with a standardized methodology for setup, analysis and reporting of usability testing. This to make it accessible to anyone, to perform a usability test.
There are a bunch of things we need to discuss to make this happen.
- Overall process: Create a begin and end, workflow for people to follow.
- Setup: Participant requirements, suggested software, test plan
- Analysis: How do we ensure, all results are captured
- Reporting: Standardized reporting by severity, problem statement and additional material (video's, screens)
Overall process
1. The contributor initiates the conversation with UX team to determine if it requires UI testing or not
2. In collaboration with the UX team, a methodology is adopted
3. Prep and research is conducted by the contributor and/or others. UX team will be there to provide assistance.
4. Findings are analyzed, reported and discussed with the contributor and the UX team.
What does it mean to pass this gate?
Bojhan suggests that this should be crystal clear and not quantative. For example, a 80% success rate for the most important tasks. No matter what the issues are low hanging or core issues, they should communicate a goal.
However, Dharmesh agrees that it is good to have a clear actionable "pass" or "fail" he is of the opinion that it is difficult to do so with usability research. Usability research is a different beast as opposed to programming and should not be forced to quantify things in a way that is not RIGHT. Although it is possible to tie in the results to a usability metric/s. It is not enough to make an informed decision. They have to be viewed (unfortunately) on a case-by-case basis and holistically. Also, how a feature's UI affects the core or other parts is important to consider which might be missed if the focus is on a metric. Therefore, a discussion is required. A good severity rating would help some-what to ease the process but still it is not definitive. This would certainly add more work but he would rather prefer to do more work and do it right rather than a half-baked way. Over the period of time, as the process matures there is a scope to streamline. It might not work to streamline right from the get go only to realize that it does not work. He also understands that the contributors would want a clear actionable goals as to what makes a pass - but doing that would be a flawed process.
Questions for the team:
- It is not possible to do usability testing for everything. Having a heuristic evaluation of the UI might help expedite the process. My concern is that there might be push-back from the community about the authencity of the data, but this has been a very valid and widely adopted methodology with it’s own merits and demerits. Thoughts?
Section#1: Severity Ratings:
Once the issues are identified, they need to be prioritized. For this, we should have a more robust system in place. I (as many other usability people) am inclined to use severity ratings. There are several of them in the market: Jacob Nielsen’s scale is pretty effective.
More information on severity ratings:
“Severity ratings can be used to allocate the most resources to fix the most serious problems and can also provide a rough estimate of the need for additional usability efforts. If the severity ratings indicate that several disastrous usability problems remain in an interface, it will probably be unadvisable to release it. But one might decide to go ahead with the release of a system with several usability problems if they are all judged as being cosmetic in nature.”
- The severity of a usability problem is a combination of three factors:
- The frequency with which the problem occurs: Is it common or rare?
- The impact of the problem if it occurs: Will it be easy or difficult for the users to overcome?
- The persistence of the problem: Is it a one-time problem that users can overcome once they know about it or will users repeatedly be bothered by the problem?
- Finally, of course, one needs to assess the market impact of the problem since certain usability problems can have a devastating effect on the popularity of a product, even if they are "objectively" quite easy to overcome. Even though severity has several components, it is common to combine all aspects of severity in a single severity rating as an overall assessment of each usability problem in order to facilitate prioritizing and decision-making.
The following 0 to 4 rating scale can be used to rate the severity of usability problems:
0 = I don't agree that this is a usability problem at all
1 = Cosmetic problem only: need not be fixed unless extra time is available on project
2 = Minor usability problem: fixing this should be given low priority
3 = Major usability problem: important to fix, so should be given high priority
4 = Usability catastrophe: imperative to fix this before product can be released”
Link: http://www.useit.com/papers/heuristic/severityrating.html
Section #2 : Resources
- Usability Report Template
- How to categorize usability issues? (Severity of the issues)
- How to conduct a usability study?
o Moderated Usability Study
o Unmoderated Usability Study
Comments
Initial feedback
Thanks for kick starting this discussion. For me its very important that we get the fundamentals down for this, for far too long we have held confusing standards on what should be usability tested before getting in and where rarely able to enforce it. To me the primary points of discussion are:
But let me start by addressing some of the questions you posed, lets keep in mind that we need to work towards a document that can be used as handbook to reference to in the gate.
Ideally it is required for all UI changes but that is obviously not feasible. I am on the fence whether we should intentionally word this subjective to just say "Any patch that makes major changes to the user interface.". Or be more specific, and say for example that this means that we test new concepts (block ui overhaul, module page overhaul, new ia) - but not smaller changes like copy changes, changing small patterns (machine name).
This part confuses me, you pose many different questions. I think for the contributor the process could look something like:
1) A conversation is started with the UX-team to determine whether it needs to be tested and when the appropriate moment is.
2) In collaboration with the UX-Team a approach is determined (methodology, participants, scope, length, tasks).
3) Participants are recruited
4) Research is performed by the contributor and/or others with possible assistance of the UX-Team.
5) Findings are resulted back, conform to the standards
The big question is to what extend the UX-Team is involved? I believe we want to involve more contributors in this process of actual testing to greater the impact of our findings and educate more contributors into doing this. Ideally our role, is mostly that of advising and only for mission critical parts to jump in and perform testing ourselves.
This should be crystal clear, and near quantitative. For example a 80% success rate for the most important tasks. No matter what the issues are low-hanging-fruit or fundamental problems, we should communicate a clear goal.
Anyone! But have some advised UX-team supervision on mission critical parts.
I think a heuristic evaluation might be very though for the community to come to terms with. Although a widely adopted technique, the discussion it triggers might be more worthwhile to just solve with real users. Given that we can use this technique in many more ways, for example in patch reviews. However I do not see heuristic evaluation as an alternative to usability testing, usability testing is about seeing what your users do where as heuristic evaluation is predicting what your users do. Additionally this is a technique we can apply in other ways for example in patch reviews, rather than a way to pass this gate.
Yes, important - but lets keep the discussion around those when we discuss the specifics of reporting?
Dreaming I envision that we have one book that outlines the process for this gate requirement, and some additional books to assist the contributor in making many of the considerations.
Determine what qualifies as a "pass" in usability testing
To me this is the most important question, what do we qualify as a pass? Do we look at success rate, number of participants, realistic tasks etc? How do we handle patches that don't pass?
Provide documentation how to setup a usability test that counts
If we want to allow others to do most of the work. We need to have documentation that clearly lays out which steps are required to get to a test that is valid.
These are my rather random thoughts so far, its 3AM here so I should probably stop rambling :)
Draft 2 posted
Based on your feedback, I posted Draft 2 with my comments.
For heuristic evaluation, agree! agree! But it is not that we have been doing usability testing from the dawn of Drupal. Even the UX team was doing a review of the UI before (isn't it?)
Also, usability research goes beyond usability testing. We are trying to improve the usability of the product - that is the core. I do not agree with you when you say that heuristic evaluation as "predicting what our users do?". Heuristic evaluation is done by a usability expert. A usability expert will/should be able to tell what are the flaws with the interaction and point it out. I am no way suggesting that we should do heuristic evaluation over usability testing. It depends on the scope of the project and problems. We need to create awareness bit by bit, just the way we have done with usability testing.
About documentation, I agree that is absolutely necessary. I wonder a video in conjunction might be more effective. At least, if we are the advocates of user experience, the experience of going through our gate should be a pleasant one.
Dharmesh Mistry
UX Researcher | Acquia,Inc.
A request for process, can
A request for the process, can you respond through comments and not in the wiki? It's a bit though for me but especially others to follow the discussion otherwise.
Actually the UX-Team didn't exist at all before we did the first usability testing. I agree with your points on heuristic evaluation, we simply have to find places where we can apply it and demonstrate its strengths.
What does it mean to pass this gate? - What can we do to make this clear? Although I agree that anything quantitative would be fooling ourselves, for this gate to work we need to have a very clear standard for this. "Requires discussion" is not clear. Could we have something like "There are no major+ usability issues"? Obviously we should be able to deviate from this on a case-per-case basis. (This might be something for our IRC meeting).
About documentation, should we make a list of things we need to document?
What does it mean to pass
What does it mean to pass this gate? - What can we do to make this clear? Although I agree that anything quantitative would be fooling ourselves, for this gate to work we need to have a very clear standard for this. "Requires discussion" is not clear. Could we have something like "There are no major+ usability issues"? Obviously we should be able to deviate from this on a case-per-case basis. (This might be something for our IRC meeting).
Certainly this is a bigger decision. Awaiting comments and thoughts from other awesome people.
About documentation, should we make a list of things we need to document?
That's what the resources section is about in the Wiki. I would suggest to put your initials while editing in that section so it is easily understood who added what.
Dharmesh Mistry
UX Researcher | Acquia,Inc.
We spend the last hour
We spend the last hour discussing on IRC what this gate should mean, the primary conclusion was that we shouldn't expect contributors to meet this gate too often. The effort involved in doing a usability test, still is a bit much for a weekend-drupal-core contribution. However it is reasonable to assume for really large functionality, where often the UX-team is involved this gate can apply.
So ideally each review process would first go through the principle and guideline gate, and review from the UX-Team before it would need to meet this gate. An example use case would be the new IA, that would first go through all the other gates and then undergo testing.
We concluded that any more quantiative meassure even a treshold would be misleading, and we should focus on "passing" this gate be a discussion point - given that its primarly for major UI changes its reasonable to assume this.
For the coming weeks we will focus on writing out the process, so what it means to pass, and which qualifiers there are for a valid test. How to setup a test and perform testing, will be documented past london.
Draft 3
Description:
To provide usability feedback through heuristic evaluation, usability testing or other method depending upon the scope of the UI change.
Given the nature of usability research, this gate does not have "pass" or "fail" but has prioritized issues that need to be addressed.
When is this needed?
For any changes that affect the UI for core. Major contribs are also encouraged to use this gate.
Details
Based on the scope and impact of the UI change, the UX team would suggest a usability research methodology (Usability testing, Heuristic evaluation or other) to give feedback. Based on the evaluation, issues will be prioritized based on the severity of the problem (frequency, impact and scope) and will be followed by discussion/meeting on possible solutions.
Contributors with relevant skill set could conduct usability tests and the UX team along with resource documentation would be made available to assist in the process. Heuristic evaluation will be currently limited to core UX team members.
Resources
How to conduct usability tests?
Template for usability reports
Prioritizing usability issues
Dharmesh Mistry
UX Researcher | Acquia,Inc.
Usability for the overwhelmed, overworked Drupalist
I have no idea if any pieces of the following will be useful or if they even approximate reality in some cases, but I wanted to propose a short HOWTO document that devels/designers can use to understand and execute, if needed, the testing sub-gate. Critique, modification, and even slash-and-burn if the doc is just not realistic are all appreciated. Anyone who reads this, please realize that this is only a draft put here for comment and might not be acceptable in any form or fashion by the Drupal community...
Keep in mind that the goal is not teaching someone how to be a professional usability tester, but how to essentially empower people to do their own first-pass sanity check.
USABILITY FOR THE OVERWHELMED, OVERWORKED DRUPALIST
How to goof-proof your code and avoid the issues queue
It’s a pretty typical scenario: you’re a Drupal code contributor, which means that in addition to your day job you also juggle one or more hunks of Drupal core, contributed modules, and/or themes, squeezing in things wherever you can. You’ve managed to figure out most of the Drupal Way – quite a challenge – and you have a healthy commit record. But lurking in the issue queue are those sometimes-vague issues against your patches – issues that are hard to pin down, hard to replicate, hard to debug. Just what you need, right? Not so much. But never fear; there’s a tried and true way to flush out those phantom bugs that always seem to crop up where the human meets the machine, freeing you up for more hacking and less issue queue hawking.
Usability – a word that traditionally has just meant “annoyance” to many developers – is the art and science of making systems easier, more intuitive, and more satisfactory to use by the intended audience. That last part is important – the intended audience – because we often get into trouble when we write code and develop user interfaces that are how we would personally like them to behave, instead of seeing how other users react to our creations. The process of seeing how others react to your creation is called usability testing, and it can either be quite complicated and expensive to pull off, or it can be done without tears, by even the most frenetic of developers. This sort of guerrilla usability is what this article focuses on.
Sometimes, a small patch to existing code does not necessitate this seeing-it-from-someone-else’s’-perspective; sometimes it does. If you have been keeping up with the Drupal 8 Gates (http://drupal.org/node/1203498) – a principle introduced and endorsed by Dries – you know that making Drupal 8 the most user-friendly and stable version of Drupal ever is a high priority. The Gates are intended to foster the capability in developers to self-assess code for functionality and form before it ever hits the repository, and usability is one of the Gates. Unfortunately, usability is often seen as a tedious and difficult process that takes away from time that could be spent on more interesting things. The good news is that you can pass the Drupal 8 usability verification gate with a minimal amount of effort that pays off in spades once people are happily and effectively using your creation.
Do I need to pass the usability testing gate?
#drupal-usabilityand we will help out. Proceed to step 5 once you have results.How to do a heuristic evaluation, Drupal-Style
For each of the following heuristics, take a moment to decide if your code changes interfere with them. If not, check each off as you go. If any one of these seems to fail, we are happy to take a look in
#drupal-usability.How to do guerilla usability testing, Drupal-style
#drupal-usabilityand we can help.While some people in the Usability community would argue that this sort of testing that is done by non-usability professionals is bound to be haphazard and empirically flawed, the reality of life in an Open Source project is that there aren't enough trained professionals to go around, and empowering people to perform first-tier testing - where some is vastly better than none - is a key to a better Drupal.
Good progress here.I like to
Good progress here.
I like to think that the call for 'needs usability testing' is mostly made by ux people. Or at least, if people thinkthey need to test (and want to) we can come by and advise on how to test what.
mpearrow: that's a very yummy write-up. Handbook material.
Wow!
This is a great start. Very happy to see this shaping so well!
My comments:
How to do a heuristic evaluation, Drupal-Style
For heuristic evaluations, I think what we agreed is for starters we will have the Drupal UX members perform the evaluation and post the report. Although the points mentioned in the evaluation guidelines are good, I feel there are a lot more which should be added. Several version of evaluations are used by the usability professionals. One of them (widely used) is the Jakob Nielsen's principles. http://www.useit.com/papers/heuristic/heuristic_list.html
These are ten general principles for user interface design. They are called "heuristics" because they are more in the nature of rules of thumb than specific usability guidelines.
Visibility of system status
The system should always keep users informed about what is going on, through appropriate feedback within reasonable time.
Match between system and the real world
The system should speak the users' language, with words, phrases and concepts familiar to the user, rather than system-oriented terms. Follow real-world conventions, making information appear in a natural and logical order.
User control and freedom
Users often choose system functions by mistake and will need a clearly marked "emergency exit" to leave the unwanted state without having to go through an extended dialogue. Support undo and redo.
Consistency and standards
Users should not have to wonder whether different words, situations, or actions mean the same thing. Follow platform conventions.
Error prevention
Even better than good error messages is a careful design which prevents a problem from occurring in the first place. Either eliminate error-prone conditions or check for them and present users with a confirmation option before they commit to the action.
Recognition rather than recall
Minimize the user's memory load by making objects, actions, and options visible. The user should not have to remember information from one part of the dialogue to another. Instructions for use of the system should be visible or easily retrievable whenever appropriate.
Flexibility and efficiency of use
Accelerators -- unseen by the novice user -- may often speed up the interaction for the expert user such that the system can cater to both inexperienced and experienced users. Allow users to tailor frequent actions.
Aesthetic and minimalist design
Dialogues should not contain information which is irrelevant or rarely needed. Every extra unit of information in a dialogue competes with the relevant units of information and diminishes their relative visibility.
Help users recognize, diagnose, and recover from errors
Error messages should be expressed in plain language (no codes), precisely indicate the problem, and constructively suggest a solution.
Help and documentation
Even though it is better if the system can be used without documentation, it may be necessary to provide help and documentation. Any such information should be easy to search, focused on the user's task, list concrete steps to be carried out, and not be too large.
We should modify for Drupal and incorporate these as well.
How to do usability guerrilla style?
Although it sounds vague, we should not give exact numbers to entities (like number of tasks, number of users, etc.) because it depends on the use case. Different use cases warrant different methodologies, and segmenting the user profiles based on the impact. Since this is such a grey area, usability folks at #drupal-usability will be there to assist. Also, depending on the use case and availability of resources the usability test will be performed could be performed by the UX team or the developer or someone from the community. The reason I stress on this is because we want to make sure that we did the research in the right way so that it does not backfire the project.
Hope all this make sense :)
Dharmesh Mistry
UX Researcher | Acquia,Inc.
Follow up
Hi all,
Thinking aloud here a bit - What is the general feeling about the degree to which we want to make each passage through the usability gate a function of the Drupal UX team's availability? I have been working under the impression that we want to make the Gates as much self-service as possible, making interaction with the UX team a requirement less of the time rather than more, just due to the constraints on peoples' availability. If we can afford having more traffic through the UX team's queue, though, that opens things up a bit in terms of the degree of detail we can prescribe.
I'm certainly open to listing more heuristics - the reason I used fewer than the canonical ten from Grandad Jakob is that several of those ten are variations on others in the set, and some are plain confusing to non-usability people. I tried to list the ones I figured were the most salient and understandable to non-usability folks who would be considering them - hopefully, with the side effect of being less overwhelming and more likely to get used :) I do think that the heuristics absolutely need examples next to each that are Drupal-specific to help overcome that "less is beautiful abstract emptiness"
We could potentially have a laundry list of suggested heuristics that are specifically focused for Drupal and for consumption by people who are focusing on usability, and a subset of these - a condensed-and-prioritized list for others that will let them triage reasonably well and catch most of the show-stopper types of issues.
Meeting results
In the meeting we discussed this part of the UX gate. Currently the discussion is lacking focus a bit. We discussed the heuristics shortly and concluded we should initially focus on the questions of "When to do usability testing?", "Who should do usability testing?", "How to do usability testing".
Although heuristics is definitly something we want to discuss more, for the sake of getting an initial version of this gate finished, we might want to focus on answering the when, who and how questions.
From our discussion:
"When to do usability testing?"
We perform usability testing to see realuser behaviour with new or existing interface elements. We use test results to inform our decision making process on what is usable and useful and what isn't. Typically we perform usability testing on major features that significantly impact the user experience of all our users, and preferably this is done on prototypes.
We want to prevent requiring (up front) usability testing in these cases:
"Who should do usability testing?"
Everyone in the community who is interested in conducting a usability test i be encouraged to contribute. Just bring passion, we'll provided you with guidance and resources to get people up to speed on the process of how to conduct a usability test.
Companies, individuals and meetup/user groups are encouraged to conduct usability testing.
"How to do usability testing"
We concluded this needs more discussion, what are the qualities we look for - and what documentation we will supply to create consistent setup, analysis and reporting.
Fantastico!
Just one thing...
"We want to prevent requiring (up front) usability testing in these cases:
This is a bit unclear. What do you mean by this? Are we saying we should not test something that is deemed "controversial"?
Dharmesh Mistry
UX Researcher | Acquia,Inc.
controversial or not
Coontroversial might mean that there is a 300 comment issue with strong pros and cons building up, like in some issues in D7UX.
For me those would be exactly the issues where one should say: stop discussing, we don't know. Conduct a user test to have actual data and a basis for decision if feature x helps or harms usability.
But maybe Bojhan means something else altogether by controversial.
Life is a journey, not a destination
@dcmistry We should use
@dcmistry We should use usability testing to inform our decisions, not make decisions for us. Whenever we employ usability testing as a binary tool in controversial discussions, we create a culture of having to "proof" everything. Which doesn't really build trust, but more importantly breaks down on the fact that we do not have the resources to do so.
I do wish to note that this rule is obviously exempt to any major user interface change. Those will almost always become somewhat or very controversial and usability testing is required to build understanding of its impact.
Last meeting attendees were
Last meeting attendees were Bojhan and yoroy joined by svenryen (look for maxus in IRC) who tracked the discussion and helped review and edit our latest draft. Thanks Sven!
My version of that second guideline.
We want to prevent requiring (up front) usability testing in these cases:
- Minor or normal usability bug fixes
- Requiring testing for making a design decision on a 'controversial' new interface element. This applies to minor and normal issues only.
Still needs work for clarity and brevity. What we want to convey is that this is about verifying changes, and making sure that minor and normal feature request- and tasks can be committed without (solid) prior usability testing. We don't have the resources to test everything and this guideline would help to make the best use of the testing that does get done, which is verifying the major and critical ones first. Also, this will ensure room for experimentation. Design principles and product vision would be the guidelines to help design smarter defaults and specific features.
How about this? Priority for
How about this?
Priority for usability testing would be given to changes which significantly affect the user interface. The minor and normal feature request can be committed without (solid) prior usability testing. Although this is not ideal but owing to resource and time constraints, we are prioritizing them.
Dharmesh Mistry
UX Researcher | Acquia,Inc.
Mind.rand()
After tonight's IRC session and after reading this thread over several times, I have some observations. Some of these are probably totally obvious to everyone but me :) Caveat emptor.
I don't think this problem is specific to the UX group, but we need to figure out how other groups manage to overcome the pocketing effect of distributed and asynchronous communications.
If we have limited resources and have to prioritize, it makes sense that core should get top priority, and then maybe contrib in descending order of install-base-size. It follows that the severity rating that yoroy mentions is a function of the size of the affected population, and that can be a metric we use for determining triage order.
I'm not sure what the best way to disseminate that kind of info would be? Handbook? PDF?
Angie noted "Requiring
Angie noted "Requiring testing for making a design decision on a 'controversial' new interface element. This applies to minor and normal issues only." that the requirement of not for Minor or normal usability bug fixes, should already do this, no need to emphasize it.
UX-Meeting update
During our UX Meeting today we discussed what should be part of the guide and make a big step in actually getting this documented. A book is created at http://drupal.org/node/1284300 which needs a lot of work. Most of the out standing to-do's are listed below. It's likely we will run into some discussions items, which should just be discussed in this topic.
The audience of this book is people who have identified something they want to test, from discussion on d.o and move to this book to find out how.
Intro
The intro wants to make clear why testing is important and set expectations on that you can do it, too!
Need it for informed decisions, reporting is key for community action.
What to test?
Minor edit, to make sure it doesnt come off as we dont care about minor or normal usability bug fixes.
Test plan (setup)
What is the focus? What are the goals? (derived from the issue/discussion)
Why is it important? (derived from the issue/discussion)
What needs to be tested? (concider adding example case)
How should we test it? (scenario, tasks)
Recruiting
How many should you recruit?
Whom should you recruit?
How/where should you recruit?
Testing
How should you test? timeline: 0-5 minutes contextual, 5-25 tasks, 25-30 minutes outtake
Suggestions: How to be a good moderator
Analysis
How to analyze the results?
How do you priortize what needs to be fixed? (How to rank severity)
Reporting
How to report so it gets community attention
When/how to move things to the issue queue.
Paw-gress!
This is looking good. I am being cautiously optimistic that we are coming to finalizing the gate. Of course, documenting this will take a bit of time. But, I am sure I will find some time to document
Dharmesh Mistry
UX Researcher | Acquia,Inc.
So because the handbook is
So because the handbook is getting really long and hard to oversee. We split up the verify documentation into
• Verify (why, when, process)
- How-to
- Communicate
To clarify: dcmistry has
To clarify: dcmistry has written a solid first draft for the 'how to usability test' guide here: http://drupal.org/node/1284300
We discussed it during the ux open hour today, and Bojhan's outline is what we came up with for how to split that page up into a 'Verify' page (why & when to test) with sub pages for 'How to do it' and 'Communicate the results effectively with the community'.
dcmistry and bojhan will give it another workout on the content and I'll be their editor and format things when they are done :)
I got some feedback already about "too long", so I'll try and write executive summaries for each page.
Some more eyes on it would be good though: http://drupal.org/node/1284300 Just edit the page if you want to improve sentences, grammer etc. Discuss larger concerns in here.
This part of the usability gate is shaping up very well! Thanks dcmistry