By Philip Remedios | CEO, Director of Design & Development
User preference and evaluation studies have been a mainstay design tool for medical device development for decades. To veteran engineers, this design tool is simply good practice even if not specifically required by regulating bodies. However, there are opportunities to significantly enhance output quality and clarity of input requirements that drive the formal design process. Understanding and providing solutions to unexpected usability aspirations often produce powerful and valuable intellectual property that can positively differentiate and protect a product from competition.
Ideally, some level of “discovery” research has already been conducted to collect important design criteria that will ultimately determine the device’s commercial success. At this stage, pragmatic prioritization of critical versus non-critical features and functions is usually not clear. Additionally, early user research often paints a qualitative, non-linear definition of problem statements that requires an actionable assessment of which singular concepts hold the most value. Representing the best collection of these early concepts for preference feedback allows the R&D team to subsequently converge disparate sub-systems into a harmonious and unified design both rapidly and efficiently, minimizing the risk and impact of downstream deviation.
The study also provides an opportunity to fill gaps in design requirements or quantify system specifications through further probing. It helps establish the features and functions for the minimally viable product and rank additional features in order of their value versus negative impact to determine additions to the requirements, finalize system specifications, and provide insights for future product planning considerations.
A methodical, well thought out protocol is the backbone of a successful evaluation study. The basis of the study is to enhance the vision of what the emerging design concept must be and what factors will make it acceptable and ideally favorable in the intended market. Develop an interview discussion guide that reviews features and variations between models, ensuring review order is randomized, if possible. Test simpler, less capable options first to qualify adequacy of function or effectiveness as users will always want better, faster, and smaller if given the choice. Frontload quantitative questions so answers aren’t colored or influenced by the subsequent study model review’s qualitative responses. A simple Likert Scale of 1-5 can be deployed when necessary to understand the level of objective preference to understand the user’s rationale and motivation. Ensure the qualitative weighting between feature options is consistent (is Option A really a 5 compared to Option B which scored a 3 and Option C a 2?).
The key to maximizing the study’s value is structuring the content and development of response stimuli. These discrete simulators and study models are generally representative of the required size, resolution, volumes, and relevant configurations driven by technical and cost limitations. Each stimulus should demonstrate disparate features without introducing qualitative flavoring such as individual styling, colors, materials, finishes, and graphics. They may include dynamic display mockup and kinematic function if deemed important for accurate evaluation. This can include graphical interface screen and navigation, touchpoint access, door deployment, attachment of sub-systems (disposable to handpiece), representative weight (if portable), and ergonomic shaping (if handheld).
Another important planning task is determining the primary, secondary, and tertiary end-user groups (e.g., surgeon, nurse, biotech, buyer, patient) in weighted proportions to obtain appropriate feedback. Plan to interview approximately n=4-7 of each user group who will operate or influence the device design and recruit at least one more of each user group for inevitable cancellations. If the user group is homogenous in applicable attributes, skew to the bottom of the range. If the user group is diverse, recruit seven or more. If possible, recruit from varying institutional settings—teaching, private, or public hospital versus outpatient clinic based on projected market opportunities. Ensure the participant screener avoids wasted effort recruiting under-qualified users. There are many professional agencies that can assist with recruitment which can be extremely challenging in the clinical space.
Honoraria compensation is standard practice in the industry to encourage participation in the study. It is important to understand the current rates so high-quality participants can be assured. Some corporations have established “fair-market values” for legal reasons that are not kept up to date, which severely hampers the recruitment process. If in doubt, err on the generous side to ensure the best outcome—after all, the additive cost to the project is a drop in the ocean compared to the overall R&D budget for such a critical function.
Keep the interview team to a moderator and notetaker, with the possibility of a third person to support stimuli management if necessary. At BlackHägen Design, the moderator is typically a usability specialist and notetaker an industrial designer or engineer. The combination of these disciplines improves the quality of the analysis and recommendations post-interview. Additional observers can be accommodated behind one-way glass or via online live streaming video. Allow ample time between sessions to reset and potentially repair the stimuli and accommodate tardy sessions. Consider adding an initial training or familiarization exercise if the device or procedure is not widely practiced or is relatively new or different than the standard—although this action may diminish the opportunity to evaluate the intuitiveness of a user’s discovery interaction with an unfamiliar interface.
Plan for each interview to not last more than 60-75 minutes to avoid fatigue. If testing content requires more time, it may be necessary to plan for two separate preference studies.
Keep the stimuli/models under wraps until they are ready to review, deploying one at a time to avoid distraction. Previously viewed stimuli can be left uncovered for comparison.
The interviews should be conducted in a uniform, repeatable manner ensuring questions are structured so they do not introduce bias. Note-taking should be tracked to the activities in the study to ease analysis efforts. Advanced tools such as eye-tracking software and physiological monitoring (BP, respiration, galvanic skin response) can help match emotional and stress response to tasks.
This phase will determine its overall effectiveness, so it’s vital to extract accurate and traceable data for identification of preferences and user needs trends. Because this research is highly qualitative and interpretive, establish emerging preference patterns to draw accurate conclusions. Sometimes it becomes necessary to add participants to the initial population until these patterns become indisputable. However, the small number of datapoints will never become statistically significant and shouldn’t be confused with quantitative research techniques.
Ensure you understand the “why” of the preference—the true user need and not a suggestion for the solution. The moderator must continue to probe each participant after they indicate what they prefer to fully understand why they prefer what they indicate. The ultimate device should be designed by professionals to meet the need and not merely implement the suggested preference. Distill the whys into a set of user needs the project stakeholders will integrate into the formal design requirements. Balancing the requirements and user needs against the feedback in the preference study and other user research determines their suitability to function, use-safety, and ease-of-use.
Carefully interpret all critical user requirements from previous contextual inquiry and marketing studies into a few carefully segmented and similarly executed stimuli and study model configurations that will test variations on features, functions, and interfaces to determine the best solutions. Leave qualitative flavoring out of the equation; all models should be as consistent in execution as possible. Don’t ask leading questions and don’t suggest answers; encourage participants to answer with their own words and emphasis. If the protocol is well-structured and the stimuli of sufficient resolution and quality, the design team should not have much difficulty cherry-picking the best features and interfaces as they head into the hardcore design development phases. Balancing user preferences against risk, cost, and reliability remains the next hurdle to determine where compromises may need to be made to ensure the final design provides the best value for its intended use.