Overview
Designed a collaborative data review interface for IBM's InstructLab, an open-source project training enterprise-level LLMs with synthetically generated data on WatsonX AI. Led UX research and prototyping to solve the critical problem: "How might we allow teams of reviewers to efficiently and collaboratively approve or deny sets of synthetic data?"
Built Figma prototype featuring dual-view design (list + modular), collaborative commenting, and integrated reference documents. Conducted user testing with IBM developers and presented to senior stakeholders from UX, product management, and development teams, receiving validation as "a big step forward from what we've been doing in the past — a large improvement."
The Problem
InstructLab trains enterprise LLMs with synthetically generated data, but the current integration lacked a consistent way to review hundreds of datasets. Teams relied on irregular manual review processes—transferring data into CSV/JSON files or using command-line interfaces that created barriers for non-technical reviewers.
This inconsistency caused setbacks in model training time and quality gaps. IBM developers reported: "If I'm unsure about the process or answer, I just hope that someone gets around to reviewing the question."
Key Pain Points
No Standard Review Process
No existing standard review location or process, making review and collaboration especially difficult across teams.
Decision Uncertainty
Users were oftentimes unsure about their decisions, leaving some questions partially or completely unreviewed.
Lack of Collaboration Tools
Despite the process being inherently collaborative and involving multiple reviewers, users were left on their own to assign or review questions.
Design Solution
Working with continuous feedback from two WatsonX UX designers and one InstructLab developer, we designed screens encompassing three prioritized features: Collaborative Team Tools, List and Modular Views, and Approve/Deny/Edit functionalities.
Key Features
List and Modular Views
List view offers faster, streamlined review with basic approve/deny functions for quick workflows. Modular view provides comprehensive review with editing capabilities, collaborative commenting, and side-by-side reference document access.
Collaborative Team Tooling
Enables reviewers to discuss synthetic data with commenting and tagging features. Users can tag teammates for help or additional perspectives, with comments appearing in the tagged reviewer's "to review" feed.
Reference Document Integration
Reviewers have direct access to source documents with PDF search tools, allowing accurate fact-checking of Q&A pairs against ground truth data in real-time.
Approve, Deny, and Edit Functionalities
Core review actions: approve accurate questions, deny unsatisfactory ones, or edit questions to align with intended results before final approval.
User Research & Testing
Conducted user testing sessions with IBM software developers who work with the existing synthetic data generation process. Breaking into small groups (one developer + three designers), we gathered insights on current workflows, prototype usability, and desired features.
Critical takeaways:
- Data review was a very individualized process
- Reviewers leaned heavily on the reference document
- Modular view was preferred to list view and more functional
Design Iterations
Based on user testing feedback, we focused on three key improvements: making navigation more intuitive, improving commenting, and increasing accessibility of reference documents.
Navigation Improvements
Redesigned toggle between list and modular views with switch icon and colorful animated transitions to clearly indicate page changes.
Simplified Commenting
Added minimally invasive comment display and redesigned comment modal to emphasize short interactions, aligned with developers' preference for simple commenting and independent decision-making.
Enhanced Reference Access
Included reference document for each question in modular view with PDF search tool, and made reference documents available in list view after users showed heavy reliance on source materials.
Impact & Recognition
"This is a big step forward from what we've been doing in the past — a large improvement!"
— Jacob Engelbrecht, Backend IBM Software Engineer
Presented final prototype to senior IBM stakeholders from UX, product management, and development teams. Received validation on design approach and strategic alignment with WatsonX AI objectives. Stakeholders highlighted potential for extending design to automated validation mechanisms and suggested additional metrics for data quality evaluation.