IBM InstructLab

Overview

Designed a collaborative data review interface for IBM's InstructLab, an open-source project training enterprise-level LLMs with synthetically generated data on WatsonX AI. Led UX research and prototyping to solve the critical problem: "How might we allow teams of reviewers to efficiently and collaboratively approve or deny sets of synthetic data?"

Built Figma prototype featuring dual-view design (list + modular), collaborative commenting, and integrated reference documents. Conducted user testing with IBM developers and presented to senior stakeholders from UX, product management, and development teams, receiving validation as "a big step forward from what we've been doing in the past — a large improvement."

The Problem

InstructLab trains enterprise LLMs with synthetically generated data, but the current integration lacked a consistent way to review hundreds of datasets. Teams relied on irregular manual review processes—transferring data into CSV/JSON files or using command-line interfaces that created barriers for non-technical reviewers.

This inconsistency caused setbacks in model training time and quality gaps. IBM developers reported: "If I'm unsure about the process or answer, I just hope that someone gets around to reviewing the question."

Key Pain Points

No Standard Review Process

No existing standard review location or process, making review and collaboration especially difficult across teams.

Decision Uncertainty

Users were oftentimes unsure about their decisions, leaving some questions partially or completely unreviewed.

Lack of Collaboration Tools

Despite the process being inherently collaborative and involving multiple reviewers, users were left on their own to assign or review questions.

Design Solution

Working with continuous feedback from two WatsonX UX designers and one InstructLab developer, we designed screens encompassing three prioritized features: Collaborative Team Tools, List and Modular Views, and Approve/Deny/Edit functionalities.

Key Features

List and Modular Views

List view offers faster, streamlined review with basic approve/deny functions for quick workflows. Modular view provides comprehensive review with editing capabilities, collaborative commenting, and side-by-side reference document access.

Collaborative Team Tooling

Enables reviewers to discuss synthetic data with commenting and tagging features. Users can tag teammates for help or additional perspectives, with comments appearing in the tagged reviewer's "to review" feed.

Reference Document Integration

Reviewers have direct access to source documents with PDF search tools, allowing accurate fact-checking of Q&A pairs against ground truth data in real-time.

Approve, Deny, and Edit Functionalities

Core review actions: approve accurate questions, deny unsatisfactory ones, or edit questions to align with intended results before final approval.

User Research & Testing

Conducted user testing sessions with IBM software developers who work with the existing synthetic data generation process. Breaking into small groups (one developer + three designers), we gathered insights on current workflows, prototype usability, and desired features.

Critical takeaways:

Data review was a very individualized process
Reviewers leaned heavily on the reference document
Modular view was preferred to list view and more functional

Design Iterations

Based on user testing feedback, we focused on three key improvements: making navigation more intuitive, improving commenting, and increasing accessibility of reference documents.

Navigation Improvements

Redesigned toggle between list and modular views with switch icon and colorful animated transitions to clearly indicate page changes.

Simplified Commenting

Added minimally invasive comment display and redesigned comment modal to emphasize short interactions, aligned with developers' preference for simple commenting and independent decision-making.

Enhanced Reference Access

Included reference document for each question in modular view with PDF search tool, and made reference documents available in list view after users showed heavy reliance on source materials.

Impact & Recognition

"This is a big step forward from what we've been doing in the past — a large improvement!"
— Jacob Engelbrecht, Backend IBM Software Engineer

Presented final prototype to senior IBM stakeholders from UX, product management, and development teams. Received validation on design approach and strategic alignment with WatsonX AI objectives. Stakeholders highlighted potential for extending design to automated validation mechanisms and suggested additional metrics for data quality evaluation.