This is the context of the problem space. Bill is frustrated with the lack of manageability of the large scale of experiments that he is tasked to evaluate.
Bill’s challenges are prevalent, and the broader industry’s push towards AI/ML elucidates why the MLops market is expected to grow to $16.6 billion dollars by 2030
Startups have flooded this market to help meet Bill's needs for experiment tracking. I’m tasked with creating an impactful product design zero-to-one in this complex and emerging domain
Once Upon a Time in a Tech Company
Once Upon a Time in a Tech Company
Bill Rothdale is an AI researcher
In his role, he is tasked with running experiments to evaluate LLM models …
… selecting the right combination of factors before deployment
His evaluation analyzes parameters and tracks performance …
… keeping track of the overall budget concurrently
In the early days of the AI boom, Bill tracked his work in Jupyter Notebooks
Subsequently, he’s seen the number of notebooks skyrocket …
… to the point where now it’s hard to manage the scale of his experiments …
… and he is frustrated with the complexity of executing his day-to-day job
In the ideation phase, I created three different ideas to solve our user’s pain points
The first idea seeks to surface important information related to the user’s experiments …
… and expose detailed information on mouseover. The data in the table below updates to show information specifically related to the highlighted epoch in the chart above
The second idea highlights the connection between different experiments, allowing the user to slice them by different metrics to guage the most performant
The final design enables the user to easily distinguish the best experiment by bubbling up the most performant experiments across the impactful metrics
Which is going to work best for our user? Sam, a data scientist and researcher, provided feedback on the early designs
The first idea seemed instantly cognizable. Sam reported that he wanted more control over the table, such as the ability to sort by a metric and add additional metrics
The second idea sparked Sam’s interest — “I’ve never seen this approach” — but he reported it would likely be less effective than the first design
The third idea took the longest to grok. While Sam found it intriguing, it seemed the least valuable to solve this scenario
The final design surfaces relevant metrics …
…
…
…
…
When looking into an emerging domain and designing 0>1, I Iook to understand our user and their context of use
First, develop empathy by putting myself in the shoes of our user's day-to-day life and understanding the technical challenges they face. Here I’m conducting background research on AI experimentation
Here, I’m looking into other products in this market to better understand our user’s expectations in context of other available tools
Let’s take a moment to investigate the existing solutions in more detail
Weights & Biases surfaces all of the user’s experiments in a left hand panel and important metrics across them in the graphs on the right
On mouseover, metrics for the highlighted experiment are shown in tandem across the different visualizations, and called out in the left hand panel
Comet, similarly, surfaces all available experiments in a left hand panel …
… but in contrast to Weights & Biases, on mouseover only surfaces details related to the current metric
Neptune, in contrast, surfaces different evaluation metrics and leverages configurable dashboards customized for common use cases
The Work
Solutions to identified pain points
Define
Design
Discover
Deliver
Specifically, I compiled …
Internal Research
Competitive Research
Market Research
I then…
Explored Design Solutions
Whiteboarded with Colleagues
Created Hi-Fidelity Prototypes
Specifically, I …
Collaborated with the Research Team
Conducted 1:1 Usability Testing
Synthesized the Findings for Stakeholders
When finalizing the project I …
Collaborated with Developers to Implement Interactions
Documented Guidelines for Other Teams
Monitored Performance Metrics on Release
The Process
The process I follow, and how I used it to deliver this project
…
…
…
…
…
…
…
…
…
The Results
What came out of the work
Thank you!
I appreciate you taking the time to review my work