In September of 2021, Synth was an MVP with a big idea — Second brain for anything audio. It was a tool that would help capture meetings, YouTube videos, podcasts or any other audio content users could consume at a tap. Moreover, it would store the information in a way that would be extremely easy to retrieve whenever required.
When I joined the team, Synth showed potential but lacked the polish and depth to become indispensable. My challenge was to reimagine Synth as an MLP (Minimum Lovable Product), seamlessly integrated into users' workflows while solving deep-seated pain points around audio information. What followed was an exciting journey of discovery, collaboration, and design innovation.
The project was done in close collaboration with co-founders at Synth — Urvin Soneta, Vaibhav Saxena and Suneel Matham. I worked with multiple engineers during the course of the project.
The project lasted around 6 months, 2 months of discovery and then 4 months of building, testing and co-creating with the users.
We live in a world of audio overload. The average professional consumes over 1,000 hours of meetings, podcasts, and videos annually, yet how much of it do we remember? When we need to, how much of it are we able to refer back to? How much of it do we accurately capture?
Audio content is dense with critical information and hidden connections that could yield valuable insights. However, without the right tools to extract & connect ideas, these opportunities are often lost.
Manual note-taking often requires intense focus and can’t capture everything, leading to gaps that could lose valuable context or details.
Audio is a linear format, making it difficult to organize into topics or categories. Unlike written notes or knowledge management tools where content can be organized into hierarchies, tags, or networks, audio information is hard to “see” as an overall structure, which makes synthesizing and contextualizing it more challenging.
Even when we do manage to capture details through manual notes or transcription, finding that information when needed is a huge challenge. Audio lacks inherent searchability, and common note-taking approaches often don’t include indexing or tagging methods that allow for quick cross-referencing. This leads to wasted time as we sift through notes, re-listen to audio, or rely on faulty memory, which impacts decision-making and productivity.
The Synth team’s simple and core belief ‘It should be easy to take notes during and retrieve information after the meetings, videos or podcasts’ lead them to come up with an MVP that enabled people to capture notes in real time without switching screens, timestamp key-points being talked about, generate summaries and retrieve specific parts of transcripts by asking questions.
As a result of interviewing and testing novel concepts with around 20+ people during the course of the project and running usability tests with 14+ people we were able to unlock the following insights that not merely addressed the pain-points of the users but rather opened up opportunities that took the product in the direction that the team had previously not imagined. This prompted the team to redesign Synth’s core user experience from the ground up. Here are some of the top insights that helped shape the product.
Users focus on understanding and participating in real-time, but they also want to capture important points effortlessly for later reference. After listening, users shift their focus from active participation to reflection, and application of the information they consumed. A seamless transition between these phases is essential to ensure the information is not only captured but also retained, organized, and actionable.
In spoken content, tone, emphasis, and speaker interactions add layers of meaning. Transcription doesn’t preserve these nuances, so essential context or intent gets lost. This alters the intended meaning, especially in sensitive or high-stakes discussions, potentially leading to misinterpretations down the line.
Without a clear structure, important points become lost amid the flood of data, leading to inefficiency, stress, and an inability to find valuable knowledge when it’s needed most.
Since different content types (meetings, podcasts, etc.) are often accessed across various platforms, the information becomes scattered across multiple locations. This fragmentation makes it difficult to connect insights across sources, such as linking a concept from a podcast with a discussion from a recent meeting. This knowledge often remains isolated, reducing its value.
In a crowded landscape of productivity and knowledge tools, Synth aimed to address a critical gap that existing solutions missed.
Tools like Notion, Evernote, and Roam Research excelled at organizing written content but struggled with audio, where capturing and extracting insights was inherently more complex.
On the other hand, tools like Otter.ai and Fireflies.ai facilitated audio capture but primarily catered to meeting-specific workflows and fell short in retrieval and cross-context usability.
Synth was designed to bridge this divide—offering both seamless audio capture and powerful, intuitive retrieval features. This dual focus enabled it to address broader use cases, from podcasts to YouTube videos, while positioning itself uniquely in a market heavily segmented by function. Synth’s ability to unlock actionable insights from audio content provided an edge over competitors, ensuring users could connect knowledge across contexts effortlessly.
To translate these insights into actionable design decisions, we established a set of guiding principles. These principles grounded our work and helped us align every feature and interaction with the user’s goals.
Design features that amplify user productivity without adding complexity. Every interaction should feel like it’s saving time or effort, not creating a new layer of friction.
Fit naturally into the tools and workflows users already use. Synth should feel like an extension of their existing processes, not a separate task to manage.
Audio is inherently linear and intangible. Translate it into clear, digestible visuals like timelines, chapters, and notes, making dense audio content easier to navigate and comprehend.
Crafting the MLP required stripping Synth back to its essence and rebuilding it around the user's mental models. This meant addressing key challenges like simplifying note-taking in real time, organizing linear audio into actionable insights, and ensuring that retrieval was as intuitive as searching for a bookmarked page in a favorite book.
To achieve this, we needed to rethink Synth’s role in the user’s workflow. Could it integrate seamlessly without disrupting their focus? Could it provide value not just in the moment but long after the session ended? These questions guided our approach as we refined the product into something that didn’t just work but worked beautifully.
The entire Synth experience was divided into two phases ‘In the Moment’ and ‘Post Listening’.
‘In the Moment’ is associated ‘note taking’ which involves capturing information quickly without losing trail of thought. It offers users shortcuts to capture audio and what’s visible on the screen.
In ‘Post Listening’ phase, people shift their focus from active participation to reflection, retrieval, and application of the information they consumed.
Crafted an effortless, distraction-free experience that seamlessly integrated into users’ workflows.
Increased daily active users by 200% within six months and grew the user base to 20k+ within the first year.