Vibe Code Camp Distilled

Naveen Naidu - Building Monologue: Voice-to-Text for the AI Age

Naveen Naidu - Building Monologue: Voice-to-Text for the AI Age

Key Insights

Summary

Naveen Naidu, the solo builder behind Monologue at Every, presented a deep dive into building a competitive voice-to-text product in a crowded market. Naveen joined Every about 15 months ago as an engineer-in-residence, ran multiple experiments, and built Monologue - which immediately saw internal users recording 100+ times per day. The product has since grown to process 1.5 million words daily.

The talk covered Monologue’s key differentiating features (modes, auto-enter, per-app activation) and provided a preview of the upcoming iOS app launching February 9th, 2026. Naveen shared his workflow for building with AI coding tools, using Codex for working with large Swift/iOS codebases and Claude Code for rapid prototyping. He emphasized that in the current era, the constraint isn’t implementation ability but knowing what to build - you can vibe code a complete feature in an hour and start getting real user feedback immediately.

Main Topics

Introduction to Monologue

Monologue is a “smart voice to text Mac app” with iOS launching in early February 2026. The core workflow is simple: set a keyboard shortcut (Naveen uses right-side option key), hold to record short clips (5-10 seconds) or double-tap for longer recordings, and text gets pasted wherever your cursor is.

Key differentiators from competitors: - Per-app modes that auto-activate - Custom instructions for personalization - Auto-enter feature for hands-free operation - Paste last transcript with keyboard shortcut (Ctrl+V for Naveen)

“If you want to live in the future, you cannot be typing anything. You have to be using your voice and you should use Monologue from Every, built by the one and only Naveen.” [00:00:34 - 00:00:46]

Modes: The Power Feature

Modes allow different transcription behavior based on context. You can create per-app modes - for example, a “Cloud Code mode” that automatically activates when you’re in your terminal or IDE.

How to set up: 1. Go to settings and create a new mode 2. Add the apps where it should auto-activate (e.g., Ghostty terminal, Warp, Claude app) 3. Optionally add custom instructions for that mode 4. Enable “auto enter” to send text immediately without confirmation

Demonstration: Naveen opened his terminal, triggered Monologue, and said “Hey, can you record, uh, go through my code base and see if there are any kind of bugs.” The transcription auto-entered directly into the terminal. [00:09:01 - 00:09:18]

“Tap, talk, tap, it’s, uh, send, paste the text, it auto sends. I think that’s the, one of the best ways for you to work with codex or cloud code.” [00:09:49 - 00:09:53]

Pro tip: You can toggle between modes mid-recording using the UI or keyboard shortcuts. One user created a mode that adds clapping emojis between every word for emphasis.

Custom Instructions for Better Accuracy

In settings, you can add custom instructions telling Monologue who you are, what you do, and any specific terminology or formatting preferences.

What to include: - Your name and role - Calendar links or phone numbers (for proper transcription) - British English vs American English preference - Your specific speaking style

“What happens is monologue understands you much better. So the output that you get from monologue is much better. Uh, so that’s one quick tip I recommend everyone, uh, to just like do a brain dump here.” [00:07:45 - 00:07:59]

Usage Statistics and Patterns

Naveen shared surprising data about how people actually use Monologue:

“People are talking a lot, uh, to their cloud code or a codex. And that’s a good thing because you’re giving as much context and you get better.” [00:07:11 - 00:07:13]

iOS App Preview (Launching February 9th, 2026)

The iOS app brings all Mac features to mobile, with full sync between devices: - All modes sync automatically - Widget support for one-tap recording without opening the app - Background recording with timer visible in status bar - Settings sync across devices

Notes feature (iOS and Mac): A new feature for longer-form recordings that you want to save and reference later. Different from quick dictation - designed for capturing stream-of-consciousness thoughts on the go, like during a walk or hike. Plans to integrate with Spiral (Every’s writing product) to convert voice notes directly into blog posts.

“You can just start immediately recording and then stop it. It will be on your laptop. It will get synced. It will be on your phone as well.” [00:21:11 - 00:21:19]

Building with AI Coding Tools: Codex vs Claude Code

Naveen shared his workflow for deciding which AI coding assistant to use:

Use Codex when: - Fixing bugs in large, complex codebases - Working with Swift, iOS, or Mac codebases specifically - You need precise edits across many files - You need the tool to understand existing patterns and context

“Codex is really good at, uh, understanding Swift, uh, iOS and max code base right now, Mac, it’s a huge code base… codex is that one senior engineer where, uh, it understands all the code, everything. So, and it does that precise edits” [00:15:16 - 00:15:26]

Use Claude Code when: - Vibe coding new features from scratch - You want creative solutions and rapid prototyping - Starting from a blank slate

“When I’m wipe coding, I don’t usually do codex because codex is like, it’s not that creative, right? Personal fear. So that’s when I go to plot code.” [00:15:26 - 00:15:30]

The Notes feature workflow: The entire Notes feature was built by doing a “brain dump” to Claude Code using Monologue, describing what he wanted, and having it vibe code everything from scratch. First prototype took one hour to implement and share internally for feedback.

“When right now it looks everything well polished, but when we initially try a prototype, the, uh, the big thing that we able to do it as I’m able to write coded it and one hour and start sharing it internally.” [00:14:02 - 00:14:06]

Competition as Market Education

A counterintuitive insight about competing in a crowded space:

Monologue competes with well-funded voice dictation apps (competitors have raised between $10M-$80M). But Naveen and the team realized this is actually advantageous - those companies are spending millions educating the market about AI voice dictation, while Naveen (as a solo builder supported by Every’s ~$700K total funding) can focus on building a better product.

“It’s actually amazing in this day and age to build a product and have competitors that have raised a ton of money because getting people to use new AI to use products that are, that, that use AI is really hard. There’s a lot of education and educating, educating a market is so expensive and we have competitors that are spending millions of dollars educating a market and you building a product that is just as good, if not better.” [00:24:31 - 00:24:58]

Future Roadmap

Auto-learning modes (next major feature): Instead of static mode templates, Monologue will learn from your edits. When you paste transcribed text and then edit it, Monologue will detect what you changed and automatically update the underlying mode instructions. The system learns your writing style and preferences as you use it.

“You edited out Monolog actually actually learns from it and then goes and updates the mode underlying. So what happens is it learns from you as you use Monolog more” [00:19:23 - 00:19:29]

Other requested features: - Mode templates / public library of modes - Custom skins (mentioned: see-through Nintendo DS aesthetic) - Windows/PC version - Integration with Spiral for voice-note-to-blog-post workflow - Hardware product (Naveen’s “crazy idea” - on hold for now)

Notes vs Granola positioning: Notes is not meant to replace Granola (Every’s meeting notes product). Granola is for people who live in meetings all day. Monologue Notes is for people with fewer meetings (<5/week) and for capturing stream-of-consciousness thoughts on the go.

Philosophy: Implementation vs. Knowing What to Build

“In the age that right now we are living in implementing features is not really that, uh, important. It’s knowing what to implement and what actually gets, uh, people excited. That’s the most important part for us.” [00:14:28 - 00:14:35]

Naveen emphasized building quickly, sharing with real users immediately, and iterating based on feedback. The Every Discord community provides constant feedback that shapes the roadmap.

Actionable Details

Getting Started with Monologue

  1. Download: Go to monologue.to
  2. Set keyboard shortcut: In settings, configure your trigger key (right-side option recommended)
  3. Choose recording mode:
  4. Hold to record: Quick 5-10 second clips
  5. Double-tap: Longer recordings with manual stop
  6. Add custom instructions: Go to settings and describe who you are, what you do, preferred terminology
  7. Set up per-app modes:
  8. Create mode for coding (terminal, IDE)
  9. Create mode for messaging (Slack, Discord, iMessage)
  10. Create mode for writing (text editors)
  11. Enable auto-enter for hands-free operation

Pro Tips

Tools and Products Mentioned

Technical Stack

Quotes Worth Saving

“I just love building products, that’s it.” [00:01:24 - 00:01:28]

“The average is around P 50 averages on 48 words, but P 90 ton of people who put in the words, it’s around 400 to 1000 words. So actually people are talking a lot, uh, to their cloud code or a codex. And that’s a good thing because you’re giving as much context and you get better.” [00:06:57 - 00:07:11]

“Tap, talk, tap, it’s, uh, send, paste the text, it auto sends. I think that’s the, one of the best ways for you to work with codex or cloud code. And I believe yeah, the apps don’t do it.” [00:09:49 - 00:09:57]

“Even though there are a ton of crowded, uh, things, the way I work, no app supports me. And that’s why we added mods. That’s why we had this auto enter feature.” [00:22:39 - 00:22:45]

“In the age that right now we are living in implementing features is not really that, uh, important. It’s knowing what to implement and what actually gets, uh, people excited. That’s the most important part for us.” [00:14:28 - 00:14:35]

“Having users, uh, in our discord, I think that’s a great, great feeling, uh, because ton of people just give feedback and we have a rich roadmap now.” [00:26:51 - 00:26:55]