This free Obsidian plugin turns my voice into notes, and it all runs on my computer

2 weeks ago 2

After getting inspired by my XDA colleagues, I started using Obsidian as my primary note-taking app on my computer. I used it to jot down my thoughts, ideas, links, photos, and other relevant bits. But typing long notes does get tiring at times, and I wish Obsidian had a native feature to convert voice notes to text. Before, I tried Apple’s Notes app to take and transcribe voice notes, but my experience was underwhelming. It all changed when I discovered the Whisper plugin for Obsidian. Since I started using the plugin, it has been a game-changer. I vehemently use it to take voice notes without worrying about when or how I will transcribe them all.

Whisper plugin has helped me turn Obsidian into a powerful note-taking solution in a true sense. It has changed my workflow in a way I didn’t expect, and my overall experience using Obsidian has become more rewarding. Thanks to that, I keep my personal journal active so I can take quick audio notes on the fly. Also, searching for relevant text in the sea of notes makes the relevant entry pop instantly, thanks to the transcriptions provided by the plugin. While I am late to the party, I regret not using it sooner.

set up a local wiki for projects using Obsidian

Related

I finally started using Obsidian, and I should have sooner

Obsidian is often touted as the best note-taking app out there, and I can finally see why.

Whisper plugin automates swift voice notes to text conversion

Like a voice assistant, you’d want

I thought the Notes app was enough to transcribe voice notes easily, but I was so wrong. The Whisper plugin, even though it’s not a built-in feature, was more than good enough to convince me to switch entirely from the Notes app to Obsidian. Whisper is OpenAI’s automatic speech recognition system, which listens to speech and transcribes it into written text. Hence, I fed in OpenAI's API keys into the plugin’s settings to ensure it worked smoothly.

Next, it was just my microphone and me recording voice notes in Obsidian. When I stop recording, Obsidian shows a mini-player for audio as a fresh note, and the transcription appears automatically under it as text. After several trials, I noticed that the plugin would transcribe shorter notes quickly. But when I tried uploading existing audio files, like a 25-minute podcast episode, it took quite a while to convert the speech to text.

The plugin knocked my socks off with an accurate transcription, even when my diction was sometimes different with select words. The plugin stumped me even when I tried to mimic an accent. To test it out further, I picked up my old French workbook and read it aloud, and even that was transcribed well, despite my rusty French. Of course, it couldn’t process the garbled words because of an inexpensive headphone mic.

The plugin helps me focus on speaking without any inhibitions while recording notes. I can always review and fix the goof-ups in the transcription notes later. To enable that, I created folders to store the audio and transcription from the plugin. By default, the plugin makes Obsidian save all voice notes separately, so I need to move them to my dedicated voice notes folder. That’s something I can live with.

breadcrumbs plugin obsidian

Related

The Breadcrumbs Obsidian plugin has helped me be significantly more productive when it comes to note-taking

Breadcrumbs transformed my chaotic notes into a structured thinking system, helping me write better, think deeper and connect ideas across topics.

Is there a downside to using the Whisper plugin?

Privacy at a cost

Whisper plugin for Obsidian

Setting up the Whisper plugin in Obsidian doesn’t take a lot of effort. However, you’ll need to load a few bucks into your OpenAI account since the free tier won’t work, and you'll also need to register as a developer. When it comes to cost, it would take me roughly 2 hours and 45 minutes to spend a full buck on using Whisper’s audio-to-text services through the API. Unfortunately, I had to pay for that separately as it’s not included in my paid ChatGPT account and I get billed by a pay-as-you-go model.

Noticing my Obsidian setup with Whisper, a friend casually commented about letting OpenAI listen and process all my thoughts. And that comment stayed with me. On investigating, I found that OpenAI offers to turn off Data logging from the account settings, and that takes care of privacy concerns. Otherwise, my audio data is stored on OpenAI’s servers for 30 days. At least that’s what OpenAI states, besides not using that data to train the model. Still, I wanted to explore if I could run a speed-to-text model locally on my computer.

Making a Whisper model run locally on a PC

It takes quite a lot of effort

Since the core Whisper model is open-source, I explored how to make it work on my base M1 MacBook Air with 8GB of RAM. I stumbled upon the C/C++ port of the Whisper model, which could run locally on a computer in offline mode. After cloning the repository and downloading a large Whisper model converted into a custom binary format, I built the Whisper.cpp app.

Using a shell script, I ran a local Whisper model server to work with the Whisper plugin in Obsidian and recorded a voice note. The transcription appeared automatically with the audio note using the local Whisper model instance. After testing multiple times, I realized that the local Whisper model fell short in terms of accuracy and occasionally failed to pick accents. Yet, I achieved satisfactory results with a few voice notes to text transcriptions.

Use Obsidian as PKM system

Related

Speak to free yourself from typing notes

Typing isn’t the only way to take notes in Obsidian. The Whisper plugin makes it suitable for anyone who seeks freedom from clacking the keyboard. Even if you enjoy typing, I encourage you to try out the plugin at least. This plugin is a prime example of the bustling Obsidian community that keeps building several add-ons to make the app a favorite of many. Even if the local Whisper model aided by custom binaries works fine, I’d recommend using a powerful computer with decent CPU muscle and at least 16GB of RAM.

Whisper plugin has made me more confident, speaking out my thoughts and ideas openly by recording them. That’s how it’s quickly become my go-to note-taking app after trying out several others.

An image showing the logo of Obsidian notes app.
Read Entire Article