Skip to content

dbanswan/voiceprint-ai

Repository files navigation

Voice Print AI

Voice Print AI

Introduction

Voice Print AI is a project that uses Whisperfile to generate transcripts from audio files completely locally on your system.

Voice Print AI

Read more about the Whisperfile at link Mozilla/whisperfile.

Whisperfile is a high-performance implementation of OpenAI's Whisper created by Mozilla Ocho as part of the llamafile project, based on the whisper.cpp software written by Georgi Gerganov, et al.

Why

This project can be used to generate transcripts where privacy is a concern. The audio files are processed locally on your system and no data is sent to any server. This can include sensitive information like medical records, legal data, etc.

Docker Version

You can run the Docker version of this project by running the following commands. This way you don't have to worry about dealing with the code or the dependencies.

docker pull dbanswan/voice-print-ai
mkdir -p audio db
docker run -p 3000:3000 -p 8080:8080 -v $(pwd)/audio:/app/public/audio/uploads -v $(pwd)/db:/app/db dbanswan/voice-print-ai

Open your browser and go to http://localhost:3000 to use the application.

The Whisperfile server will be available on http://localhost:8080.

You can also run just the Whisperfile server using the following command.

docker pull dbanswan/whisper-tiny-server
docker run -p 8080:8080 dbanswan/whisper-tiny-server

Both images are based on tiny model to keep the size small.

Run the project locally

  1. Clone the repository

git clone https://github.com/dbanswan/voiceprint-ai.git

  1. Install the dependencies

npm install

  1. Go to Hugging Face and download the model files.

    Download the model files

    I would suggest downloading "whisper-tiny.llamafile" it is around 315 MB and works great for most part. But feel free to download the other models as well the larger the model the better the results. But they would need more resources to run.

  2. Go the folder where you downloaded the model file.


# make it executable

chmod +x whisper-tiny.llamafile

  1. And then run it with the following command

./whisper-tiny.llamafile

Run the model

As you can see the model will start a server on port 8080. And this what we are going to call from our app. You can download any model and run it the same way.

  1. Run the app

npm run dev

No keys or API tokens are required to run this project.

How it works

  1. Select the audio Select the audio

  2. Press the "Transcribe" button
    Press the transcribe button

  3. You can go to "History" tab to see the all the transcripts that you have generated.
    History

  4. It also comes with audio player so you can listen to the audio as well while reading the transcript.

Audio player

  1. The history is stored in a simple JSON file inside the db folder. Feel free to switch to let's say sqlite or any other database. This is done to keep things simple and not to add any extra dependencies.

  2. The audio files are stored in the public/audio/uploads folder.

  3. Also for advance users it also exports the transcript in JSON, Verbose json, VTT, SRT format. This can be used to integrate with other systems.

Export JSON

About

Voiceprint AI : Transcribe Audio Files Locally with privacy and speed

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors