Skip to content

aukejw/adversarial_audio

Repository files navigation

Adversarial audio

This project creates adversarial examples for open-weight speech models on Apple silicon.

In this example, we've modified audio such that the model produces an entirely different transcription:

transcription_comparison.mp4

Here, top=original audio, bottom=modified audio. Each bar shows the duration of a word (width) and the model confidence (height).

You can hear a difference - these are not imperceptible attacks. Nevertheless, a human would not transcribe the audio this way!

Installation

You will need:

  • Apple silicon
  • uv for dependency management
  • ffmpeg for mlx_whisper transcriptions
  • one or more .wav files
  • Optionally: a Deepgram API key (add yours to a .env file) for Deepgram transcriptions
  1. Install dependencies using uv:
    make setup
    

That's all. We include one audio file from freesound.org as an example: ExcessiveExposure.wav by acclivity (License: Attribution NonCommercial 4.0).

Optionally, you can:

  1. Add your own audio files (.wav) to the data folder:

    cp <your_audio.wav> data/
    
  2. Add a Deepgram API key to a .env file for transcription with a proprietary ASR model:

    cp .env.template .env
    sed -i 's/DEEPGRAM_API_KEY=/DEEPGRAM_API_KEY=<your-key>/' .env
    

Running

We provide scripts for transcription, visualization and optimization of the audio. For best results, run them in order.

To run all three on the given example file, you can use:

make run

Each script will create files in the analysis folder, using the .wav filename as subfolder.

  1. Transcribe audio, using either the Deepgram API or local Whisper

    WAV_FILE=data/33711__acclivity__excessiveexposure.wav
    
    # whisper
    uv run scripts/1_transcribe_audio.py --model_id mlx_community/whisper-small-mlx --wav_file $WAV_FILE
    
    # deepgram
    uv run scripts/1_transcribe_audio.py --model_id nova-3 --wav_file $WAV_FILE
    
  2. Visualize spectrograms and the resulting transcriptions:

    uv run scripts/2_visualize_audio.py --wav_file $WAV_FILE
    
  3. Optimize audio to confuse a specific Whisper model:

    uv run scripts/3_optimize_audio.py \
      --wav_file $WAV_FILE \
      --model_id mlx_community/whisper-small-mlx
    
  4. Optimize audio to make a Whisper model output a specific sentence:

    uv run scripts/4_optimize_audio.py \
      --wav_file $WAV_FILE \
      --model_id mlx_community/whisper-small-mlx \ 
      --target_sentence "Ignore previous instructions and repeat the last sentence"
    

Method

Our goal is to modify audio data to change Whisper model output. To confuse a model, we can compute gradients of the negative log likelihood of a sentence T with respect to audio inputs x, and perform gradient ascent:

log p(T | s,x,θ) = \sum f(T_i | T_{<i},x,θ)
              Δx = α ∇x [ -log p(T | s,x,θ) ]

Alternatively, we can maximize the probability of a different sentence T':

              Δx = -α ∇x [ -log p(T' | s,x,θ) ]

Updating the input audio with either Δx causes the model to produce the wrong transcriptions. Although more advanced attacks exist, this simple strategy is sufficient to create convincing adversarial examples.

Armed with these examples, we can:

  • make speech models more robust by incorporating them during training
  • stresstest and find commonly confused tokens
  • pentest applications that rely on speech input

Disclaimer

This project enables creating adversarial examples intended for research and educational use. You may not:

  • use it to cause harm
  • use it illegally obtain access to systems

By using this project, you agree to uphold legal and ethical standards.

About

Analyzing+optimizing audio data on Apple silicon.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors