Python SDK for Soniox (community-driven) Speech-to-Text API. Built with httpx for both synchronous and asynchronous support.
- 🎯 Complete API Coverage: Full support for Soniox REST API
- ⚡ Async & Sync: Full support for both synchronous and asynchronous operations
- 🔒 Type Safe: Built with Pydantic v2 for robust type checking and validation
- 📝 Comprehensive Logging: Built-in logging with the
sonioxlogger - 🌍 60+ Languages: Transcribe speech in multiple languages with language hints
- 🎭 Speaker Diarization: Identify different speakers in audio
- 🔍 Language Identification: Automatic language detection
- 📊 Word-Level Timestamps: Get precise timing for each word
- 🎯 Context Support: Improve accuracy with domain-specific context
pip install sonioxSet your API key as an environment variable:
export SONIOX_API_KEY="your-api-key-here"Or pass it directly when initializing the client:
from soniox import SonioxClient
client = SonioxClient(api_key="your-api-key-here")Synchronous:
import time
from soniox import SonioxClient
client = SonioxClient()
# Submit transcription job
job = client.transcribe_file("path/to/audio.wav")
print(f"Job ID: {job.id}")
print(f"Status: {job.status}")
# Poll for completion
while True:
job = client.get_transcription_job(job.id)
if job.status == "completed":
break
time.sleep(1)
# Get the transcript
result = client.get_transcription_result(job.id)
print(f"Transcript: {result.text}")
print(f"Tokens: {len(result.tokens)}")Asynchronous:
import asyncio
from soniox import SonioxClient
async def transcribe():
client = SonioxClient()
# Submit transcription job
job = await client.transcribe_file_async("path/to/audio.wav")
print(f"Job ID: {job.id}")
# Poll for completion
while True:
job = await client.get_transcription_job_async(job.id)
if job.status == "completed":
break
await asyncio.sleep(1)
# Get the transcript
result = await client.get_transcription_result_async(job.id)
print(f"Transcript: {result.text}")
asyncio.run(transcribe())You can pass configuration options either as a TranscriptionConfig object or as keyword arguments:
from soniox import SonioxClient
from soniox.languages import Language
from soniox.types import TranscriptionConfig
client = SonioxClient()
# Using TranscriptionConfig
config = TranscriptionConfig(
model="stt-async-preview",
language_hints=[Language.en],
enable_speaker_diarization=True,
context="Medical terminology context"
)
job = client.transcribe_file("audio.wav", config=config)
# Or using kwargs
job = client.transcribe_file(
"audio.wav",
model="stt-async-preview",
enable_speaker_diarization=True
)Identify different speakers in your audio:
import time
from soniox import SonioxClient
client = SonioxClient()
# Submit job with speaker diarization
job = client.transcribe_file(
"path/to/audio.wav",
enable_speaker_diarization=True
)
# Wait for completion
while True:
job = client.get_transcription_job(job.id)
if job.status == "completed":
break
time.sleep(1)
# Get results with speaker information
result = client.get_transcription_result(job.id)
for token in result.tokens:
if token.speaker:
print(f"Speaker {token.speaker}: {token.text}")Automatically identify the language being spoken:
from soniox import SonioxClient
from soniox.languages import Language
client = SonioxClient()
job = client.transcribe_file(
"multilingual_audio.wav",
language_hints=[Language.en, Language.es, Language.fr],
enable_language_identification=True
)Provide context to improve recognition of domain-specific terms:
from soniox import SonioxClient
client = SonioxClient()
job = client.transcribe_file(
"medical_audio.wav",
context="Medical terminology: hypertension, cardiovascular, stethoscope"
)from soniox import SonioxClient
client = SonioxClient(
api_key="your-api-key", # API key (or use SONIOX_API_KEY env var)
base_url="https://api.soniox.com", # Custom base URL (optional)
timeout=60.0 # Request timeout in seconds
)The SDK uses Python's standard logging module with the logger name soniox:
import logging
# Enable debug logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger("soniox")
logger.setLevel(logging.DEBUG)
# Or configure it your way
import logging
handler = logging.StreamHandler()
handler.setLevel(logging.INFO)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
logger = logging.getLogger("soniox")
logger.addHandler(handler)
logger.setLevel(logging.INFO)Main client for interacting with Soniox API.
Submit an audio file for transcription.
Parameters:
file_path(str): Path to audio fileconfig(TranscriptionConfig, optional): Configuration object**kwargs: Configuration options (used if config is None)model(str): Model to use (default: "stt-async-preview")language_hints(list[Language]): Language hints for better accuracyenable_speaker_diarization(bool): Enable speaker diarizationenable_language_identification(bool): Enable language identificationcontext(str): Context for improved accuracywebhook_url(str): Webhook URL for completion notificationclient_reference_id(str): Your reference ID
Returns: TranscriptionJob - Job object with status information
Raises:
FileNotFoundError: If file doesn't existSonioxAPIError: If API returns an error
Get the status of a transcription job.
Parameters:
job_id(str): Job ID fromtranscribe_file()
Returns: TranscriptionJob - Updated job status
Get the transcript once the job is completed.
Parameters:
job_id(str): Job ID from completed transcription
Returns: TranscriptionResult - Transcript with tokens
Raises:
SonioxAPIError: If job is not completed or not found
Async version of transcribe_file().
Async version of get_transcription_job().
Async version of get_transcription_result().
Transcription job status and metadata.
Fields:
id(str): Job ID (UUID)status(TranscriptionJobStatus): Job status ("queued", "processing", "completed", "error")created_at(datetime): Job creation timestampfilename(str): Original filenamefile_id(str | None): Uploaded file IDaudio_url(str | None): Audio URL if providedaudio_duration_ms(int | None): Audio duration in millisecondserror_message(str | None): Error message if failed- All configuration fields from
TranscriptionConfig
Transcription result with full transcript.
Fields:
id(str): Transcript ID (matches job ID)text(str): Full transcribed texttokens(list[Token]): Word-level tokens with timing
Word-level transcription token.
Fields:
text(str): Token textstart_ms(int): Start time in millisecondsend_ms(int): End time in millisecondsconfidence(float): Confidence score (0-1)speaker(str | None): Speaker ID if diarization enabled
Configuration for transcription jobs.
Fields:
model(str): Model to use (default: "stt-async-preview")language_hints(list[Language] | None): Language hintsenable_language_identification(bool): Enable language detectionenable_speaker_diarization(bool): Enable speaker diarizationcontext(str | None): Context for improved accuracyclient_reference_id(str | None): Your reference IDwebhook_url(str | None): Webhook URLwebhook_auth_header_name(str | None): Webhook auth header namewebhook_auth_header_value(str | None): Webhook auth header value
Response from file upload.
Fields:
id(str): File IDfilename(str): Original filenamesize(int): File size in bytescreated_at(datetime): Upload timestampclient_reference_id(str | None): Your reference ID
SonioxError: Base exception for all Soniox errorsSonioxAuthenticationError: Raised when authentication failsSonioxAPIError: Raised when API returns an error responseSonioxRateLimitError: Raised when rate limit is exceeded
import time
from soniox import SonioxClient
from soniox.exceptions import (
SonioxAPIError,
SonioxAuthenticationError,
SonioxRateLimitError,
)
client = SonioxClient()
try:
# Submit transcription
job = client.transcribe_file("audio.wav")
# Wait for completion
while True:
job = client.get_transcription_job(job.id)
if job.status == "completed":
break
elif job.status == "error":
print(f"Transcription failed: {job.error_message}")
break
time.sleep(1)
# Get result
if job.status == "completed":
result = client.get_transcription_result(job.id)
print(result.text)
except FileNotFoundError:
print("Audio file not found")
except SonioxAuthenticationError as e:
print(f"Authentication failed: {e}")
except SonioxRateLimitError as e:
print(f"Rate limit exceeded: {e}")
print(f"Status code: {e.status_code}")
except SonioxAPIError as e:
print(f"API error: {e}")
print(f"Status code: {e.status_code}")
print(f"Response: {e.response_body}")Run tests with pytest:
# Install development dependencies
pip install -e ".[dev,test]"
# Run all tests
pytest
# Run with coverage
pytest --cov=src --cov-report=html --cov-report=term-missing
# Run specific test file
pytest tests/test_models.py
# Run with verbose output
pytest -vSee tests/README.md for more details on the test suite.
# Clone the repository
git clone https://github.com/mahdikiani/soniox-sdk.git
cd soniox
# Install in editable mode with dev dependencies
pip install -e ".[dev,test]"
# Run linter
ruff check src/
# Run type checker
mypy src/This project is licensed under the MIT License - see the LICENSE.txt file for details.
- 📧 Email: mahdikiany@gmail.com
- 🐛 Issues: GitHub Issues
- 💬 Discussions: GitHub Discussions
See CHANGELOG.md for version history and updates.
Made with ❤️ by Mahdi Kiani