A powerful CLI and GUI tool to export Salesforce SOQL query results to local CSV files using Bulk API v2.
Key Features:
- Uses Salesforce Bulk API v2 for efficient data extraction
- Supports large data volumes with automatic pagination
- Progress indicators for long-running queries
- Post-processing capability for data transformation
- User-friendly GUI for interactive use
- Preserves original data with timestamped exports
- Installation
- Configuration
- Usage
- Library Usage
- Post-processing
- Examples
- Troubleshooting
- Contributing
- License
Install from GitHub:
pip install git+https://github.com/datsom1/soql2csv.gitTo install the unofficial latest version (you probably don't need to do this):
pip install --upgrade --force-reinstall git+https://github.com/datsom1/soql2csv.gitFor development:
pip install -e "git+https://github.com/datsom1/soql2csv.git#egg=soql2csv[dev]"soql2csv requires Salesforce authentication credentials in a .env file.
Example .env file:
SF_USERNAME=your_salesforce_username
SF_PASSWORD=your_salesforce_password
SF_SECURITY_TOKEN=your_salesforce_security_token
SF_DOMAIN=login # or test for sandbox
SF_CLIENT_ID=your_salesforce_client_id
SF_CLIENT_SECRET=your_salesforce_client_secretBasic syntax:
soql2csv [OPTIONS] SOQLFILEGet help:
soql2csv --helpSOQLFILE: Path to the SOQL query file to execute
--outfilename TEXT: Base name for the output CSV file--outfilepath DIRECTORY: Directory where the output CSV will be saved
--envpath PATH: Path to the .env file (defaults to .env in current directory)--postprocess PATH: Path to a Python script for post-processing the exported CSV--gui: Launch the graphical user interface instead of the CLI--version: Show the version and exit-h, --help: Show help message and exit
soql2csv path/to/query.soql --outfilename=Extract_Table__c.csv --outfilepath=./outputLaunch the GUI with:
soql2csv --guiThe GUI provides fields for:
- SOQL File: Select your query file
- Output Folder: Where exported CSVs will be saved
- Env File: Path to your .env file with Salesforce credentials
- Output CSV Base Name: Base filename for the exported data
- Postprocess Script: Optional Python script for transforming the data
A console output area displays progress and results.
You can also call soql2csv directly from Python code without invoking the CLI.
Primary helper:
export_soql_to_csv(soql_path, output_base_name, output_dir='.', env_path='.env', postprocess_path=None, timestamp=True) -> str
Parameters:
| Parameter | Description |
|---|---|
soql_path |
Path to the .soql file containing your query. |
output_base_name |
Base name for the exported CSV (with or without .csv). |
output_dir |
Destination directory (created if it does not exist). Default: current directory. |
env_path |
Path to the .env file containing Salesforce credentials. Default: .env. |
postprocess_path |
Optional path to a Python script for post-processing. Receives temp_input and final_output CSV paths as argv[1], argv[2]. |
timestamp |
If True (default) prefixes filename like the CLI with a timestamp. Set to False to suppress. |
Returns: Absolute path to the final CSV file.
Example:
from soql2csv import export_soql_to_csv
csv_path = export_soql_to_csv(
soql_path="queries/Accounts.soql",
output_base_name="Accounts.csv",
output_dir="./exports",
env_path="./salesforce.env", # or leave as default '.env'
postprocess_path=None, # or a script like 'scripts/clean_accounts.py'
timestamp=True # set False to disable timestamp prefix
)
print("CSV exported to", csv_path)Minimal example (defaults to .env in cwd, current directory output, adds timestamp):
from soql2csv import export_soql_to_csv
export_soql_to_csv("query.soql", "MyData")With post-processing (script signature identical to CLI expectations):
from soql2csv import export_soql_to_csv
export_soql_to_csv(
soql_path="query.soql",
output_base_name="FilteredData",
postprocess_path="scripts/filter_script.py"
)Errors you might encounter:
| Exception | Cause |
|---|---|
FileNotFoundError |
The .soql, .env, or postprocess script path does not exist. |
EnvironmentError |
Missing required Salesforce environment variables. |
SalesforceAPIError |
Any Salesforce API interaction failed. |
The function internally mirrors CLI logic: loads env vars, executes Bulk API v2 query, waits for completion, downloads paginated results, optionally runs a postprocess script (on a temporary copy), and returns the final CSV path.
Post-processing scripts allow you to transform the exported data. The script automatically receives two arguments:
sys.argv[1]: Path to the input CSV file (exported from Salesforce)sys.argv[2]: Path to the output CSV file (to write processed data)
A template post-processing script is included in this repository as TEMPLATE_postprocess-example.py. You can use this as a starting point for your own custom logic.
Example post-processing script:
import sys
import csv
# Get input and output file paths from command-line arguments
input_csv = sys.argv[1] # Path to the input CSV file (exported from Salesforce)
output_csv = sys.argv[2] # Path to the output CSV file (to write processed data)
# ------------------------------------ BEGIN USER CUSTOMIZATION ------------------------------------
# Example: Keep only the first record for each Payment__c (most recent, since CSV is ordered DESC)
# Open the input CSV file for reading
with open(input_csv, newline='', encoding='utf-8') as infile:
reader = csv.DictReader(infile) # Read rows as dictionaries (column name -> value)
rows = list(reader) # Read all rows into a list
header = reader.fieldnames # Get the list of column names
# Define the key field to group by (change as needed for your use case)
key_field = 'Payment__c'
seen_keys = set() # Track which keys have already been processed
filtered_rows = [] # Store the filtered output rows
# Iterate through each row in the input CSV
for row in rows:
key = row.get(key_field) # Get the value for the key field
if not key:
continue # Skip rows without the key field
if key not in seen_keys:
filtered_rows.append(row) # Keep the first occurrence of each key
seen_keys.add(key) # Mark this key as seen
# Note: You can modify the filtering logic above to suit your specific requirements.
# ------------------------------------ END USER CUSTOMIZATION ------------------------------------
# Write the filtered rows to the output CSV file with the same formatting as Salesforce Bulk API v2
with open(output_csv, 'w', newline='', encoding='utf-8') as outfile:
writer = csv.DictWriter(
outfile,
fieldnames=header, # Use the same columns as the input
delimiter=',',
quotechar='"',
quoting=csv.QUOTE_ALL, # Quote all fields for compatibility
lineterminator='\r\n' # Use CRLF line endings (Windows-style)
)
writer.writeheader() # Write the header row
for row in filtered_rows:
writer.writerow(row) # Write each filtered rowRun with:
soql2csv query.soql --outfilename=Extract.csv --outfilepath=./output --postprocess=process_script.pyBasic extraction:
soql2csv queries/Accounts.soql --outfilename=Accounts.csv --outfilepath=./exportsUsing a custom environment file:
soql2csv queries/Contacts.soql --outfilename=Contacts.csv --outfilepath=./exports --envpath=./salesforce.envWith post-processing:
soql2csv queries/Opportunities.soql --outfilename=Opportunities.csv --outfilepath=./exports --postprocess=scripts/clean_opps.pyCommon issues:
- Authentication Failures: Verify your credentials in the .env file
- SOQL Syntax Errors: Check your query syntax
- Permission Errors: Ensure you have access to the queried objects in Salesforce
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

