Skip to content

datsom1/soql2csv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

92 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

soql2csv

Changelog License

A powerful CLI and GUI tool to export Salesforce SOQL query results to local CSV files using Bulk API v2.

Key Features:

  • Uses Salesforce Bulk API v2 for efficient data extraction
  • Supports large data volumes with automatic pagination
  • Progress indicators for long-running queries
  • Post-processing capability for data transformation
  • User-friendly GUI for interactive use
  • Preserves original data with timestamped exports

Table of Contents

Installation

Install from GitHub:

pip install git+https://github.com/datsom1/soql2csv.git

To install the unofficial latest version (you probably don't need to do this):

pip install --upgrade --force-reinstall git+https://github.com/datsom1/soql2csv.git

For development:

pip install -e "git+https://github.com/datsom1/soql2csv.git#egg=soql2csv[dev]"

Configuration

soql2csv requires Salesforce authentication credentials in a .env file.

Example .env file:

SF_USERNAME=your_salesforce_username
SF_PASSWORD=your_salesforce_password
SF_SECURITY_TOKEN=your_salesforce_security_token
SF_DOMAIN=login  # or test for sandbox
SF_CLIENT_ID=your_salesforce_client_id
SF_CLIENT_SECRET=your_salesforce_client_secret

Usage

CLI Usage

Basic syntax:

soql2csv [OPTIONS] SOQLFILE

Get help:

soql2csv --help

Required Arguments

  • SOQLFILE: Path to the SOQL query file to execute

Required Options

  • --outfilename TEXT: Base name for the output CSV file
  • --outfilepath DIRECTORY: Directory where the output CSV will be saved

Optional Options

  • --envpath PATH: Path to the .env file (defaults to .env in current directory)
  • --postprocess PATH: Path to a Python script for post-processing the exported CSV
  • --gui: Launch the graphical user interface instead of the CLI
  • --version: Show the version and exit
  • -h, --help: Show help message and exit

Example

soql2csv path/to/query.soql --outfilename=Extract_Table__c.csv --outfilepath=./output

CLI Preview

soql2csv v2 2 CLI helptext

GUI Usage

Launch the GUI with:

soql2csv --gui

The GUI provides fields for:

  • SOQL File: Select your query file
  • Output Folder: Where exported CSVs will be saved
  • Env File: Path to your .env file with Salesforce credentials
  • Output CSV Base Name: Base filename for the exported data
  • Postprocess Script: Optional Python script for transforming the data

A console output area displays progress and results.

GUI Preview

soql2csv v2 2 GUI

Library Usage

You can also call soql2csv directly from Python code without invoking the CLI.

Primary helper:

export_soql_to_csv(soql_path, output_base_name, output_dir='.', env_path='.env', postprocess_path=None, timestamp=True) -> str

Parameters:

Parameter Description
soql_path Path to the .soql file containing your query.
output_base_name Base name for the exported CSV (with or without .csv).
output_dir Destination directory (created if it does not exist). Default: current directory.
env_path Path to the .env file containing Salesforce credentials. Default: .env.
postprocess_path Optional path to a Python script for post-processing. Receives temp_input and final_output CSV paths as argv[1], argv[2].
timestamp If True (default) prefixes filename like the CLI with a timestamp. Set to False to suppress.

Returns: Absolute path to the final CSV file.

Example:

from soql2csv import export_soql_to_csv

csv_path = export_soql_to_csv(
    soql_path="queries/Accounts.soql",
    output_base_name="Accounts.csv",
    output_dir="./exports",
    env_path="./salesforce.env",  # or leave as default '.env'
    postprocess_path=None,         # or a script like 'scripts/clean_accounts.py'
    timestamp=True                 # set False to disable timestamp prefix
)

print("CSV exported to", csv_path)

Minimal example (defaults to .env in cwd, current directory output, adds timestamp):

from soql2csv import export_soql_to_csv
export_soql_to_csv("query.soql", "MyData")

With post-processing (script signature identical to CLI expectations):

from soql2csv import export_soql_to_csv
export_soql_to_csv(
    soql_path="query.soql",
    output_base_name="FilteredData",
    postprocess_path="scripts/filter_script.py"
)

Errors you might encounter:

Exception Cause
FileNotFoundError The .soql, .env, or postprocess script path does not exist.
EnvironmentError Missing required Salesforce environment variables.
SalesforceAPIError Any Salesforce API interaction failed.

The function internally mirrors CLI logic: loads env vars, executes Bulk API v2 query, waits for completion, downloads paginated results, optionally runs a postprocess script (on a temporary copy), and returns the final CSV path.

Post-processing

Post-processing scripts allow you to transform the exported data. The script automatically receives two arguments:

  1. sys.argv[1]: Path to the input CSV file (exported from Salesforce)
  2. sys.argv[2]: Path to the output CSV file (to write processed data)

A template post-processing script is included in this repository as TEMPLATE_postprocess-example.py. You can use this as a starting point for your own custom logic.

Example post-processing script:

import sys
import csv

# Get input and output file paths from command-line arguments
input_csv = sys.argv[1]  # Path to the input CSV file (exported from Salesforce)
output_csv = sys.argv[2] # Path to the output CSV file (to write processed data)

# ------------------------------------ BEGIN USER CUSTOMIZATION ------------------------------------
# Example: Keep only the first record for each Payment__c (most recent, since CSV is ordered DESC)

# Open the input CSV file for reading
with open(input_csv, newline='', encoding='utf-8') as infile:
    reader = csv.DictReader(infile)  # Read rows as dictionaries (column name -> value)
    rows = list(reader)              # Read all rows into a list
    header = reader.fieldnames       # Get the list of column names

# Define the key field to group by (change as needed for your use case)
key_field = 'Payment__c'
seen_keys = set()        # Track which keys have already been processed
filtered_rows = []       # Store the filtered output rows

# Iterate through each row in the input CSV
for row in rows:
    key = row.get(key_field)  # Get the value for the key field
    if not key:
        continue              # Skip rows without the key field
    if key not in seen_keys:
        filtered_rows.append(row)  # Keep the first occurrence of each key
        seen_keys.add(key)         # Mark this key as seen

# Note: You can modify the filtering logic above to suit your specific requirements.
# ------------------------------------ END USER CUSTOMIZATION ------------------------------------

# Write the filtered rows to the output CSV file with the same formatting as Salesforce Bulk API v2
with open(output_csv, 'w', newline='', encoding='utf-8') as outfile:
    writer = csv.DictWriter(
        outfile,
        fieldnames=header,         # Use the same columns as the input
        delimiter=',',
        quotechar='"',
        quoting=csv.QUOTE_ALL,     # Quote all fields for compatibility
        lineterminator='\r\n'      # Use CRLF line endings (Windows-style)
    )
    writer.writeheader()           # Write the header row
    for row in filtered_rows:
        writer.writerow(row)       # Write each filtered row

Run with:

soql2csv query.soql --outfilename=Extract.csv --outfilepath=./output --postprocess=process_script.py

Examples

Basic extraction:

soql2csv queries/Accounts.soql --outfilename=Accounts.csv --outfilepath=./exports

Using a custom environment file:

soql2csv queries/Contacts.soql --outfilename=Contacts.csv --outfilepath=./exports --envpath=./salesforce.env

With post-processing:

soql2csv queries/Opportunities.soql --outfilename=Opportunities.csv --outfilepath=./exports --postprocess=scripts/clean_opps.py

Troubleshooting

Common issues:

  1. Authentication Failures: Verify your credentials in the .env file
  2. SOQL Syntax Errors: Check your query syntax
  3. Permission Errors: Ensure you have access to the queried objects in Salesforce

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

About

Cli tool to export Salesforce SOQL query results to local CSV using Bulk API v2.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages