Skip to content

Hogwai/spring-boot-postgresql-copy

Repository files navigation

Using the PostgreSQL COPY command with Spring Boot

This repository demonstrates how to leverage the PostgreSQL COPY command for high-performance bulk inserts in a Spring Boot application.

Two approaches are compared against standard JPA saveAll():

Strategy 10 000 rows Speedup
JPA saveAll() ~25s baseline
CopyManager (custom) ~1.1s x22
pgbulkinsert (library) ~1.6s x15

Getting started

Prerequisites

  • Java 25+
  • Docker

Run

docker compose up -d
./gradlew bootRun

API

GET    /customers              # list all customers
POST   /customers/copy?customerNumber=10000   # insert via CopyManager
POST   /customers/bulk?customerNumber=10000   # insert via pgbulkinsert

Both POST endpoints return an InsertResult with the row count and elapsed time:

{
  "rowCount": 10000,
  "elapsedSeconds": 1.153
}

Tests

docker compose up -d
./gradlew test

Unit tests run without a database. Integration benchmarks (InsertBenchmarkIT) require the PostgreSQL container.


Approach 1 — Custom CopyManager implementation

Architecture

The custom implementation is built around four components:

CopyMapper<T> — defines the mapping contract for any entity:

public interface CopyMapper<T> {
    String[] toCopyRow(T entity);

    String tableName();

    List<String> columns();
}

CopyOptions — configurable COPY options (format, delimiter, null string, batch size):

CopyOptions options = CopyOptions.builder()
                                 .format(CopyOptions.Format.CSV)
                                 .delimiter(",")
                                 .batchSize(5000)
                                 .build();

CopyInputStream<T> — streams rows on-the-fly without buffering the entire dataset in memory. Only the current row is held in memory at any time.

CopyUtils — orchestrates the COPY operation:

@Component
public class CopyUtils {

    public <T> void insertWithCopy(CopyMapper<T> mapper, List<T> entities) { ...}

    public <T> void insertWithCopy(CopyMapper<T> mapper, List<T> entities, CopyOptions options) { ...}
}

Key design decisions

  • Spring transaction participation — connections are obtained via DataSourceUtils.getConnection() so the COPY operation participates in the current @Transactional context
  • StreamingCopyInputStream feeds rows to the CopyManager one by one, avoiding loading the entire payload into memory
  • Batching — when batchSize > 0, entities are split into chunks and each chunk is sent in a separate COPY call
  • SQL injection prevention — table and column names are quoted with " identifiers
  • EscapingescapeTextValue() handles \, \t, \n, \r, \b, \f and null; escapeCsvValue() handles CSV double-quote escaping

Usage example

private static final CopyMapper<Customer> CUSTOMER_MAPPER = new CopyMapper<>() {
    @Override
    public String[] toCopyRow(Customer customer) {
        return new String[]{
                String.valueOf(customer.getId()),
                CopyUtils.escapeTextValue(customer.getFirstName()),
                CopyUtils.escapeTextValue(customer.getLastName()),
                customer.getCreationDate().toString()
        };
    }

    @Override
    public String tableName() {
        return "customer";
    }

    @Override
    public List<String> columns() {
        return List.of("id", "first_name", "last_name", "creation_date");
    }
};

// Simple call with default options (TEXT format, tab delimiter)
copyUtils.

insertWithCopy(CUSTOMER_MAPPER, customers);

// Or with custom options
CopyOptions options = CopyOptions.builder()
                                 .format(CopyOptions.Format.CSV)
                                 .delimiter(",")
                                 .batchSize(5000)
                                 .build();
copyUtils.

insertWithCopy(CUSTOMER_MAPPER, customers, options);

Approach 2 — PgBulkInsert library

A higher-level alternative using the pgbulkinsert library. Define a mapping:

public class CustomerMapping extends AbstractMapping<Customer> {
    public CustomerMapping() {
        super("public", "customer");
        mapLong("id", Customer::getId);
        mapText("first_name", Customer::getFirstName);
        mapText("last_name", Customer::getLastName);
        mapText("country", Customer::getCountry);
        mapText("address", Customer::getAddress);
        mapText("city", Customer::getCity);
        mapTimeStamp("creation_date", Customer::getCreationDate);
        mapTimeStamp("update_date", Customer::getUpdateDate);
    }
}

Then use it:

PgBulkInsert<Customer> bulkInsert = new PgBulkInsert<>(new CustomerMapping());

try(
Connection conn = dataSource.getConnection()){
        bulkInsert.

saveAll(PostgreSqlUtils.getPGConnection(conn),customers);
        }

Project structure

src/main/java/.../
├── controller/
│   └── CustomerController.java          # REST endpoints
├── model/
│   ├── Customer.java                    # JPA entity
│   └── CustomerMapping.java             # pgbulkinsert mapping
├── repository/
│   ├── CustomerRepository.java          # JPA + custom repos
│   ├── CustomerCopyRepository.java      # CopyManager interface
│   ├── PgbulkinsertCopyRepository.java  # pgbulkinsert interface
│   ├── impl/
│   │   ├── CustomerCopyRepositoryImpl.java
│   │   └── PgbulkinsertCopyRepositoryImpl.java
│   └── util/
│       ├── CopyUtils.java               # COPY orchestration
│       ├── CopyMapper.java              # Entity-to-row contract
│       ├── CopyOptions.java             # COPY configuration
│       ├── CopyInputStream.java         # Streaming InputStream
│       └── CopyInsertException.java     # Dedicated exception
├── service/
│   ├── CustomerService.java
│   ├── InsertResult.java                # Response DTO
│   └── impl/
│       └── CustomerServiceImpl.java
└── util/
    ├── CustomerFactory.java             # Test data generator
    └── StringUtil.java

About

Several ways to leverage the COPY command of PostgreSQL

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages