Skip to content

fix: Reconnect on a closed connection when checked out#304

Open
jrogov wants to merge 1 commit intoplausible:masterfrom
adjust:fix/reconnect-on-closed-checkout-conn
Open

fix: Reconnect on a closed connection when checked out#304
jrogov wants to merge 1 commit intoplausible:masterfrom
adjust:fix/reconnect-on-closed-checkout-conn

Conversation

@jrogov
Copy link
Copy Markdown

@jrogov jrogov commented Apr 10, 2026

(In collaboration with @trollfred)

Hiya 👋

We've been experimenting with checkout a bit more, and stumbled upon a rather inconvenient behaviour of Ch when a connection's been checkout of echo_ch repo.

TL;DR: due to Clickhouse's keep_alive_timeout (defaults to 10 seconds), when a connection is checked out and there's a pause between DB operations, connection gets closed by server, and instead of reconnecting, ch just fails.

This PR provides a simple single reconnect in such cases.

Steps to reproduce:

Mix.install([
  {:ecto_sql, "~> 3.13.0"},
  {:ch, path: Path.expand("~/code/ch"), override: true},
  {:ecto_ch, "~> 0.8.6"}
])

defmodule ChRepo do
  use Ecto.Repo,
    otp_app: :demo,
    adapter: Ecto.Adapters.ClickHouse
end

{:ok, _} =
  ChRepo.start_link(
    hostname: "localhost",
    port: 8123,
    database: "default",
    # Ensure that nothing else is interfering
    timeout: 60_000
)

ChRepo.checkout(
  fn ->
    ChRepo.query!("SELECT 'before'")

    # Anything >10s
    :timer.sleep(15_000)

    ChRepo.query!("SELECT 'after'")
  end,
  timeout: :infinity
)

Result:

17:42:51.521 [debug] QUERY OK db=4.0ms
SELECT 'before' []

17:44:51.523 [debug] QUERY ERROR db=0.2ms
SELECT 'after' []

17:44:51.528 [error] Ch.Connection (#PID<0.283.0> ("db_conn_8")) disconnected: ** (Mint.TransportError) socket closed
** (Mint.TransportError) socket closed
    (ecto_sql 3.13.5) lib/ecto/adapters/sql.ex:1113: Ecto.Adapters.SQL.raise_sql_call_error/1
    iex:17: (file)
    iex:9: (file)

Comment on lines 386 to +391
def request_chunked(conn, method, path, headers, stream, opts) do
with {:ok, conn, ref} <- send_request(conn, method, path, headers, :stream),
{:ok, conn} <- stream_body(conn, ref, stream),
do: receive_full_response(conn, timeout(conn, opts))
with_retry_if_stale_connection(conn, fn conn ->
with {:ok, conn, ref} <- send_request(conn, method, path, headers, :stream),
{:ok, conn} <- stream_body(conn, ref, stream),
do: receive_full_response(conn, timeout(conn, opts))
end)
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure we should not do this on request_chunked when stream-writing the data, as it would mean that we could write duplicated data. Just wanted to check if that's correct.

If that's so, let me know, and I'll remove this part

Copy link
Copy Markdown
Collaborator

@ruslandoga ruslandoga Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as it would mean that we could write duplicated data

I think since it's an ongoing HTTP request, if the connection gets closed for it, ClickHouse would drop all received data, at least for non-async inserts. But I haven't actually checked that. So it might be that we don't gain anything from reconnecting since we need to start sending the stream anew anyway.

Copy link
Copy Markdown
Collaborator

@ruslandoga ruslandoga Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also request_chunked should probably be removed (I'll open a PR), I added it because I didn't know at the time about ability to do Stream.into/2 with DBConnection.

DBConnection.run(ch, fn conn ->
  File.stream!("demo.csv", 512_000)
  |> Stream.into(Ch.stream(conn, "insert into demo format CSV"))
  |> Stream.run()
end)

But the issue would remain, do we reconnect in those scenarios or not.

@ruslandoga
Copy link
Copy Markdown
Collaborator

👋 @jrogov

I wonder, what's your use-case? If something is pausing for that long, why not do two separate checkouts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants