Description:
Hi, thanks for the great work on this project! I've been using it and encountered a data loss issue in KtlsRecv when the peer abruptly closes the connection. I have investigated the issue and made a local patch that seems to fix it. I'd like to discuss whether this is the best approach to merge upstream.
The Problem:
During TLS communication, if the underlying network layer (TDI) experiences a sudden disconnect or RST, mbedtls_ssl_read returns MBEDTLS_ERR_SSL_CONN_EOF.
In the original implementation of KtlsRecv, receiving this error code immediately returns STATUS_CONNECTION_DISCONNECTED. However, mbedTLS might have already decrypted some data that is currently sitting in its internal buffer. Returning immediately causes this pending decrypted data to be permanently lost.
Root Cause:
Because of the nature of the TDI layer and network buffering, the socket might report an EOF/disconnect to mbedTLS, prompting it to return MBEDTLS_ERR_SSL_CONN_EOF. But mbedTLS does not flush its internal available bytes automatically on EOF.
Proposed Solution:
I modified KtlsRecv to check for available bytes using mbedtls_ssl_get_bytes_avail before dropping the connection. If MBEDTLS_ERR_SSL_CONN_EOF is returned but there are still bytes left, we should force a read to salvage the remaining data before returning the disconnected status.
Additionally, I improved the Plain TCP path to properly handle STATUS_CONNECTION_RESET.
Here is the core logic of my patch for the TLS part:
if (ret == MBEDTLS_ERR_SSL_CONN_EOF) {
// Check if there is still decrypted data pending in mbedTLS internal buffers
size_t available = mbedtls_ssl_get_bytes_avail(&Session->ssl);
if (available > 0) {
KTLS_LOG_WARN("Forcing read of remaining %zu bytes despite EOF", available);
// Salvage the remaining data
ret = mbedtls_ssl_read(&Session->ssl, (unsigned char*)Buffer, min(BufferSize, available));
if (ret > 0) {
*BytesReceived = (ULONG)ret;
return STATUS_SUCCESS;
}
}
KTLS_LOG_WARN("TLS Connection reset/closed unexpectedly (EOF).");
return STATUS_CONNECTION_DISCONNECTED;
}
Questions for Discussion:
-
Does this approach of using mbedtls_ssl_get_bytes_avail align with your design for KtlsRecv?
-
Is there any edge case in the TDI wrapper that might cause mbedtls_ssl_read to hang if we force read after an EOF? (In my tests, it works perfectly and saves the dropped packets).
3.I also added explicit handling for STATUS_CONNECTION_RESET in the Plain TCP section. Would you like me to submit a PR including both fixes?
Looking forward to your thoughts!
Description:
Hi, thanks for the great work on this project! I've been using it and encountered a data loss issue in KtlsRecv when the peer abruptly closes the connection. I have investigated the issue and made a local patch that seems to fix it. I'd like to discuss whether this is the best approach to merge upstream.
The Problem:
During TLS communication, if the underlying network layer (TDI) experiences a sudden disconnect or RST, mbedtls_ssl_read returns MBEDTLS_ERR_SSL_CONN_EOF.
In the original implementation of KtlsRecv, receiving this error code immediately returns STATUS_CONNECTION_DISCONNECTED. However, mbedTLS might have already decrypted some data that is currently sitting in its internal buffer. Returning immediately causes this pending decrypted data to be permanently lost.
Root Cause:
Because of the nature of the TDI layer and network buffering, the socket might report an EOF/disconnect to mbedTLS, prompting it to return MBEDTLS_ERR_SSL_CONN_EOF. But mbedTLS does not flush its internal available bytes automatically on EOF.
Proposed Solution:
I modified KtlsRecv to check for available bytes using mbedtls_ssl_get_bytes_avail before dropping the connection. If MBEDTLS_ERR_SSL_CONN_EOF is returned but there are still bytes left, we should force a read to salvage the remaining data before returning the disconnected status.
Additionally, I improved the Plain TCP path to properly handle STATUS_CONNECTION_RESET.
Here is the core logic of my patch for the TLS part:
Questions for Discussion:
Does this approach of using mbedtls_ssl_get_bytes_avail align with your design for KtlsRecv?
Is there any edge case in the TDI wrapper that might cause mbedtls_ssl_read to hang if we force read after an EOF? (In my tests, it works perfectly and saves the dropped packets).
3.I also added explicit handling for STATUS_CONNECTION_RESET in the Plain TCP section. Would you like me to submit a PR including both fixes?
Looking forward to your thoughts!