Skip to content

[ENH]: Added functionality to use an in-memory intermediary to load#35

Merged
tanujnay112 merged 1 commit into
masterfrom
add_memory_bindings
Aug 26, 2025
Merged

[ENH]: Added functionality to use an in-memory intermediary to load#35
tanujnay112 merged 1 commit into
masterfrom
add_memory_bindings

Conversation

@tanujnay112
Copy link
Copy Markdown

No description provided.

Copy link
Copy Markdown
Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@tanujnay112 tanujnay112 force-pushed the add_memory_bindings branch 2 times, most recently from da6fa1c to 736bfd1 Compare August 12, 2025 17:30
@tanujnay112 tanujnay112 changed the title [ENH]: Added functionality to use a in-memory intermediary to load [ENH]: Added functionality to use an in-memory intermediary to load Aug 12, 2025
@tanujnay112 tanujnay112 marked this pull request as ready for review August 12, 2025 18:56
@tanujnay112 tanujnay112 requested a review from HammadB August 12, 2025 19:05
@tanujnay112 tanujnay112 force-pushed the add_memory_bindings branch 2 times, most recently from 77bc087 to 06f82f7 Compare August 13, 2025 16:42
Comment thread CMakeLists.txt
Comment thread src/hnsw.rs Outdated
#[repr(C)]
pub struct HnswDataViewFFI {
_data: [u8; 0],
_marker: core::marker::PhantomData<(*mut u8, core::marker::PhantomPinned)>,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentionally mut?

Comment thread src/hnsw.rs Outdated
// Create HnswData FFI structure and set the buffers
let ffi_ptr = unsafe { create_hnsw_data_view(header_buffer.as_ptr(), header_buffer.len(), data_level0_buffer.as_ptr(), data_level0_buffer.len(), length_buffer.as_ptr(), length_buffer.len(), link_list_buffer.as_ptr(), link_list_buffer.len()) };
if ffi_ptr.is_null() {
return Err(HnswError::FFIException("Failed to create HnswData FFI structure".to_string()));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we've generally isolated the FFI exception to exceptions thrown in C++, that makes it clear to debug where the error is. Could we make this a custom error?

Comment thread src/hnsw.rs Outdated

// Opaque struct for memory buffers
#[repr(C)]
pub struct HnswDataFFI {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming nit - DataFFI vs HnswDataFFI, I know its HNSW

Comment thread hnswlib/hnswalg.h
Comment thread hnswlib/hnswalg.h
Comment thread src/bindings.cpp Outdated
// Memory buffer management functions
hnswlib::HnswDataMut* create_mutable_hnsw_data(char* header_buffer, size_t header_size, char* data_level0_buffer, size_t data_level0_size, char* length_buffer, size_t length_size, char* link_list_buffer, size_t link_list_size)
{
return new hnswlib::HnswDataMut(header_buffer, header_size, data_level0_buffer, data_level0_size, length_buffer, length_size, link_list_buffer, link_list_size);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we error handle here just defensively?

Comment thread src/bindings.cpp Outdated

const hnswlib::HnswDataView* hnsw_data_to_view(hnswlib::HnswDataMut* hnsw_data)
{
return hnsw_data->getView();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we error handle here just defensively?

Comment thread src/bindings.cpp Outdated

hnswlib::HnswDataView* create_hnsw_data_view(char* header_buffer, size_t header_size, char* data_level0_buffer, size_t data_level0_size, char* length_buffer, size_t length_size, char* link_list_buffer, size_t link_list_size)
{
return new hnswlib::HnswDataView(header_buffer, header_size, data_level0_buffer, data_level0_size, length_buffer, length_size, link_list_buffer, link_list_size);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we error handle here just defensively?

Copy link
Copy Markdown

@HammadB HammadB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Some of the size calcs seem descrepant (void * vs unsigned int)
  • Reallocs need to happen the same way as the malloc changes
  • The link list malloc is probably not OK we should alloc all memory on one side of the FFI boundary, ideally rust

Comment thread hnswlib/hnswalg.h
mutable std::atomic<size_t> cur_element_count{0}; // current number of elements
size_t size_data_per_element_{0};
size_t size_links_per_element_{0};
unsigned int size_links_per_element_{0};
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've been serializing and reading this as unsigned int so just making it consistent here.

@tanujnay112 tanujnay112 requested a review from HammadB August 22, 2025 18:59
Comment thread src/hnsw.rs Outdated
pub dimensionality: i32,
pub persist_path: PathBuf,
pub ef_search: usize,
pub hnsw_data: HnswData,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this shouldn't be in the config, its not really config

@tanujnay112 tanujnay112 merged commit 20f8c61 into master Aug 26, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants