Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 83 additions & 0 deletions Documentation/technical/native-odb-api.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
Native ODB API overview
=========================

Git's native object database (ODB) exposes `struct object_database` and
`struct odb_source` as the central data structures for working with local and
alternate object stores.【F:odb.h†L102-L160】 The API provides helpers to create a
database (`odb_new()`), attach paths as object sources, and read or write
objects through functions such as `odb_write_object_ext()` that operate on the
local repository's primary object directory.【F:odb.h†L169-L477】

A consumer that wants to experiment with custom storage can allocate its own ODB
using `odb_new()`, populate `struct odb_source` entries, and reuse Git's object
hashing helpers (for example `hash_object_file()`) to stay compatible with Git's
loose-object format.【F:object-file.h†L1-L126】【F:odb.c†L983-L1007】

Simple ODB example helper
-------------------------

The `test-tool simple-odb` helper demonstrates a minimal object database that
stores entries using Git's loose-object layout. It hashes payloads using the
repository's hash algorithm, compresses the `"<type> <size>\0<data>"` payload,
and writes the result underneath an `objects` directory it maintains on disk.
The helper exposes commands to initialise the store, append new objects, and
record the resulting directory as an alternate for the current repository so
that Git can discover the objects without further patches. It also provides a
`lop-write` command that dispatches blobs according to a size threshold: small
payloads are written to the repository's primary object store while larger
payloads are diverted into the simple store and automatically added as an
alternate.【F:simple-odb.c†L1-L158】【F:t/helper/test-simple-odb.c†L1-L157】

The accompanying regression test (`t/t0039-simple-odb.sh`) uses the helper to
write and read blobs, verifying that repositories can read from the alternate
store via `git cat-file` once the helper has created the loose object and added
its path to `objects/info/alternates`.【F:t/t0039-simple-odb.sh†L1-L43】

Activating the helper via alternates
------------------------------------

The helper relies entirely on Git's existing alternates mechanism: adding the
simple store's `objects` directory to `objects/info/alternates` (or setting
`GIT_ALTERNATE_OBJECT_DIRECTORIES`) is enough for Git to consult it during
object lookups. No changes to `odb.c` are required because the helper writes
objects in the same layout as a regular loose object directory. The test suite
demonstrates this by initialising a repository, adding the helper-managed
directory as an alternate, and reading objects through `git cat-file` without
any additional plumbing.【F:t/helper/test-simple-odb.c†L64-L89】【F:t/t0039-simple-odb.sh†L8-L43】

Large Object Promisor experiments
---------------------------------

The Large Object Promisor (LOP) design aims to keep very large blobs on
dedicated promisor remotes while the primary remote serves the remainder of the
repository.【F:Documentation/technical/large-object-promisors.adoc†L15-L114】 The
`lop-write` helper command mirrors that split locally: a caller passes the
simple store path and a `blob:limit`-style threshold, and the helper stores
objects larger than the limit in the alternate while leaving smaller objects in
the main store. Because the alternate is recorded automatically, subsequent
commands like `git cat-file` can resolve those large blobs transparently, which
makes it easy to prototype LOP-aware workflows without modifying Git's core
ODB routines.【F:t/helper/test-simple-odb.c†L90-L157】【F:t/t0039-simple-odb.sh†L45-L80】

Comparison with other ODB APIs
------------------------------

* **libgit2** exposes an `git_odb` type with pluggable backends and callbacks
that are registered globally. Implementers provide `read`, `write`, and
iteration function pointers, but the integration is centered around a single
multi-backend registry instead of Git's notion of one primary source plus a
linked list of alternates.
* **gitoxide** (gix) models its ODB as a layered `gix_odb::Store`, combining
a cache and multiple stores selected via configuration. Custom stores
implement the `Store` trait and are typically used by wiring them into a
`Repository` configuration object.

Key compatibility considerations
--------------------------------

Git's native API must preserve backwards compatibility with repositories that
may be accessed by older clients, so helpers should reuse Git's hashing
functions and object formats rather than inventing new on-disk layouts. The
example therefore hashes payloads with the repository's configured algorithm and
reuses the conventional loose-object directory structure, letting existing
clients take advantage of the alternate without special knowledge.【F:simple-odb.c†L63-L135】
2 changes: 2 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -854,6 +854,7 @@ TEST_BUILTINS_OBJS += test-sha1.o
TEST_BUILTINS_OBJS += test-sha256.o
TEST_BUILTINS_OBJS += test-sigchain.o
TEST_BUILTINS_OBJS += test-simple-ipc.o
TEST_BUILTINS_OBJS += test-simple-odb.o
TEST_BUILTINS_OBJS += test-string-list.o
TEST_BUILTINS_OBJS += test-submodule-config.o
TEST_BUILTINS_OBJS += test-submodule-nested-repo-config.o
Expand Down Expand Up @@ -1265,6 +1266,7 @@ LIB_OBJS += setup.o
LIB_OBJS += shallow.o
LIB_OBJS += sideband.o
LIB_OBJS += sigchain.o
LIB_OBJS += simple-odb.o
LIB_OBJS += sparse-index.o
LIB_OBJS += split-index.o
LIB_OBJS += stable-qsort.o
Expand Down
195 changes: 195 additions & 0 deletions simple-odb.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
#define USE_THE_REPOSITORY_VARIABLE

#include "git-compat-util.h"
#include "abspath.h"
#include "environment.h"
#include "dir.h"
#include "hash.h"
#include "hex.h"
#include "path.h"
#include "repository.h"
#include "simple-odb.h"
#include "wrapper.h"
#include "git-zlib.h"

static int make_dir(const char *path)
{
char *dup;

if (!path || !*path)
return error("simple-odb: empty path");

dup = xstrdup(path);
if (safe_create_leading_directories_no_share(dup) < 0) {
int save_errno = errno;
free(dup);
errno = save_errno;
return error_errno("unable to create directories for '%s'", path);
}
free(dup);

if (mkdir(path, 0777) && errno != EEXIST)
return error_errno("unable to create '%s'", path);

return 0;
}

void simple_odb_init(struct simple_odb *odb)
{
strbuf_init(&odb->root, 0);
strbuf_init(&odb->objects_dir, 0);
}

void simple_odb_release(struct simple_odb *odb)
{
strbuf_release(&odb->root);
strbuf_release(&odb->objects_dir);
}

int simple_odb_prepare(struct simple_odb *odb, const char *path)
{
struct strbuf real = STRBUF_INIT;
struct strbuf tmp = STRBUF_INIT;
int ret = -1;

if (!path || !*path)
return error("simple-odb: missing object directory path");

strbuf_addstr(&tmp, path);
if (make_dir(tmp.buf))
goto out;

if (!strbuf_realpath(&real, tmp.buf, 1)) {
error_errno("simple-odb: unable to canonicalize '%s'", tmp.buf);
goto out;
}

strbuf_addf(&odb->objects_dir, "%s/objects", real.buf);
if (make_dir(odb->objects_dir.buf))
goto out;
if (make_dir(mkpath("%s/info", odb->objects_dir.buf)))
goto out;
if (make_dir(mkpath("%s/pack", odb->objects_dir.buf)))
goto out;

strbuf_swap(&odb->root, &real);
ret = 0;
out:
strbuf_release(&real);
strbuf_release(&tmp);
if (ret)
simple_odb_release(odb);
return ret;
}

int simple_odb_store_buffer(struct simple_odb *odb,
enum object_type type,
const void *data,
size_t len,
struct object_id *oid)
{
struct git_hash_ctx ctx;
struct strbuf dir = STRBUF_INIT;
struct strbuf path = STRBUF_INIT;
struct strbuf tmp = STRBUF_INIT;
struct strbuf header = STRBUF_INIT;
struct git_zstream stream;
unsigned long maxsize;
size_t header_len;
size_t total_len;
size_t compressed_len;
int fd = -1;
int ret = -1;
unsigned char *payload = NULL;
unsigned char *compressed = NULL;
const char *type_name_str = type_name(type);
const struct git_hash_algo *algo;

if (!odb->objects_dir.len)
return error("simple-odb: object directory not initialized");
if (!type_name_str)
return error("simple-odb: invalid object type");

strbuf_addf(&header, "%s %"PRIuMAX, type_name_str, (uintmax_t)len);
header_len = header.len + 1;

total_len = header_len + len;
payload = xmalloc(total_len);
memcpy(payload, header.buf, header_len);
if (len)
memcpy(payload + header_len, data, len);

algo = the_repository ? the_repository->hash_algo
: &hash_algos[GIT_HASH_SHA1_LEGACY];

oid_set_algo(oid, algo);

algo->init_fn(&ctx);
algo->update_fn(&ctx, payload, total_len);
algo->final_oid_fn(oid, &ctx);

git_deflate_init(&stream, zlib_compression_level);
maxsize = git_deflate_bound(&stream, total_len);
compressed = xmalloc(maxsize);
stream.next_in = payload;
stream.avail_in = total_len;
stream.next_out = compressed;
stream.avail_out = maxsize;
if (git_deflate(&stream, Z_FINISH) != Z_STREAM_END) {
error("simple-odb: unable to compress object");
git_deflate_abort(&stream);
goto out;
}
compressed_len = maxsize - stream.avail_out;
git_deflate_end_gently(&stream);

const char *hex = oid_to_hex(oid);

Check failure on line 146 in simple-odb.c

View workflow job for this annotation

GitHub Actions / win build

simple-odb.c:146:9: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]

Check failure on line 146 in simple-odb.c

View workflow job for this annotation

GitHub Actions / linux-TEST-vars (ubuntu:20.04)

simple-odb.c:146:9: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]

Check failure on line 146 in simple-odb.c

View workflow job for this annotation

GitHub Actions / fuzz smoke test

simple-odb.c:146:21: mixing declarations and code is incompatible with standards before C99 [-Werror,-Wdeclaration-after-statement]

Check failure on line 146 in simple-odb.c

View workflow job for this annotation

GitHub Actions / fuzz smoke test

simple-odb.c:146:21: mixing declarations and code is incompatible with standards before C99 [-Werror,-Wdeclaration-after-statement]

Check failure on line 146 in simple-odb.c

View workflow job for this annotation

GitHub Actions / debian-11 (debian:11)

simple-odb.c:146:9: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]

Check failure on line 146 in simple-odb.c

View workflow job for this annotation

GitHub Actions / almalinux-8 (almalinux:8)

simple-odb.c:146:9: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]

Check failure on line 146 in simple-odb.c

View workflow job for this annotation

GitHub Actions / debian-11 (debian:11)

simple-odb.c:146:9: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]

Check failure on line 146 in simple-odb.c

View workflow job for this annotation

GitHub Actions / linux32 (i386/ubuntu:focal)

simple-odb.c:146:9: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]

Check failure on line 146 in simple-odb.c

View workflow job for this annotation

GitHub Actions / almalinux-8 (almalinux:8)

simple-odb.c:146:9: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]

Check failure on line 146 in simple-odb.c

View workflow job for this annotation

GitHub Actions / win build

simple-odb.c:146:9: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]

Check failure on line 146 in simple-odb.c

View workflow job for this annotation

GitHub Actions / linux32 (i386/ubuntu:focal)

simple-odb.c:146:9: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]

Check failure on line 146 in simple-odb.c

View workflow job for this annotation

GitHub Actions / linux-TEST-vars (ubuntu:20.04)

simple-odb.c:146:9: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]

strbuf_addf(&dir, "%s/%2.2s", odb->objects_dir.buf, hex);
if (make_dir(dir.buf))
goto out;

strbuf_addf(&path, "%s/%s", dir.buf, hex + 2);
if (!access(path.buf, F_OK)) {
ret = 0;
goto out;
}

strbuf_addf(&tmp, "%s/.tmp_simple_XXXXXX", odb->objects_dir.buf);
fd = xmkstemp_mode(tmp.buf, 0444);
if (fd < 0) {
error_errno("simple-odb: unable to create temporary file");
goto out;
}
if (write_in_full(fd, compressed, compressed_len) < 0) {
error_errno("simple-odb: unable to write object data");
goto out;
}
if (close(fd) < 0) {
error_errno("simple-odb: unable to close object file");
fd = -1;
goto out;
}
fd = -1;

if (rename(tmp.buf, path.buf)) {
error_errno("simple-odb: unable to move object into place");
goto out;
}
strbuf_setlen(&tmp, 0);
if (the_repository)
adjust_shared_perm(the_repository, path.buf);
ret = 0;
out:
if (fd >= 0)
close(fd);
if (tmp.len)
unlink_or_warn(tmp.buf);
strbuf_release(&dir);
strbuf_release(&path);
strbuf_release(&tmp);
strbuf_release(&header);
free(payload);
free(compressed);
return ret;
}
24 changes: 24 additions & 0 deletions simple-odb.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
#ifndef SIMPLE_ODB_H
#define SIMPLE_ODB_H

#include "git-compat-util.h"
#include "hash.h"
#include "object.h"
#include "strbuf.h"

struct simple_odb {
struct strbuf root;
struct strbuf objects_dir;
};

void simple_odb_init(struct simple_odb *odb);
void simple_odb_release(struct simple_odb *odb);

int simple_odb_prepare(struct simple_odb *odb, const char *path);
int simple_odb_store_buffer(struct simple_odb *odb,
enum object_type type,
const void *data,
size_t len,
struct object_id *oid);

#endif /* SIMPLE_ODB_H */
Loading
Loading