[WIP] feat: unstable SHA256 support#1206
Conversation
dd35da4 to
a332c64
Compare
a332c64 to
fd975c7
Compare
| /// View this OID as a byte-slice. | ||
| /// | ||
| /// * 20 bytes in length if the feature `unstable-sha256` is not enabled. | ||
| /// * 32 bytes in length if the feature `unstable-sha256` is enabled. | ||
| pub fn as_bytes(&self) -> &[u8] { | ||
| &self.raw.id | ||
| } |
There was a problem hiding this comment.
I think this should return 20 bytes if unstable-sha256 is enabled but the object format is SHA-1.
One reason for this is that crate features should be additive; turning on the feature shouldn't unnecessarily affect how the crate behaves with already-supported repositories.
There was a problem hiding this comment.
Thanks for calling this out. That's a fair point, but I'm not sure "feature additivity" is the core issue here.
I think the real question is what Oid::as_bytes() is supposed to mean in this crate: the logical object ID bytes, or the raw backing storage from libgit2.
I'm leaning toward the logical interpretation. That seems like the less surprising API for callers, as you mentioned.
The reason I wasn’t fully convinced yet is that this type has historically been a fairly thin wrapper over libgit2, so I wanted to check whether anyone might rely on the raw representation or the backing buffer. Or maybe we should just mark this with a big
There was a problem hiding this comment.
There is the option to have Oid::as_bytes() give you the logical representation and introduce a new Oid::raw_bytes() for the (probably few) users that actually want the entire buffer.
There was a problem hiding this comment.
Addressed the as_bytes part and also added the raw_bytes method. Thanks for the feedback, folks!
c07b123 to
5c04d59
Compare
* GIT_SHA1_COLLISIONDETECT -> GIT_SHA1_BUILTIN * GIT_WINHTTP -> GIT_HTTPS_WINHTTP * GIT_SECURE_TRANSPORT -> GIT_HTTPS_SECURETRANSPORT * GIT_OPENSSL -> GIT_HTTPS_OPENSSL See <libgit2/libgit2#6994>
* Add `GIT_REPOSITORY_INIT_RELATIVE_GITLINK` at libgit2/libgit2@bc737620d * Remove `GIT_REPOSITORY_INIT_NO_DOTGIT_DIR` at libgit2/libgit2@ca2a241e4
* Remove `GIT_OBJECT_OFS_DELTA` * Remove `GIT_OBJECT_REF_DELTA` See libgit2/libgit2@23da3a8f3
libgit2 simplified SHA256 API to use `_ext` suffixes instead of changing base function signatures. This avoids breaking changes for users who don't need SHA256 support. Updated bindings: - git_oid_from_raw - git_oid_from_prefix - git_oid_from_string - git_diff_from_buffer_ext - git_index_new_ext - git_index_open_ext - git_odb_new_ext - git_odb_open_ext - git_repository_new_ext See libgit2/libgit2@56e2a85643f (libgit2/libgit2#6975)
This adds an `unstable-sha256` Cargo feature, as a follow-up of rust-lang#1201 Also adds some smoke tests for affected operations/types. ## Insta-stable - **NEW** `Index::with_object_format` to create with different format - **NEW** `Oid::object_format` to access underlying oid type ## Behind `unstable-sha256` - **NEW** `ObjectFormat::Sha256` enum variant - **NEW** `RepositoryInitOptions::object_format()` method to set hash algo - **NEW** `Remote::object_format` method to get hash algo on a remote - **NEW** `Oid::raw_bytes` to return the full underlying bytes - **CHANGED** `Diff::from_buffer` to accept an extra object format argument - **CHANGED** `Index::open` to accept an extra object format argument - **CHANGED** `Indexer::new` to accept an extra object format argument - **CHANGED** `Oid::from_str` to accept an extra object format argument - **CHANGED** `Oid::hash_{object,file}` to accept an extra object format argument - **CHANGED** `Oid::as_bytes` to return the logic bytes (SHA1 -> 20 bytes; SHA256 -> 32 bytes) - **CHANGED** `impl Hash for Oid` to hash `git_oid->kind` - **REMOVED** `Index::new` to avoid misuse. - **REMOVED** `impl std::FromStr for Oid` to avoid misuse
This also run systest first, so we can dicover bindding mismatch before
|
I believe to have found something that you missed in this PR. Note that Lines 271 to 275 in e6919b7 I believe that this means that the The function Line 231 in e6919b7 will hit the Line 29 in e6919b7 This is because I discovered this experimenting in code that is downstream of Then I started to think about how much #[inline]
pub(crate) fn zeroed_raw_oid(format: ObjectFormat) -> raw::git_oid {
raw::git_oid {
kind: format.raw(),
id: [0; raw::GIT_OID_MAX_SIZE],
}
}In fact one might even ask why not to have impl Oid {
pub const ZERO_SHA1: Self = Self {
raw: raw::git_oid {
kind: raw::GIT_OID_SHA1,
id: [0; raw::GIT_OID_MAX_SIZE],
}
};
pub const ZERO_SHA256: Self = Self {
raw: raw::git_oid {
kind: raw::GIT_OID_SHA256,
id: [0; raw::GIT_OID_MAX_SIZE],
}
};
// NOTE: Maybe call this `zero_with_object_format` to not break `zero`.
pub fn zero(format: ObjectFormat) -> Self {
// TODO: Match on `format` and return `Oid::ZERO_SHA1` or `Oid::ZERO_SHA256`.
}
}This way, users that know at compile time which kind of zero they want can just use the constant, while those that decide at run time would call If you do not want For |
|
Another comment regarding |
| /// * [`Repository::index`] if you have repository context | ||
| /// | ||
| /// </div> | ||
| #[cfg(not(feature = "unstable-sha256"))] |
There was a problem hiding this comment.
why is this function removed when sha256 support is enabled? If the idea is that we should migrate to with_object_format() anyway, maybe add that function in a separate PR that doesn't need to wait for SHA-256 to be ready?
| /// | ||
| /// Note: Most users should obtain an ODB from [`Repository::odb`](crate::Repository::odb), | ||
| /// which automatically inherits the repository's object format. | ||
| #[cfg(feature = "unstable-sha256")] |
There was a problem hiding this comment.
this can match Index and introduce the new method without the feature gate, so that code can start to use it without requiring the unstable feature
| /// characters (or 64 for SHA256), or contains any non-hex characters. | ||
| pub fn from_str( | ||
| s: &str, | ||
| #[cfg(feature = "unstable-sha256")] format: crate::ObjectFormat, |
There was a problem hiding this comment.
this parameter could also be ungated so that callers can start passing it
| #[cfg(not(feature = "unstable-sha256"))] | ||
| { | ||
| if bytes.len() != raw::GIT_OID_SHA1_SIZE { | ||
| return Err(Error::from_str("raw byte array must be 20 bytes")); |
There was a problem hiding this comment.
maybe a clearer error for trying to accidentally use 32 bytes when the feature isn't enabled?
Also, why have different ways of creating the SHA1 oid depending on if the SHA256 feature is available?
let oid_type = match bytes.len() {
raw::GIT_OID_SHA1_SIZE => raw::GIT_OID_SHA1,
#[cfg(feature = "unstable-sha256")]
raw::GIT_OID_SHA256_SIZE => raw::GIT_OID_SHA256,
#[cfg(not(feature = "unstable-sha256"))]
raw::GIT_OID_SHA256_SIZE => {
return Err(Error::from_str(
"raw byte array of 32 bytes requires unstable-sha256 feature to be enabled",
))
}
_ => {
return Err(Error::from_str(
"raw byte array must be 20 bytes (SHA1) or 32 bytes (SHA256, if enabled)",
))
}
};
unsafe {
try_call!(raw::git_oid_from_raw(&mut raw, bytes.as_ptr(), oid_type));
}
Ok(Oid { raw })| pub fn hash_object( | ||
| kind: ObjectType, | ||
| bytes: &[u8], | ||
| #[cfg(feature = "unstable-sha256")] format: crate::ObjectFormat, |
There was a problem hiding this comment.
another parameter that can already be added immediately - not going to call them out each time though
| } | ||
| #[cfg(not(feature = "unstable-sha256"))] | ||
| { | ||
| let _ = oid_type; |
There was a problem hiding this comment.
hmm, even if the git2-rs bindings don't have SHA256 enabled, if the oid_type corresponds to SHA256 we should say that and probably return an error, rather than just claiming that the repository is formatted with SHA1?
This adds an
unstable-sha256Cargo feature, as a follow-up of #1201.Also adds some smoke tests for affected operations/types.
Part of #1090
Insta-stable
Index::with_object_formatto create with different formatOid::object_formatto access underlying oid typeBehind
unstable-sha256ObjectFormat::Sha256enum variantRepositoryInitOptions::object_format()method to set hash algoRemote::object_formatmethod to get hash algo on a remoteOid::raw_bytesto return the full underlying bytesDiff::from_bufferto accept an extra object format argumentIndex::opento accept an extra object format argumentIndexer::newto accept an extra object format argumentOid::from_strto accept an extra object format argumentOid::hash_{object,file}to accept an extra object format argumentOid::as_bytesto return the logic bytes (SHA1 -> 20 bytes; SHA256 -> 32 bytes)impl Hash for Oidto hashgit_oid->kindIndex::newto avoid misuse.impl std::FromStr for Oidto avoid misuseimpl std::FromStr for Oidto avoid misuselibgit2 1.9 compatibility changes
This PR also includes fixes for libgit2 1.9 breaking changes:
GIT_SHA1_COLLISIONDETECT->GIT_SHA1_BUILTIN, HTTPS backend macros)refdb_typefield togit_repository_init_optionsSee libgit2/libgit2#6994 and libgit2/libgit2#7102 for upstream changes.
These changes can be split out if needed.