digest refactoring for performance

While working #210, I noticed opportunities for opportunites for consolidating / refactoring code in digest.c.

In some sense, digest.c is providing a [Facade](https://en.wikipedia.org/wiki/Facade_pattern) around interacting with the various quirks of the imported hashing libraries. It does some setup / reinterpretation of the input from R, uses a series of cases to handle each of the different algorithms, and then returns a consistent output.

It seems over time those cases have been internally implemented in slightly different ways (and I imagine, the hasher interfaces have changed over time after initial implementation). There is now a fair bit of repetitious code and/or inconsistent naming/etc that could be consolidated. This would make the assorted "Consider X" issues smaller lifts to accomplish, present opportunities for a bit of unit testing on the C side, benchmarking, etc - all the normal benefits of [DRY](https://en.wikipedia.org/wiki/Don't_repeat_yourself).

I propose the following refactoring objectives:

 - [x] refactor macros to functions (per discussion in #210)
 - [ ] eliminate the duplicate code for streaming vs not approaches (e.g. by figuring out how to share the setup / return code, while switching between stream vs blob inputs)
 - [ ] more clearly distinguish "jobs" of various functions (e.g. I imagine `digest` would be as a more of traffic manager, with the individual hasher interface logic being modularized out; then adding a new hasher means adding a new consistently-surfaced-function + another case to digest and presto)
 - [ ] additional linting (e.g. spaces around binary operators)

Thoughts? Additional objectives? I think some of these (and possibly additional targets added via discussion) are entangled, but there's some potential to isolate at least some of these as their own PRs - preferences one way or another?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

digest refactoring for performance #212

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

digest refactoring for performance #212

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions