feat(download): add dunite-download generic binary distribution engine #2

Merged
nrupard merged 2 commits from feat/dunite-download into main 2026-06-01 19:07:11 +02:00
Owner

What

New dunite-download workspace crate: the generic, storage-agnostic engine for proxying binary artifacts from an upstream Forgejo instance. Tracks BUNYIP-29 (subtask of the BUNYIP-28 distribution story).

Why

Bunyip's binary download vertical is currently hand-rolled inside bunyip-domain (ForgejoClient, ReleaseCache, DownloadCache, DownloadLimiter with a hard PgPool dependency), which violates the dunite/bunyip split: generic mechanism belongs here, persistence belongs to the consumer. This crate is that mechanism, extracted and generalized; the follow-up (BUNYIP-30) makes bunyip consume it and deletes the hand-rolled code.

Design

Mirrors dunite-oci exactly:

  • store::AssetStore + store::DownloadCounter are the persistence traits the consumer implements against its own schema. No PgPool, no named tables, no domain types in the crate.
  • ForgejoAssetClient speaks both Forgejo artifact APIs, selected per lookup via the ArtifactSource enum: release attachments (/api/v1/repos/{owner}/{repo}/releases/tags/{tag}) and the generic package registry (/api/v1/packages/{owner}/generic/{package}/{version}/files). Download URLs are validated against the configured base host before the API token is forwarded (same token-safety rule as dunite-oci).
  • ReleaseCache is a moka TTL cache for artifact metadata, shared by both sources.
  • DownloadCache<S: AssetStore> is the on-disk content-addressed asset cache: single-flight fetches via Arc<OnceCell>, SHA-256 hashing while streaming to a temp file, atomic rename, replaced-SHA orphan cleanup, async LRU eviction over the byte cap.
  • DownloadLimiter enforces per-user concurrency in process and delegates the durable daily count to DownloadCounter, mirroring OciLimiter/PullCounter.

Testing

23 unit tests, no DB and no network: wiremock for both upstream APIs (releases, generic packages, 404s, token-forwarding refusal on foreign hosts) and an in-memory AssetStore exercising cache hit/miss, stale-row refetch, replaced-SHA cleanup, invalidation, and LRU eviction. cargo test --workspace, cargo clippy -D warnings, and cargo fmt --check are green (run inside the rust-builder-glibc 1.94 image).

## What New `dunite-download` workspace crate: the generic, storage-agnostic engine for proxying binary artifacts from an upstream Forgejo instance. Tracks **BUNYIP-29** (subtask of the BUNYIP-28 distribution story). ## Why Bunyip's binary download vertical is currently hand-rolled inside `bunyip-domain` (ForgejoClient, ReleaseCache, DownloadCache, DownloadLimiter with a hard PgPool dependency), which violates the dunite/bunyip split: generic mechanism belongs here, persistence belongs to the consumer. This crate is that mechanism, extracted and generalized; the follow-up (BUNYIP-30) makes bunyip consume it and deletes the hand-rolled code. ## Design Mirrors `dunite-oci` exactly: - **`store::AssetStore` + `store::DownloadCounter`** are the persistence traits the consumer implements against its own schema. No PgPool, no named tables, no domain types in the crate. - **`ForgejoAssetClient`** speaks both Forgejo artifact APIs, selected per lookup via the `ArtifactSource` enum: release attachments (`/api/v1/repos/{owner}/{repo}/releases/tags/{tag}`) and the generic package registry (`/api/v1/packages/{owner}/generic/{package}/{version}/files`). Download URLs are validated against the configured base host before the API token is forwarded (same token-safety rule as `dunite-oci`). - **`ReleaseCache`** is a moka TTL cache for artifact metadata, shared by both sources. - **`DownloadCache<S: AssetStore>`** is the on-disk content-addressed asset cache: single-flight fetches via `Arc<OnceCell>`, SHA-256 hashing while streaming to a temp file, atomic rename, replaced-SHA orphan cleanup, async LRU eviction over the byte cap. - **`DownloadLimiter`** enforces per-user concurrency in process and delegates the durable daily count to `DownloadCounter`, mirroring `OciLimiter`/`PullCounter`. ## Testing 23 unit tests, no DB and no network: wiremock for both upstream APIs (releases, generic packages, 404s, token-forwarding refusal on foreign hosts) and an in-memory `AssetStore` exercising cache hit/miss, stale-row refetch, replaced-SHA cleanup, invalidation, and LRU eviction. `cargo test --workspace`, `cargo clippy -D warnings`, and `cargo fmt --check` are green (run inside the rust-builder-glibc 1.94 image).
feat(download): add dunite-download generic binary distribution engine
All checks were successful
Checks / fmt + clippy + test (pull_request) Successful in 28s
2465e9135d
New workspace crate providing the storage-agnostic mechanism for proxying binary artifacts from an upstream Forgejo instance, extracted and generalized from the hand-rolled download code in bunyip-domain (BUNYIP-29).

- ForgejoAssetClient speaks both artifact APIs, selected per lookup via the ArtifactSource enum: release attachments (/api/v1/repos/{owner}/{repo}/releases/tags/{tag}) and the generic package registry (/api/v1/packages/{owner}/generic/{package}/{version}/files), with download URLs always validated against the configured base host before the API token is forwarded.
- ReleaseCache is a moka TTL cache for artifact metadata keyed by (scope id, version), shared by both sources.
- DownloadCache<S: AssetStore> is the on-disk content-addressed asset cache: single-flight fetches, SHA-256 hashing while streaming to a temp file, atomic rename, replaced-SHA orphan cleanup, and async LRU eviction over the byte cap. Persistence goes through the AssetStore trait the consumer implements.
- DownloadLimiter enforces per-user concurrency in process and delegates the durable daily count to the DownloadCounter trait, mirroring dunite-oci's OciLimiter/PullCounter split.
- 23 unit tests: wiremock for both upstream APIs and an in-memory AssetStore exercising hit/miss, refetch, orphan cleanup, invalidation, and eviction. cargo test/clippy/fmt green on the workspace.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
fix(download): address code-review findings on the distribution engine
All checks were successful
Checks / fmt + clippy + test (pull_request) Successful in 28s
create-release / create-release (pull_request) Has been skipped
da136e0da6
Fixes from a 10-finding review of PR #2 (correctness, integrity, reuse, efficiency):

- Release-asset download URLs are now built from the configured base_url and the canonical attachment route instead of trusting upstream browser_download_url, so downloads work when the API base differs from Forgejo's public ROOT_URL (split internal/external addressing) and the API token can only ever go to the configured host.
- Integrity verification: the generic-packages files API sha256 is captured into ReleaseAsset and verified against the streamed bytes (new ShaMismatch error); the upstream-advertised size is verified when known (new SizeMismatch error). Truncated or corrupted bodies are no longer cached.
- ReleaseCache is keyed on (scope_id, ArtifactSource) instead of (scope_id, version-string), so two sources sharing a version label can never serve each other's metadata.
- Typed errors survive the single-flight cell: a cloneable FetchError mirror replaces string flattening, so an upstream 404 reaches the caller as Forgejo(NotFound) and can map to HTTP 404 instead of 500.
- AssetStore::upsert now documents the atomicity requirement (prior-SHA read + write in one transaction / row lock) that the orphan-cleanup contract depends on; total_size_bytes documents the must-return-0-when-empty rule.
- The per-user rate limiter and the same-origin token-forwarding guard moved DOWN into dunite-core (services::usage_limiter, validation::origin) and are consumed by dunite-download via re-exports; the duplicated download_limiter.rs is deleted. The asset client also sets redirect Policy::none() so the token cannot be bounced via redirects.
- Eviction reads the store total once and decrements locally per deleted row (was one aggregate query per row), and guards against a negative store total wiping the cache via the i64 -> u64 cast.
- Naming leak fixed: release_tag / tag_name fields renamed to version across models, store traits, and engine APIs, matching ArtifactSource::version() for both release and package sources.

Workspace test/clippy/fmt green: 116 tests across dunite-core (61), dunite-download (26), dunite-oci (26), dunite-oidc (3).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
nrupard deleted branch feat/dunite-download 2026-06-01 19:07:12 +02:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
psa-systems/dunite!2
No description provided.