Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,12 @@
helpers for strict JSON-object / JSON-schema generation with final-output
validation and typed decoding.

* Added `LlamaEngine.loadMultimodalProjectorSource(...)` so GGUF
multimodal projector files can use the same `ModelSource` resolver,
native download/cache manager, authentication, checksum, and progress
options as `loadModelSource(...)`, while preserving the existing
`loadMultimodalProjector(...)` path/string API.

## 0.8.12

* Updated the default LiteRT-LM native runtime pin to
Expand Down
33 changes: 29 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -422,7 +422,25 @@ For LiteRT-LM bundles, use the same `loadModelSource(...)` path with a
CPU, GPU, or Android NPU execution after the file is cached.
`llamadart` does not list Hugging Face files or expand sharded GGUF manifests;
pick the exact `.gguf` file path from the repository, and use separate model and
`mmproj` sources for multimodal assets.
`mmproj` sources for multimodal assets. After the GGUF model is loaded, call
`loadMultimodalProjectorSource(...)` to resolve, download, cache, and load the
projector through the same source/cache layer:

```dart
await engine.loadModelSource(
ModelSource.parse('hf://owner/repo/model-Q4_K_M.gguf'),
);
await engine.loadMultimodalProjectorSource(
ModelSource.parse('hf://owner/repo/mmproj.gguf'),
options: ModelLoadOptions(cachePolicy: ModelCachePolicy.preferCached),
);
```

Native/file-backed backends load the cached local projector path. URL-loading
web backends pass unauthenticated remote projector URLs directly to the bridge;
authenticated headers, checksum verification, explicit cache policy changes,
custom cache directories, and local filesystem path sources require a
file-backed native cache manager.

### 7. Generate embeddings

Expand Down Expand Up @@ -1139,6 +1157,10 @@ void main() async {
try {
await engine.loadModel('vision-model.gguf');
await engine.loadMultimodalProjector('mmproj.gguf');
// Or use ModelSource when the projector should be downloaded/cached:
// await engine.loadMultimodalProjectorSource(
// ModelSource.parse('hf://owner/repo/mmproj.gguf'),
// );

final session = ChatSession(engine);

Expand Down Expand Up @@ -1166,7 +1188,9 @@ void main() async {

Web-specific note:

- Load model/mmproj with URL-based assets (`loadModelFromUrl` + URL projector).
- Load model/mmproj with URL-based assets (`loadModelSource` /
`loadModelFromUrl` + URL projector). `loadMultimodalProjectorSource` supports
remote unauthenticated projector URLs on URL-loading web backends.
- For user-picked browser files, send media as bytes (`LlamaImageContent(bytes: ...)`,
`LlamaAudioContent(bytes: ...)`) rather than local file paths.

Expand All @@ -1178,8 +1202,9 @@ LiteRT-LM native note:
- Native LiteRT-LM supports local paths and encoded media bytes (`blob`) for
media parts. Remote image URLs and raw PCM `Float32List` samples fail before
native generation with clear errors.
- `loadMultimodalProjector`, `supportsVision`, and `supportsAudio` remain
projector-oriented APIs for llama.cpp/WebGPU multimodal paths.
- `loadMultimodalProjector`, `loadMultimodalProjectorSource`,
`supportsVision`, and `supportsAudio` remain projector-oriented APIs for
llama.cpp/WebGPU multimodal paths.

### 💡 Model-Specific Notes

Expand Down
109 changes: 101 additions & 8 deletions lib/src/core/engine/engine.dart
Original file line number Diff line number Diff line change
Expand Up @@ -293,6 +293,85 @@ class LlamaEngine {
return _withMmLifecycle(() => _loadMultimodalProjectorLocked(mmProjPath));
}

/// Loads a multimodal projector from a structured [source].
///
/// A model must already be loaded with [loadModel], [loadModelSource], or
/// [loadModelFromUrl]. Calling this before the model is ready throws a
/// [LlamaContextException].
///
/// This method is lifecycle-compatible with [loadMultimodalProjector]:
/// source resolution, package-managed download/cache work, and the final
/// backend projector load are serialized with direct path projector loads and
/// unloads. Concurrent projector lifecycle calls are applied in call order,
/// and loading a new projector replaces any active projector.
///
/// Local path sources are validated by the configured
/// [modelDownloadManager], then loaded from their local file path. Remote
/// sources use the native download/cache manager on file-backed backends. On
/// URL-loading backends, remote unauthenticated sources are passed directly to
/// the backend; package-managed auth, headers, checksum verification, cache
/// policy changes, cache directories, cancellation, retry/resume settings,
/// and progress reporting are not available because the backend/browser owns
/// the network and cache behavior.
///
/// Throws [LlamaUnsupportedException] when the active backend cannot load
/// multimodal projectors, when a local path is used with a URL-loading
/// backend, when the resolver returns a remote target that disallows
/// browser/backend caching, or when URL-backend loading is requested with
/// options that require the package-managed download/cache manager.
Future<void> loadMultimodalProjectorSource(
ModelSource source, {
ModelLoadOptions options = ModelLoadOptions.defaults,
ModelDownloadProgressCallback? onProgress,
}) {
return _withMmLifecycle(() async {
_ensureReady(requireContext: false);

final target = await modelResolver.resolve(
source,
ModelResolveRequest(options: options, onProgress: onProgress),
);

switch (target) {
case LocalModelFile(:final path):
if (backend.supportsUrlLoading) {
throw LlamaUnsupportedException(
'Explicit local multimodal projector paths are not supported by URL-loading backends.',
);
}
final localSource = ModelSource.path(path);
final entry = await modelDownloadManager.ensureModel(
localSource,
options: options,
onProgress: onProgress,
);
return _loadMultimodalProjectorLocked(entry.filePath);
case RemoteModelUrl(:final url, :final useBrowserCache):
if (!useBrowserCache) {
throw LlamaUnsupportedException(
'Remote multimodal projector loading without browser/backend cache is not supported yet.',
);
}
if (!backend.supportsUrlLoading) {
final downloadSource = source.isRemote
? source.withResolvedUri(url)
: ModelSource.url(url, fileName: source.fileName);
final entry = await modelDownloadManager.ensureModel(
downloadSource,
options: options,
onProgress: onProgress,
);
return _loadMultimodalProjectorLocked(entry.filePath);
}
_rejectUnsupportedUrlBackendOptions(
options,
assetType: 'multimodal projector',
);
return _loadMultimodalProjectorLocked(url.toString());
}
});
}

Future<void> _loadMultimodalProjectorLocked(String mmProjPath) async {
final mmProjName = _displayNameForSource(mmProjPath);
LlamaLogger.instance.info('Loading multimodal projector: $mmProjName');
Expand Down Expand Up @@ -1148,40 +1227,54 @@ class LlamaEngine {
_isReady = false;
}

void _rejectUnsupportedUrlBackendOptions(ModelLoadOptions options) {
void _rejectUnsupportedUrlBackendOptions(
ModelLoadOptions options, {
String assetType = 'model',
}) {
final isModel = assetType == 'model';
if (options.cachePolicy != ModelCachePolicy.preferCached) {
throw LlamaUnsupportedException(
'${options.cachePolicy.name} model loading requires the native download/cache manager.',
'${options.cachePolicy.name} $assetType loading requires the native download/cache manager.',
);
}
if (options.bearerToken != null || options.headers.isNotEmpty) {
throw LlamaUnsupportedException(
'Authenticated model URL loading requires the native download/cache manager.',
'Authenticated $assetType URL loading requires the native download/cache manager.',
);
}
if (options.cancelToken != null) {
throw LlamaUnsupportedException(
'Cancellation tokens require the native download/cache manager.',
isModel
? 'Cancellation tokens require the native download/cache manager.'
: 'Cancellation tokens for $assetType loading require the native download/cache manager.',
);
}
if (options.sha256 != null) {
throw LlamaUnsupportedException(
'Checksum verification requires the native download/cache manager.',
isModel
? 'Checksum verification requires the native download/cache manager.'
: 'Checksum verification for $assetType loading requires the native download/cache manager.',
);
}
if (options.cacheDirectory != null) {
throw LlamaUnsupportedException(
'cacheDirectory is not supported by URL-loading backends.',
isModel
? 'cacheDirectory is not supported by URL-loading backends.'
: 'cacheDirectory is not supported for $assetType loading by URL-loading backends.',
);
}
if (!options.resume) {
throw LlamaUnsupportedException(
'Disabling resume is not supported by URL-loading backends.',
isModel
? 'Disabling resume is not supported by URL-loading backends.'
: 'Disabling resume is not supported for $assetType loading by URL-loading backends.',
);
}
if (options.maxRetries != ModelLoadOptions.defaults.maxRetries) {
throw LlamaUnsupportedException(
'Custom maxRetries is not supported by URL-loading backends.',
isModel
? 'Custom maxRetries is not supported by URL-loading backends.'
: 'Custom maxRetries is not supported for $assetType loading by URL-loading backends.',
);
}
}
Expand Down
Loading
Loading