diff --git a/docs/parameters.md b/docs/parameters.md index f70ec3dbdd..d83e5497c6 100644 --- a/docs/parameters.md +++ b/docs/parameters.md @@ -98,12 +98,12 @@ Shared configuration options for the pull, and pull & start mode. In the presenc ## Pull Mode Options for optimum-cli mode -When pulling models outside of OpenVINO organization the optimum-cli api is used inside ovms. You can set two additional parameters for this mode. +When pulling models outside of OpenVINO organization the optimum-cli api is used inside ovms. You can set additional parameters for this mode. | Option | Value format | Description | |------------------------------|--------------|---------------------------------------------------------------------------------------------------------------| -| `--extra_quantization_params`| ` ` | Add advanced quantization parameters. Check [optimum-intel](https://github.com/huggingface/optimum-intel) documentation. Example: `--sym --group-size -1 --ratio 1.0 --awq --scale-estimation --dataset wikitext2` | -| `--weight-format` | `string` | Model precision used in optimum-cli export with conversion. Default `int8`. | - +| `--extra_quantization_params`| `string` | Add advanced quantization parameters. Check [optimum-intel](https://github.com/huggingface/optimum-intel) documentation. Example: `--sym --group-size -1 --ratio 1.0 --awq --scale-estimation --dataset wikitext2` | +| `--weight-format` | `string` | Model precision used in optimum-cli export with conversion. Default `int8`. | +| `--vocoder` | `string` | The vocoder model to use for text2speech. For example `microsoft/speecht5_hifigan`. | There are also additional environment variables that may change the behavior of pulling: @@ -161,7 +161,7 @@ Task specific parameters for different tasks (text generation/image generation/e | `--num_streams` | `integer` | The number of parallel execution streams to use for the model. Use at least 2 on 2 socket CPU systems. Default: 1. | | `--normalize` | `bool` | Normalize the embeddings. Default: true. | | `--truncate` | `bool` | Truncate input when it exceeds model context length. Default: false | -| `--mean_pooling` | `bool` | Mean pooling option. Default: false. | +| `--pooling` | `string` | Pooling option. One of: CLS, LAST, MEAN. Default: CLS. | ### Rerank | option | Value format | Description | diff --git a/src/cli_parser.cpp b/src/cli_parser.cpp index 1ca3a57e3f..d0c0d89634 100644 --- a/src/cli_parser.cpp +++ b/src/cli_parser.cpp @@ -279,10 +279,6 @@ std::variant> CLIParser::parse(int argc, char* "Resets model precision.", cxxopts::value(), "PRECISION") - ("resize", - "Resets model resize dimensions.", - cxxopts::value(), - "resize") ("model_version_policy", "Model version policy", cxxopts::value(), diff --git a/src/graph_export/embeddings_graph_cli_parser.cpp b/src/graph_export/embeddings_graph_cli_parser.cpp index 8bdcffe7bf..192dd6c748 100644 --- a/src/graph_export/embeddings_graph_cli_parser.cpp +++ b/src/graph_export/embeddings_graph_cli_parser.cpp @@ -53,7 +53,7 @@ void EmbeddingsGraphCLIParser::createOptions() { cxxopts::value()->default_value("false"), "truncate") ("pooling", - "Mean pooling option.", + "Pooling option. One of: CLS, LAST, MEAN.", cxxopts::value()->default_value("CLS"), "POOLING"); } @@ -98,7 +98,7 @@ void EmbeddingsGraphCLIParser::prepare(OvmsServerMode serverMode, HFSettingsImpl embeddingsGraphSettings.truncate = result->operator[]("truncate").as(); embeddingsGraphSettings.pooling = result->operator[]("pooling").as(); } - if (!(embeddingsGraphSettings.pooling == "CLS" || embeddingsGraphSettings.pooling == "LAST")){ + if (!(embeddingsGraphSettings.pooling == "CLS" || embeddingsGraphSettings.pooling == "LAST" || embeddingsGraphSettings.pooling == "MEAN")){ throw std::invalid_argument("Only CLS and LAST pooling modes are supported"); } hfSettings.graphSettings = std::move(embeddingsGraphSettings);