Command line#
Credentials management#
You can use metalchat credential command to store the access tokens. On MacOS
the credential is stored in Keychain Access
in a secure way and only queried by the metalchat command, when accessing remote resources.
Hint
You can create a HuggingFace access token by following through the User Access Tokens Guide.
$ metalchat credential add --host huggingface.co --username $HF_USERNAME --secret $HF_ACCESS_TOKEN
Then you could list access tokens using the list sub-command. Here the hostname is defined
as part of the URL. If the same URL prefix is used in the model pulling command, the access
token will be automatically pulled from the secrets provider and used to authenticate requests.
$ metalchat credential list
https://huggingface username @keychain
Models management#
Note
You will need the access to the gated Meta Llama3 model model in order to run metalchat pull
command. You can do this by creating an account at HuggingFace.
And then requesting access to a
Llama-3.2-1B-Instruct model.
You can use metalchat model command to pull models from remote repositories, list or remove
then from the repository. By default all models are stored into $HOME/.metalchat/models
directory.
$ metalchat model pull -p consolidated https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct
This command assigns to each model a SHA-1 identifier comprised of a repository URL, model architecture, model variant, and weights partitioning. You can use this identifier to switch between models.
$ metalchat model list --abbrev
e37f2df llama3 consolidated https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct
When you already have a cloned repository with a model, you could pull it as well. This command creates hard links to the necessary files, so even if you remove the original directory, the model remains functional.
$ metalchat model pull -p consolidated file:///var/models/Llama-3.2-1B-Instruct
Switching models#
The metalchat utility uses metalchat.toml manifest file to keep the currently used
model version and all respective options of that model, as well as environment parameters.
The utility distinguishes three scopes: local, global, and model. You can find the manifest
file in each of those scopes.
The scopes correspond to the following locations:
local- the current working directory.global- a directory located at$HOME/.metalchat.model- a directory in the$HOME/.metalchat/models.
You can use metalchat checkout command to switch models use either in local or global scope.
By default, this command switches a model in the local scope.
$ metalchat checkout e37f2dfbbef2a9dcad4e1d83274b8ff5d55c5481
$ cat metalchat.toml
[model]
repository = "https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct"
architecture = "llama3"
partitioning = "consolidated"
variant = "huggingface"
In a similar way, you can switch a model in the global scope:
$ metalchat checkout --global e37f2dfbbef2a9dcad4e1d83274b8ff5d55c5481
Configuring options#
The metalchat options command allows to override model options (like, rms_norm_eps,
rope_theta). During the inference, MetalChat runtime merges both model and currently selected
scope and runs a model with merged options:
$ metalchat options set --type=float rms_norm_eps 0.0001
This command updates the manifest file with the new options. After that you could check what options a model will be using during inference and scope of the options.
$ metalchat options list --show-scope
local rms_norm_eps=0.0001
model head_dim=64
model num_attention_heads=32
model num_hidden_layers=16
model num_key_value_heads=8
model rope_theta=500000.0
Alternatively, you can override the options in the metalchat.toml manifest in the section
options, like in the example below.
[options]
rms_norm_eps = 0.0001
Prompting models#
There are multiple ways of prompting a model, all of them start the inference from the 0 position.
You could feed the query into the standard input:
$ echo 'Who are you?' | metalchat -
I'm an artificial intelligence model known as Llama. Llama stands for "Large Language Model Meta AI."
Or you could run the inference using metalchat prompt command.
$ metalchat prompt -c 'Who are you?'
$ echo 'Who are you?' > file.md
$ metalchat prompt file.md
By default model runs the inference without a system prompt. You could specify a custom prompt
through the manifest file. The system option requires an existing file either relative to the
manifest file location, or an absolute path:
[prompt]
system = 'system.md'