Llama inference for Apple Devices#

MetalChat is a Metal-accelerated C++ framework and command line interpreter for inference of Meta Llama models. MetalChat is designed as a full-stack framework, allowing to provide access to both low-level GPU kernels and high-level LLM interpreter API.

Open source

Distributed under a copy-left GPLv3 license, MetalChat is developed and maintained publicly on GitHub.

Lightweight

MetalChat supports only Apple hardware with little external dependencies.

HuggingFace compatible

MetalChat supports Llama models distributed through HuggingFace Hub out of the box.

You can install both Framework and binary by adding a third-party repository, and running brew install. You can get more details how to use metalchat binary in the command line guide.

$ brew tap ybubnov/metalchat https://github.com/ybubnov/metalchat
$ brew install metalchat

User Guide#

Information about installation and usage of the MetalChat library and the binary utility.

User Guide

Development notes and contribution guide#

Information about the development principles of this library and how you can contribute.

Development
- Building from source

MetalChat reference#

The programming interface exposed by the MetalChat library.

Reference