Safetensors format#
These are methods to initialize neural networks from safetensors files. In order to use them, include the hear like in the example below:
#include <metalchat/safetensor.h>
Safetensor document#
See also
For more details on the format of the safetensors and other implementations refer to the huggingface page.
-
class safetensor_document#
A document for writing and reading tensors in a
safetensorformat.Public Types
-
using metadata_container = std::unordered_map<std::string, std::string, _StringHash>#
A container type used to store metadata of the safetensor_document.
-
using iterator = safetensor_iterator#
An iterator type of safetensor instances.
Public Functions
-
safetensor_document()#
A default safetensor_document constructor.
-
safetensor_document(const safetensor_document&) = default#
A safetensor_document copy constructor.
-
std::vector<std::size_t> offsets() const#
Returns tensor offsets (relative to a safetensor metadata header) in bytes.
-
std::vector<std::size_t> sizes() const#
Returns a list of tensor sizes in bytes.
-
iterator begin()#
Returns an iterator to the first safe tensor in a document.
auto document = safetensor_document::open("model.safetensors"); for (auto it = document.begin(); it != document.end(); ++it) { std::cout << (*it) << std::endl; }
Note
It’s guaranteed that tensors are returned in an order defined by their offset in a document.
-
const_iterator begin() const#
Returns a constant iterator to the first safe tensor in a document.
-
const_iterator end() const#
Returns an iterator past the last safe tensor in a document.
-
void insert(const safetensor &st)#
Insert a safetensor into the safetensor document.
-
void insert(const std::string &name, const basic_tensor &tensor)#
Insert a tensor into the safetensor document.
The implementation saves a pointer to the underlying container, so the tensor referring to that container could be destroyed by a caller.
Example:
auto weight = zeros<float>({3, 4}); safetensor_document doc; doc.insert("weight", weight); doc.save("weights.safetensors");
- Parameters:
name – A name of the tensor to insert.
tensor – A tensor data to insert into the safetensor document.
-
void insert(const std::string &name, const std::string &source)#
Insert a tensor into the safetensor document.
Inserts a new tensor entry within a safetensor that refers to the another tensor with a name specified in
sourceargument. Thesourcetensor must be presented in the safetensor document.Both tensors will be sharing the same underlying container.
Example:
auto weight = zeros<float>({3, 4}); safetensor_document doc; doc.insert("input.weight", weight); doc.insert("output.weight", "input.weight");
- Parameters:
name – An alias of the existing safetensor.
source – A name of the source tensor (should be present in the safetensor document).
-
void insert(const nn::basic_layer &layer)#
Insert all registered parameters of the specified layer into the safetensor document.
This method recursively traverses layer and inserts parameters into the safetensor document.
Example:
auto linear = nn::linear<float>({10, 64}); safetensor_document doc; doc.insert(linear); doc.save("linear.safetensors");
- Parameters:
layer – A layer to use.
-
template<typename Layer>
inline void insert(const nn::indirect_layer<Layer> &layer)# Insert all registered parameters of the indirect layer into the safetensor document.
- Template Parameters:
Layer – A type of the layer to load into.
- Parameters:
layer – A layer to use.
-
void load(nn::basic_layer &layer) const#
Load tensors from a safetensor document into the layer’s registered parameters.
The traverses through all tensors in the safetensor_document and assigns them to the registered parameters of the specified layer. Method raises an exception, when the parameter is not registered in the layer, but is presented in the document.
Example:
hardware_accelerator accelerator; nn::linear<float> linear(accelerator); auto doc = safetensor_document::open("linear.safetensors", accelerator); doc.load(linear);
Note
Consider using safetensor_document::begin() and safetensor_document::end() iterators to implement a custom logic of weights assignment.
Warning
Layer parameters should be using the same container type as the safetensor document.
-
template<typename Layer>
inline void load(nn::indirect_layer<Layer> &layer) const# Load tensors from a safetensor document into an indirect layer.
-
void load(const std::string &name, basic_tensor &tensor) const#
Load tensor from a safetensor document into a specified tensor.
The implementation assigns a new container to the specified tensor (which means that target tensor might be empty or any arbitrary size), and resets the size of the tensor to correctly address elements of the new container. The method resets offsets if they were set in the target tensor, see tensor_accessor::resize(BidirIt, BidirIt, Accessor&) for more details.
Depending on the allocator type and the way safetensor document was opened, new container might alias a pointer to the resources that were used to create a container (like memory- mapped files).
Example:
tensor<float> target; auto doc = safetensor_document::open("linear.safetensors"); doc.load("weight", target);
Warning
Tensor should be using the same container type as the safetensor document.
- Parameters:
name – A name of the tensor to load.
tensor – A target tensor that will be updated.
-
void save(const std::filesystem::path &p)#
Save all registered tensors into the file at the specified location.
- Parameters:
p – A path to the file to save tensors.
-
metadata_container &get_metadata()#
Get a reference to the safetensor_document metadata.
- Returns:
a reference to the metadata container.
Public Static Functions
-
static safetensor_document open(const std::filesystem::path &p)#
Open a safetensor document.
This implementation uses a memory-mapped file and allocates all tensors into random_memory_container without copying actual memory. It is safe to destroy this instance after accessing tensors, since tensor pointers will carry over a pointer to the backing file. This means, until a pointer to the container exists, a memory-mapped file won’t be closed.
- Parameters:
p – A path in the filesystem to a file in a safetensor format.
-
static safetensor_document open(const std::filesystem::path &p, hardware_accelerator &accelerator)#
Open a safetensor document.
This implementation, like safetensor_document::open(const std::filesystem::path&) uses a memory-mapped files. But all tensors are allocated using hardware_memory_container.
Similarly, pointer to a memory-mapped file is carried over by the tensors.
This is the most efficient implementation, since it tries to allocate buffers of maximally allowed size by a hardware accelerator, and then uses pooling_allocator_adapter and nocopy_allocator to avoid copying memory from the memory-mapped file.
- Parameters:
p – A path in the filesystem to a file in a safetensor format.
accelerator – A hardware accelerator.
-
template<allocator_t<void> Allocator>
static inline safetensor_document open(std::istream &is, Allocator alloc)# Open a safetensor document.
This implementation reads safetensor data from the specified basic stream. So all reads from the stream will result in copying data from stream to tensor containers. The containers do not hold a reference to the specified stream.
- Template Parameters:
Allocator – A type of the allocator used to allocate tensor containers.
- Parameters:
is – An input string stream, that will be used to retrieve tensors from.
alloc – An instance of a void container allocator to allocate tensor containers.
-
template<allocator_t<void> Allocator>
static inline safetensor_document open(const std::filesystem::path &p, Allocator &alloc, std::size_t max_size = -1)# Open a safetensor document
This implementation reads safetensor data from the specified memory-mapped file and then uses a paginated allocator to create large metal buffers to allocate tensors from. All containers hold a pointer to the opened file.
- Template Parameters:
Allocator – A type of the allocator used to allocate tensor containers.
- Parameters:
p – A path in the filesystem to a file in a safetensor format.
alloc – An instance of the Allocator type.
max_size – A maximum size of the buffer to allocate.
-
template<allocator_t<void> Allocator>
static inline safetensor_document open(const std::filesystem::path &p, Allocator &&alloc, std::size_t max_size = -1)# Open a safetensor document.
This implementation is similar to safetensor_document::open(const std::filesystem::path&, Allocator&, std::size_t), except that allocator must be an r-value.
-
static void load(const std::filesystem::path &p, nn::basic_layer &layer)#
Load tensors from a safetensor document into the layer’s registered parameters.
The implementation is identical to the safetensor_document::load(nn::basic_layer&) const, the difference is that safetensor file is not returned to the caller.
Warning
Layer parameters should be using the same container type as the safetensor document.
- Parameters:
p – A path to load tensors from.
layer – A layer instance to load tensors into.
-
template<typename Layer>
static inline void load(const std::filesystem::path &p, nn::indirect_layer<Layer> &layer)# Load tensors from a safetensor document into the indirect layer’s registered parameters.
- Template Parameters:
Layer – A type of the layer to load into.
- Parameters:
p – A path to load tensors from.
layer – A layer instance to load tensors into.
-
static void save(const std::filesystem::path &p, nn::basic_layer &layer)#
Save all registered parameters of the layer into the file at the specified location.
Warning
Layer parameters should be using the same container type as the safetensor document.
- Parameters:
p – A path to the file to save tensors.
layer – A layer containing parameters to save into the safetensors document.
-
template<typename Layer>
static inline void save(const std::filesystem::path &p, nn::indirect_layer<Layer> &layer)# Save all registered parameters from the indirect layer into the file at the specified location.
- Template Parameters:
Layer – A type of the layer to save
- Parameters:
p – A path to the file to save tensors.
layer – A layer containing parameter to save into the safetensors document.
-
using metadata_container = std::unordered_map<std::string, std::string, _StringHash>#
Safetensor#
-
class safetensor#
Public Functions
-
inline const std::string &name() const#
Returns a name of the tensor (a complete path as in the original safetensor document).
-
inline const std::string &dtype() const#
Returns a data type string representation as in the safetensors specification (
I8,U8,I16, etc.).
-
inline std::size_t dimensions() const#
Return the number of dimensions in the tensor.
-
inline std::size_t numel() const#
Return the total number of elements in the tensor.
-
inline const std::span<std::size_t> sizes() const#
Returns sizes of the tensor.
-
inline const std::string &name() const#
Safetensor allocator#
-
template<allocator_t<void> Allocator>
class safetensor_allocator# A safetensor allocator is used to dynamically (in run-time) dispatch allocator type binding according to the type of a tensor specified in the safetensor document.
This type is used internally within a safetensor_document and does not expose public API for registering new, unsupported types.
Here is an example of allocating a 128-element container of
int32_ttypes in a heap:using Allocator = random_memory_allocator<void>; safetensor_allocator<Allocator> dynamic_alloc; auto alloc = Allocator(); auto container_ptr = dynamic_alloc.allocate("I32", 128, alloc);
- Template Parameters:
Allocator – A type-erased allocator used to serve dynamic allocations.
Public Types
-
using container_ptr = std::shared_ptr<basic_container>#
Type of the allocated container type. All containers are inherited from the basic containers, therefore allocator returns a polymorphic reference to the actual container implementation.
Public Functions
-
inline safetensor_allocator()#
A safetensor_allocator default constructor.
-
safetensor_allocator(const safetensor_allocator&) = default#
A safetensor_allocator copy constructor.
-
inline container_ptr allocate(const std::string &type_name, void *data, std::size_t size, Allocator &alloc)#
Allocate an a block of contiguous memory of the specified type and initialize it with the data specified by the argument
data.- Parameters:
type_name – a type name of container elements (e.g. ‘I32’, ‘F32’, ‘F64’, etc.).
data – a contiguous block of data to initialize new memory with.
size – a size of a new container in bytes.
alloc – a basic void allocator to use for typed allocation.
-
inline container_ptr allocate(const std::string &type_name, std::size_t size, Allocator &alloc)#
Allocate an uninitialized a block of contiguous memory of the specified type.
- Parameters:
type_name – a type name of container elements (e.g. ‘I32’, ‘F32’, ‘F64’, etc.).
size – a size of a new container in bytes.
alloc – a basic void allocator to use for typed allocation.