Infero

A powerful CLI tool for easy model downloading, conversion, and hosting with ONNX runtime.

ONNX Runtime

GPU Support

Model Hosting

terminal

bash

$ pip install infero

$ infero pull [hf_model_name]

$ infero run [hf_model_name] --quantize

$ infero list

Key Features

Easily download models from Hugging Face and other repositories with a simple command.

Convert models to ONNX format for faster inference and better performance.

Host models with a built-in server for easy integration with your applications.

Leverage GPU acceleration for faster model inference and better performance.

8-bit

Reduce model size and improve performance with 8-bit quantization.

Install Infero using pip:

bash

pip install infero

This will install the latest version of Infero and all its dependencies.

Start optimizing your AI model deployments with Infero. Open source, easy to use, and powerful.