Infero

A powerful CLI tool for easy model downloading, conversion, and hosting with ONNX runtime.

ONNX Runtime
GPU Support
Model Hosting
PyPI GitHub
terminal
bash
$ pip install infero

$ infero pull [hf_model_name]

$ infero run [hf_model_name] --quantize

$ infero list

Key Features

Automatic Downloads

Easily download models from Hugging Face and other repositories with a simple command.

ONNX Conversions

Convert models to ONNX format for faster inference and better performance.

Automatic Server Setup

Host models with a built-in server for easy integration with your applications.

GPU Support

Leverage GPU acceleration for faster model inference and better performance.

8-bit

Quantization Support

Reduce model size and improve performance with 8-bit quantization.

How to Use Infero

Installation

Install Infero using pip:

bash
pip install infero

This will install the latest version of Infero and all its dependencies.

Get Started with Infero Today

Start optimizing your AI model deployments with Infero. Open source, easy to use, and powerful.