LOCAL MODELS · FREE

Run AI models on your own machine

Download and run Ollama models in a click. Call embeddings and completions from your projects through one local endpoint — no key, no cost, nothing leaves your machine.

Local models, one endpoint straight to your project

Models run on your own machine, exposed through one local endpoint. Your PHP / Node / Python projects call it like a cloud API — except the data never leaves your machine.

Your project

PHP · Node · Python

Local endpoint

One unified API

Ollama models

Running on your machine

Three steps to local models

Download in a click

Pick a model in ServBay and hit download — with resume support, so interruptions are no problem.

Endpoint, automatically

Once a model is ready, it's exposed through one local endpoint — zero configuration.

Call it from your project

Point your project at the local endpoint and call embeddings and completions like any cloud API.

What you get

One-click model download with resume support

Local models exposed through one endpoint

Call embeddings and completions from your PHP / Node / Python projects

No key, no cost — data never leaves your machine

Coming soon · llama.cpp and MLX support

Call local models like a cloud API

One standard interface; switch models by changing a single name.

ServBay · localhost:11434

# Call a local model, exactly like a cloud API

curl http://localhost:11434/v1/chat/completions \

-d '{"model": "llama3", "messages": […]}'

# Switch models? Just change the model name

"model": "qwen2"

Leading open models you can run locally

Open models from the Ollama library, downloaded and ready in one click.

Llama

Qwen

DeepSeek

Mistral

Gemma

Phi

… and many more from the Ollama library

Frequently Asked Questions

Which models can I run?

Any open model in the Ollama library, from Llama and Qwen to DeepSeek and Mistral — downloaded and run in a click.

Can my local project call them directly?

Yes. Through one local endpoint, your PHP / Node / Python projects can call embeddings and completions directly.

Is any data uploaded?

No. Models and calls run entirely on your own machine; data never leaves it.

Documentation

Release Notes

About Us