Documentation

Getting started with LlamaBoss

LlamaBoss is a native Windows app for chatting with AI models that run entirely on your own computer. This page covers installing it, loading your first model, and what it can do.

The LlamaBoss app showing a chat with a local model answering a C++ question, with a formatted code block and explanation.
LlamaBoss running a local model — streaming answers with formatted code, fully offline.

What is LlamaBoss

LlamaBoss is a desktop chat app for Windows that runs large language models locally — on your machine, with no cloud service, account, or sign-in. It bundles the llama.cpp engine, so there is nothing else to set up: no Docker, no Python, no Ollama.

You can chat, drop in or paste images and text files, and get streaming answers with cleanly formatted code. You can also let the model use tools — reading and writing files, running scripts, working with spreadsheets and documents — with your approval on every action.

Install

  1. Download the installer from the home page (Windows 10 or 11, 64-bit).
  2. Run the downloaded .exe and click through the installer.
  3. Launch LlamaBoss from the Start menu.

Because LlamaBoss is a new, independently published app, Windows SmartScreen may show a warning the first time you run the installer. If you trust your download, choose More info → Run anyway. The app is open source and MIT licensed — you can review the code on GitHub.

Choose and download a model

LlamaBoss needs at least one model to chat with. Models are downloaded once and stored locally on your PC — by default under C:\Users\<you>\AppData\Local\LlamaBoss\models. Use the model menu at the top of the window (the pill that reads something like [ ● gemma-4-12B-it-bf16 ]) to pick a model you already have or download a new one.

Bigger models are more capable but need more memory and a stronger graphics card; smaller models run comfortably on modest hardware. If you are not sure where to start, pick a smaller model first and move up once you know how your machine handles it.

What you need

Everything runs on your own hardware, so response speed depends on your machine and the size of the model you choose.

Tools and files

Beyond plain chat, LlamaBoss can act as an agent. With your permission, the model can read and write files, search folders, run scripts, and work with spreadsheets, PDFs, and Word documents inside a workspace kept per conversation. Every action that touches your system shows an approval prompt first, so nothing runs by surprise.

You can also organize your work into Projects and save reusable Skills, so common tasks and context are a click away.

Privacy

LlamaBoss is local-first by design. Your conversations and models stay on your computer. There is no telemetry, no account, and no data sent to any server. Once a model is downloaded, you can use LlamaBoss completely offline.

Help and feedback

Found a bug or have a request? The best place to reach me is GitHub Issues — it keeps reports searchable and trackable.