Framework · Self-hosted · No rate limits

DEPLOYYOURAGENT.

Build production-grade AI agents. Host them on your own server. Private knowledge base, full REST API, end-to-end encryption. Zero data leaves your perimeter.

0%Self-hosted
0Data leaks
TRACY // BOOT SEQUENCE
LIVE
  _______                              _____ 
 |__   __|                       /\   |_   _|
    | |_ __ __ _  ___ _   _     /  \    | |
    | | '__/ _` |/ __| | | |   / /\ \   | |
    | | | | (_| | (__| |_| |  / ____ \ _| |_
    |_|_|  \__,_|\___|\__, | /_/    \_\_____|
                       __/ |
                      |___/
RuntimeIsolated
EgressControlled
EncryptAES-256
ModelAgnostic
Early access

96 LEFT.

Join early and lock in 50% off before standard pricing goes live.

4 active early users · 96 places left · 50% off before public release

Capabilities

EVERYTHING
YOU BUILT IN.

01 — DEPLOY

Your server. Your compute. Full stop.

Tracy Server runs on your infra — Docker, bare-metal, Kubernetes. No SaaS, no phone-home, no surprise bill.

$ docker compose down $ docker compose build --no-cache tracy $ docker compose up -d $ docker logs tracy --follow
02 — INGEST

Documents, plain and simple

CSV, DOC, DOCX, Excel, HTML, ODT, PDF, PPTX — parsed by specialized Go extractors, ready for search and RAG on your machine.

03 — SECURITY

Keep data on-prem

No third-party persistence. Indexes and documents stay on your disk; air‑gapped deployments are first‑class.

04 — API

Single HTTP entrypoint

Unified HTTP API over Tracy Server to extract, index, query content, and run AI agents with conversations, streaming chat, and structured extraction.

05 — AGENTS

Agents that know your documents

Spin up conversational and extractor agents over your own files. Persist conversations, stream answers over SSE, and keep every token grounded in your knowledge base.

Protocol
01
Tokened. Built. Live.

Install Tracy Server

Generate an auth token, persist it to config, build the binary and boot the Docker stack in one go.

terminal
docker compose down
docker compose build --no-cache tracy
docker compose up -d
docker logs tracy --follow
02
From 7B toys to 671B giants.

Pull your LLMs

Download DeepSeek-V3.1:671B and keep every weight on your own hardware.

terminal
curl http://localhost:9090/model/download \
  --request POST \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "deepseek-v3.1:671b"
}'
03
Vector store, one call away.

Create your collection

Spin up a Qdrant collection tuned to your embedding size. Ready for chunks in a single POST.

terminal
curl http://localhost:9090/collection \
  --request POST \
  --header 'Content-Type: application/json' \
  --data '{
  "name": "documents",
  "size": 768
}'
04
PDFs in. Chunks out.

Ingest your files

Upload PDFs, spreadsheets or HTML. Tracy parses, chunks and embeds directly into your collection.

terminal
curl http://localhost:9090/collection/file \
  --request POST \
  --header 'Content-Type: multipart/form-data' \
  --form 'collection=documents' \
  --form 'model=nomic-embed-text' \
  --form 'file=@/path/to/file.pdf'
05
RAG and extractors, typed.

Spin up your agents

Create conversational and extractor agents in a few JSON lines, with RAG for chat and schema-based extraction for structured outputs.

terminal
# Conversational agent
curl http://localhost:9090/agent/agent \
  --request POST \
  --header 'Content-Type: application/json' \
  --data '{
  "name": "HR Assistant",
  "role": "<your system prompt>",
  "folder": "documents",
  "model": "llama3.2",
  "embedding_model": "nomic-embed-text"
}'

# Extractor agent
curl http://localhost:9090/agent/extractor \
  --request POST \
  --header 'Content-Type: application/json' \
  --data '{
  "name": "Contract Extractor",
  "role": "<your system prompt>",
  "model": "llama3.2",
  "format": "{ /* JSON schema */ }"
}'
06
Conversational RAG, on-prem.

Chat with your agent

Send prompts into a persistent conversation and stream answers token by token from your own server.

terminal
curl http://localhost:9090/chat \
  --request POST \
  --header 'Content-Type: application/json' \
  --data '{
  "agent_id": 1,
  "conversation_id": 1,
  "prompt": "What leave days are planned this year?",
  "limit": 5
}'
07
Free text in. JSON out.

Extract structured data

Call extractor agents to turn raw text into typed JSON payloads that slot straight into your pipelines.

terminal
curl http://localhost:9090/extract \
  --request POST \
  --header 'Content-Type: application/json' \
  --data '{
  "agent_id": 1,
  "text": "Contract signed on 01/01/2024...",
  "nb": 3
}'
Security

BUILT
PARANOID.
DEPLOYED
CONFIDENT.

Security in Tracy isn't a feature layer — it's the foundation every decision is built on. Local LLMs, on-premise embeddings, hardened binary. When your data never leaves your server, you don't need to trust us.

Data never leaves your infrastructure

Ollama runs your LLMs locally. Qdrant stores your embeddings on-premise. Document parsing (DOCX, PDF, ODT, Excel) happens on your server.

Bearer token authentication

Every API route is protected by Bearer token middleware.

Air-gapped & fully self-contained

Single static binary, zero runtime dependencies, built for isolated networks and local AI deployments.

GDPR readyHIPAA compatibleNo telemetryAir-gapped
Field reports

TEAMS WHO
TAKE IT SERIOUSLY.

"We replaced three SaaS AI subscriptions with Tracy in a weekend. Our IP stays ours."
LF

Léa Fontaine

CTO, Meridian Labs

"Air-gapped deployment was exactly what our compliance team needed. Tracy just works in isolation."
MV

Marcus Void

Lead Infra, Arctis Security

"The API design is exceptional. PoC to production in 4 days. Zero surprises."
YT

Yuki Tanaka

AI Engineer, Fluxion

Self-hosted · One license · Yours forever

OWN YOUR
INTELLIGENCE.

One license. Deployed on your infrastructure. No usage fees, no per-seat billing, no data leaving your server. Pay once — run forever on your terms.

$docker compose up -d
Pricing

ONE PLAN.
NO SURPRISES.

Monthly subscription
Monthly

2-week free trial · Billed monthly · Cancel anytime

  • Full Tracy AI access
  • 2-week free trial
  • Local LLM support (Ollama)
  • RAG + document ingestion
  • API access + Bearer auth
  • Updates & new features
  • Email support
Get started
Perpetual license
One-time

Own it forever · No renewal

Need a perpetual license for your organization? Custom pricing is available for teams requiring a one-time purchase, on-premise deployment agreements, or volume deals.

Reach out and we'll put together a proposal within 48h.

Contact us

All plans include self-hosted deployment · 2-week free trial on the monthly plan · No data leaves your server · Air-gapped compatible