Framework · Self-hosted · No rate limits

DEPLOYYOURAGENT.

Build production-grade AI agents. Host them on your own server. Private knowledge base, full REST API, end-to-end encryption. Zero data leaves your perimeter.

Get started →See how it works

0%Self-hosted

0Data leaks

TRACY // BOOT SEQUENCE

LIVE

  _______                              _____ 
 |__   __|                       /\   |_   _|
    | |_ __ __ _  ___ _   _     /  \    | |
    | | '__/ _` |/ __| | | |   / /\ \   | |
    | | | | (_| | (__| |_| |  / ____ \ _| |_
    |_|_|  \__,_|\___|\__, | /_/    \_\_____|
                       __/ |
                      |___/

RuntimeIsolated

EgressControlled

EncryptAES-256

ModelAgnostic

Early access

95 LEFT.

Join early and lock in 50% off before standard pricing goes live.

Claim early access

5 active early users · 95 places left · 50% off before public release

Capabilities

EVERYTHING
YOU BUILT IN.

Native file readers, predictable APIs, and local-first indexing. Tracy is a backend you can ship, not a black box you have to trust.

01 — DEPLOY

Your server. Your compute. Full stop.

Tracy Server runs on your infra — Docker, bare-metal, Kubernetes. No SaaS, no phone-home, no surprise bill.

$ docker compose down $ docker compose build --no-cache tracy $ docker compose up -d $ docker logs tracy --follow

02 — INGEST

Documents, plain and simple

CSV, DOC, DOCX, Excel, HTML, ODT, PDF, PPTX — parsed by specialized Go extractors, ready for search and RAG on your machine.

03 — SECURITY

Keep data on-prem

No third-party persistence. Indexes and documents stay on your disk; air‑gapped deployments are first‑class.

04 — API

Single HTTP entrypoint

Unified HTTP API over Tracy Server to extract, index, query content, and run AI agents with conversations, streaming chat, and structured extraction.

05 — AGENTS

Agents that know your documents

Spin up conversational and extractor agents over your own files. Persist conversations, stream answers over SSE, and keep every token grounded in your knowledge base.

Protocol

Tokened. Built. Live.

Install Tracy Server

Generate an auth token, persist it to config, build the binary and boot the Docker stack in one go.

terminal

docker compose down
docker compose build --no-cache tracy
docker compose up -d
docker logs tracy --follow

From 7B toys to 671B giants.

Pull your LLMs

Download DeepSeek-V3.1:671B and keep every weight on your own hardware.

terminal

curl http://localhost:9090/model/download \
  --request POST \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "deepseek-v3.1:671b"
}'

Vector store, one call away.

Create your collection

Spin up a Qdrant collection tuned to your embedding size. Ready for chunks in a single POST.

terminal

curl http://localhost:9090/collection \
  --request POST \
  --header 'Content-Type: application/json' \
  --data '{
  "name": "documents",
  "size": 768
}'

PDFs in. Chunks out.

Ingest your files

Upload PDFs, spreadsheets or HTML. Tracy parses, chunks and embeds directly into your collection.

terminal

curl http://localhost:9090/collection/file \
  --request POST \
  --header 'Content-Type: multipart/form-data' \
  --form 'collection=documents' \
  --form 'model=nomic-embed-text' \
  --form 'file=@/path/to/file.pdf'

RAG and extractors, typed.

Spin up your agents

Create conversational and extractor agents in a few JSON lines, with RAG for chat and schema-based extraction for structured outputs.

terminal

# Conversational agent
curl http://localhost:9090/agent/agent \
  --request POST \
  --header 'Content-Type: application/json' \
  --data '{
  "name": "HR Assistant",
  "role": "<your system prompt>",
  "folder": "documents",
  "model": "llama3.2",
  "embedding_model": "nomic-embed-text"
}'

# Extractor agent
curl http://localhost:9090/agent/extractor \
  --request POST \
  --header 'Content-Type: application/json' \
  --data '{
  "name": "Contract Extractor",
  "role": "<your system prompt>",
  "model": "llama3.2",
  "format": "{ /* JSON schema */ }"
}'

Conversational RAG, on-prem.

Chat with your agent

Send prompts into a persistent conversation and stream answers token by token from your own server.

terminal

curl http://localhost:9090/chat \
  --request POST \
  --header 'Content-Type: application/json' \
  --data '{
  "agent_id": 1,
  "conversation_id": 1,
  "prompt": "What leave days are planned this year?",
  "limit": 5
}'

Free text in. JSON out.

Extract structured data

Call extractor agents to turn raw text into typed JSON payloads that slot straight into your pipelines.

terminal

curl http://localhost:9090/extract \
  --request POST \
  --header 'Content-Type: application/json' \
  --data '{
  "agent_id": 1,
  "text": "Contract signed on 01/01/2024...",
  "nb": 3
}'

Security

BUILT
PARANOID.
DEPLOYED
CONFIDENT.

Security in Tracy isn't a feature layer — it's the foundation every decision is built on. Local LLMs, on-premise embeddings, hardened binary. When your data never leaves your server, you don't need to trust us.

⬡

Data never leaves your infrastructure

Ollama runs your LLMs locally. Qdrant stores your embeddings on-premise. Document parsing (DOCX, PDF, ODT, Excel) happens on your server.

⬡

Bearer token authentication

Every API route is protected by Bearer token middleware.

⬡

Air-gapped & fully self-contained

Single static binary, zero runtime dependencies, built for isolated networks and local AI deployments.

✓ GDPR ready✓ HIPAA compatible✓ No telemetry✓ Air-gapped

Field reports

TEAMS WHO
TAKE IT SERIOUSLY.

"We replaced three SaaS AI subscriptions with Tracy in a weekend. Our IP stays ours."

Léa Fontaine

CTO, Meridian Labs

"Air-gapped deployment was exactly what our compliance team needed. Tracy just works in isolation."

Marcus Void

Lead Infra, Arctis Security

"The API design is exceptional. PoC to production in 4 days. Zero surprises."

Yuki Tanaka

AI Engineer, Fluxion

Self-hosted · One license · Yours forever

OWN YOUR
INTELLIGENCE.

One license. Deployed on your infrastructure. No usage fees, no per-seat billing, no data leaving your server. Pay once — run forever on your terms.

⬡ Get a license →Read the docs

$docker compose up -d

Pricing

ONE PLAN.
NO SURPRISES.

Monthly subscription

Monthly

2-week free trial · Billed monthly · Cancel anytime

⬡Full Tracy AI access
⬡2-week free trial
⬡Local LLM support (Ollama)
⬡RAG + document ingestion
⬡API access + Bearer auth
⬡Updates & new features
⬡Email support

Get started

Perpetual license

One-time

Own it forever · No renewal

Need a perpetual license for your organization? Custom pricing is available for teams requiring a one-time purchase, on-premise deployment agreements, or volume deals.

Reach out and we'll put together a proposal within 48h.

All plans include self-hosted deployment · 2-week free trial on the monthly plan · No data leaves your server · Air-gapped compatible

DEPLOYDEPLOYDEPLOYYOURYOURYOURAGENT.AGENT.AGENT.

95 LEFT.

EVERYTHINGYOU BUILT IN.

Your server. Your compute. Full stop.

Documents, plain and simple

Keep data on-prem

Single HTTP entrypoint

Agents that know your documents

Install Tracy Server

Pull your LLMs

Create your collection

Ingest your files

Spin up your agents

Chat with your agent

Extract structured data

BUILTPARANOID.DEPLOYEDCONFIDENT.

Data never leaves your infrastructure

Bearer token authentication

Air-gapped & fully self-contained

TEAMS WHOTAKE IT SERIOUSLY.

OWNOWNOWN YOURINTELLIGENCE.

ONE PLAN.NO SURPRISES.

DEPLOYYOURAGENT.

EVERYTHING
YOU BUILT IN.

BUILT
PARANOID.
DEPLOYED
CONFIDENT.

TEAMS WHO
TAKE IT SERIOUSLY.

OWN YOUR
INTELLIGENCE.

ONE PLAN.
NO SURPRISES.