DEPLOYYOURAGENT.
Build production-grade AI agents. Host them on your own server. Private knowledge base, full REST API, end-to-end encryption. Zero data leaves your perimeter.
_______ _____
|__ __| /\ |_ _|
| |_ __ __ _ ___ _ _ / \ | |
| | '__/ _` |/ __| | | | / /\ \ | |
| | | | (_| | (__| |_| | / ____ \ _| |_
|_|_| \__,_|\___|\__, | /_/ \_\_____|
__/ |
|___/96 LEFT.
Join early and lock in 50% off before standard pricing goes live.
4 active early users · 96 places left · 50% off before public release
EVERYTHING
YOU BUILT IN.
Native file readers, predictable APIs, and local-first indexing. Tracy is a backend you can ship, not a black box you have to trust.
Your server. Your compute. Full stop.
Tracy Server runs on your infra — Docker, bare-metal, Kubernetes. No SaaS, no phone-home, no surprise bill.
Documents, plain and simple
CSV, DOC, DOCX, Excel, HTML, ODT, PDF, PPTX — parsed by specialized Go extractors, ready for search and RAG on your machine.
Keep data on-prem
No third-party persistence. Indexes and documents stay on your disk; air‑gapped deployments are first‑class.
Single HTTP entrypoint
Unified HTTP API over Tracy Server to extract, index, query content, and run AI agents with conversations, streaming chat, and structured extraction.
Agents that know your documents
Spin up conversational and extractor agents over your own files. Persist conversations, stream answers over SSE, and keep every token grounded in your knowledge base.
Install Tracy Server
Generate an auth token, persist it to config, build the binary and boot the Docker stack in one go.
docker compose down docker compose build --no-cache tracy docker compose up -d docker logs tracy --follow
Pull your LLMs
Download DeepSeek-V3.1:671B and keep every weight on your own hardware.
curl http://localhost:9090/model/download \
--request POST \
--header 'Content-Type: application/json' \
--data '{
"model": "deepseek-v3.1:671b"
}'Create your collection
Spin up a Qdrant collection tuned to your embedding size. Ready for chunks in a single POST.
curl http://localhost:9090/collection \
--request POST \
--header 'Content-Type: application/json' \
--data '{
"name": "documents",
"size": 768
}'Ingest your files
Upload PDFs, spreadsheets or HTML. Tracy parses, chunks and embeds directly into your collection.
curl http://localhost:9090/collection/file \ --request POST \ --header 'Content-Type: multipart/form-data' \ --form 'collection=documents' \ --form 'model=nomic-embed-text' \ --form 'file=@/path/to/file.pdf'
Spin up your agents
Create conversational and extractor agents in a few JSON lines, with RAG for chat and schema-based extraction for structured outputs.
# Conversational agent
curl http://localhost:9090/agent/agent \
--request POST \
--header 'Content-Type: application/json' \
--data '{
"name": "HR Assistant",
"role": "<your system prompt>",
"folder": "documents",
"model": "llama3.2",
"embedding_model": "nomic-embed-text"
}'
# Extractor agent
curl http://localhost:9090/agent/extractor \
--request POST \
--header 'Content-Type: application/json' \
--data '{
"name": "Contract Extractor",
"role": "<your system prompt>",
"model": "llama3.2",
"format": "{ /* JSON schema */ }"
}'Chat with your agent
Send prompts into a persistent conversation and stream answers token by token from your own server.
curl http://localhost:9090/chat \
--request POST \
--header 'Content-Type: application/json' \
--data '{
"agent_id": 1,
"conversation_id": 1,
"prompt": "What leave days are planned this year?",
"limit": 5
}'Extract structured data
Call extractor agents to turn raw text into typed JSON payloads that slot straight into your pipelines.
curl http://localhost:9090/extract \
--request POST \
--header 'Content-Type: application/json' \
--data '{
"agent_id": 1,
"text": "Contract signed on 01/01/2024...",
"nb": 3
}'BUILT
PARANOID.
DEPLOYED
CONFIDENT.
Security in Tracy isn't a feature layer — it's the foundation every decision is built on. Local LLMs, on-premise embeddings, hardened binary. When your data never leaves your server, you don't need to trust us.
Data never leaves your infrastructure
Ollama runs your LLMs locally. Qdrant stores your embeddings on-premise. Document parsing (DOCX, PDF, ODT, Excel) happens on your server.
Bearer token authentication
Every API route is protected by Bearer token middleware.
Air-gapped & fully self-contained
Single static binary, zero runtime dependencies, built for isolated networks and local AI deployments.
TEAMS WHO
TAKE IT SERIOUSLY.
"We replaced three SaaS AI subscriptions with Tracy in a weekend. Our IP stays ours."
Léa Fontaine
CTO, Meridian Labs
"Air-gapped deployment was exactly what our compliance team needed. Tracy just works in isolation."
Marcus Void
Lead Infra, Arctis Security
"The API design is exceptional. PoC to production in 4 days. Zero surprises."
Yuki Tanaka
AI Engineer, Fluxion
OWN YOUR
INTELLIGENCE.
One license. Deployed on your infrastructure. No usage fees, no per-seat billing, no data leaving your server. Pay once — run forever on your terms.
ONE PLAN.
NO SURPRISES.
2-week free trial · Billed monthly · Cancel anytime
- ⬡Full Tracy AI access
- ⬡2-week free trial
- ⬡Local LLM support (Ollama)
- ⬡RAG + document ingestion
- ⬡API access + Bearer auth
- ⬡Updates & new features
- ⬡Email support
Own it forever · No renewal
Need a perpetual license for your organization? Custom pricing is available for teams requiring a one-time purchase, on-premise deployment agreements, or volume deals.
Reach out and we'll put together a proposal within 48h.
All plans include self-hosted deployment · 2-week free trial on the monthly plan · No data leaves your server · Air-gapped compatible