The Plutobee technology stack.
The tools our engineers reach for, the reason we picked each one, and where we deliberately go boring instead of new. Updated every quarter as the landscape moves.
AI infrastructure.
Production agents and LLM features. Not science projects.
Models
Anthropic Claude (Opus 4.6, Sonnet 4.6, Haiku 4.5) as our default. OpenAI for specific routing. Open-weights (Llama, Mistral) when self-host is required for data residency or cost.
Orchestration
Anthropic Agent SDK and Model Context Protocol (MCP) for tool-using agents. Temporal for durable agentic workflows. LangGraph where the graph metaphor fits the problem.
Retrieval
Postgres with pgvector for small-to-mid corpora (under 10M docs). Pinecone or Qdrant for scale. Voyage embeddings as default. Hybrid BM25 + semantic with reranking via Cohere or local cross-encoders.
Evals
Promptfoo and Inspect AI for offline evaluation. Custom regression harnesses wired to CI. Judge-model evaluation with calibrated thresholds. Real-customer-trace replay for end-to-end runs.
Observability
Langfuse for trace inspection. Helicone for cost. Datadog LLM observability for production alerting. Custom dashboards for hallucination rate, refusal rate, token p95 cost.
Safety
Output structured generation (JSON mode, instructor, outlines). Input/output content moderation. PII redaction pipelines. Prompt injection detection with guardrails.
Web applications.
Performance-first. Type-safe end-to-end. Boring where boring works.
Frameworks
Next.js 15 with the App Router for most product work. Remix for data-heavy admin tools. SvelteKit when Next is overkill. Astro for marketing surfaces. Plain TypeScript + Vite for highly specific apps.
UI
Tailwind CSS v4 + shadcn/ui as foundation. Radix primitives for accessibility. Framer Motion for premium interactions. CSS Modules where Tailwind makes a mess. Design systems built on top.
State & data
TanStack Query for server state. Zustand for client state. tRPC where same-team Node backend. urql or Apollo when GraphQL is already chosen. React Hook Form + Zod for forms.
Type safety
TypeScript strict mode. Zod schemas at network boundaries. Type-safe routing. Generated clients from OpenAPI. End-to-end types from database to UI.
Performance
React Server Components for shipping less JS. Partytown for third-party scripts. Code splitting and route-based chunking. Image pipelines via Next/Image or Cloudflare Images.
Testing
Vitest for unit. React Testing Library for component. Playwright for end-to-end. Storybook for visual regression. Chromatic in CI.
Server & APIs.
Long-lived services. Right tool for the failure mode.
Languages
Go for high-throughput, long-running services with hard latency budgets. TypeScript on Node for product API surface with type-shared frontends. Rust for performance-critical hotpaths and data tooling. Python for AI orchestration and data work. Elixir for chat, presence, and high-concurrency long-lived sockets.
API
REST + OpenAPI 3.1 default. gRPC for service-to-service. GraphQL when client teams need it and the schema is a real product. tRPC for same-team Node monorepos.
Databases
Postgres as default for everything OLTP. With JSONB, full-text, pgvector, RLS. SQLite + Turso for edge use cases. ClickHouse for analytics. DynamoDB for known-shape high-scale KV. MongoDB only when a customer mandates it.
Messaging
Kafka for event-sourced architectures and high-throughput streams. NATS for service mesh messaging. SQS/SNS where AWS-native is fine. Redpanda where Kafka API but smaller ops surface is required.
Caching
Redis or KeyDB for application cache and sessions. Cloudflare Cache and Workers KV at the edge. Memcached for pure LRU. Application-level cache with bounded LRU for hot paths.
Background work
Temporal for durable workflows with human-in-the-loop or LLM-in-the-loop steps. Sidekiq, RQ, or BullMQ for queue work in their respective ecosystems. Postgres-based queues (river, pgmq) when adding Redis would be the third moving part.
iOS, Android, cross-platform.
Native
Swift + SwiftUI on iOS. Kotlin + Jetpack Compose on Android. We default to native when the app needs deep platform integration, hardware features, or App Store top-tier performance.
Cross-platform
React Native + Expo when the team is already TypeScript-fluent and platform features are off the shelf. Flutter when the design vocabulary is bespoke and pixel parity matters.
Distribution
Fastlane for both stores. CI-driven builds with signed artifacts. EAS for Expo. Internal beta via TestFlight and Firebase App Distribution. Phased rollout with crash gates.
Where it actually runs.
Cloud providers
AWS as primary. GCP for BigQuery and Vertex AI when that ecosystem is already in place. Azure for Microsoft-shop customers. Cloudflare for edge, DNS, Workers, R2. Hetzner where European data residency demands it and cost matters.
Compute
EKS or GKE for service workloads. Fargate or Cloud Run for occasional or bursty workloads. Lambda for genuinely event-driven. EC2/Compute Engine when control over the host matters. Workers for edge.
IaC
Terraform with OpenTofu drop-in option. Modules versioned and tested with Terratest. CDK for fully-AWS shops that prefer code. Crossplane when the platform team wants self-service.
Networking
Cloudflare in front of everything. Private subnets default. Service mesh via Linkerd or Istio when scale demands it. mTLS internal. Zero-trust access via Cloudflare Access, Tailscale, or AWS Verified Access.
Secrets
AWS Secrets Manager or HashiCorp Vault. 1Password for human secrets. Doppler for environments where simpler is better. Never in env files committed to git, ever.
Cost
Spot instances where the workload tolerates restart. Right-sizing reports monthly. Savings Plans annually. FinOps reviews quarterly. We surface cost in PR review for anything new.
CI/CD, observability, SRE.
CI/CD
GitHub Actions or GitLab CI on the customer's preference. ArgoCD for Kubernetes deploys. Flux for GitOps. Pipelines hardened with concurrency controls, cancel-in-progress, required reviewers.
Observability
Datadog as default APM. Honeycomb for high-cardinality event analysis. Grafana stack (Mimir, Loki, Tempo) for self-hosted. New Relic where the customer already standardized. Sentry for errors.
SRE
SLO/SLI written before launch. Error budgets enforced. Toil tracked and capped at 50% of on-call time. Postmortems blameless and published internally. Chaos engineering once SLOs are stable.
Data engineering & analytics.
Warehouse
Snowflake or BigQuery for greenfield. Databricks when the lakehouse pattern fits. Redshift when AWS-native is required. ClickHouse self-hosted for real-time analytical workloads.
Pipelines
dbt for SQL transformation. Dagster or Prefect for orchestration. Airbyte for SaaS sources. Fivetran when the customer's ops team prefers it. Kafka Connect or Debezium for CDC.
BI
Metabase for product-team self-serve. Looker or Lightdash for governed metric layers. Hex and Mode for analyst-facing notebooks. Custom dashboards in Next.js where the audience needs polish.
Want a deep dive on a specific stack?
Our principal engineers can walk you through the tradeoffs for your context. No sales call disguised as architecture review.
Schedule a tech review →