Engineering · Stack

The Plutobee technology stack.

The tools our engineers reach for, the reason we picked each one, and where we deliberately go boring instead of new. Updated every quarter as the landscape moves.

AI & LLM

AI infrastructure.

Production agents and LLM features. Not science projects.

Models

Anthropic Claude (Opus 4.6, Sonnet 4.6, Haiku 4.5) as our default. OpenAI for specific routing. Open-weights (Llama, Mistral) when self-host is required for data residency or cost.

Orchestration

Anthropic Agent SDK and Model Context Protocol (MCP) for tool-using agents. Temporal for durable agentic workflows. LangGraph where the graph metaphor fits the problem.

Retrieval

Postgres with pgvector for small-to-mid corpora (under 10M docs). Pinecone or Qdrant for scale. Voyage embeddings as default. Hybrid BM25 + semantic with reranking via Cohere or local cross-encoders.

Evals

Promptfoo and Inspect AI for offline evaluation. Custom regression harnesses wired to CI. Judge-model evaluation with calibrated thresholds. Real-customer-trace replay for end-to-end runs.

Observability

Langfuse for trace inspection. Helicone for cost. Datadog LLM observability for production alerting. Custom dashboards for hallucination rate, refusal rate, token p95 cost.

Safety

Output structured generation (JSON mode, instructor, outlines). Input/output content moderation. PII redaction pipelines. Prompt injection detection with guardrails.

Frontend

Web applications.

Performance-first. Type-safe end-to-end. Boring where boring works.

Frameworks

Next.js 15 with the App Router for most product work. Remix for data-heavy admin tools. SvelteKit when Next is overkill. Astro for marketing surfaces. Plain TypeScript + Vite for highly specific apps.

UI

Tailwind CSS v4 + shadcn/ui as foundation. Radix primitives for accessibility. Framer Motion for premium interactions. CSS Modules where Tailwind makes a mess. Design systems built on top.

State & data

TanStack Query for server state. Zustand for client state. tRPC where same-team Node backend. urql or Apollo when GraphQL is already chosen. React Hook Form + Zod for forms.

Type safety

TypeScript strict mode. Zod schemas at network boundaries. Type-safe routing. Generated clients from OpenAPI. End-to-end types from database to UI.

Performance

React Server Components for shipping less JS. Partytown for third-party scripts. Code splitting and route-based chunking. Image pipelines via Next/Image or Cloudflare Images.

Testing

Vitest for unit. React Testing Library for component. Playwright for end-to-end. Storybook for visual regression. Chromatic in CI.

Backend

Server & APIs.

Long-lived services. Right tool for the failure mode.

Languages

Go for high-throughput, long-running services with hard latency budgets. TypeScript on Node for product API surface with type-shared frontends. Rust for performance-critical hotpaths and data tooling. Python for AI orchestration and data work. Elixir for chat, presence, and high-concurrency long-lived sockets.

API

REST + OpenAPI 3.1 default. gRPC for service-to-service. GraphQL when client teams need it and the schema is a real product. tRPC for same-team Node monorepos.

Databases

Postgres as default for everything OLTP. With JSONB, full-text, pgvector, RLS. SQLite + Turso for edge use cases. ClickHouse for analytics. DynamoDB for known-shape high-scale KV. MongoDB only when a customer mandates it.

Messaging

Kafka for event-sourced architectures and high-throughput streams. NATS for service mesh messaging. SQS/SNS where AWS-native is fine. Redpanda where Kafka API but smaller ops surface is required.

Caching

Redis or KeyDB for application cache and sessions. Cloudflare Cache and Workers KV at the edge. Memcached for pure LRU. Application-level cache with bounded LRU for hot paths.

Background work

Temporal for durable workflows with human-in-the-loop or LLM-in-the-loop steps. Sidekiq, RQ, or BullMQ for queue work in their respective ecosystems. Postgres-based queues (river, pgmq) when adding Redis would be the third moving part.

Mobile

iOS, Android, cross-platform.

Native

Swift + SwiftUI on iOS. Kotlin + Jetpack Compose on Android. We default to native when the app needs deep platform integration, hardware features, or App Store top-tier performance.

Cross-platform

React Native + Expo when the team is already TypeScript-fluent and platform features are off the shelf. Flutter when the design vocabulary is bespoke and pixel parity matters.

Distribution

Fastlane for both stores. CI-driven builds with signed artifacts. EAS for Expo. Internal beta via TestFlight and Firebase App Distribution. Phased rollout with crash gates.

Cloud & Platform

Where it actually runs.

Cloud providers

AWS as primary. GCP for BigQuery and Vertex AI when that ecosystem is already in place. Azure for Microsoft-shop customers. Cloudflare for edge, DNS, Workers, R2. Hetzner where European data residency demands it and cost matters.

Compute

EKS or GKE for service workloads. Fargate or Cloud Run for occasional or bursty workloads. Lambda for genuinely event-driven. EC2/Compute Engine when control over the host matters. Workers for edge.

IaC

Terraform with OpenTofu drop-in option. Modules versioned and tested with Terratest. CDK for fully-AWS shops that prefer code. Crossplane when the platform team wants self-service.

Networking

Cloudflare in front of everything. Private subnets default. Service mesh via Linkerd or Istio when scale demands it. mTLS internal. Zero-trust access via Cloudflare Access, Tailscale, or AWS Verified Access.

Secrets

AWS Secrets Manager or HashiCorp Vault. 1Password for human secrets. Doppler for environments where simpler is better. Never in env files committed to git, ever.

Cost

Spot instances where the workload tolerates restart. Right-sizing reports monthly. Savings Plans annually. FinOps reviews quarterly. We surface cost in PR review for anything new.

DevOps

CI/CD, observability, SRE.

CI/CD

GitHub Actions or GitLab CI on the customer's preference. ArgoCD for Kubernetes deploys. Flux for GitOps. Pipelines hardened with concurrency controls, cancel-in-progress, required reviewers.

Observability

Datadog as default APM. Honeycomb for high-cardinality event analysis. Grafana stack (Mimir, Loki, Tempo) for self-hosted. New Relic where the customer already standardized. Sentry for errors.

SRE

SLO/SLI written before launch. Error budgets enforced. Toil tracked and capped at 50% of on-call time. Postmortems blameless and published internally. Chaos engineering once SLOs are stable.

Data

Data engineering & analytics.

Warehouse

Snowflake or BigQuery for greenfield. Databricks when the lakehouse pattern fits. Redshift when AWS-native is required. ClickHouse self-hosted for real-time analytical workloads.

Pipelines

dbt for SQL transformation. Dagster or Prefect for orchestration. Airbyte for SaaS sources. Fivetran when the customer's ops team prefers it. Kafka Connect or Debezium for CDC.

BI

Metabase for product-team self-serve. Looker or Lightdash for governed metric layers. Hex and Mode for analyst-facing notebooks. Custom dashboards in Next.js where the audience needs polish.

Want a deep dive on a specific stack?

Our principal engineers can walk you through the tradeoffs for your context. No sales call disguised as architecture review.

Schedule a tech review →