What we've built

Detailed case studies from recent engagements. More projects added as they ship.

AI Data Pipeline

Ad performance evaluation pipeline

An eval-first pipeline that gives a brand owner an independent read on their Meta ad performance. Pulls data from Meta and Shopify each night, evaluates every active ad through Claude Sonnet and delivers recommendations with supporting reasoning via Discord. No agency interpretation required.

The problem

The brand had been running paid ads through an external agency and couldn't get clear answers about whether they were working. Reports arrived in the agency's format. Performance questions were deflected. Decisions about scaling, killing and refreshing individual ads were being made on incomplete information. The owner needed an independent, automated way to evaluate ad performance against his actual Shopify sales data, not the agency's interpretation of it.

What we built

A nightly pipeline that pulls ad performance data from the Meta Ads API and order data from Shopify, aligns them in the same format and stores the combined dataset in Postgres. That dataset is passed to Claude Sonnet, which produces a recommendation for each active ad: scale, kill, refresh or monitor. Each recommendation includes a confidence rating, supporting reasoning that cites specific metrics and flags for unusual data or tracking discrepancies between Meta and Shopify.

Recommendations are summarised in a daily report delivered to the client via Discord. Beyond the daily report, the client has access to a Discord bot for follow-up questions about any recommendation: why an ad was flagged for refresh rather than kill, why a confidence rating came back medium, what a tracking issue flag means in practice. The bot only uses the data and reasoning the evaluated prompt produced, so it cannot generate answers that aren't backed by the analysis. If it can't answer with confidence, it says so.

The eval layer

No version of the prompt went into the live pipeline without first being scored against a set of test cases covering eight scenarios: clear winners, clear losers, creative fatigue, tracking issues, low-data cases and edge cases. Two graders ran in combination: a code grader checking output structure and field validity, and a model grader assessing reasoning quality, metric citation, confidence accuracy and whether tracking issues were flagged correctly. Each prompt version was measured against the previous one. Only a version that beat a defined score threshold made it into the live system. The same eval-first approach was applied to the Q&A bot prompt.

Outcome

The brand owner now has an independent basis for evaluating his agency's reporting. The pipeline catches under-performing ads earlier than weekly agency reviews. It surfaces tracking discrepancies between Meta and Shopify before they lead to bad scaling decisions. Every recommendation comes with a reasoning trail the client can interrogate directly.

Client

Ecommerce store (US)

Industry

DTC ecommerce, performance marketing

Stack

Claude Sonnet Claude Haiku n8n PostgreSQL Python Meta Ads API Shopify API Discord API Hostinger VPS

Capabilities

Prompt evals Ad scoring Dual graders Q&A bot Daily digest Tracking reconciliation
AI Data Pipeline

Trade finance document processing platform

An end-to-end AI pipeline that processes trade finance documents, extracts structured data, scores risk and provides a natural language interface for querying the full document set.

The problem

A Singapore-based trade finance firm was reviewing bills of lading, commercial invoices and certificates of origin manually. Staff were cross-referencing documents by eye, checking commodity pricing against market rates in spreadsheets and building risk assessments in Word documents. The process was slow, inconsistent and impossible to audit retroactively.

What we built

A complete document processing pipeline powered by AI. Documents are uploaded through a web interface and routed through an automated workflow that calls the Claude API for entity extraction. The system pulls structured fields from each document type - shipper details, port pairs, commodity descriptions, weights, values and dates.

Extracted data is stored in PostgreSQL and run through a deterministic risk scoring engine. The engine checks for pricing anomalies against a commodity lookup table covering 13 commodity categories, detects potential duplicate financing across the document set and flags routing or date inconsistencies. Every score is logged with a full audit trail.

The frontend is a React application with a two-panel layout: document details and scores on the left, a persistent AI chat interface on the right. Users can ask questions like "show me all palm oil shipments from Indonesia above market rate" and get answers drawn from the structured data.

Key decisions

Risk scoring was deliberately built as a deterministic JavaScript engine rather than an AI-generated assessment. This means scores are reproducible, auditable and free from the variability that comes with LLM-generated evaluations. The AI handles what it's good at (entity extraction from unstructured text) and the rules engine handles what needs to be consistent.

Outcome

The platform processes documents in seconds that previously took staff 30-45 minutes each. Risk flags that were previously caught only by experienced reviewers are now surfaced automatically. The natural language interface lets junior staff query the document library without needing to understand the underlying data structure.

Client

Trade finance firm (Singapore)

Industry

Trade finance, commodity trading

Stack

Claude API Workflow Orchestration PostgreSQL React JavaScript Docker Traefik Nginx

Capabilities

Entity extraction Risk scoring Duplicate detection NL querying Audit logging
AI-Enhanced Ecommerce

DTC health supplement platform rebuild

A ground-up Shopify rebuild for a US-based probiotic kombucha brand, covering custom storefront development, subscription architecture, international DDP shipping and an ongoing SEO campaign with AI-driven reporting.

The problem

The brand had outgrown its original Shopify setup. The cart was broken (a third-party app that had stopped working), subscriptions weren't converting, international orders were generating surprise customs fees and refunds, and the site had no SEO foundation. The store needed a complete rebuild - not a patch job.

What we built

A custom Shopify theme built from scratch in Liquid. The product page features a subscribe-and-save toggle, Mix & Match flavour selection for multi-packs and a custom cart drawer with a progress bar and milestone rewards. Subscription and bundle logic is handled through Appstle, with inventory automation via Shopify Flow.

International shipping was solved with a DDP (Delivered Duty Paid) configuration using EasyPost and DHL eCommerce, eliminating surprise customs charges for international customers. The setup prints as standard USPS labels, meaning zero workflow change for the existing 3PL.

SEO was built into the site architecture from the start: schema markup, optimised heading structures, FAQ pages with structured data, a blog with proper content hierarchy and a disavow file for existing spam backlinks. Google Search Console was configured and a sitemap submitted as part of the go-live checklist.

Ongoing engagement

The engagement continues with a monthly SEO campaign and AI-driven performance reporting. Reports cover search console data, keyword visibility, sales trends and actionable recommendations - delivered as interactive HTML dashboards in the brand's own visual identity.

Client

BoochBod (US health supplement brand)

Industry

DTC health and wellness

Stack

Shopify Liquid Appstle Shopify Flow EasyPost DHL eCommerce Google Search Console

Capabilities

Custom theme Subscriptions DDP shipping Technical SEO AI reporting
Visit BoochBod →

Have a project in mind?

We're always interested in hearing about new problems to solve.

Get in Touch