Building production AI systems while the rest of the world is still figuring out the prompt. Agent orchestration, vision pipelines, LLM infrastructure. First principles. Smallest solution that earns its place.
What we build and how we think about it

Production is the only proof that matters. Everything else is a pitch deck.
Multi-agent systems with sub-agent coordination, semantic memory, and scheduled autonomy.
Multi-stage vision pipelines combining GPT-4V, Cloud Vision, and Claude for domain-specific analysis.
Smart model routing across providers, budget management, and complexity-based task classification.
Docker, AWS, CI/CD — systems that run 24/7 without babysitting.
React, TypeScript, Python, FastAPI, PostgreSQL — whatever the problem demands.
Systems I've built to understand where this is going
CompanyCam needed automated analysis of construction site photos — detecting materials, safety issues, and project progress across millions of images.
Single-model approaches failed on domain-specific construction imagery. Accuracy varied wildly across photo types, and costs scaled linearly with volume.
Built a multi-stage vision pipeline: GPT-4V for high-level scene understanding, Google Cloud Vision for object detection, and Claude for structured reasoning over combined outputs. Each stage feeds the next with confidence scoring.
Production system processing thousands of photos daily with 90%+ accuracy on domain-specific classification tasks. Cost per image dropped 60% vs single-model baseline.
Personal AI assistant with sub-agent orchestration, semantic memory, habit coaching, and scheduled autonomy — running 24/7 on AWS.
Building an agent that operates independently on a schedule, maintains long-term memory, and coordinates multiple specialized sub-agents without human intervention.
Designed a heartbeat-driven architecture with specialized sub-agents for research, memory review, and habit coaching. Semantic memory uses SQLite with embeddings for vector search. Conversations stored as append-only JSONL for crash safety.
Fully autonomous agent running in production. Conducts independent research, reviews and consolidates memories, and coaches habits — all without prompting.
Original research into whether aspiration-based system prompts change LLM decision-making in autonomous agents, tested across multiple model families.
Standard system prompts tell agents what to do. But autonomous agents face ambiguous decisions where instructions run out. Needed a way to shape judgment, not just behavior.
Developed aspiration-based intent prompts — short identity statements that frame the agent's goals as intrinsic motivation. Tested across Claude, GPT, and Gemini using psychometric methodology (Big Five, Dark Triad). Added a severity clause to prevent over-identification with persona traits.
Measurable personality shifts across all three model families. Intent prompts produced more consistent autonomous behavior than instruction-based prompts. Published as ongoing research, validated in production with Alvis.
Thoughts on AI systems, agent architecture, and production engineering
A controlled experiment testing whether adding purpose to LLM system prompts changes decision-making behavior. Tested across 3 model families, validated with psychometric methodology, and confirmed in production.
Learn how I built Vectus AI, an advanced medical scheduling assistant powered by OpenAI GPT-4. This technical deep dive explores the architecture, implementation, and challenges of creating an AI-driven conversational agent for streamlining medical appointments.
A technical exploration of building an AI-powered bot to detect and explain logical fallacies in social media posts using Python, multiple LLM models, and web automation.
Start a conversation about your AI project
I'm looking for people who see the inflection point and want to build at it — not talk about it. If you have an AI problem worth solving, let's talk.
Book a Call