How It Works
This page provides a technical deep-dive into Julia's architecture. Understanding how Julia works under the hood can help you get the most out of her capabilities and appreciate the technology powering your conversations.
High-Level Architecture
Julia is built as a multi-channel AI assistant with a unified backend that processes messages from WhatsApp and Telegram, orchestrates AI reasoning, and executes actions through integrated tools.
Channels
WhatsApp, Telegram
Privacy Layer
Pseudonymization
Orchestrator
NestJS Backend
Tools
Calendar, Email, etc.
Message Processing Flow
Every message you send goes through a sophisticated pipeline that combines context retrieval, AI reasoning, and tool execution:
Message Reception
Channel Dispatcher ServiceYour message arrives via WhatsApp or Telegram webhook. The channel adapter normalizes the message format and identifies your user account through the linked channel identity.
User Context Loading
Context Builder ServiceThe system loads your profile, recent messages, active tasks, and contacts. This context helps the AI understand who you are and what you've been working on.
Quota Check
Subscription Service + RedisYour subscription tier is checked to ensure you haven't exceeded daily or monthly message limits. Redis tracks usage in real-time.
Privacy Layer (Pseudonymization)
Privacy Gateway ServiceBefore reaching the AI, all personal data is pseudonymized. Contact names become PERSON_A, emails become EMAIL_A, etc. This ensures the LLM never sees your real personal information.
LLM Processing
Unified LLM ServiceYour pseudonymized message, along with context and available tools, is sent to the LLM (GPT-4o or GPT-4o-mini depending on tier). The LLM decides what actions to take.
Tool Execution
Tool Executor ServiceIf the LLM decides to use tools (check calendar, send email, etc.), pseudonyms are converted back to real values, tools are executed, then results are re-pseudonymized.
Response Generation
Orchestrator ServiceThe LLM generates a natural language response based on tool results and context. Memory writes are processed to store important information.
De-pseudonymization
Privacy Gateway ServiceBefore sending the response to you, pseudonyms are converted back to real names. 'I scheduled with PERSON_A' becomes 'I scheduled with Sarah Chen'.
Message Delivery
Channel ServiceThe response is sent back through the appropriate channel (WhatsApp or Telegram API) and the conversation is logged for future context.
AI & Function Calling
Julia uses OpenAI's function calling (tool use) capability to interact with external services. This allows the AI to take real actions rather than just generating text.
Available Tools
google_calendarList, search, check availability, create eventsoutlook_calendarSame as Google Calendar for Microsoft accountsemailList, search, read, send, reply, forward emailstaskCreate, list, and complete taskscontactCreate, update, search, and list contactsmemoryRead/write user profile and episodic memorybrave_searchSearch the web for current informationask_clarificationRequest clarification from the useraccount_statusGet subscription and usage informationReAct Agent Pattern
Julia uses a ReAct (Reasoning + Acting) agent pattern. This means the AI can:
- Reason about what information is needed
- Call one or more tools to gather information or take action
- Observe the results and decide if more actions are needed
- Iterate up to 5 times for complex multi-step tasks
- Generate a final response based on all gathered information
Memory System
Julia's memory system enables her to learn and remember information across conversations. It combines structured profile data with semantic search over episodic memories.
Profile Memory
Structured JSON data that's always included in context:
{
"personal": {
"name": "Sarah",
"timezone": "Europe/Paris"
},
"preferences": {
"meetings": "morning",
"communication": "concise"
},
"work": {
"company": "TechCorp",
"role": "Product Manager"
}
}Episodic Memory
Important events stored with vector embeddings for semantic search:
- • Stored as text with 1536-dim OpenAI embeddings
- • pgvector extension for efficient similarity search
- • Retrieved based on relevance to current message
- • Importance score affects retrieval priority
Smart Context Building
When enabled, Julia uses an LLM call (GPT-4o-mini) to analyze your message and determine which context is relevant. This reduces the amount of data sent to the main LLM by 50-75%, improving both speed and cost while maintaining accuracy.
Technology Stack
Backend
- NestJS (Node.js framework)
- TypeScript
- PostgreSQL + pgvector
- Redis (caching, rate limiting)
- TypeORM
AI & Integrations
- OpenAI GPT-4o / GPT-4o-mini
- OpenAI Embeddings (text-embedding-3-small)
- Brave Search API
- Google APIs (Calendar, Gmail)
- Microsoft Graph API
Frontend
- Next.js 14 (App Router)
- TypeScript
- Tailwind CSS
- Framer Motion
- Zustand (state management)
Infrastructure
- Docker / Docker Compose
- Stripe (payments)
- WhatsApp Business API
- Telegram Bot API