Private AlphaJulia is currently in private alpha

How It Works

This page provides a technical deep-dive into Julia's architecture. Understanding how Julia works under the hood can help you get the most out of her capabilities and appreciate the technology powering your conversations.

High-Level Architecture

Julia is built as a multi-channel AI assistant with a unified backend that processes messages from WhatsApp and Telegram, orchestrates AI reasoning, and executes actions through integrated tools.

Channels

WhatsApp, Telegram

Privacy Layer

Pseudonymization

Orchestrator

NestJS Backend

Tools

Calendar, Email, etc.

YouChannelPrivacy LayerLLM + ToolsDe-pseudonymizeResponse

Message Processing Flow

Every message you send goes through a sophisticated pipeline that combines context retrieval, AI reasoning, and tool execution:

1

Message Reception

Channel Dispatcher Service

Your message arrives via WhatsApp or Telegram webhook. The channel adapter normalizes the message format and identifies your user account through the linked channel identity.

2

User Context Loading

Context Builder Service

The system loads your profile, recent messages, active tasks, and contacts. This context helps the AI understand who you are and what you've been working on.

3

Quota Check

Subscription Service + Redis

Your subscription tier is checked to ensure you haven't exceeded daily or monthly message limits. Redis tracks usage in real-time.

4

Privacy Layer (Pseudonymization)

Privacy Gateway Service

Before reaching the AI, all personal data is pseudonymized. Contact names become PERSON_A, emails become EMAIL_A, etc. This ensures the LLM never sees your real personal information.

5

LLM Processing

Unified LLM Service

Your pseudonymized message, along with context and available tools, is sent to the LLM (GPT-4o or GPT-4o-mini depending on tier). The LLM decides what actions to take.

6

Tool Execution

Tool Executor Service

If the LLM decides to use tools (check calendar, send email, etc.), pseudonyms are converted back to real values, tools are executed, then results are re-pseudonymized.

7

Response Generation

Orchestrator Service

The LLM generates a natural language response based on tool results and context. Memory writes are processed to store important information.

8

De-pseudonymization

Privacy Gateway Service

Before sending the response to you, pseudonyms are converted back to real names. 'I scheduled with PERSON_A' becomes 'I scheduled with Sarah Chen'.

9

Message Delivery

Channel Service

The response is sent back through the appropriate channel (WhatsApp or Telegram API) and the conversation is logged for future context.

AI & Function Calling

Julia uses OpenAI's function calling (tool use) capability to interact with external services. This allows the AI to take real actions rather than just generating text.

Available Tools

google_calendarList, search, check availability, create events
outlook_calendarSame as Google Calendar for Microsoft accounts
emailList, search, read, send, reply, forward emails
taskCreate, list, and complete tasks
contactCreate, update, search, and list contacts
memoryRead/write user profile and episodic memory
brave_searchSearch the web for current information
ask_clarificationRequest clarification from the user
account_statusGet subscription and usage information

ReAct Agent Pattern

Julia uses a ReAct (Reasoning + Acting) agent pattern. This means the AI can:

  • Reason about what information is needed
  • Call one or more tools to gather information or take action
  • Observe the results and decide if more actions are needed
  • Iterate up to 5 times for complex multi-step tasks
  • Generate a final response based on all gathered information

Memory System

Julia's memory system enables her to learn and remember information across conversations. It combines structured profile data with semantic search over episodic memories.

Profile Memory

Structured JSON data that's always included in context:

{
  "personal": {
    "name": "Sarah",
    "timezone": "Europe/Paris"
  },
  "preferences": {
    "meetings": "morning",
    "communication": "concise"
  },
  "work": {
    "company": "TechCorp",
    "role": "Product Manager"
  }
}

Episodic Memory

Important events stored with vector embeddings for semantic search:

  • • Stored as text with 1536-dim OpenAI embeddings
  • • pgvector extension for efficient similarity search
  • • Retrieved based on relevance to current message
  • • Importance score affects retrieval priority

Smart Context Building

When enabled, Julia uses an LLM call (GPT-4o-mini) to analyze your message and determine which context is relevant. This reduces the amount of data sent to the main LLM by 50-75%, improving both speed and cost while maintaining accuracy.

Technology Stack

Backend

  • NestJS (Node.js framework)
  • TypeScript
  • PostgreSQL + pgvector
  • Redis (caching, rate limiting)
  • TypeORM

AI & Integrations

  • OpenAI GPT-4o / GPT-4o-mini
  • OpenAI Embeddings (text-embedding-3-small)
  • Brave Search API
  • Google APIs (Calendar, Gmail)
  • Microsoft Graph API

Frontend

  • Next.js 14 (App Router)
  • TypeScript
  • Tailwind CSS
  • Framer Motion
  • Zustand (state management)

Infrastructure

  • Docker / Docker Compose
  • Stripe (payments)
  • WhatsApp Business API
  • Telegram Bot API
Back to Documentation Home