Data Protection Layer

Zero-Knowledge Architecture

Julia features an advanced privacy layer that ensures your personal data never reaches third-party AI providers. This is not just encryption at rest — it's a fundamental architectural decision that makes Julia unique among AI assistants.

Your Real Data Never Reaches OpenAI

When you tell Julia to "schedule a meeting with Sarah Chen at sarah@company.com", the AI doesn't see those real names and emails. Instead, it sees "schedule a meeting with PERSON_A at EMAIL_A". The AI processes the request, and Julia translates the response back to real names before showing it to you. This means your contacts, personal information, and sensitive data are never exposed to external AI providers.

How the Privacy Layer Works

Every message goes through a sophisticated two-layer protection system before reaching the AI:

YouReal names & data

Privacy LayerPseudonymization + PII Detection

AI (OpenAI)Only sees pseudonyms

YouReal names restored

Two-Layer Protection System

The privacy layer combines two complementary protection mechanisms to ensure comprehensive coverage:

Pseudonymization

The first layer replaces all known personal data with deterministic pseudonyms. This includes information from your contacts, profile, and connected accounts.

Example Transformation:

Before:"Email Sarah Chen at sarah@acme.com about the meeting"

After:"Email PERSON_A at EMAIL_A about the meeting"

Protected Data Types:

Person names
Email addresses
Phone numbers
Company names
Addresses
Profile data

PII Detection

The second layer uses Microsoft Presidio, an enterprise-grade PII detection engine, to catch any personal information that wasn't in your known contacts — like a new phone number or credit card you mention.

Additional Detection:

Before:"My SSN is 123-45-6789"

After:"My SSN is ID_A"

Detected Patterns:

Credit card numbers
Social Security Numbers
Bank account numbers
Passport numbers
Driver's licenses
IBAN codes

Bidirectional Processing

The privacy layer works in both directions, ensuring data is protected throughout the entire conversation lifecycle:

1. Your Message

When you send a message, all personal data is replaced with pseudonyms before the AI sees it.

Call John at 555-1234→Call PERSON_A at PHONE_A

2. Tool Execution

When Julia needs to execute a tool (e.g., create a calendar event), pseudonyms are converted back to real values so the tool works correctly.

Create event with PERSON_A→Create event with John

3. Tool Results

Results from tools are pseudonymized again before being fed back to the AI for context.

Event created with john@email.com→Event created with EMAIL_A

4. Final Response

The AI's response is de-pseudonymized before you see it, so you get natural text with real names.

I scheduled a meeting with PERSON_A→I scheduled a meeting with John

Deterministic Mapping

A key feature of the privacy layer is that pseudonyms are deterministic — the same contact always maps to the same pseudonym within a conversation. This ensures the AI maintains context correctly.

Consistent Mapping Example:

Real Value	Pseudonym	Category
Sarah Chen	PERSON_A	Person
John Miller	PERSON_B	Person
sarah@company.com	EMAIL_A	Email
+1 555-1234	PHONE_A	Phone
Acme Corp	ORG_A	Company

This mapping persists across multi-turn conversations and even through clarification flows, ensuring consistent context for the AI.

Sensitive Data Warnings

When the PII detection layer detects highly sensitive information like credit card numbers or Social Security Numbers, it generates warnings that are logged for security monitoring. The data is still pseudonymized, but the system recognizes when extra-sensitive information is being processed.

High-Risk Entity Types

The following patterns trigger additional security logging when detected:

• Social Security Numbers (SSN)
• Credit card numbers
• Bank account numbers
• Passport numbers
• Driver's license numbers
• IBAN codes

Technical Architecture

Core Components

PrivacyGatewayService

The orchestrator that manages the full lifecycle of privacy protection. Coordinates between pseudonymization and PII detection, handles map serialization for multi-turn conversations.

PseudonymizationService

Handles deterministic mapping between real values and pseudonyms. Creates maps from user context (contacts, profile, accounts) and performs bidirectional text transformation.

PiiDetectionService

Integrates with Microsoft Presidio Analyzer to detect unknown PII patterns. Uses machine learning models to identify entities not present in the user's known contacts.

Key Features

Deterministic mapping (consistency across requests)
Bidirectional transformation (pseudonymize ↔ depseudonymize)
Deep object traversal (handles nested data)
Case-insensitive matching (emails, names)
Word-boundary awareness (avoids partial replacements)
Multi-turn conversation support
Clarification flow continuity
Graceful degradation (works if Presidio is down)

Why This Matters

Most AI assistants send your raw data directly to AI providers like OpenAI. This means your contact names, email addresses, phone numbers, and personal details become part of their systems — subject to their privacy policies and potential data breaches.

Julia's privacy layer changes this fundamentally. The AI provider only ever sees abstract tokens like PERSON_A and EMAIL_A. Even if there were a data breach at OpenAI, your actual personal information would not be exposed because it was never sent there in the first place.

This is true zero-knowledge privacy — not just a promise, but an architectural guarantee.

Privacy & Security Overview Full Architecture