On-Device LLMs & Your Encrypted Data: The Profiling Risk Amplified

This is Part 2 of a series. Part 1 explored how on-device ML models could profile users despite end-to-end encryption through image classification and object detection.

There’s a distinction that doesn’t get enough attention: companies have always been able to analyze your non-end-to-end-encrypted (non-E2EE) data. Your emails on Gmail, your photos on the cloud, your files on Dropbox. None of that requires special tricks. It sits on their servers, and they can process it however they see fit (within their privacy policies, which are generous to themselves, and sometimes beyond).

End-to-end encrypted data was supposed to be different. E2EE was the one category of user data that remained genuinely inaccessible to the service provider. Your WhatsApp messages, your Signal conversations, your iMessage threads. These were the hold-outs, the data that couldn’t be mined because companies literally couldn’t read it.

In Part 1, I showed how on-device ML models, simple classifiers and object detectors, created a crack in that wall. A 17MB model could analyze 1000 photos in under 6 seconds, extracting labels and categories from content that was supposed to be for your eyes only.

That crack has now become a chasm.

Apple and Google have deployed full-fledged foundation language models directly onto smartphones. Not cloud-based assistants that phone home. Not lightweight classifiers trained to recognize 80 objects. These are roughly 3-billion-parameter language models running entirely on your device, and they can understand context, interpret nuance, and reason about your data in ways those simple classifiers never could.

It’s your encrypted data, the data companies couldn’t previously touch, that faces the greatest new risk here. Non-encrypted data was already vulnerable. E2EE data was the last frontier. On-device LLMs have now breached it.

What Part 1 Established

For those who haven’t read Part 1, here’s the core argument:

End-to-end encryption protects your data in transit and at rest. But your phone must decrypt that data locally to display it to you. The moment it’s decrypted, on-device ML models can analyze it, extracting insights, labeling content, inferring intent, and sending back metadata rather than raw content.

The encryption remains technically unbroken. But the privacy promise? That’s another matter.

This matters most for E2EE data specifically. With non-E2EE services, companies already have server-side access. They don’t need on-device models to profile you. But E2EE data was the one thing they couldn’t reach. On-device ML gave them an indirect path to exactly the data that encryption was designed to keep away from them.

I demonstrated this with experiments on my own phone. A YOLO object detection model processed 3000 photos, identifying people, vehicles, food, luggage, everything needed to profile my lifestyle. An image classifier built for Cacti (our privacy app at Labyrinth) analyzed the same dataset at 4.1 milliseconds per image.

The takeaway was simple: apps could learn everything about you from your “encrypted” data without ever transmitting the actual content. And you’d never notice.

Foundation Models Are Now On Your Phone

In Part 1, I traced how Apple and Google introduced on-device ML frameworks around 2016–2017 (CoreML, CreateML, TensorFlow Lite), enabling apps to run lightweight models locally. The models I experimented with were products of that era: a 17.7MB YOLO object detector that recognized 80 object categories, and a 17KB image classifier we trained at Labyrinth for Cacti. Small, specialized, narrow.

At WWDC 2025, Apple unveiled something categorically different: the Foundation Models framework. It gives third-party developers direct access to the on-device LLM powering Apple Intelligence. Not another specialized classifier. A roughly 3-billion-parameter language model that runs locally, processes data without network connectivity, and costs developers nothing per inference.

Google followed a similar path. Gemini Nano, Google’s on-device model, now ships on Pixel devices and an expanding list of Android phones from Samsung, Motorola, and Realme. The ML Kit GenAI APIs let any developer integrate summarization, proofreading, rewriting, and image description capabilities directly into their apps.

On the surface, this is a privacy win, for the same reasons I acknowledged in Part 1. Processing happens on-device. Data doesn’t leave your phone. No cloud servers see your content. And to be fair, Apple and Google using these models for their own OS-level features (photo organization, dictation, smart replies) is a relatively contained risk. There are only two major mobile OS companies to scrutinize, and both have strong incentives to maintain user trust.

The real shift is what this means for third-party developers. These frameworks aren’t locked to system features. Any app, whether it’s a journaling app, a messaging client, a photo editor, or a fitness tracker, can now tap into a foundation model that understands the user’s data, not just processes it. That’s where the threat I outlined in Part 1 escalates. It’s no longer about scrutinizing two OS companies. It’s about tens of thousands of developers, each with their own incentives and business models, building apps that sit on top of your E2EE data with access to a model capable of genuine comprehension.

Labels vs. Comprehension

Part 1’s experiments hit a clear limitation: the models I tested only output labels. My YOLO model looked at a photo and returned person, suitcase, airplane. The Cacti classifier returned a category. Useful for basic profiling. I showed that those labels alone were “more than sufficient to profile my lifestyle, dietary habits, favorite activities, relationships, likely places visited, and interests.” But still, at the end of the day, they were just labels.

An important distinction worth making: today’s on-device foundation models available to third-party developers are primarily text-based. Apple’s Foundation Models framework is strictly text-only for third-party apps. Google’s Gemini Nano does support image+text input through the ML Kit Prompt API, but its current deployment is limited to newer devices. These are not yet the multimodal powerhouses their cloud counterparts are. I’ll return to what happens when they become multimodal in the Looking Forward section.

But even as text-only models, they represent a massive leap for profiling text-based E2EE data. And that’s where most of our private communication lives: messages, notes, documents, journal entries.

A classifier outputs labels. An LLM constructs narratives.

Consider the breakup chat scenario from Part 1. I described how an on-device NLP model could label a late-night conversation with relationship, breakup, sadness, anxiety, anger, nighttime usage, and how those labels alone could trigger dating app promotions, therapy suggestions, and comfort-shopping ads. That was concerning enough.

A capable on-device LLM doesn’t stop at labels. It can understand that you’re processing a relationship ending, that you’re considering therapy (because you mentioned it twice), that you’re financially worried about moving out, and that your friend suggested you reconnect with your sister. It grasps the narrative arc of your situation, not just its emotional tone.

That’s not inference from labels. That’s comprehension. It applies to every text-based scenario I outlined in Part 1: the breakup chat, the sneaker search DMs, the productivity journal. But with far more depth in what gets extracted. The image-based scenarios (the photo dump, the travel screenshots) remain in the domain of the specialized classifiers from Part 1 for now. But not for long.

Apple’s Foundation Models Framework

In Part 1, I described how Apple’s CoreML allowed apps to run specialized models (object detectors, image classifiers) with transfer-learning weights as small as tens of kilobytes. The Foundation Models framework is a generational leap beyond that. Here’s what Apple’s documentation reveals is now available to any developer building apps for iOS 26, iPadOS 26, or macOS 26:

Text Understanding: The model can analyze and interpret text content, understanding not just what’s written but the intent behind it.

Summarization: It can distill long documents, message threads, or notes into key points. Which also means it can extract the essential information from your private communications.

Entity Extraction: Names, places, dates, organizations. Anything significant can be identified and structured.

Classification: Categorizing content by topic, tone, urgency, or any other dimension a developer defines.

Tool Calling: This one is perhaps the most significant. The model can autonomously decide when it needs additional information and call external tools or APIs to get it.

Guided Generation: Developers can constrain the model’s output to specific Swift data structures, ensuring they get exactly the profiling metadata they need in a machine-readable format.

Apple explicitly notes that developers can “provide tools to the model that call back into the app when the model requires more information for processing.” So an on-device LLM analyzing your messages could call back into the app to pull in additional text-based data like calendar entries, notes, or contact names. Whatever the developer has wired up as a tool.

And because everything runs locally, none of this requires network access or triggers any obvious privacy warnings.

Gemini Nano on Android

Part 1 discussed on-device ML primarily through Apple’s ecosystem because that’s where I ran my experiments. But the same threat exists on Android, and Google has moved just as aggressively. Gemini Nano runs through Android’s AICore system service, enabling any developer to access foundation model capabilities without managing their own ML infrastructure.

The ML Kit GenAI APIs support summarization, proofreading, and content rewriting, all running locally through small LoRA adapter models fine-tuned on top of the Gemini Nano base. Notably, Google’s Prompt API already supports image+text input for third-party developers, meaning Android apps can send both a photo and a text prompt to Gemini Nano and receive a text-based analysis back. This is a step ahead of Apple’s text-only offering and brings some of Part 1’s image-based profiling scenarios into the LLM era on Android today, at least on supported devices.

Google touts offline functionality as a privacy feature. “The advantage of Gemini Nano is that it runs on-device, which means it does not send any data to Google’s servers for processing,” their documentation states.

True enough. But the inferences derived from that data? Those can absolutely leave the device.

Google’s own system-level features hint at what’s possible. The Pixel Screenshots feature uses Gemini Nano to analyze screenshots, understand their content, and make them searchable. That’s a system feature, not available to third-party developers. The scam detection feature analyzes “conversation patterns” in real-time to identify potential threats. These are OS-level capabilities today. But the pattern is familiar from Part 1: capabilities that start as system features tend to become developer APIs. The same pattern recognition that detects scams could just as easily identify purchasing intent, relationship status changes, or health concerns, once it’s in a third-party developer’s hands.

What the Extracted Metadata Looks Like Now

Part 1’s central argument was that on-device ML lets apps extract metadata from your E2EE content (labels, categories, sentiment scores) without ever transmitting the raw data. I called it “innocent metadata” with scare quotes, because even those simple labels could trigger targeted ads and build shadow profiles.

With foundation models, the metadata isn’t just richer. It’s qualitatively different. In Part 1, I illustrated how an app analyzing your encrypted messages might extract metadata like:

relationship breakup sadness anxiety anger nighttime usage

With a foundation model running locally, that same analysis produces something far more detailed:

User Profile Update - January 2026

Relationship Status: Recently ended long-term relationship
Emotional State: Processing grief, fluctuating between acceptance and anger
Sleep Pattern: Disrupted (messaging activity 1-4 AM for past 2 weeks)
Financial Concerns: Mentioned inability to afford apartment alone (3 instances)
Support Network: Primary contact "Sarah" - sister, recommended reconnection with family
Therapeutic Interest: Expressed interest in therapy, researched providers twice
Purchasing Behavior: Likely comfort spending pattern emerging
Life Transition: Apartment search initiated, considering moving cities

This isn’t science fiction. The Foundation Models framework explicitly supports this kind of structured extraction through guided generation. A developer defines a Swift struct with the fields they want, and the model populates it with information extracted from the user’s data.

Third-Party Apps Now Have Access

In Part 1, I made the point that any app with access to your photos, messages, or other data could run on-device ML to profile you, and I showed exactly how with my experiments. The risk wasn’t hypothetical. Any developer with access to CoreML could do what I demonstrated. But there was a practical barrier: developers had to source, train, or license their own models.

That barrier is gone.

Apple’s privacy messaging emphasizes that Apple doesn’t use personal data to train its models. That’s reassuring for system-level features. But what about the thousands of third-party apps that now have access to the same foundational capabilities?

Apple introduced App Store Guideline 5.1.2(i) in late 2025, requiring apps to “clearly disclose where personal data will be shared with third parties, including with third-party AI, and obtain explicit permission before doing so.”

The key phrase there is shared with third parties.

But what about insights derived on-device? An app could analyze your messages using Apple’s own Foundation Models framework, extract detailed profiling metadata, and transmit only that synthesized information to its servers. The raw data never leaves your phone. The encrypted content remains encrypted.

Does that count as “sharing personal data”? The guidelines don’t clearly address this. And that gray area is precisely where profiling can flourish.

It’s “Just Metadata,” Right?

Part 1 explored how the metadata extracted by simple classifiers (labels like travel intent, destination: Amsterdam, trip type: solo) was already enough to trigger targeted ads and build behavioral profiles. I also discussed how companies might build “shadow profiles” and maintain “plausible deniability” about how they knew what they knew.

But there was a ceiling on what label-based metadata could reveal. A sentiment classifier might tag a message thread as negative and anxious, but it couldn’t tell you what the user was actually anxious about.

On-device LLMs blow past that ceiling. They can synthesize what I’d call semantic metadata, narrative-level understanding, from your text content while still claiming no content was ever transmitted.

Consider a journaling app. You write private thoughts about your fears, your health, your relationships, your career. The app uses the Foundation Models framework to “help you find entries later” by extracting key themes and emotions.

That extraction produces:

Entry Themes: Career anxiety, imposter syndrome
Mentioned Conditions: Considering ADHD evaluation
Relationship Context: Conflict with partner about finances
Future Planning: Considering job change, researching startups
Emotional Trajectory: Declining optimism over 3-month period

The app could transmit this synthesis while technically never accessing your “content.” Your actual journal entries never leave the device. But the profile built from them does.

What’s Changed Since Part 1

Six months ago, I warned that on-device ML created a gray area where “the lines between privacy and profiling a user quietly blur.” I demonstrated the threat with classifiers and object detectors, and I flagged SLMs and LLMs as the next escalation. Several things have evolved since then, some of which I predicted, some of which moved faster than I expected.

The models are dramatically more capable. The jump from Part 1’s 17MB object detection model to a 3B-parameter foundation model isn’t incremental. It’s transformational. Part 1’s models could tell you what was in your data. These models understand why it matters.

Developer access is now official. In Part 1, I noted that Apple and Google shipped foundational models within their operating systems, and third-party apps could use transfer-learning weights. That’s evolved. They’ve now explicitly opened foundation model access to third-party developers. Reading through the Foundation Models framework documentation honestly feels like reading a guide for building the kind of inference engine I warned about.

The business case is clearer. Part 1 discussed how on-device ML was efficient. My experiments ran on the device’s Neural Engine with no cloud costs. LLM inference in the cloud is orders of magnitude more expensive than that. Running it on-device eliminates that cost entirely. Apple emphasizes that using the Foundation Models framework is “free of cost.” No per-query fee, no API usage limits. The economic barrier to extensive profiling has been removed.

Hardware capabilities have increased. Part 1’s experiments ran on an iPhone 14 Pro Max. The Neural Engine on Apple’s latest A-series chips and the NPUs in Qualcomm’s Snapdragon 8 Elite can now handle 80+ trillion operations per second. Models that would have strained devices two years ago now run smoothly in the background. This is exactly the “more powerful devices” scenario I flagged in Part 1’s future risks section.

The normalization has accelerated. AI features are everywhere now. Summarization, smart replies, photo organization. Users expect these capabilities. Part 1 discussed how apps could spread inference across sessions to avoid detection. Now they don’t even need to hide it. Users want their apps to analyze their content. The surveillance has been rebranded as convenience.

The Bigger Shift: Prompts Replace Models

Let me be clear: today’s on-device LLMs are version one. They’re constrained by device memory, limited context windows, and the inevitable compromises of running a foundation model on mobile hardware. They are not as capable as their cloud counterparts. Not yet.

But even these early versions represent a fundamentally different threat than the models I tested in Part 1, and the trajectory matters more than the current snapshot.

In Part 1’s world, profiling required models. If a company wanted to detect objects in your photos, they needed an object detection model. If they wanted to classify image themes, they needed a separate image classifier. If they wanted to analyze text sentiment, that was yet another model. Each new profiling dimension required training or sourcing a specialized model, bundling it into the app, and shipping it through an app update. My YOLO model could identify 80 object categories, and that was its ceiling. To recognize a new category, the model itself had to be retrained and redeployed.

On-device LLMs invert this entirely. The model stays the same. What changes is the prompt.

A company wanting to profile your travel habits doesn’t need to train a travel-detection model. They write a prompt: “Extract travel destinations, dates, companions, and trip purposes from the following content.” To profile purchasing intent instead, they change the prompt. To assess health concerns, relationship status, financial anxiety, each is just a different text instruction sent to the same foundation model.

And here’s what should concern you: a prompt is just text. A few kilobytes at most. It can be delivered silently from a server as a configuration update. No app update required, no App Store review triggered, no user notification generated. A company could experiment with hundreds of different profiling prompts on their backend, A/B testing which ones extract the most valuable metadata, and deploy new ones to your device overnight. The profiling strategy becomes as fluid and easy to iterate on as a marketing email.

Compare that to Part 1’s world, where changing what the app could infer about you required retraining a neural network, validating it, packaging it into a new app binary, submitting it to the App Store, and waiting for users to update. The friction was real. With LLMs, the friction is gone.

And this is with version one. Every year, at each OS release, these on-device models will grow more capable. Larger context windows, better reasoning, multimodal understanding. The prompts stay simple. The depth of what they can extract keeps increasing. The profiling surface expands with every software update that you eagerly install for its latest “AI features.”

Profiling at Scale Gets Easier

In Part 1, I showed that my YOLO model could process 1000 photos in 5.5 seconds, and the Cacti classifier managed 3000 photos in 12.4 seconds. I argued that an app could “analyze the entire library in just a few days without you noticing anything,” spreading inference across multiple app sessions to avoid heating up the device.

That was with models that output labels. Security researchers have since identified that LLMs amplify a specific risk category: automated profile inference at scale.

The concern is straightforward. People generate vast digital footprints: social media posts, messages, photos, search history, purchase records. Traditionally, synthesizing these fragments into coherent profiles required significant human effort. It was too expensive to do at scale. The classifiers from Part 1 lowered the cost but could only produce shallow profiles. Lists of labels that still required server-side aggregation to become actionable.

LLMs change that calculus entirely. They can “systematically analyze vast digital footprints to infer sensitive attributes with minimal human intervention,” as one research paper notes. “This automation dramatically amplifies the threat, enabling profiling attacks at an unprecedented scale.” And crucially, LLMs can do the synthesis step on-device too, the step that previously had to happen on a server. They don’t just extract labels. They construct the profile itself.

On-device processing makes this even more efficient. The profiling happens locally, using the user’s own computing resources, requiring no cloud infrastructure, generating no network traffic that might be detected or monitored. Part 1’s hierarchical inference approach (using multiple specialized models in sequence) gets replaced by a single foundation model that can do it all.

An app could build a comprehensive psychological profile of a user (their fears, desires, vulnerabilities, likely purchasing triggers, relationship status, health concerns, political leanings) without transmitting a single piece of raw content. Just structured metadata extracted by a foundation model running silently on the user’s own phone. And for E2EE data, this is the only way companies can access these insights. The on-device path isn’t a convenience. It’s the only path.

The Iterative Profiling Pipeline

The prompt-based approach I described earlier doesn’t have to be a one-shot thing. In fact, its real power comes from iteration.

A company could start with a broad, generic set of A/B tested prompts designed to surface high-level themes from a user’s data. Things like “this user talks a lot about career stress” or “this user is considering a move.” Those themes, just text, get sent back to the company’s servers. Not your messages. Not your photos. Just the themes.

On the server side, a much more capable cloud-based LLM analyzes those themes and generates new, more targeted prompts. Prompts specifically designed to dig deeper into what surfaced for that particular user.

So if the first pass reveals you’ve been messaging about job dissatisfaction, the next round of prompts might specifically look for salary expectations, companies you’ve mentioned, skills you’re developing, or whether you’ve talked to recruiters. That second round of results goes back up, gets analyzed again, and produces even more specific prompts. The on-device model does the extraction. The cloud model does the thinking about what to extract next. The cycle repeats.

This can happen gradually over time, across new messages and new content as it arrives on your device. Each iteration builds on the last. The profile gets more specific, more personal, more valuable. And from the user’s perspective, nothing has changed. No app updates. No new permissions requested. Just the same app running the same on-device model, with slightly different instructions each time.

What you end up with is a top-down, iterative profiling pipeline. The on-device LLM is the extraction layer. The cloud LLM is the intelligence layer deciding what to extract next. The on-device model never needs to be smart enough to orchestrate this on its own. It just needs to follow instructions well enough. The cloud model handles the strategy.

This is, I think, one of the most effective ways on-device LLMs could be used for profiling. It combines the cost efficiency of on-device inference with the intelligence of cloud models, and it produces profiles that get richer over time without any visible change to the app’s behavior. The user sees the same app. The company sees an increasingly detailed picture of who that user is.

So What Can You Actually Do?

Part 1 laid out a framework for protecting yourself: identifying your privacy trust zones (your device/OS vs. third-party apps), using web apps over native apps, limiting permissions, auditing regularly, favoring open source, and studying business models. All of that still applies, but LLMs shift things on several fronts.

Use web apps wherever you can. In Part 1, I recommended browser-based versions of services because they have limited access to device resources and can’t run local models against your data. With foundation models available to any native app at zero cost, the gap between what a native app can infer and what a web app can infer has widened significantly. This remains the single most effective privacy protection.

Permission granularity matters more, but helps less. Part 1 recommended iOS’s granular photo access, limiting apps to specific photos rather than your full library. That’s still worth doing. But for text-based data (messages, notes, documents) there’s often no granular permission at all. An app with access to your messages has access to all your messages, and a foundation model can extract far more from even a short conversation than a classifier could from a thousand labeled images.

Be skeptical of AI features. Every “smart” feature in an app is potentially an analysis vector. Summarization means the app is reading your content. Smart replies mean it’s understanding your conversations. Photo organization means it’s classifying your images.

Follow the business model, but note a new wrinkle. Part 1 noted that “if it’s free, you are most likely the product.” The Foundation Models framework adds a twist: the inference is now free for developers too. No per-query cost, no API limits. The economic barrier to extensive profiling has been removed entirely.

Why Transparency Alone Won’t Fix This

Part 1 called for transparency about what models are running on devices and what inferences they make. I still think transparency matters, but I want to be honest about its limits.

Today, users have no visibility into which apps are using the Foundation Models framework or Gemini Nano. There’s no system-level log of AI inferences. No notification when an app extracts structured data from your content. No audit trail of what profiles have been built.

Apple and Google could address some of this. They could require apps to declare their use of on-device AI in their privacy labels. They could provide a system setting showing which apps have accessed the foundation model and for what purpose.

But here’s the problem: the privacy “nutrition labels” that already exist on the App Store and Google Play are declarative. The developers fill them out themselves. Apple and Google are straightforward about this. There is no practical way for them to verify what every app actually does on millions of devices. They can and do act on publicly reported violations, but the labels describe what developers say they do, not what actually happens under the hood.

And we know how this plays out. Companies like Meta and Google, among many others, are not exactly known for respecting user privacy. They’ve violated their own privacy policies multiple times, paid fines, and moved on. The math is simple for them: the fines are cheaper than the value they extract from the data. GDPR fines, FTC settlements, these are line items on a quarterly report, not deterrents.

So transparency without stronger policy doesn’t change much. And stronger policy without enforcement doesn’t change much either. The fines as they stand today are not a real blocker for these firms.

The Pragmatic Reality

So where does that leave us?

I think the only honest answer is: with ourselves. Our own awareness is the most reliable tool we have.

This isn’t a comfortable answer. It’s not a robust solution. You can’t know for certain what a company is doing with your data behind the scenes. There is a fundamental information asymmetry in this game: the companies know exactly what they’re capable of extracting, and you don’t. The rules of the game are set up so that you never have all the data to make a fully informed decision.

But being aware of that asymmetry is itself the first step. Knowing what’s technically possible, knowing what policies can and cannot prevent, knowing that fines don’t actually deter large companies. That knowledge lets you make better calls about where you put your data, which apps you use, and what you share.

Now, you might say: “I don’t have anything to hide, so none of this matters to me.” That’s a fair position. But I’d suggest making that call after understanding what these companies are capable of obtaining from your data, not before. If you’ve read this article and Part 1, and you still feel that way, then go for it. That’s a perfectly valid, informed choice. But if knowing all this changes how you feel, then you have more data to work with. You can be more deliberate about which platforms you use, what data you store on native apps, and how much you share.

That’s the call each of us needs to make individually. The goal isn’t to be paranoid. It’s to know what capabilities exist, to know what regulations can and cannot prevent, and then to decide how you want to use these tools and platforms accordingly. I think that’s the only pragmatic approach, because this isn’t black or white. Companies can and do skirt the laws. The punishments aren’t severe enough to stop them. The only variable you fully control is your own behavior.

What Comes Next

Part 1 ended with a “Future Risks” section that predicted this moment. I wrote about Small Language Models contextualizing text messages “with much more nuance than basic NLP text classification models.” I warned that when LLMs go local, “the level of profiling achievable via encrypted data could surpass what has previously been possible.”

Six months later, that prediction has materialized faster than I expected. And the trend line points toward capabilities I didn’t fully anticipate.

Multimodal models are coming to third-party developers. Today, Apple’s Foundation Models framework is text-only for third-party apps, and Google’s Gemini Nano supports image+text on a limited set of devices. But these are version-one constraints. Apple already runs multimodal models at the system level for its own features: image understanding in Apple Intelligence, visual search, photo organization. Google’s Gemini Nano is multimodal at its core, and the Prompt API’s image support is already rolling out. It’s a matter of when, not if, full multimodal capabilities become available to every third-party developer on both platforms.

When that happens, the image-based profiling scenarios from Part 1 (the private photo dump, the travel screenshots, the medical documents captured for reference) merge with the text comprehension capabilities I’ve described in this article. A single on-device model will be able to look at a photo of a prescription bottle and read your messages about side effects and synthesize both into a health profile. The specialized classifiers from Part 1 and the foundation models from Part 2 converge into one system that understands everything: text, images, and the connections between them.

Some researchers predict “personalized SLMs” that continuously fine-tune on user data in real-time. Models that learn your writing style, your professional jargon, your social patterns. These would be extraordinarily valuable for profiling purposes. They’d also be entirely local, technically privacy-preserving by current definitions.

The 2027 horizon includes “screen awareness,” where models can see and interact with any application running on your device. A profiling system with that capability wouldn’t need photo access or message access. It would simply observe everything you do. At that point, even Part 1’s advice about limiting app permissions becomes moot. The model wouldn’t need permissions to your data if it can observe your screen.

The infrastructure being built here is remarkable in its scope. It doesn’t require centralized data collection. It runs on the devices we carry everywhere. And it’s marketed as a privacy feature.

Part 1 concluded that “encryption alone isn’t enough.” I’d add to that now: the tools marketed as keeping your data private on-device are the same tools that make the deepest profiling of your E2EE data possible. That doesn’t make encryption unimportant. It remains essential. But the conversation about privacy needs to expand beyond “is my data encrypted?” to “what’s happening to my data on my own device, and who benefits from it?”

That’s the question worth sitting with.

This is Part 2 of a series. Part 1 covered how on-device ML models could profile users despite end-to-end encryption through image classification and object detection. This installment examines how foundation language models amplify those risks, particularly for E2EE data, the one category that was supposed to remain untouchable.

This article was written by me, a human. I used an LLM-powered grammar checker for final review.

What Part 1 Established#

Foundation Models Are Now On Your Phone#

Labels vs. Comprehension#

Apple’s Foundation Models Framework#

Gemini Nano on Android#

What the Extracted Metadata Looks Like Now#

Third-Party Apps Now Have Access#

It’s “Just Metadata,” Right?#

What’s Changed Since Part 1#

The Bigger Shift: Prompts Replace Models#

Profiling at Scale Gets Easier#

The Iterative Profiling Pipeline#

So What Can You Actually Do?#

Why Transparency Alone Won’t Fix This#

The Pragmatic Reality#

What Comes Next#