We often assume that once a service claims to use end-to-end encryption (E2EE), our data is completely safe. But are we really protected if that same service uses on-device machine learning models to learn from our encrypted, private content?
With non-E2EE services, it’s a given: user data is routinely analyzed, mined, profiled, and monetized—either directly through targeted ads or indirectly via product optimization. We’ve come to expect this trade-off, for better or for worse.
But E2EE is supposed to be different—a cryptographic promise of privacy. And yet, there may be a gray area with the use of on-device ML that isn’t talked about enough, where the lines between privacy and profiling a user quietly blur.
In this article, I’ll show you how your “encrypted” messages and private photos can still be used to build detailed profiles about you—without ever leaving your device unencrypted. Through real experiments on my own phone, I’ll demonstrate just how quickly and efficiently this invisible profiling can happen, even with end-to-end encryption protecting your data.
TL;DR
Your messages are encrypted—but does that mean nobody’s watching? Apps leveraging on-device machine learning can infer detailed insights about you without directly accessing your private content. Encryption alone isn’t enough; we need awareness, regulation, active management of app permissions, and regular audits to truly protect your privacy.
Understanding E2EE
Let’s set the stage.
Broadly, most modern consumer-facing digital products or services rely on two components:
- A client-side app running on your device
- A server-side system handling coordination and storage
To protect users, these services typically encrypt data:
- In transit (when it moves between device and server)
- At rest (when it’s stored on a server)
These are foundational protections—good, but not foolproof. They prevent malicious third parties from intercepting your data, but they don’t stop the service itself from reading it.
That’s where end-to-end encryption (E2EE) steps in. With E2EE, only the users can decrypt the data—not even the service provider has access. Messaging apps like WhatsApp, Signal, iMessage, and other services like Anytype use this approach.
With E2EE, the decryption keys remain on the user’s device, or only with the people who need to have access to the content, using a key exchange workflow. Some implementations even generate keys on the fly on the users’ devices, ensuring that the provider’s servers can never access the keys nor the content.
E2EE has gained traction precisely because users are now more conscious of privacy than ever. It gives the impression—and in many ways, the reality—of secure communication.
But that’s only part of the story.
Why Do Companies Want Your Encrypted Data?
So why would companies that claim to respect your privacy still want to understand what’s inside your encrypted messages?
Because,
Knowledge is profit. Data is the new gold. Every click is a clue. The more they know, the more they grow. Inference is influence. Privacy is a feature, profiling is the foundation. They don’t need to read your mind—just your metadata. Consent is often implied, not informed (unfortunately). Patterns are more powerful than passwords.
Do we need more reasons?
Over the past two decades, tech companies have realized that knowing more about their users means they can:
- Personalize features
- Increase engagement
- Sell better-targeted ads
- Improve retention
- Unlock more value per user
This is especially critical for ad-driven businesses like Meta and Google. They don’t necessarily need to directly read your messages or listen to your conversations—there are easier, more scalable ways to understand your intent and mood.
That’s possibly where on-device ML could come in.
Understanding On-Device Machine Learning
Machine learning has traditionally been done in the cloud, or on powerful, beefy machines. But around 2016–2017, Apple and Google added frameworks to run ML models locally—on your phone—for both the OS and third-party apps. This shift allowed for intelligent features on the OS like:
- Face recognition in photos
- Dictation
- Scene understanding
- Language translation
- Sentiment detection in text
- Object detection in images
- Scene description in photos All happening right on your device, without their clouds gaining access to your photos, messages, etc.
It was a win for privacy: less personal data had to leave your device. Additionally, companies began shipping foundational models for image, text, and voice directly within their operating systems. This allowed third-party apps to only include smaller, transfer-learning weights instead of entire models. Typically, image-based models can range from a few hundred megabytes to tens of gigabytes. However, with foundational models pre-installed, apps now only needed transfer-learning weights—often just tens to hundreds of kilobytes—making them a negligible addition to the app’s overall size to offer smart features.
But there’s a catch.
I posit that with on-device ML, companies and their apps no longer need to collect your raw data, store, and analyze it on their servers to learn about you. Instead, they can deploy small, specialized models directly to your device. These models can analyze your text messages, photos, videos, voice notes, or even screenshots—right there on your phone—and send back metadata, inferences, or topics of interest about the user.
In a traditional app, this is fine; there’s no more threat to the privacy of your data than before, as the companies are able to access your data anyway. But, in the case of E2EE apps and services, this opens a gray area.
To understand this better, imagine the following two modes of surveillance:
- A central system that collects all the data and processes everything
- Distributed on-device models in the field (users’ devices) that observe, summarize, and report back
On-device ML operates in the second mode. It’s decentralized, efficient, and technically doesn’t violate E2EE—because the data never leaves your device in raw form.
But the insights might, as “innocent” metadata.
On-Device ML: A Trojan Horse for Privacy?
Even with end-to-end encryption in place, apps can still learn a surprising amount—without ever reading your raw messages or accessing your photos directly. On-device ML models running locally can process this data in real time and send back structured inferences rather than actual content.
Here are a few illustrations of how that might work.
Please note that I am not claiming that any of the following are happening in these apps today, but the technical ability to do this exists.
1. The Breakup Chat
Two friends exchange over 200 WhatsApp messages late at night. One is going through a tough breakup, expressing frustration, sadness, and vulnerability.
An on-device NLP model that’s used by WhatsApp analyzes the emotional tone, frequency, and timing of messages and labels the session with:
relationship
breakup
sadness
anxiety
anger
nighttime usage
The raw messages never leave the device, but the app could use these labels to serve:
- Dating app promotions
- Online therapy suggestions
- Comfort-shopping ads if the user has a history of retail behavior when stressed
2. The Sneaker Search
Two people share a string of Instagram DMs, including screenshots and comments about sneakers:
“Clean lines, under $200”
“Onitsuka or Adidas?”
screenshot of product page
A local multi-modal model combines visual and textual inference to generate:
shopping intent
topic: shopping
outfit: sneakers
brand preference: Onitsuka
color: white
budget: <$200
Without uploading a single photo or message, the user begins seeing:
- Targeted sneaker ads
- Local store deals
- Discounts on footwear platforms
3. The Travel Planner
Over the course of a few days, a user takes screenshots of TripAdvisor pages, Google Maps listings, and hotel bookings.
An image-text ML model running on-device identifies patterns such as:
travel intent
destination: Amsterdam
dates: early July
trip type: solo
interest: museums
The raw screenshots stay local, but this meta-profile could trigger:
- Airbnb suggestions
- Museum exhibit ads
- Currency exchange tools
4. The Productivity Spiral
A user writes reflective journal entries in a private notes app and screenshots their tightly packed calendar and Pomodoro timer app.
The model infers:
burnout risk
work stress
attempting focus
low energy
midweek spikes
This might surface:
- Focus-enhancing supplements
- Productivity apps
- Digital detox content
5. The Private Photo Dump
A user never shares their photos—but the gallery app organizes them using on-device vision models.
It detects scenes like:
baby
birthday
pet illness
hospital
elderly care
religious event
Even without uploading a single image, the service can infer:
- Parenting status
- Pet ownership and care patterns
- Recent life events or transitions
- Travel preferences
- Fashion style & preferences
What Data Can Apps Actually Access and Infer?
For apps like WhatsApp, Instagram (Messenger), Facebook (Messenger), Telegram, and others, the primary data at risk, despite E2EE, includes messages and photos shared within these apps. Although end-to-end encryption protects your content in storage and transit, apps must decrypt this data locally on your device to display it, giving them an opportunity to use on-device machine learning models to infer insights directly from your private data.
Beyond encrypted messaging, apps also gain access to unencrypted data stored on your device—such as your photo library, contacts, call logs, location data, and more. Even with OS-level encryption (such as Apple’s iCloud or Android’s Google services), this local data access opens additional pathways for profiling. Companies often augment these insights by aggregating multiple data sources like online browsing behaviors, purchased data from brokers, and offline interactions from loyalty programs or store memberships, building detailed user profiles.
On-device inference enriches these profiles significantly, providing context and depth in near real-time—insights which might otherwise be challenging or slow to obtain. Thus, even though E2EE protects content explicitly shared, it does not necessarily safeguard your broader privacy from inference-based profiling. The combination of direct device access and powerful inference capabilities makes this approach exceptionally effective, sometimes enabling companies to predict user behaviors even better than users themselves can recognize.
Real-World Experiments: How Easy Is Profiling?
To demonstrate how feasible this profiling actually is, I conducted some experiments to show how easy, time-efficient, and resource-efficient it is to run these on-device models within apps that have access to your data.
Experiment 1: Analyzing Photos with Object Detection (No Transfer Learning)
To test the practical impact of on-device ML, I installed a lightweight object detection model in a simple test app on my phone. After granting the app access to my Photos library, the model swiftly analyzed thousands of photos in the background. Check parameters and results below.
Test Conditions:
Parameter | Details |
---|---|
Device | iPhone 14 Pro Max |
OS | iOS 18.5 |
Test App | Built with Swift 5 and SwiftUI |
ML Stack | CoreML | Internally uses Neural Engine and GPU for efficient and fast inference |
Model Type | Object detection in image (Neural Network → Non Maximum Suppression) |
Model Name | YOLOv3TinyFP16 |
Model Size | 17.7 MB |
Model Link | YOLO on Apple Developer Site |
Recognized Objects Count | 80 objects Hugging Face spec |
I ran 3 batches, each with 1000 random photos from my library. The goal was to understand the time it would take an app that’s using an image object detection model to go through a significant amount of photos on one’s device.
Note that the photos on my iPhone live on iCloud with Optimize Storage enabled. This means only a lower-resolution copy of the photos live on the phone. These, in most cases, are sufficient for the object detection models. This one needs an image size of 416x416, so no network downloads are needed to run these models, making them faster and more inconspicuous.
Batch 1
Total images | 1000 |
Total time taken | 5.695s |
Average time per photo | 0.0057s (5.7ms) |
Median time per photo | 0.0056s (5.6ms) |
Minimum time per photo | 0.0051s (5.1ms) |
Maximum time per photo | 0.0293s (29.3ms) |
Standard Deviation of 1000 data points | 0.0008s (0.8ms) |
Batch 2
Total images | 1000 |
Total Time taken | 5.5712s |
Average time per photo | 0.0056s (5.6ms) |
Median time per photo | 0.0055s (5.5ms) |
Minimum time per photo | 0.0051s (5.1ms) |
Maximum time per photo | 0.0416s (41.6ms) |
Standard Deviation of 1000 data points | 0.0012s (1.2ms) |
Batch 3
Total images | 1000 |
Total Time taken | 5.5592s |
Average time per photo | 0.0056s (5.6ms) |
Median time per photo | 0.0054s (5.4ms) |
Minimum time per photo | 0.0051s (5.1ms) |
Maximum time per photo | 0.0188s (18.8ms) |
Standard Deviation of 1000 data points | 0.0005s (0.5ms) |
Let’s try to understand these numbers. The model ran in the background, without blocking the interface or user interactions. On average, the model took a little bit over 5.5 seconds to identify objects in 1000 photos. That is ~5.6 milliseconds per photo, which is very quick! What does that mean?
I have ~96,000 photos in my library. The model would take about 528 seconds, or ~9 minutes to analyze my entire library if it ran continuously. But an app that needs to do this would perhaps sample images in smaller batches, possibly grouped by chronological periods, and run it over multiple uses of the app to not consume too much battery or heat up your device. Imagine how often you use WhatsApp and Instagram; if the app spreads these analyses over a few different app usage sessions, there’s practically no way to tell. For a typical user, these apps can analyze the entire library in just a few days without you noticing anything. In my case, it just needs 9 minutes of app use.
Okay, we understand that these models are quick and can virtually hide the analyses. Let’s see what kind of data it can learn from our photos.
In my case, the model more or less identified the following objects across a random sample of 3000 photos of mine.
person
, bicycle
, car
, motorcycle
, airplane
, bus
, train
, bench
, bird
, elephant
, backpack
, suitcase
, surfboard
, wine glass
, fork
, knife
, spoon
, banana
, orange
, pizza
, cake
, laptop
…
(I’ve omitted some and added some more objects and also left out the repeat counts of each object to not publicize my own behavior.)
These tags alone are more than sufficient to profile my lifestyle, dietary habits, favorite activities, relationships, likely places visited, and interests.
This is how easy it is to profile a user despite having E2EE since these models run locally, where this data may be decrypted.
Experiment 2: Classifying Images Using Transfer Learning (Create ML)
At my firm, Labyrinth, we made Cacti a few years ago to find nudes from the user’s photo library to help them move such sensitive photos into an encrypted vault. Due to the sensitive nature of the use case, and because we wanted to evaluate the on-device ML model’s capabilities, we trained our nudity detection model from scratch on Create ML to use it as an on-device only model. This meant all processing happened on the device and the users’ photos never left their devices.
Let’s take a look at how efficiently this model worked on-device.
Test Conditions:
Parameter | Details |
---|---|
Device | iPhone 14 Pro Max |
OS | iOS 18.5 |
App | Cacti |
ML Stack | CreateML + CoreML | Internally uses Neural Engine and GPU for efficient and fast inference |
Model Type | Image Classifier |
Model Size | 17 KB |
Number of Classes | 3 |
Model Training Dataset Size | ~15M images |
Model Training Info | Trained using CreateML on a Blackmagic eGPU attached to a Mac. |
I ran 3 batches, each with 1000 images each. The goal was to show how quickly an on-device model could understand the main theme of each image (classify the image between classes) it sees.
Similar to the last experiment, the photos on my iPhone live on iCloud with Optimize Storage enabled. This means only a lower-resolution copy of the photos live on the phone. These, in most cases, are sufficient for the image classification models. This model needs an image size of 299x299, so no network downloads are needed to run these models, making them faster and more inconspicuous.
Results
For brevity, I have combined the results from the 3 batches in one table.
Total images | 3000 |
Total Time taken | 12.3867s |
Average time per photo | 0.0041s (4.1ms) |
Median time per photo | 0.0039s (3.9ms) |
Minimum time per photo | 0.0037s (3.7ms) |
Maximum time per photo | 0.0123s (12.3ms) |
Once again, it is clear how quickly these models are able to classify images and infer from them.
Key Insights from the Experiments
I ran these models in batches of 1000 images at once. The results we see are when the phone is in ideal, fully-performant conditions. When these models run recursively on a lot more images, the phone tends to heat up, and the performance does drop then.
The heating-up is not an issue if the apps that use on-device ML for learning about their users do it smartly. They may employ the following measures to prevent you from ever noticing anything intensive being done by these apps:
- Run ML inferences on a small subset of data on each app launch session.
- Run ML inferences on small batches of data and space the inference on the next batch to ensure the device has enough time to not heat up.
- Sample a specific number of photos grouping them by time epochs. If your photos are clustered by timestamp, it would usually give specific activity-related photos. All photos from a dinner may be in a cluster, and all photos from a day’s hike might be in another. These apps only need some photos from these clusters to understand that you were out dining a specific cuisine of food on Sunday, or were hiking in Yosemite last Saturday.
- Use a hierarchical inference approach by using multiple models. Imagine there’s one model to broadly categorize your images as ’travel’, ‘food’, ‘sport’, ‘pets’, etc.
- When one image is classified as travel, they may then use another model to detect objects specific to travel, or a model to identify places in your photos, to learn about where you traveled and what you did there, more specifically.
- With food-classified images, they may use another model to identify individual dishes, cuisines, or restaurant interiors. This would make the inference of each image take a pathway through different models, and be less resource-intensive.
Largely, it is clear that there exists the technical ability to learn about a user from their E2EE data, or from the data they have on their devices that an app has access to, without transmitting this data to the app’s servers, or indicating to the user that something heavy is going on in the background of the apps that they use.
Ethical and Legal Gray Areas
Whether companies are doing this is hard to confirm. Privacy policies are vague, and the fines for violating them are often trivial for billion-dollar firms. There’s also the question of legality: do inferences count as personal data under laws like GDPR? Should users be informed about what models are running on their devices, what inferences they are making, and how these inferences are being used?
These are murky waters. And as machine learning becomes more embedded into our devices, the ethical questions only grow.
If any of this is actually happening, and these companies are using on-device ML to learn about a user, they might actually be very careful in how they use what they have learned. It’s possible they don’t trigger content and ads that directly give away what they know based on these learnings. Instead, they may use this data to build a shadow profile, and wait for another plausible trigger to use as justification as to why the ad/content was served, if they ever expect to face a legal battle. Since ML models are black boxes, and multiple factors are involved, they can set it up so there’s enough plausible deniability.
Future Risks: Could This Get Worse?
In a nutshell, yes.
More Powerful Devices
As devices get more powerful and gain the ability to run more powerful neural networks, the ability to run more complex and smart on-device models would make the opportunity to learn about you greater.
Increasing Data Availability
We have seen historically that more and more data exists or is accessed by our personal devices. There is a need by users and tech companies to give us access to everything at our fingertips. This consequently means that there’s more user data for these companies to learn from.
Small Language Models (SLMs)
Small language models are generative models that are built with a smaller number of parameters than Large Language Models (LLMs), making them efficient to run on devices with less memory, compute bandwidth, and storage space. This opens an opportunity to use them to contextualize text messages (WhatsApp) with much more nuance than basic NLP text classification models.
When LLMs Go Local
The imminent shift of Large Language Models (LLMs) onto personal devices dramatically amplifies privacy risks posed by on-device ML. LLMs are exceptionally capable at understanding nuanced language, detecting emotional subtleties, and maintaining extensive conversational context. These models don’t just read content—they interpret it deeply, gaining insights into users’ thoughts, preferences, emotional states, and future intentions.
For example, an LLM could analyze ongoing private conversations in real-time, identify stress, excitement, purchasing intent, health concerns, or interpersonal relationships, and generate detailed, actionable metadata without ever transmitting a single plaintext message off the device. This heightened capability to parse language and context makes LLMs a formidable tool for profiling, capable of far deeper and subtler inferences than SLMs or traditional ML models.
As LLMs become standard on smartphones, the level of profiling achievable via encrypted data could surpass what has previously been possible, solidifying the idea that encryption alone is insufficient for privacy.
Okay, What Can One Do About This?
Awareness is the first step, but practical measures are essential. Here’s how one can limit exposure:
Identify Your Privacy Trust Zones:
- Realm 1: Trusting Your Device and OS:
Recognize that Android and iOS inherently support on-device ML. Limit what data you store locally. Consider storing sensitive data off-device or encrypt it independently. Also pick the OS that you trust more not to profile you. There’s no right answer here as our choices are practically limited to these 2 options (There are some hobbyist exceptions to this).
- Realm 2: Evaluating Third-Party Apps and Services:
- Differentiate clearly between apps offering E2EE and those that don’t.
- Assume non-E2EE apps actively access and profile any accessible data.
- For E2EE apps, remember they might still use on-device ML for inference, despite the encryption.
- Have a method and threshold for trusting these apps and services.
You need to define your level of trust and exposure differently for each of these realms.
Actionable Privacy Controls:
-
Use Web App Over Native App: Where possible, if the service offers a web app, use that instead of the native app installed on your devices. This limits exposure to private data on the device, and the app’s ability to run and access full device resources to run on-device models. This is not foolproof, but it limits exposure.
-
Limit App Permissions: Follow a “least privilege” principle. Grant access only to essential subsets of your data. iOS, for instance, allows granular access to apps, such as limiting access to specific photos or contacts, which users should actively use. There might be a similar control on Android.
-
Regular Permission Audits: Regularly review and revoke unnecessary app permissions. If granular permissions for Photos/Contacts are used, update these regularly.
-
Limit Background Activity: Disable background refresh for apps where possible, reducing the likelihood of continuous inference.
-
Favor Open Source: Consider open-source apps that transparently disclose data use practices. It’s not practical to strictly use open-source; consider where possible.
-
Study App’s Business Model: Study and understand the business models of apps you are using. If it’s free, you are most likely the product. Choose an app/service that is known to stand for privacy of data.
Ultimately, privacy requires constant vigilance—understanding not just who can access your data, but also who can infer from it, even when encrypted.
Conclusion
Encryption gives us peace of mind. It assures us mathematically and cryptographically that nobody is reading our messages or peeking at our photos.
But what if there are models running silently on our phones within the apps we use? They might be drawing conclusions anyway, and may be working in the interests of these large firms.
This isn’t conspiracy—it’s capability. The infrastructure is already here. The opportunity is clear. The incentives are powerful. Yet users remain largely unaware of what’s inferred from their private data, even within the most “secure” systems.
For true privacy, encryption alone isn’t enough. We also need transparency and clear visibility into what our devices and apps are really doing behind those polished interfaces, and better regulations to offer transparency in this regard.
The text was written by an old-school, all-natural human (me) who typed this out. I used an LLM-powered grammar checker for final fixes. Only the cover image was generated using an AI model.