There is a class of engineering problems that look trivial until real constraints show up. Multi-tenant IoT device filtering is one of them.
This post documents the full journey: approaches we considered, why each failed, and the architecture that holds up at scale. The dead ends matter as much as the answers.
The setup
Picture a typical IoT SaaS stack. Data is split across databases, each chosen for what it does best:
| Database | What it stores |
|---|---|
| SQL (PostgreSQL / MySQL) | Users, devices, user_devices — relational ownership |
| MongoDB | Real-time readings — many parameters per second per device |
| ClickHouse | Aggregated analytics and historical time-series |
| Elasticsearch | Full-text search over device metadata |
A user logs in and sees a paginated table: one row per device, ~20 columns (temperature, voltage, pressure, status, firmware, and so on). They can filter on any column.
That sounds standard. Here is the catch:
- A single user may access 1–2 lakh devices (100k–200k).
- Device data is write-heavy — millions of new documents per hour.
- Ownership is dynamic — access is granted and revoked frequently.
- Filters can target any of those columns, living in MongoDB, ClickHouse, or Elastic.
That combination — scale, dynamic ownership, multi-database reads, arbitrary filters — is what makes the problem genuinely hard.
Approach 1: The naive $in clause
The first instinct is natural: load the user’s device IDs from SQL, then filter MongoDB.
// Step 1 — device IDs for this user from SQL
const deviceIds = await sql.query(
`SELECT device_id FROM user_devices WHERE user_id = ?`,
[userId]
);
// Step 2 — filter MongoDB
db.device_data.find({
device_id: { $in: deviceIds },
temperature: { $gt: 30 },
});
This works in development when a test user has a handful of devices. It struggles in the low thousands and collapses around 1 lakh.
Why it breaks
- The
$inarray becomes a 100,000-element payload on every request. - MongoDB still pays a heavy cost to resolve a huge
$inagainst a large collection, even with an index ondevice_id. - The same pattern hurts ClickHouse and Elasticsearch — every store sees the same giant list, on every page, filter change, and sort.
Verdict: Fine up to roughly 1,000 devices. Beyond that, treat it as a prototype pattern only.
Approach 2: Store user_ids inside each document
If cross-DB hops are painful, why not embed ownership in the telemetry document?
{
"device_id": "abc123",
"user_ids": ["user_1", "user_456", "user_789"],
"temperature": 32,
"voltage": 4.5,
"timestamp": "2025-04-05T10:00:00Z"
}
Query becomes a single MongoDB operation:
db.device_data.find({
user_ids: "user_456",
temperature: { $gt: 30 },
});
Why it breaks
- Storage bloat: High-frequency telemetry means enormous document counts. Duplicating
user_idson every document multiplies cost brutally when devices are shared across many users. - Update nightmare: Granting access means patching historical documents — potentially tens of millions of updates for one grant.
- Revocation is worse: You must remove the user everywhere, and access is not truly revoked until every copy is consistent — a security and ops trap.
Verdict: Reasonable for write-once data with stable ownership. Avoid for high-velocity IoT with dynamic sharing.
Approach 3: Redis cache + chunked parallel queries
Keep ownership out of each document, but split the giant $in into parallel chunks.
const allDeviceIds = await redis.smembers(`user:${userId}:devices`);
const chunks = chunkArray(allDeviceIds, 500);
const results = await Promise.all(
chunks.map((chunk) =>
db.device_data.find({
device_id: { $in: chunk },
temperature: { $gt: 30 },
})
)
);
const merged = results.flat().sort(/* ... */).slice(page * size, (page + 1) * size);
Why it breaks
- Fan-out: 100k IDs at 500 per chunk ⇒ 200 concurrent queries per user interaction — connection pools and DB capacity suffer.
- Pagination is wrong in principle: True “global page 1” needs ordering across all matches; chunk results do not give you that without materializing everything.
- Cache sync: Redis must stay aligned with SQL on every assignment change.
Verdict: Better than one mega-$in, still too expensive at this scale. Only consider when device counts are in the low thousands and approximate pagination is acceptable.
Approach 4: Separate mapping table in each database
Mirror user_devices inside MongoDB, ClickHouse, and Elasticsearch so each store can join locally.
MongoDB-style sketch:
db.device_data.aggregate([
{
$lookup: {
from: "user_device_mapping",
localField: "device_id",
foreignField: "device_id",
pipeline: [{ $match: { user_id: "user_456" } }],
as: "access",
},
},
{ $match: { access: { $ne: [] }, temperature: { $gt: 30 } } },
]);
ClickHouse sketch:
SELECT d.*
FROM device_data d
INNER JOIN user_device_mapping m ON d.device_id = m.device_id
WHERE m.user_id = 'user_456'
AND d.temperature > 30;
What it fixes: No cross-DB round trip for ownership; filters compose after the join.
Why it still breaks
- Join cost at scale: Huge mapping sets × huge telemetry sets is expensive. MongoDB
$lookupat lakh scale is especially painful. - Sync × N: You now maintain truth in SQL and three more systems. Any lag ⇒ wrong devices shown.
- Trees break the model: Real platforms often grant access at a node in a hierarchy (“everything under India”), not a flat precomputed device list that stays still while the fleet changes.
Verdict: Looks right in staging; production volume and tree-based ACL expose the limits.
The tree problem: access control gets harder
So far we assumed users map to devices. In real IoT products, users map to nodes in a hierarchy — and inherit the whole subtree.
Example (names are not unique; only IDs are):
Assign a user to N2 and they should see D1–D4, including future devices under that subtree. Moves and inserts change who should see what without a simple static device list.
The hard question: given node N2, how do you resolve all devices in the subtree efficiently when the tree has millions of nodes and structure changes multiple times per hour?
Approach 5: Closure table
Store every ancestor → descendant pair. Subtree membership becomes one indexed lookup.
SELECT descendant_id
FROM closure_table
WHERE ancestor_id = 'N2'
AND is_device = true;
Why it breaks at this scale
- A subtree move can touch hundreds of thousands of pairs.
- Frequent structural changes create continuous heavy writes competing with reads.
- At millions of nodes, the closure table itself can become enormous.
Verdict: Great when the tree is slow-changing. Costly when large subtrees move often. Keep parent/child as source of truth; closure is derived.
Approach 6: Materialized path
Store a string path per node, e.g. /N1/N2/N4/D1. Subtree query:
SELECT id FROM nodes
WHERE path LIKE '/N1/N2/%'
AND is_device = true;
Moves use REPLACE on descendant paths.
Verdict: Simpler updates than closure. Prefix LIKE / anchored regex still degrades when the matching subtree is huge. Good medium scale; know the limits.
Approach 7: Graph database (Neo4j)
Traversal is what graphs are for:
MATCH (user:User { id: 'user_123' })
-[:ASSIGNED_TO]->(n:Node)
-[:ANCESTOR_OF*]->(d:Device)
RETURN d.id;
What it solves: Deep trees, frequent edge changes, ID-based disambiguation.
What it does not solve: You still get device IDs — and if that list is 1 lakh, downstream MongoDB / ClickHouse / Elastic still face the $in explosion unless you change the filtering strategy.
Verdict: Strong for access resolution. Not a substitute for a cross-store filtering design.
The pattern that works: ancestor nodes at ingestion time
Shift work left — to write time. When a device emits a reading, the ingestion pipeline attaches full ancestry (ordered node IDs from root to device):
{
"device_id": "D1",
"ancestor_nodes": [
{ "id": "N1", "name": "World" },
{ "id": "N2", "name": "India" },
{ "id": "N4", "name": "Pune" },
{ "id": "N11", "name": "Factory A" }
],
"temperature": 32,
"voltage": 4.5,
"status": "active",
"timestamp": "2025-04-05T10:00:00Z"
}
If the user is assigned to nodes N2 and N6, the query stays tiny:
db.device_data.find({
"ancestor_nodes.id": { $in: ["N2", "N6"] },
temperature: { $gt: 30 },
});
Whether the user can reach 100 devices or 100,000, you still pass a handful of node IDs. No per-request tree walk. No giant device $in.
Why store id and name: Access control uses IDs. Search UIs use names — but ambiguous labels (“Pune”) must resolve through paths in the picker so the query uses N4 vs N6, not a raw string.
Tree changes: Publish NODE_MOVED (etc.) to Kafka; consumers recompute ancestry for affected subtrees and bulk-update MongoDB / Elastic asynchronously. Expect a short staleness window (seconds to minutes). For many IoT dashboards, that is acceptable; for instant revocation security domains, you need stricter (and more expensive) designs.
The ClickHouse immutability problem
ClickHouse is built for append-only analytics. You do not want to “UPDATE every row” when ancestry changes.
Practical split
| Store | Role |
|---|---|
device_latest (MongoDB) | One document per device — current readings with ancestor_nodes, easy to update |
device_history (ClickHouse) | Append-only telemetry without embedding ancestry; filter by device_id after you know the allowed set |
For analytics that must respect user→node assignments, a dictionary refreshed on a schedule can avoid mutating fact data:
CREATE DICTIONARY user_nodes_dict (
user_id String,
node_ids Array(String)
)
SOURCE(CLICKHOUSE(...))
LIFETIME(300);
-- Illustrative pattern — adapt to your schema
SELECT device_id, avg(temperature), max(voltage)
FROM device_history
WHERE /* access via dict + your model */
AND timestamp > now() - INTERVAL 7 DAY
GROUP BY device_id;
End-to-end picture
Ingestion
- Device emits reading
- Resolve
ancestor_nodes(e.g. Redis warmed from Neo4j) - Write
device_latest(MongoDB) with ancestry - Write
device_history(ClickHouse) append-only - Index Elastic with ancestry for search
Query time
- Load user’s assigned node IDs (SQL / cache)
- Mongo / Elastic:
ancestor_nodes.id IN [those nodes]+ filters - ClickHouse: dictionary (or resolved device set) + analytics query
Tree change
- Kafka event
- Neo4j graph update
- Async ancestry repair in Mongo + ES
- Invalidate affected caches
- Dictionary refresh on next cycle
Tradeoffs summary
| Approach | Best for | Avoid when |
|---|---|---|
Naive $in | Under ~1k devices, prototypes | Large device sets |
user_ids on each doc | Stable ownership, low write rates | High-frequency telemetry + dynamic sharing |
| Redis chunking | Medium scale, flat ownership | Huge lists + correct deep pagination |
| Mapping table per DB | Simpler models, moderate data | Heavy joins + multi-way sync + tree ACL |
| Closure table | Slow-changing trees | Frequent large moves |
| Materialized path | Moderate scale, simplicity | Massive prefix match sets |
| Neo4j | Dynamic trees, graph truth | Used alone without filtering strategy |
| Ancestors at ingest | Production multi-DB + trees | Hard real-time revocation requirements |
Honest conclusion
There is no free lunch. You trade write complexity for read simplicity, and you often accept eventual consistency at the edge.
The pattern that survives in the wild: push complexity to ingestion — pay once per document, query cheaply forever — and scope ClickHouse so immutability stays a feature, not a fight.
If you have shipped something like this, the failed approaches will look familiar. The goal is not a perfect abstraction; it is the least painful fit for your latency, security, and ops constraints.
About the author
Amit Verma is a software engineer focused on distributed systems, event-driven backends, and data-intensive products — the kind of work where SQL, document stores, search, and columnar analytics have to agree just enough to ship.
- Website: realproblem.me
- Writing: Medium @amitech
- Code: GitHub @amit8889
- Problem solving: LeetCode
If this post saved you a week of dead ends, a share or follow goes a long way.