Hardware Sizing Guide
Choose the right hardware for your GraphnAI deployment based on the number of identities in your environment.
Audience: IT administrators planning a self-hosted GraphnAI deployment.
Related: Quick Start: Self-Hosted, Identity Bridge, Bridge Configuration
Deployment Architecture
GraphnAI runs as two components:
| Component | What It Includes | Where It Runs |
|---|---|---|
| Server container | GraphnAI Platform (API + UI) and ArangoDB (graph database) in a single Docker Compose stack | A dedicated VM or server with Docker |
| Identity Bridge | Lightweight agent that collects Active Directory data | On-premises, one per network segment or AD forest |
The server container handles all graph storage, analysis, and UI serving. The Bridge connects outbound to the server — it does not require inbound firewall rules.
Server Container Requirements
The server and ArangoDB share resources within the same Docker host. ArangoDB is the primary consumer of RAM and storage during identity sync operations.
| Environment | Identities | Estimated Edges | RAM | CPU | Storage (SSD) |
|---|---|---|---|---|---|
| Small | Up to 10,000 | ~200K | 8 GB | 2 cores | 30 GB |
| Medium | 10,000 - 100,000 | 1 - 2M | 16 GB | 4 cores | 75 GB |
| Large | 100,000 - 500,000 | 6 - 10M | 32 GB | 8 cores | 150 GB |
| Enterprise | 500,000+ | 15 - 20M+ | 64 GB | 8+ cores | 300 GB+ |
Understanding the estimates
- Identities include users, groups, computers, service accounts, service principals, OUs, containers, and domain objects synced from Active Directory and Entra ID.
- Edges represent access relationships (MemberOf, ACL permissions, containment, etc.). Environments with complex ACLs (many delegated permissions, fine-grained security descriptors) will have higher edge counts. The ratio is typically 10-20 edges per identity.
- RAM is the most critical dimension. ArangoDB's write-ahead log (WAL), RocksDB block cache, and indexes all compete for memory during sync. Insufficient RAM causes sync failures.
- Storage must be SSD. ArangoDB uses RocksDB, which performs background compaction. Spinning disks cause severe performance degradation. Allow headroom of roughly 2x the live data size to accommodate compaction.
Resource usage pattern
GraphnAI has a bursty workload:
- Idle / query time (majority): Lightweight topology queries, dashboard reads, and analytics cache hits. Minimal resource usage.
- Sync bursts (periodic): Heavy write activity during scheduled identity syncs — edge building, criticality evaluation, analytics pipeline. This is when RAM and CPU spike.
For most deployments, sizing for the sync burst is what matters. Query-time performance is comfortable even on the Small tier.
Identity Bridge Requirements
Each Bridge instance is lightweight. CPU is the main factor — ACL parsing from Active Directory security descriptors is CPU-bound.
| Resource | Requirement |
|---|---|
| RAM | 2 - 4 GB |
| CPU | 2 - 4 cores (more cores = faster sync) |
| Storage | Minimal (stateless agent, logs only) |
| Network | Outbound HTTPS to server on port 8443 |
How many Bridges?
Deploy one Bridge per network segment that contains Active Directory domain controllers:
| Scenario | Bridges Needed |
|---|---|
| Single AD forest, one site | 1 |
| Single AD forest, multiple isolated network segments | 1 per segment |
| Multiple AD forests | 1 per forest (minimum) |
| Entra ID only (no on-premises AD) | 0 — Entra sync runs server-side |
Bridges are managed entirely from the platform UI. See Bridge Configuration for deployment instructions.
Cloud / SaaS Considerations
For cloud-hosted deployments, the bursty workload pattern makes auto-scaling attractive:
- Scale up before sync: If syncs are scheduled (e.g., nightly), increase the container's memory allocation before the sync window starts, then scale back down after completion.
- ArangoDB memory: RocksDB's block cache grows as data is accessed. Scaling up memory before the sync starts is more effective than scaling reactively mid-sync.
- Server is stateless: The server container holds no state — all data lives in ArangoDB. This makes horizontal scaling straightforward for query workloads, though sync operations run single-threaded per domain.
Recommendations by Deployment Type
Small / Proof of Concept
A single VM with 8 GB RAM and 2 cores handles up to 10K identities comfortably. Suitable for evaluation, single-domain environments, or small organizations.
Medium / Multi-Domain
16 GB RAM with 4 cores supports most mid-size organizations. If you have multiple AD domains, plan for one Bridge per network segment. Syncs complete in a few minutes.
Large / Enterprise
32-64 GB RAM is recommended. At this scale, identity syncs produce millions of edges and sync duration depends on the ArangoDB write throughput. The platform uses adaptive throttling to prevent database overload — sufficient RAM ensures syncs complete without intervention.
Multi-Forest / Global
For organizations with 500K+ identities across multiple forests, contact your GraphnAI representative for architecture planning. Factors like cross-forest trust relationships, sync scheduling, and network topology affect the optimal deployment layout.
Support
Contact your GraphnAI account representative for sizing guidance specific to your environment.