Version: Unreleased

Elasticsearch Sizing Guidelines

This document gives practical guidance for sizing Elasticsearch clusters (nodes, memory, CPU, storage, and shards) and points to the official Elasticsearch documentation for deeper reference. Use these guidelines as starting points — actual sizing must be validated with realistic load tests for your data, query patterns, and retention policies.

Key principles

Right-size heap: give Elasticsearch enough heap to operate, but avoid very large JVM heaps. Use native OS memory for filesystem cache and long-lived buffers.
Keep shards at reasonable sizes: small shards add overhead, very large shards are harder to recover and rebalance.
Separate concerns: consider hot/warm or cold tiers for different retention/IOPS needs.
Measure and iterate: baseline CPU, IOPS, and indexing/query throughput with representative workloads and tune from there.

Node sizing basics

RAM: start with 32–64GB of system RAM for data nodes in small-medium clusters; larger clusters commonly use 64–256GB depending on workload.
JVM heap: set JVM heap to 50% of physical RAM but no more than ~31GB (to keep compressed ordinary object pointers). Example: with 64GB RAM, set heap to 31GB (or 30g), leaving the rest for OS cache and native memory.
- Rule: heap = min(physicalRAM/2, 31GB)
CPU: choose CPUs based on query and indexing load. Many cores help for heavy concurrent queries. A baseline data node often uses 4–16 vCPUs; scale up for high-throughput indexing or heavy aggregations.
Disk: prefer SSDs for data nodes. Choose capacity based on retention and expected indexing rate plus replicas and snapshots. Leave headroom—avoid filling disks above ~70–80% to allow shard reallocation.
Network: ensure low-latency, high-throughput network between nodes (1Gbps minimum, 10Gbps for larger clusters or heavy cross-node traffic).

Shard sizing and counts

Shard size: aim for shard sizes that balance manageability and performance. A common guideline is 20–50GB per shard for many workloads; adjust by your query patterns and recovery targets. Very large shards (>100GB) make recovery and relocation slow.
Number of shards per node: keep the total number of shards moderate (thousands of shards per cluster may be fine, but avoid excessive shard counts per node). Each shard has memory and file-descriptor overhead.
Primary vs replicas: replicas increase read throughput and provide redundancy. Plan replica count so cluster can tolerate node failures while still serving queries.

Index design and lifecycle

Index-per-time-bucket: if using time-series data, create indices aligned to retention and query windows (daily, weekly) to make lifecycle management easier.
ILM (Index Lifecycle Management): use ILM to move indices between tiers (hot -> warm -> cold) and to delete or freeze old indices automatically.

Hot–Warm (and cold) architecture

Hot nodes: fast CPUs, more memory, and fast NVMe/SSDs — handle indexing and low-latency queries.
Warm nodes: less CPU, larger, cheaper storage — for querying older data with lower performance requirements.
Cold/Frozen: for long-term retention, reduce resource usage and accept slower queries.

Storage performance and sizing

IOPS and throughput: base sizing on expected indexing throughput and query IO patterns. Bulk indexing may require sustained write throughput; heavy aggregations drive read IOPS.
Snapshots: allocate network and storage throughput for snapshots. Snapshots are incremental but still require bandwidth during large retention periods.

Monitoring and metrics to watch

JVM heap usage and GC pause times
Node CPU and load
Disk usage and IOPS
Search rate, latency, and queue sizes
Indexing rate, refresh rate, and merge times
Cluster health and shard allocation status

Example sizing workflows

Small test cluster (development or low-volume):
- 3 data nodes, each: 16–32GB RAM, heap 8–15GB, 4 vCPUs, 1–2 SSDs
- replicas: 1
Production moderate cluster (search + indexing):
- 3–5 hot data nodes, each: 64GB RAM, heap 31GB, 8–16 vCPUs, NVMe SSDs
- 3–5 warm nodes with larger, cheaper storage, 64–128GB RAM with smaller heap
- use ILM to roll indices from hot -> warm

Example calculation (very simplified): expected daily ingestion = 100GB/day raw documents

Retention: 30 days -> 3TB raw
With replicas=1 and ~20% indexing overhead and merges -> ~7.2TB usable required
Choose node count and disk sizes such that per-node disk usage stays below 70% with room for snapshots and rebalancing.

Operational guidance

Avoid swapping: configure the OS to avoid swapping Elasticsearch processes and reserve memory for the OS.
Set bootstrap.memory_lock and lock the JVM heap if appropriate to prevent swapping.
Use discovery.seed_hosts and proper discovery settings for cluster formation.
Regularly run queries that reflect production patterns during capacity planning and scaling tests.

Elasticsearch Sizing Guidelines

Key principles

Node sizing basics

Shard sizing and counts

Index design and lifecycle

Hot–Warm (and cold) architecture

Storage performance and sizing

Monitoring and metrics to watch

Example sizing workflows

Operational guidance

References (official Elasticsearch documentation)

Further reading and tools

Key principles​

Node sizing basics​

Shard sizing and counts​

Index design and lifecycle​

Hot–Warm (and cold) architecture​

Storage performance and sizing​

Monitoring and metrics to watch​

Example sizing workflows​

Operational guidance​

References (official Elasticsearch documentation)​

Further reading and tools​

Key principles

Node sizing basics

Shard sizing and counts

Index design and lifecycle

Hot–Warm (and cold) architecture

Storage performance and sizing

Monitoring and metrics to watch

Example sizing workflows

Operational guidance

References (official Elasticsearch documentation)

Further reading and tools