Skip to main content
Version: Unreleased

Elasticsearch Sizing Guidelines

This document gives practical guidance for sizing Elasticsearch clusters (nodes, memory, CPU, storage, and shards) and points to the official Elasticsearch documentation for deeper reference. Use these guidelines as starting points — actual sizing must be validated with realistic load tests for your data, query patterns, and retention policies.

Key principles

  • Right-size heap: give Elasticsearch enough heap to operate, but avoid very large JVM heaps. Use native OS memory for filesystem cache and long-lived buffers.
  • Keep shards at reasonable sizes: small shards add overhead, very large shards are harder to recover and rebalance.
  • Separate concerns: consider hot/warm or cold tiers for different retention/IOPS needs.
  • Measure and iterate: baseline CPU, IOPS, and indexing/query throughput with representative workloads and tune from there.

Node sizing basics

  • RAM: start with 32–64GB of system RAM for data nodes in small-medium clusters; larger clusters commonly use 64–256GB depending on workload.
  • JVM heap: set JVM heap to 50% of physical RAM but no more than ~31GB (to keep compressed ordinary object pointers). Example: with 64GB RAM, set heap to 31GB (or 30g), leaving the rest for OS cache and native memory.
    • Rule: heap = min(physicalRAM/2, 31GB)
  • CPU: choose CPUs based on query and indexing load. Many cores help for heavy concurrent queries. A baseline data node often uses 4–16 vCPUs; scale up for high-throughput indexing or heavy aggregations.
  • Disk: prefer SSDs for data nodes. Choose capacity based on retention and expected indexing rate plus replicas and snapshots. Leave headroom—avoid filling disks above ~70–80% to allow shard reallocation.
  • Network: ensure low-latency, high-throughput network between nodes (1Gbps minimum, 10Gbps for larger clusters or heavy cross-node traffic).

Shard sizing and counts

  • Shard size: aim for shard sizes that balance manageability and performance. A common guideline is 20–50GB per shard for many workloads; adjust by your query patterns and recovery targets. Very large shards (>100GB) make recovery and relocation slow.
  • Number of shards per node: keep the total number of shards moderate (thousands of shards per cluster may be fine, but avoid excessive shard counts per node). Each shard has memory and file-descriptor overhead.
  • Primary vs replicas: replicas increase read throughput and provide redundancy. Plan replica count so cluster can tolerate node failures while still serving queries.

Index design and lifecycle

  • Index-per-time-bucket: if using time-series data, create indices aligned to retention and query windows (daily, weekly) to make lifecycle management easier.
  • ILM (Index Lifecycle Management): use ILM to move indices between tiers (hot -> warm -> cold) and to delete or freeze old indices automatically.

Hot–Warm (and cold) architecture

  • Hot nodes: fast CPUs, more memory, and fast NVMe/SSDs — handle indexing and low-latency queries.
  • Warm nodes: less CPU, larger, cheaper storage — for querying older data with lower performance requirements.
  • Cold/Frozen: for long-term retention, reduce resource usage and accept slower queries.

Storage performance and sizing

  • IOPS and throughput: base sizing on expected indexing throughput and query IO patterns. Bulk indexing may require sustained write throughput; heavy aggregations drive read IOPS.
  • Snapshots: allocate network and storage throughput for snapshots. Snapshots are incremental but still require bandwidth during large retention periods.

Monitoring and metrics to watch

  • JVM heap usage and GC pause times
  • Node CPU and load
  • Disk usage and IOPS
  • Search rate, latency, and queue sizes
  • Indexing rate, refresh rate, and merge times
  • Cluster health and shard allocation status

Example sizing workflows

  1. Small test cluster (development or low-volume):

    • 3 data nodes, each: 16–32GB RAM, heap 8–15GB, 4 vCPUs, 1–2 SSDs
    • replicas: 1
  2. Production moderate cluster (search + indexing):

    • 3–5 hot data nodes, each: 64GB RAM, heap 31GB, 8–16 vCPUs, NVMe SSDs
    • 3–5 warm nodes with larger, cheaper storage, 64–128GB RAM with smaller heap
    • use ILM to roll indices from hot -> warm

Example calculation (very simplified): expected daily ingestion = 100GB/day raw documents

  • Retention: 30 days -> 3TB raw
  • With replicas=1 and ~20% indexing overhead and merges -> ~7.2TB usable required
  • Choose node count and disk sizes such that per-node disk usage stays below 70% with room for snapshots and rebalancing.

Operational guidance

  • Avoid swapping: configure the OS to avoid swapping Elasticsearch processes and reserve memory for the OS.
  • Set bootstrap.memory_lock and lock the JVM heap if appropriate to prevent swapping.
  • Use discovery.seed_hosts and proper discovery settings for cluster formation.
  • Regularly run queries that reflect production patterns during capacity planning and scaling tests.

References (official Elasticsearch documentation)

Further reading and tools