Spark Memory Architecture: The Complete Guide to the Unified Memory Model
A Spark job fails with OutOfMemoryError. The team doubles spark.executor.memory from 8 GB to 16 GB. The job passes. Nobody investigates why. Three months later, the same job fails again on a larger dataset. They double it again to 32 GB. The cluster bill doubles with it.
This is the most common pattern in Spark operations — and the most expensive. Teams treat memory as a single knob to turn up. They do not know that Spark splits memory into distinct regions with different purposes, different eviction rules, and different failure modes. They do not know that their 16 GB executor only gives Spark 9.42 GB of usable unified memory. They do not know that their driver OOM has nothing to do with executor memory. They do not know that half their container memory is overhead they never configured.
This post explains exactly how Spark manages memory. We cover the unified memory model with exact formulas, the difference between execution and storage memory, driver vs executor memory architecture, off-heap memory with Tungsten, container memory for YARN and Kubernetes, PySpark-specific memory, and how to calculate every region from your configuration. The companion post covers OOM debugging, GC tuning, and observability.
The Unified Memory Model
Before Spark 1.6, memory management used a static model with fixed, non-flexible boundaries between execution and storage. If execution memory was full but storage was empty, execution could not use the idle storage memory — it would simply spill to disk. This was wasteful.
Spark 1.6 introduced the Unified Memory Manager (SPARK-10000), which replaced the static model with a dynamic one. Execution and storage share a single pool and can borrow from each other.
The Four Memory Regions
Every Spark executor's JVM heap is divided into four regions:
┌─────────────────────────────────────────────────────┐
│ JVM Heap (spark.executor.memory) │
│ │
│ ┌───────────────┐ │
│ │ Reserved │ 300 MB (hardcoded) │
│ │ Memory │ Internal Spark structures │
│ └───────────────┘ │
│ │
│ ┌───────────────┐ │
│ │ User │ (heap - 300MB) × (1 - 0.6) │
│ │ Memory │ Your objects, UDFs, metadata │
│ └───────────────┘ │
│ │
│ ┌─────────────────────────────────────────────┐ │
│ │ Unified Memory (heap - 300MB) × 0.6 │ │
│ │ │ │
│ │ ┌───────────────────┐ ┌───────────────────┐ │ │
│ │ │ Storage Memory │ │ Execution Memory │ │ │
│ │ │ (initial 50%) │ │ (initial 50%) │ │ │
│ │ │ │ │ │ │ │
│ │ │ Cached DFs/RDDs │ │ Shuffle buffers │ │ │
│ │ │ Broadcast vars │ │ Sort buffers │ │ │
│ │ │ Task results │ │ Hash tables │ │ │
│ │ │ │ │ Aggregation maps │ │ │
│ │ └───────────────────┘ └───────────────────┘ │ │
│ │ ← dynamic boundary → │ │
│ └─────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────┘
Region 1: Reserved Memory (300 MB)
A hardcoded 300 MB set aside for Spark's internal structures — logging, metrics, internal bookkeeping. This is not configurable in production. There is a spark.testing.reservedMemory parameter but it is explicitly for testing only and should never be changed.
Implication: Your executor must have at least 300 MB + some usable space. In practice, never set spark.executor.memory below 1 GB.
Region 2: User Memory
User Memory = (spark.executor.memory - 300 MB) × (1 - spark.memory.fraction)
With default spark.memory.fraction=0.6:
User Memory = (spark.executor.memory - 300 MB) × 0.4
User memory holds:
- Your custom data structures (HashMap, ArrayList, etc.)
- Internal metadata for data processing
- UDF state and closures
- RDD dependency information
- Spark internal objects not managed by the memory manager
This region is not managed by Spark's unified memory manager. If you allocate too many objects here, you get a standard java.lang.OutOfMemoryError: Java heap space with no help from Spark's eviction or spill mechanisms.
Region 3: Unified Memory — Storage
Protected Storage = (spark.executor.memory - 300 MB) × spark.memory.fraction × spark.memory.storageFraction
With defaults:
Protected Storage = (spark.executor.memory - 300 MB) × 0.6 × 0.5
= (spark.executor.memory - 300 MB) × 0.3
Storage memory holds:
- Cached DataFrames and RDDs — data stored via
.cache()or.persist() - Broadcast variables — hash tables and data sent to all executors
- Unroll memory — space used when deserializing blocks for caching
The protected storage portion (defined by spark.memory.storageFraction) is immune to eviction by execution. Storage above this protected floor can be evicted when execution needs space.
Region 4: Unified Memory — Execution
Initial Execution = (spark.executor.memory - 300 MB) × spark.memory.fraction × (1 - spark.memory.storageFraction)
With defaults:
Initial Execution = (spark.executor.memory - 300 MB) × 0.6 × 0.5
= (spark.executor.memory - 300 MB) × 0.3
Execution memory holds:
- Shuffle buffers — data being serialized and written during shuffle
- Sort buffers —
UnsafeExternalSorterpages for SortMergeJoin and sort operations - Hash tables —
HashedRelationfor BroadcastHashJoin and ShuffledHashJoin - Aggregation maps — hash maps for
GROUP BYoperations
Execution memory has priority over storage. If execution needs more space and storage is occupying it, storage blocks are evicted (using LRU). The reverse is not true — execution memory borrowed by storage is never forcibly reclaimed.
How Dynamic Memory Sharing Works
The unified memory model's key feature is that the boundary between execution and storage is soft. Both regions can expand into the other's space when it is idle.
Borrowing Rules
Storage borrows from execution: When storage needs more memory and execution has unused capacity, storage expands into execution's region. However, if execution later needs that memory back, storage must give it up — cached blocks are evicted using LRU (Least Recently Used).
Execution borrows from storage: When execution needs more memory and storage has unused capacity, execution expands into storage's region. If storage later needs memory, execution does not give it back. Storage must wait for execution to release naturally or spill cached data to disk.
Why Execution Has Priority
Execution memory holds intermediate computation state — shuffle buffers, sort arrays, hash tables. If execution runs out of memory mid-operation, the task fails with SparkOutOfMemoryError. Recovering from this requires restarting the entire task.
Storage memory holds cached data. If cached data is evicted, it can be recomputed from the lineage (RDD) or re-read from disk (MEMORY_AND_DISK). Eviction is inconvenient but not fatal.
This asymmetry is intentional: losing execution state is worse than losing cached data.
The Protected Storage Floor
The spark.memory.storageFraction (default 0.5) defines the protected floor below which execution cannot evict storage. Even under extreme execution pressure, Spark guarantees that storage retains at least this fraction of the unified region.
Protected Floor = Unified Memory × spark.memory.storageFraction
= (spark.executor.memory - 300 MB) × 0.6 × 0.5
Anything cached above this floor is fair game for eviction. Anything below it is safe.
Exact Memory Calculations
Example: 16 GB Executor
spark.executor.memory = 16 GB (16,384 MB)
spark.memory.fraction = 0.6
spark.memory.storageFraction = 0.5
| Region | Formula | Size |
|---|---|---|
| Reserved | Hardcoded | 300 MB |
| Usable Heap | 16,384 - 300 | 16,084 MB |
| User Memory | 16,084 × 0.4 | 6,434 MB |
| Unified Memory | 16,084 × 0.6 | 9,650 MB |
| → Storage (initial/protected) | 9,650 × 0.5 | 4,825 MB |
| → Execution (initial) | 9,650 × 0.5 | 4,825 MB |
Key insight: Of your 16 GB executor, Spark only has 9.65 GB of managed unified memory. The rest is reserved (300 MB) and user space (6.4 GB). When you see "executor memory" in configurations, the usable Spark-managed portion is only 60% of the heap minus 300 MB.
Example: 4 GB Executor
spark.executor.memory = 4 GB (4,096 MB)
| Region | Size |
|---|---|
| Reserved | 300 MB |
| User Memory | (4,096 - 300) × 0.4 = 1,518 MB |
| Unified Memory | (4,096 - 300) × 0.6 = 2,278 MB |
| → Storage (protected) | 1,139 MB |
| → Execution (initial) | 1,139 MB |
Only 2.28 GB of managed memory from a 4 GB executor. This is why small executors struggle with large shuffles and caching.
Example: 32 GB Executor
spark.executor.memory = 32 GB (32,768 MB)
| Region | Size |
|---|---|
| Reserved | 300 MB |
| User Memory | (32,768 - 300) × 0.4 = 12,987 MB |
| Unified Memory | (32,768 - 300) × 0.6 = 19,481 MB |
| → Storage (protected) | 9,740 MB |
| → Execution (initial) | 9,740 MB |
19.5 GB of managed memory. This is where broadcast joins and large caches start to become comfortable.
Driver Memory Architecture
The driver is fundamentally different from executors. It coordinates the application but does not process data partitions. Its memory layout serves a different purpose.
What the Driver Holds
- SparkContext — the entry point object for all Spark operations
- DAGScheduler and TaskScheduler — the scheduling infrastructure for all stages and tasks
- Broadcast variables — the driver builds and holds the full copy before broadcasting
- Collect results — any data pulled back via
.collect(),.toPandas(),.take(),.show() - Accumulator values — aggregated from all executors
- Query plans — for complex queries, the catalyst plan tree itself can be large
Driver Memory Configuration
spark.driver.memory = 1g (default)
The driver's JVM heap. Controls how much data the driver can hold in memory.
spark.driver.memoryOverhead = max(driverMemory × 0.10, 384 MB)
Off-heap memory for the driver process — NIO direct buffers, native memory, JVM internal overhead.
spark.driver.maxResultSize = 1g (default)
Maximum total size of serialized results from all partitions for a single action (like collect()). If a result exceeds this, the job is aborted. This is a safety net — not the memory allocation. Setting this to 0 disables the limit, which is dangerous.
Common Driver OOM Scenarios
1. Large collect():
# This pulls ALL data to the driver — OOM if df is large
all_data = df.collect()
# This too — the entire DataFrame becomes a Pandas DataFrame in driver memory
pandas_df = df.toPandas()
2. Large broadcast:
The driver must hold the entire broadcast table in memory while building the HashedRelation. A 500 MB Parquet table can expand to 2+ GB in memory.
3. Too many accumulator values:
Each task sends accumulator updates back to the driver. With millions of tasks and custom accumulators, the driver accumulates significant state.
4. Complex query plans:
Queries with hundreds of columns, deep subquery nesting, or many unions can create large catalyst plan trees that consume driver heap.
Right-Sizing the Driver
For most workloads:
spark.driver.memory = 2g to 8g
If your workflow includes collect(), toPandas(), or broadcast of tables > 100 MB:
spark.driver.memory = 8g to 16g
If you do not use any of these and the driver is purely a coordinator:
spark.driver.memory = 2g to 4g
Executor Memory Architecture
Each executor runs tasks that process data partitions. Executor memory is where the actual computation happens.
The Full Executor Memory Layout
┌─────────────────────────────────────────────────────────┐
│ Container Memory │
│ │
│ ┌──────────────────────────────────────────────┐ │
│ │ JVM Heap (spark.executor.memory) │ │
│ │ │ │
│ │ Reserved (300 MB) │ │
│ │ User Memory (heap - 300) × 0.4 │ │
│ │ Unified Memory (heap - 300) × 0.6 │ │
│ │ - Storage (cached data, broadcasts) │ │
│ │ - Execution (shuffles, sorts, joins) │ │
│ └──────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────┐ │
│ │ Memory Overhead (spark.executor.memoryOverhead)│ │
│ │ │ │
│ │ NIO Direct Buffers (Netty) │ │
│ │ Thread Stacks │ │
│ │ JVM Internal Structures │ │
│ │ Native Memory (JNI) │ │
│ │ Compressed Class Space / Metaspace │ │
│ └──────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────┐ │
│ │ Off-Heap Memory (spark.memory.offHeap.size) │ │
│ │ (only if spark.memory.offHeap.enabled=true) │ │
│ │ │ │
│ │ Tungsten managed memory │ │
│ │ Execution: shuffles, sorts, hash tables │ │
│ │ Storage: off-heap columnar cache │ │
│ └──────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────┐ │
│ │ PySpark Memory (spark.executor.pyspark.memory)│ │
│ │ (only for PySpark workloads) │ │
│ │ │ │
│ │ Python worker process memory │ │
│ │ Pandas DataFrames, NumPy arrays │ │
│ │ Arrow serialization buffers │ │
│ └──────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────┘
Container Memory Formula
Total Container Memory = spark.executor.memory
+ spark.executor.memoryOverhead
+ spark.memory.offHeap.size (if enabled)
+ spark.executor.pyspark.memory (if set)
This total is what YARN or Kubernetes allocates for the container/pod. If the process exceeds this, the container is killed.
Memory Overhead Defaults
The overhead default depends on the deployment platform:
| Platform | Default Formula | Minimum |
|---|---|---|
| YARN | executorMemory × 0.07 | 384 MB |
| Kubernetes (JVM) | executorMemory × 0.10 | 384 MB |
| Kubernetes (non-JVM) | executorMemory × 0.40 | 384 MB |
What needs overhead memory:
- Netty NIO buffers — shuffle data transfer uses direct byte buffers outside JVM heap
- Thread stacks — each thread needs 512 KB–1 MB of native stack space
- JVM internal structures — class metadata (Metaspace), JIT compiled code, internal tables
- JNI native memory — any native libraries loaded by Spark or your code
- Compressed oops — JVM pointer compression overhead
Per-Task Memory
Each executor runs spark.executor.cores (default 1) tasks concurrently. Each task shares the executor's unified memory pool, managed by TaskMemoryManager.
Available memory per task ≈ Unified Memory / spark.executor.cores
With a 16 GB executor and 4 cores:
Per-task unified memory ≈ 9,650 MB / 4 = 2,412 MB
This is why increasing cores without increasing memory can cause OOM — each task gets less memory.
Off-Heap Memory
Off-heap memory stores data outside the JVM heap, in native memory managed by Spark's Tungsten engine. The primary benefit: zero GC overhead. Data in off-heap is invisible to the garbage collector.
Configuration
spark.memory.offHeap.enabled = false (default)
spark.memory.offHeap.size = 0 (default, in bytes)
When enabled, Spark creates a second unified memory pool in off-heap space, with the same execution/storage split. The on-heap unified pool still exists — both pools are used.
How Tungsten Uses Off-Heap
Tungsten is Spark's memory management engine that operates on raw binary data:
- Uses
sun.misc.Unsafe(orjava.lang.foreign.MemorySegmentin newer JDKs) for direct memory access - Allocates memory in pages (default 16 MB page size)
- Stores data in compact binary format — no Java object headers, no boxing
- Enables cache-friendly data layouts for better CPU performance
Operations that benefit from off-heap:
- Shuffle sort and merge
- Hash table construction for joins and aggregations
- Columnar cache storage (
spark.sql.columnInMemory.offHeap.enabled)
When to Use Off-Heap
Use off-heap when:
- GC pauses are causing performance problems (> 10% of task time)
- You have large shuffles or hash tables that create GC pressure
- You need predictable latency without GC spikes
Avoid off-heap when:
- Your workload is not GC-bound
- You cannot increase container memory limits (off-heap adds to total memory)
- You are debugging memory issues (off-heap is harder to inspect)
Container Memory Impact
Off-heap memory is not inside the JVM heap. It must be accounted for in the container's total memory:
Container Memory = spark.executor.memory + memoryOverhead + spark.memory.offHeap.size
If you enable 4 GB of off-heap but do not increase the container limit, the process will be killed for exceeding memory.
Example configuration:
spark.executor.memory = 12g
spark.executor.memoryOverhead = 2g
spark.memory.offHeap.enabled = true
spark.memory.offHeap.size = 4g
Total container memory: 12 + 2 + 4 = 18 GB.
Container Memory: YARN and Kubernetes
Understanding container memory is critical because the most confusing OOM errors come from the container being killed — not from the JVM running out of heap.
YARN Container Memory
YARN Container Memory = spark.executor.memory
+ spark.yarn.executor.memoryOverhead
+ spark.memory.offHeap.size
+ spark.executor.pyspark.memory
YARN overhead default:
spark.yarn.executor.memoryOverhead = max(spark.executor.memory × 0.07, 384 MB)
Example: 20 GB executor on YARN:
JVM Heap: 20 GB
Overhead: max(20 × 0.07, 0.384) = 1.4 GB
Container: 21.4 GB
The YARN NodeManager checks physical RSS memory of the container. If it exceeds the container limit, YARN kills the container immediately with no Java stack trace — just a log message: "Container killed by YARN for exceeding memory limits." The exit code is 137.
Kubernetes Pod Memory
K8s Pod Memory = spark.executor.memory
+ spark.kubernetes.executor.memoryOverhead
+ spark.memory.offHeap.size
+ spark.executor.pyspark.memory
Kubernetes overhead default:
spark.kubernetes.memoryOverheadFactor = 0.10 (JVM) or 0.40 (non-JVM like PySpark)
spark.kubernetes.executor.memoryOverhead = max(spark.executor.memory × factor, 384 MB)
Example: 16 GB executor on Kubernetes (PySpark):
JVM Heap: 16 GB
Overhead: max(16 × 0.10, 0.384) = 1.6 GB
PySpark: 4 GB
Pod Memory: 21.6 GB
When a pod exceeds its memory limit, Kubernetes sends OOMKilled and the pod is terminated. You see this in kubectl describe pod output.
Why Container Kills Are Confusing
Container kills produce no Java stack trace. The process is killed by the operating system or container runtime, not by the JVM. This makes them the hardest OOM errors to diagnose because:
- No
OutOfMemoryErrorin application logs - The Spark UI may not capture the failure
- The error message only says "exceeded memory limits" without telling you what consumed the memory
- The root cause is usually off-heap memory (NIO buffers, native memory) that is not visible in JVM heap metrics
Right-Sizing Overhead
| Workload Type | Recommended Overhead |
|---|---|
| Scala/Java, light shuffle | Default (7-10%) |
| Heavy shuffle, large broadcast | 15-20% |
| PySpark with UDFs | 20-30% |
| PySpark with pandas_udf + Arrow | 25-40% |
| Off-heap enabled | Add full offHeap.size separately |
Rule of thumb: Start with default overhead. If you see container kills but no JVM OOM in logs, increase overhead by 50% and monitor again.
PySpark Memory Architecture
PySpark adds a separate Python process alongside the JVM executor, creating a dual-memory architecture.
How PySpark Memory Works
┌─────────────────────────────────────┐
│ Container │
│ │
│ ┌─────────────┐ ┌──────────────┐ │
│ │ JVM Executor │ │ Python Worker│ │
│ │ │ │ │ │
│ │ Spark Core │←→│ PySpark API │ │
│ │ Catalyst │ │ Pandas UDFs │ │
│ │ Tungsten │ │ NumPy/SciPy │ │
│ │ │ │ │ │
│ │ executor. │ │ pyspark. │ │
│ │ memory │ │ memory │ │
│ └─────────────┘ └──────────────┘ │
│ │
│ + memoryOverhead (shared) │
└─────────────────────────────────────┘
Key PySpark Memory Configurations
spark.executor.pyspark.memory -- not set by default
Limits the Python worker process memory. When set, Spark configures RLIMIT_AS to restrict the Python process. Without this, the Python worker can consume unbounded memory.
spark.python.worker.memory -- not set by default
Controls when Spark spills Python objects to disk in the JVM-side bridge. This is for RDD operations through the Py4J bridge.
spark.sql.execution.arrow.pyspark.enabled = false (default)
Enables Apache Arrow for efficient data transfer between JVM and Python. With Arrow, data is transferred as columnar batches instead of row-by-row serialization — 10-100x faster.
spark.sql.execution.arrow.maxRecordsPerBatch = 10000
Maximum rows per Arrow record batch. Larger batches are more efficient but use more memory per batch.
PySpark Memory Traps
1. toPandas() pulls everything to driver:
# This collects ALL data to the driver, then converts to Pandas
# Both the Spark collect AND the Pandas DataFrame must fit in driver memory
pandas_df = spark_df.toPandas()
# For a 10 GB DataFrame, you need ~20+ GB of driver memory
2. pandas_udf loads entire partitions:
@pandas_udf("double")
def my_func(v: pd.Series) -> pd.Series:
return v * 2 # Entire partition loaded as Pandas Series
If a partition has 50 million rows, the entire partition is materialized as a Pandas DataFrame in the Python worker. Increase partition count to reduce per-partition size.
3. Arrow serialization overhead:
Arrow transfers between JVM and Python create temporary copies. With large batches, this can double the effective memory usage temporarily.
Recommended PySpark Memory Configuration
spark.executor.memory = 8g
spark.executor.memoryOverhead = 2g
spark.executor.pyspark.memory = 4g
spark.sql.execution.arrow.pyspark.enabled = true
spark.sql.execution.arrow.maxRecordsPerBatch = 10000
Total container: 8 + 2 + 4 = 14 GB.
Memory Configuration Reference
Core Memory
| Configuration | Default | Description |
|---|---|---|
spark.executor.memory | 1g | JVM heap per executor |
spark.driver.memory | 1g | JVM heap for driver |
spark.driver.maxResultSize | 1g | Max serialized result size per action |
spark.executor.cores | 1 | Concurrent tasks per executor (affects per-task memory) |
Unified Memory Model
| Configuration | Default | Description |
|---|---|---|
spark.memory.fraction | 0.6 | Fraction of (heap - 300 MB) for unified pool |
spark.memory.storageFraction | 0.5 | Protected storage floor within unified pool |
Off-Heap
| Configuration | Default | Description |
|---|---|---|
spark.memory.offHeap.enabled | false | Enable off-heap memory |
spark.memory.offHeap.size | 0 | Off-heap allocation in bytes |
Memory Overhead
| Configuration | Default | Description |
|---|---|---|
spark.executor.memoryOverhead | max(memory × 0.10, 384 MB) | Executor off-heap overhead |
spark.driver.memoryOverhead | max(memory × 0.10, 384 MB) | Driver off-heap overhead |
spark.yarn.executor.memoryOverhead | max(memory × 0.07, 384 MB) | YARN-specific override |
spark.kubernetes.memoryOverheadFactor | 0.10 (JVM), 0.40 (non-JVM) | K8s overhead factor |
PySpark
| Configuration | Default | Description |
|---|---|---|
spark.executor.pyspark.memory | not set | Python worker memory limit |
spark.python.worker.memory | not set | Py4J spill threshold |
spark.sql.execution.arrow.pyspark.enabled | false | Enable Arrow for PySpark |
spark.sql.execution.arrow.maxRecordsPerBatch | 10000 | Rows per Arrow batch |
Serialization (Affects Memory Usage)
| Configuration | Default | Description |
|---|---|---|
spark.serializer | JavaSerializer | Serializer class (Kryo is 2-10x more compact) |
spark.kryoserializer.buffer | 64k | Initial Kryo buffer |
spark.kryoserializer.buffer.max | 64m | Max Kryo buffer |
Shuffle and Execution (Affects Memory Usage)
| Configuration | Default | Description |
|---|---|---|
spark.sql.shuffle.partitions | 200 | Partitions per shuffle (more = less data per task) |
spark.shuffle.file.buffer | 32k | Shuffle file output buffer |
spark.reducer.maxSizeInFlight | 48m | Max shuffle data in flight per reduce task |
spark.sql.files.maxPartitionBytes | 128 MB | Max bytes per partition when reading files |
How Cazpian Uses Memory Architecture
Cazpian's compute pools are configured with production-tested memory defaults for Iceberg workloads on S3:
- Unified memory sizing is tuned for the mix of execution (shuffle-heavy joins) and storage (cached dimension tables) typical in lakehouse queries
- Container memory accounts for off-heap overhead from Netty shuffle and Arrow serialization
- PySpark memory is pre-configured when Python runtimes are selected
- Per-query memory tracking in the SQL editor shows exactly how much memory each query consumed — unified, execution, storage, spill — so teams can right-size without guesswork
The companion post — Spark OOM Debugging: The Complete Guide — covers how to diagnose and fix every type of OOM error using Spark UI, GC analysis, and observability tools.
Summary
Spark's unified memory model is elegant but complex. The key takeaways:
- Only 60% of your heap is managed by Spark. The rest is reserved (300 MB) and user memory. A 16 GB executor has ~9.65 GB of unified memory
- Execution has priority over storage. Execution can evict cached data, but storage cannot evict execution buffers. This is by design — computation failures are more expensive than cache misses
- The boundary between execution and storage is dynamic. They borrow from each other, but only storage can be forcibly evicted
- Container memory = heap + overhead + off-heap + PySpark. Container kills (exit 137, OOMKilled) happen when the total exceeds the container limit — not when the JVM heap is full
- Driver and executor OOM are completely different problems. Driver OOM is usually caused by collect(), broadcast, or accumulators. Executor OOM is caused by large shuffles, skewed partitions, or insufficient memory per task
- PySpark doubles the memory challenge. A separate Python process runs alongside the JVM, and both need memory within the same container
The next time you see an OOM error, do not double the memory. Calculate which region overflowed, understand what filled it, and fix the root cause.