Hadoop Configuration Tuning
Default Hadoop settings are conservative and designed for small test clusters. Production clusters require careful tuning across HDFS, MapReduce, and YARN to achieve good throughput and stability.
HDFS Tuning
core-site.xml
<!-- Increase RPC handler threads on NameNode (default: 10) -->
<property>
<name>dfs.namenode.handler.count</name>
<value>64</value>
<!-- Rule of thumb: 20 * log2(cluster_size_nodes) -->
</property>
<!-- I/O buffer size for reads/writes (default: 4096 bytes) -->
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
<!-- 128 KB — match your storage block size -->
</property>
hdfs-site.xml
<!-- Block size: 128 MB (default) is fine; use 256 MB for large sequential files -->
<property>
<name>dfs.blocksize</name>
<value>268435456</value>
</property>
<!-- Replication factor (default: 3, use 2 for dev clusters) -->
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<!-- DataNode handler threads (default: 10) -->
<property>
<name>dfs.datanode.handler.count</name>
<value>32</value>
</property>
<!-- Increase DataNode data transfer threads -->
<property>
<name>dfs.datanode.max.transfer.threads</name>
<value>8192</value>
</property>
<!-- Short-circuit reads: clients read directly from local DataNode disk -->
<property>
<name>dfs.client.read.shortcircuit</name>
<value>true</value>
</property>
<property>
<name>dfs.domain.socket.path</name>
<value>/var/lib/hadoop-hdfs/dn_socket</value>
</property>
<!-- NameNode heap — set via HADOOP_NAMENODE_OPTS in hadoop-env.sh -->
<!-- ~1 GB per million files; minimum 4 GB for production -->
hadoop-env.sh — JVM heap sizes
# NameNode heap
export HADOOP_NAMENODE_OPTS="-Xms8g -Xmx8g -XX:+UseG1GC"
# DataNode heap (usually smaller — just metadata)
export HADOOP_DATANODE_OPTS="-Xms2g -Xmx2g -XX:+UseG1GC"
# Use G1GC for large heaps (>4 GB) to reduce GC pause times
YARN Tuning
yarn-site.xml
<!-- Total memory available on each NodeManager (GB × 1024) -->
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>49152</value>
<!-- Leave ~15% for OS + DataNode; e.g., 56 GB node → 48 GB YARN -->
</property>
<!-- CPU cores available to YARN per node -->
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>14</value>
<!-- Reserve 2 cores for OS/DataNode on a 16-core machine -->
</property>
<!-- Minimum container memory allocation (default: 1024 MB) -->
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>512</value>
</property>
<!-- Maximum single container allocation -->
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>16384</value>
</property>
<!-- Enable log aggregation for debugging -->
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
<!-- Retain 7 days of logs -->
</property>
MapReduce Tuning
mapred-site.xml
<!-- Memory for each Map and Reduce container -->
<property>
<name>mapreduce.map.memory.mb</name>
<value>2048</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>4096</value>
</property>
<!-- JVM heap inside containers (keep 20-25% below container memory) -->
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx1638m</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx3277m</value>
</property>
<!-- Enable compression for map output (speeds up shuffle significantly) -->
<property>
<name>mapreduce.map.output.compress</name>
<value>true</value>
</property>
<property>
<name>mapreduce.map.output.compress.codec</name>
<value>org.apache.hadoop.io.compress.SnappyCodec</value>
</property>
<!-- Number of reduce tasks — usually 0.95 × (nodes × containers_per_node) -->
<property>
<name>mapreduce.job.reduces</name>
<value>95</value>
</property>
<!-- Speculative execution: re-run slow tasks on other nodes -->
<property>
<name>mapreduce.map.speculative</name>
<value>true</value>
</property>
<property>
<name>mapreduce.reduce.speculative</name>
<value>true</value>
</property>
<!-- Sort buffer: larger = fewer merge passes during shuffle -->
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>512</value>
</property>
<property>
<name>mapreduce.task.io.sort.factor</name>
<value>100</value>
</property>
Key Tuning Rules of Thumb
| Parameter | Formula / Guideline |
|---|---|
| NameNode heap | ~1 GB per million files; minimum 8 GB |
dfs.namenode.handler.count | 20 × log₂(N) where N = number of nodes |
| YARN node memory | Total RAM − 15% for OS/DataNode |
| Map container memory | 2–4 GB depending on job type |
| Reduce containers | Start at 2× map count, adjust by shuffle size |
| Block size | 128 MB for mixed workloads, 256 MB for large sequential files |
io.sort.mb | 40% of map container heap |
Monitoring the Impact
After tuning, watch these metrics:
# Check NameNode RPC queue length (should stay near 0)
hdfs dfsadmin -report | grep "RPC"
# Check YARN cluster utilization
yarn node -list -all | grep -E "Memory|VCores"
# View running application resource usage
yarn application -list -appStates RUNNING
# Check DataNode block report latency
hdfs dfsadmin -report | grep -A3 "Name:"
Use the YARN ResourceManager UI (http://namenode:8088) and NameNode UI (http://namenode:9870) to visualize utilization before and after tuning.