4 posts tagged with "Release"

Hadoop release notes and highlights

Apache Spark 4.0 for Big Data Engineering: What's New and Why It Matters

June 11, 2026 · 7 min read

Big Data Practitioner

Apache Spark 4.0 is the biggest leap for the project in years — and it's squarely aimed at the people who build and operate big data pipelines. The release sharpens four areas at once: SQL and workflow authoring, data types and observability, the Python/PySpark experience, and how clients connect to Spark. If you spin up a cluster on Databricks Runtime 17.0, these capabilities are available out of the box.

This article is an original, engineer-focused tour of what changed in Spark 4.0 and why each change matters in practice. If you want the fundamentals first, see our primers on Spark's key components and how Spark supports big data processing.

Hadoop 3 Features and Enhancements: A Deep Dive (2026)

May 22, 2026 · 12 min read

Hadoop.so Editorial Team

Big Data Engineers

Apache Hadoop 3 was the first release in nearly a decade that made operators rethink how they buy storage. Erasure coding cut disk overhead from 200% to 50%. The NameNode HA cap doubled, then more. The MapReduce shuffle path moved into native code. YARN learned to manage long-running services and Docker workloads. And every default port that lived in the Linux ephemeral range was moved out of it.

Several years after the 3.0 GA, Hadoop 3.3 and 3.4 lines are the de-facto on-prem standard, and most cloud Hadoop distributions (EMR, Dataproc, HDInsight, CDP) ship a 3.x core. This deep dive walks through every major feature in the Hadoop 3 line — what changed, why it matters, and where the tradeoffs hide — and ends with a side-by-side Hadoop 2.x vs 3.x comparison table.

What's New in Apache Hadoop 3

April 28, 2026 · 2 min read

Hadoop.so Editorial Team

Big Data Engineers

Apache Hadoop 3.x was a landmark release that brought significant improvements to performance, reliability, and scalability. Here's a quick tour of the most important changes.

Hadoop and Java: A Version Compatibility Guide

April 21, 2026 · 5 min read

Hadoop.so Editorial Team

Big Data Engineers

Picking the wrong Java version for your Hadoop cluster is one of the most common causes of cryptic build failures, runtime exceptions, and upgrade blockers. This guide maps Hadoop releases to their supported Java versions, explains what changed between Java versions, and offers practical recommendations for 2025.