Press "Enter" to skip to content

Apache Foundation Moves From Mirrors to a CDN to Distribute Software

Apache Software Foundation says goodbye to its system of mirrors that have been serving downloads of its software for more than 20 years.


About a week ago the Apache Software Foundation, home of the Apache Web Server, Hadoop, OpenOffice, and over 350 other open source software projects, announced the end of the line for its system of mirror sites for delivering its software to users, From now on, the foundation will be using a content delivery network instead.

Most users of open source software, especially those who have downloaded Linux distributions, will be familiar with download mirroring sites, usually just called “mirrors,” which rose to prominence in the 1990s as the internet became the preferred way to distribute software.

Mirrors provided a way to have download servers close to users downloading large applications, during a time when the internet wasn’t as robust as it is today and almost all users connected to the internet using dial-up modems with download speeds of 56 kbit/s or slower. In those days, downloading something like an office productivity suite could be an all day affair.

Mirrors remain the way that most Linux distributions are delivered, with users being offered a list of mirror sites, often maintained by colleges or universities, so they can choose one close to them instead of having their software downloaded from the other side of the continent or from overseas.

Until now, the Apache Foundation has distributed software in a similar fashion, although it had automated the mirror selection process. Its implementation of mirrors began back in the 1990s when it was still known as the Apache Group, long before downloading something as large as a Linux distro was practical in most cases.

“In April 1997, Brian Behlendorf [the current GM at the Linux Foundation’s Open Source Security Foundation] invited 66 people already hosting mirrors to join the ‘mirror@’ Apache mailing list,” Joe Brockmeier, Apache’s VP of marketing and publicity, wrote in a blog announcing the change. “In June of the same year users could automatically be directed to a local mirror by a CGI script that would select the right mirror based on their country code.”

Brockmeier said that by 2002 an Apache mirror site needed to allocate 10 GB of space to handle all the software available for download from ASF.

“Today, that 10GB has grown to more than 180GB for a mirror to carry all ASF software,” he added.

Why a CDN Is a Better Solution

For a variety of reasons, mirrors are becoming the clunky, old-fashioned way of distributing software. Not only did dial-up internet connections quit being much of a thing nearly 20 years ago, the rise of cloud and affordable CDNs such as Cloudflare has paved the way for a much more efficient way of bringing the server closer to users at the edge.

“The industry has changed…,” Brockmeier said. “Technology has advanced, bandwidth costs have dropped, and mirror systems are giving way to content delivery networks.”

“After discussion and deliberation,” he added, “the ASF’s Infrastructure team has decided to move our download system to a CDN with professional support and a service level appropriate to the foundation’s status in the technology world.”

Almost all websites these days, especially those with a large amount of traffic, use the services of a CDN. Visitors don’t directly connect with the website’s servers, but with the site’s CDN, which acts as a proxy. Since the CDN has servers located in data centers across the globe, in almost all cases this results in a much shorter jump through cyberspace for the visitor, meaning a quicker response.

It also means less load on the website’s IT infrastructure. If the CDN has a cached copy of the requested page on hand it sends that without involving the site’s servers. If it doesn’t have a cached copy of the page, it grabs it from the site’s server to pass on, while keeping a copy on file for a specified amount of time (usually an hour) to satisfy any future requests for that particular page.

Much the same will be the process for software downloads from ASF going forward.

Now that ASF is using a CDN, this means that someone wanting to download the latest and greatest version of the Apache webserver or Hadoop will in most cases automatically be served the download from the CDN server closest to them. For ASF, it means the organization will no longer have to deal with the operators of hundreds of different mirror sites, or have concerns about the security precautions these mirror sites might (or might not) have in place.

No Changes for Project Maintainers or Users

“Our new delivery system is part of a global CDN with economies of scale and fast, reliable downloads around the world,” Brockmeier said. “We expect ASF users will see faster deployment of software, without any lag that one might usually see with a mirror system while local mirrors sync off the main instance.”

ASF project developers and maintainers also won’t have anything new to learn.

“ASF projects won’t see any difference in their workflow, just a faster delivery of open source artifacts to their users,” he said.

Breaking News: