Increase Speed & Reliability: Our Multi-CDN Strategy
Significant Speed Improvements with Wiredrive
We recently completed a many month project to increase the speed and reliability of uploads and downloads to Wiredrive. The goal of the project was to speed up upload times by 20% from locations outside of the United States. By analyzing customer usage traffic patterns, our current vendors, and internet peering agreements we were able to reach our goal.
How a Traditional CDN Works
A CDN is a collection of servers spread throughout the globe optimized to serve content from servers nearest to the user. They are relied on by major organizations worldwide to serve most of the Internet content we see today, including text, software, graphics, and video. The goal of a CDN is to have the content sent from the nearest server with the highest performance and availability to ensure the fastest delivery.
The success of a CDN is entirely dependent upon the peering relationships it has with the end user’s ISPs. As user ISPs become more commodity, and have merged into just a few large companies, peering between these ISPs and CDN vendors is pretty well established and stable. The ISPs have completely optimized their networks for the most common traffic patterns – downloads. They have direct connections, or peering agreements, to the major CDNs, which means the traffic does not pass over the public internet. The result is smooth playback, fast downloads, quick page loads.
This graph shows global performance for downloading from our CDN provider globally. Performance is very level and predictable.
The Reality of CDNs
With downloading, and peering solved, everything should load fast all the time. But, that’s not reality. Looking at the same data at a higher resolution and applying some basic math shows a very different story. The same CDN is showing a whole lot of inconsistency throughout the month. The spikes are consistent throughout the month, and peak at over 10x what the average is showing above.
Digging in a bit more, we see a lot of inconsistency by geo-region, with two cities in particular showing lots of inconsistency – Miami and London. With a single CDN, there isn’t much we can do to optimize download speeds for our customer. We are dependant on the peering relationships of that one CDN.
The most obvious solution is to bring on more CDNs. When we started this project, we concentrated on the geo-locations and ISPs our customers are using. We didn’t want to sign up with the largest providers and call it day. We analyzed our upload and download traffic patterns, and the ISPs in between our data centers and our end users. Through lots of trial and error, the final strategy utilized a combination of dedicated data centers, regional CDN providers, and far-reaching global CDNs.
Selecting a CDN in Real Time
The hardest part of this entire project is selecting when and where to use each CDN. Doing it manually was out of the question. So, we needed a reliable, automated way to select the CDN. We also needed to make sure we are picking the fastest CDN, and the fastest CDN changes all the time. There are a few ways to accomplish this, so let’s look into each.
run a set of predefined tests from dedicated servers in major data centers. These tests are great because they cut out the end user, or last mile delivery. They check everything between our data centers, and the users ISPs, but not down to the browser. These tests show an idealized connection, and can find major slowdowns and outages, but won’t reflect what the end user is experiencing.
Last Mile Tests
run a set of predefined tests from dedicated machines on popular ISPs. They are not in the datacenter, but in homes and offices. They are great, but also frustrating because there are so many unknowns that come into play it’s hard to make a real decision. However, they more definitively demonstrate the huge variance between these last mile tests and the synthetic tests.
Real User Testing (RUM)
is done in the end user’s browser. It captures the load time based on timing in the browser. This is a great source of data, but, for Wiredrive it wouldn’t work since every user sees different content. Normalizing the data based on the amount of files, presentations, etc. was more work than it was worth.
Collecting all the data, it then makes decisions on which CDN to use, and updates the DNS record that serves our customers data. Pretty simple, and very effective.
Below is a graph of the decisions, and of which CDN is fastest. It’s pretty clear, that the fastest CDN is always changing, and that two are performing much better than the others. These two CDNs are picked at the same time by different geo-regions, which confirms the data we saw above that a single CDN can be very inconsistent.
Narrowing down the decisions to just London tells a very different story. The CDN that was dominant in the global scope, is not being selected here at all.
Tuning the Decisions
Now that we have Cedexis up and running, and picking from the various CDNs and data centers, we will start tuning per geographic region and ISP. We are hoping that all the effort results in much faster performance globally for both uploads and downloads.
We will have a follow up post in a few weeks that analyzes the decisions, and what we can now do about it.
While still very new, the multi-CDN strategy is already paying off. Customers are reporting much more consistent upload and download times, and support tickets have dropped off.
We would like to thank all the customers that helped us with this project by providing endless amount of traceroutes, and for the CDN vendors who allowed us run experiments on their networks.
We are always looking for ways to improve the speed of Wiredrive and are dedicated to making downloads and uploads as fast as possible. We know every second we can shave off gives you back valuable time. Please reach out if you would like the help.