Composed of the words "hyper" (extreme, greater than usual) and "scale" (to change size), the term hyperscale represents the IT industry's golden standard for scalability and availability. Data centers with such infrastructure run the world's most taxing workloads seamlessly and cost-effectively, meeting the demands of use cases that cannot run in any other type of facility.
This article explores the core concepts of hyperscale data centers and shows what sets them apart from traditional hosting facilities. We also explain how hyperscale systems work and walk you through the main characteristics (as well as benefits) of hyperscale computing.
Hyperscaling is pushing the limits of what organizations expect from data centers, but this tech is not the only one gaining traction. Get a clear sense of what's shaping the industry in our articles on current data center trends and market analysis.
What Is Hyperscale?
Hyperscale is the capability of an IT architecture to scale in response to increasing or decreasing demand automatically, in real-time, and without latency. Such infrastructure runs on tens of thousands of same-sized servers that auto-activate and deactivate to adapt to current requirements.
The main idea behind hyperscale systems is to deliver the most efficient and cost-effective hosting environment for the most demanding sets of IT requirements. Such infrastructure scales almost instantly not just from a single server to a few but from several hundred to thousands. This capability is vital for hosting fluctuating and processing-hungry services, such as:
- Cloud computing.
- Video streaming.
- Social media.
- Large-scale apps based on artificial intelligence and machine learning.
- Online gaming.
Hyperscale computing relies solely on horizontal scaling (or "scaling out"). The system adds more same-sized servers to the cluster and shares the workload across a bigger pool of devices to meet the increased demand. This strategy is different from vertical scaling (or "scaling up") in which you improve the specifications of a machine to boost its performance (e.g., add more memory or a better CPU to a server).
So why scale horizontally and not vertically? Here are the main reasons:
- Using identical servers standardizes operations and simplifies day-to-day management. All devices have the same updates, security patches, operating systems, etc.
- Every machine has a hard "cap" for vertical scaling after which it becomes impossible to upgrade the device any further.
- Small, cheap servers are more cost-effective in the long run due to economies of scale.
- Relying on the same servers leads to more consistent performance.
- Horizontal scaling does not require admins to take machines offline to upgrade them.
- A network of same-size servers easily avoids downtime as an identical node takes over the workload if one machine goes down.
Our horizontal vs vertical scaling article compares the two techniques in-depth and helps choose the right scalability strategy for your use case.
What Is a Hyperscaler?
A hyperscaler is the owner and operator of one or more data centers that house horizontally linked servers required for hyperscaling.
The most prominent hyperscalers on the market are the three leading public cloud providers (AWS, Microsoft Azure, and Google Cloud). Massive companies like Facebook and Apple also own facilities that run services in a hyperscale fashion.
Here're the main differences between hyperscalers and regular providers:
- Hyperscale data centers house tens of thousands of servers and petabytes of data storage. In comparison, standard centers and server rooms host only a few hundred to a few thousand servers on average.
- A hyperscale provider has a lower cost structure due to economies of scale and the use of commodity hardware. These facilities rely on cheaper servers instead of more complex and expensive racks in traditional data centers.
- Hyperscalers have lower per-server power consumption thanks to energy-efficient designs and advanced cooling systems.
- Whereas regular data centers rely heavily on manual provisioning and resource management, a hyperscaler employs a high degree of automation for provisioning, monitoring, and day-to-day operations.
- Regular data centers offer less flexibility with on-demand services and require longer lead times for changes.
- Hyperscalers invest significantly more in ensuring high levels of redundancy and availability.
- On average, a hyperscaler employs fewer IT staff members due to high levels of automation. The number of security team members often exceeds that of the computing staff.
- Hyperscalers rely on standardized, modular designs that enable easy expansion and upgrades. Regular facilities use custom-designed solutions that make expansion and upgrades difficult and time-consuming.
While only a handful of organizations qualify as hyperscalers, some technologies used at these facilities are increasingly making their way to smaller data centers, such as:
- Software-defined networking (SDN).
- Converged infrastructure.
- Micro-segmentation.
While relatively few in number, hyperscalers process over 80% of all public cloud workloads.
How Does Hyperscale Computing Work?
Hyperscale computing groups tens of thousands (or more) of small, simple servers and networks them horizontally. "Simple" does not mean primitive, though, only that servers have a few basic conventions (e.g., network protocols) to make them:
- Easy to network and manage.
- Highly responsive and able to meet fluctuating capacity demands.
- More fault tolerant as a group.
These servers run apps in virtual machines (VMs), computing environments that rely on software-defined resources instead of dedicated hardware. One server can host multiple VMs and enable each to run independently, which allows workloads to move between hardware without bugs or slowdowns.
Every hyperscale network includes a load balancer that constantly re-allocates computing, storage, and network resources. This device manages all incoming requests to the network and routes them to servers with the most capacity. The balancer continuously monitors the load on each server, switching them on or off based on the amount of data that currently needs processing:
- If the load balancer detects an increased demand for a workload, it adds servers to the current dedicated pool.
- Once the demand goes down, the balancer removes servers from the pool, either shutting them down or reassigning them to another workload.
This process happens in real-time to maximize cost-effectiveness (both for users and the owner of the facility).
Automation is a massive part of hyperscale computing since it's impossible to manually orchestrate thousands of servers that often span beyond one facility. Hyperscale systems also require top-tier networking to enable such a highly distributed and scalable architecture. An ultra-high-speed, high-fiber count network connects servers, load balancers, and all interlinked data centers.
Learn the difference between orchestration and automation, two overlapping practices that enable effective hyperscaling.
What Is a Hyperscale Data Center?
A hyperscale data center is a facility that houses the equipment for hyperscale computing. In 2021, the official number of hyperscale data centers was 728. Experts predict this figure will reach the 1,000 mark by 2026.
While there are no official criteria, an average hyperscale facility is:
- At least 10,000sq ft. in size (although there are significantly larger facilities, such as Microsoft's Northlake data center that's 700,000sq ft. or Apple's Mesa facility that spans 1.3Msq ft.)
- Home to at least 5,000 dedicated servers.
- Storing hundreds of Petabytes (PBs) of data.
- Providing at least 40 Gbps network connections.
- Consuming over 50MW annually.
Most hyperscale systems operate in a unified network of facilities, not in one building. These fleets of data centers run as highly connected clusters. Some centers are next door to one another, while others are thousands of miles apart—this distance enables companies to:
- Lower the impact of localized power outages and cyberattacks.
- Serve all customers from a nearby facility to ensure quick response times.
Here are the characteristics of hyperscale data centers:
- Size: Hyperscale data centers are massive facilities that often house tens of thousands of servers.
- Scalability: Equipment within hyperscale data centers has one primary goal: to be as highly and quickly scalable as possible.
- Highly modular design: These facilities rely on stripped-down hardware that allows for easy expansion.
- Lower prices: These facilities use economies of scale to offer services at a lower cost than what a regular data center would charge for the same resources.
- Reach: Hyperscale data centers are always a part of global networks, providing access to resources from anywhere in the world.
- Automation: These data centers employ a high degree of automation when provisioning, monitoring, and managing resources.
- Redundancy: Facilities employ multiple layers of redundancy to ensure high service reliability.
Here are a few more numbers that demonstrate the sheer size of these facilities. A Microsoft hyperscale data center in Singapore contains enough concrete to build a sidewalk from London to Paris (about 215 miles).
Another Microsoft facility in Quincy houses around 24,000 miles of network cable, which is almost equal to the Earth's circumference (24,901.461 mi).
Hyperscale Benefits
If you've got a suitable use case, hyperscale computing offers a range of benefits you won't get from other hosting solutions. Here are the main advantages of relying on hyperscale computing:
- There are no realistically achievable upper caps for scaling, so there's no risk of running out of resources in times of high demand.
- End-users never experience overly long loading times or downtime due to top-tier redundancy that automatically self-heals the system in case of an error.
- Scaling occurs automatically based on current requirements, so there's no need to constantly manage the environment and adjust resources manually.
- Hyperscale computing uses economies of scale to reduce infrastructure, power, and cooling costs. If you want to outsource hyperscale services, expect better terms in your service level agreement (SLA) than what you'd get from a typical data center.
- The ability to scale both up and down ensures you avoid any unnecessary overhead.
- High levels of automation free the in-house team from maintaining and upgrading IT systems. Organizations free up internal resources for other business avenues, such as innovation and generating revenue.
- You get access to a wide range of on-demand computing resources (storage, processing power, network bandwidth, etc.). The team quickly deploys new apps and services without the constraints of traditional computing infrastructure.
- Since hyperscalers house more servers than typical data centers, these facilities distribute workloads across more devices to avoid overheating problems. Workloads tend to be far more balanced than in legacy hosting environments.
- Hyperscale computing easily handles the advanced processing challenges of cutting-edge tech such as AI, ML, and IoT.
- While all the servers in a hyperscale system are the same, the VMs within them are not. Users select the operating system and preferred programming language, so teams create a custom system that fits their use case.
The positives of hyperscaling are also evident in market growth. The global market size for this type of computing will reach $593 billion by 2030 (this figure sat at $62 billion in 2021), enjoying a projected 28.42% compound annual growth rate (CAGR).
Too Much for Most Use Cases, but the Only Hosting Option for Some
Hyperscale is an expensive and complex technology that most organizations cannot afford or benefit from (usually both).
However, specific large-scale use cases like cloud services or social media can only operate efficiently with hyperscale computing. No other setup meets their incredible scalability requirements, so expect to see more hyperscaling as the world gets more connected and companies become more comfortable with data center outsourcing.