Systems-At-Scale

Stack Overflow Architecture

image

Network Setup

1. DNS (Domain Name System):

2. ISPs (Internet Service Providers):

3. Edge Routers:

4. Load Balancers:

5. 10Gbps MPLS (Multiprotocol Label Switching) between Data Centers:

Load Balancers

1. Load Balancer Software:

HAProxy 1.5.15 on CentOS 7.

2. TLS Termination:

Managed within HAProxy.

3. Network Connections:

Two pairs of 10Gbps connections for each load balancer, designated for the external network and DMZ.

4. Memory Utilization:

64GB or more in each load balancer to efficiently cache TLS sessions.

5. Session Resumption:

Caching allows for faster and more cost-effective connection resumption.

6. Additional Features:

Includes rate limiting and header captures for performance metrics.

Web Tier

1. Software Stack:

IIS 8.5, ASP.Net MVC 5.2.3, and .Net 4.6.1.

2. Web Servers:

9 “primary” servers (01-09) for main sites and 2 “dev/meta” servers (10-11) for staging.

3. Distribution:

Primary servers handle sites like Stack Overflow and Careers, while meta sites run on the last two servers.

4. Multi-tenant Application:

A single application serves requests for all Q&A sites, allowing the entire Q&A network to run off a single application pool on one server. Other applications like Careers and APIs are separate.

5. Monitoring:

They use Opserver, an internal monitoring dashboard, to observe Stack Overflow’s distribution and utilization across the web tier.

6. Overprovisioning:

The servers are overprovisioned for purposes like rolling builds, headroom, and redundancy.

Service Tier

1. Software and Platform

Similar to the web tier, the service tier runs on IIS 8.5, ASP.Net MVC 5.2.3, .Net 4.6.1, and includes HTTP.SYS, all on Windows 2012R2.

2. Function

It operates internal services to support the web tier and other internal systems.

3. Components

4. Redundancy and Efficiency

Unlike the 9-fold redundancy in the web tier, the service tier emphasizes efficiency and has lesser redundancy. Data loading from the database is done only three times for cost efficiency but still maintains safety.

5. Hardware Configuration

The boxes are configured to be optimized for different computational loads, like the tag engine and elastic indexing jobs.

6. Tag Engine

A crucial component handling all tag matching outside of /search, enabling features like new navigation.

Cache & Pub/Sub

1. Redis Usage

Utilized as a rock-solid system for about 160 billion operations per month, consuming less than 2% CPU.

2. L1/L2 Cache System

3. Site-Specific Caching

Each Q&A site has its own L1/L2 caching system, identified by key prefix and database ID.

4. Main Redis Servers

2 primary servers (master/slave) and additional machine learning instances (for recommendations, matching jobs, etc.), known as Providence.

5. Memory

Main servers have 256GB of RAM (90GB in use); Providence servers have 384GB of RAM (125GB in use).

6. Protobuf Format

Values are stored in Protobuf format, using specific in-house and open-source libraries.

7. Publish & Subscribe Mechanism

Beyond caching, Redis also supports a mechanism for messaging, allowing one server to publish a message and all subscribers to receive it. Used for clearing caches on other servers for consistency and other uses like websockets.

Web Sockets

1. Purpose

Websockets are used to push real-time updates to users, including notifications, vote counts, new answers, comments, and other real-time data.

2. Implementation

The socket servers utilize raw sockets running on the web tier and are built on the open-source library, StackExchange.NetGain.

3. Concurrency

At peak times, there can be about 500,000 concurrent websocket connections open.

4. Efficiency

Compared to polling, websockets are much more efficient at Stack Overflow’s scale, enabling the push of more data with fewer resources, and providing instantaneous updates to users.

5. Challenges

Despite their efficiency, websockets are not without issues, such as the exhaustion of ephemeral ports and file handles on the load.

6. Interesting Fact

Some browsers with websocket connections have been open for over 18 months, leading to humorous speculation about the status of those developers.

1. Usage

Elasticsearch is used for handling search functionality on the site, including performing searches, calculating related questions, and offering suggestions when asking a question.

2. Implementation

The searches are conducted against Elasticsearch 1.4, using a specialized client called StackExchange.Elastic, which exposes only a specific subset of the API Stack Overflow uses.

3. Cluster Configuration

Each Elastic cluster has 3 beefier-than-average nodes, with all SSD storage, 192GB of RAM, and dual 10Gbps network each. Indexing is also continually handled by the same domains that host the tag engine.

4. Reason for Using Elasticsearch

5. Challenges

Upgrading to a newer version of Elasticsearch is hindered by a major change to “types,” requiring a complete reindexing of everything and a migration plan.

Databases

1. Single Source of Truth

SQL Server is the single source of truth, with all data in Elastic and Redis originating from it. This decision simplifies data consistency across different platforms.

2. Cluster Configuration

Two clusters are maintained, each with 1 master and multiple replicas, including one in a disaster recovery data center.

3. Hardware Specification

The servers have substantial RAM and SSD space, which caters to the demanding requirements of the site.

4. Low CPU Utilization

They aim to keep CPU utilization low, though there are current issues related to plan cache that are being addressed. This ensures that the databases run efficiently and have room for spikes in demand.

5. Usage Simplicity

The usage of SQL is deliberately kept simple.

References