HTTP2 is an upgrade to the HyperText Transfer Protocol (HTTP) which was originally created by Tim Berners-Lee during the advent of the worldwide web. HTTP2 retains much of the same functionality as its predecessor, namely its verb system (GET, POST, PUT, DELETE) and its stateless nature which utilises [TCP][1] and [IP][2]. If you're interested in reading the original draft for HTTP1 visit [here][3].

Furthermore, HTTP2 has made some drastic improvements which are evident in the inefficiencies found in HTTP. One of these core improvements is called Multiplexing. Conceptually HTTP (as mentioned earlier) transfers data across IP addresses using a single TCP connection. This model fit the original requirements for the web, which was primarily used to help scientists to exchange academic literature in a remote fashion, not constraint to their geographical location. Conceptually, the web included three foundational components: A uniform resource locator (i.e, a mechanism to transmit data between these locators (HTTP) and a simple markup language to build the documents which were to be sent.

How Multiplexing helps

The issue with this, is that as the context of web usage has transformed from the simple transmission of academic papers, into complex web applications with rich user interfaces and computational abilities. Ultimately HTTP was not originally built to handle such requirements, however the tenacious global developer community has consistently found ways around these issues through using innovative techniques such as 'image sprites' to reduce the number of HTTP requests needed for several images, into one request.

HTTP2 uses multiplexing to remove the need for optimisation techniques such as image sprites. This works, due to the underlying mechanisms which multiplexing uses. ![Multiplexing diagram][4]

As shown in the above diagram, various data signals are fed into a multiplexer (also referred to as a MUX). These signals are then multiplexed and thereby converted into one channel which is then sent and demultiplexed to reassemble the individual signals. These individual requests are split up into frames which has their corresponding stream name (defined in a frames header). This is done so that frames can be sent asynchronously across the network and then reassembled into a complete stream.

There are several implementations of multiplexing which differ based on what the system requires. For example, analog-based signals can be multiplexed using 'Frequency Division Multiplexing' which divides the frequency into segments to be sent across a channel. Also, Time Division Multiplexing divides several input streams through sending them at different allocated time slots across a channel. However, with HTTP2 the decision of multiplexing implementation is left up to the servers.

NOTE, a stream is another way to refer to a HTTP request.

HTTP1 requests were more expensive

Before HTTP2, requests were far more expensive. For example, imagine being restricted to collecting one item from a shop every time you visited; it would quickly become very frustrating! If the average amount of items bought during a visit to a local supermarket was 30 this would mean thirty visits to the shop. Wouldn't it be far easier to take one visit and bring your thirty items home with you once? It would not only save you time, but it would also allow you to make far less visits and only travel when necessary. In the same way, HTTP was initially built to share simplistic data between servers and thus omitted the need for more sophisticated data exchange methodologies such as Multiplexing. However, with the arrival of Multiplexing that one efficient visit to the shop is now possible.

For those interested in the technical implementation of Multiplexing, as mentioned earlier there are several methods used such as Frequency-Division Multiplexing (FDM) and Time-Division Multiplexing (TDM). There are more methods, however they will not be covered in this article as covering FDM and TDM is sufficient for a conceptual technical overview. So firstly,

Frequency-Division Multiplexing

FDM takes various input signals and joins them together through sending them at different frequencies. For example, if a network were to have a bandwidth (data transfer rate) of 2000kHz then this bandwidth could be divided by the number of input signals to send them across the network. This means 10 input signals with a signal size of 200kHz could be sent across the network through FDM. This starts with the input signals going through the Multiplexer which then combines them and sends them across the network channel, which distributes them at different frequencies. This is perhaps more complex than it sounds; remember using a radio at home to tune into different frequencies? This used FDM to allow different streams of data (in this case, a morning news channel, or sport channel) to be accessed at different frequencies. Now, for TDM.

Time-Division Multiplexing

TDM works in two distinct ways, namely 'Synchronous' TDM and 'Asynchronous' TDM. Synchronous TDM can be described in three stages. Various low bit streams are fed into the multiplexer, which are then multiplexed. This results in the various streams being sent across the channel in an ordered manner in which the individual signals are allocated a certain time that they will be transported. It is important to note that in synchronous TDM, the inputs are spread across the channel according to a set time. This means that regardless of the amount of data inside each stream, they will all be given a fixed duration on the channel. Of course, this has some potential performance implications as some streams may contain no data, but will still be allocated a duration across the channel. This can be quite expensive and is more beneficial in networks which have predictable or fixed resources to allocate. For example, if an engineer knows that each stream contains 30kHz of data, then sending them synchronously will ensure that the bandwidth of the network is being utilised effectively.

On the other hand, in Asynchronous TDM each stream is assigned a duration on the network dynamically, based on the demands of the network. In other words, each stream is allocated more time of the channel based on its needs. This is more flexible than standard TDM as it enables a more courteous data transmission rate to the streams that require more and less to those which requires less.

Looking at this from a software development perspective, this means that images, JavaScript files and CSS files, alongside several other types of hypermedia can be retrieved from servers with an overall lower Round-Trip Time. So you can now load webpages faster and more efficiently than ever before without having to utilise several industry standard optimisation techniques (image sprites, file concatenation, etc).

There are several resources online for learning about the technical implementations of Multiplexing in more detail, however this simply covers the basic concepts in order to understanding this helpful new feature.


Priotization is something that is included within Multiplexing. The basic concept is that each HTTP request can advertise their priority to the server in the order in which they will sent back. This is very useful as resources can be sent back accordingly to the priority set by the client. For an example, it is now possible to ask a server to send back a certain image before another, similarly with CSS files, JavaScript files, etc.

There is far more which could be covered in detail, but this is simply an overview. What is hopefully clear now is that overall, multiplexing enables developers to load multiple resources via one channel which saves a vast amount of time. It also strengthens the relationship between the client and the server as clients can now ask more specific questions to the server.

I hope you've found this useful and insightful :)

Here are some references: