What is a Web Browser, and How Does it Work?

What is a Web Browser, and How Does it Work?

In the previous articles we talked about various web technologies at the heart of the web and we’ve said that web browsers are mixing things together in order to display web documents

Yeah, that really sounds like voodoo magic! Really?

Okay, so today, let’s talk a bit more about how browsers work and… Let’s kill the witchcraft.

Browsers are… Browsers are very complex pieces of software that have drastically evolved since the 90’s, from rendering simple text to executing fully fledged applications. So what are exactly browsers able to do these days?

First of all, because the web is all about requesting documents stored on the Internet, they are able to handle network connections. Historically, that was done using the HTTP protocol to perform simple CRUD operations on documents. But nowadays they can also create bidirectional communication channels with a server or they can even create a peer-to-peer connections with another browser.

Second, they have to be able to understand, render and/or execute different languages. They also have to be able to render fonts, images, audio, and videos.

Third, everything you see in a browser must be interactive. You can select text, scroll pages, click on links, input text, drag and drop content, etc.

Fourth, they can store data, either by caching content or providing various storage mechanisms. And all of this must work regardless of the complexity of each page and regardless of the number of pages open at the same time!

This is…very challenging.

Last but not least, because the Internet remains a wild place and because websites can aggregate contents from very different sources, things have to be reasonably secured. To do that browsers must be able to manage data encryption, and to sandbox every set of contents in order to prevent malicious code to access other contents, or worse, your computer.

And this is just the tip of the iceberg. So, let’s try to keep things simple.

Okay, let’s focus on the specific job of a browser, from a high level: rendering multiple documents to display them to you in an interactive way.

Let’s start with the rendering. One. The browser requests all the necessary documents

For the sake of example… let’s say we are requesting an HTML document and its associated stylesheets and images.

Two. All documents will be parsed and turned into machine-friendly representations. If you do a bit of research here, You’ll hear fancy names like DOM trees, Style trees, Rendering trees, Display lists these are just different ways for browsers to represent documents.

Three. The intermediate representations are used to compute the whole page layout. This is where the “mixing all the things” happen. The browser computes the size, position, and representation of everything that will be displayed on the page.

In the browser industry, senior engineers who are actually programing layout computations are often considered true magicians because… this is very likely one of the most complicated parts of a browser.

Four. The browser finally paints each element of the layout on the screen.

Tadda!

So to summarize: the browser reads all the documents; it turns them into several intermediate representations easier to use; it computes the position of everything on the page; and it paints each pixel accordingly.

Okay, this is roughly what any piece of software displaying content is doing.

The nuance comes from the interactive aspect of the web: everything on a web page is dynamic, meaning it can be changed or animated at will! So if something is moving on the page, the full layout must be recomputed at 30 or 60 frames per second, or even more and regardless of the number of elements on the page.

Actually, the only other types of software that have to face such a challenge are top of the line triple-A games. Thanks to that, things can become interactive, and there are two ways to handle interactivity: On the one hand there are all the native interactions that the browser is providing, usually through HTML and CSS.

Remember? You can select text, scroll pages, click on links, input text, drag and drop content, etc. On the other hand, there is what JavaScript lets people do.

Fundamentally, JavaScript lets people build any kind of interactivity by granting them access to the core functionalities of the browsers. You know, the one we’ve listed at the beginning of that video…

Yeah, OK, you get it!

The consequence of all of that is that web authors must learn how browsers work in order to optimize the experience they want to provide to their users.

For example it is a good idea to learn what triggers a new layout computation versus a simple repaint, just because the latter is faster than the former. Unfortunately, It’s hard to get into the details because… each browser works somewhat differently.

Oh Thanks! That’s soooo helpful!

Well!

There are no big secrets and… browser makers are working hard to make sure that even the crappiest web pages are rendered fast and smooth. But it’s a topic for another article.

Okay, let’s recap.

Browsers handle a lot of things: networks, content display, data storage, content security… all in a very interactive way.

They handle technical requirements that can be compared to those of super high-end games. And they give content creators the ability to create any form of interactivity. Now you shouldn’t consider the web to be magic. it’s just clever computer science.

Thank you all for visiting this article.

Leave a Comment

Your email address will not be published. Required fields are marked *