Critical CSS Rules Decreasing time to first render by inlining CSS rules for over-the-fold elements

(1)

Critical CSS Rules

Decreasing time to first render by inlining CSS

rules for over-the-fold elements

Gorjan Jovanovski

hey@gorjan.rocks

July, 2016, 41 pages

Supervisor: Dr. Vadim Zaytsev

Host organisation: The Next Web,http://thenextweb.com

Universiteit van Amsterdam

(2)

Abstract

CSS is a render-blocking resource that increases the time needed for the first elements of a web page to be rendered on screen during the initial load. Detecting and inlining critical CSS rules in the page’s HTML that apply only on initial visible elements on a page helps counter this problem, removing the need for extra GET requests and processing of unneeded styles. Tools for detection of critical CSS exist, but are limited in scope, flexibility and use on dynamic pages. In this thesis, we first explore if external CSS files causes create a significant negative impact on page render times, then analyse and evaluate methods used by existing tools that detect, extract and inline critical CSS rules, and finally introduce an automated tool for detection and extraction of critical CSS from static and dynamic web pages.

(5)

Chapter 1

Introduction

In this chapter, we explain the problem that CSS introduces to the time it takes for the first render to occur in a browser while loading a web page. We pose three main research questions which will help us come to a better understanding of it and finally propose a tool that solves the render blocking problem. Cascading Style Sheets (CSS) are the primary method that allows web developers and designers to control the way elements on a page are rendered by browsers. They are written following standards imposed by the World Wide Web Consortium, who constantly refine and add new rules to the ever-expanding style language, allowing for more flexibility and customization of websites [Con16b]. This, in turn, results to the subsequent rise in use of CSS all around the web. As data collected from HTTPArchive [Arc16] on more than half a million sites shows, in just the past five years, the number of GET requests for CSS files from a single web page has grown from an average of 2.2 to 7.7 requests. Transfer size of CSS files has also increased threefold, from an average of 23kb to 80kb. One reason for this could be the growing trend among developers to use CSS frameworks and fonts side by side with their custom rules. The most popular CSS based frameworks tracked by the web crawler BuiltWith [Bui16], are displayed in Table1.1.

Name Type No. of sites that use it Google Font API Fonts 17.200.000 +

Bootstrap Framework 6.700.000 +

Adobe Edge Web Fonts Fonts 83,700 +

Foundation CSS Framework 35.000 +

Formalize CSS Form framework 34.000 + Materialize CSS Framework 15.000 +

Table 1.1: Usage of CSS frameworks on the web

This just comes to show the extent to which websites rely on CSS, and researchers from AT&T back this claim that top websites can contain anywhere from 2 to 73 scripts and style sheets [EGJR15].

1.1 Problem statement

Together with HTML, CSS is a render blocking resource, meaning that the browser can not render a page without first parsing these resources. For HTML, it is obvious and expected, since without content, there is nothing for the browser to display. Parsing it and creating a DOM (Document Object Model) tree is an essential rendering step. But for CSS, the browser will not start rendering the page until the CSSOM (CSS Object Model) tree of all style sheets defined in the head tag is constructed and applied to the DOM tree. That process is network heavy for externally linked files and includes:

(6)

2. An additional DNS lookup request being made if the CSS file is hosted on another domain 3. The response being received and read

4. An CSSOM tree being constructed based on the response 5. Continuing parsing of the HTML

This is even more problematic when the @import CSS at-rule is used, which allows the embedding of one style sheet into another, which in turn requires an additional HTTP GET request to be made. That, combined with the use of popular CSS frameworks as listed above, contributes to the decreased speed of which the first render appears on a user’s screen.

With so much CSS files being loaded, not all of them should be equally prioritized. Some style sheets apply to mobile screens, others to a printer-friendly version of the page, yet if incorrectly marked, all of them will block the rendering while being downloaded and parsed. Google suggests adding special attributes to CSS link tags to state under what conditions they should be loaded [Dev16a]. But even if CSS links are correctly marked, the CSS rules for the main content cover a lot more elements than are initially visible on the user’s screen. This statement is tackled in chapter4.

A often cited solution to this problem in research [WBKW13], by top industry engineers [KO16] and companies [Dev16a], is the use of critical (over-the-fold) CSS. Critical CSS rules only affect the elements of a web page that are initially visible after the load, without scrolling in the browser win-dow. These are the most important elements since they are the first to appear. By extracting and inlining only critical CSS rules, the time to first render (time needed for the browser to start painting elements on the screen from the initial load of the page) can be decreased. The goal of this thesis is to analyse the effects that injecting critical CSS rules in a web page have on the time to first render, and provide a tool to automate the detection, extraction and injection of CSS rules that apply to over-the-fold elements. More in-depth definitions of critical CSS and time to first render are provided in sections2.3and2.2.1respectively.

Our hypothesis, which we test in this thesis, is: Inlining critical CSS rules in web pages and loading non-critical ones asynchronously, creates a significant decrease in the time to first render.

1.2 Research questions

In order to successfully test our hypothesis, we first had to answer the following important questions that will direct our research in a right way research:

• RQ1: Do requests to external CSS files make a significant negative impact on the time to first render?

• RQ2: What methods do existing tools use for detection, extraction and inlining of critical CSS rules?

• RQ3: How can critical CSS inlining be automated for dynamic web pages?

1.3 Use cases

Not all scenarios could benefit from inlining critical CSS rules. We look into multiple scenarios and what their positive and negative aspects are in regards to the time to first render.

(7)

1.3.1 Caching

Browsers can load CSS files from cache faster than from a network request, allowing time to be saved. But this process requires that the user has already visited the site at least once. Aside from that, this approach also has other problems. Popular sites change CSS code on a daily basis, forcing the browser to re-download CSS multiple times, while some even put the current timestamp at the end of a link to the CSS file to prevent it from being cached. Problems are also present on mobile devices, where the total available cache space could be as low as 8MB [QQH+₁₂_{]. The amount of web traffic originating}

from mobile devices renders such small cache sizes redundant, forcing the browser to re-download all resources after the cache has been filled.

1.3.2 Content Delivery Networks

Another option is the use of content delivery networks (CDN), which are caching servers that deliver static content in a fast manner by having servers optimized for speed in multiple locations around the world. Once a file is requested, the server closest to the user dispatches a copy, shortening the network route the data needs to take. Most highly ranked websites use CDNs [VP03], so in our research below, sites like that should report a lower difference in time to first render between requests that allow fetching of external CSS files and ones that block them.

1.3.3 Resolution mismatch

If a web page that is being tested for critical CSS is 100% fluid, meaning that it always takes up 100% of the viewport, and never has any hidden content, then the extraction of critical CSS will yield the whole CSS file, reducing the benefits imposed by this method.

1.4 Solution outline

For the purpose of usability across different systems and development setups, we propose a Node.js tool that will serve as a proof-of-concept solution. It uses a process that involves scanning an input file, usually a local HTML file or a link to a remote file, and extracting the remote CSS files it references. Then it parses and constructs an abstract syntax tree for all the detected CSS files, renders the input page using PhantomJS as a headless browser, and tests each CSS selector in the AST against the final DOM tree. If an element that matches the selector is found and it is located in the predefined viewport, it is considered critical, and marked as such in the AST. Finally, the critical CSS is extracted from the AST and embedded in a style tag in the head of the HTML page, while the original CSS files are loaded via simple Javascript code that is inserted at the end of the body and called after the start of the rendering process as not to block it. Figure1.1visualizes the algorithm used.

(8)

Figure 1.1: The algorithm used for processing CSS files, extracting critical CSS rules and inlining them in the original HTML

(9)

Chapter 2

Background

2.1 Cascading Style Sheets

The W3 Consortium defines CSS as “A style sheet language that allows authors and users to attach style (e.g., fonts, spacing, and aural cues) to structured documents (e.g., HTML documents and XML applications). By separating the presentation style of documents from the content of documents, CSS simplifies Web authoring and site maintenance” [BLLJ98].

In order to add styles to elements on a page, there needs to be a way to select which ones should be affected by a certain group of rules. This is accomplished by CSS selectors. According to the W3 Consortium, there are a total of 52 types of selector patterns in use across all 3 versions of CSS. They are used to select elements based on class names, tag names, IDs attributes or pseudo selectors. A simple breakdown of a CSS selector is shown on figure2.1.

Figure 2.1: Breakdown of a CSS selector

The more complicated the selector, the less abstract its selection is. Every following property on a selector, shortens the list of possible elements that it can affect. Ordering is important, and so is white-space, because slight modifications to selector could result in a totally different selection of elements on a page. If an elements matches the defined pattern, then the CSS rules defined under that selector are applied to it in the render tree. Selectors play a vital role in the detection of critical CSS, by allowing us to find all CSS rules that apply to elements above the fold, and mark them as critical.

2.1.1 Media queries

Media queries, originally proposed to the W3 Consortium in 2007 [LCGvK01], are specialized CSS ‘at’ rules that allow developers to specify the range of devices and screen resolutions a set of CSS rules should be valid for. A media query condition can rely upon the type of screen the page is viewed on, its pixel ratio, orientation, width or height. These conditions can be combined to form more complex media queries that apply to a more specific group of devices. They are especially useful when a web

(10)

But the browser loads all media queries before the initial render, regardless if they apply to the current device or not, and only after it has built the CSSOM tree, does it start to parse and understand which queries are applicable. This approach creates even more network traffic by downloading larger files, where only a portion of the media queries could be valid for the initial render. In our solution, we take this into account, testing beforehand which queries will apply to the predefined viewport, which in turn allows us to extract the media queries which indeed will be needed for first render, and run the element hit test on them.

2.2 Web page rendering

Each web browser utilizes a rendering engine to transform HTML and accompanying CSS styles into graphical output on the user’s screen. The most popular browsers on the market today use one of the following rendering engines:

• WebKit — open source and used by Safari on OS X and iOS systems, as well as the native Android browser

• Blink — open source and used by Chrome, Opera and Android’s Webview component • Gecko — open source and used by Firefox

• Trident — proprietary and used by Internet Explorer 4.0 trough 11.0 • EdgeHTML — proprietary and used the Edge Browser

All of these engines follow a similar procedure in rendering a web page, described below. From this list, we want to focus on optimizing the second step, because here external CSS files create a bottleneck in the rendering process, causing proceeding steps to also become slower.

1. Construct the DOM tree based on the HTML retrieved from the server 2. Load and parse all linked CSS files and inline styles

3. Construct the CSSOM tree based on the parsed data

4. Construct the rendering tree based on data from step 1 and 3, making sure all elements in the rendering tree point back to the corresponding element in the DOM tree, and are visible (elements that are not initially visible, or are not rendarable like head or script, are not included) 5. For each element in the rendering tree, compute all styles that affect it from the CSSOM, and

calculate its coordinates for the layout 6. Paint items on screen

2.2.1 Time to first render

The time to first render is defined as the time-span starting from the browser navigation, to the moment the first non-white content is painted on the browser viewport. Time to first render is affected by multiple factors, including network speed (speed of connection to server), server processing speed, and initial page parsing and rendering. Since the server and connection response time can be improved with upgrades to the hardware components, the best way a web developer can improve the time to first render is to help the browser speed up the processing of objects in the head tag.

(11)

Figure 2.2: The render process used by all major browsers

As seen in figure 2.2, the time to first render depends on multiple factors: α1 — the time that

it takes from the beginning of the request, until the full HTML of the document is received; the time to parse the result and the time to receive and parse all requested external CSS files. In the example figure, there are three different CSS files, which the browser starts downloading at the same time. Unlike DOM elements, which can be added to the tree as they are recieved, CSS rules can override each other, hence the browser can’t start building the CSSOM until all CSS files have been downloaded and parsed, including the inlined rules in the head. The time until the CSSOM building process starts, T2, is equal to the CSS file that takes longest to download and be parsed, in this

example, β2. The time it takes to build the CSSOM, T3− T2, is proportional to the total size of CSS

files being parsed, meaning more CSS rules processed during the initial loading of the page contribute to a longer time to first render.

T1= α1

T2= T1+ time to parse html() + max(β1, β2, β3)

T3= T2+ time to parse all css()

Based on this, we can see that requests for external CSS files cause a bottleneck in the rendering process, and improving that area can contribute to decreasing of time to first render.

2.3 Critical path CSS

When a page is initially rendered in a browser, in most cases only a section of it is visible on the screen, that section being a rectangular area that starts from the top left corner of the viewport and extends down and to the right to cover the height and width of the viewport. The line where this initially visible content ends, and scrolling needs to be performed to see the rest, is called ‘the fold’. CSS rules that apply to the elements above the fold line are refereed to as critical path rules. They apply to the first visible content, and thus should be prioritized in loading over rules for invisible and below the fold elements, whose styles can be asynchronously loaded without blocking the rendering process. Figure2.3gives a visual representation of where the fold is, and which element’s CSS rules should be labeled as critical.

(12)

Figure 2.3: ’The fold’ and critical elements above it

In our solution, we allow the developer to set a viewport size for which critical CSS rules should be extracted. This viewport allows developers to define the location of the fold, and accordingly look for CSS that applies only to elements that are above it. Note that the fold does not always have to be horizontal. Even though the majority of web pages today scale their width according to the screen they are viewed on, some still have horizontal scrolling enabled. In these cases, two different folds are applied, one vertical and one horizontal, sectioning off a rectangular area from the top left portion of the screen to the point where the two folds intersect.

(13)

Chapter 3

Related work

In this section, we go over related work of other authors that try to minimize the time to first render. We also look at prefetching and preloading as attributes added to HTML tags, that tell the browser when and with which priority to load resources. Then we analyse putting content before style, manual extraction of critical CSS, and finally, other Node.js modules that extract critical CSS rules. We focus on the methods used by these Node.js tools to extract critical CSS in order to answer our second research question.

3.1 Academical Research

Researchers from the University of British Columbia in Canada worked on a tool called Cilla that is capable of detecting unmatched and ineffective selectors and properties as well as undefined class values [MM12]. Other research has been performed between the Concordia University and the University of British Columbia where refactoring opportunities for CSS code in terms of duplication and size reduction have been analysed [MTM14]. We consider both as important steps for general optimization of CSS code which will keep it clean and up-to-date. However, they do little to help speed up the initial rendering of the page content, something that we focus on in this thesis.

3.2 CSS prefetch and preload

The W3 Consortium has suggested two methods to deal with static content that will soon need to be loaded in the browser. They both are attributes to the ‘link’ tag in HTML, and both accomplish mainly the same thing — instructing the browser to load a resource ahead of time, indicating that it will be used in the next navigation step.

3.2.1 Prefetch

The prefetch attribute can ask the browser to perform a range of actions [Con16b]: prefetching DNS, preconnecting (fetching DNS and make a TCP handshake), prefetching a resource and saving it to the cache or rendering a whole new URL in the background. But these requests for action can be overridden by the browser, and are not mandated to actually happen. For example, in Mozilla’s browser, Firefox, prefetching occurs only if the browser is idle [Dev16c].

3.2.2 Preload

Preloading, a new specification proposed by the W3 Consortium [Con16b], acts very similar to prefetching prefetch, but can not be overridden by browsers, and a requests must be stared as soon as the element is encountered.

(14)

This approach is only valid if the user is already on a website, and the site is trying to optimise the rendering of the next navigation (considering that the developer can predict where the user will go next on the site). This can not be applied when coming from external sites like search engines.

3.3 Content first

Another option is to place all the link tags at the end of the body, thus allowing the browser to create the DOM tree and start rendering it before it encounters CSS styles that need to be applied to it. While this seems like a good approach, the drawback is that a phenomenon occurs called Flash of Unstyled Content (FOCU). Because the browser has started the paint process without actually having any custom styles to use, it applies default styles integrated in each browser when rendering the elements. When actual CSS files are encountered later on, a repaint is requested, and the styles are applied subsequently. Usually this process is fast, but the bare elements will be visible to the user for a short period while fetching these CSS files, thus creating a flash of unstyled content.

3.4 Manual extraction

Manual extraction of critical CSS is also an approach taken by some developers, but multiple things can go wrong. Manual extraction forces developers to inline critical styles by hand, making this approach tiresome, and can even be problematic when the HTML of the page is changed to add/remove elements, causing a full reevaluation of the critical CSS to be performed. Relatively defined URLs also may cause problems. If the CSS file is not in the same directory as the input file (in most cases they are not), then inlining properties that have values with relative URLs, will break the relativity. This makes the task of tracking what URLs in CSS are relative and where they should point, important. It is obvious from problems like these, that extraction and inlining should be an automated task instead of a manual one.

3.5 Automatic extraction

Tools exist that help developers in extraction and inlining of critical CSS rules. Our evaluation of existing tools that automate this process showed that they mainly exist in two different flavours: ones that are modules to popular server platforms, and others written as Node.js modules that can be run on any platform.

3.5.1 Server-side modules

Google provides a pre-built module for the most commonly used servers Apache and Nginx, called PageSpeed. With the help of these modules, multiple optimizations can be performed server-side before a page is served based on a request from the user. One option provided is inlining of small CSS files. One option offered by these modules is inlining small CSS files found in the HTML of a page. Based on a configuration file, the server administrator may set the maximum size a CSS file can be, so that inlining would occur. While this is indeed part of our suggested solution, it in no way provides extraction of critical CSS in the first place, just inlining of already existing CSS files.

3.5.2 Node.js modules

There are multiple Node.js modules that aim to extract and inline critical CSS. We examined reported issues on the GitHub repositories for all projects, and took them into account while developing our solution. A more detailed look at how they work was performed as part of the second research question we examine in chapter5.

(15)

Penthouse

Penthouse1 requires the developer to manually specify what CSS files need to be tested against an HTML file and would fail on invalid CSS. It does not handle relative URLs and does no injection of critical CSS in the HTML, but simply outputs it to a file.

CriticalCSS

CriticalCSS2 has problems while parsing multiple types of pseudo elements and classes in selectors, and is known to invalidate them if they do not strictly adhere to the CSS convention. It does scan the HTML code provided for included CSS files, but can only work on local HTML documents, and does not fetch remotely (developers must use additional libraries to accomplish this). It also does not inject critical CSS in the HTML code.

Critical

Critical3_{is the most robust of all the three Node.js packages that we examined. It does allow inlining}

of critical CSS in the resulting HTML (and minifies that as well), finds CSS files in HTML, and allows developers to add certain CSS rules to be ignored while parsing CSS files. All of this is accomplished because Critical is a wrapper around Penthouse, adding the extra functionality mentioned to the exiting package. However, paths to background images are improperly handeled by Critical, and there is no clear way of integrating it into dynamic sites.

1_{https://github.com/pocketjoso/penthouse/} 2_{https://github.com/filamentgroup/criticalcss}

(16)

Chapter 4

Impact of external CSS files on

time to first render

In this chapter, we research the impact of external CSS files and screen resolutions on the time to first render by using the top 1000 sites ranked by Alexa, a tool called WebPageTest and data from the host company — The Next Web.

4.1 Research Method

As stated in the introduction, one of the two main issues of CSS that contribute to blocking of browser rendering, is fetching external CSS files. Additional HTTP GET requests, especially on mobile net-works, are costly and should to be kept to a minimum. But to see if there is a significant time difference in rendering the same site with and without external CSS requests, we analysed the top 1000 sites based on popularity listed on the Alexa rankings [Ale16]. Alexa ranks websites based on tracking information of user visits in the last 3 months, and manual inspection of the sites confirmed that they indeed match popular sites today. The ranking for the data that we used in our analysis was made in April of 2016. WebPageTest.org was used as the tool to help us analyse the rendering time differences. It renders a given URL on a remote machine, recording multiple types of statistics, among which is the time to first render in milliseconds. It also permits us to define what type of GET requests it should block during render, thereby allowing us to filter out any requests for files that ended with the “.css” extension. In this research, 2 tests were executed on each of the main landing page of the 1000 top sites, one with normal site rendering, and the second blocking all external CSS requests and then rendering the page. All tests were run on a 5Mbps connection speed and rendered in the latest version of the Chrome browser at the time of the test.

To further analyse the impact external CSS files have on the time to first render, we must also look at the correlation of screen resolutions vs page height. This correlation is important because it shows what percentage of elements on a page fall below the fold, meaning that CSS rules that only apply to those elements are non-critical to the first render. This was also done using the Alexa top 1000 sites and the WebPageTest.org tool, since it allowed us to define custom metrics, in this case the height of the document obtained via Javascript, and track them in the reports. It was then compared to statistics about the most common screen resolutions from the W3 Consortium [Con16a]. It was also compared to the same type of data gathered from April 2015 to April 2016 from the host company The Next Web.

(17)

4.2.1 Time to first render

To look at the first part of this research question, we used the top 1000 most popular sites from Alexa, recording the difference in time to first render on every site between having requests for external CSS files blocked and allowed. Roughly 9% of the tested sites did not report any content (manual inspection showed that they were mostly content delivery networks) or a HTTP status code different from 200 (OK), and were ignored. From the final results that contained 909 successfully processed sites, we deduced that, on average, the first render happened around 1.96 seconds earlier when blocking external CSS files was enabled. This number could possibly be even higher since we were testing only the landing pages of websites, where some news-like sites may have more extensive content requiring additional CSS files on inner pages of the site. A visualization of the difference in time to render between disables and enabled blocking of external CSS files can be seen in figure4.1.

33% 20% 15% 11% 21% ≤500ms 501ms-1000ms 1001ms-1500ms 1501ms-2000ms ≥2000ms

Figure 4.1: Time saved to first render by removal of external CSS files

4.2.2 Screen resolution and document height

The second part of our research on this question was run to gather the height of the analyzed web pages, in order to determine how much of the content is hidden under the fold of the most common screen resolutions. WebPageTest reported that 80.4% of the pages in our test group have a height over 1100px, which was to be expected, since websites today focus on putting more content on one page in order to let the user have a seamless scrolling experience. The results are visualized in figure

4.2.

For us to make use of the height data that was gathered, we turned to two separate sources in order to determine the most common screen resolutions in use today. Statistics from the W3 Consortium show that almost 50% of users browse the web on resolutions that have a height ≤ 800px [Con16a]. That data corresponds with analytics from The Next Web, where screen resolutions ≤ 800px are shared among 60% of the 67 million visitors of the site from April 2015 to April 2016, as shown in figure4.3.

(18)

14.8% 3% 1.8% 80.4% ≤800px 801px-1000px 1001px-1100px ≥1100px

Figure 4.2: Height of rendered web pages from Alexa’s top 1000

60.2% 12% 22.5% 5.3% ≤800px 801px-1000px 1001px-1100px ≥1000px

Figure 4.3: Screen resolution data from The Next Web from 04/15 to 04/15 for 67 million users

4.3 Analysis

Before analysis is performed on the results from the research, there are a number of limitations that need to be taken into consideration. First, the servers on which the sites may be hosted on could be under-performing. To counter that, we calculate only the difference between render times with and without requests for external CSS files. This also goes for network delays, which are out of our control, but taking into account the number of sites tested, it should minimize as much interference as possible. The selection of sites based on Alexa’s top ranking may not be relevant in every case, but by visual analysis of the included URL’s, we trust that they make up the majority of popular web sites among Internet users. Having this in mind, we can see that by eliminating requests to external CSS files, a noticeable decrease of at least 500ms in the time to first render is achieved in more than 67% of the tested cases. By injecting critical CSS into the HTML code, external calls to CSS files can be eliminated before rendering starts, thus avoiding this issue [WSW13]. And by looking at the height of the tested web pages in comparison to the most popular screen resolutions they are viewed

(19)

Chapter 5

Methods used by existing tools

In this chapter, we examine methods used by existing tools for detection, extraction and inlining of critical CSS.

5.1 Research method

The solution we present in this thesis is written for the Node.js environment so that it can be deployed on any development environment, and because of that, we focused our examination on the three main Node.js packages mentioned in section3.5.2. They were chosen based on the similarity of their main goal to our research, as well as their popularity in the Node.js community, with Penthouse having 89,649 downloads from April 2015 to April 2016, Critical — 57,417 and CriticalCSS — 20,774 from the NPM repository [Man16]. All three are open sourced and have their code published on GitHub, thus allowing us to examine it and determine what methods they use in detection, extraction and inlining of critical CSS rules.

5.2 Results

5.2.1 Penthouse

Penthouse uses the PhantomJS, a headless WebKit browser, to render and test for elements inside a given viewport, and extracting styles that belong to them. It generates an AST from the CSS file that is being tested (given in the JSON configuration) using an external library and sends it to a child process where PhantomJS is running. There, the AST is traversed, and all selectors are tested to see if elements that they refer to are located above the fold of the defined viewport. This is done by using Javascript’s querySelectorAll function, that traverses the DOM looking for elements that comply to the given selector, and returning them in an array. The coordinates of every element in the resulting array are tested to see if they belong in the width and height of the viewport, and if at least one does, the corresponding CSS rules of that selector are marked as critical.

One interesting observation is that Penthouse only tests if elements are above the horizontal fold, ignoring the possibility that a web page might not scale to the width of a window, and have elements that overflow on the x-axis. This is visible in the function ”isElementAboveFold” in figure5.1, where only the top coordinate of an element is compared to the height of the viewport.

(20)

1 var isElementAboveFold = function (element) { 2 ...

3 var aboveFold = element.getBoundingClientRect().top < h 4 ...

5 return aboveFold 6 }

Figure 5.1: Penthouse code for checking if viewport contains element

Other processing of the CSS includes removal of pseudo elements and classes before testing them on elements, since testing with pseudo-enabled selectors may cause elements not to be detected (like the “.class:hover” selector which can never match any elements in a headless browser). The removal of unused @font-face fonts is also present, where each font-face rule is tested to extract the name of the font family it defines, and all critical CSS font rules are examined to check if they include the detected font family. If no critical rules use that selected font, the whole font-face rule is ignored.

Media queries are also tested and the ones that are only applicable to a print environment are discarded, but only media queries that include the ‘min’ feature, which symbolizes that the tested value must exceed the minimum defined value, are tested. This is interesting, because media queries can also define the ‘max’ feature, or even an exact resolution that certain CSS rules should apply to, and these will not be tested, but simple accepted as applicable media queries as visible in figure

5.2. A simple example can show where this detection will fail: If a page is viewed on a 1200x900 screen resolution, and a media query is formatted as @media screen and (max-width: 700px), it will be considered applicable, even though it does not apply to the given screen resolution.

1 function _isMatchingMediaQuery (rule, matchConfig) { 2 ...

3 var keep = mediaAST.some(function (mq) { 4 if (mq.type === ’print’) {

5 return false 6 }

7 return mq.expressions.some(function (expression, index) {

8 if (expression.modifier === ’min’) {

9 return cssMediaQuery.match(’(min-’ + expression.feature +

’:’ + expression.value + ’)’, matchConfig) 10 } else { 11 return true 12 } 13 }) 14 }) 15 return keep 16 }

Figure 5.2: Penthouse code for testing applicable media queries

Penthouse does not inline any detected CSS styles, and deletes any non-critical rules from the initially generated AST before converting it into a string and saving the result to a output file.

(21)

generates a dummy page with the contents of the CSS file placed in a style tag, and then uses the ‘styleSheets’ property of the DOM tree to parse each rule and manually create an AST. This approach minimizes the dependencies on external libraries, but does require an additional run of PhantomJS to parse the CSS being tested.

It also takes a similar approach to Penthouse when testing for selectors that hit on elements inside the viewport, by removing pseudo elements from selectors, checking if an element is found using the selector, and determining its location relative to the viewport. This is where we spotted most of the issues with CriticalCSS, with the code visible in figure5.3.

First, the removal of pseudo elements will only work on the ‘before’ and ‘after’ elements, but will not omit any other pseudo elements or classes, causing selection elements to which these selectors may apply, to fail. Second, the ‘querySelector’ function is used, which returns only the first element to match the given selector. This may yield false-negatives, when the first detected element is not in the viewport (hidden as a modal box, a menu, or simply positioned somewhere else), but a subsequent one is. Finally, CriticalCSS only tests the x-coordinate of elements to see if they are above or below the horizontal fold, leaving out options for them to be outside a possible vertical fold.

1 function criticalSelectorList ( list, maxTop ){

2 return page.evaluate( function( selectors, maxTop ){ 3 return selectors.filter(function( selector ){ 4 var elem = null;

5 var selectorNoPsuedos = selector.replace( /\:+(before|after)/gmi, "" ); 6 try {

7 elem = window.document.querySelector( selectorNoPsuedos ); 8 } catch (e){}

9

10 return elem && elem.getBoundingClientRect().top <= maxTop;

11 });

12 }, list, maxTop ); 13 }

Figure 5.3: Critical’s code for detection of elements with critical CSS selectors in the viewport

5.2.3 Critical

Critical focuses on the automatic extraction and inlining of CSS instead of the detection, for which it relies on Penthouse as a dependency.

The extraction, done with the help of the package Oust, looks for link tags in the HTML code of the input file, and extracts all of them that include the rel attribute ‘stylesheet’. The content of each CSS file is then extracted and combined in one main CSS file, that is to be sent off to Penthouse for processing. Before that step, Critical takes care of any relative paths, transforming them into paths relative to the base folder defined in the running configuration. But it does not work well with relative paths of background images, as seen in multiple open issues on its GitHub repository.

Inlining is achieved by the inline-critical package, which in turn uses the fg-loadcss package to asynchronously load CSS into an HTML file with Javascript. Link elements may have a rel attribute with the value preload, which signals to a browser to load the element at a later point in time, not blocking the rendering of the page. As soon as an element is loaded, Javascript is used to mark that link element as a CSS stylesheet, forcing the browser to apply the new styles to the render tree. To make the solution work in a environment where Javascript isn’t supported, a fallback noscript tag is

(22)

1 \$(links).each(function (idx, el) { 2 ...

3 var \$el = \$(el); 4 ...

5 \$el.after(’\n’ + elIndent + ’<noscript>’ + render(this) + ’</noscript>’); 6 \$el.attr(’rel’, ’preload’);

7 \$el.attr(’as’, ’style’);

8 \$el.attr(’onload’, ’this.rel=\’stylesheet\’’); 9 });

10

11 var scriptAnchor = \$(’link[rel="stylesheet"], noscript’).filter(function () { 12 return !\$(this).parents(’noscript’).length;

13 }).last().get(0); 14

15 \$(scriptAnchor).after(’\n’+targetIndent+’<script>’+getScript()+’</script>’);

Figure 5.4: inline-critical’s code for Javascript enabled loading of CSS resources

5.3 Analysis

By examining the methods used in these tools, we can conclude that all of them have a major dependency on PhantomJS as a headless browser for rendering pages and discovering over-the-fold elements. One author described the tool as “Essential in the web development stack. It is ideal for fast unit test watches, end-to-end tests in continuous integration, screen captures, screen scraping, performance data collection, and more.” [Fri14]. Penthouse and CriticalCSS focused mainly on detection and extraction of critical CSS, while Critical added flexibility to Penthouse for automatic detection of CSS in HTML files, as well as automated inlining of the recovered critical styles. Every tool had a set of strength and flaws, something that we took into account while developing our solution.

(23)

Chapter 6

Focusr

Having analysed 3 popular tools for detection, extraction and inlining of critical CSS rules, we propose a solution, a tool called Focusr, that overcomes their limitations stated in in chapter 5, and adds additional flexibility by integrating into dynamic websites. As a proof of concept, a plugin for the popular Wordpress content management system was developed, allowing injection of generated critical CSS files into its ecosystem on the fly.

6.1 Configuration possibilities

In order to allow for maximum configurability, we allow a JSON file to be used to override the default values and actions of the tool. Below, a preview of all the values supported by the configuration file is shown, prefilled with the default values for each setting.

1 { 2 "allowJS": false, 3 "debug": false, 4 "processExternalCss": true, 5 "renderTimeout": 60000, 6 "inlineAllCss": false, 7 "groups": [ 8 { 9 "enabled": true, 10 "baseDir": "tests/", 11 "inputFile": "", 12 "outputFile": "", 13 "alwaysInclude": [], 14 "httpAuth": "", 15 "wordpress": false, 16 "viewport": [1200, 900], 17 "outputJS": false 18 }, 19 ... 20 ] 21 }

The configuration is divided into two separate entities: global and group specific settings. This was done to add the possibility to process multiple different inputs (local and/or remote) in a single run of the tool. The global settings, shown in table 6.1apply to all the defined groups, while the group specific settings shown in table6.2only apply to that specific group.

(24)

Name Type Default Description

allowJS boolean true Allow PhantomJS to execute Javascript while looking for over-the-fold elements. Since loading and executing Javascript could potentially slow the process down, it can be disabled, but some sites that rely on filling content using Javascript, so users may want to keep this option on. debug boolean false Adds a debug outline to the output HTML file to show what

elements were marked as critical. Only applies to local HTML files.

inlineNonCritical boolean false Inline the non-critical CSS rules at the end of the body tag instead of using Javascript to load them asynchronously. processExternalCss boolean true Allow for fetching and processing of externally linked CSS

files from hosts different than the main site.

renderTimeout int 60000 How long to allow PhantomJS to render the website before interrupting it.

inlineAllCss boolean false Setting this to true will inline all CSS files, skipping the ex-traction of critical CSS.

groups array [] An array of group objects to process. Table 6.1: Global configuration

Name Type Default Description

enabled boolean true Enables or disables processing of the group, allowing for con-figurations to be stored in the file, but run only when needed baseDir string ”tests/” Required. The directory of where to base the input and

out-put files

inputFile string ”” Required. The input file to process. Should be a remote URL or a path to a local HTML file relative to the baseDir setting outputFile string ”” Required. The output file. If inputFile is a remote URL, the critical CSS will be written here; if the inputfile if a local HTML file, the same HTML with inlined critical CSS will be written here

alwaysInclude array [] An array of string, regular expressions for selectors of CSS rules that should always be included in the resulting critical CSS, even if they are not detected as critical

httpAuth string ”” Basic HTTP authentication string for remote URLs that re-quire authentication to access. Format ’username:password’. wordpress boolean false If true, will try to connect to the Wordpress plugin of the given inputFile URL and extract the CSS files embedded on the site.

viewport array [1200, 900] The viewport size PhantomJS should use while rendering and analyzing the input file.

outputJS boolean false If true, will output the Javascript that includes all the CSS files asynchroniously.

Table 6.2: Group configuration

(25)

Initially, the tool starts by parsing the configuration file, and overwriting the default configuration settings with the one the user has provided. This way, we compensate with default values for settings omitted by the user. Each enabled group defined in the configuration file, is parsed in parallel to gain extra speed.

If the input file for the current group is a remote URL, then a GET request is made (and basic HTTP authentication is added if defined) and the resulting HTML is stored in the group object if the response from the server is a OK 200 code. If the input file is a local path, it is read using the Node.js file system, and its HTML code is stored in the group object.

6.2.1 CSS detection and extraction

This step of the process involves using the JSDom library1, that parses and generates an HTML DOM tree based on a given HTML string. It produces the Javascript ‘window’ variable, which is the same window variable generated by browsers while loading a page, thus all standard Javascript functions are available on it. This allows us to easily extract linked CSS files by running a query selector on the head of the page, and searching for all link tags that define the rel attribute as a stylesheet. Stylesheets defined outside the head tag are in violation of HTML standards, so we exclude them from our search. After all the links are obtained, they are processed in parallel as well, reading the CSS code from the local file system if the input file and CSS link are both relative paths, or else issuing a GET request and reading the CSS code from a web location.

AST generation

A Node.js module called ReworkCSS2is used to parse the CSS code and generate a traversable AST. As soon as the first CSS file is downloaded, the initial CSS AST is generated. For each sequential CSS file that gets read, a new AST is generated, and merged with the existing one. This process is repeated until the last CSS file for the group is downloaded, after which the final CSS AST is passed on for processing. A sample CSS rule, and the resulting generated AST with ReworkCSS is shown below on figure6.1and figure6.2. For each rule defined in the tree, an extra attribute named ‘critical’ is then added and initialized to be false. This attribute is used during the rest of the process to track which rules are indeed critical and should be included in the final inlined CSS.

1 body {

2 background: #eee; 3 color: #888; 4 }

Figure 6.1: Sample CSS rule

1_{https://github.com/tmpvar/jsdom} 2_{https://github.com/reworkcss/css}

(26)

1 { 2 "type": "stylesheet", 3 "stylesheet": { 4 "rules": [ 5 { 6 "type": "rule", 7 "critical: "false", 8 "selectors": [ 9 "body" 10 ], 11 "declarations": [ 12 { 13 "type": "declaration", 14 "property": "background", 15 "value": "#eee", 16 "position": { 17 "start": { 18 "line": 2, 19 "column": 3 20 }, 21 "end": { 22 "line": 2, 23 "column": 19 24 }}}, 25 { 26 "type": "declaration", 27 "property": "color", 28 "value": "#888", 29 "position": { 30 "start": { 31 "line": 3, 32 "column": 3 33 }, 34 "end": { 35 "line": 3, 36 "column": 14 37 }}}], 38 "position": { 39 "start": { 40 "line": 1, 41 "column": 1 42 }, 43 "end": { 44 "line": 4, 45 "column": 2 46 } 47 }}]}}

Figure 6.2: Generated AST from sample CSS

Relative issues

(27)

that, for every that includes a relative URL in its value, we calculate the relative path from the original to the output file, and rewrite it.

Media queries

Media queries are also checked in the AST for compatibility with the user defined viewport. If a media query is intended for a screen, and its limits for the screen resolution match the viewport, it is marked as critical, and no further testing is performed on its rules.

Always include

If the user-defined configuration file includes any regular expressions in the ‘alwaysInclude’ setting, all rules that match to those expressions are marked as critical, even if no elements above the fold match to them. This is helpful in situations where CSS classes are not initially present in the HTML, but are added with Javascript during or right after loading the page.

6.2.2 Over-the-fold elements

The main process that detects over-the-fold elements uses PhantomJS, a headless Webkit engine, that can silently render a page and allow execution of Javascript on the resulting window object. Since PhantomJS is not directly compatible with Node.js, a pre-built version is downloaded from the Node repository as a dependency. The generated CSS AST and the HTML of the input site is saved in temporary files, and together with the defined viewport size, are all passed to a child process that runs the PhantomJS processing script.

We set multiple properties on the PhantomJS engine before rendering begins:

• The useragent which the engine is going to use is set to the latest Chrome signature, to mimic a regular desktop browser environment.

• The resource timeout is set to 3 seconds, to skip all resources that take too long to download. • Javascript execution is enabled or disabled based on the passed configuration value.

• SSL security checks are disabled to minimize potential errors on rendering.

• Embedded iframes are also blocked from loading since there is no need for their content. • Javascript errors are ignored.

After the page has finished rendering, we query every selector from the CSS AST that had not been marked as critical so far. By getting all elements that match that CSS query, we are able to check their position in the rendered document by accessing the ‘boundingClientRect‘ properties. If the top and left positions of the element are contained within the viewport defined in the configuration file, then that CSS rule is marked as critical; if not, the next element that matches the selector is tested.

(28)

1 function focusr_processRule(rule, viewportWidth, viewportHeight) { 2 for (var i = 0; i < rule["selectors"].length; i++) {

3 var selector = rule["selectors"][i];

4 selector = focusr_removePseudoSelector(selector); 5 6 if (focusr_isAtRule(selector, rule)) { 7 continue; 8 } 9

10 var elements = focusr_getElementsFromDom(window, selector); 11 var foundElementInViewport = false;

12 for (var j = 0; j < elements.length; j++) { 13 var element = elements[j];

14 if (focusr_isElementInViewport(element, viewportWidth, viewportHeight)) {

15 foundElementInViewport = true; 16 break; 17 } 18 } 19 if (foundElementInViewport) { 20 rule["critical"] = true; 21 break; 22 } 23 } 24 }

Figure 6.3: Main function run on all CSS rules in PhantomJS

When all the non-critical CSS rules from the AST have been checked, the modified AST is saved back into a temporary file that will be read by the main Focusr script, and a status code is reported back from the process.

6.2.3 Inlining

In order to inline the critical CSS rules in the resulting HTML, we remove all non-critical marked rules from the resulting AST before minifing the critical ones.

In the final part of the process, JSDom is again used to parse and create a Javascript ready environment from the HTML code of the input file, allowing us to modify the DOM tree and perform three steps:

1. Remove all link elements from the head tag that have the rel attribute set to a stylesheet. 2. Create a new HTML style tag at the end of the head tag, and add the minified critical CSS

code in in.

3. Create a new HTML script tag at the end of the body tag and add special Javascript code that will load the original CSS links after the document renders

Javascript loading of styles

Using Javascript to asynchronously load CSS is recommended by Google [Dev16b], and their preferred way is shown in figure 6.4. It utilizes the ‘requestAnimationFrame’ window method that allows us to schedule a function right before the next paint event of the browser. Since the browser manages

(29)

‘load’ event. Once the event fires, each CSS link that was defined in the header gets inserted into a new ‘link’ tag in the order that they appeared originally. All of these tags are then appended to the body of the document.

1 var loadDeferredStyles =

2 function() {

3 var styleSheets = ["style01.css", "style02.css"];

4 var i = 0;

5 for (; i < styleSheets.length; i++) {

6 var styleSheet = document.createElement("link"); 7 styleSheet.rel = "stylesheet"; 8 styleSheet.href = styleSheets[i]; 9 document.body.appendChild(styleSheet); 10 } 11 }; 12

13 var reqAnimFrame = requestAnimationFrame || mozRequestAnimationFrame || 14 webkitRequestAnimationFrame || msRequestAnimationFrame; 15 16 if (reqAnimFrame) { 17 reqAnimFrame(function() { 18 window.setTimeout(loadDeferredStyles, 0); 19 }); 20 } 21 else { 22 window.addEventListener(’load’, loadDeferredStyles); 23 }

Figure 6.4: Javascript for asynchronous loading of stylesheets

Example

As an example of how the tool works, we made a simple test website with 2 columns of div elements. Each has a unique background color and text to make it easily identifiable. The first 4 div elements have a background image set, all with different type of local and remote definition of images, done in this way to test the ability of the tool to find and fix the paths to the images when inlining the critical CSS.

1 .div1 { background-image: url("../img/buck.jpg"); } 2 .div2 { background-image: url(../img/buck.jpg); } 3 .div3 { background-image: url(’../img/buck.jpg’); }

4 .div4 { background-image: url("https://i.ytimg.com/vi/DkIVqD8pJt8/maxresdefault.jpg"); }

Figure 6.5: 4 types of CSS background image properties

A 300x600 pixel viewport was defined, and the debug option was turned on so a visual representation of this viewport can be seen. Figure6.6adisplays the original site without any modifications, while figure6.6bshows the site after running Focusr on it without adding the Javascript to inline the rest of the non-critical CSS. It can be seen that images are still correctly being displayed, and only the elements that were in the viewport are still displaying all their associated styles.

(30)

(a) Original (b) Only critical CSS

Figure 6.6: Pre and post tool results

6.3 Dynamic sites

As part of this thesis, our goal was to provide a solution for detecting and inlining critical CSS for static, as well as dynamic websites. In this section we introduce a proof of concept plugin for Wordpress that works together with Focusr to automatically detect styles introduced by any Wordpress theme or plugin, extract critical CSS for different types of views, and inline that before serving the HTML to the user’s browser.

6.3.1 Wordpress plugin

According to monthly W3 reviews, Wordpress accounts for almost 60% of the marketshare of content management systems [W3T16], and is the system in use by The Next Web as well. Those were the main reasons the proof-of-concept plugin was chosen to support Wordpress. The platform is very powerful by itself, allowing our plugin to hook into the HTML delivery process and modify the contents that will be presented to the user’s browser as it happens.

When Focusr is pointed to a Wordpress installation, the plugin delivers a JSON object that lists links to the homepage, and to a random post, random page and random category (if they exist). They are then split up into separate groups by Focusr and processed individually, generating two output files for each group: the critical CSS file, and the Javascript file that contains code to asynchronously load the rest of the CSS rules. They are named according to the type of page they should be deployed on, making it easy for the plugin to choose the correct one for injection.

Wordpress allows themes and plugins to hook onto different action events that occur in its ecosystem. Because we need to make fundamental changes to the DOM tree, the whole HTML output is needed, thus we use the ‘template redirect’ action hook, which executes just before WordPress determines which template page to load. By setting the priority of the action hook to a high number (higher means less priority), we allow for a later execution of our plugin, making room for others to modify the DOM first.

(31)

a PHP DOM parser object, but experienced problems with removal of invalid HTML tags used for caching, so resorted to using regular expressions to speed up the process and avoid these issues. By hooking onto the ‘wp head’ action, we can output the correct critical CSS for that type of page. There are 4 main types: homepage, post, page or category, and for each type, Focusr generates a separate pair of CSS and JS files, of which the CSS one is inlined here. Lastly, we hook onto the ‘wp footer’ action so we can insert the Javascript that will load the CSS files asynchronously.

1 public function remove_link_tags($buffer) 2 {

3 $re = "/<link .*rel=(’|\")stylesheet\\1.*(\/>|<\/link>|>)/"; 4 $buffer = preg_replace($re, "", $buffer);

5 return $buffer; 6 }

7

8 public function catch_template_redirect() 9 {

10 ob_start([$this, ’remove_link_tags’]); 11 }

12

13 add_action(’template_redirect’, [$this, ’catch_template_redirect’], 99999);

Figure 6.7: Removing link tags from the generated content

1 public function inject_critical_css() 2 {

3 $outputDir = get_option(’focusr_output_dir’, ’focusr/wordpress/’); 4

5 $critical = "<style data-generated-by=’focusr’>"; 6 if ($outputDir && $outputDir !== "") {

7 if (!$this->ends_with($outputDir, "/")) {

8 $outputDir .= "/";

9 }

10 $prefix = $this->get_current_prefix();

11 $cssFilename = $this->get_base_path() . "/" . $outputDir . $prefix . ".css"; 12 $critical .= $this->read_file($cssFilename, "/*Focusr: Can’t load CSS file*/"); 13 }

14 else {

15 $critical .= "/* Focusr: Output folder not found */"; 16 } 17 $critical .= "</style>"; 18 19 echo $critical; 20 } 21

22 add_action(’wp_head’, [$this, ’inject_critical_css’]);

(32)

1 public function inject_javascript() 2 {

3 $outputDir = get_option(’focusr_output_dir’, ’focusr/wordpress/’); 4 $loadCSS = "<script data-generated-by=’focusr’>";

5 if ($outputDir && $outputDir !== "") { 6 if (!$this->ends_with($outputDir, "/")) {

7 $outputDir .= "/";

8 }

9

10 $prefix = $this->get_current_prefix();

11 $jsFilename = $this->get_base_path() . "/" . $outputDir . $prefix . ".js"; 12 $loadCSS .= $this->read_file($jsFilename, "/*Focusr: Can’t load JS file*/"); 13 14 } 15 $loadCSS .= "</script>"; 16 17 echo $loadCSS; 18 } 19

20 add_action(’wp_footer’, [$this, ’inject_javascript’]);

Figure 6.9: Inserting loadCSS Javascript in the footer

6.4 Claims

We claim that this tool successfully detects and extracts CSS rules that affect over-the-fold elements. These rules are then inlined in the head tag of the HTML page, allowing rendering to occur faster by removing the need for additional HTTP request, as well as the size of the initial CSS that needs to be parsed. We showcase a Javascript function that utilizes the latest browser standards to queue the loading of the remaining CSS rules without blocking the render process. We also provide a proof of concept plugin for the Wordpress platform which, together with Focusr, provides the same functionality for a dynamic site that generates content on the fly.

(33)

Chapter 7

Evaluation

7.1 Research Questions & Answers

In the beginning of this thesis, we posed three main questions whose answers were important to our hypothesis. Through our research, we came down to the following conclusions for each of them:

• RQ1: Do requests to external CSS files make a significant negative impact on the time to first render?

By testing the time to first render on a list of top 1000 most popular sites listed by Alexa Analytics, we saw that by removing calls to external CSS files, 67% of the sites reported a decrease of time to first render of at least 500 milliseconds, while 21% reported a decrease of over 2 seconds.

• RQ2: What methods do existing tools use for detection, extraction and inlining of critical CSS rules?

We analysed three main Node.js tools that were reported to have the highest usage rate from the NPM repository. All of them used the PhantomJS Webkit headless browser to render pages, some focusing only on detection and extraction, while one extended the functionality of it’s predecessor to add inlining of detected critical CSS code. We analysed the methods used to detected critical CSS, and while all tools followed the same rough outline of steps, we detected multiple problems with each implementation that we took into account while building Focusr. • RQ3: How can critical CSS inlining be automated for dynamic web pages?

By using Wordpress, the most dominant content management system by percentage of market-share, we built a proof-of-concept plugin that demonstrates how it can cooperate with Focusr on detecting and inlining critical CSS on all types of pages of the CMS.

7.2 Evidence

In order to prove that Focusr does indeed lower the time to first render, we again tested the top 1000 Alexa ranked sites that were used in the research questions. Since we needed to modify the HTML content of these sites in order to inline the critical CSS, the following steps were performed for each site:

1. Download the HTML of the homepage of the site

2. Run Focusr on the site on a local machine, producing an optimised HTML output file

3. If the site does not have a ‘base’ tag in the head, add one that points to the original server, so all relatively linked resources can load

(34)

5. Run the WebPageTest.org analysis on both versions, and, like in research question 1, compare the difference of the time to first render on both runs.

Because some of the top 1000 sites were owned by banks and government organisations, to comply with their demands, we set up our own virtual server that hosted the same software from Web-PageTest.org, and allowed us to run tests without exposing the URLs of our tests sites online. Since all sites were tested from the same machine, we ensured that server performance would impact all of the tests equally, and we further reduced this by comparing the difference in render timings only, ignoring server performance. In research question 1, we used the Dulles, Virginia region from Web-PageTest.org to gather the site statistics, and all tests were ran with a 5Mbps network connection speed, and we tested with a generous resolution of 1200x900px present on the majority of modern day laptops.

The results aligned with our findings from the research, the mean being 1.24 seconds reduction of load time, which is lower because now there actually is processing of the inlined CSS. Percentage wise, as shown in figure 7.1, they correlate with the ones from research question 1, showing very similar percentages for saved time. A standard deviation of 0.35 seconds was observed in the results, and a boxplot of the results were visible in figure7.2.

To compare if there is a difference between embedding full CSS in websites and just the over-the-fold critical CSS, we ran the same test, only this time inlining all the extracted CSS. The results show a median difference of 200ms. 36.3% 22.9% 16.6% 7.4% 16.8% ≤500ms 501ms-1000ms 1001ms-1500ms 1501ms-2000ms ≥2000ms

Figure 7.1: Time saved to first render with Focusr

0 500 1,000 1,500 2,000 2,500 3,000 3,500 Figure 7.2: Boxplot of the time saved to first render. Red - Focusr, Blue - All inline1

(35)

and optimised version of the sites. The results can be visualized on figure7.3, where the mean time to first render without the plugin was 2.30 seconds, while the mean time to first render with the plugin was 0.95 seconds. These tests were ran with a 5Mbps connection speed. The difference in performance is even more visible if the same tests are run on a slower, 3G 1.6Mbps connection, as shown on figure7.4, resulting in a mean time to first render without Focusr of 3.22 seconds, while the mean time to first render with Focusr was 1.82 seconds.

run 1 run 2 run 3 run 4 run 5 run 6 run 7 run 8 run 9 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 Time in seconds to first render

With Focusr Without Focusr

(36)

run 1 run 2 run 3 run 4 run 5 run 6 run 7 run 8 run 9 1.8 2 2.2 2.4 2.6 2.8 3 3.2 3.4 Time in seconds to first render

With Focusr Without Focusr

Figure 7.4: TheNextWeb.com test on a 3G 1.6Mbps connection (lower is better)

7.3 Claims

In this chapter, we claim that Focusr, the tool built for this thesis, makes a noticeable difference in shortening the time to first render on pages where it has extracted and inlined critical CSS rules. By running tests on the top 1000 most popular sites, we claim that on average, 1.24 seconds of time to first render are saved. We also present a plugin to showcase this behaviour on dynamic sites that use the Wordpress CMS, and test it on the host’s site, The Next Web, were on average 1.4 seconds were shaved off the time to first render.

7.4 Threats to validity

There are a two of things to keep in mind when taking into account the results obtained from our validation. The sites where tested on a private server, but they pulled their resources from the original one. This introduces an additional DNS lookup request, and based on the original server’s performance, may influence the speed of delivery. As future work, we can download all resources of every page to a local server and introduce an artificial network lag, to simulate real life events, and perform the test again. Even though 1000 sites were tested, the test was run once per site, and to further fortify the evidence, multiple tests per site may need to be run, and possibly on multiple pages per domain.

(37)

Chapter 8

Conclusion

In this thesis we explored the scenarios that lead to the blocking of a browser’s rendering process during loading of web pages. By testing a list of 1000 most popular sites, we determined that there is a average of 1.96 seconds added to the time to first render of a page when loading external CSS files. Further research done on this topic revealed that unnecessarily loading of CSS rules for elements not visible on the first render of a site contribute greatly to this issue, especially if additional GET requests are made, and more rules are parsed into the CSSOM tree. Page height data from the top 1000 websites allowed us to compare that info to the most popular screen resolutions, and conclude that 82.2% of the tested sites do not fit on 72.2% of the most popular screen resolutions. Using this knowledge, we formulated a hypothesis stating that inlining critical CSS rules in web pages and loading non-critical ones asynchronously, creates a significant decrease in the time to first render.

We analysed current methodologies and tools that aim to overcome this problem, and by exploring their public GitHub issue lists and manual inspection of their source code, we were able to determine faults and missing features in all implementations. Using this knowledge, we present a Node.js tool, named Focusr, that can perform three main functions:

• Detect and extract CSS files linked in input HTML documents.

• Determine and extract CSS rules that apply only to elements inside a specific viewport resolution — critical above-the-fold CSS.

• Remove externally linked CSS files from the resulting HTML page, inline the critical CSS rules and use Javascript to asynchronously load the remainder of the CSS rules.

In order to show that the same improvements can be accomplished on dynamic sites as well, we created a proof-of-concept plugin for Wordpress, chosen for having over 60% of the global marketshare of content management system. It uses the generated critical CSS rules and Javascript files from Focusr, to hook into the action system of Wordpress and deliver them in the header and footer accordingly, all while using regular expressions to remove any existing link tags that are present.

We validated our work by running the same test on the initial list of 1000 websites, this time com-paring the time difference between the original and Focusr optimised versions of the sites. An average 1.24 second reduction was observed, that corresponded to our initial research results, taking into account that this time around there was actual data to be added to the CSSOM. The same improve-ment in the time to first render was observed with the Wordpress plugin deployed on TheNextWeb’s main site, where 9 consecutive tests showed a 1.35 second speedup, where a 1.4 second speedup was observed on lower, 3G speed, connections.

Future work

As future work, aside from the one already stated in the threats to validity, the Wordpress plugin could be made independent from the main Focusr tool, analyzing and extracting critical CSS rules

(38)

Acknowledgement

I would like to express my complete gratitude to my supervisor, Dr. Vadim Zaytsev, whose inspiration and expertise guided me throughout this project. I would also like to thank The Next Web, for being supportive to my ideas, giving me technical advice and being a wonderful host during my 3 months there. Finally, I would like to thank my close friends and family, who gave their very best to understand the details of my work even if it was in a field completely unknown to them, and Vesna who was by my side always being the strongest supporter.

Critical CSS Rules Decreasing time to first render by inlining CSS rules for over-the-fold elements