'Resource hints', HTTP/2 Push, and CDNs all provide overlapping performance benefits to some degree. I'll refer to these as "streamlining" techniques. I'll use the term 'RTT', which means 'Round Trip Time' - basically your measure of latency.
Resource hints (prefetch, preload, preconnect)
Resource hints work either via <link> tag or 'Link' HTTP header.
1) With 'preload' and 'preconnect' we can stop late-discovered resources becoming unnecessary bottlenecks. Resources in <head> generally will be discovered fast and loaded in parallel automatically, but resources may be referenced inside subresources (e.g. web fonts) or deeper in the page (e.g. ad code). Use 'preload' if you want the resource to be pre-loaded, and 'preconnect' if you just want it to download faster when needed (by pre-establishing a connection to the server).
2) With 'prefetch' we can make future navigation to common followup webpages very fast.
1) Adds extra request weight to actually make the necessarily declarations.
2) Doesn't obviate the RTT, at best there'll be one Round-Trip for subresources as soon as the first HTTP request starts to come in (that is if the declared dependencies can be detected immediately and don't exceed 6). If we can't do everything at once then at least Keep-Alive will allow our connections to be recycled.
3) 'prefetch' is not supported by Safari.
1) Deliver all your page resources with the same connection as your main webpage, i.e. only one RTT in total.
1) It is still not yet trivial to get your own working HTTP/2 server.
2) If the connection stutters then your benefits die with it. The more you try and transport over it, the more likely this will happen. This typically means mobile connections which are the connections we care most about trying to optimise.
3) It's wasteful of bandwidth. It requires a RTT for client to tell server it already has something that is already being sent to it.
4) If we intend to leverage advantage '1' then we are necessarily making our application server a bottleneck. Everything has to go through that server, risking CPU/memory/bandwidth saturation for high-traffic websites. We can use a proxy (like Cloudflare), but proxying anything has a performance impact of its own.
Resource hints & HTTP/2 Push
1) Some people may be philosophically opposed to any streamlining that results in resources being loaded that aren't needed on a page. Practically speaking, this could be a concern for mobile users on tight data limits.
2) Some analytic systems may get messed up. Not Google Analytics which works via JS-execution.
3) Adds extra complexity for you to maintain in-code all your dependencies (although I show how to avoid that in 'Other points').
1) Reduce the RTT significantly by accessing via a physically-near server.
2) Offload work from the application server (i.e. reduce bottlenecking). This is especially important for unproxied Apache servers which are not particularly efficient at delivering static resources.
3) Even modern browsers will only parallelise 6 resources per server (to stop servers getting overly-stressed): introducing a second server gives you another 6 HTTP slots. Making me write this makes me wonder why there isn't an HTTP header for a server to declare a higher number than 6, maybe that can be a future web standard.
4) CDNs can potentially do cool stuff with images, e.g. serving appropriate resources based on device capability, device size, and bandwidth/data concerns.
5) Your images will be served from a cookieless domain, reducing the request overhead a bit.
1) The CDN will need to be DNS-resolved (unless you use a raw IP address for it, which may be a smart idea).
2) If you're using HTTPS (like everybody now has to) then an HTTPS connection will need to be established for the CDN. Hopefully the CDN will use an SSL type that minimises this time.
3) Depending on how the CDN works, you may have to start priming your resources on it via some new process.
1) Rather than maintaining data about what to streamline, a smart programmer with plenty of time could use a Bayesian learning algorithm, or a system that automatically notices rendering bottlenecks, or a system that does 'above the fold' calculation. I think Cloudflare's enterprise solution is doing something like this, and Google Amp too.
2) The techniques are not necessarily mutually exclusive. We can use a CDN that supports HTTP/2 Push or puts out resource hint HTTP headers. A very powerful CDN might combine this with the above self-learning technique(s). At the very least a developer can pre-code up resource hints for the CDN to benefit from, as appropriate.
1) We probably should avoid doing Resource hints & HTTP/2 Push unless it is a small amount of data and/or data we're very likely to want, to avoid annoying the key mobile users we are likely doing this optimisation for in the first place. The techniques are usually too presumptive/optimistic and use of a CDN typically wins out.
2) Google Amp obviates the problems discussed as it:
- heavily promotes reduced page complexity tailored for mobile usage
- leverages shared JS library caching (I assume)
- statically caches all your (necessarily static) content on its CDN
- pre-optimises server-side to work out how to load resources (I assume)
I have expressed my major concerns with Amp before, but my concerns are not technical. Amp works by enforcing very opinionated constraints ideal for a very common use case of (basically) static mobile-optimised pages, and powering it on Google's unmatchable CDN capabilities. Note Amp is not necessarily doing any particular technique discussed here, as it is also relying on its own JS techniques, frame techniques, and inlining techniques.
3) Use Resource hints for very specific bottleneck cases you've detected via detailed analysis using browser developer tools. Use of CSS-initiated web fonts is a good example.
What I've intentionally missed out from my analysis
1) 'prerender', because it is being dropped from Chrome (it just didn't make sense due to the very presumptive hogging of a user's computing resources). As an aside, Amp does prerendering, but it does it using an alternative frame technique (Amp content actually loads in frames from Google-hosted content).
2) 'dns-prefetch' is effectively an older version of 'preconnect', I can't see any reason to use it.
3) Resource inlining using base64, or inline JS or CSS. This is something that does make sense to do sometimes, but I didn't want to overcomplicate my analysis. The down-side to inlining is lack of atomic cacheabiliy (i.e. inlined resources inside HTML will be re-sent many times).
4) Trying to optimise our DNS lookups is not going to help you much, DNS is really fast, with multiple levels of geo-distributed caching - and for most of your resources the client will already have it locally cached. Avoiding DNS for embedded resources is legitimate though.
5) Another technique is using CSS sprites. This is when many small images referenced via CSS backgrounds are merged into one shared image file, hence reducing requests. Personally I prefer image inlining because maintaining sprites is a lot of work and leads to messy code.
6) A web packaging standard is coming. Google Amp is moving towards working with that.