As I understand it, the client has an opportunity to proactively cancel pushes of assets it already has cached.
The server will send a "push promise" which basically says "I'm going to send this file to you", and then the client can come back and say "don't bother, I already have it". And this all happens in parallel with the download of other assets (like the main page), so it doesn't really slow anything down.
It can slow things down, as the push works that way:
The client requests some page (e.g. index.html), which will trigger the push on the server. E.g. one for asset1.png. The server initiates the push by sending a push promise frame down the wire, which contains the content of a simulated HTTP request for that exact resource - like if the client sent a GET request for asset1.png through a headers frame. The client can only react on this frame when it receives it and sees that it has asset1.png already in cache. It can then cancel the stream by sending a ResetStream frame. However obvisouly that takes some time, until this push promise frame arrives at the client. Meanwhile the server might decide to not only send the push promise header, but also parts of (or the complete) data. So asset1.png already goes down the wire, and blocks bandwidth that could be used for other things too.
That might slow down downloads from other pages or processes, but as far as the download from the current page goes (which uses multiplexed requests and responses over a single connection) I don't see how that's any worse than it'd be without server push.
In the worst case scenario with server-push that you described, the server decides to push content the client already has cached before pushing content the client doesn't have, and begins transmitting that redundant content before the client has a chance to cancel the push. Once the client receives push promises for the content it doesn't need, it can immediately send a request to cancel the download of those resources. Once those cancellation requests reach the server, it immediately stops transmitting the resources the client has cached, and starts transmitting the data the client actually needs.
Contrast this with the same scenario without server push, where the client parses the HTML for the main page to determine what resources it needs, then send requests for those items which the server must receive before it starts transmitting them. In both cases a round-trip from the server to the client and back is needed before the server can start transmitting the necessary assets, but without server push the client needs to download and parse the main page before it can start telling the server which resources it needs, whereas with server push it can send cancellation notices before the current page is downloaded or parsed.
The difference is that bandwidth is being used by incoming pushed assets before the abort. This might not matter much on fast wired connections but can make a big difference on mobile networks.
Exactly. The start of the push will already consume downstream bandwidth until it's canceled by the remote side. I guess in the worst case it consume about up to the maximum window size of the stream (typically 64kB). The "classic" approach compared to that is the client initiating another GET request for each asset after it has received the index page. This requires more upstream traffic. However the webserver could directly respond with a 304 for cached assets, which means less downstream traffic.
I had a bit of a look around after reading the article, this page might also be useful in explaining how it interacts with the cache and some suggestions on when to apply it:
No, even in the worst case it's exactly the same number of round trips.
Without server push:
1. Client requests main page ->
2. Client receives main page <-
3. Client requests subresources ->
4. Client receives subresources <-
With server push:
1. Client requests main page ->
2. Client receives push promises and main page <-
3. Client cancels promises it doesn't need ->
4. Client receives subresources <-
The only difference is that with server push, steps 2, 3 and 4 happen in parallel, and step 3 can be omitted entirely in the event that the client doesn't need to cancel any pushes.
Ah, fair point. Now I see what you were trying to say.
Assuming the server has no way of determining which assets the client has cached (which depending on implementation may not be the case) you're of course correct. However, after step 2 the page has already fully loaded in both cases, so step 3 doesn't really slow anything down.
The server will send a "push promise" which basically says "I'm going to send this file to you", and then the client can come back and say "don't bother, I already have it". And this all happens in parallel with the download of other assets (like the main page), so it doesn't really slow anything down.
Here's an article I read which goes into how server push works in a lot more detail: https://hpbn.co/http2/#server-push