Streamdata.io (linked in a different post) has blogged [1] about why they chose SSE instead of WS. Additionally, WS drops you down to something very similar to raw TCP, so you have to bring your own application-level message framing, while SSE comes with its own minimalistic format. From a middle box's perspective, SSE isn't that different from streaming video or long polling, although the SSE recommendation [2] does have a line about older proxies:
"Legacy proxy servers are known to, in certain cases, drop HTTP connections after a short timeout. To protect against such proxy servers, authors can include a comment line (one starting with a ':' character) every 15 seconds or so."
As far as I know, two of the few things WS does on top of TCP are in fact: making endpoints URL-addressable - and providing message framing. You can even have text-only frames with a well-defined encoding.
In any way though, thanks for the links. I haven't given them a deep read yet but will definitely do so soon.
"Legacy proxy servers are known to, in certain cases, drop HTTP connections after a short timeout. To protect against such proxy servers, authors can include a comment line (one starting with a ':' character) every 15 seconds or so."
[1] http://streamdata.io/blog/push-sse-vs-websockets/
[2] https://html.spec.whatwg.org/multipage/comms.html#authoring-...