I'm going to say something that is a bit out there, but to this day web frontend still feels like a downgrade in some ways to basic things Visual Basic had in 1998.
I was having similar thoughts about building UIs back in the late 90s and a few years thereafter. There were a number of tools, I remember using Tcl/Tk to create a GUI scheduling program that was relied on for many years. Later on we had Delphi and numerous others. I suppose building for the web was so much more complicated because the HTML/CSS/JS infrastructure was ill-suited as a basis for comparable GUIs.
Stars kinda famously fuse elements up to iron as part of normal operations. And even if you exclude that, the entire solar system is leftovers from a previous star - all that is inside our current star too. Sure, much of it isn't at the surface, but there's not much of a reason to expect that literally zero of it randomly floats up among the lighter elements.
That said, "heavy ions and atomic nuclei of elements such as carbon, nitrogen, oxygen, neon, magnesium, silicon, sulfur, and iron" makes up only "trace amounts" of the solar-wind plasma [1].
That looks a bit bare minimum, not the use of regex but rather that it's a single line with a few dozen words. You'd think they'd have a more comprehensive list somewhere and assemble or iterate the regex checks as needed.
A lot of styling is done still in the old Advanced Substation subtitle format, which is nice in a whole number of ways but doesn't have any standards working group behind and so it's a bit ignored in software and operations.
People use either some flavor of W3C's Timed Text or WebVTT instead (and it was already a pain to get them to drag their feet into them and drop the old analog broadcast formats). Now, here's the thing. WebVTT isn't radically different in format and features to (A)SSA and it has plenty of styling options... but, once again, a lot of platforms and software are dragging their feet to support them.
So the industry has been sloooowly doing the right thing moving to the W3C standards (not a huge fan of Timed Text myself, but it exists for a reason), but only with the most basic and safe features. Which are also about as many features you get out of plain speech to text output, so it's even easier to make that decision.
The amount of calls on some pages displaying the simplest stuff is mind-boggling. 160 requests for a page just displaying a HTML5 video and a title, 360 requests for a Reddit page, it's nuts. We don't need to be like this.
reply