That's cute, but pure fantasy, imo. There's no non-trivial code that's actually used and maintained that's going to be immune to obvious deficiencies that get overlooked. Look at this [0], for example. Simple as can be, obvious defect in a critical area, left out in open source for ages, a type of bug an LLM would be liable to generate in a heartbeat and a reviewer could easily miss. We are human after all. [1]
Early on in Amazon’s history (long before same day shipping), they added a feature that would tell you, on a product page, whether you had recently bought that same product. The metrics spoke loud and clear: it caused purchase count to go down. Human common sense about the customer’s experience overruled the data and they have some variation of that feature to this day. That’s the “customer obsession,” but unfortunately most businesses only copy the “data driven”.
I don't know about some of those "incantations", but it's pretty clear that an LLM can respond to "generate twenty sentences" vs. "generate one word". That means you can indeed coax it into more verbosity ("in great detail"), and that can help align the output by having more relevant context (inserting irrelevant context or something entirely improbable into LLM output and forcing it to continue from there makes it clear how detrimental that can be).
Of course, that doesn't mean it'll definitely be better, but if you're making an LLM chain it seems prudent to preserve whatever info you can at each step.
> What I mean by that is: after reading through a brief description of a project, or a conceptual overview, they are no better than noise at predicting whether it will be worthwhile to try out, or rewarding to learn about, or have a discussion about, or start using day-to-day.
You could criticize a Michelin inspector the same way. The poor bastards have to actually taste the dish and can't decide merit based on menu descriptions alone.
I have RSI issues with my wrists. It really helps to have split ortholinear keyboards, if only because I have trained on them specifically to type without moving my wrists whatsoever (using a miryoku layout), while old habits persist on a standard QWERTY TKL. Of course, it also helps to put on really soft-springed keys (like kailh silver) — it's self evident that softer keys are easier to press.
I had initially thought that it'd be hard to use both kinds of keyboards at once, but my muscle memory for either seems unaffected when I use the other one.
The actual founder of Bitcoin cannot touch their money without causing a lot of panic in the market. I believe Coinbase's 2021 S-1 prospectus explicitly listed "the identification of Satoshi Nakamoto... or the transfer of Satoshi’s Bitcoins" as a business risk factor.
In other words, AI raises the floor. If you were already near the ceiling, relying on it can (and likely will) bring you down. In areas where raising the floor is exceptionally good value (such as bespoke tools for visualizing data, or assistants that intelligently write code boilerplate, or having someone to speak to in a foreign language as opposed to talking to the wall), AI is amazing. In areas where we expect a high bar, such as an editorial, a fast and reliable messaging library, or a classic novel, it's not nearly as useful and often turns out to be a detriment.
When you say "notes", "community notes" is definitely not the first thing that comes to my mind. On your landing page one either has to scroll down the length of three folds or read some very small print for the first mention of community notes, I think even watching the video doesn't make it entirely clear what the "notes" part of your app really is. I think the fact that this is community notes should be made instantly apparent. It'll probably help your conversions, because the value prop is easier to grok.
Perhaps notes is just an unfortunate name for this kind of thing. I much prefer Twitter's original "Birdwatch".
Is this for very large servers? I'm curious to know why community notes was chosen as the lever for better moderation.
HN already has a mechanism for flagging posts. If flagging low effort/trivial show HN posts were normalized, I suspect it would work just fine: if a human can't easily decide whether or not to flag it, they likely wouldn't.
As dang posted above, I think it's better to frame the problem as "influx of low quality posts" rather than framing policies having to do explicitly with AI. I'm not sure I even know what "AI" is anymore.
[0] https://www.theguardian.com/technology/2014/feb/25/apples-ss...
[1] https://pmc.ncbi.nlm.nih.gov/articles/PMC9436839/
reply