TD Bank believes it will make them a profit. Their interests are not those of eBay’s shareholders: if they can juice the financials long enough to sell their loan, they don’t care if the company goes bankrupt the minute after that sale closes.
Even a cursory familiarity with the history of the industry shows both that this is untrue but also that it’s leaving out many of the core reasons why finance is regulated. Bankers do make mistakes, but also their focus is on what makes them a profit now rather than what’s good for their client or the country long term. The bank does not care if GameStop goes bust as long as that happens after the loans are repaid or, most likely, sold. None of the guys who sold incredibly dodgy mortgages—if you weren’t in the market in the late-2000s, they would literally let applicants pencil in their income and not check it—went to jail for packaging those mortgages up so many times removed that they couldn’t reliably prove the loan even existed and reselling them with inflated ratings, and absolutely none of them had to repay their bonuses. Once they found a buyer for an “AAA” derivative, foreclosure was a problem for the retirement fund left holding it after a couple of sales.
That’s what I’d expect here, too: they’ll make some flashy announcements to juice share prices (“AI powered auctions paid in crypto!”) and sell that debt, spin whatever’s left into a subsidiary which splits off, and then profess complete surprise when that goes bankrupt.
> Even a cursory familiarity with the history of the industry shows both that this is untrue
The modern trend of believing that “history” is made up of one or two things that everyone saw on the news is actually really entertaining. Definition of “cursory understanding” tbh.
The banking industry, historically is far from stupid.
This particular story is just basic PR driven market manipulation and has nothing to do with the banking system.
> The modern trend of believing that “history” is made up of one or two things that everyone saw on the news is actually really entertaining.
It would be useful if you could provide a more detailed version of your argument. I don’t think you seriously believe that banks don’t make mistakes but the way this is written does sound like you’re saying it’s highly unlikely while ignoring the other half of the sentence you quoted.
> Cohen is already rich rich, his GameStop compensation doesn’t really matter much
I think this argument is much stronger in the opposite direction: if his motivations were not focused on accumulating wealth, he’d be retired or running some kind of charity once he was that far past the point where he had to work. The fact that he’s not suggests that he derives his self-identity from wealth and the guys who do that are rarely satisfied at mid-tier rich.
He wants to be the next Warren Buffett, I believe this is a stated goal, or at least he’s been very clear that this is his inspiration. He wants GameStop to become the next Berkshire.
If that was a military problem, they’d have done it. Unfortunately, it’s a societal problem and you can’t bomb governments into functioning or people out of poverty.
Terrorism comes in two flavors: you have small groups where a pure assassination strikes can work (e.g. Bin Laden) and larger groups backed by an actual social faction (e.g. the Taliban, Hamas, ISIS), where they don’t.
The Somali pirates fall into the latter territory: desperately poor people with a dysfunctional national government see money floating by daily. You can’t bomb that dynamic out of existence unless you’re willing to commit mass murder or occupy the territory and make a Marshall Plan-level investment in the local society.
This is definitely complicated—I’m not a neuroscientist but worked for some and married one, so I’ve heard quite a few entries from the genre of how our brains fool ourselves or make our conscious experience seem more coherent and linear than it actually is—but the big ones I see are the inability to learn from experience or have a generalized sense of conceptual reasoning. For the latter, I’m not just thinking about the simple “count the r’s in strawberry” things companies have put so much effort into masking but the way minor changes in a question can get conflicting answers from even the best models, indicating that while there’s something truly fascinating about how they cluster topics it is not the same as having a conceptual model of the world or a theory of mind. This is the huge problem in the field: all of these companies would love to have a model which is safe to use in adversarial contexts because then the mass layoffs could begin in earnest, but the technology just isn’t there.
This isn’t a religious argument that there’s something about our brains which can’t be replicated, but simply that it’s sufficiently more complex than anything we have currently.
Not unless you’re referring to significant mental illness, no. Individual people may vary if, say, I ask for health advice but if I ask the same doctor they’re not going to flip the answer based on whether I use medical or wellness influencer phrasings — and that allows them to build a reputation which other people can rely on.
This especially applies to mistakes: the junior developer who drops a database by mistake is unlikely to ever do that again, whereas the same AI companies models keep doing that to a small but non-zero number of customers because they don’t have that higher level learning process or anything like fear of consequences.
Humans can't reliably subitize more than five-ish objects, while chimps can actually do this task better than us. That's our "cant count the R's in strawberry" (which flagship models can reliably do now, general letter counting).
That’s not a valid analogy: humans reliably perform that task billions of times daily. It’s still routine to find cases which reveal that while models may have improved on some basic tasks (or learned to call a tool) there isn’t a deeper understanding of the underlying concept or the problem they’re being asked to solve.
And AI agents reliably-ish do tasks billions of times a day that humans struggle with, namely regurgitating information at incredible rates across wide breadths of topics. I see it as merely a matter of degree, not category.
How do you measure "deeper understanding" in humans? You usually do it by asking them to show their work, show how the dots connect. Reasoning models are getting there, and when they do, I'm sure the goalposts will move yet again.
We subsidize driving by somewhat over a trillion dollars annually, mostly due to lax penalties for negligence which shift liability to drivers’ victims[1]. One way to tackle all of these problems would be requiring drivers to cover the full damages.
Another simple and effective measure would be changing fines from absolute values to a percentage of income. Right now, parking in a bike lane usually doesn’t kill anyone so drivers are only thinking there’s a small chance of a small fine, but if it was a chance of, say, 0.1% of annual income Waymo technology would magically be capable of not doing that. Add a right of private action and enforcement would be high enough to really speed things along, too, and that’d improve safety and travel times for all road users.
Yeah, making fines relative to income would change behaviors for sure. A $20 ticket when you make $20 an hour hits different when you're making $200 or $2,000/hr. If it was a percentage of pay, then the ticket would actually sting.
> Besides, nothing legally prevents placing code under a license. Enforceability is the question, not permission.
That’s not how copyright works. If you don’t own the code, you can’t release it under a license. The question of how much human editing is needed to establish copyright is a huge question right now.
Who’s doing it better? I have yet to hear from a Google or Amazon user who has a transformatively better experience, and I think that’s why they haven’t jumped so far because they have hundreds of millions of users who have daily habits that they don’t want to lightly disturb.
> I think that’s why they haven’t jumped so far because they have hundreds of millions of users who have daily habits that they don’t want to lightly disturb.
I don't think that's part of their decision making, Liquid Glass moved most things around for seemingly not much else than novelty and that's not the first time.
They have done this before, release something large early in anticipation of a major shift and iron out issues before the shift happens. Liquid Glass started off a little janky but they appear to have been ironing out initial issues with each update.
From what I understand (which might be wrong), Liquid Glass was at least partially inspired by visionOS and "spatial computing". And I guess on that platform it might make sense for some use cases.
That doesn't change the fact that I can hardly read some of the user interface in Apple Music for example.
It's not that the idea is bad, but it's badly executed.
Really? None of my issues are fixed. The settings panel still has a massive gray empty chunk hanging off the bottom which makes it look like a 13 year old coded it...
Is Liquid Glass not just a means to slowly force old phones to be obsolete? - My iPhone 11 is fairly slow now and they’ve probably bought forward my next phone purchase by a year
Liquid Glass was also noteworthy for being the first macOS release since 10.1 which was worse across the board in a deliberate manner. They have shipped bugs before but this time it got such poor reception because all of the regressions were intentional and there wasn’t an expectation that they’d be patched.
I don’t think that cavalier attitude is universal at Apple and I don’t think the Siri PM wanted to break with their past respect for UX.
Agreed. I vaguely remember another HN link that said Apple tried a competing-team approach to building a better siri, but it fell apart due to internal politics reasons?
Right now Alexa+ and Gemini are objectively better.
The best is ChatGPT voice mode. It understands non English words and accents amazingly well, and even though the LLM model isn’t the full fledged one, I can have deep conversations with it for an hour without it missing a beat.
Speech to text should work. I regularly have to manually edit the transcribed input. The more special words the more frequent. Completely disregards the context of the current input, for example, on Hacker news might involve special technical and IT vocabulary.
Any of the LLM-based ones should pull this* off - so that's to say.. none of the popular commercially available ones, yet?
Alexa+ does, but I don't use it for anything except kitchen timers and home automation triggers, so I can't speak to how well it works in a longer conversation.
Zoom's meeting notes excels at this, Google Meet is terrible at it. Meet mishears our company name about 90% of the time; various attendee names are a coin toss.
* "this" being: context consideration in speech-to-text/transcription.
The iOS equivalent would be Shortcuts, which, while not as powerful as Tasker depending on the context, is an official Apple feature that most apps support. Claude and ChatGPT both have various Shortcuts hooks, including voice conversation.
The experience of having to tell Siri to "Ask ChatGPT <about something>" really sucks, though. It doesn't consistently do it, the handoff frequently just stalls out and you never get a response, the transcription that gets passed to ChatGPT is low quality, etc.
And though I have the feature enabled that should cause it to ask ChatGPT about things it can't answer, that works even less frequently.
But even if all of these things were true, the stuff on your phone you would expect to be exposed to the model as available tool calls, are not. So their efficacy is very limited.
Oh I was just thinking creating a shortcut that you'd tap on your Home Screen/control shade (whatever it's called) to activate ChatGPT, or wire up to the action button. I forgot you can have Siri do the "ask ChatGPT xyz" thing – I agree, that integration sucks.
I'd definitely do the former. I don't even think this is specific to ChatGPT or Claude's apps.
There seems to be something about how intents get triggered by Shortcuts on iOS that feels flaky to me. Whenever some app suggests a shortcut (most recently Starbucks promoted a shortcut that orders your "usual"), the success rate when I tap it is <50%.
It's possible it's uniquely worse on my device, since I haven't done a "clean install" (vs letting the device upgrade flow copy over) in like a decade. But I'm also not up for dealing with the pain of setting up from scratch just to find out it's bad on a fresh profile, either.
Alexa+ has been a massive downgrade for me. It's extremely laggy and constantly misunderstands me, whereas the old one never did. "Set a timer for 20 minutes" used to be instant and just work, I did this the other day and it took 10 seconds to respond and set a timer for 10 minutes.
Same here. I can see why LLM-driven voice assistants makes sense to product people in the abstract, but introducing non-deterministic behavior into a device I primarily use to help with timekeeping and control lights is nothing but a regression.
I concur that the ChatGPT voice mode is excellent. I can't even think of anything to knock it for other than for whatever reason it never 'hears' my kids, but that's probably because it's not intended to be used in multi-participant chats?
But for one-on-one, it is a really outstanding experience. Especially since they tamped down the way over-the-top humanisms.
My preference, however, is for a voice-control UX just like I get with my Amazon Echo and "classic" Alexa like I have been for the past 10 years I've been using it: I think I can best describe it as a "voice-driven command-line" just like your OS' CLI shell, which makes its interactions predictable, even if it means I need to "know" what commands are valid in a given context. We all need predictability and reliability when it comes to my home-automation integrations.
...but computer interaction with a LLM / transformer-driven / "AI agent" is anything but predictable. When Amazon opted everyone into Alexa+ I agreed to give it a go and see if it really made things better or not - and it did not. I opted-out of Alexa+ and went back to something actually reliable.
Here's a question: I don't understand the gap between these LLM powered voice agents vs CLI coding agents, the latter of which are obviously useful and quite resourceful at getting something done when asked in plain English.
Seems like an agent given 20-30 tool calls like "read_sms" "matter_command", and "send_email" would be able to work out what to do for things like "set the house to 72° and text Laura that I did it."
> Seems like an agent given 20-30 tool calls like "read_sms" "matter_command", and "send_email" would be able to work out what to do for things like "set the house to 72° and text Laura that I did it."
Incidentally, a major headline in the news this past week was about a coding-agent that wiped its company's entire system, including backups; which the company's staffers were confident was utterly impossible (as it didn't have any access to that system), and yet somehow, it did[1] (the TL;DR is the agent randomly came across an unprotected God-tier admin API-key/token saved to a personal text-file in a filesystem it had read-access to). If an agent can do that with only read-only access to a company's routine/everyday storage area then there's no way I'm giving it the ability to deactivate my house's fire-alarms and security-cameras via Google Home/Matter/Thread/HomeKit/X10/OhFfsNotAnotherCloudBasedAutomationScheme.
If you are really worried about that, the agent already has that access since itll go find that key anyways.
the HN thread about that case was much more of a "why are you putting your prod keys in random text files" and "the sota in prompt engineering is that putting DONT FUCKING DO THE BAD THING" makes the agent more desperate to get stuff done
putting limits at the harness level would do just fine. one LLM call, one tool call per voice message.
Siri's one job I care about is doing exactly what I want while I'm driving. I need it to check my text messages, take dictation, start phone calls and deal with music. I don't need to have conversations with it, I need deterministic responses to known commands.
Whenever I see one of these comments, it's always from someone that tried it at the start and then gave up because of a bad experience. And many times there are more people commenting back that this was essentially the 1.0 version and that the current 2.0 version is much better. So as someone that uses none of these products (old voice assistants vs. ai ones) it's really hard to evaluate if any of these anecdotes mean anything.
You could have tried Alexa+ at the start when it was shitty compared to plain Alexa, and maybe it's better now. But equally none of the people that comment that it is "amazing" in its current iteration qualify their statements with their experiences comparing and contrasting the old version vs. the new version making them seem either unqualified to make statements based on how much "better" it is than the old version or at worse they are shills (paid or not). The best take is that they are comparing (e.g.) day-one Alexa+ vs. the current Alexa+ without a comparison to the original Alexa.
... which is to say that it really feels like there are no clear conclusions that could be drawn from all of this.
No matter how good the LLM features are, I just want to turn my lights on and off and check the time. A perfect LLM could maybe perform on par with a simple deterministic command system for these tasks, but not better. All an LLM does is introduce the possibility that a command that worked fine yesterday will randomly not work
Also, one of my first interactions with this Alexa+ thing was “how long is it until 8:45am”, one of only a few commands I use it for to work out how much sleep I’m getting, and it proceeded to ask me what the current time was… I immediately turned it off after that
> All an LLM does is introduce the possibility that a command that worked fine yesterday will randomly not work
Aren't hallucinations part of GenAI? I would assume that "AI" voice recognition doesn't have that baked in, but I'm not working in either of those spaces so maybe I'm missing the details. So many things are being looped into the "AI" umbrella that would have just been called machine learning or pattern recognition a decade ago (e.g. "facial recognition" vs "AI" at a time when "AI" also means chatbots like ChatGPT).
The point is Amazon is adding an “Alexa+” mode that uses LLMs. The plain voice recognition + keyword matching or however the old version works is more reliable (I assume, I didn’t use the new mode much because it immediately failed at what I wanted)
> that tried it at the start and then gave up because of a bad experience
I've had enough bad experiences with products that never got better, or just got worse (Exhibit A: Windows 11). Like most primates, I am capable of learning, and I've learned that once a consumer product/service goes bad there's little hope of a turn-around. I accept that you're telling me that it's gotten better, but of the people I know IRL who also use an Echo, none of them have told me that Alexa+ is worth trying, let alone committing to.
Yes, it's on me for not giving Alexa+ a second chance, but I'm not willing to give Alexa+ a second chance because, as a technology product/service customer, I just don't feel respected by the industry I work for (...lol); if Amazon, Microsoft, Google, et al won't respect me, why should I venture outside my comfort-zone for... what benefit, exactly?
> I accept that you're telling me that it's gotten better,
I'm not telling you this. I'm basically saying that with Alexa/Alexa+ and with Google's Gemini vs Goole Now(?) I've seen many posts like this. Where someone complains about the AI version, but then there are other posts that come in and claim how much better it is. Even for things like Claude Code you get people complaining about how many mistakes it makes, and then people coming in and saying that it's because they are "doing it wrong". Either "Claude has improved by 10x in the last 6 months. It's so amazing! If you used it a year or so ago it doesn't even compare!" or "You aren't using the most expensive tier of Claude which increases context and thinking abilities that are hobbled in the cheaper versions!"
I never really see a comparison on the same level and it sounds like people talking past each other or some people having legitimate complaints and then others coming in to shill for a product.
I'm not in anyway implying that "You should totally try this out now that they fixed everything" or anything of the sort. I even stated that I don't use any of these tools, and I was commenting as something more akin to an "outsider."
I don't run Windows 11 so I haven't taken a look, but I speculate it's because it contains a bunch of ML blobs for Windows Photo's image-classification and photo subject/contents keyword search.
On Windows 10, the Photos app package is about ~140MB on my computer. A good chunk of that is because the package includes a lot of dependencies - including platform deps that I'd expect would be part of the UWP runtime in the OS - kinda like how since the introduction of Swift/UIKit/etc in iOS the IPA packages all bundle their platform dependencies, even though they're demonstrably redundant, because UIKit isn't an OS-provided framework anymore... I'm not up-to-date in the iOS dev scene so I'm unsure why Apple went with that approach.
I'm not an Alexa user myself but I have watched my wife interact with it for around 5years now.
The new Alexa powered by an LLM is objectively better that previous Alexa in a few ways. This much was apparently from day one and has only gotten smoother.
1. It can reliably execute direct or vague-ish commands "play X movie in app Y" or "play x show" and can infer X movie is only available in app Z so use that.
> It can reliably execute direct or vague-ish commands "play X movie in app Y" or "play x show" and can infer X movie is only available in app Z so use that.
...how does that work, exactly? (or rather: what's the context here?); there's no possible way for an Alexa+-powered Amazon Echo to control my AppleTV or interface with VLC on my desktop.
It's not the early 2000s where just messing around and wasting time on this stuff is cool in itself. None of that time wasted turned into much long term apps that stuck with me. Maybe a banking app and a trail running app.
I ruined multiple dinners with timers that didn't work (with a time/labor cost).
I had to get out of bed in the freezing to turn the lights out. It's easy to hit the lights when I go to bed but annoying having the tool fail and getting back out.
Music stuff didn't work well because I used Youtube Music not Spotify.
Those were my 3 use cases for Google voice, and it failed them all enough I just stopped using it all together. Who cares if it works today if in another month they just change something and break it again? They've shown it's not a tool to use for tool things, it's a 'gee wow' thing. I don't need to be impressed. I need not burnt food.
Alexa+ is terrible compared to Alexa. It's so bad that I've dusted off my v1 echos cuz they're too old to run Alexa+. Complete shit show that is.
I do like Gemini better than Assistant, even though it's not quite there yet. But that's just a matter of time because they actually designed it from the ground up to be a drop in replacement for Assistant.
Oh man. I made the mistake of converting my Google Home devices to Gemini.
The first problem is that it's just slow. If I want it to turn off some light, it takes a long time before responding.
But yeah, the failure to do basic tasks. I have a routine that I used to have it run (controls several devices at once). Now:
10-20% of the time it runs it.
60% of the time it says it's running it but it doesn't do anything.
20-30% of the time it says it can't do it unless I opt in to invasive permissions. And when I opted into them, it still failed about a third of the time. So I opted out again.
I don't know if it's related to Gemini, but sometimes the Android Auto tells me "I don't have permission to do that" simultaneously with actually doing the thing that is allegedly lacking permission. Sometimes I want to move off the grid.
Man, I hate touch screens. And I hate Android Auto. My previous car had an aftermarket Bluetooth system (radio, etc). It was way, way better than Android Auto or any entertainment system I've seen in any car.
Strong disagree. The upgrade was a little bit rough at first (mostly because of slow response) but now it's a million times better than the old assistant. The old assistant basically just repeated "I don't know how to do that" over and over.
I have never had trouble setting timers with either.
The new one was 100% failure to do anything with timers for me. I never saw it work once. If I had ever gotten that to work at all, I may not have uninstalled it, and might have a different impression now. I cannot account for why our experiences are so different.
Your experience is valid of course, but I never once have had the inclination to have a conversation with my phone. I'm not sure which of our experiences is more common.
It’s not a conversation like you’d have with a friend, it’s the type of interaction you’d have with a chatbot, just hands-free.
To give you an example, I was having coffee the other morning while unloading the dishwasher and asked the speaker if today was a good day to apply weed and feed on my lawn. This was not possible with the old assistant and was useful to me.
I hate it too.. the old assistant is pretty smart, obviously it has some language processing, but not "AI", but it's very fast for things like "Set alarm for ... ", "Remind me at X about Y", "Add calendar event on x at y about z", or "Navigate home".
And now if I want to use Gemini on my phone I have to replace Assistant. Nah, I'll keep Assistant thanks, and just have a shortcut to load the Gemini in the browser.
Except the browser experience is so fucking buggy, constant reloads needed..
Claude.. I switched my phone assistent to claude and it does everything that google (used to) do like set alarms and timers, but also does everything claude can do.
The only thing I haven't been able to get it to do is read from my phone's local calendar. The claude app can but the voice assistant cannot (Why? No idea). Perplexity has no issue doing it so I actually use them for my rare needs to do voice commands with my phone.
WhisprFlow produces much better speech-to-text for long text messaging-by-voice (dictation / transcription) than apple's speech-to-text does. Whisper models in general seem to do a lot better than most built-into-OS/app models. Which is interesting, because there's nothing stopping them from just using Whisper models.
I love MacWhisper personally. Also, Gumroad is a fantastic app distribution platform for my personal values.
As far the "decision tree" side ... there's not much that can be done about that now. Agentic agents still go too "off-the-rails" to be productionized out to the billions of smartphones of the world. I'm working on voice-controlled agentic-with-rails AI features for my HomeAssistant, because Alexa / Google Home suck. But that's a hobby project and rogue AI actions only affect me, not billions of customers.
It’s not “transformatively better” but it definitely involves fewer frustrations to interact with. That’s always been Apple’s main value proposition, you’re not getting the most cutting-edge stuff but you’re supposed to have something that “just works” not something that makes you go “GODDAMN IT!” when it inexplicably seems to fumble normal things.
So if you buy Apple products based on that value proposition it’s a big problem for Apple if they can’t seem to keep their brand-promise in this area.
My android phone was so much better for voice-to-everything. Whether it was transcribing my voice for text messages, or doing looks on the internet. Siri is just so bad.
Still love not having google's paws all over my data, though, so not going back.
Actually, could you recommend one? The ones I've found all seem to want subscriptions. I'm okay paying a few dollars for a well done frontend, but an ongoing sub to run an open weights model locally is nuts...
Wasn’t planning on it, but Show HN: JoinIn.AI?! We’re working on making a better audio interface to LLMs that is socially adroit enough to handle even multi-party conversation.
Feeling #blessed that I apparently have the exact same upper midwest accent they must've trained Siri on, because I've literally never had an issue with dictation or being misheard. And I use it a lot!
(It misunderstands my wife from California all the time, though.)
Plus, if someone else does it better (or different), I bet they've got a team and technology at a 90% done state waiting to jump on it, pick it apart and make it better. I don't think they're not doing anything.
Yesterday my google home mini gave me the current temperature in farenheit. I live in Canada and use a pixel. Dumbest fucking AI going. May as well give it to me in coulombs per hectare.
> My personal opinion for a while has been that crypto operations should be in the kernel so we can end the madness that is every application shipping it's own crypto and trust system which has only gotten worse since containers were invented.
There’s a valid argument here but I think that’d devolve into the DNSSec trap without both a very well-designed API and a stable way to ship updates for older kernels. If people can’t get good user experience or have to force kernel upgrades to improve security, most applications will avoid it. Things like Chrome shipping their own crypto mean that they can very quickly ship things like PQC without waiting years or having to deal with issues like kernel n+1 having unrelated driver or performance issues which force things into a security vs. functionality fight.
Which does sort of loop around to the issue of Linux not having a stable ABI as a feature I suppose which would be one way to implement it with long term compatibility on kernel modules.
But the Chrome example also highlights the problem: Chrome might ship it, but vanishingly little software is ever going to upgrade and we've got an explosion of statically linked languages now.
Sure, nobody’s saying it’s an inscrutable mystery but if your goal is to inform a wide audience it’s considered good form to expand all but the most common acronyms. It’ll even get you more internet points than petty smugness.
reply