More

TheTaytay · 2026-03-30T03:35:08 1774841708

I quite liked this. It feels approachable and to-the-point.

TheTaytay · 2026-03-29T19:51:57 1774813917

The +1 button doesn’t work?

TheTaytay · 2026-03-28T04:52:11 1774673531

Devcontainers are looking pretty gold right now…

TheTaytay · 2026-03-28T04:51:22 1774673482

A lot of automated scanners run during that week.

TheTaytay · 2026-03-23T20:10:33 1774296633

Yes, I was more impressed with their decoupling of prompts from parameters!

TheTaytay · 2026-03-23T15:07:03 1774278423

I think the more likely attack vector in OpenClaw is convincing it to install a malicious npm package or script, have that siphon all machine/env secrets, and then watch those secrets get abused. (Cloud API key -> crypto mining. Wallet key->theft. Npm credentials->worm publishes more copies of itself. GitHub key->more theft and malicious code upload. Email API key->IP theft and password reset on other systems) Almost all of this can be automated, so the attacker doesn’t have to know who you are.

It’s not targeted per se.

TheTaytay · 2026-03-23T15:02:00 1774278120

I tried it in the past, one time “in earnest.” But when I discovered that none of my actual optimized prompts were extractable, I got cold feet and went a different route. The idea of needing to do fully commit to a framework scares me. The idea of having a computer optimize a prompt as a compilation step makes a lot of sense, but treating the underlying output prompt as an opaque blob doesn’t. Some of my use cases were JUST off of the beaten path that dspy was confusing, which didn’t help. And lastly, I felt like committing to dspy meant that I would be shutting the door on any other framework or tool or prompting approach down the road.

I think I might have just misunderstood how to use it.

sbpayne · 2026-03-23T15:10:43 1774278643

I don't know that you misunderstood. This is one of my biggest gripes with Dspy as well. I think it takes the "prompt is a parameter" concept a bit too far.

I highly recommend checking out this community plugin from Maxime, it helps "bridge the gap": https://github.com/dspy-community/dspy-template-adapter

TheTaytay · 2026-03-23T20:10:02 1774296602

Woah, that community plugin looks so much closer to what I initially thought (hoped) Dspy was!

TheTaytay · 2026-03-23T03:59:20 1774238360

Please do elaborate. I’ve only tried switching to codex once or twice, and it’s been probably 3 months since I last tried it, but I was underwhelmed each time. Is it better on novel things in your experience?

davidanekstein · 2026-03-23T04:25:55 1774239955

My experience is that it is much more terse and realistic with its feedback, and more thoughtful generally. I trust its positive acknowledgements of my work more than claude, whose praise I have been trained to be extremely skeptical of.

Tallain · 2026-03-23T05:36:39 1774244199

In my experience, Codex / ChatGPT are better at telling you where you're wrong, where your assumptions are incomplete, etc., and better at following the system prompts.

But more importantly, as a coding agent, it follows instructions much better. I've frequently had Claude go off and do things I've explicitly told it not to do, or write too much code that did wrong things, and it's more work to corral it than I want to spend.

Codex will follow instructions better. Currently, it writes code that I find a few notches above Claude, though I'm working with C# and SQL so YMMV; Claude is terrible at coming up with decent schema. When your instructions do leave some leeway, I find the "judgment" of Codex to be better than Claude. And one little thing I like a lot is that it can look at adjacent code in your project so it can try to write idiomatically for your project/team. I haven't seen Claude exhibit this behavior and it writes very middle-of-the-road in terms of style and behavior.

But when I use them I use them in a very targeted fashion. If I ask them to find and fix a bug, it's going to have as much or more detail as a full bug report in my own ticketing system. If it's new code, it comes with a very detailed and long spec for what is needed, what is explicitly not needed, the scope, the constraints, what output is expected, etc., like it's a wiki page or epic for another real developer to work from. I don't do vague prompts or "agentic" workflow stuff.

logicchains · 2026-03-23T18:24:13 1774290253

GPT is much better at anything mathematical than Claude, as is Gemini. This is evidenced by their superior results at math Olympiads, the Putnam, etc.

TheTaytay · 2026-03-17T23:54:23 1773791663

I don’t know why this is being downvoted. Danielhanchen is legit, and unsloth was early to the fine-tuning on a budget party.

danielhanchen · 2026-03-18T01:57:21 1773799041

Haha no worries at all :)

TheTaytay · 2026-03-16T20:27:53 1773692873

I’m a huge fan of CLIs over MCP for many things, and I love asking Claude Code to take an API, wrap it in a CLI, and make a skill for me. The ergonomics for the agent and the human are fantastic, and you get all of the nice composability of command line Unix build in.

However, MCPs have some really nice properties that CLIs generally don’t, or that are harder to solve for. Most notably, making API secrets available to the CLI, but not to the agent, is quite tricky. Even in this example, the options are env variables (which are a prompt injection away from dumping), or a credentials file (better, but still very much accessible to the agent if it were asked).

MCPs give you a “standard” way of loading and configuring a set of tools/capabilities into a running MCP server (locally or remotely), outside of the agent’s process tree. This allows you to embed your secrets in the MCP server, via any method you choose, in a way that is difficult or impossible for the agent to dump even if it goes rogue.

My efforts to replicate that secure setup for a CLI have either made things more complicated (using a different user for running CLIs so that you can rely upon Linux file permissions to hide secrets), or start to rhyme with MCP (a memory-resident socket server started before the CLI that the CLI can talk to, much like docker.sock or ssh-agent)