Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In 2017 LLMs weren't powerful enough to generate working code on their own, but my goal was to at least create a chatbot that could help you rubber-duck-debug your way to a solution. Unfortunately the tech wasn't quite strong enough for that, and not enough engineers even knew what rubber-duck-debugging was. RIP Duckly.

Trying to train an LLM on two 1080ti's on the StackOverflow corpus in my living room was a vibe though. Good times.

 help



Duckly deserved to actually work. There’s a small irony here: the closest study I found to this, robots specifically built to simulate attentive listening, found they performed no better than an actual inanimate rubber duck for adult engineers. The mechanical signal of listening doesn’t seem to be the active ingredient. Makes me wonder if Duckly would have needed real disagreement to close a gap a duck can’t, not just better natural language.

You're probably on to something with the value of disagreement. I think it's one reason why chatting with current models doesn't create the same stimulation as rubber-ducking used to bring. The models are typically too quick to agree and amplify what you think rather than truly break it down and push back.

And thanks for saying it should have worked, I agree. My chagrin has increased over the years as I have realized the magnitude of my ill-timing.


Has anyone seen a good set of prompts for that disagreement? For the "skeptical eyebrow-raise" or "confused/doubtful head tilt" aspect of rubber ducks?

Agentic uses adversarial expert, steel-man opponent, risk-mitigation and failure-mode analysis. But what about almost brainstorming, but with thought-provoking nudge questions? Or on the other hand, arm-waving fight-club style discussion? Or... It's a big design space. I used to go to lots of research talks at MIT, in assorted departments. The post-talk Q&A question cultures varied a lot. Like encompassing both "leaves the speaker in tears", and "nudge so subtle, you won't quickly get it if you've not already spotted the fatal flaw in the work".

So aside from dialing down the "transformative insight!" silliness, there seems a rich multi-agent multi-persona space to explore.


I think agreement has value here too. An LLM that's starting to get a bit sycophantic will rephrase your ideas in a few different ways, and seeing the different presentations is helpful for reconsideration.

fwiw Pangram says 100% of the original article was written by AI

I wonder how much is actually needed to create an automated rubber duck. How well would ELIZA work? (https://en.wikipedia.org/wiki/ELIZA) (might need some adjustment to not talk like a therapist, but you get the idea)

2017 is a bit early to refer to them as LLMs. I'm not sure when exactly we started to refer to LMs as 'large', but I don't think it was before GPT2 (2019). That said, from the NLP work I've done, it was much more interesting working on small specialized models.

fwiw Pangram says 100% of the original article was written by AI

perhaps it is time to resurrect Duckly queue Frankenstein music and thunder in background

That would be interesting but the space is so choked now.

I'm also already busy building Forgetmenaut and enabling data deletion at scale: forgetmenaut.com


It's AIive!

AI ive, the thinnest neural network ever.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: