Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't think so. Pruning a large model and training a smaller model isn't the same thing. It might appear to be the same thing, but it's not.


Do you expect a model which was overtrained (relative to the Chinchilla law) to be no more affected by pruning than a model of the same size that wasn't overtrained?


Can you reformulate this question? It's hard to know what you mean when you say "no more affected". How are you defining "more" ?


I mean stronger impact on loss or benchmark results.


I mean, in some relative (like 10%) or some absolute amount? I think I'd expect the "more trained" model to drop performance by less (as a %, which is hard to define here) but more in absolute sense. Which, is basically impossible to measure but even if it was measurable..I don't feel confident about that prediction, it's speculation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: