csomar 2 days ago

I think the parent-parent poster has explained why we can't trust you (and work on OpenAI doesn't help they way you think it does).

I didn't read the ToS, like everyone else, but my guess is that degrading model performance at peak times will be one of the things that can slip through. We are not suggesting you are running a different model but that you are quantizing it so that you can support more people.

This can't happen with Open weight models where you put the model, allocate the memory and run the thing. With OpenAI/Claude, we don't know the model running, how large it is, what it is running on, etc... None of that is provided and there is only one reason that I can think of: to be able to reduce resources unnoticed.

2
rfoo 2 days ago

An (arbitrarily) quantized model is a totally different model, compared to the original.

Reubachi 2 days ago

I'm not totally sure how you at this point in your online presence associate someone stating their job as a "brag" and not what it really is, providing transparency/disclosure before stating their thoughts.

This is HN and not reddit.

"I didn't read the ToS, like everyone else, but my guess..."

Ah, there it is.