Item 44239675

sunaookami • 2 days ago

I think that was a joke. New pricing is already in place:

Input: $2.00 / 1M tokens

Cached input: $0.50 / 1M tokens

Output: $8.00 / 1M tokens

https://openai.com/api/pricing/

Now cheaper than gpt-4o and same price as gpt-4.1 (!).

runako • 2 days ago

> Now cheaper than gpt-4o and same price as gpt-4.1 (!).

This is where the naming choices get confusing. "Should" o3 cost more or less than GPT-4.1? Which is more capable? A generation 3 of tech intuitively feels less advanced than a 4.1 of a (similar) tech.

2 replies

jacob019 • 2 days ago

Do we know parameter counts? The reasoning models have typically been cheaper per token, but use more tokens. Latency is annoying. I'll keep using gpt-4.1 for day-to-day.

koakuma-chan • 2 days ago

o3 is a reasoning model, GPT-4.1 is not. They are orthogonal.

1 reply

runako • 2 days ago

My quibble is with naming choices and differentiating. Even here they are confusing:

- o4 is reasoning

- 4o is not

They simply do not do a good job of differentiating. Unless you work directly in the field, it is likely not obvious what is the difference between "our most powerful reasoning model" and "our flagship model for complex tasks."

"Does my complex task need reasoning or not?" seems to be how one would choose. (What type of task is complex but does not require any reasoning?) This seems less than ideal!

1 reply

koakuma-chan • 2 days ago

This is true, and I believe apps automatically route requests to appropriate models for normie users.

MallocVoidstar • 2 days ago

No, people had tested it after Altman's announcement and had confirmed that they were still being billed at the original price. And I checked the docs ~1h after and they still showed the original price.

The speculation of only input pricing being lowered was because yesterday they gave out vouchers for 1M free input tokens while output tokens were still billed.

agsqwe • 2 days ago

thinking models produce a lot of internal output tokens making them more expensive than non-reasoning models for similar prompt and visible output lengths

rvnx • 2 days ago

It is slower though

vitaflo • 2 days ago

Still 4x more expensive than Deepseek R1 tho.