tauntz 2 days ago

Why's it called o3 then if it's a different thing? There's already a rather extreme amount of confusion with the model names and it's not clear _at all_ which model would be "the best" in terms of response quality.

Here's the current state with version numbers as far as I can piece it together (using my best guess at naming of each component of the version identifier. Might be totally wrong tho):

1) prefix (optional): "gpt-", "chatgpt-"

2) family (required): o1, o3, o4, 4o, 3.5, 4, 4.1, 4.5,

3) quality? (optional): "nano", "mini", "pro", "turbo"

4) type (optional): "audio", "search"

5) lifecycle (optional): "preview", "latest"

6) date (optional): 2025-04-14, 2024-05-13, 1106, 0613, 0125, etc (I assume the last ones are a date without a year for 2024?)

7) size (optional): "16k"

Some final combinations of these version number components are as small as 1 ("o3") or as large as 6 ("gpt-4o-mini-search-preview-2024-12-17").

Given this mess, I can't blame people assuming that the "best" model is the one with the "biggest" number, which would rank the model families as: 4.5 (best) > 4.1 > 4 > 4o > o4 > 3.5 > o3 > o1 (worst).

3
tedsanders 2 days ago

o3 pro is based on o3 and its style and outputs will be quite similar to o3.

As an analogy, think of it like this:

o3-low ~ Ford Mustang with the accelerator gently pressed

o3-medium ~ Ford Mustang with the accelerator pressed

o3-high ~ Ford Mustang with the accelerator heavily pressed

o3 pro ~ Ford Mustang GT

Even though a Mustang GT is a different car than a Mustang, you don’t give it a totally different name (eg Palomino). The similarity in name signals it has a lot of the same characteristics but a souped up engine. Same for o3 pro.

Fun fact: before GPT-4, we had a unified naming scheme for models that went {modality}-{size}-{version}, which resulted in names like text-davinci-002. We considered launching GPT-4 as something like text-earhart-001, but since everyone was calling it GPT-4 anyway, we abandoned that system to use the name GPT-4 that everyone had already latched onto. Kind of funny how our original unified naming scheme made room for 999 versions, but we didn't make it past 3.

Edit: When I say the Mustang GT is a different car than a Mustang - I mean it literally. If you bought a Mustang GT and someone delivered a Mustang with a different trim, you wouldn't say "great, this is just what I ordered, with the same features/behavior/value." That we call it a different trim is a linguistic choice to signal to consumers that it's very similar, and built on the same production line, but comes with a different engine or different features. Similar to o3 pro.

dwohnitmok 2 days ago

Can you elaborate on what you mean that o3 pro is a GT? In particular I don't understand how to reconcile what you're saying that o3 pro is in some way fundamentally different from o3 (albeit based on o3) with this tweet:

> As o3-pro uses the same underlying model as o3, full safety details can be found in the o3 system card.

https://x.com/OpenAI/status/1932530423911096508

tedsanders 2 days ago

Yeah, I totally get the confusion here. Unfortunately I can't give the recipe behind our models, so there's going to be some irreducible blurriness here, but the following statements are all true:

- o3 pro is based on o3

- o3 pro uses the same underlying model as o3

- o3 pro is similar to o3, but is a distinct thing that's smarter and slower

- o3 pro is not o3 with longer reasoning

In my analogy, o3 pro vs o3 is more than just an input parameter (e.g., not just the accelerator input) but less than a full difference in model (e.g., Ford Mustang vs F150). It's in between, kind of like car trim with the same body but a stronger engine. Imperfect analogy, and I apologize if this doesn't feel like it adds any clarity. At the end of the day, it doesn't really matter how it works - what matters is if people find it worth using.

stonogo 2 days ago

This analogy might work better if the Mustang GT weren't, in fact, the same car as the Mustang. It's just a trim level, not a different car.

energy123 2 days ago

My guess is this comes from an org structure where you have multiple "pods" working on different research. Who comes up with the next shippable model and when that happens is kind of random and the chaotic naming system comes from that. It's just my speculation and could be wildly wrong.

rat9988 2 days ago

o3 and o3-pro aren't the same thing still makes sense though.