Item 44240142

resters • 2 days ago

It's probably optimized in some way, but if the optimizations degrade performance, let's hope it is reflected in various benchmarks. One alternative hypothesis is that it's the same model, but in the early days they make it think "harder" and run a meta-process to collect training data for reinforcement learning for use on future models.

SparkyMcUnicorn • 2 days ago

It's a bit dated now, but it would be cool if people submitted PRs for this one: https://aider.chat/docs/leaderboards/by-release-date.html

1 reply

__mharrison__ • 2 days ago

Dated? This was updated yesterday https://aider.chat/docs/leaderboards/

1 reply

SparkyMcUnicorn • 2 days ago

My link is to the benchmark results _over time_.

The main leaderboard page that you linked to is updated quite frequently, but it doesn't contain multiple benchmarks for the same exact model.