Item 44238743

a2128 • 2 days ago

24B is the size of the Small opensourced model. The Medium model is bigger (they don't seem to disclose its size) and still gets beaten by Deepseek R1

thot_experiment • 2 days ago

Mistral Large is 123b so one can probably assume that medium is between 24b and 123b, also Mistral 3.1 is by a wide margin my go-to model in real life situations. Benchmarks absolutely don't tell the whole story, and different models have different use cases.

2 replies

ohso4 • 2 days ago

It's a 70b model, Medium 2 was 70b.

https://xcancel.com/arthurmensch/status/1920136871461433620#...

Ringz • 2 days ago

Can you please explain what your „real life situations“ are?

1 reply

thot_experiment • 2 days ago

I use it as a personal assistant (so tool use integrated into calendar/todo/notes etc) often times using the multimodal aspect (taking a photo of a todo list, asking it to remind me to buy something from a picture). I also use it as a code completion tool in vscode, as well as a replacement for most basic google searches ("how does this syntax work", "what's the torch method for X")

I use it for almost every interaction I have with AI that isn't asking it to oneshot complex code. I fairly frequently run my prompts against Claude/ChatGPT and Mistral 3.1 and find that for most things they're not meaningfully different.

I also spend a lot of time playing around with it for storytelling/integration into narrative games.

1 reply

mandelken • 2 days ago

Cool. What framework or program do you use to orchestrate this?

1 reply

thot_experiment • 1 day ago

Me, Mistral and Claude writing modules on top of a homebrew assistant framework in node with a web frontend. I started out mostly handwriting the first couple modules and the framework for it. (todo and a time tracker) and now the AI is getting pretty good at replicating the patterns I like using, esp with some prompt engineering as long as I don't ask for entire architectures but just prod it along. It's just so easy to make the exact thing you want now. All the heavy lifting is done by ollama and the node/browser APIs.

The only dependency on the node side is 'mime' which is just a dict of mime types, data lives inside node's new `node:sqlite` everything on the front side that isn't just vanilla is alpine. It runs on my main desktop and has filesystem access (which doesn't yet do anything useful really) but the advantage here is that since I've written (well at least read) all of the code I can put a very high level of trust into my interactions.

1 reply

Rastonbury • 1 day ago

Did you hook up any search tools?