Oras 2 days ago

Would be interesting to see a comparison with Qwen 32B. I found it a fantastic local model (ollama).

2
SV_BubbleTime 2 days ago

Last year, fit was important. This year, inference speed is key.

Proofreading an email at four tokens per second, great.

Spending a half hour to deep research some topic with artifacts and MCP tools and reasoning at four tokens per second… a bad time.

DSingularity 2 days ago

I agree. Qwen models are great.