问题
QwQ-32B 是一个性能比肩 DeepSeek-R1 的推理模型,
但是很多人发现模型会出现无限生成、不停重复输出的情况。
等了两天后发现已经有解决方案了:Tutorial: How to Run QwQ-32B effectively | Unsloth Documentation
本文是自己的体验记录
推荐参数设置
- Temperature = 0.6
- TopP = 0.95
- TopK = 20 ~ 40
- Min_P of 0.02
- Repetition Penalty of 1.0
使用对话模板
<|im_start|>user\nCreate a Flappy Bird game in Python.<|im_end|>\n<|im_start|>assistant\n<think>\n
Ollama
我只有在 MacBook Pro 上才能跑起来这个模型,因此只记录了 Ollama 的使用。 llama.cpp的用法
ollama 支持将模型设置打包到 params
文件里,所以 MacBook 用户直接执行以下命令就行了
ollama run hf.co/unsloth/QwQ-32B-GGUF:Q4_K_M
或者用其他的量化版本 unsloth/QwQ-32B-GGUF at main
ollama run hf.co/unsloth/QwQ-32B-GGUF:Q5_K_M
效果和总结
我尝试了 unslosh 的 Flappy Bird
游戏的例子:
Create a Flappy Bird game in Python. You must include these things:
- You must use pygame.
- The background color should be randomly chosen and is a light shade. Start with a light blue color.
- Pressing SPACE multiple times will accelerate the bird.
- The bird's shape should be randomly chosen as a square, circle or triangle. The color should be randomly chosen as a dark color.
- Place on the bottom some land colored as dark brown or yellow chosen randomly.
- Make a score shown on the top right side. Increment if you pass pipes and don't hit them.
- Make randomly spaced pipes with enough space. Color them randomly as dark green or light brown or a dark gray shade.
- When you lose, show the best score. Make the text inside the screen. Pressing q or Esc will quit the game. Restarting is pressing SPACE again.
- The final game should be inside a markdown section in Python. Check your code for errors and fix them before the final markdown section.
对话客户端用的是 chatbox,输出结果由于太长就不放出来了。可以直接看 unslosh 的分享
作为推理模型,它在我的 MacBook Pro M3 Max 64G 上有以下问题
- 思考时间很长。不大适合作为代码补全的用途,因为等待时间长就容易打断思路。
其他的开源推理模型