告别重复输出！QwQ-32B 故障排查与修复实战指南

问题

QwQ-32B 是一个性能比肩 DeepSeek-R1 的推理模型，

但是很多人发现模型会出现无限生成、不停重复输出的情况。

等了两天后发现已经有解决方案了：Tutorial: How to Run QwQ-32B effectively | Unsloth Documentation

本文是自己的体验记录

Ollama

我只有在 MacBook Pro 上才能跑起来这个模型，因此只记录了 Ollama 的使用。 llama.cpp的用法

ollama 支持将模型设置打包到 params 文件里，所以 MacBook 用户直接执行以下命令就行了

ollama run hf.co/unsloth/QwQ-32B-GGUF:Q4_K_M

或者用其他的量化版本 unsloth/QwQ-32B-GGUF at main

ollama run hf.co/unsloth/QwQ-32B-GGUF:Q5_K_M

效果和总结

我尝试了 unslosh 的 Flappy Bird 游戏的例子：

Create a Flappy Bird game in Python. You must include these things:

- You must use pygame.
- The background color should be randomly chosen and is a light shade. Start with a light blue color.
- Pressing SPACE multiple times will accelerate the bird.
- The bird's shape should be randomly chosen as a square, circle or triangle. The color should be randomly chosen as a dark color.
- Place on the bottom some land colored as dark brown or yellow chosen randomly.
- Make a score shown on the top right side. Increment if you pass pipes and don't hit them.
- Make randomly spaced pipes with enough space. Color them randomly as dark green or light brown or a dark gray shade.
- When you lose, show the best score. Make the text inside the screen. Pressing q or Esc will quit the game. Restarting is pressing SPACE again.
- The final game should be inside a markdown section in Python. Check your code for errors and fix them before the final markdown section.

对话客户端用的是 chatbox，输出结果由于太长就不放出来了。可以直接看 unslosh 的分享

作为推理模型，它在我的 MacBook Pro M3 Max 64G 上有以下问题

思考时间很长。不大适合作为代码补全的用途，因为等待时间长就容易打断思路。

其他的开源推理模型

NovaSky-AI/Sky-T1-32B-Preview

问题#

推荐参数设置#

Ollama#

效果和总结#

问题

推荐参数设置

Ollama

效果和总结