Hacker News new | past | comments | ask | show | jobs | submit login

AFAIK, 2-bit quant leads to too much loss of performance, such that you're better off using a different smaller model altogether. See here:

https://www.reddit.com/r/LocalLLaMA/comments/18ituzh/mixtral...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:
  翻译: