Llama.cpp:修订间差异
创建页面,内容为“ Category:Deep Learning” |
|||
(未显示同一用户的5个中间版本) | |||
第1行: | 第1行: | ||
= Build llama.cpp = | |||
<ref>https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md</ref> | |||
<syntaxhighlight lang="bash"> | |||
git clone https://github.com/ggerganov/llama.cpp | |||
cd llama.cpp | |||
make | |||
</syntaxhighlight> | |||
=Convert Hugging Face Model to GGUF = | |||
<syntaxhighlight lang="bash"> | |||
pip install -r requirements.txt | |||
python convert_hf_to_gguf.py --help | |||
#M1 MPS does not support bf16 | |||
python convert_hf_to_gguf.py ~/Documents/MODELS/Qwen2-0.5B --outfile ~/Documents/MODELS/qwen2-0.5b-fp16.gguf --outtype f16 | |||
</syntaxhighlight> | |||
=Run the model = | |||
<syntaxhighlight lang="bash"> | |||
./llama-cli -m ~/Documents/MODELS/qwen2-0.5b-fp16.gguf -p "Hi, who are you?" -n 128 | |||
</syntaxhighlight> | |||
= quantize= | |||
<ref>https://cloud.aigonna.com/2024/03/25/llama-cpp%E9%87%8F%E5%8C%96/</ref> | |||
<syntaxhighlight lang="bash"> | |||
</syntaxhighlight> | |||
[[Category:Deep Learning]] | [[Category:Deep Learning]] |
2024年7月19日 (五) 06:15的最新版本
Build llama.cpp
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make
Convert Hugging Face Model to GGUF
pip install -r requirements.txt
python convert_hf_to_gguf.py --help
#M1 MPS does not support bf16
python convert_hf_to_gguf.py ~/Documents/MODELS/Qwen2-0.5B --outfile ~/Documents/MODELS/qwen2-0.5b-fp16.gguf --outtype f16
Run the model
./llama-cli -m ~/Documents/MODELS/qwen2-0.5b-fp16.gguf -p "Hi, who are you?" -n 128