2024年7月19日 (五) 06:15的最新版本

Build llama.cpp

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make

Convert Hugging Face Model to GGUF

pip install -r requirements.txt
python convert_hf_to_gguf.py --help
#M1 MPS does not support bf16
python convert_hf_to_gguf.py ~/Documents/MODELS/Qwen2-0.5B --outfile ~/Documents/MODELS/qwen2-0.5b-fp16.gguf --outtype f16

Run the model

./llama-cli -m ~/Documents/MODELS/qwen2-0.5b-fp16.gguf -p "Hi, who are you?" -n 128

quantize

^[2]

[1] ttps://github.com/ggerganov/llama.cpp/blob/master/docs/build.md

[2] ttps://cloud.aigonna.com/2024/03/25/llama-cpp%E9%87%8F%E5%8C%96/

[1]

[2]

@@ 第1行： / 第1行： @@
+= Build llama.cpp =
+<ref>https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md</ref>
+<syntaxhighlight lang="bash">
+git clone https://github.com/ggerganov/llama.cpp
+cd llama.cpp
+make
+</syntaxhighlight>
+=Convert Hugging Face Model to GGUF =
+<syntaxhighlight lang="bash">
+pip install -r requirements.txt
+python convert_hf_to_gguf.py --help
+#M1 MPS does not support bf16
+python convert_hf_to_gguf.py ~/Documents/MODELS/Qwen2-0.5B --outfile ~/Documents/MODELS/qwen2-0.5b-fp16.gguf --outtype f16
+</syntaxhighlight>
+=Run the model =
+<syntaxhighlight lang="bash">
+./llama-cli -m ~/Documents/MODELS/qwen2-0.5b-fp16.gguf -p "Hi, who are you?" -n 128
+</syntaxhighlight>
+= quantize=
+<ref>https://cloud.aigonna.com/2024/03/25/llama-cpp%E9%87%8F%E5%8C%96/</ref>
+<syntaxhighlight lang="bash">
+</syntaxhighlight>
 [[Category:Deep Learning]]

Llama.cpp：修订间差异

2024年7月19日 (五) 06:15的最新版本

目录

Build llama.cpp

Convert Hugging Face Model to GGUF

Run the model

quantize