Last Updated: December 24, 2025
Installation
Build from source, pre-built binaries
Key point 1
Detailed explanation for installation
Key point 2
Detailed explanation for installation
Key point 3
Detailed explanation for installation
Key point 4
Detailed explanation for installation
Model Conversion
Convert to GGUF format
Key point 1
Detailed explanation for model conversion
Key point 2
Detailed explanation for model conversion
Key point 3
Detailed explanation for model conversion
Key point 4
Detailed explanation for model conversion
Quantization Options
q4_0, q4_K_M, q8_0 explained
Key point 1
Detailed explanation for quantization options
Key point 2
Detailed explanation for quantization options
Key point 3
Detailed explanation for quantization options
Key point 4
Detailed explanation for quantization options
Command Line Usage
Run models, parameters, options
Key point 1
Detailed explanation for command line usage
Key point 2
Detailed explanation for command line usage
Key point 3
Detailed explanation for command line usage
Key point 4
Detailed explanation for command line usage
GPU Acceleration
CUDA, Metal, OpenCL
Key point 1
Detailed explanation for gpu acceleration
Key point 2
Detailed explanation for gpu acceleration
Key point 3
Detailed explanation for gpu acceleration
Key point 4
Detailed explanation for gpu acceleration
Performance Tuning
Batch size, context, threads
Key point 1
Detailed explanation for performance tuning
Key point 2
Detailed explanation for performance tuning
Key point 3
Detailed explanation for performance tuning
Key point 4
Detailed explanation for performance tuning
💡 Pro Tip: Master the fundamentals first before moving to advanced techniques. Practice regularly and refer to this cheatsheet for quick reference.