Local Models
Offline, Uncensored, Unrestricted
GGUF Quantized Models
Installation Guide
# 1. Download model (<4bit recommended)
$ wget [MODEL_URL]
# 2. Install llama.cpp
$ git clone https://github.com/ggerganov/llama.cpp
$ cd llama.cpp && make
# 3. Run locally
$ ./main -m [MODEL_FILE] -p "Your prompt"
# Advanced: Use --temp 0 for no randomness
Advanced Configuration
# For maximum performance (Linux/NVIDIA):
$ make LLAMA_CUBLAS=1
$ ./main -m [MODEL] -ngl 100 --ctx-size 4096
# For completely anonymous operation:
$ torsocks ./main -m [MODEL]