How to run 30B/65B LLaMa-Chat on Multi-GPU Servers
LLaMa (short for "Large Language Model Meta AI") is a collection of pretrained state-of-the-art large language models, developed by Meta AI. Compared to the famous ChatGPT, the LLaMa models are available for download and can be run on available hardware.