How to run 30B/65B LLaMa-Chat on Multi-GPU Servers

LLaMa (short for "Large Language Model Meta AI") is a collection of pretrained state-of-the-art large language models, developed by Meta AI. Compared to the famous ChatGPT, the LLaMa models are available for download and can be run on available hardware.

PyTorch 2 GPU Performance

A benchmark based performance comparison of the new PyTorch 2 with the well established PyTorch 1. The benchmarks cover different areas of deep learning, such as image classification and language models. It is shown that PyTorch 2 generally outperforms PyTorch 1 and is scaling well on multiple GPUs.

Using Stable Diffusion with webUI in AIME MLC

If you're looking for a convenient and user-friendly way to interact with Stable Diffusion, the webUI from AUTOMATIC1111 is the way to go. This open-source project makes it easy to use image generation models and offers many other features in addition to the normal image generation.

Deep Learning GPU Benchmarks 2022

An overview of current high end GPUs and compute accelerators best for deep and machine learning tasks. Included are the latest offerings from NVIDIA: the Hopper and Ada Lovelace GPU generation. Also the performance of multi GPU setups is evaluated.

public

AIME Machine Learning Framework Container Management

To set up and run a deep learning framework in a GPU environment, some prerequisites for installed drivers and libraries must be met. There are guides to get a specific version of your favourite framework up and running. But most easy is just to use the AIME MLC framework.

Multi GPU training with Pytorch

Training deep learning models consist of a high amount of numerical calculations which can be performed to a great extent in parallel. Since GPUs offer far more cores than CPUs, GPUs (>10k cores) outperform CPUs (<= 64 cores) in most deep learning applications by factors.

Deep Learning GPU Benchmarks 2021

An overview of current high end GPUs and compute accelerators best for deep and machine learning tasks. Included are the latest offerings from NVIDIA: the Ampere GPU generation. Also the performance of multi GPU setups like a quad RTX 3090 configuration is evaluated.

Deep Learning GPU Benchmarks 2020

An overview of current high end GPUs and compute accelerators best for deep and machine learning tasks. Included are the latest offerings from NVIDIA: the Ampere GPU generation. Also the performance of multi GPU setups like a quad RTX 3090 configuration is evaluated.

Deep Learning GPU Benchmarks 2019

A state of the art performance overview of high end GPUs used for Deep Learning in 2019. All tests are performed with the latest Tensorflow version 1.15 and optimized settings. Also the performance for multi GPU setups is evaluated.