GPUs available at Wisconsin CMS T2#

Analysis Facility GPUs#

The Wisconsin Analysis Facility provides an interactive jupyter notebook environment with access to local GPUs as well as GPUs in the Condor cluster. Currently, the GPUs availabe in this way are NVIDIA Tesla T4 GPUs with 16GB RAM.#

Large Memory Machine Learning Computer, cmsnn01.hep.wisc.edu#

The computer cmsnn01.hep.wisc.edu is a Dell Pro Max with GB10, the NVIDIA GB10 Grace Blackwell GPU. It has 128GB unified memory, accessible to both the CPUs and GPU, making it suitable for testing and developing LLMs and other large models. Note that cmsnn01 has ARM cpus, unlike most other T2 computers, which are x86 architecure.#

Access is via ssh from login.hep.wisc.edu. The computer does not support AFS access, so if your home directory is in AFS, you should contact help@hep.wisc.edu to have your home directory migrated to CEPH, which is accessible from cms01. (If you just want a peek, you can ssh to the computer and just not have access to your AFS home directory.)#

One way to use LLMs on this computer is via ollama, which is installed and running.#

To list the installed models:#

$ ollama list
NAME                                 ID              SIZE      MODIFIED      
llama3:latest                        365c0bd3c000    4.7 GB    2 minutes ago    
nemotron-3-super:120b-a12b-q4_K_M    95acc78b3ffd    86 GB     2 days ago       

To install a new model:#

$ ollama pull llama3:latest

To start a chat session with a model:#

$ ollama run llama3:latest
>>> What is the Higgs Boson?

To programmatically call a model via python:#

import requests
import json

url = "http://localhost:11434/api/chat"
payload = {
    "model": "nemotron-3-super:120b-a12b-q4_K_M",
    "messages": [{"role": "user", "content": "What is the Higgs Boson?"}],
    "stream": False
}

response = requests.post(url, json=payload)
print(response.json()['message']['content'])