Skip to content

tdiprima/run_system_checks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

System Checks

I got tired of jumping between Stack Overflow tabs, nvidia-smi man pages, and half-remembered PyTorch commands every time I set up a new machine or debugged a training run. So I wrote this — a single script that tells me everything I need to know about my GPU, CUDA, PyTorch, CPU, and disk performance, before I waste an hour wondering why things are slow or broken.

Usage

cd src
./run_system_checks.sh

This will run through all the checks and give you a comprehensive overview of system capabilities.

What's up with disk_speed and filesystem_speed?

Yes, they are completely different:

disk_speed.sh:

  • Uses fio (Flexible I/O Tester) - a sophisticated benchmarking tool
  • Tests read speed only
  • 1GB test file, runs for 30 seconds
  • Requires installing fio (sudo dnf install -y fio)
  • Includes a reference table comparing HDD, SATA SSD, and NVMe speeds

filesystem_speed.sh:

  • Uses dd command - basic Unix utility
  • Tests both write and read speed
  • 100MB test file, quick test
  • Clears filesystem cache between tests for accuracy
  • No additional tools needed (dd is built-in)
  • Cleans up the test file when done

The main differences:

  1. Different tools (fio vs dd)
  2. Different scope (read-only vs read+write)
  3. Different test sizes (1GB vs 100MB)
  4. fio is more advanced/accurate but requires installation; dd is simpler and always available

About

One-shot script to audit GPU, CUDA, PyTorch, CPU, and disk performance before debugging a slow or broken ML environment.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors