Skip to content

mattbhenley/LearnToCodeWithBaseball

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Learn To Code With Baseball

love baseball? Want to code? Start here. Using Python and baseball's biggest stars and stats, we'll learn variables, loops, lists, and more!

This project is in progress and open to feedback. đź’ˇ

Why Python?

Python is one of the easiest programming languages to read and write, which makes it perfect for learning to code for the first time. Python is also lightweight and great for data - Python doesn’t force you to memorize lots of symbols. You focus on what you want to do, not how complicated the language is.

Python is like learning the rules of baseball before learning advanced analytics — it builds a foundation you can grow on.

Step 1: Check if Python is already installed (macOS)

Open Terminal

Applications → Utilities → Terminal

Run:

python3 --version

What you’ll see

✅ Python 3.x.x → Python is installed

❌ command not found → we’ll install it

macOS sometimes ships with a system Python — we will NOT use python, only python3.

Step 2: Install Homebrew (recommended)

What is Homebrew? Homebrew is a popular, free, open-source package manager for macOS and Linux that simplifies installing, updating, and managing software from the command line.

So to install homebrew, we'll simply copy the following command in our terminal. (Open Terminal appliction)

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

After it finishes, let's verify that it was installed by checking the version.

brew --version

Step 2.2: Install Python

brew install python

Verify:

python3 --version
pip3 --version

We should see something like this:

Python 3.12.x
pip 23.x

Step 3: Set up project folder

In terminal, first, we'll navigate to our Desktop. (alternatively, we can use another location but for beginners, I recommed Desktop)

mkdir baseball_python
cd baseball_python

cd changes our directory to baseball_pyhton

Create our first file:

touch baseball.py

đź’Ą You created your first python file! Congrats.

Step 4: Choose an editor (VS Code is recommended)

Install VSCode

Download from VSCode

Once downloaded, we can open up our project.

In your terminal window, we can simply type:

code .

The project should open in VSCode.

You can also open VSCode, select Open and select your baseball.py file.

If code doesn't work, Cmd + Shift + P -> Shell Command: Install 'Code' command

Step 5: Run your first Python program

Open baseball.py and type (or copy):

print("Hello, baseball Python!")

Step 5.2

❗️ you will need to add the Python extension in VSCode. It should prompt you when adding a python file.

In VS Code:

  1. Click the Extensions icon on the left sidebar
  2. Search for Python
  3. Install Python

This extension gives us:

  • Syntax highlighting
  • Helpful error messages
  • Easy run buttons

Run your file either in VSCode or in Terminal.

In VSCode, select â–¶Run. The intergrated terminal should open and you should see:

Hello, baseball Python!

Or in Terminal, type:

python3 baseball.py

You'll then see:

Hello, baseball Python!

🎉 That’s it. You just ran your first Python file.

Why this matters

  • print() shows output
  • .py files are Python programs
  • python3 filename.py runs the file in Terminal

Step 6: Install Baseball-Friendly Libaries

Python really shines when working with data, and baseball is full of it.

We’ll install a few libraries that help us work with stats.

Step 6.2: Install the libraries

pip install pandas numpy matplotlib

what these do:

  • pandas → tables of player stats
  • numby → math and calculations
  • matplotlib → charts and graphs

Steph 6.3: Quick Test

Create a new file:

touch stats.py

Open it and add:

import pandas as pd

data = {
    "player": ["Judge", "Ohtani", "Trout"],
    "hits": [2, 3, 1],
    "at_bats": [4, 5, 3]
}

df = pd.DataFrame(data)
df["average"] = df["hits"] / df["at_bats"]

print(df)

Run it and you should see the following output:

Player Hits At Bats Batting Average
Judge 2 4 0.500
Ohtani 3 5 0.600
Trout 1 3 0.333

👏 Now we're ready to code with baseball ⚾️

Step 7: Variables (using Baseball stats)

Before we can do anything interesting in Python, we need to understand variables.

What is a variable?

A variable is a named container that stores a value.

In baseball terms: *A stat (hits, at-bats, runs) is a value

*The stat name is the variable

Example: open baseball.py and replace the code with this:

player_name = "Judge"
hits = 2
at_bats = 4

Here's what's happening:

  • player_name stores text (a string)
  • hits stores a number
  • ```at_bats```` stores a number

Using variables together

Now let’s calculate a batting average:

batting_average = hits / at_bats

print(player_name)
print(batting_average)

When we run the file, you should see:

Judge
0.5

Making the output easier to read

Python lets us format output so it looks nicer:

print(f"{player_name}'s batting average is {batting_average:.3f}")

Output:

Judge's batting average is 0.500

Why Variables matter:

  • Store baseball stats
  • Reuse values
  • Change data without rewriting code

If Judge gets another hit, we only change one number:

hits = 3

Everything else updates automatically.

Takeaways

  • Variables store information
  • Numbers don’t need quotes
  • Text does need quotes
  • Variables make stats reusable and flexible

Step 8: Data Types (Strings vs Numbers)

Python needs to know what kind of data it’s working with.

These are called data types.

For now, we’ll focus on the two most important ones:

  • Strings (text)
  • Numbers (integers and decimals) Strings (text)

A string is any text wrapped in quotes.

Baseball examples:

player_name = "Judge"
team = "Yankees"
position = "RF"

Key rule:

  • Strings must be inside quotes (" " or ' ')

If you forget the quotes, Python thinks it’s a variable and will throw an error.

Numbers

Numbers are used for math and do NOT use quotes.

hits = 2
at_bats = 4
batting_average = 0.500

There are two main number types you’ll see:

Integers (whole numbers)

hits = 2
at_bats = 4

Floats (decimals)

batting_average = 0.500

Why data types matter

Python treats strings and numbers very differently.

This works (math with numbers):

hits = 2
at_bats = 4
print(hits / at_bats)

Output:

0.5

This does NOT work (math with strings):

hits = "2"
at_bats = "4"
print(hits / at_bats)

❌ Python will error because text can’t be divided.

Mixing strings and numbers (the right way)

If you want to print text and numbers together, use an f-string:

player_name = "Judge"
hits = 2
at_bats = 4

average = hits / at_bats

print(f"{player_name} has a batting average of {average:.3f}")

Output:

Judge has a batting average of 0.500

Checking a variable’s data type

You can ask Python what type something is:

print(type(player_name))
print(type(hits))
print(type(average))

Output:

<class 'str'>
<class 'int'>
<class 'float'>

Common beginner mistake 🚨

❌ This looks right but is wrong:

hits = "2"

âś… This is correct:

hits = 2

Remember:

  • Quotes = text
  • No quotes = number

Key takeaways

  • Strings = text (names, teams, positions)
  • Integers = whole numbers (hits, at-bats)
  • Floats = decimals (averages)
  • Python needs correct data types to do math

Step 9: Lists (storing a roster)

So far, we’ve worked with one player at a time. But baseball is a team sport — we need a way to store multiple players together.

That’s where lists come in.

Creating a List

players = ["Judge", "Ohtani", "Trout"]

Things to remember:

  • Lists use square brackets []
  • Items are separated by commas
  • Order matters

Accessing items in a list

Each item in a list has a position called an index. Indexes start at 0, not 1.

print(players[0])
print(players[1])
print(players[2])

Output:

Judge
Ohtani
Trout

Adding players to a list

You can add a new player to the roster using .append() :

players.append("Betts")
print(players)

Output:

['Judge', 'Ohtani', 'Trout', 'Betts']

Counting players on the roster

To see how many players are in the list, use len() :

print(len(players))

Output:

4

Lists of numbers (baseball stats)

Lists aren’t just for names — they’re great for stats too:

hits = [2, 3, 1]
at_bats = [4, 5, 3]

Each position lines up with the same player:

  • hits[0] → Judge
  • hits[1] → Ohtani
  • hits[2] → Trout

Using list values in calculations

average = hits[0] / at_bats[0]
print(average)

Output:

0.5

Common beginner mistakes 🚨

❌ Using parentheses instead of brackets:

players = ("Judge", "Ohtani", "Trout")

❌ Forgetting indexes start at 0:

print(players[1])

(This prints the second player, not the first.)

Key takeaways

  • Lists store multiple values
  • Lists use square brackets
  • Indexes start at 0
  • Lists are perfect for rosters and stat groups

Step 10: Dictionaries (Player Cards)

Lists are great for storing multiple values, but they don’t tell us what each value represents.

For example, this works:

hits = [2, 3, 1]

But which number is hits? At-bats? Walks?

That’s where dictionaries come in.

What is a dictionary?

A dictionary stores data as key–value pairs.

Think of a dictionary like a player card:

  • The key is the stat name

  • The value is the stat itself

Creating a dictionary

Here’s a dictionary for one player:

player = {
    "name": "Judge",
    "hits": 2,
    "at_bats": 4
}

Things to notice:

  • Dictionaries use curly braces { }

  • Keys are strings

  • Each key maps to a value using :

Accessing values in a dictionary

You access values by using the key name:

print(player["name"])
print(player["hits"])
print(player["at_bats"])

Output:

Judge
2
4

Using dictionary values in calculations

average = player["hits"] / player["at_bats"]
print(average)

Output:

0.5

Now the code clearly shows what stat is being used.

Updating values

If a player gets another hit, you can update the dictionary:

player["hits"] = 3

Recalculate:

average = player["hits"] / player["at_bats"]
print(average)

Output:

0.75

Adding new stats

You can add new key–value pairs at any time:

player["walks"] = 1
player["rbi"] = 2

Common beginner mistakes 🚨

❌ Forgetting quotes around keys:

player[hits]

âś… Correct:

player["hits"]

Step 11: Lists of Dictionaries (A Full Roster)

Now we’re going to combine what you’ve learned so far:

  • Lists (multiple players)
  • Dictionaries (player stats)

This is how you represent a real roster in Python.

What are we building

Instead of one player:

player = {"name": "Judge", "hits": 2, "at_bats": 4}

We'll store many players:

roster = [
    {"name": "Judge", "hits": 2, "at_bats": 4},
    {"name": "Ohtani", "hits": 3, "at_bats": 5},
    {"name": "Trout", "hits": 1, "at_bats": 3}
]

Each item in the list is a dictionary representing one player.

Accessing a player

print(roster[0])

Output:

{'name': 'Judge', 'hits': 2, 'at_bats': 4}

Accessing a specific stat

print(roster[0]["name"])
print(roster[0]["hits"])

Output:

Judge
2

Looping through the roster

This is where Python starts to feel powerful.

for player in roster:
    average = player["hits"] / player["at_bats"]
    print(f"{player['name']} batting average: {average:.3f}")

Output:

Judge batting average: 0.500
Ohtani batting average: 0.600
Trout batting average: 0.333

Why this matters

This pattern is used everywhere:

  • Sports analytics
  • Databases
  • APIs
  • Real-world applications

If you understand this, you’re officially past the beginner line.

Step 12: For Loops (Simulating an Inning)

So far, we’ve manually accessed players one at a time. That doesn’t scale. Baseball has lineups, innings, and seasons.

That’s where for loops come in.

What is a for loop?

A for loop lets Python repeat an action for each item in a collection.

In baseball terms:

  • A lineup = list
  • Each at-bat = one loop

Basic for loop example

players = ["Judge", "Ohtani", "Trout"]

for player in players:
    print(player)

Output:

Judge
Ohtani
Trout

Python reads this as: For each player in the list, print the player’s name.”

Looping through a roster of player stats

Using our roster from the previous step:

roster = [
    {"name": "Judge", "hits": 2, "at_bats": 4},
    {"name": "Ohtani", "hits": 3, "at_bats": 5},
    {"name": "Trout", "hits": 1, "at_bats": 3}
]

for player in roster:
    average = player["hits"] / player["at_bats"]
    print(f"{player['name']} average: {average:.3f}")

Output:

Judge average: 0.500
Ohtani average: 0.600
Trout average: 0.333

Why indentation matters 🚨

Python uses indentation, not braces.

This works:

for player in roster:
    print(player["name"])

This does NOT:

for player in roster:
print(player["name"])

Indentation tells Python what belongs inside the loop.

Simulating an inning

Think of a loop like an inning where each batter comes up once:

batters = ["Judge", "Ohtani", "Trout"]

for batter in batters:
    print(f"{batter} is at the plate")

Output:

Judge is at the plate
Ohtani is at the plate
Trout is at the plate

Key takeaways

  • For loops repeat actions
  • Loops work perfectly with lists
  • Indentation matters in Python
  • Loops model real baseball sequences

Step 13: Functions (Baseball Formulas)

So far, we’ve written the same calculations more than once. In real programming, we don’t want to repeat ourselves.

That’s where functions come in.

What is a function?

A function is a reusable block of code that performs a specific task.

In baseball terms:

  • A function is like a stat formula
  • You give it inputs (hits, at-bats)
  • It gives you an output (batting average)

Your first function

def batting_average(hits, at_bats):
    return hits / at_bats

What this means:

  • def starts a function
  • hits and at_bats are inputs (parameters)
  • return sends back a result

Using the function

avg = batting_average(2, 4)
print(avg)

Output:

0.5

Using the function with player data

player = {"name": "Judge", "hits": 2, "at_bats": 4}

avg = batting_average(player["hits"], player["at_bats"])
print(f"{player['name']} average: {avg:.3f}")

Output:

Judge average: 0.500

Using functions inside loops

roster = [
    {"name": "Judge", "hits": 2, "at_bats": 4},
    {"name": "Ohtani", "hits": 3, "at_bats": 5},
    {"name": "Trout", "hits": 1, "at_bats": 3}
]

for player in roster:
    avg = batting_average(player["hits"], player["at_bats"])
    print(f"{player['name']} average: {avg:.3f}")

Output:

Judge average: 0.500
Ohtani average: 0.600
Trout average: 0.333

Why functions matter

Functions:

  • Prevent duplicated code
  • Make programs easier to read
  • Let you change logic in one place
  • Mirror real baseball formulas

Key takeaways

  • Functions package logic into reusable blocks
  • Inputs go in parentheses
  • return sends data back
  • Functions model real baseball stats

Step 14: Step 15: Multiple Stats (OBP, SLG, OPS)

Batting average is useful, but baseball uses multiple stats to measure performance.

In this step, we’ll:

  • Create multiple stat formulas
  • Use functions for each one
  • Combine stats like real baseball analytics

The stats we’ll calculate

We’ll start with three common ones:

Batting Average (BA) BA = hits / at_bats

On-Base Percentage (OBP) OBP = (hits + walks) / (at_bats + walks)

Slugging Percentage (SLG) SLG = total_bases / at_bats

OPS OPS = OBP + SLG

Batting Average function (review)

def batting_average(hits, at_bats):
    return hits / at_bats

On-Base Percentage (OBP)

def on_base_percentage(hits, walks, at_bats):
    return (hits + walks) / (at_bats + walks)

Slugging Percentage (SLG)

def slugging_percentage(total_bases, at_bats):
    return total_bases / at_bats

OPS (combining stats)

def ops(obp, slg):
    return obp + slg

Using the functions with a player

player = {
    "name": "Judge",
    "hits": 2,
    "walks": 1,
    "at_bats": 4,
    "total_bases": 5
}

ba = batting_average(player["hits"], player["at_bats"])
obp = on_base_percentage(player["hits"], player["walks"], player["at_bats"])
slg = slugging_percentage(player["total_bases"], player["at_bats"])
player_ops = ops(obp, slg)

print(f"{player['name']} BA: {ba:.3f}")
print(f"{player['name']} OBP: {obp:.3f}")
print(f"{player['name']} SLG: {slg:.3f}")
print(f"{player['name']} OPS: {player_ops:.3f}")

Output:

Judge BA: 0.500
Judge OBP: 0.600
Judge SLG: 1.250
Judge OPS: 1.850

Why this matters

This is real-world programming:

  • Small, focused functions
  • Reusable logic
  • Clear formulas
  • Readable output

This is how analytics code is actually written.

Key takeaways

  • One stat = one function
  • Functions can build on each other
  • Code mirrors real baseball formulas
  • This is foundational analytics logic

About

love baseball? Want to code? Start here.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages