Share



Intro to FastAPI - Create a Best Ball Projections API using FastAPI, SQLAlchemy and Pandas


Flask and Django have long been the go-to web frameworks for Python developers. However, FastAPI is a more modern, high-performance web framework with support for both synchronous and and asynchronous data requests, and much more. Let’s implement a (crude) Best Ball projections API using FastAPI with a little help from SQLAlchemy and Pandas. Enough with the intro, let’s see how fast this thing goes.

Back to the Future

Note: This post is a quick intro to building an API using FastAPI. If you need a more thorough introduction to REST APIs, though it covers Flask, please read my post here. There are some similarities between this post and my Flask API post, as I go over similar topics in both.

Initial Setup

Perhaps you have read through one of my prior posts and already have Python 3 installed. If not, you can install Python 3 the old fashioned way, or do it my preferred way using pyenv. Once Python 3 is installed you will also need to install pipenv.

The following command will install pipenv for you:

pip install -U pipenv

Once Python and pipenv are both installed we then need to create our project folder by entering the following command in a terminal window:

# This assumes you have a "Desktop" directory.
# Feel free to put your project folder anywhere you see fit
$ cd ~/Desktop
$ mkdir underdog_fastapi && cd underdog_fastapi

Now, open the underdog_fastapi directory in your favorite code editor and create a new Pipfile:

[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true

[dev-packages]
black = "==22.12.0"
pytest = "==7.2.0"
pytest-cov = "==4.0.0"
pytest-sugar = "==0.9.6"
requests = "==2.28.1"

[packages]
fastapi = "==0.89.0"
pandas = "==1.5.2"
sqlalchemy = "==1.4.46"
uvicorn = "==0.20.0"

[requires]
python_version = "3.11"

Be aware, the above Pipfile calls for Python 3.10. Ensure that you have at least Python 3.10 installed before proceeding.

Note: FastAPI, unlike Flask and Django, does not come with its own built-in server. We will be using Uvicorn, a lightning-fast ASGI server for our needs.

Next, head back to the underdog_fastapi directory in your terminal window and install everything via pipenv. Once pipenv has finished installing all of the project’s dependencies we can activate our virtualenv and start coding!

# Install all dependencies in our Pipfile
$ pipenv install --dev

# Then activate our virtualenv
$ pipenv shell
Project Structure

Our project’s main directory now contains a Pipfile and Pipfile.lock file. Let’s create a few more directories and files, laying out our project structure like so:

fantasy_stats/
│
├── data/
│   └── underdog.csv <-- We will download this file in a step below.
├── underdog_fastapi/
│   ├── __init__.py
│   ├── constants.py
│   ├── database.py
│   └── api/
|       ├── __init__.py
|       ├── crud.py
|       ├── main.py
|       ├── models.py
|       └── schemas.py
│   └── underdog/
|       ├── __init__.py
|       ├── stacks.py
|       └── team.py
├── tests/
|   ├── __init__.py
|   └── test_api.py
├── Pipfile
└── Pipfile.lock

Above, we see a file named underdog.csv. This file is a CSV file downloaded from Underdog Fantasy containing their 2021 Best Ball player projections for their half-point per-reception scoring format. Underdog Fantasy is, in my opinion, the best place to play best ball, thus we will be using their projections for our database. The CSV can be downloaded here, or you can sign up for a free account on Underdog Fantasy and get it there as well. The CSV was last downloaded in early June of 2021, but data relevancy has no impact on this post. If you are doing actual analysis with this data I suggest you download their latest projections from their site.

We now have all of our dependencies installed, the CSV file containing the player projections, and a solid project structure. Time to write some code!

Setting up the Database

Before we start implementing our FastAPI routes we should get our database populated with data and up and running. We will accomplish this by reading the Underdog CSV data via Pandas into a Pandas DataFrame, deleting a few unwanted columns, and lastly calling the to_sql function to load all of the data into the players table via SQLAlchemy. Let’s have a look at database.py and how all of this is accomplished:

import logging
import os

import pandas
from sqlalchemy import create_engine
from sqlalchemy.exc import SQLAlchemyError
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

from underdog_fastapi.constants import DB_FILE, PROJECT_ROOT
from underdog_fastapi.underdog.team import TEAM_NAME_TO_ABBREV

SQLALCHEMY_DATABASE_URL = f"sqlite:///{DB_FILE}"
engine = create_engine(
    SQLALCHEMY_DATABASE_URL, connect_args={"check_same_thread": False}
)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
Base = declarative_base()

UNDERDOG_CSV = f"{PROJECT_ROOT}/data/underdog.csv"

# We rename the columns in the CSV to match our players table
DB_COLUMNS = [
    "first_name",
    "last_name",
    "adp",
    "projected_points",
    "position",
    "team_name",
    "team_abbreviation",
]
logger = logging.getLogger(__name__)


def create_database():
    """
    Function that creates a new SQLite database from the Underdog CSV
    data file. This file implicitly uses the file referenced
    in the UNDERDOG_CSV constant

    :return: None
    """
    logger.info("Creating new players database from Underdog CSV file.")

    if os.path.exists(DB_FILE):
        # Remove any old DB files, as this allows us to
        # start the API up with a new projections file.
        # The API itself is read-only, thus this is fine.
        os.remove(DB_FILE)

    Base.metadata.create_all(bind=engine)
    df = _get_dataframe_for_db()

    try:
        with engine.begin() as connection:
            df.to_sql("players", con=connection, index=False, if_exists="append")
    except SQLAlchemyError as e:
        logger.error("The database creation failed!")
        raise
    else:
        logger.info("Players database successfully created.")


def _get_dataframe_for_db():
    """
    Function that reads the UNDERDOG_CSV file, parses it to a
    Pandas dataframe, and massages the dataframe to our desired
    data model

    :return: Pandas DataFrame
    """
    logger.debug("Parsing Underdog CSV with Pandas.")

    df = pandas.read_csv(UNDERDOG_CSV)
    df["teamAbbreviation"] = df["teamName"].map(TEAM_NAME_TO_ABBREV)

    # Some of the ADPs in the Underdog CSV are set to "-".
    # We want to set this value to None, instead.
    df.loc[df["adp"] == "-", "adp"] = None

    # Delete the CSV data we do not want to persist in our database.
    # We can delete ID here, as our database table will autoincrement
    # its ID, starting from 1.
    del df["id"]
    del df["lineupStatus"]
    del df["byeWeek"]

    # Rename the Underdog CSV column headers with our table column names
    df.columns = DB_COLUMNS

    logger.debug("Successfully parsed Underdog CSV.")
    return df


def get_db():
    """
    Function used to get a SQLAlchemy Session

    :return: SQLAlchemy Session
    """
    logger.debug("Getting db session.")

    db = SessionLocal()
    try:
        yield db
    finally:
        db.close()

The function create_database will be invoked on the startup of our FastAPI application. This means we are creating a new database everytime the app is started, which might seem odd, but the dataset is small and our API is read-only (only get methods, no creates, updates or deletes). Also, the CSV from Underdog could be updated throughout the football preseason, requiring we make a new database to read from anyway.

Note: Creating a new database everytime FastAPI starts up is not something we would likely do with an application in production. Instead, we would manage database changes with a database migration tool such as Alembic.

Perhaps you noticed the get_db function I snuck in above in database.py, towards the bottom of the module. This function will be used throughout our app to get a SQLAlchemy session when needed.

Lastly, before moving on, we have a few imports towards the top of this module that we have not yet implemented. Let’s get constants.py out of the way now as it is nothing more than a few basic constants used throughout our project:

"""
constants.py
File contains some constants shared across our modules
If you move this file the PROJECT_ROOT var needs fixing.
"""

from pathlib import Path

PROJECT_ROOT = Path(__file__).parent.parent
DB_FILE = f"{PROJECT_ROOT}/data/underdog.db"

The constant imported from teams.py, TEAM_NAME_TO_ABBREV, is covered further down below in this post.

Create the SQLAlchemy Models

It is now time to define our SQLAlchemy models in models.py. I say models, but we only need a single model for this post, the Player model. Using the Base class we instantiated in database.py create the Player model in models.py:

from sqlalchemy import Column, Integer, Numeric, String

from underdog_fastapi.database import Base


class Player(Base):
    __tablename__ = "players"

    id = Column(
        Integer, primary_key=True, index=True, autoincrement=True, nullable=False
    )
    first_name = Column(String)
    last_name = Column(String)
    adp = Column(Numeric)
    projected_points = Column(Numeric)
    position = Column(String)
    team_name = Column(String)
    team_abbreviation = Column(String)

That’s all there is to it. Each database column is defined above, and the id column is marked as our primary_key and autoincrements itself.

Create the pydantic Models

FastAPI makes use of pydantic, a library to define data validation, serialization and documentation. We need to define the pydantic models for our API outputs. Our API will consist of routes to retrieve Player models from, and additional routes to retrieve what I am calling PlayerStacks. The Player routes will simply return Players from the players table, as is. The routes that return PlayerStacks will return stacks of players, grouped together by team. The pydantic models look as so:

from pydantic import BaseModel
from typing import List

from underdog_fastapi.underdog.team import Team


class PlayerBase(BaseModel):
    first_name: str
    last_name: str
    adp: float = None
    projected_points: float = None
    team_name: str
    team_abbreviation: Team


class Player(PlayerBase):
    id: int
    bye_week: int = None

    class Config:
        orm_mode = True


class PlayerStack(BaseModel):
    players: List[PlayerBase]
    average_adp: float
    median_adp: float
    projected_points_per_week: float

The above models are pretty basic. As mentioned, our API will return Player and PlayerStack models. For the sake of keeping things DRY we implement a PlayerBase model. The player data in our PlayerStack model contains similar, but lesser data than our Player model, thus inheritance makes sense here.

Note: Pydantic “models” are not the same as SQLAlchemy “models”. Yes, the libraries share some of the same jargon, but pydantic models and SQLAlchemy models are not related.

Implementing our CRUD module

What’s an API without CRUD operations? With our database and validation models squared away next up is our crud.py module containing the various functions our app needs to retrieve data from the database via SQLAlchemy:

import logging

from sqlalchemy.orm import Session

from underdog_fastapi.api.models import Player
from underdog_fastapi.underdog.stacks import generate_player_stack
from underdog_fastapi.underdog.team import Team, TEAM_NAME_TO_ABBREV


logger = logging.getLogger(__name__)


def get_player(db: Session, player_id: int):
    logger.debug(f"Retrieving data for player_id {player_id}.")
    return db.query(Player).filter(Player.id == player_id).first()


def get_players_by_team(db: Session, team: Team):
    logger.debug(f"Retrieving data for team {team}.")
    return db.query(Player).filter(Player.team_abbreviation == team.name).all()


def get_player_stack_by_team(db: Session, team: Team):
    logger.debug(f"Retrieving player stack data for team {team}.")
    players = get_players_by_team(db, team)
    return generate_player_stack(players)


def get_all_player_stacks(db: Session):
    logger.debug("Retrieving all player stacks, sorted by projected points per week.")
    player_stacks = []

    for team_name in TEAM_NAME_TO_ABBREV.values():
        player_stack = get_player_stack_by_team(db, Team(team_name))
        player_stacks.append(player_stack)

    player_stacks.sort(key=lambda stack: stack.projected_points_per_week, reverse=True)

    logger.debug(
        "Successfully retrieved all player stacks, and sorted by projected points per week."
    )
    return player_stacks

Once again, you might notice we are importing a couple of modules we have not implemented just yet, team.py and stacks.py. The former module contains an enum, representing the 32 NFL teams and their corresponding team abbreviations:

from enum import Enum


class Team(Enum):
    ARI = "ARI"
    ATL = "ATL"
    BAL = "BAL"
    BUF = "BUF"
    CAR = "CAR"
    CHI = "CHI"
    CIN = "CIN"
    CLE = "CLE"
    DAL = "DAL"
    DEN = "DEN"
    DET = "DET"
    GB = "GB"
    HOU = "HOU"
    IND = "IND"
    JAX = "JAX"
    KC = "KC"
    LV = "LV"
    LAC = "LAC"
    LAR = "LAR"
    MIA = "MIA"
    MIN = "MIN"
    NE = "NE"
    NO = "NO"
    NYG = "NYG"
    NYJ = "NYJ"
    PHI = "PHI"
    PIT = "PIT"
    SF = "SF"
    SEA = "SEA"
    TB = "TB"
    TEN = "TEN"
    WAS = "WAS"


TEAM_NAME_TO_ABBREV = {
    "Arizona Cardinals": Team.ARI.name,
    "Atlanta Falcons": Team.ATL.name,
    "Baltimore Ravens": Team.BAL.name,
    "Buffalo Bills": Team.BUF.name,
    "Carolina Panthers": Team.CAR.name,
    "Chicago Bears": Team.CHI.name,
    "Cincinnati Bengals": Team.CIN.name,
    "Cleveland Browns": Team.CLE.name,
    "Dallas Cowboys": Team.DAL.name,
    "Denver Broncos": Team.DEN.name,
    "Detroit Lions": Team.DET.name,
    "Green Bay Packers": Team.GB.name,
    "Houston Texans": Team.HOU.name,
    "Indianapolis Colts": Team.IND.name,
    "Jacksonville Jaguars": Team.JAX.name,
    "Kansas City Chiefs": Team.KC.name,
    "Las Vegas Raiders": Team.LV.name,
    "Los Angeles Chargers": Team.LAC.name,
    "Los Angeles Rams": Team.LAR.name,
    "Miami Dolphins": Team.MIA.name,
    "Minnesota Vikings": Team.MIN.name,
    "New England Patriots": Team.NE.name,
    "New Orleans Saints": Team.NO.name,
    "NY Giants": Team.NYG.name,
    "NY Jets": Team.NYJ.name,
    "Philadelphia Eagles": Team.PHI.name,
    "Pittsburgh Steelers": Team.PIT.name,
    "San Francisco 49ers": Team.SF.name,
    "Seattle Seahawks": Team.SEA.name,
    "Tampa Bay Buccaneers": Team.TB.name,
    "Tennessee Titans": Team.TEN.name,
    "Washington Football Team": Team.WAS.name,
}

The other module, stack.py is a module containing just a single function. It generates PlayerStack models, taking in a list of Player models as a parameter. The generate_player_stack function contained in this module will exclude any players projected for less than 100.0 points for the 2021 season. This number averages out to around a mere 5.85 fantasy points per game. This number is an arbritrary cutoff that I made up. Feel free to finagle this value as you see fit.

import logging
import statistics

from underdog_fastapi.api.schemas import PlayerBase, PlayerStack

PLAYER_POINT_CUTOFF = 100.0
NUM_GAMES = 17
logger = logging.getLogger(__name__)


def generate_player_stack(db_players):
    players = []
    logger.debug(f"Building player stack for {len(db_players)} players.")

    for db_player in db_players:
        if db_player.adp and _valid_projected_points(db_player.projected_points):
            player = {
                "first_name": db_player.first_name,
                "last_name": db_player.last_name,
                "adp": db_player.adp,
                "projected_points": db_player.projected_points,
                "team_name": db_player.team_name,
                "team_abbreviation": db_player.team_abbreviation,
            }

            players.append(PlayerBase(**player))

    player_adps = [player.adp for player in players]
    average_adp = statistics.mean(player_adps)
    median_adp = statistics.median(player_adps)
    projected_points_per_weak = sum(player.projected_points for player in players) / NUM_GAMES

    player_stack = {
        "players": players,
        "average_adp": average_adp,
        "median_adp": median_adp,
        "projected_points_per_week": projected_points_per_weak,
    }

    logger.info("Successfully built player stack.")
    return PlayerStack(**player_stack)


def _valid_projected_points(projected_points):
    if projected_points and projected_points >= PLAYER_POINT_CUTOFF:
        return True

    return False

Adding our FastAPI Routes and Running the API

We just have to make our FastAPI routes and start the actual application now, in main.py:

import logging
from typing import List

from fastapi import Depends, FastAPI, HTTPException
from sqlalchemy.orm import Session

from underdog_fastapi.api import crud, schemas
from underdog_fastapi.database import get_db, create_database
from underdog_fastapi.underdog.team import Team

logging.basicConfig(
    level=logging.DEBUG,
    format="%(asctime)s %(name)-12s %(levelname)-8s %(message)s",
    datefmt="%m-%d %H:%M",
    handlers=[logging.FileHandler("football_api.log"), logging.StreamHandler()],
)
logger = logging.getLogger(__name__)

create_database()
app = FastAPI()


@app.get("/api/players/{player_id}", response_model=schemas.Player)
def get_player(player_id: int, db: Session = Depends(get_db)):
    logger.info(f"Retrieving player by id {id}.")
    player = crud.get_player(db, player_id=player_id)

    if player is None:
        logger.warning(f"Player not found by id {id}.")
        raise HTTPException(status_code=404, detail="Player not found.")

    logger.info(f"Successfully retrieved player by id {id}.")
    return player


@app.get("/api/players/teams/{team}", response_model=List[schemas.Player])
def get_player(team: str, db: Session = Depends(get_db)):
    logger.info(f"Retrieving players for team {team}.")
    players = crud.get_players_by_team(db, team=Team(team))

    if players is None:
        logger.warning(f"Players not found by team {team}.")
        raise HTTPException(status_code=404, detail="Players not found.")

    logger.info(f"Successfully retrieved players for team {team}.")
    return players


@app.get("/api/stacks/{team}", response_model=schemas.PlayerStack)
def get_player_stack(team: str, db: Session = Depends(get_db)):
    logger.info(f"Retrieving player stack for team {team}.")
    player_stack = crud.get_player_stack_by_team(db, team=Team(team))

    if player_stack is None:
        logger.warning(f"Player stack not found by team {team}.")
        raise HTTPException(
            status_code=404, detail=f"Player stack not found by team {team}"
        )

    logger.info(f"Successfully retrieved player stack for team {team}.")
    return player_stack


@app.get("/api/stacks/", response_model=List[schemas.PlayerStack])
def get_player_stack(db: Session = Depends(get_db)):
    logger.info(f"Retrieving all player stacks")
    player_stacks = crud.get_all_player_stacks(db)

    if player_stacks is None:
        logger.warning(f"Player stacks not retrieved.")
        raise HTTPException(status_code=500, detail=f"Internal server error.")

    logger.info(f"Successfully retrieved all player stacks.")
    return player_stacks

Each route above is decorated with app.get({endpoint_here}, response_model={pydantic_schema}). If you were extending our API with additional methods, such as a POST, PUT, etc., those methods would be decorated with app.post and app.put, respectively. Lastly, this decorator also contains a response_model parameter, used for validation purposes. These response_models should reference one of our pydantic models from above, or a collection of said models.

That should do it for our FastAPI implementation! It is now time to start our application and request some data. Let’s start our FastAPI app from the command line from our project’s root directory via the following command:

$ uvicorn underdog_fastapi.api.main:app --reload

Assuming the API started successfully you should now be able to test it out in your favorite browser, or via a curl command:

$ curl http://127.0.0.1:8000/api/players/1
{
  "first_name": "Christian",
  "last_name": "McCaffrey",
  "adp": 1.1,
  "projected_points": 308.3,
  "team_name": "Carolina Panthers",
  "team_abbreviation": "CAR",
  "id": 1,
  "bye_week": null
}

If you received a similar response then everything should be in working order! Since our app is up and running let’s take a look at one of FastAPI’s cooler features. I previously wrote a post about designing a RESTful API with OpenAPI. OpenAPI docs are very useful, but they are a lot of work to create and maintain. However, FastAPI creates these documents for us! Just navigate to http://127.0.0.1:8000/docs when FastAPI is running and you should see something similar:

FastAPI docs

Before wrapping up let’s write a few simple integration tests using pytest!

Testing our FastAPI Application using pytest

Our API is rather small, consisting of just four separate routes and should be quite easy to test using pytest. FastAPI provides a TestClient, which is actually the TestClient from Starlette, making it a breeze to invoke our own API via test code.

Note: Our Pipfile includes the requests package under its dev dependencies because the TestClient requires it.

Here is a simple set of tests, covering most of our code:

from fastapi.testclient import TestClient

from underdog_fastapi.api.main import app
from underdog_fastapi.underdog.team import TEAM_NAME_TO_ABBREV

client = TestClient(app)


def test_get_player():
    response = client.get("/api/players/1")
    assert response.status_code == 200


def test_get_players_by_team():
    response = client.get("/api/players/teams/ARI")
    assert response.status_code == 200


def test_get_player_stack():
    for team in TEAM_NAME_TO_ABBREV.values():
        response = client.get(f"/api/stacks/{team}")
        assert response.status_code == 200


def test_get_all_player_stacks():
    response = client.get("/api/stacks")
    assert response.status_code == 200

Above, we have a separate test for each route, testing all of our happy paths. Feel free to add additional tests, covering the various exceptions found in each of our API routes.

We can run the tests from the command line with the following command from our project’s root directory:

$ pytest tests

When setting up our project dependencies in our Pipfile we included pytest-cov in our dev dependencies. This pytest plugin allows us to see our tests’ code coverage with the following command:

$ pytest --cov underdog_fastapi/

We should get output similar to this

---------- coverage: platform darwin, python 3.10.0-final-0 -----------
Name                                    Stmts   Miss  Cover
-----------------------------------------------------------
underdog_fastapi/__init__.py                0      0   100%
underdog_fastapi/api/__init__.py            0      0   100%
underdog_fastapi/api/crud.py               25      0   100%
underdog_fastapi/api/main.py               48      8    83%
underdog_fastapi/api/models.py             12      0   100%
underdog_fastapi/api/schemas.py            20      0   100%
underdog_fastapi/constants.py               3      0   100%
underdog_fastapi/database.py               47      3    94%
underdog_fastapi/underdog/__init__.py       0      0   100%
underdog_fastapi/underdog/stacks.py        24      0   100%
underdog_fastapi/underdog/team.py          35      0   100%
-----------------------------------------------------------
TOTAL                                     214     11    95%


Results (1.92s):
       4 passed

Conclusion

That wraps up our introduction to FastAPI! This tutorial also served as a quick look at reading CSV data via Pandas and parsing it into a database using SQLAlchemy. If you’re looking for the source code found in this post it can be found on GitHub. I still love Flask, but FastAPI is a lot of fun to work with. My post does not really do it enough justice, so I suggest plugging into FastAPI’s own documentation.

Doc!

Subscribe

Get updates on new content straight to your inbox! Unsubscribe at anytime.

* indicates required