Ordinateur

Feeling Some Type Of Way

March 20, 2025

Why do I care about type safety? There is plenty of literature about this, but I want to look back on two specific experiences I’ve had with type systems.

The Bad

Early in my career, I hit a sharp corner with Python Pandas that permanently changed how I think of dynamic vs. static typing.


import pandas as pd
from typing import Union

def to_series(data: Union[pd.Series, pd.DataFrame]):
	return data.squeeze()

What type does the above function return? The pandas documentation for pandas.Series.squeeze states the following:

This method is most useful when you don’t know if your object is a Series or DataFrame, but you do know it has just a single column. In that case you can safely call squeeze to ensure you have a Series.

As time passes certain details evade me, but I worked on a project running digital marketing campaigns on a website. We wanted to combine results across many days of traffic. We had a function more or less like this:


import pandas as pd
import functools
from typing import Union, List

def sum_conversions_per_campaign(conversions: List[Union[pd.Series, pd.DataFrame]]) -> pd.Series:
	conversion_series = [df.squeeze() for df in conversions]
	return functools.reduce(lambda x, y: x.add(y, fill_value=0), conversion_series)

One day, we got a notification that our application crashed. When looking at the logs we found a surprise: AttributeError: 'numpy.int64' object has no attribute 'add'. This was difficult to reproduce, but eventually we learned that if a pd.Series only has a single element, squeeze returns a single value instead of a pd.Series.


monday = pd.Series([10], index=['blue-button'])
tuesday = pd.Series([12, 15, 7], index=['blue-button', 'cat-image', 'modal-window'])
wednesday = pd.Series([18, 6, 8], index=['blue-button', 'cat-image', 'modal-window'])

find_max_conversions_per_campaign([tuesday, wednesday]) # Returns [30, 21, 15]
find_max_conversions_per_campaign([monday, tuesday]) # Raises an AttributeError

Of course, this return type behavior is in the documentation (Returns: DataFrame, Series, or scalar), but with a dynamically typed language, we’re relying on best intentions to write this code correctly rather than compile time guarantees. Python has gotten better about this over time with type hints, mypy, and Pydantic, but a statically typed language would have prevented this bug altogether without any bells and whistles.


The Good

Fast forward a few years, I was on a team writing secure-by-default wrappers around our infrastructure as code (IaC). It was my first time writing in TypeScript but immediately I saw how useful the type system was compared to what I had dealt with in Python. For example, if you wanted to eliminate wildcards from your IAM policies, you could do something like this:


import { Construct } from 'constructs';
import * as iam from 'aws-cdk-lib/aws-iam';

export class LeastPrivilegedPolicy extends iam.Policy {
    constructor(scope: Construct, id: String, props: iam.PolicyProps) {
        props.statements?.forEach(statement => {
            if (statement.effect === iam.Effect.DENY && statement.resources.includes("*")) {
                throw new Error("Least privileged policy cannot ALLOW action for resource *")
            }
        })
    }
}

Similarly, you may want to enforce that all of the s3 buckets you create are encrypted by a key managed by your application. For this, you could use a union type:


export interface EncryptedBucketProps extends s3.BucketProps {
    readonly encryption: s3.BucketEncryption.KMS | s3.BucketEncryption.DSSE
}

export class EncryptedBucket extends s3.Bucket {
    constructor(scope: Construct, id: String, props: EncryptedBucketProps) {... }
}