Why we chose Pants

Published in

Klaviyo Engineering

19 min readJan 2, 2024

We chose Pants for our new Klaviyo backend monorepo. We write RFCs at Klaviyo to help us make technical decisions as Yann Tambouret explained in Always Write Something. Below is a lightly redacted version of the RFC Greg Niemann and I wrote when we chose Pants. I’m sharing it because it’s a good example of how we choose tools at Klaviyo and because it could help other companies in the middle of adopting monorepos.

RFC: Klaviyo Monorepo for Backend (K-Repo)

Klaviyo RFC header that Shares the authoring team (Event Pipeline), the authors (Tyler Bream, Greg Niemann), product Spec (Internal — No Product Spec), and any additional internal resources.

Executive Summary

In Quarter 1 2023, the Event Pipeline team experimented with creating a monorepo to house libraries, services and useful scripts. After adding services and libraries and integrating with the Klaviyo app, the monorepo is ready to be opened up to the rest of the company. This RFC will outline the results of the experiment and how the monorepo will function moving forward.

Background

The event pipeline team has been targeting k8s as its future deployment platform for over 1 year. Both past and future projects have involved splitting functionality out of the app repo (our monolith application) and into independent applications. Previously, each of these services have lived in their own repos, each with their own CI/CD. Additionally, team libraries have existed in their own repos, each with their own CI/CD. This has several downsides:

Changes to libraries require updates to many repos which simply bump versions. For example, a change to a shared library requires updating all users of that library (which may not be known due to being spread across multiple repos). A monorepo system can help with this, by (at least) updating all the co-located users at once.
Each CI/CD pipeline is a snowflake. Making general changes to CI requires updating many Jenkins files.
Code discoverability suffers from code living in multiple places.
Things such as style rules and linting standards are harder to universally enforce, since the configurations have to be copied across repos.
Ensuring compatibility of requirements can become difficult since a dependent library does not know what dependencies version requirements are and can cause issues with installation.
Python compatibility tests and upgrades have to occur in many places. The monorepo can test multiple versions of python and allow teams flexibility as to when to update their application.

The site monorepo.tools covers many of the downsides to the current poly-repo state of affairs and many of the benefits of a monorepo. It does not compare monorepos to each other.

Build System Requirements

To aid in choosing which monorepo to use, we broke down a set of requirements necessary for the build system to support:

Available resources. What public resources are available to aid in getting it set up, and for advanced usage. This includes public documentation and community forums.
Protobuf integration. With Klaviyo centralizing around protobuf definitions, we wanted protobuf compatibility built in.
Building container images and compatibility of images across environments. The ultimate goal of a code repository is to produce an artifact that can run its code. That artifact should be a container image so we can deploy to k8s. To avoid having to build any specialized functionality to support docker, instead it should be native to the build system. Additionally, we want the ability to use container images easily especially with requirements and wheels across our local, build-machines and final container images.
Building library artifacts. The team expects that not everything will always exist in the monorepo, or not be ported over (for instance, the app). However, libraries developed in the monorepo may want to be exported (noted, that not all libraries will be. Some will stay internal to the monorepo). Thus, we would like to have support for publishing libraries to our artifact store.
Integrated test coverage. The ability to get test coverage to be able to show how much code was covered in tests and the ability to enforce a certain threshold to pass.
Extensibility. The ability for the build system to be customized and extended to allow for Klaviyo usage. For instance, the ability to add own rules or commands, or to modify or leverage existing rules.
CI integration and caching. Long build times can slow down development time and a poor experience with CI can make it hard to find out why things are not working. The build system should make working with CI easy and quick. Additionally, caching capabilities should exist to allow for speeding up the build times and to have to avoid running checks over unchanged code.
Requirements management. The ability to manage all requirements for a system and ensure that there are no compatibility issues.
Multiple language support. In addition to Python, the team already maintains Java code, and is looking at expanding into using other languages like Go.
Linting integrations. The team wants to leverage the existing linters to ensure code correctness and to enforce coding standards. Using one system to do everything would be ideal over having to use multiple tools to maintain the repo.
IDE Integration. To avoid as much friction as possible, the build system should ideally have integration with the IDE. This will allow for code discovery, auto completion, import discovery, and support running tests. Ideally, it would also contain a language server to make developing BUILD files easier.
Handle multiple python environments and versions. When executing python upgrades or certain requirements files, it is quite useful to support building in multiple different environments. These environments could be a different set of requirements, or different python environments.
Easy to adopt and think about. To make it easier for widespread usage, we want the build system to be easy to work with, to think through and understand, and simple to use. This would help as teams would not need a steep learning curve to start using the build system.

Other MonoRepo Requirements

Outside of what is needed for the build system, we need to consider how to set up and govern the monorepo. Below is a set of requirements considered for the monorepo:

Easy to navigate and understand structure
How to govern its growth, future and long-term support
How to handle building upstream dependencies when their requirements change (like a downstream internal library)?
How to handle versioning of libraries, especially when libraries depend on other libraries.

Proposed Changes

Monorepo Name

The team proposes the name k-repo, standing for Klaviyo Monorepo. This would be its own github repository.

Monorepo Structure

The team proposes a structure of directories shown by example below for language code:

python -> base folder for all python projects
├─ klaviyo (for package namespacing)
├── <domain> -> the <domain> or area of focus, eg event_processing
|   └── <component> -> an individual library or service 
|       ├── README.md -> readme for service
|       ├── client -> code for connecting to the service as a client
|       ├── server -> deployable server
|       ├── proto -> houses protobuf definitions defining communication
|       ├── locust -> houses locust load testing if applicable.
|       ├── tests -> unit tests
|       └── integration-tests -> integration tests. Separated so that pants tags can be used to manage the running of which tests
java -> base folder for all java projects
go -> base folder for all go projects
schemas -> base folder for Klaviyo schemas repository (structure here differs and discussed later).
scripts -> base folder for independent shell scripts

The structure here was chosen to make navigation and source roots easier. The top level is broken down by language. Each root maps to a specific language. The breakdown of the language folders will vary by language slightly, but will be the class path / package hierarchy. As we are primarily working in Python, we will explain the python structure.

We start with a klaviyo top level package. Underneath the organization level is a sub-package for each domain (for instance event-processing). Note, that the special domain core includes common, shared components which don’t fit into other domains. These components tend to include monorepo specific core functionality that will most likely not be exported to a library outside of the monorepo. Under each domain is a collection of components. These components are typically libraries or services. Within the component, no formal structure is mandated, but we provide a suggestion as to organizing the component (the organization is shown in the example).

Domain was chosen specifically because it provides a descriptive way of finding code, but does not corner the structure to team names. Teams split, or transfer domain ownership to other teams. By choosing domains as the key instead of team names, we avoid having to move components/code or split ownership with an out of date team name. More importantly, by choosing domains, we make domains a first class citizen. This aligns with domain decomposition company initiative. This makes it clear when importing functionality that one is referencing an outside domain and makes it easy to build tooling around its usage. Added with the klaviyo organization, it is easy to determine what code is internal compared to external.

Notably missing is a deployment directory. Deployment files should not be kept in this repo. The monorepo plans on building artifacts for deployments, and deployments separated into another location for management (eventually ArgoCD).

This structure underwent a lot of discussion with the ARB before settling on this structure. This structure represents an informative amount of information in the imports, but not too long that is burdensome.

For instance, importing the distributive tracing comes from the observability domain and the library ktrace, making the import statement from klaviyo.observability.ktrace import start_span.

Besides the code structure, there lives a schema registry. This structure here is slightly different from the language specific implementations. The structure looks like:

schemas -> base folder for all protobuf definitions
├── java -> java specific support
├──klaviyo/schemas/<version> -> houses versioned protobuf definitions
|   ├── __init__.py -> enables python packaging support
|   ├── <named_proto>.proto
|   └── <named_proto>.proto -> proto with definitions relevant to its name.

The schemas will generate a library in each language where the compiled schemas are used. The structure of the schema registry may evolve from this format as more use is adopted.

In addition to these directories, there will be directories specific to supporting the BUILD system.

3rdparty -> houses dependencies
└── python -> houses dependencies for python
build_support -> tools for running the build system effectively
├── plugins -> Custom rules and modifications to rules to build ecosystem for the company. Also houses macros.
|   └── <plugin> -> Code for the plugin
└── python_lint_configs -> python linter configuration files

Build System Evaluations

Choices

Bazel, Pants and Please build systems were considered for the monorepo. Bazel was ruled out early due to its complexities and known difficulties of use, especially with Python. However, it was planned to be revisited if Pants or Please were unable to meet our needs. Both Pants and Please solutions were built out making the evaluation on practical experience and use. While evaluating Please, we heavily relied on the amazing work of Anton Rodionov in please.make. This was able to increase our speed of getting Please up and running and understanding more complex workflows.

Evaluations By Requirement

Available Resources

While Please has documentation, and a community to ask questions to, the team found that Pants documentation and community support were spectacular and better able to match how we work. The Pants documentation is very extensive, with a lot of examples. The blogs for Pants are very active with many examples and optimizations. Additionally, the Pants community was quick to answer questions on Slack and were able to patch bugs in Pants within a day of being discovered. On the other hand, it was harder to get through Please documentation to find specific information.

Protobuf Integration

Both systems had native support and please.make showed additional ways in Please to generate protobufs. Both systems were able to match our needs.

Building Container Images

One of the major challenges in python build systems is the generated pex contains wheels from the build machine, and not the target operating system. Please.make solved this by bypassing the build system dependency management and directly installing requirements onto the container image. Natively, Please did have limited support for this, and we found it to be complex. Pants has a feature that allows us to build container images inside the target container environment (so the wheels would match). This offered better support with the build system native dependency management and did not require any workarounds. More importantly, it did not require us to have to list every wheel explicitly for each environment.

Building Artifacts/Distributions

In Pants, publishing artifacts is a built-in property and well documented. In Please and please.make, building a distribution is supported, but adding in the artifact publish would have to be added ourselves.

Test coverage integration

While both Pants and Please had first class support for running coverage, Please ran coverage as a separate command (with the warning that it could take longer to run than the tests). Furthermore, please.make did not work with the coverage. Pants, on the other hand, ran coverage on tests and enforced coverage rules (which were configurable). Pants also exports the coverage and prints it after the test making it easy to understand coverage.

Extensibility

Both Pants and Please have the ability to define custom rules. Please.make completely uses custom rules to run please. The Pants plugin system is how all pants commands were written. This makes the Pants plugin system very powerful and capable of solving more complex issues. Pants also allows for macros, allowing to define custom rules that enforce/add arguments making adoption easy. The downside to Pants engine is that it requires a steeper learning curve and can be harder to use. However, the more complex capabilities makes the Pants plugin environment much more intriguing and useful.

CI Integration and caching

Pants has a dedicated documentation for running on CI. This led to the discovery of multiple ways that we could choose to perform checks. Pants allows for automatic detection of what has changed since the last commit, and only running changes over that set. Furthermore, Pants can also detect based on transitive dependencies, so if a library is changed, then its usage in test files outside of the library can be run. Pants also allows for full building relying on a cache to prevent re-running on instances that have changed. This can use a local on disk cache or a remote cache. However, the remote cache must follow the REAPI (remote execution API), requiring purchasing a product or standing up a compatible system within our CI infrastructure.

Please did not have the same documentation and required figuring out how to interact with the CI ourselves. This led to problems trying to interact with building container images and managing dependency changes. This approach would have required caching, which was supported in Please with a local and remote caching. The remote caching in Please is very easy to work with, as it allows http caching or implementing a custom cache source.

Requirements Management

In most python build systems, all requirements and their dependencies must be listed out. Then, for each environment, record the wheel and environment paring. This can be involved, and be frustrating to add a new requirement, since all dependencies must be listed and associated wheels included.

In please.make, this is worked around by having a central requirements file. The requirements were then installed directly into the container environment, build machine or local host where the code would run. This would also mean that every service and library would use the same set of requirements files. This could lead to a bloated container image including large dependencies not needed, but would speed up container builds due to the requirements layer getting cached. Using native Please, all dependencies would have to be listed for the application manually (only the top level imports, all others would get auto included based on the requirement requirements).

In Pants, dependencies are auto-detected and not required to be listed out (except for hidden dependencies). This means the build-system can auto add the dependencies to a service/library, and generate the requirements for all applications. Since the dependencies are auto discovered, this still allows for the central management and ensures all dependencies run on the same version (and thus are compatible with each other). The auto-detection also allows for container images to get built with just the dependencies necessary. Pants will create a lockfile around the requirements (and separates the requirements of tools and the pants build system from that of application). Pants also allows for customization of the process. In our build, we customize the lock file to use pip-compile to build a readable and easily parsed requirements file. The team still advises adding top-level dependencies into a requirements.in file.

Multiple Language support

Both offer Java, Go and shell support. Pants offers Scala additionally, and has roadmapped added in Rust and javascript support. Please offers C++ support additionally. Pants framework allows for additional language support to be added by the team.

Linting Integration

Please does not have builtin rules around linting code. This would have to be added, or done via another tool like pre-commit.

Pants does have built-in linting, along with auto-formatting. This allows for the use of a single tool to interact with the code. Pants also offers a wide range of linters, and the plugin system makes it easy to add in custom linters.

IDE Integration

Please has a language server for VsCode and Pycharm. This allows for support with building Please BUILD files. Since please.make builds a virtual environment, that also makes IDE integration easy since it can be linked.

Pants does not yet have built-in IDE support, but it is roadmapped (the team has been reached out to by the Pants community on requirements for this integration). Pants can export a virtual environment for the IDE and Source Roots can be added manually to get the auto discovery and navigation of IDEs to work. Additionally, the team has been adding in workaround to enable support (for instance, adding in a wrapper to set the correct environments to commit from inside the IDE).

Handle Multiple Python Environments and Versions

Pants has built-in support for discovering python versions from pyenv. Additionally, Pants supports parametrizing targets to run tests on multiple versions. This allows for services/libraries which versions to be tested for compatibility across versions. Pants also exposes interpreter constraints, allowing a certain version of python to be used to run local development, linted, and tests. This allows for teams to upgrade their services individually instead of updating the entire repo at once.

This was not tested using Please, but there was no built-in support for this. It is believed that this would have to have been accomplished by creating multiple rules for each test for each version supported. In please.make, this would have been easy to do since we used custom wrappers around commands anyways.

Easy to Adopt and Think About

With available tooling and ability to generate rules, we thought both of these would be easy for teams to adopt. Most adoption would be just learning what commands to run and how to build the macros.

The next thought was around how easy it would be to maintain and expand as new requirements and changes were brought into the repo. The Pants system is slightly favored here since the tooling was superior, offered more flexibility and enforcement, and the community support helping solve problems was spectacular.

Build System Choice

Pants was ultimately chosen after building out both systems. Pants was easier to work with, more flexible and thus more powerful, and involved less friction in everyday use. Additionally, the Pants puns were more fun.

Artifact Tagging Philosophy

Whenever a dependent library was changed, the team wanted everything that depended on it to get rebuilt. So, if a core library changed, everything that depended on it should get rebuilt. This includes all libraries that depended on that library and all container images. However, when developing a library, having to bump every place where it was used causes issues. It requires having to know every location and all dependencies (though the Pants build system can show this for you). Additionally, it can cause build version discrepancies when two competing change sets are merged close together, or prevent merges from going out and requiring more changes and another CI run. Thus, below are solutions that the team thought would cause the least amount of disruption, but enough information to be useful.

Container Images

Services will still be versioned by their owners as they see appropriate. This allows for the semvar versions to be changed when teams feel like they should be. This can be done via a version.py file in the server code that gets auto-discovered. A Pants plugin will exist to discover this version and add it as a tag. Additionally, the plugin will also add a suffix to the version to uniquely identify the code getting built. During CI, this will be the git commit. On locals, this will be the git commit if the branch is not dirty. If the branch is dirty, then it will be the hash of the docker image digest. This approach makes every commit on the master branch generate a unique version and easy to find what caused the build. This allows for dependencies to change without having to update a bunch of versions

Python Artifacts

Python libraries also follow a similar convention. A version file exists, and is linked to the library build command. A plugin exists to read this version, and add it to the artifact. In addition, a build number is appended to all python libraries. The build number is calculated by the commit history length. This was chosen because it preserves semvar conventions, and still enables usage of supporting PEP-508.

However, because the Pants plugin will generate the build number for all libraries, if an upstream library changes and a downstream library does not, the upstream library will depend on a version of the downstream library not published. So, instead of just publishing libraries where there were changes, we will publish libraries with that build number all the way down the chain.

Java Klaviyo Schema Generation

Similar to the python adding a build number, the java schema generation also uses the same method for calculating a build number and appending it to the version.

Future Language Artifacts

The team feels strongly that all builds should follow the model laid out in the python artifacts approach. A build number should be appended to the version.

Notifications

After a library or a container image is created, there should be a system in place to notify teams when a new version of their application/library is available. A mapping can be used via CODEOWNERS to find the owning team, then send a Slack message to inform them. The Slack message should either to be sent to a team’s slack channel or via a central slack channel to notify the team. The earlier approach is more desired.

Strong Linting Rules

The monorepo uses stronger linting rules than previously used in our other repos. This is adherence the team feels should be in place for all projects in the monorepo and not removed.

Linting is useful to force a clean, consistent and easy to read code. Linting and formatting help maintain the quality of code by enforcing rules around syntax, style and best practices. Linting can help catch errors early, make the code easier to understand and reduce the amount of time to perform code review.

All languages should be linted and formatted to enforce coding standards. Linters should help enforce the coding standards, while formatters should help keep code readable and consistent. Linters should aim to perform static analysis to help detect logic or code errors. Stronger checking should be enforced. In python, code should be typed to help increase its readability and detect incorrect usage. Linters should enforce that code is properly documented, including classes, public functions and modules/packages. All languages should include protobuf definitions, docker files, readmes (not yet supported), and coding languages.

Linters in the library may change, but the general enforcement of them will change. For instance, pylint is used currently to enforce things like documenting functions, and performs more rigid static analysis along referencing variables. This could be replaced with a similar linter, like ruff that performs better.

Linters should not be disabled, except for in temporary circumstances like migrating code into the monorepo, or rolling out/disabling a linting rule. Tests may use a subset of the linters, like disabling type checking, but should not disable formatting and general linting.

Governorship and Ownership

Ownership is broken down into two categories: implementation and standards. Implementation is defined as the work to get the build system running and adhering to the standards. Standards regard the philosophy of how the monorepo should run.

Standards includes the following information:

Monorepo structure
Artifact Tagging Philosophy
Strong Linting Rules (This should not be the exact rules or tools, but the general guidance)

Implementation includes ensuring that the build system functions properly, that the CI/CD setup works, maintaining the plugins, upgrading the build system, and implementing tools for the linting.

Governorship should be controlled centrally by the ARB. Implementation should be owned by the velocity team, with ownership transferring from the event pipeline to velocity in quarter 3 of 2023.

Alternatives Considered

Structure

Many alternatives were considered for the structure:

Using klaviyo vs not using klaviyo. We liked using it to show internally sourced dependencies from external sourced dependencies.
Including separate directories for services/commands/libraries. This would make it easier to navigate if there was a lot of services, but makes the names significantly longer. In the end, it does not help navigation enough to be useful
Having code top level and domains under a separate directory. This made the names too long. In the end, core ends up just being a core domain to the monorepo and separation did not aid navigation.
Using team names, or pillars or an area to associate code. But given team name splits and as things move, this seemed to make a need for moving things around higher. It also doesn’t necessarily aid more in understanding the code, since a team can own widely differing services and libraries. The domains helps make it easier to understand interactions across domains.

Tagging Philosophy

Originally, no additional tags were included onto container builds and python distributions. If this approach was continued, that would increase the burden.

Build Systems

As noted in the evaluations, Bazel and Please were also considered.

Linting

Looser linting rules, or laxer restrictions were considered. However, having only parts of the monorepo enforced and others not makes it harder for developers as they move across domain boundaries (if they change teams, or embed in another team and have to change coding standards). Additionally, we found that bugs were getting introduced into other repos where the linters could catch problems earlier. Finally, many functions, modules were untyped and undocumented. This made reading the code much harder to understand. The team feels developers should be ensuring the code can be maintained by someone that is not themselves and that documentation and typing really help evolve understanding of the code.

Scalability

When using a build system, scalability matters in terms of how fast can things still build, how large does caching need to be, and managing container build times when a core component changes and everything needs to get rebuilt.

The design of pants can handle a large number of build targets. With the method of rebuilding what changes, this can limit the vast majority of builds to a small subset, keeping its performance fast. Additionally, with remote caching, anything that has not changed would not get rebuilt thus keeping build times short.

Pants has blogs on how to optimize building container images to make the most of caching. By re-using base layers that define most of the requirements, this can increase the caching and help reduce the build times.

Rollout and Monitoring

Build failures are monitored with a message in slack along with build successes. Developers can leverage github to send notifications when their builds fail.

A conversion document exists within the monorepo to help guide developers in moving code into the monorepo from outside sources.

CODEOWNERS file is used to map directory and files to an owning team. The owning team will maintain ownership of *. This code ownership check gates entering the monorepo with an approval, allowing for checking that the structure is being used. A linter may be added to enforce elements of the structure.

Risks and Failure Modes

To help mitigate risk, CI is run with pants via a pre-built container image. The container image should pre-install the pants pex to avoid having to download it on every run and having errors communicating over the internet. The container image should also have all things needed to run all pants commands (all language installs, container, and all requirements). This should enforce that the container image is always in a good state. The jenkinsfile should reference the image by its tag or sha instead of using an unpinned version (in case there was a new version pushed with untested changes).

[Remainder of RFC excluded.]