PaveArrow: A tmux-based CI/CD Architecture

Context: This was a startup idea I worked on alone in 2021 that I ultimately gave up before shipping a functional product. You can see the marketing page still up at pavearrow.com. I let the scope grow more and more without finishing anything. I got the sandboxing working, some of the OpenResty routing, and gave up at migrating container data between machines once I realized I basically reimplementing Kubernetes.

The Efficiency Question

What makes CI/CD so expensive? Every build starts by cloning repositories, downloading dependencies, and setting up environments - work that gets thrown away minutes later. What if we kept everything in place?

Consider these questions:

PaveArrow was an attempt to answer these questions by rethinking CI/CD from first principles. The technical architecture that emerged was unlike anything else in the space.

Core Architecture

At its heart, PaveArrow was built around an unusual principle: every CI job was literally a tmux session. Not a container running commands, not a script executor. An actual tmux server that automation controlled by sending keystrokes.

This design choice cascaded through the entire system architecture:

Execution Model

When a build job started, the system would:

  1. Create a new Linux namespace (using bubblewrap for unprivileged containers)
  2. Start a tmux server inside the namespace
  3. Send keystrokes to tmux to execute the build commands
  4. Keep the tmux server alive for 1 hour if the build failed

The beauty of this approach was that debugging used the exact same interface as automation. When you connected via SSH or the web shell (using xterm.js), you weren't connecting to a "debug environment"; you were attaching to the same tmux session that ran your build.

The Three-User Security Model

Just a tiny note, the system used three Linux users with different trust levels:

This separation was fundamental. Carol could write whatever she wanted inside the container, but only Bob could copy files out to the host. Alice never executed anything Carol created. This created defense in depth: even if Carol found a bug, she'd need to compromise Bob to affect the host, and then compromise Alice to affect the system.

And Alice was watching the syscalls that Bob made via eBPF and would terminate the process if she saw anything suspicious. I was able to do this because I wrote all the Bob programs myself to be deliberately minimal in usage of system calls. So if someone was trying to compromise Bob programs: start allocating more memory, or calling ioctls, manipulating kernel resources, etc.. I knew that Bob did not do that.

Mutable Workspace Philosophy

Unlike traditional CI/CD that gives you a fresh environment every time, PaveArrow gave you a persistent, mutable workspace. This meant:

Instead of the platform deciding when to purge caches, developers wrote their own logic:

# Check if dependency already downloaded
if [ ! -f "node_modules/.cache-marker" ]; then
    npm install
    touch node_modules/.cache-marker
fi

This philosophy extended the "you own your build" principle: if you wanted fast builds, you optimized them yourself. If you wanted clean builds, you wrote rm -rf yourself. The platform just gave you a persistent filesystem and got out of your way.

But this wasn't the total wild west. The base image (rootfs) would be reset every build, and the source code would be reset by git checkout. So it really came down to what other files in the checkout directory you left around. Did you leave the node_modules? Fine. Did you leave the cargo target directory? Fine.

The system gave you a clean slate for the things that mattered (base image and source code) while letting you decide what build artifacts to preserve between runs.

There was an option to have builds be completely reset, but you could also turn that off. Mutable workspaces were just an extra feature built on top of the immutable system. You could always configure a completely clean build if needed.

Storage Architecture

The filesystem design used overlayFS to create isolated container environments:

/home/pavearrow/pages/data/
├── +rootfs+              # Container base images (per-customer)
│   ├── acme-corp/        # ACME Corp's custom image
│   └── jane-doe/         # Jane's Ubuntu 20.04 image
├── +workspace+           # Active build environments
│   └── author/project/
│       ├── +gitdir+      # Bare git repository
│       │   └── worktrees/
│       │       └── branch-name/
│       └── workNgHHU5bKF/  # Container root filesystem
│           ├── +checkout+   # Git worktree
│           │   ├── package.json
│           │   ├── src/
│           │   └── ...
│           ├── slirp.sock   # Network namespace socket
│           └── tmux-tmpdir/ # tmux socket directory

Each customer got their own base image in +rootfs+. When customers uploaded a Docker image, PaveArrow would unpack it into an immutable filesystem tree stored only on their assigned machine. This meant:

Each container was built using overlayFS:

This meant every container started from the same base image but had its own writable layer. The +rootfs+ directory contained converted Docker images: when users uploaded a Docker image, PaveArrow would unpack it into an immutable filesystem tree stored on every build server.

Inside each container, the three-user permission model was enforced:

To publish a website, your build script would invoke a setuid binary owned by bob:

# Inside your build script (running as 'user')
generate_site > www/index.html
publish www/  # This calls a 'bob'-owned binary

The publish command would:

  1. Validate the input files
  2. Copy them into the content-addressable blobstore
  3. Make them immutable (removing write permissions)
  4. Update OpenResty's routing table

Network Isolation

I modified bubblewrap to add proper network namespace support. Each container got its own network namespace with:

Git Repository Management

Git repositories were stored as bare repos with worktrees:

The content-addressable store also held common dependencies (like compilers or libraries) that were hardlinked into each build environment. With fs.protected_hardlinks=1, users couldn't modify these shared files, but they took up no extra disk space. The hardlink-based deduplication meant even large transfers were efficient - you only moved each unique block once.

Database-Triggered Builds

The Pages service implemented an interesting model: builds were triggered by database changes. The flow was:

  1. Developer modifies SQLite database via Thoth API
  2. Pages detects the database version change
  3. Pages triggers a build with the new database version
  4. Build generates static files into www/ directory
  5. Files get stored in content-addressable storage
  6. OpenResty serves the files

This created a primitive CMS where your "content" lived in SQLite and your "theme" was whatever code generated HTML from that database.

Security Model

Instead of heavyweight virtualization, the system relied on:

Files had careful ownership matching the three-user model:

The assumption was that hobby developers had a lower threat model than enterprises. Dedicated machines would be used for any customers who needed stronger isolation.

Migration and Scaling

Each customer was assigned to exactly one machine. This wasn't a limitation; it was a deliberate design choice that simplified everything:

When a project outgrew its current machine, PaveArrow would:

  1. Use BitTorrent to replicate the content-addressable store to the new machine
  2. Start routing new builds to the new machine
  3. Let existing builds complete on the old machine
  4. Sync any new artifacts back
  5. Update routing tables in Redis

Since git repos and content stores were immutable and append-only, this migration could happen without downtime. The BitTorrent replication meant you only transferred each unique block once, even if multiple projects needed it.

Failure Recovery

The outage strategy was refreshingly simple: restore from a stale filesystem backup to a new server, restart processes, and update the global Redis that stored routing for OpenResty ingress servers.

This approach came from my experience at OneSignal, where we ran all servers in one datacenter for 11 years. The lesson: some delay during outages is acceptable at a business level. Perfect availability requires massive complexity that hobby projects don't need. A 30-minute recovery time for a $10/year customer is reasonable. The same outage for a $10,000/month customer is not.

The Temporal Connection

Build orchestration used Temporal (via temporalite) for workflow management. This handled:

Why This Architecture?

Every design decision optimized for one goal: pack as many hobby projects as possible onto cheap hardware while maintaining a good developer experience.

Using tmux meant developers could debug exactly what went wrong. Hardlinks meant shared dependencies used no extra disk space. Git worktrees meant multiple builds could run without copying repositories. Linux namespaces meant isolation without virtualization overhead.

The most interesting aspect was how the constraints (cheap hardware, many users) drove architectural decisions that ended up creating novel developer experiences. Being able to fix and commit code from inside a failed build isn't something mainstream CI/CD systems offer; it fell naturally out of the tmux-based architecture.

The system was simultaneously over-engineered (custom orchestration, content-addressable storage, BitTorrent replication) and under-engineered (basic Linux sandboxing, permissive security model). But each decision made sense given the target market of hobby developers paying $10/year.

The Economics of Bin Packing

To validate whether this architecture could work economically, I scraped GitHub for all public repositories with .travis.yml and .circleci/config.yml configurations. Then I:

  1. Downloaded a sample of these repositories
  2. Measured their disk space requirements
  3. Analyzed their commit patterns to estimate build frequency
  4. Calculated resource usage based on their CI configurations

The results were striking: most projects were tiny. The median repository was under 50MB, builds ran a few times per week, and took minutes not hours. My calculations showed that a single 16-core machine with 32GB RAM and 1TB SSD could handle 150-1000 of these projects.

At $10/year per developer and 150 developers, that's $1,500/year revenue on a machine costing $150/year. The 10x margin left room for support, development, and profit.

Scaling the Security Model

The security model does scale with customer size. I calculated that real business customers would be single tenants on their own machine.

Hobby tier ($10/year): Linux namespaces and EBPF monitoring for syscall anomalies. The reasoning was simple: hobby developers building personal projects don't have nation-state adversaries trying to steal their code. If someone's building a weekend project, they care more about convenience than Fort Knox security. Monitoring for the unusual syscalls used in container escapes (like ptrace or mount in weird contexts) catches the obvious attacks without the overhead of full virtualization.

Business tier ($150/year): Get an entire dedicated VM. This wasn't a compromise: modern cloud VMs are cheap enough that you can give each business customer their own machine. The smallest EC2 instances cost less than what any real business pays for CI/CD.

The architecture was deliberately designed around single-machine-per-repository:

Fault tolerance came from filesystem snapshots only. If a machine died, you'd restore from backup to a new host: about 30 minutes of downtime. My experience at OneSignal taught me that actual hardware failures requiring this recovery happen years apart when you rent quality servers from a real datacenter. For $10/year customers, 30 minutes of downtime every few years is acceptable. For business customers on dedicated machines, they could implement their own redundancy if needed.

Hot standby was totally possible if customers wanted to pay for it: replicate the filesystem and voila! Downtime reduced to openresty reload time. From experience, you can hold HTTP requests in memory during a switchover if you keep it under a couple seconds. But multi-region disaster recovery? That's complexity nobody was actually using. I saw so many contract negotiations that had this seemingly hard requirement get redlined away. One machine per git repository keeps things simple.

Parallel Builds Without Distributed Complexity

If you needed to parallelize builds across multiple machines, the design stayed simple:

  1. Git mirror from the single upstream source to follower machines
  2. Each follower has its own local git repository
  3. Dispatch build commands to specific followers
  4. Accept some replication delay (still faster than competitors who checkout over the network every time)

This way, all machines running builds for a repository stayed identical, and the git replication delay was hidden from users. Unlike traditional CI/CD where you wait for network checkout... with PaveArrow, the code was already there when your build process. The checkout process was built in.

What Actually Happened

In the end, the technical architecture was sound for its intended use case. The project failed not because the technology didn't work, but because the scope expanded beyond what one person could reasonably build and maintain. The core insight that CI/CD is inefficient because it throws away state remains valid. But building a complete platform to capitalize on that insight proved larger than expected.

Want a weekly digest of this blog?