diff options
| author | James O'Doherty <james@theodohertyfamily.com> | 2026-06-14 01:28:47 -0400 |
|---|---|---|
| committer | James O'Doherty <james@theodohertyfamily.com> | 2026-06-14 01:28:47 -0400 |
| commit | 04349c45dbf2b04ee89c6c99ac90152daa799097 (patch) | |
| tree | 3957ee3eb07bbeea07642cf65f8fbe082f4e33ed /CASE_STUDY.md | |
| parent | 1b23c0acbb6e4f45ce4c4e95ad295218d769f709 (diff) | |
Diffstat (limited to 'CASE_STUDY.md')
| -rw-r--r-- | CASE_STUDY.md | 79 |
1 files changed, 79 insertions, 0 deletions
diff --git a/CASE_STUDY.md b/CASE_STUDY.md new file mode 100644 index 0000000..9930fe8 --- /dev/null +++ b/CASE_STUDY.md @@ -0,0 +1,79 @@ +# Case Study: Building `wg-wrap` with an AI Agent + +## 1. Introduction +`wg-wrap` is a tool for Linux that lets you run specific programs over a WireGuard VPN without needing root privileges or changing your whole computer's network settings. + +This project serves as a case study in how a high-level expert can use an AI agent to accelerate development. The project was created by a **Principal Systems & Security Architect** (an expert in Linux kernel internals, C/Go interop, and security hardening) using the **`pi` agentic harness** and the **Gemma 4 31B** reasoning model. + +--- + +## 2. How the AI was Managed: The Rulebook +To prevent the AI from making common mistakes or writing fragile code, the project used a file called `AGENTS.md`. This served as a set of strict instructions the AI had to follow. + +### Key Rules That Worked: +- **Clear Tool Use**: The rules explained exactly how the AI should read and edit files to avoid mistakes. +- **Stop and Pivot**: If a command failed, the AI was told to stop and try a different approach rather than repeating the same error. +- **Isolated Testing**: The AI had to use temporary folders for tests so it wouldn't interfere with the actual system. + +--- + +## 3. Development Strategy +The project progressed rapidly because the creator applied a specific strategic sequence to reduce technical risk: + +1. **Solving the Hardest Parts First**: The first priority was "Technical Derisking." The focus was on the C launcher and Linux namespaces. Because this was the most difficult part, it was solved before any other features were built. +2. **Steel Thread Proof of Concept**: Once the hard part was solved, the goal was to create a "steel thread"—the shortest possible path to a working end-to-end demonstration. +3. **Security Hardening**: After the tool worked, the focus shifted to security. This included adding fuzzing to find crashes and preventing DNS leaks. +4. **Ergonomics and Usability**: The final phase focused on the user experience, such as creating a standalone binary, removing external dependencies, and refining the CLI. + +--- + +## 4. Timeline Comparison +The project was built in a few concentrated bursts over about three weeks (May 22 – June 13, 2026), totaling approximately **25–30 active hours**. + +### Effort Comparison +Comparing the total hours spent shows the magnitude of the difference in productivity. + +| Scenario | Estimated Total Hours | Comparison to Actual | +| :--- | :--- | :--- | +| **Expert + Agent (Actual)** | **25 - 30 Hours** | **1x** | +| **Expert by Hand** | **80 - 160 Hours** | **~4x - 6x Slower** | +| **Typical Senior + Agent** | **120 - 240 Hours** | **~5x - 8x Slower** | +| **Typical Senior by Hand** | **480 - 960 Hours** | **~16x - 32x Slower** | + +A traditional manual workflow for a typical senior engineer would require 3 to 6 months of full-time effort. The agentic approach, steered by a systems expert, completed the same project in less than one standard work-week of actual effort. + +### Context on "Typical Senior Engineer" +For this comparison, a "typical senior engineer" is defined as someone proficient in Go and general Linux networking (e.g., comfortable with `ip` and `iptables` and building production apps), but who does not have specialized, daily experience in kernel-level namespace manipulation or C-to-Go bootstrap launchers. + +--- + +## 5. Problems Encountered and How They Were Fixed + +### Problem 1: Go doesn't play well with Linux Namespaces +- **The Issue**: The Go language uses multiple threads, which makes it crash when using certain Linux kernel functions (`setns` and `unshare`). +- **The Fix**: A **C launcher** was used. This is a small, single-threaded program that sets up the network environment before the Go program starts. + +### Problem 2: Namespaces disappearing too early +- **The Issue**: The network tunnel would sometimes close while a program was still using it. +- **The Fix**: A **reference counter** was implemented. The system tracks how many programs are using the tunnel and only closes it when the last one exits. + +### Problem 3: Relying on external tools +- **The Issue**: The program originally called the `ip` command to configure the network, which can behave differently on different systems. +- **The Fix**: These calls were replaced with native Go code using a library called `netlink`. + +--- + +## 6. What could be improved? + +### Easier Building +- **The Problem**: Building requires `gcc` and `go` to be configured exactly right. +- **The Fix**: Use a **Docker container** to ensure the code builds the same way for everyone. + +### Better Testing +- **The Problem**: Full tests require a Linux machine with specific permissions, making cloud automation difficult. +- **The Fix**: Use a **virtual machine (QEMU)** to run tests in a controlled environment. + +--- + +## 7. Conclusion +The main lesson from `wg-wrap` is that AI agents are a force multiplier for expertise. While the AI handled the implementation and the "grind" of debugging, the human provided the high-level strategy. By solving the hardest problems first and using an agentic harness to execute precisely, a project that would typically take months was completed in a few days of active work. |
