<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Luke Stahl&apos;s Blog</title><description>Articles on web development, CMS platforms, AI tools, and developer marketing</description><link>https://lukestahl.io/</link><language>en-us</language><item><title>Prompting is quick. Shipping a website isn&apos;t.</title><link>https://lukestahl.io/blog/shipping-a-website/</link><guid isPermaLink="true">https://lukestahl.io/blog/shipping-a-website/</guid><description>Prompting a website takes no time. What comes after, from DNS to email to AEO to payments, is a different conversation entirely.</description><pubDate>Fri, 12 Jun 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;img src=&quot;/blog/images/shipping-a-website/shipping-a-website-hero_png.png&quot; alt=&quot;shipping-a-website-hero.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;Vibe coding a site is fast. Both of my production sites were built with Claude. What took the rest of the time was everything underneath it: the DNS, email deliverability, analytics, SEO that works for humans and AI assistants, Core Web Vitals, payments, design, automation. None of it shows up in the prompt that built the homepage.&lt;/p&gt;
&lt;p&gt;This isn&amp;#39;t a critique. I&amp;#39;m all for it AI assisted building. The question is what comes next and what you do after the site looks good.&lt;/p&gt;
&lt;h2&gt;The two sites&lt;/h2&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/shipping-a-website/Shacks_OG_1200_630_png.png&quot; alt=&quot;Shacks_OG_1200_630.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://shacksbloom.com/&quot;&gt;Shacksbloom.com&lt;/a&gt; runs a florist business. Nine public pages (Home, Services, Portfolio, About, Inquire, Pay, Privacy, Terms, and a standalone Instagram page) plus an admin portal behind auth. The visible part is the easy part.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/shipping-a-website/lukestahl-og-logo_1200x600_png.png&quot; alt=&quot;lukestahl-og-logo_1200x600.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;Lukestahl.io is this site. &lt;a href=&quot;https://astro.build/&quot;&gt;Astro&lt;/a&gt; 6 with &lt;a href=&quot;https://notion.so/&quot;&gt;Notion&lt;/a&gt; as the CMS, deployed to &lt;a href=&quot;https://pages.github.com/&quot;&gt;GitHub Pages&lt;/a&gt;. Blog, project listings, an AI models guide that checks for model updates every Monday and opens a PR when it finds them, a monthly newsletter built from scratch. Started with plain HTML, CSS, and JS, then migrated to Astro and &lt;a href=&quot;https://react.dev/&quot;&gt;React&lt;/a&gt; once the site needed more structure. Both of these sites are in production. &lt;/p&gt;
&lt;h2&gt;Tech stack decisions&lt;/h2&gt;
&lt;p&gt;Most of the time, these decisions happen before the first prompt, although sometimes a good conversation between you and your agent can help in the decision making. &lt;/p&gt;
&lt;p&gt;For shacksbloom.com: &lt;a href=&quot;https://vite.dev/&quot;&gt;Vite&lt;/a&gt; + &lt;a href=&quot;https://vercel.com/&quot;&gt;Vercel&lt;/a&gt;. React (Single Page Application) SPA with build-time prerendering. Each deploy runs a prerender script that generates static HTML per route. No running SSR server, just HTML snapshots Vercel serves directly. Two people touch it and neither needs a CMS yet. I did however connect this site to &lt;a href=&quot;https://www.builder.io/fusion&quot;&gt;Builder.io’s Fusion&lt;/a&gt; product for visual editing. This is best for the non-technical team member who needs to make edits and submit via PR to GitHub. &lt;/p&gt;
&lt;p&gt;For lukestahl.io: started with plain HTML, CSS, and JS, then migrated to &lt;a href=&quot;https://astro.build/&quot;&gt;Astro&lt;/a&gt; once the site needed real structure. Astro handles static page building; React handles interactive components like &lt;a href=&quot;https://ui.shadcn.com/&quot;&gt;Shadcn UI&lt;/a&gt;. Posts live in Notion, sync at build time, render to static HTML. Adding a post is a Notion edit and a deploy. The stack should fit how you work, not the other way around. I prefer Notion because it’s where I like to write content via their editor rather than within a CMS. I don’t have team members that I require a CMS. &lt;/p&gt;
&lt;p&gt;Stack mismatches show up later in the build. A framework that defaults to SSR when you&amp;#39;re serving mostly static content adds complexity everywhere. Don&amp;#39;t let a prompt make that call.&lt;/p&gt;
&lt;h2&gt;Design: I&amp;#39;m not a designer&lt;/h2&gt;
&lt;p&gt;Both sites needed design work. Before &lt;a href=&quot;https://claude.ai/&quot;&gt;Claude Design&lt;/a&gt; existed, I was having Claude generate self-contained HTML files with inline styles, opening them in a browser, screenshotting what worked, and feeding that context back into the next session. I also used the &lt;a href=&quot;https://excalidraw.com/&quot;&gt;Excalidraw&lt;/a&gt; MCP to export those HTML files into hand-drawn style visuals. More character than a polished AI mockup, and a useful middle ground between wireframe and finished design.&lt;/p&gt;
&lt;p&gt;Claude Design changed the workflow considerably. Where I was stitching together HTML mocks, browser screenshots, and context drops, now I iterate in a design-system-aware tool. The output lands closer to finished when you bring intent to it. The &amp;quot;AI slop&amp;quot; label is becoming something people throw around to sound above it, or maybe because they&amp;#39;re nervous about what it means for designers. I understand both positions. Not being a designer with access to these tools means I can ship sites that look like a designer worked on them. Nobody one-shots a design and ships it. The output on pass one is a starting point. Push back on defaults, redirect what&amp;#39;s off, keep going. The output gets better every pass. That&amp;#39;s the whole job, same as it&amp;#39;s always been.&lt;/p&gt;
&lt;p&gt;Excalidraw is worth knowing regardless of the AI workflow. It handles everything from rough architecture sketches to clean professional diagrams, and the hand-drawn aesthetic is a deliberate choice you can switch on or off. For visual thinking, system diagrams, and design assets that need a bit of character, it&amp;#39;s one of the better tools out there. The MCP integration is a bonus: take an HTML mock, run it through, and you get a hand-drawn version that doesn&amp;#39;t look like it came straight out of an AI tool.&lt;/p&gt;
&lt;h2&gt;DNS and deployment&lt;/h2&gt;
&lt;p&gt;The DNS chain for shacksbloom.com runs &lt;a href=&quot;https://godaddy.com/&quot;&gt;GoDaddy&lt;/a&gt; → &lt;a href=&quot;https://cloudflare.com/&quot;&gt;Cloudflare&lt;/a&gt; → Vercel. Vercel terminates SSL, which means every Cloudflare record has to stay on DNS-only mode. Flip one record to orange-cloud and HTTPS breaks. That detail isn&amp;#39;t documented anywhere visible. You find it by breaking things.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://developers.cloudflare.com/email-routing/&quot;&gt;Cloudflare Email Routing&lt;/a&gt; came free with the domain. &lt;code&gt;jackie@shacksbloom.com&lt;/code&gt; and &lt;code&gt;hello@shacksbloom.com&lt;/code&gt; forward to Gmail without a Google Workspace account. &lt;code&gt;hello@&lt;/code&gt; is a catch-all. Small thing that eliminates a recurring cost.&lt;/p&gt;
&lt;p&gt;Lukestahl.io is simpler at the DNS layer. GitHub Pages handles SSL and deployment. The complexity lives in the build pipeline.&lt;/p&gt;
&lt;h2&gt;Email infrastructure&lt;/h2&gt;
&lt;p&gt;Both sites use &lt;a href=&quot;https://resend.com/&quot;&gt;Resend&lt;/a&gt; for transactional email. Resend&amp;#39;s free tier allows one verified domain per account, so shacksbloom.com and lukestahl.io each have their own. DKIM on each domain. SPF on the &lt;code&gt;send&lt;/code&gt; subdomain. A &lt;code&gt;p=none&lt;/code&gt; DMARC record at minimum: skip it and sending works fine for weeks, then Gmail starts routing inquiry replies to spam. No error, no warning. It just stops working.&lt;/p&gt;
&lt;p&gt;The shacksbloom.com inquiry handler is a Vercel function at &lt;code&gt;api/inquire.ts&lt;/code&gt;. It validates required fields, silently drops honeypot-flagged submissions, sends Jackie a styled HTML email with &lt;code&gt;Reply-To&lt;/code&gt; set to the person inquiring, and fires an auto-reply. The API key is scoped to sending on shacksbloom.com only.&lt;/p&gt;
&lt;p&gt;Lukestahl.io has a monthly &lt;a href=&quot;https://lukestahl.io/newsletter/&quot;&gt;newsletter&lt;/a&gt;. Not &lt;a href=&quot;https://mailchimp.com/&quot;&gt;Mailchimp&lt;/a&gt;, not &lt;a href=&quot;https://beehiiv.com/&quot;&gt;Beehiiv&lt;/a&gt;. Custom-built. Resend handles delivery. The list, scheduling, and template rendering are all code I own: no subscriber caps, no platform fees, templates that match the site exactly. It&amp;#39;s one more thing to maintain, and it runs exactly how I want it to.&lt;/p&gt;
&lt;h2&gt;Analytics and automation&lt;/h2&gt;
&lt;p&gt;Both sites run &lt;a href=&quot;https://posthog.com/&quot;&gt;PostHog&lt;/a&gt;. For shacksbloom.com, it&amp;#39;s proxied through &lt;code&gt;/ingest/*&lt;/code&gt; so ad blockers don&amp;#39;t drop the events. Five explicit events, autocapture off, session replay scoped to the inquiry flow so I&amp;#39;m not burning the free tier watching generic page views.&lt;/p&gt;
&lt;p&gt;For lukestahl.io, same setup without the proxy. Analytics feed content decisions: which posts get read, where people land, what converts to newsletter signups.&lt;/p&gt;
&lt;p&gt;Automated &lt;a href=&quot;https://github.com/features/actions&quot;&gt;GitHub Actions&lt;/a&gt; handle monitoring on lukestahl.io:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A link checker that emails me via Resend when internal or external links break&lt;/li&gt;
&lt;li&gt;A Core Web Vitals monitor that opens a GitHub issue when LCP, CLS, or INP regress on tracked pages&lt;/li&gt;
&lt;li&gt;An AI models guide updater that pulls Anthropic and OpenAI release feeds, runs a diff through a prompt, pushes a branch, and opens a PR when it finds changes&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For shacksbloom.com, a GitHub Actions workflow runs every Monday, queries PostHog for the prior week, and emails a summary to Jackie and me via Resend.&lt;/p&gt;
&lt;p&gt;Slack integrations are connected via PostHog workflows for &lt;a href=&quot;https://lukestahl.io/&quot;&gt;lukestahl.io&lt;/a&gt; and scheduled sends on Monday via a Content Agent. This helps monitor changes week over week if I want a quick report vs going into PostHog’s dashboard. &lt;a href=&quot;https://posthog.com/docs/model-context-protocol&quot;&gt;PostHog’s MCP&lt;/a&gt; also helps create custom dashboards that I highly recommend. &lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/shipping-a-website/Analytics_png.png&quot; alt=&quot;Analytics.png&quot;&gt;&lt;/p&gt;
&lt;h2&gt;SEO and AEO&lt;/h2&gt;
&lt;p&gt;Core Web Vitals affect ranking directly. Let LCP, CLS, or INP regress on mobile and organic traffic follows. The GitHub Actions monitor surfaces regressions before they add up to a traffic drop. These GitHub Actions also run weekly with a GitHub action for deployment to &lt;a href=&quot;https://lukestahl.io/seo/&quot;&gt;lukestahl.io/seo/&lt;/a&gt;. &lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/shipping-a-website/SEO_jpeg.jpeg&quot; alt=&quot;SEO.jpeg&quot;&gt;&lt;/p&gt;
&lt;p&gt;AEO is the other half: making the site readable and useful to AI assistants. Both sites have &lt;a href=&quot;https://llmstxt.org/&quot;&gt;&lt;code&gt;llms.txt&lt;/code&gt;&lt;/a&gt; files that explain what each site is for. Lukestahl.io has JSON-LD structured data on every post, an &lt;a href=&quot;https://www.indexnow.org/&quot;&gt;IndexNow&lt;/a&gt; integration that notifies Bing and Yandex within minutes of a new page, and a sitemap with canonical URLs and &lt;code&gt;noindex&lt;/code&gt; filtering. Shacksbloom.com&amp;#39;s &lt;code&gt;llms.txt&lt;/code&gt; includes the payment page URL so an AI assistant can hand it to someone asking &amp;quot;how do I pay Jackie?&amp;quot;&lt;/p&gt;
&lt;p&gt;LLMs.txt:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://lukestahl.io/llms.txt&quot;&gt;https://lukestahl.io/llms.txt&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://shacksbloom.com/llms.txt&quot;&gt;https://shacksbloom.com/llms.txt&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Both sites run weekly &lt;a href=&quot;https://ahrefs.com/&quot;&gt;Ahrefs&lt;/a&gt; audits. It surfaces broken links, crawl errors, missing metadata, and issues that don&amp;#39;t show up in any build output. It&amp;#39;s the tool that keeps SEO health visible on an ongoing basis rather than something you check once at launch and forget.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/shipping-a-website/Ahrefs_png.png&quot; alt=&quot;Ahrefs.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;One thing it caught on shacksbloom.com: the homepage was flagged as orphaned three weeks after launch. Same &lt;code&gt;index.html&lt;/code&gt; served every route, empty &lt;code&gt;&amp;lt;div id=&amp;quot;root&amp;quot;&amp;gt;&lt;/code&gt;, no internal &lt;code&gt;&amp;lt;a&amp;gt;&lt;/code&gt; tags visible to crawlers without JavaScript. The fix was a prerender step that emits a fully-hydrated HTML body per route. Most LLM scrapers and older search crawlers don&amp;#39;t execute JavaScript. If your build emits a shell, those crawlers see a blank page.&lt;/p&gt;
&lt;h2&gt;Payments&lt;/h2&gt;
&lt;p&gt;Shacksbloom.com has a &lt;code&gt;/pay&lt;/code&gt; &lt;a href=&quot;https://shacksbloom.com/pay&quot;&gt;page&lt;/a&gt;. A &lt;a href=&quot;https://venmo.com/&quot;&gt;Venmo&lt;/a&gt; deep-link button that opens the app on mobile and falls back to the web on desktop. A &lt;a href=&quot;https://zellepay.com/&quot;&gt;Zelle&lt;/a&gt; card with copy-to-clipboard for the receiving email. No &lt;a href=&quot;https://stripe.com/&quot;&gt;Stripe&lt;/a&gt;, no transaction fees on small deposits.&lt;/p&gt;
&lt;p&gt;It&amp;#39;s indexed, linked from the footer, and listed in &lt;code&gt;llms.txt&lt;/code&gt;. Simple setup, but the discoverability piece matters. If someone asks an AI assistant how to pay for a custom arrangement, the assistant can surface the link.&lt;/p&gt;
&lt;h2&gt;Instagram&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;http://shacksbloom.com/instgram&quot;&gt;Shacksbloom.com/instgram&lt;/a&gt; pulls a live Instagram feed via &lt;a href=&quot;https://behold.so/&quot;&gt;Behold.so&lt;/a&gt;. The floral work is the product, and Instagram is where the portfolio lives. Behold.so handles the API connection and caching on their end; the site loads a widget script and stays current without a build.&lt;/p&gt;
&lt;h2&gt;Admin portal&lt;/h2&gt;
&lt;p&gt;Shacksbloom.com has a custom admin portal built on &lt;a href=&quot;https://neon.tech/&quot;&gt;Neon&lt;/a&gt; serverless Postgres with &lt;a href=&quot;https://orm.drizzle.team/&quot;&gt;Drizzle ORM&lt;/a&gt;, auth via &lt;a href=&quot;https://clerk.com/&quot;&gt;Clerk&lt;/a&gt; backed by Google OAuth, and a full quote-to-invoice lifecycle. The entire admin bundle is lazy-loaded so none of its JS ships to public visitors.&lt;/p&gt;
&lt;p&gt;Getting Clerk to production with Google OAuth isn&amp;#39;t just enabling a toggle. You need a Google Cloud project, OAuth credentials with a client ID and secret, and authorized redirect URIs pointing to Clerk&amp;#39;s callback, all wired into Clerk&amp;#39;s dashboard. Most tutorials skip that step entirely. Once Google auth clears, there&amp;#39;s a second layer: the user&amp;#39;s email is checked against an &lt;code&gt;ADMIN_EMAILS&lt;/code&gt; env var. The frontend blocks rendering if it fails, and every API call independently re-verifies the Clerk session token and re-checks the allowlist server-side. Two separate enforcement points, not one.&lt;/p&gt;
&lt;p&gt;A quote starts as a draft, gets sent to the customer, and converts to an invoice when the job is confirmed. Invoices track payment method and paid date. Refunds are their own invoice type linked back to the original. Every quote and invoice has line items with description, date, notes, and amount. Every invoice can be emailed to the customer as a branded PDF, rendered server-side via &lt;a href=&quot;https://react-pdf.org/&quot;&gt;@react-pdf/renderer&lt;/a&gt; with Jackie&amp;#39;s logo, and every send is logged to an &lt;code&gt;email_logs&lt;/code&gt; table with the Resend message ID and delivery status. A year view shows total collected, receipt count, and refund count.&lt;/p&gt;
&lt;p&gt;This only works because one developer manages both sites. Build vs. buy math changes fast once you add team members or operational complexity. For a personal site or a small business where one developer owns the whole stack, custom-built is often the right call.&lt;/p&gt;
&lt;h2&gt;Knowledge is key&lt;/h2&gt;
&lt;p&gt;The jump from a fun vibe-coded site to one that&amp;#39;s built to last isn&amp;#39;t about the framework or the code quality. It&amp;#39;s about knowing what needs to exist. Having an agent to gut-check ideas, bake a plan, and work through the checklist with you gets you there considerably faster. My main advice is to treat your agent as a team member and dive into the laundry list of deliverables needed to go from this looks pretty to this can scale. &lt;/p&gt;
</content:encoded><category>Vibe Coding</category><category>AI</category><category>SEO</category><category>Web Development</category></item><item><title>PLG for developer companies</title><link>https://lukestahl.io/blog/plg-for-developer-companies/</link><guid isPermaLink="true">https://lukestahl.io/blog/plg-for-developer-companies/</guid><description>PLG is not a marketing motion. It&apos;s the discipline of letting your product, your pricing, and your free tier do the work a sales team does. This is the sequence that works.</description><pubDate>Tue, 05 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;img src=&quot;/blog/images/plg-for-developer-companies/plg_png.png&quot; alt=&quot;plg.png&quot;&gt;&lt;/p&gt;
&lt;h2&gt;The core idea&lt;/h2&gt;
&lt;p&gt;PLG is not a marketing motion. It&amp;#39;s the discipline of letting your product, your pricing, and your free tier do the work that a sales team does at a sales-led company. Sales is the layer you add on top once adoption is happening. Never the layer you start with.&lt;/p&gt;
&lt;p&gt;Most companies miss this because they treat PLG as a tactic to bolt onto an existing motion. They run a 14-day trial, gate the API behind a sales call, charge per seat, and wonder why developers churn before they activate. The order matters. So does each individual decision inside it.&lt;/p&gt;
&lt;p&gt;The companies that scaled developer products past $100M ARR (Stripe, Twilio, Vercel, Algolia, Datadog, MongoDB, GitHub, Cloudflare) didn&amp;#39;t do this by accident. They got the same set of decisions right, in roughly the same sequence. This is that sequence, in 10 steps.&lt;/p&gt;
&lt;h2&gt;1. Ship a free tier with utility — your acquisition channel&lt;/h2&gt;
&lt;p&gt;Not a 14-day trial. Not a &amp;quot;contact sales&amp;quot; form. A free tier where a developer can build something, deploy it, and run it in production before paying. The free tier is your acquisition channel.&lt;/p&gt;
&lt;p&gt;Stripe charges nothing until a developer processes a transaction. Vercel&amp;#39;s free tier includes deployments, serverless functions, and a production URL. Twilio gives free credits, and developers send SMS messages during integration. &lt;a href=&quot;https://www.algolia.com/pricing&quot;&gt;Algolia provides 10K search requests a month for free&lt;/a&gt;. What these have in common: a developer can build something they&amp;#39;d be embarrassed to lose.&lt;/p&gt;
&lt;p&gt;That&amp;#39;s the activation bar. Not signups, not free-account counts. If your free tier is a demo environment with watermarks and sandbox-only limits, it&amp;#39;s a trial, and trials don&amp;#39;t drive PLG. Trials drive sales calls.&lt;/p&gt;
&lt;p&gt;The most common mistake is time-based gating. A 14-day trial creates an artificial deadline that pushes developers into a buying decision before they&amp;#39;ve integrated. Usage limits don&amp;#39;t. A developer can integrate Stripe and run in test mode for years before paying anything. By the time they do pay, they&amp;#39;ve shipped a checkout flow, embedded the API in their stack, and accumulated months of historical data they don&amp;#39;t want to migrate. That&amp;#39;s lock-in earned through utility, not contracts.&lt;/p&gt;
&lt;p&gt;Everything above assumes near-zero marginal cost per free user. That assumption breaks when your product includes AI-powered features. Every prompt, generation, or inference call burns compute. The free tier still needs to exist, but it can&amp;#39;t be open-ended. Usage caps on AI features replace unlimited access. Gate volume and speed, not feature access. Midjourney does this with Fast Mode vs. Relax Mode. Fast Mode gives instant GPU access with limited monthly hours. Relax Mode is unlimited but queued. Users pay for priority and throughput, not better outputs.&lt;/p&gt;
&lt;p&gt;The trap is building a free tier so capable that it removes the reason to upgrade. Google&amp;#39;s AI team ran into this launching Gemini subscriptions. The free tier already outperformed most use cases. Users had no reason to pay. The fix wasn&amp;#39;t restricting features. It was designing a ceiling that creates upgrade desire without killing the experience that gets developers hooked in the first place.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Checklist&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Audit your free tier. Can a developer build and ship something production-grade without paying?&lt;/li&gt;
&lt;li&gt;Remove time-based trial gates. Usage limits work. Countdown timers don&amp;#39;t.&lt;/li&gt;
&lt;li&gt;Track what percentage of free-tier users build something they&amp;#39;d be embarrassed to lose. That&amp;#39;s your activation rate.&lt;/li&gt;
&lt;li&gt;If your product includes AI features, define where free stops and whether that stopping point creates the right motivation to upgrade.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;2. Time to first meaningful action — the metric that predicts everything&lt;/h2&gt;
&lt;p&gt;Time to first API call is the strongest predictor of developer conversion. Every friction point between signup and value drags conversion down. It doesn&amp;#39;t matter how good your product is downstream if developers don&amp;#39;t get there.&lt;/p&gt;
&lt;p&gt;The number to measure is the median time from signup to a developer&amp;#39;s first meaningful action. For Stripe, that meant a working checkout flow in under 15 minutes. For Twilio, it was the &amp;quot;send your first SMS&amp;quot; tutorial, not docs, not a dashboard tour. For Firebase, quickstart templates that deploy a working backend in under five minutes.&lt;/p&gt;
&lt;p&gt;The mistake here is measuring the wrong moment. &amp;quot;Account created&amp;quot; is not first action. &amp;quot;Logged in&amp;quot; is not first action. First action is the moment a developer sees the product do something useful for their use case. That&amp;#39;s specific to your product, and you have to define it precisely. For a payments API, it&amp;#39;s a successful test charge. For a database, it&amp;#39;s a query returning data. For a deployment platform, it&amp;#39;s a live URL.&lt;/p&gt;
&lt;p&gt;Track median time-to-action weekly. Identify the top three friction points between signup and that action (onboarding flow, key generation, SDK install, first request) and fix the biggest one each quarter. Time-to-action improvements ripple through every metric downstream.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Checklist&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Define your first meaningful action precisely. Not signup, not login. The moment a developer creates something useful for their use case.&lt;/li&gt;
&lt;li&gt;Measure the median time from signup to that action and track it weekly.&lt;/li&gt;
&lt;li&gt;Identify the top three friction points between signup and first action. Fix the biggest one this quarter.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;3. Activation is not signup — define it in one sentence&lt;/h2&gt;
&lt;p&gt;A signup is not a user. &lt;a href=&quot;https://openviewpartners.com/2023-product-benchmarks/&quot;&gt;OpenView&amp;#39;s PLG benchmarks&lt;/a&gt; show PLG companies are 2x more likely than sales-led companies to grow revenue 100% year over year, and 87% of standout PLG companies track an explicit activation metric.&lt;/p&gt;
&lt;p&gt;Activation is the moment a developer experiences the product&amp;#39;s core value for their use case. It has to be specific enough that an analyst can query it from your data warehouse on a Monday morning and a number comes back.&lt;/p&gt;
&lt;p&gt;Slack defined activation as &lt;a href=&quot;https://review.firstround.com/from-0-to-1b-slacks-founder-shares-their-epic-launch-strategy/&quot;&gt;&amp;quot;2,000 messages sent by a team.&amp;quot;&lt;/a&gt; That single metric reshaped their entire GTM strategy: every onboarding decision, every paid feature, every sales handoff was built around getting teams to 2,000 messages. Datadog defined activation as connecting a first integration and sending data. Amplitude defined it as a user creating their first saved chart.&lt;/p&gt;
&lt;p&gt;If you can&amp;#39;t write your activation metric in one sentence, it&amp;#39;s too vague. If you can&amp;#39;t query it from your warehouse today, it doesn&amp;#39;t exist yet. And if you&amp;#39;re still reporting signup counts to leadership without activation rates alongside them, you&amp;#39;re making decisions on the wrong number.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Checklist&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Write a one-sentence activation definition. If it takes more than one sentence, it&amp;#39;s too vague.&lt;/li&gt;
&lt;li&gt;Instrument it. If you can&amp;#39;t query it from your data warehouse today, it doesn&amp;#39;t exist yet.&lt;/li&gt;
&lt;li&gt;Stop reporting signup counts to leadership without activation rates alongside them.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;4. Visible artifacts — your organic loop runs on these&lt;/h2&gt;
&lt;p&gt;The strongest PLG loop: developers build things with your product that other developers can see. Those artifacts drive organic discovery without marketing spend.&lt;/p&gt;
&lt;p&gt;Vercel&amp;#39;s free-tier sites deploy to a vercel.app subdomain by default. Every deployment surfaces the brand. Netlify&amp;#39;s deploy previews in GitHub PRs exposed the product to every reviewer on the team. Stripe&amp;#39;s checkout pages are seen by millions of end users. Every transaction surfaces the brand. Webflow&amp;#39;s published sites include Webflow branding on free plans, and every site is a billboard for the platform.&lt;/p&gt;
&lt;p&gt;The pattern: the product creates artifacts visible to non-users. Sites, apps, APIs, integrations, embeds, deploy logs, share links. Make attribution easy but not mandatory. Make the artifact good enough that developers want to share it. And track the inbound signup volume that originates from product-generated artifacts. That&amp;#39;s your organic loop metric, and it&amp;#39;s the closest thing to a free customer acquisition channel that exists in B2B.&lt;/p&gt;
&lt;p&gt;If your product doesn&amp;#39;t create visible artifacts, you have a harder PLG problem. Internal tooling, backend-only services, and infrastructure-layer products have to find their visible surface elsewhere: open source, public docs, technical content, conference talks. The loop still has to exist. The channels are different.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Checklist&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Identify what artifacts your product creates that are visible to non-users — sites, apps, APIs, integrations, embeds, deploy logs, share links.&lt;/li&gt;
&lt;li&gt;Make attribution easy but not mandatory.&lt;/li&gt;
&lt;li&gt;Track inbound signups that originate from product artifacts. That&amp;#39;s your organic loop metric.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;5. Self-serve onboarding — if a developer needs sales, it&amp;#39;s not PLG&lt;/h2&gt;
&lt;p&gt;If a developer needs to talk to a human to get started, you don&amp;#39;t have a PLG motion. You have a sales-led motion with a free tier on top. Docs, quickstarts, templates, and example projects do the work.&lt;/p&gt;
&lt;p&gt;Stripe&amp;#39;s documentation is the gold standard. Every API endpoint includes a working code example. Supabase built project templates that deploy a working app in one click. Neon&amp;#39;s CLI creates a working Postgres database with one command, no dashboard required. None of these required a sales call to evaluate.&lt;/p&gt;
&lt;p&gt;Run the new-developer test. Take someone who has never used your product, sit them in front of a clean machine, and time how long it takes them to sign up, build something, and deploy it without talking to anyone. If they can&amp;#39;t, fix what blocks them before fixing anything else.&lt;/p&gt;
&lt;p&gt;Two specific calls. First, prioritize quickstarts over comprehensive documentation. Developers want to start, not read. Comprehensive docs come second. Second, build at least one template or starter project that runs end-to-end in under 10 minutes. The developer who ships something in 10 minutes activates. The developer who spent an afternoon reading already churned.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Checklist&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Run the new-developer test. Sit someone down at a clean machine and time how long it takes them to sign up, build, and deploy without talking to anyone.&lt;/li&gt;
&lt;li&gt;Prioritize quickstarts over comprehensive docs. Developers want to start, not read.&lt;/li&gt;
&lt;li&gt;Build at least one template or starter project that runs end-to-end in under 10 minutes.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;6. Usage-based pricing — per-seat is friction at the wrong moment&lt;/h2&gt;
&lt;p&gt;Usage-based pricing aligns cost with value. Per-seat pricing creates friction at the exact moment you want expansion, when a second or third developer wants to start using the product.&lt;/p&gt;
&lt;p&gt;Twilio charges per API call. Vercel charges based on bandwidth and serverless function execution. Cloudflare Workers charges per request after a generous free tier. Algolia charges per search request. In all of these, cost scales with the product&amp;#39;s success, not headcount. A team of five developers using Stripe doesn&amp;#39;t cost more than a team of one developer doing the same volume. That&amp;#39;s the right answer.&lt;/p&gt;
&lt;p&gt;Map your pricing to the unit of value a developer gets. API calls, deployments, executions, storage, bandwidth, requests, GB processed. Pick the metric that&amp;#39;s easiest to explain on a single line and that a developer can predict from their usage. Predictability matters as much as fairness.&lt;/p&gt;
&lt;p&gt;Two tests for your pricing model. First, is $0 a starting point? Usage limits beat time limits. A developer should be able to run on the free tier indefinitely, with the bill kicking in only when usage crosses a threshold. Second, what happens when a customer 10x&amp;#39;s their usage? If the answer is &amp;quot;they call sales,&amp;quot; you have a pricing wall, not a growth model. The whole point of usage pricing is that growth happens automatically.&lt;/p&gt;
&lt;p&gt;Pure per-unit billing has a failure mode at scale: bill shock. Vikas Kansal&amp;#39;s team at Google hit this building AI subscription tiers. Unpredictable costs make enterprise buyers nervous and individual developers hesitant. The alternative is intensity tiers. Prepaid volume buckets (Plus/Pro/Ultra) that give predictability while still aligning cost with usage. Developers get a taste of most capabilities in every tier, but volume and speed increase as they move up. Predictable tiers convert better than open meters because developers can budget for them.&lt;/p&gt;
&lt;p&gt;There&amp;#39;s a second dimension most developer tools haven&amp;#39;t considered yet: outcome-based pricing. Instead of charging per input (API call, prompt, request), charge per successful result. Intercom&amp;#39;s Fin AI agent charges $0.99 per resolution. The AI tries to answer for free. You only pay when the problem is solved. As developer tools add agent and automation capabilities, pricing per outcome starts to make more sense than pricing per request. A developer who pays per successful deployment or per resolved issue is paying for value delivered, not compute consumed.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Checklist&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Map your pricing to the unit of value a developer gets — API calls, deployments, executions, storage, bandwidth, requests.&lt;/li&gt;
&lt;li&gt;Make $0 a starting point. Usage limits beat time limits.&lt;/li&gt;
&lt;li&gt;Model what happens when a customer 10x&amp;#39;s their usage. If the answer is &amp;quot;they call sales,&amp;quot; you have a pricing wall, not a growth model.&lt;/li&gt;
&lt;li&gt;If your product has AI-powered features, map each feature to its compute cost. Pricing tiers should align with cost-to-serve, not just perceived value.&lt;/li&gt;
&lt;li&gt;Evaluate whether any capability in your product could be priced on outcomes (successful completions, resolved issues, deployed builds) instead of inputs.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;7. Product-qualified leads — instrument behavior, not vibes&lt;/h2&gt;
&lt;p&gt;A product-qualified lead is a user whose behavior signals they&amp;#39;re ready for a paid plan or a sales conversation. The threshold should be specific and defensible, not a gut feeling.&lt;/p&gt;
&lt;p&gt;Dropbox&amp;#39;s PQL signal was a user hitting their storage limit. Atlassian identified PQLs based on growing team size on free Jira instances. Figma&amp;#39;s PQL signal was 3+ editors working in shared files, the moment usage shifted from solo to collaborative. In each case, the signal was behavioral, queryable, and tied to expansion intent, not &amp;quot;this account looks active.&amp;quot;&lt;/p&gt;
&lt;p&gt;Define two or three behavioral signals that predict conversion from free to paid. Test them against historical data before you trust them. The simplest scoring model (signal A plus signal B equals PQL) outperforms guessing every time, and it gives sales something concrete to act on.&lt;/p&gt;
&lt;p&gt;When PQLs do route to sales, route them with context. Not &amp;quot;this account is active.&amp;quot; Specifically: three developers running production workloads across two workspaces, last 30-day request volume up 4x, hitting rate limits on the free tier. Sales will only land calls where the signal is strong, and the signal is only strong when it&amp;#39;s specific.&lt;/p&gt;
&lt;p&gt;For products with variable cost-to-serve, PQL scoring needs a cost dimension. A developer running 500 prompts a day is both your best conversion candidate and your biggest margin risk. Score for conversion likelihood and cost-to-serve simultaneously. High engagement plus high compute cost means this user needs to convert now, not eventually. The free tier is subsidizing their usage, and the longer they stay free, the worse your unit economics get.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Checklist&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Define two or three behavioral signals that predict conversion from free to paid. Test them against historical data before you trust them.&lt;/li&gt;
&lt;li&gt;Build a simple PQL scoring model. Signal A plus signal B equals PQL beats guessing.&lt;/li&gt;
&lt;li&gt;Route PQLs to sales with specific context — developer count, workload type, usage trend, rate-limit hits — not &amp;quot;this account looks active.&amp;quot;&lt;/li&gt;
&lt;li&gt;Track cost-per-user alongside engagement-per-user. High engagement plus high cost equals your most urgent conversion target.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;8. Bottom-up adoption — let developers pull the product into their org&lt;/h2&gt;
&lt;p&gt;Bottom-up adoption is the bet that individual developers will use the product first, then bring it to their team and their org. Enterprise features matter, but they come after individual adoption, not before.&lt;/p&gt;
&lt;p&gt;GitHub grew inside enterprises one developer at a time, before it sold top-down to IT. Slack spread team by team: one team adopted, adjacent teams noticed, and IT eventually had to standardize. Datadog entered orgs through a single DevOps engineer monitoring one service. Each of these scaled because individual developers could get full value from the product without needing org-level approval.&lt;/p&gt;
&lt;p&gt;The most important rule: no admin-only features in the critical path. If a developer can&amp;#39;t use your product without their CTO clicking a button, you&amp;#39;ve broken bottom-up adoption. Permissions, billing, and SSO matter, but they should be optional add-ons, not blockers to first use.&lt;/p&gt;
&lt;p&gt;Build sharing and collaboration features that naturally expose the product to non-users. When one developer adopts the product, the next developer should encounter it within a week: a shared link, a PR, a deploy preview, a notification, an artifact in the repo. Track the internal referral pattern. When a second developer joins an account, what triggered it? That trigger is your expansion lever.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Checklist&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Make sure a single developer can get full value without org-level approval. No admin-only features in the critical path.&lt;/li&gt;
&lt;li&gt;Build sharing and collaboration features that naturally expose the product to non-users.&lt;/li&gt;
&lt;li&gt;Track internal referral patterns. When a second developer joins an account, what triggered it?&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;9. Expansion triggers — one developer is adoption, three is expansion&lt;/h2&gt;
&lt;p&gt;The moment one developer invites a second, or one workspace connects to a second, that&amp;#39;s your expansion signal. This is the bridge between individual adoption and team or enterprise revenue, and it&amp;#39;s the metric that separates products that grow from products that plateau.&lt;/p&gt;
&lt;p&gt;Notion&amp;#39;s expansion came from page-sharing. A user shares a page with a teammate, the teammate joins the workspace, and the account grows. Cross-account sharing was a stronger upgrade signal than any solo-usage metric. Linear&amp;#39;s signature expansion event is the second engineer joining a workspace. Figma&amp;#39;s was a design file shared with a developer, because cross-functional sharing predicted enterprise deals.&lt;/p&gt;
&lt;p&gt;The pattern across all of these: the expansion signal is &amp;quot;more people or more projects,&amp;quot; not &amp;quot;more usage.&amp;quot; A single developer using the product more isn&amp;#39;t expansion. A second developer joining is.&lt;/p&gt;
&lt;p&gt;Measure active developer density per account. One developer is adoption. Three is expansion. That&amp;#39;s the threshold where sales should engage. Earlier and you&amp;#39;re spending sales time on accounts that aren&amp;#39;t ready. Later and you&amp;#39;re missing the window. Build features that make multi-developer collaboration better than solo use, because the moment collaboration is the obvious choice, expansion happens by default.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Checklist&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Define your expansion signal as &amp;quot;more people or more projects,&amp;quot; not &amp;quot;more usage.&amp;quot;&lt;/li&gt;
&lt;li&gt;Measure active developer density per account. One is adoption, three is expansion. That&amp;#39;s the threshold for sales engagement.&lt;/li&gt;
&lt;li&gt;Build features that make multi-developer collaboration better than solo use.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;10. Product-led sales — sales is a layer, not a starting point&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://www.elenaverna.com/&quot;&gt;Elena Verna&lt;/a&gt; calls this product-led sales. Developers adopt bottom-up. Buyer personas (directors of engineering, tech leads, CTOs) approve expansion and budget. Sales engages accounts where adoption is already happening, not accounts where it might.&lt;/p&gt;
&lt;p&gt;Twilio&amp;#39;s sales team only engaged accounts after developers were already making API calls. Sales opened with usage data, not feature demos. MongoDB&amp;#39;s enterprise sales targeted companies where Atlas free-tier clusters were already running. Vercel&amp;#39;s enterprise motion focused on companies where multiple developers had already deployed projects. In all three, sales worked because the product had already done the trust-building.&lt;/p&gt;
&lt;p&gt;Define the product adoption threshold that triggers a sales touchpoint. &amp;quot;Three or more active developers in one account&amp;quot; is better than &amp;quot;high engagement.&amp;quot; The threshold has to be queryable and defensible. Sales should know exactly why this account got escalated.&lt;/p&gt;
&lt;p&gt;The other half of this rule: don&amp;#39;t let sales engage accounts below the adoption threshold. Premature outreach damages developer trust faster than anything else you can do. A developer who got a cold sales email two days after signing up will remember it, and they&amp;#39;ll associate the friction with your product. Hold the line on the threshold even when pipeline pressure builds, because the alternative is faster pipeline this quarter and worse retention forever.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Checklist&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Define the product adoption threshold that triggers a sales touchpoint. &amp;quot;Three or more active developers in one account&amp;quot; beats &amp;quot;high engagement.&amp;quot;&lt;/li&gt;
&lt;li&gt;Equip sales with product usage data — which products are in use, how many developers are active, what the growth trend looks like.&lt;/li&gt;
&lt;li&gt;Don&amp;#39;t let sales engage accounts below the adoption threshold. Premature outreach damages developer trust.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;The order matters&lt;/h2&gt;
&lt;p&gt;These 10 steps aren&amp;#39;t independent. They&amp;#39;re a sequence, and each one only works because the previous ones are in place.&lt;/p&gt;
&lt;p&gt;A free tier with utility doesn&amp;#39;t drive growth without short time-to-action. Time-to-action doesn&amp;#39;t matter if your activation metric is undefined. Activation can&amp;#39;t be measured if you&amp;#39;re charging per seat. Per-seat pricing kills the bottom-up adoption that PQLs depend on. PQLs are noise without an expansion signal. And expansion signals don&amp;#39;t matter if sales engages accounts before they fire.&lt;/p&gt;
&lt;p&gt;Companies fail at PLG because they pull one piece out of the sequence and bolt it onto a different motion. Free tier without usage pricing. Usage pricing without an activation metric. Activation metric without bottom-up adoption. The 10 steps above are the same set of decisions Stripe, Vercel, Twilio, Algolia, and Datadog all made, in roughly the same order.&lt;/p&gt;
&lt;p&gt;If you&amp;#39;re starting a developer product today, the order to follow is the order in this list. If you&amp;#39;re adopting PLG inside a company that already has a sales motion, the order matters even more, because the friction will be highest at the steps where your existing motion contradicts the playbook. Find the contradiction and fix the contradiction. Don&amp;#39;t try to run both at once.&lt;/p&gt;
&lt;p&gt;PLG is the growth half of building a durable developer company. The other half is &lt;a href=&quot;https://lukestahl.io/blog/end-of-coding-age-of-building/&quot;&gt;how companies innovate in the AI era&lt;/a&gt;.&lt;/p&gt;
</content:encoded><category>PLG</category><category>Product Strategy</category><category>Developer Tools</category></item><item><title>Building developer tools for agents (not just humans)</title><link>https://lukestahl.io/blog/building-dev-tools-for-agents/</link><guid isPermaLink="true">https://lukestahl.io/blog/building-dev-tools-for-agents/</guid><description>How developer tool companies stay relevant when AI agents are the ones calling APIs, reading docs, and integrating SDKs.</description><pubDate>Tue, 28 Apr 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;img src=&quot;/blog/images/building-dev-tools-for-agents/humans-and-agents_png.png&quot; alt=&quot;humans-and-agents.png&quot;&gt;&lt;/p&gt;
&lt;h2&gt;The core problem&lt;/h2&gt;
&lt;p&gt;Most developer tools were built for humans. A developer reads your docs, copies your quickstart, integrates your SDK, and ships. That loop worked for 20 years.&lt;/p&gt;
&lt;p&gt;The loop is changing. AI agents do more of that work now: reading docs, writing integration code, making API calls. Claude Code, Cursor, and Copilot don&amp;#39;t browse your docs the way a human does. They need your product to be machine-readable, composable, and permission-aware in ways that most tools weren&amp;#39;t designed for.&lt;/p&gt;
&lt;p&gt;This doesn&amp;#39;t mean rebuilding from scratch. It means understanding what changes when the primary consumer of your developer product is an AI agent rather than a human, then adapting deliberately.&lt;/p&gt;
&lt;p&gt;The companies that do this well will deepen their moat. The companies that don&amp;#39;t will find themselves irrelevant as agent-native alternatives emerge.&lt;/p&gt;
&lt;h2&gt;1. API design — the foundation&lt;/h2&gt;
&lt;p&gt;If your API isn&amp;#39;t clean, nothing else matters. Agents pattern-match. Inconsistency breaks that pattern-matching immediately. A human developer can work around a poorly designed API by reading between the lines. An agent can&amp;#39;t.&lt;/p&gt;
&lt;h3&gt;Consistency above everything&lt;/h3&gt;
&lt;p&gt;Every design decision that seemed minor when humans were the primary users becomes critical when agents are calling your API thousands of times autonomously.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Resource naming:&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;// Inconsistent — breaks agent pattern matching
GET /getUserById
POST /create_project
DELETE /org-member-remove

// Consistent — agents can predict URL structure
GET /users/{id}
POST /projects
DELETE /organizations/{org_id}/members/{user_id}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Rules that matter:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Nouns for resources, verbs for actions&lt;/li&gt;
&lt;li&gt;Consistent casing throughout. Pick snake_case or camelCase, never mix&lt;/li&gt;
&lt;li&gt;Predictable URL hierarchy: &lt;code&gt;/resource/{id}/sub-resource/{sub_id}&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;HTTP methods used semantically. GET retrieves, POST creates, PUT replaces, PATCH updates, DELETE removes&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Predictable response structures&lt;/h3&gt;
&lt;p&gt;Every response, regardless of endpoint, should follow the same shape. Agents build internal models of what to expect. Surprise response structures break those models.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-json&quot;&gt;// Success
{
  &amp;quot;data&amp;quot;: { ... },
  &amp;quot;meta&amp;quot;: {
    &amp;quot;request_id&amp;quot;: &amp;quot;req_123&amp;quot;,
    &amp;quot;timestamp&amp;quot;: &amp;quot;2026-04-27T10:00:00Z&amp;quot;
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code class=&quot;language-json&quot;&gt;// Error — every error looks the same
{
  &amp;quot;error&amp;quot;: {
    &amp;quot;code&amp;quot;: &amp;quot;user_not_found&amp;quot;,
    &amp;quot;message&amp;quot;: &amp;quot;No user with ID usr_123 exists&amp;quot;,
    &amp;quot;retryable&amp;quot;: false,
    &amp;quot;docs_url&amp;quot;: &amp;quot;https://docs.example.com/errors/user_not_found&amp;quot;
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Never return different shapes for the same endpoint depending on state. Agents can&amp;#39;t handle conditional response structures reliably.&lt;/p&gt;
&lt;h3&gt;Idempotency&lt;/h3&gt;
&lt;p&gt;Agents retry. Networks fail. Operations get called twice. Your API needs to handle this without creating duplicate records, duplicate charges, or duplicate emails.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-plain&quot;&gt;POST /payments
Idempotency-Key: pay_abc123
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Same key, same result every time, regardless of how many times the agent calls it. The agent doesn&amp;#39;t need to track whether the call succeeded. It retries safely and trusts the server to dedupe.&lt;/p&gt;
&lt;h3&gt;Semantic error codes&lt;/h3&gt;
&lt;p&gt;Generic HTTP status codes aren&amp;#39;t enough. An agent receiving a 400 needs to know exactly what went wrong and what to do about it.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-json&quot;&gt;{
  &amp;quot;error&amp;quot;: {
    &amp;quot;code&amp;quot;: &amp;quot;rate_limit_exceeded&amp;quot;,
    &amp;quot;message&amp;quot;: &amp;quot;You have exceeded 100 requests per minute&amp;quot;,
    &amp;quot;retryable&amp;quot;: true,
    &amp;quot;retry_after&amp;quot;: 45,
    &amp;quot;docs_url&amp;quot;: &amp;quot;https://docs.example.com/errors/rate_limit_exceeded&amp;quot;
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Error codes should be:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Machine-readable strings, not just numbers&lt;/li&gt;
&lt;li&gt;Stable: never change them once published&lt;/li&gt;
&lt;li&gt;Documented with recommended recovery actions&lt;/li&gt;
&lt;li&gt;Categorized: auth errors, validation errors, rate limits, server errors&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Pagination that works at scale&lt;/h3&gt;
&lt;p&gt;Agents process large datasets. Cursor-based pagination is better than offset because it&amp;#39;s stable. Inserting new records doesn&amp;#39;t shift results mid-traversal.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-json&quot;&gt;{
  &amp;quot;data&amp;quot;: [...],
  &amp;quot;pagination&amp;quot;: {
    &amp;quot;next_cursor&amp;quot;: &amp;quot;cur_abc123&amp;quot;,
    &amp;quot;has_more&amp;quot;: true
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Rate limiting with transparency&lt;/h3&gt;
&lt;p&gt;Agents are fast and don&amp;#39;t naturally pace themselves. Return rate limit state in every response so agents can self-regulate before hitting limits.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-plain&quot;&gt;X-RateLimit-Limit: 100
X-RateLimit-Remaining: 47
X-RateLimit-Reset: 1714215600
Retry-After: 45
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Webhooks over polling&lt;/h3&gt;
&lt;p&gt;Agents shouldn&amp;#39;t poll. For operations that take time, use webhooks and give the agent a job ID to reference when the webhook fires.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-json&quot;&gt;// Immediate response
{
  &amp;quot;job_id&amp;quot;: &amp;quot;job_abc123&amp;quot;,
  &amp;quot;status&amp;quot;: &amp;quot;processing&amp;quot;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code class=&quot;language-json&quot;&gt;// Webhook when complete
{
  &amp;quot;event&amp;quot;: &amp;quot;job.completed&amp;quot;,
  &amp;quot;job_id&amp;quot;: &amp;quot;job_abc123&amp;quot;,
  &amp;quot;result&amp;quot;: { ... },
  &amp;quot;timestamp&amp;quot;: &amp;quot;2026-04-27T10:05:00Z&amp;quot;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;2. MCP — making your product callable&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://modelcontextprotocol.io/&quot;&gt;Model Context Protocol&lt;/a&gt; is the new integration layer between AI coding tools and external services. Without an MCP server, agents have to parse your docs and construct API calls from scratch every time. With one, your product becomes a first-class tool that agents can discover and call through natural language. &lt;a href=&quot;https://lukestahl.io/blog/is-headless-making-a-comeback/&quot;&gt;Headless APIs were already designed for programmatic consumption&lt;/a&gt;; MCP adds the discoverability layer they were missing.&lt;/p&gt;
&lt;p&gt;When &lt;a href=&quot;https://clerk.com/&quot;&gt;Clerk&lt;/a&gt; shipped their MCP server, a developer using Claude Code could say &amp;quot;add authentication to this Next.js app&amp;quot; and Claude called Clerk&amp;#39;s APIs directly. No copy-pasting from docs, no configuration errors, no context switching.&lt;/p&gt;
&lt;h3&gt;What an MCP server does&lt;/h3&gt;
&lt;p&gt;An MCP server wraps your API and exposes it as a set of tools. It handles:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Tool discovery, so the agent asks what your product can do and gets a structured list&lt;/li&gt;
&lt;li&gt;Parameter validation before the API call is made&lt;/li&gt;
&lt;li&gt;Response formatting into something the agent can reason about&lt;/li&gt;
&lt;li&gt;Error handling with actionable messages&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Basic MCP server structure&lt;/h3&gt;
&lt;pre&gt;&lt;code class=&quot;language-typescript&quot;&gt;import { Server } from &amp;quot;@modelcontextprotocol/sdk/server/index.js&amp;quot;;
import { StdioServerTransport } from &amp;quot;@modelcontextprotocol/sdk/server/stdio.js&amp;quot;;

const server = new Server(
  { name: &amp;quot;your-product-mcp&amp;quot;, version: &amp;quot;1.0.0&amp;quot; },
  { capabilities: { tools: {} } }
);

server.setRequestHandler(ListToolsRequestSchema, async () =&amp;gt; ({
  tools: [
    {
      name: &amp;quot;get_user&amp;quot;,
      description: &amp;quot;Retrieve a user by their ID. Returns the user&amp;#39;s profile including email, name, created date, and subscription status. Use this before performing any action that requires knowing the user&amp;#39;s current state. Returns null if the user does not exist — check for this before proceeding.&amp;quot;,
      inputSchema: {
        type: &amp;quot;object&amp;quot;,
        properties: {
          user_id: {
            type: &amp;quot;string&amp;quot;,
            description: &amp;quot;The unique identifier of the user (e.g., usr_abc123)&amp;quot;
          }
        },
        required: [&amp;quot;user_id&amp;quot;]
      }
    }
  ]
}));
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Tool description is everything&lt;/h3&gt;
&lt;p&gt;The description field is where most MCP implementations fail. Agents use it to decide when and how to call the tool. A weak description means the agent calls the wrong tool at the wrong time or passes the wrong parameters. This is one of the &lt;a href=&quot;https://lukestahl.io/blog/stop-upgrading-your-model-fix-your-harness/&quot;&gt;harness-level mistakes that breaks agents before any model upgrade can help&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bad:&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-plain&quot;&gt;&amp;quot;Gets a user&amp;quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Good:&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-plain&quot;&gt;&amp;quot;Retrieve a user by their ID. Returns the user&amp;#39;s profile including email, name,
created date, and current subscription status. Use this before performing any
action that requires knowing the user&amp;#39;s current state. Returns null if the user
does not exist — check for this before proceeding.&amp;quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The description should tell the agent what the tool does, when to use it, what it returns, and edge cases to watch for.&lt;/p&gt;
&lt;h3&gt;What to expose in your MCP server&lt;/h3&gt;
&lt;p&gt;Not everything in your API needs to be an MCP tool. Start with the actions developers perform most often and the ones that are most error-prone when done manually. Auth flows, resource creation, and configuration are high value. Internal admin operations, bulk data exports, and rarely used endpoints can wait.&lt;/p&gt;
&lt;h2&gt;3. Documentation — built for machines and humans&lt;/h2&gt;
&lt;p&gt;Documentation in the agentic era has to serve two audiences. A human developer evaluating your product needs narrative, context, and examples that tell a story. An AI agent needs structured, machine-readable specifications it can parse and act on.&lt;/p&gt;
&lt;p&gt;Most developer tool docs were built for humans only. That worked when humans were the ones doing the integration. It doesn&amp;#39;t work anymore.&lt;/p&gt;
&lt;h3&gt;OpenAPI spec — the machine-readable foundation&lt;/h3&gt;
&lt;p&gt;A well-maintained &lt;a href=&quot;https://www.openapis.org/&quot;&gt;OpenAPI spec&lt;/a&gt; is the minimum requirement. Agents use these to understand your API surface without reading prose.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-yaml&quot;&gt;paths:
  /users/{id}:
    get:
      summary: Retrieve a user by ID
      description: &amp;gt;
        Returns a single user object. Use this endpoint when you need to look up
        a specific user&amp;#39;s details before performing an action. Returns 404 if the
        user does not exist — always handle this case before proceeding.
      parameters:
        - name: id
          in: path
          required: true
          description: The unique identifier of the user
          schema:
            type: string
            example: usr_abc123
      responses:
        &amp;#39;200&amp;#39;:
          description: User found
          content:
            application/json:
              schema:
                $ref: &amp;#39;#/components/schemas/User&amp;#39;
              example:
                id: usr_abc123
                email: luke@example.com
                created_at: &amp;quot;2026-01-01T00:00:00Z&amp;quot;
        &amp;#39;404&amp;#39;:
          description: User not found
          content:
            application/json:
              schema:
                $ref: &amp;#39;#/components/schemas/Error&amp;#39;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Critical requirements:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Every endpoint documented with description, parameters, request body, and response schemas&lt;/li&gt;
&lt;li&gt;Examples on every parameter and response&lt;/li&gt;
&lt;li&gt;Error codes enumerated with meanings, not just HTTP status codes&lt;/li&gt;
&lt;li&gt;Keep the spec in sync with the actual API. Stale specs break agents and erode developer trust&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Copy-to-install examples&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://ui.shadcn.com/&quot;&gt;Shadcn-style copy-to-install&lt;/a&gt; patterns have become the standard for a reason. Developers and agents both gravitate toward examples they can paste directly and have working in under a minute. Every quickstart should have this pattern.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;# Install
npm install @your-product/sdk

# Configure
export YOUR_PRODUCT_API_KEY=sk_live_abc123

# First API call
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code class=&quot;language-typescript&quot;&gt;import { YourProduct } from &amp;#39;@your-product/sdk&amp;#39;

const client = new YourProduct({ apiKey: process.env.YOUR_PRODUCT_API_KEY })

const user = await client.users.get(&amp;#39;usr_abc123&amp;#39;)
console.log(user.email)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;One install command. One config step. One working call. If your quickstart takes more than that, you&amp;#39;re losing developers and agents at the starting line.&lt;/p&gt;
&lt;h3&gt;Keeping docs current&lt;/h3&gt;
&lt;p&gt;Stale documentation is worse than no documentation for agents. A human notices when docs don&amp;#39;t match behavior. An agent acts on what the docs say and produces bugs that are hard to trace.&lt;/p&gt;
&lt;p&gt;This means:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Docs as code, with documentation living alongside the codebase and updated in the same PR&lt;/li&gt;
&lt;li&gt;Automated API reference generation from the OpenAPI spec, so nobody updates API references by hand&lt;/li&gt;
&lt;li&gt;Changelogs that call out breaking changes explicitly with migration paths&lt;/li&gt;
&lt;li&gt;Versioned documentation when you have multiple API versions active&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Error documentation that enables recovery&lt;/h3&gt;
&lt;p&gt;Most products document happy paths well and error paths poorly. Agents hit error paths constantly. Every error code should have a dedicated doc page that explains what caused it and exactly how to recover.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-plain&quot;&gt;Error: rate_limit_exceeded
Cause: You have made more than 100 requests per minute
Recovery: Wait for the retry_after value in the response header, then retry the request
Prevention: Implement exponential backoff and respect the X-RateLimit-Remaining header
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;4. Developer experience — reducing friction at every step&lt;/h2&gt;
&lt;p&gt;The best API in the world loses to a worse API with better DX. This was true before AI agents and it&amp;#39;s still true now. The difference is that agents amplify both good and bad DX. They&amp;#39;ll use a well-designed SDK flawlessly and get stuck in the same confusing flows that trip up humans.&lt;/p&gt;
&lt;h3&gt;SDKs that match how developers work&lt;/h3&gt;
&lt;p&gt;Generate your SDKs from your OpenAPI spec. Don&amp;#39;t handwrite them. Handwritten SDKs drift from the API and introduce inconsistencies. Generated SDKs stay in sync automatically.&lt;/p&gt;
&lt;p&gt;A well-designed SDK should:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Match the structure of the API without adding abstraction that obscures what&amp;#39;s happening&lt;/li&gt;
&lt;li&gt;Include TypeScript types for everything&lt;/li&gt;
&lt;li&gt;Surface errors in a way that makes them easy to handle&lt;/li&gt;
&lt;li&gt;Work with the frameworks your customers actually use: Next.js, Express, FastAPI, not just vanilla HTTP&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Framework-specific quickstarts&lt;/h3&gt;
&lt;p&gt;A generic quickstart is less useful than a Next.js quickstart, a FastAPI quickstart, and a Rails quickstart. Developers work within frameworks, not in the abstract. The faster you can get someone from install to working implementation in their stack, the more likely they adopt.&lt;/p&gt;
&lt;p&gt;Stripe does this well: separate quickstarts for Next.js, Rails, Flask, and more. Each one is tailored to how that framework handles payments, not a generic HTTP example with a note to adapt it.&lt;/p&gt;
&lt;h3&gt;Interactive examples and playgrounds&lt;/h3&gt;
&lt;p&gt;API playgrounds where developers can make calls without setting up an account lower the evaluation barrier. Agents can also use these to test calls before embedding them in code.&lt;/p&gt;
&lt;h2&gt;5. Authentication and permissions for agents&lt;/h2&gt;
&lt;p&gt;This is where most developer tools are least prepared for the agentic era. Human auth is designed for sessions: you log in, you stay logged in. Agent auth needs to be granular, short-lived, auditable, and revocable per task.&lt;/p&gt;
&lt;h3&gt;Scoped API keys&lt;/h3&gt;
&lt;p&gt;Never give agents a master API key. Give them a key scoped to exactly what they need for a specific task.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-plain&quot;&gt;// Too broad — agent can do anything
API_KEY: sk_live_abc123 (full access)

// Better — agent can only read users and create orders
API_KEY: sk_live_xyz789 (scope: users:read, orders:write)
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;OAuth for agent delegation&lt;/h3&gt;
&lt;p&gt;When an agent acts on behalf of a human user, OAuth is the right pattern. The agent gets a token scoped to that user&amp;#39;s permissions. It can&amp;#39;t do anything the user couldn&amp;#39;t do themselves.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-json&quot;&gt;POST /oauth/token
{
  &amp;quot;grant_type&amp;quot;: &amp;quot;client_credentials&amp;quot;,
  &amp;quot;client_id&amp;quot;: &amp;quot;agent_abc123&amp;quot;,
  &amp;quot;client_secret&amp;quot;: &amp;quot;sk_agent_xyz789&amp;quot;,
  &amp;quot;scope&amp;quot;: &amp;quot;orders:read orders:write&amp;quot;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The token comes back scoped to exactly what the agent requested:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-json&quot;&gt;{
  &amp;quot;access_token&amp;quot;: &amp;quot;tok_abc123&amp;quot;,
  &amp;quot;scope&amp;quot;: &amp;quot;orders:read orders:write&amp;quot;,
  &amp;quot;expires_in&amp;quot;: 900
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Short-lived tokens&lt;/h3&gt;
&lt;p&gt;Agent tokens should expire quickly. A human session might last 30 days. An agent token for a specific task should last minutes to hours. This limits the blast radius if a token is compromised.&lt;/p&gt;
&lt;h3&gt;Audit logging&lt;/h3&gt;
&lt;p&gt;Every agent action should be logged with enough context to answer: who authorized this, what did the agent do, when, and why. Enterprise customers will not buy without it.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-json&quot;&gt;{
  &amp;quot;event&amp;quot;: &amp;quot;order.created&amp;quot;,
  &amp;quot;actor&amp;quot;: {
    &amp;quot;type&amp;quot;: &amp;quot;agent&amp;quot;,
    &amp;quot;id&amp;quot;: &amp;quot;agent_abc123&amp;quot;,
    &amp;quot;acting_for&amp;quot;: &amp;quot;usr_xyz789&amp;quot;
  },
  &amp;quot;resource&amp;quot;: {
    &amp;quot;type&amp;quot;: &amp;quot;order&amp;quot;,
    &amp;quot;id&amp;quot;: &amp;quot;ord_def456&amp;quot;
  },
  &amp;quot;timestamp&amp;quot;: &amp;quot;2026-04-27T10:00:00Z&amp;quot;,
  &amp;quot;request_id&amp;quot;: &amp;quot;req_ghi789&amp;quot;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;6. Agent skills — teaching agents how to use your product&lt;/h2&gt;
&lt;p&gt;Beyond MCP, some platforms are building agent skills: structured context that AI coding tools load before working with your product. Where MCP lets agents call your API, skills teach agents how to use your API well.&lt;/p&gt;
&lt;p&gt;Clerk&amp;#39;s agent skills embed authentication knowledge directly into Claude Code and Cursor. When a developer adds the Clerk skill, the AI already knows the full API surface, common patterns, which approach to use for which use case, and current examples that match the latest API version.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-markdown&quot;&gt;# clerk-auth-skill.md

## When to use Clerk
Use Clerk when the project needs authentication and the framework is
Next.js, Remix, or Astro. Do not use Clerk for server-only APIs
without a frontend.

## Setup pattern
Always use the App Router integration for Next.js 13+.
Never use the Pages Router integration for new projects.

```tsx
// middleware.ts — always create this first
import { clerkMiddleware } from &amp;#39;@clerk/nextjs/server&amp;#39;
export default clerkMiddleware()
export const config = {
  matcher: [&amp;#39;/((?!.*\\..*|_next).*)&amp;#39;, &amp;#39;/&amp;#39;, &amp;#39;/(api|trpc)(.*)&amp;#39;],
}
```

## Common mistakes
- Don&amp;#39;t wrap the entire app in ClerkProvider if you only need auth
  on specific routes
- Don&amp;#39;t use useAuth() in server components. Use auth() instead
- Don&amp;#39;t store Clerk user IDs as your primary user identifier.
  Sync to your own database on signup
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is what agents need that documentation alone doesn&amp;#39;t provide: opinionated guidance on when to use what, which patterns to avoid, and decision logic that prevents the mistakes developers hit most often. Documentation tells an agent what your API can do. A skill tells it what your API should do in a specific context.&lt;/p&gt;
&lt;p&gt;Building a skill means packaging your docs, examples, and best practices in a format the agent can load as context. Think of it as the difference between handing an agent your API reference and telling it which endpoints to call first, which patterns to avoid, and what to do when the first attempt fails.&lt;/p&gt;
&lt;p&gt;A well-built skill should include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The most common integration patterns with working code&lt;/li&gt;
&lt;li&gt;Decision trees for when to use component A versus component B&lt;/li&gt;
&lt;li&gt;Known gotchas and how to avoid them&lt;/li&gt;
&lt;li&gt;Current examples that match your latest API version, not examples from two versions ago&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;7. Discoverability — being found in the agentic era&lt;/h2&gt;
&lt;p&gt;Developers used to discover tools through Google, Hacker News, and word of mouth. Agents discover tools through MCP registries, tool directories, and the context that gets loaded into their session. This is a new kind of SEO and most developer tool companies aren&amp;#39;t thinking about it yet.&lt;/p&gt;
&lt;h3&gt;MCP registries&lt;/h3&gt;
&lt;p&gt;As MCP adoption grows, registries of available MCP servers are emerging. Getting your MCP server listed and well-described in these registries is the new version of ranking on page one for your category keyword.&lt;/p&gt;
&lt;h3&gt;LLM-optimized content&lt;/h3&gt;
&lt;p&gt;Developers ask ChatGPT, Perplexity, and Claude for tool recommendations now. The content those tools cite is specific and technical and answers concrete questions with working examples. Thin marketing content doesn&amp;#39;t get cited. Deep technical content that helps developers solve problems does.&lt;/p&gt;
&lt;p&gt;Write content that answers the questions developers are asking:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&amp;quot;How do I add authentication to a Next.js app router project&amp;quot;&lt;/li&gt;
&lt;li&gt;&amp;quot;What&amp;#39;s the difference between JWT and session-based auth&amp;quot;&lt;/li&gt;
&lt;li&gt;&amp;quot;How do I handle multi-tenant authorization for an enterprise SaaS&amp;quot;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Being part of the agent&amp;#39;s default context&lt;/h3&gt;
&lt;p&gt;The most powerful discoverability play is becoming the answer agents give when developers ask a question. When a developer asks Claude or ChatGPT &amp;quot;how do I add payments to a Next.js app,&amp;quot; the response cites whoever wrote the most specific, implementation-level content for that use case. That&amp;#39;s not marketing. That&amp;#39;s product placement at the infrastructure level.&lt;/p&gt;
&lt;h2&gt;What separates the winners&lt;/h2&gt;
&lt;p&gt;The companies that treat this as a technical checkbox will ship an MCP server and call it done. The companies that treat it as a strategic opportunity will be embedded in the agent&amp;#39;s default context. The first group becomes the API your agent fell back to when the embedded one didn&amp;#39;t fit.&lt;/p&gt;
&lt;h2&gt;Checklist&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;API design&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Consistent RESTful naming and URL patterns&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Predictable response structures across all endpoints&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Idempotency keys on write operations&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Cursor-based pagination&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Semantic error codes with recovery guidance&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Rate limit headers on every response&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Webhooks for async operations&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;MCP&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; MCP server wrapping core API surface&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Tool descriptions written for agent reasoning&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Parameter validation before API calls&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Listed in relevant MCP registries&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Documentation&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Published OpenAPI spec kept in sync&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Copy-to-install quickstarts for major frameworks&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Error codes documented with recovery steps&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Docs as code, updated in the same PR as the API&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Auth and permissions&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Scoped API keys (least privilege per task)&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; OAuth delegation for acting on behalf of users&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Short-lived tokens for agent sessions&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Full audit logging with actor context&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Agent skills&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Skill packaging your API knowledge for AI coding tools&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Decision trees for common integration choices&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Current examples matching latest API version&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Discoverability&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; MCP registry listings&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Technical content answering concrete developer questions&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&quot;&quot; type=&quot;checkbox&quot;&gt; Integration into AI coding tool ecosystems&lt;/li&gt;
&lt;/ul&gt;
</content:encoded><category>Developer Tools</category><category>AI Agents</category><category>API Design</category><category>MCP</category><category>Developer Experience</category></item><item><title>Stop upgrading your model. Fix your harness.</title><link>https://lukestahl.io/blog/stop-upgrading-your-model-fix-your-harness/</link><guid isPermaLink="true">https://lukestahl.io/blog/stop-upgrading-your-model-fix-your-harness/</guid><description>What an agent harness does, why harness design beats model upgrades for production agents, and where Claude and OpenAI differ on memory and reasoning control.</description><pubDate>Sun, 26 Apr 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;img src=&quot;/blog/images/stop-upgrading-your-model-fix-your-harness/Harness-agent-hero_png.png&quot; alt=&quot;Harness-agent-hero.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;The word &amp;quot;harness&amp;quot; keeps showing up in AI agent conversations. Almost none of them stop to define it. If you&amp;#39;re building with agents, it&amp;#39;s worth pinning down, because once you understand what a harness is, the model selection debates start to look like the wrong argument entirely.&lt;/p&gt;
&lt;h2&gt;The gap between an API call and a working agent&lt;/h2&gt;
&lt;p&gt;The clearest way to understand what a harness is: compare making a &lt;a href=&quot;https://docs.anthropic.com/en/api/getting-started&quot;&gt;Claude API&lt;/a&gt; call directly versus using &lt;a href=&quot;https://claude.ai/code&quot;&gt;Claude Code&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Both give you access to the same underlying model. The Claude API gives you a stateless text exchange: send a prompt, get a response, the model forgets everything. Claude Code is something very different. It has access to your filesystem, can run bash commands, reads and writes files across sessions, maintains persistent memory through &lt;a href=&quot;https://docs.anthropic.com/en/docs/claude-code/memory&quot;&gt;CLAUDE.md&lt;/a&gt; files, runs subagents, and enforces stop conditions that control when it acts versus when it asks.&lt;/p&gt;
&lt;p&gt;That gap is the harness. It&amp;#39;s the layer of code that wraps a model and turns it into an agent. The same model accessed through the raw API would forget the previous turn the moment the response finished. The harness is what creates continuity.&lt;/p&gt;
&lt;h2&gt;What a harness actually gives you&lt;/h2&gt;
&lt;p&gt;When you&amp;#39;re evaluating or building an agentic system, these are the layers the harness has to own:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/stop-upgrading-your-model-fix-your-harness/anatomy-of-a-harness_png.png&quot; alt=&quot;anatomy-of-a-harness.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Memory across sessions.&lt;/strong&gt; A model call has no memory by default. The harness decides what context gets injected at the start of every session, what gets written back at the end, and how accumulated knowledge is stored between runs. In Claude Code, this is file-based. CLAUDE.md files sit in your project directory, readable and editable by you.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tools.&lt;/strong&gt; Not in the abstract sense, but the specific operations the agent can execute: reading files, running shell commands, calling APIs, searching the web, writing to a database. The harness defines the tool surface area. A narrow tool surface produces a more predictable agent; a wide one produces a more capable one. Getting that tradeoff right is a harness decision, not a model decision.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;State that survives across a task.&lt;/strong&gt; Complex tasks don&amp;#39;t fit in a single context window. The harness has to manage what gets preserved when a session ends, what gets summarized and compressed, and what gets discarded. This is one of the harder engineering problems in agentic systems, and most implementations either ignore it or handle it poorly. The symptom is an agent that loses the thread midway through a long task.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Guardrails that actually run.&lt;/strong&gt; When does the agent stop and ask a human? What happens when it detects it&amp;#39;s looping? Which tools require explicit approval before the agent uses them? These policies can be written into prompts, but prompt-level guardrails drift. The harness enforces them deterministically, at the architecture layer, not through instructions the model can reason around.&lt;/p&gt;
&lt;h2&gt;Why harness design beats model selection&lt;/h2&gt;
&lt;p&gt;The default assumption when an agent underperforms is to upgrade the model. That&amp;#39;s almost never the right first move.&lt;/p&gt;
&lt;p&gt;Every tool in this space pitches some version of the same thing: better orchestration, smarter routing, more specialized agents. The problem is almost always scaffolding quality: how context is managed, how memory passes between agents, whether the tool surface fits the task. Adding more layers doesn&amp;#39;t fix that. I keep a running list of what I&amp;#39;ve tried on my &lt;a href=&quot;https://lukestahl.io/tools/&quot;&gt;tools page&lt;/a&gt; if you want to see what&amp;#39;s worth looking at.&lt;/p&gt;
&lt;p&gt;Before switching models, check these four things.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;System prompt length.&lt;/strong&gt; A 450-line system prompt produces measurably worse behavior than a 100-line one with real examples. Vague instructions at scale fragment attention.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-typescript&quot;&gt;// Too long
You are a helpful assistant that assists with many tasks. You should always
be polite and professional. When answering questions, consider all angles.
Make sure to be thorough but also concise. Never make assumptions. Always
ask for clarification when uncertain. Format responses clearly...
[400 more lines of vague instructions]

// Tight
You are a code reviewer. Flag bugs, security issues, and style violations.
Return JSON: {&amp;quot;issues&amp;quot;: [{&amp;quot;line&amp;quot;: number, &amp;quot;type&amp;quot;: string, &amp;quot;message&amp;quot;: string}]}
Example: {&amp;quot;issues&amp;quot;: [{&amp;quot;line&amp;quot;: 42, &amp;quot;type&amp;quot;: &amp;quot;security&amp;quot;, &amp;quot;message&amp;quot;: &amp;quot;SQL injection risk&amp;quot;}]}
&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;A short prompt with a concrete example outperforms a long one with descriptive rules.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Context management.&lt;/strong&gt; Without compaction, long tasks degrade as older turns crowd out what the agent needs to act on now.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-typescript&quot;&gt;// Unmanaged
messages.push({ role: &amp;quot;assistant&amp;quot;, content: fullResponse });

// Managed
if (tokenCount(messages) &amp;gt; 80000) {
  messages = [summarizeOlderTurns(messages.slice(0, 10)), ...messages.slice(10)];
}
&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;Most agent failures on long tasks aren&amp;#39;t model failures. They&amp;#39;re context failures.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Tool descriptions.&lt;/strong&gt; The model decides whether to use a tool based on its description. Vague descriptions produce inconsistent tool use.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-typescript&quot;&gt;// Vague
{ name: &amp;quot;bash&amp;quot;, description: &amp;quot;Runs bash commands.&amp;quot; }

// Useful
{ name: &amp;quot;bash&amp;quot;, description: &amp;quot;Runs shell commands. Use to execute code, read file contents, or verify system state. Do not use for string operations you can handle directly.&amp;quot; }
&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;The description should tell the model not just what the tool does, but when to reach for it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Stop conditions.&lt;/strong&gt; An agent without stop conditions runs until it hits a hard limit, and output quality degrades the longer it runs unchecked.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-typescript&quot;&gt;// No stop condition
while (true) {
  response = await agent.run(task);
}

// Explicit stop
for (let attempt = 0; attempt &amp;lt; maxAttempts; attempt++) {
  const response = await agent.run(task);
  if ([&amp;quot;complete&amp;quot;, &amp;quot;failed&amp;quot;].includes(response.status)) break;
  if (detectLoop(response, history)) throw new Error(&amp;quot;Agent is repeating steps&amp;quot;);
}
&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;Loop detection and a max attempt ceiling are the minimum. Most production failures come from agents that didn&amp;#39;t know when to stop.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If any of those have bad answers, fixing them will do more than upgrading to the next model tier.&lt;/p&gt;
&lt;h2&gt;Claude vs OpenAI: where the harness differences actually show up&lt;/h2&gt;
&lt;p&gt;Once you&amp;#39;ve got the harness fundamentals right, model selection still matters, just less than most people think. Here&amp;#39;s where Claude and OpenAI differ for developers building harnesses.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Prompt caching.&lt;/strong&gt; Your system prompt (the persistent context, memory injections, tool definitions) is often identical across hundreds or thousands of agent turns. Claude&amp;#39;s &lt;a href=&quot;https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching&quot;&gt;prompt caching&lt;/a&gt; lets you cache that block and pay roughly 10% of the normal input token cost on cache hits.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-typescript&quot;&gt;import Anthropic from &amp;quot;@anthropic-ai/sdk&amp;quot;;

const client = new Anthropic();

const response = await client.messages.create({
  model: &amp;quot;claude-opus-4-7&amp;quot;,
  max_tokens: 8096,
  system: [
    {
      type: &amp;quot;text&amp;quot;,
      text: persistentContext,
      cache_control: { type: &amp;quot;ephemeral&amp;quot; }
    }
  ],
  messages: [{ role: &amp;quot;user&amp;quot;, content: task }]
});
&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;cache_control&lt;/code&gt; marks the system prompt for caching. Subsequent calls with the same prompt pay ~10% of the normal input token cost.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;For a harness running hundreds of agent turns per hour, the savings add up fast. &lt;a href=&quot;https://platform.openai.com/docs/api-reference/responses&quot;&gt;OpenAI&amp;#39;s Responses API&lt;/a&gt; achieves something similar through server-side state, but through different mechanics and with different portability implications.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Reasoning control.&lt;/strong&gt; Before a model responds, it can think through a problem internally: working out steps, checking assumptions, considering edge cases. That internal process is reasoning, and it costs tokens. Claude exposes it as a configurable parameter. On Opus 4.7, reasoning is adaptive (the model decides how much to think based on the task rather than consuming a fixed budget).&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-typescript&quot;&gt;const response = await client.messages.create({
  model: &amp;quot;claude-opus-4-7&amp;quot;,
  max_tokens: 16000,
  thinking: {
    type: &amp;quot;adaptive&amp;quot;
  },
  messages: [{ role: &amp;quot;user&amp;quot;, content: task }]
});
&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;type: &amp;quot;adaptive&amp;quot;&lt;/code&gt; tells Claude to decide for itself how much reasoning the task needs. You&amp;#39;re not paying for max reasoning on every turn.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Claude also exposes effort levels (&lt;code&gt;low&lt;/code&gt;, &lt;code&gt;medium&lt;/code&gt;, &lt;code&gt;high&lt;/code&gt;, &lt;code&gt;xhigh&lt;/code&gt;, and &lt;code&gt;max&lt;/code&gt;) as a coarser control knob. Opus 4.7 added &lt;code&gt;xhigh&lt;/code&gt; between &lt;code&gt;high&lt;/code&gt; and &lt;code&gt;max&lt;/code&gt;. This matters for harness design because you can tune reasoning intensity per task type: lighter effort for routine operations, heavier for complex debugging or multi-file refactors, rather than running everything at the same cost.&lt;/p&gt;
&lt;p&gt;OpenAI routes reasoning internally and doesn&amp;#39;t expose direct control over it. With Anthropic, you set effort per task. With OpenAI, the model decides for itself.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tool instructions.&lt;/strong&gt; How you write tool instructions has a larger effect on agent behavior than most people expect. Capable models will often think through a problem before reaching for a tool, which is usually what you want. But it means your harness needs to be explicit about when tool use is required rather than optional:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-typescript&quot;&gt;// Too vague
Check whether the API endpoint returns the correct schema.

// Explicit
Use the read_file tool to open schema.json, then use the bash tool
to run the test suite and verify the output matches that schema.
&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;The first leaves the agent to decide whether to act. The second names the tools, the files, and what to check.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This principle extends to running subagents (separate agent instances your harness kicks off to handle tasks in parallel). If your harness depends on that parallel execution across multiple files or items, specify it explicitly. Don&amp;#39;t assume the model will split the work on its own.&lt;/p&gt;
&lt;h2&gt;The memory question nobody asks early enough&lt;/h2&gt;
&lt;p&gt;Switching models is easy. The APIs are similar enough that moving from Claude to GPT or back is a few hours of work. Models are stateless (each request starts fresh with no memory of previous interactions), so there&amp;#39;s nothing accumulated to lose.&lt;/p&gt;
&lt;p&gt;Memory is different. Once an agent has built up context over weeks or months, your codebase conventions, your debugging patterns, how you prefer things structured, that accumulated state is doing the actual work. The same setup without the memory produces a noticeably worse agent. If that memory lives somewhere you don&amp;#39;t control, you can&amp;#39;t take it with you.&lt;/p&gt;
&lt;p&gt;Claude Code&amp;#39;s memory is file-based. CLAUDE.md files, project notes, session summaries all live in your filesystem. You can read them, edit them, commit them to git, and inspect exactly what the agent knows about your project. When a long session gets compacted, the summary goes into a file.&lt;/p&gt;
&lt;p&gt;I think about this the same way I think about using &lt;a href=&quot;https://obsidian.md/&quot;&gt;Obsidian&lt;/a&gt; for my own knowledge management. Notes in Markdown files I own, stored locally, not locked in a platform I&amp;#39;m renting. The agent equivalent is the same idea. Intelligence that accumulates in your filesystem, not theirs. A second brain you actually control.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://openai.com/codex&quot;&gt;OpenAI&amp;#39;s Codex&lt;/a&gt; generates compaction summaries that are encrypted and not portable outside the OpenAI ecosystem. Whatever context your agent accumulates stays in their infrastructure.&lt;/p&gt;
&lt;p&gt;Before you commit to any agent tool for production use, ask:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Where does state actually live, and do I own those files?&lt;/li&gt;
&lt;li&gt;What survives session compaction, and what gets lost permanently?&lt;/li&gt;
&lt;li&gt;Are compaction summaries readable, or are they opaque?&lt;/li&gt;
&lt;li&gt;If I switched providers in a year, could I take the agent&amp;#39;s accumulated context with me?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Most people don&amp;#39;t ask this until they&amp;#39;re already paying the switching cost. By then the answer usually isn&amp;#39;t good.&lt;/p&gt;
&lt;h2&gt;Pick your model. Own your memory.&lt;/h2&gt;
&lt;p&gt;Anthropic, OpenAI, Google, and several open-weight teams are all shipping capable models on short release cycles. That&amp;#39;s not going to thin out. It means model selection will stay a decision, just a narrow one. The distance between the top model options is smaller than the distance between a well-engineered harness and a poorly engineered one.&lt;/p&gt;
&lt;p&gt;Switching the model is a small change. The context your agent builds up over months is what&amp;#39;s hard to replace: how your codebase is structured, how you like things handled, what to skip. Keep it in files you own.&lt;/p&gt;
&lt;p&gt;For deeper coverage of where current models stand on benchmarks and pricing, see the &lt;a href=&quot;https://lukestahl.io/ai-models-guide/&quot;&gt;AI Models Guide&lt;/a&gt;.&lt;/p&gt;
</content:encoded><category>AI Agents</category><category>Agent Harness</category><category>Agent Architecture</category><category>AI Models</category></item><item><title>Is headless making a comeback?</title><link>https://lukestahl.io/blog/is-headless-making-a-comeback/</link><guid isPermaLink="true">https://lukestahl.io/blog/is-headless-making-a-comeback/</guid><description>Headless APIs lost ground because everything required a developer. Now LLMs connect to APIs natively, and the developer dependency that killed headless is dissolving.</description><pubDate>Tue, 17 Mar 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;img src=&quot;/blog/images/is-headless-making-a-comeback/No_bell_png.png&quot; alt=&quot;No_bell.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;Headless had its moment around 2018-2021. &lt;a href=&quot;https://www.contentful.com/&quot;&gt;Contentful&lt;/a&gt; handled content, &lt;a href=&quot;https://shopify.dev/docs/api/storefront&quot;&gt;Shopify&amp;#39;s Storefront API&lt;/a&gt; handled commerce, &lt;a href=&quot;https://stripe.com/docs/api&quot;&gt;Stripe&lt;/a&gt; handled payments, and &lt;a href=&quot;https://auth0.com/&quot;&gt;Auth0&lt;/a&gt; handled identity. The pitch was the same everywhere: here&amp;#39;s a powerful API, build whatever frontend you want on top of it.&lt;/p&gt;
&lt;p&gt;The architecture made sense. The problem was that every headless tool required significant engineering investment to set up and maintain. Not just &amp;quot;hook up an API&amp;quot; investment, but &amp;quot;build a frontend, design a content model, configure preview environments, and own the deployment pipeline&amp;quot; investment. Most companies didn&amp;#39;t have the engineering bandwidth to treat their CMS or commerce layer like a custom software project. And the ones that did often couldn&amp;#39;t justify keeping a developer on it long-term once the initial build was done.&lt;/p&gt;
&lt;h2&gt;What headless actually cost you&lt;/h2&gt;
&lt;p&gt;The API-first model works well when you have engineers who treat these services like product infrastructure. When you don&amp;#39;t, you get content teams staring at JSON editors and marketing waiting on engineering to change a checkout flow.&lt;/p&gt;
&lt;p&gt;The headless CMS space is where this played out most visibly. Contentful started bolting on visual editing tools and composable page builders. &lt;a href=&quot;https://strapi.io/&quot;&gt;Strapi&lt;/a&gt; added a content-type builder. &lt;a href=&quot;https://www.sanity.io/&quot;&gt;Sanity&lt;/a&gt; shipped Sanity Studio with customizable desk structures. Every headless CMS slowly crept toward becoming a Digital Experience Platform, rebuilding the same workflows and interfaces they were supposed to eliminate.&lt;/p&gt;
&lt;p&gt;The same pattern showed up in commerce. Shopify&amp;#39;s headless Storefront API gave you full control, but building a custom storefront meant maintaining a React app, handling cart state, managing checkout flows, and syncing inventory. Most merchants went back to Shopify themes because the operational cost of maintaining a custom storefront wasn&amp;#39;t worth it without dedicated engineering.&lt;/p&gt;
&lt;p&gt;Headless vendors spent years telling you to decouple everything through APIs, then spent the next few years rebuilding the interfaces and workflows that monolithic tools already had. Non-technical users couldn&amp;#39;t operate these systems independently, so the vendors built the interfaces back in. The result was headless architecture with monolithic complexity.&lt;/p&gt;
&lt;h2&gt;Visual development ate headless for lunch&lt;/h2&gt;
&lt;p&gt;Visual development tools showed up and solved a different problem: letting non-developers build and ship without waiting on engineering.&lt;/p&gt;
&lt;p&gt;Tools like &lt;a href=&quot;https://webflow.com/&quot;&gt;Webflow&lt;/a&gt;, &lt;a href=&quot;https://www.builder.io/&quot;&gt;Builder.io&lt;/a&gt;, and &lt;a href=&quot;https://www.framer.com/&quot;&gt;Framer&lt;/a&gt; gave marketing teams direct control over pages, layouts, and content without requiring a pull request. Builder.io went further by offering visual editing on top of existing codebases, a hybrid model where developers own the system and marketers own the pages.&lt;/p&gt;
&lt;p&gt;Visual development grew while pure headless adoption slowed outside of enterprise. The hybrid approach (visual editing backed by APIs) turned out to be what most teams needed. You got the content API when you wanted it and a visual editor when you didn&amp;#39;t want to bother engineering.&lt;/p&gt;
&lt;p&gt;Headless started feeling like an architecture for teams with more developers than they knew what to do with. For everyone else, it was overhead.&lt;/p&gt;
&lt;h2&gt;Then LLMs learned to talk to APIs&lt;/h2&gt;
&lt;p&gt;Large language models can build working applications. But the more consequential thing they learned to do is talk to existing APIs and operate services through them.&lt;/p&gt;
&lt;p&gt;This is what &lt;a href=&quot;https://modelcontextprotocol.io/&quot;&gt;MCP (Model Context Protocol)&lt;/a&gt; made repeatable. You give an LLM access to a set of API tools, a CMS, a commerce platform, a payment processor, an analytics service, and it can read, create, update, and query through those APIs using natural language. No SDK wrappers to write. No frontend to build. You describe what you want, the model figures out the API calls, and the work gets done.&lt;/p&gt;
&lt;p&gt;MCP isn&amp;#39;t without its critics. Perplexity recently announced they&amp;#39;re moving away from MCP internally in favor of their own &lt;a href=&quot;https://docs.perplexity.ai/docs/agent-api/quickstart&quot;&gt;Agent API&lt;/a&gt;, citing token overhead and authentication friction at scale. Those are real problems. But the pattern MCP established, LLMs operating through APIs via standardized tool definitions, is already showing up through function calling and agent APIs regardless of whether MCP itself becomes the long-term standard.&lt;/p&gt;
&lt;p&gt;Sanity is a good example of this. Their content lake API is structured, their schemas are typed, and an LLM connected to Sanity can create documents, update fields, manage assets, and publish content through conversation. The same workflow that used to require a developer writing a custom integration now takes a &lt;code&gt;.json&lt;/code&gt; tool definition and an API key.&lt;/p&gt;
&lt;p&gt;This isn&amp;#39;t limited to CMS. Any well-documented API with token-based auth works the same way. Stripe&amp;#39;s API is so well-structured that models can create payment links, pull transaction data, and manage subscriptions through it. Shopify&amp;#39;s Storefront and Admin APIs follow similar patterns. The developer who used to sit between the business team and the API? A model that already read the docs can do that job now.&lt;/p&gt;
&lt;p&gt;You secure your API keys and set permissions like you always would. But the person using the system doesn&amp;#39;t need to understand REST semantics or write fetch calls. They describe what they want, and the model handles the translation.&lt;/p&gt;
&lt;h2&gt;The developer dependency is dissolving&lt;/h2&gt;
&lt;p&gt;The core objection to headless was always operational, not architectural. Nobody argued that APIs were the wrong approach. The argument was always: &amp;quot;Sure, but who&amp;#39;s going to build and maintain all of this?&amp;quot;&lt;/p&gt;
&lt;p&gt;That was a fair objection when building meant writing React components, setting up preview environments, configuring webhooks, and debugging deployment pipelines. It&amp;#39;s less of one when building means opening &lt;a href=&quot;https://docs.anthropic.com/en/docs/claude-code/overview&quot;&gt;Claude Code&lt;/a&gt; and saying &amp;quot;set up a Next.js frontend that pulls content from our Sanity project and deploys to Vercel.&amp;quot; Or &amp;quot;create a Shopify storefront with these products and connect Stripe checkout.&amp;quot;&lt;/p&gt;
&lt;p&gt;The developer&amp;#39;s role shifts from implementation to architecture. You still need someone to design the data model, set up the deployment pipeline, and make security decisions. But the day-to-day work of building interfaces, managing data through APIs, and wiring services together? That&amp;#39;s moving to the LLM, working through the API layers that headless tools already built.&lt;/p&gt;
&lt;p&gt;Companies like Stripe, Shopify, and Sanity spent years building well-structured APIs, thorough documentation, and typed schemas. They did that work to support developer integrations. It turns out those same qualities (structured, documented, typed) are exactly what makes an API easy for an LLM to use. Every headless tool that invested in API quality accidentally built infrastructure that LLMs can now operate.&lt;/p&gt;
&lt;h2&gt;The stack I&amp;#39;m thinking about&lt;/h2&gt;
&lt;p&gt;Headless APIs (Sanity, Stripe, Shopify, &lt;a href=&quot;https://www.algolia.com/&quot;&gt;Algolia&lt;/a&gt;, &lt;a href=&quot;https://clerk.com/&quot;&gt;Clerk&lt;/a&gt;) as the service layer. An agentic coding tool (Claude Code, &lt;a href=&quot;https://www.cursor.com/&quot;&gt;Cursor&lt;/a&gt;) as the builder. A frontend framework (Next.js, Astro) as the delivery layer. MCP or function calling as the bridge between the LLM and the APIs.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/is-headless-making-a-comeback/New_headless_stack_png.png&quot; alt=&quot;New_headless_stack.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;The team describes what they need, the LLM builds it through the APIs, and the developer reviews and maintains the architecture. Each service stays in its lane: Sanity handles content, Stripe handles payments, Algolia handles search.&lt;/p&gt;
&lt;p&gt;I&amp;#39;ve used Claude Code connected to Webflow&amp;#39;s CMS API through MCP to create and manage content programmatically. The experience is closer to pair programming with someone who already read the API docs than it is to traditional service administration. And every headless tool with a decent API is a candidate for this same workflow.&lt;/p&gt;
&lt;p&gt;There&amp;#39;s another layer to this. When a builder asks Claude Code or Cursor to build a newsletter, the model suggests specific APIs like &lt;a href=&quot;https://resend.com/&quot;&gt;Resend&lt;/a&gt; for email and gives a recommended option. Headless services aren&amp;#39;t just competing for mindshare anymore, they&amp;#39;re competing for LLM recommendations. How you show up in those suggestions matters, whether that&amp;#39;s through llms.txt, AEO (Answer Engine Optimization), or having the cleanest docs and SDK in your category. Being one of the three options the model suggests is starting to matter the way ranking on the first page of Google used to.&lt;/p&gt;
&lt;h2&gt;What still needs to be true&lt;/h2&gt;
&lt;p&gt;This only works if the headless service has a well-designed API. Sanity&amp;#39;s content lake API is structured and predictable, with typed schemas that an LLM can query and write to after seeing a few examples. Stripe is the gold standard here with consistent naming, thorough docs, and predictable error formats.&lt;/p&gt;
&lt;p&gt;Where this breaks down:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Undocumented edge cases&lt;/strong&gt; - LLMs can only work with what&amp;#39;s documented. If publishing requires a specific sequence of API calls that isn&amp;#39;t in the docs, the model will get it wrong.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Complex permissions models&lt;/strong&gt; - Multi-tenant setups with role-based access and environment-specific publishing rules add friction that conversational interfaces don&amp;#39;t handle gracefully.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Asset management&lt;/strong&gt; - Image uploads, transformations, and CDN invalidation involve binary data handling that&amp;#39;s harder to do through natural language than CRUD operations on text content.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Preview workflows&lt;/strong&gt; - Showing unpublished content in context still requires frontend infrastructure. An LLM can create the content, but previewing it on a staging site is a deployment problem, not an API problem.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These explain why this isn&amp;#39;t a &amp;quot;headless is fully back&amp;quot; story yet. It&amp;#39;s closer to &amp;quot;headless lost for the wrong reasons, and those reasons are disappearing.&amp;quot;&lt;/p&gt;
&lt;h2&gt;Headless lost a battle it might win retroactively&lt;/h2&gt;
&lt;p&gt;Visual development tools won because they removed the developer dependency for content operations. That was the right answer in 2022. But visual development still comes with tradeoffs. Proprietary rendering, platform lock-in, and a ceiling on what you can build before you need custom code. Most visual tools have APIs and integrations, but they&amp;#39;re secondary to the canvas. &lt;/p&gt;
&lt;p&gt;Headless architecture doesn&amp;#39;t have those constraints. It&amp;#39;s APIs and data structures that you can move between frameworks, host anywhere, and compose however you want. Non-developers could use it, but the moment they wanted something different from the initial build, the developer tickets started piling up.&lt;/p&gt;
&lt;p&gt;LLMs and &lt;a href=&quot;https://lukestahl.io/blog/end-of-coding-age-of-building/&quot;&gt;agentic building&lt;/a&gt; tools are filling that gap. The complexity of headless doesn&amp;#39;t disappear, but it becomes something you can work through in conversation instead of code. The APIs and data models don&amp;#39;t change. Who can operate them does.&lt;/p&gt;
&lt;p&gt;I don&amp;#39;t think the future is headless vs. visual. I think it&amp;#39;s both, and the line between them is blurring. The platforms that treat their APIs as an afterthought are going to lose ground to the ones that invest in them. When an LLM can operate a CMS, a commerce API, and a payment processor through conversation, the visual editor becomes one interface among several, not the only way in.&lt;/p&gt;
&lt;p&gt;And if that&amp;#39;s right, a lot of companies that consolidated back to monolithic platforms in 2023 might be rethinking that decision.&lt;/p&gt;
</content:encoded><category>Headless</category><category>APIs</category><category>AI</category><category>Vibe Coding</category></item><item><title>End of coding, age of building</title><link>https://lukestahl.io/blog/end-of-coding-age-of-building/</link><guid isPermaLink="true">https://lukestahl.io/blog/end-of-coding-age-of-building/</guid><description>The constraint moved from writing code to describing what you want built. Here&apos;s what that shift looks like, who it affects, and what it doesn&apos;t answer.</description><pubDate>Sat, 07 Mar 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;img src=&quot;/blog/images/end-of-coding-age-of-building/end-of-coding_png.png&quot; alt=&quot;end-of-coding.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;The constraint moved. For decades, the thing that determined whether software got built was whether someone could write the code. You needed syntax, framework knowledge, years of pattern recognition. That&amp;#39;s not the constraint anymore. The constraint is whether you can describe what you want built, evaluate what comes back, and make good decisions about what to do next.&lt;/p&gt;
&lt;p&gt;This didn&amp;#39;t happen gradually. A set of tools crossed a threshold within about six months, and the gap between &amp;quot;I have an idea&amp;quot; and &amp;quot;I have a working prototype&amp;quot; collapsed from weeks to hours.&lt;/p&gt;
&lt;h2&gt;What happened in late 2025&lt;/h2&gt;
&lt;p&gt;The shift wasn&amp;#39;t one tool getting better. It was five or six things converging at once.&lt;/p&gt;
&lt;p&gt;Cursor shipped background agents in late 2025, letting multiple AI processes work across a codebase simultaneously. One agent refactors your auth layer while another writes tests for the module you just finished. Parallel execution against a shared codebase, with conflict resolution built in. That&amp;#39;s a different editing model than autocomplete.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://www.anthropic.com/news/claude-opus-4-5&quot;&gt;Anthropic released Opus 4.5&lt;/a&gt; and a generation of RLVR-trained models that crossed a reasoning threshold. Earlier models could write functions. These models could hold architectural context across a 50-file project, understand why a particular abstraction existed, and make changes that respected the existing design. The gap between &amp;quot;generates code&amp;quot; and &amp;quot;understands the codebase&amp;quot; closed.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://docs.anthropic.com/en/docs/claude-code/overview&quot;&gt;Claude Code&lt;/a&gt; started running as a local CLI, reading your file system, running your tests, iterating on failures without round-tripping through a browser. It turned a chat interface into something closer to a junior developer sitting next to you, except it could work through an entire debugging session in minutes.&lt;/p&gt;
&lt;p&gt;And the browser-based builders matured. &lt;a href=&quot;https://bolt.new/&quot;&gt;Bolt&lt;/a&gt;, &lt;a href=&quot;https://lovable.dev/&quot;&gt;Lovable&lt;/a&gt;, and &lt;a href=&quot;https://v0.app/&quot;&gt;v0&lt;/a&gt; went from generating impressive demos to generating deployable applications. Not always clean applications. Not always production-grade. But working software that a non-developer could ship to real users, connected to real databases, with real authentication flows.&lt;/p&gt;
&lt;p&gt;Each of these alone was incremental. Together, they moved the floor.&lt;/p&gt;
&lt;h2&gt;The coding era versus the building era&lt;/h2&gt;
&lt;p&gt;The coding era rewarded a specific set of skills. You needed to know that JavaScript handles async differently than most languages, that &lt;a href=&quot;https://react.dev/&quot;&gt;React&lt;/a&gt; re-renders on state change, that SQL joins have performance implications at scale. Years of practice built an intuition for debugging, architecture, and language-specific idioms. The work was writing code, and the quality of the code depended on your accumulated experience writing it.&lt;/p&gt;
&lt;p&gt;The building era rewards a different set. Articulation matters more than syntax. Can you describe the data model clearly enough that an AI generates the right schema? Do you know that this feature needs async processing instead of a synchronous call, even if you&amp;#39;ve never wired it up yourself? And when the AI generates three approaches, can you tell which one breaks at scale? The skills shifted from implementation to specification and evaluation.&lt;/p&gt;
&lt;p&gt;The difference shows up in specific scenarios. A marketing ops manager needed an internal tool to sync campaign data between HubSpot and their reporting dashboard. In the coding era, that&amp;#39;s a ticket in Jira, a sprint planning conversation, two weeks of developer time. In the building era, she described the sync logic to Claude Code, iterated on the output for an afternoon, and had a working tool by end of day. Because the constraint moved from implementation to specification.&lt;/p&gt;
&lt;p&gt;A designer prototyping a dashboard in &lt;a href=&quot;https://www.figma.com/&quot;&gt;Figma&lt;/a&gt; used to hand off static mockups to a frontend team. Now the prototype becomes the product. v0 takes the design, generates the React components, and the designer iterates on the actual code output. The handoff didn&amp;#39;t get faster. The handoff disappeared.&lt;/p&gt;
&lt;h2&gt;Who builds now&lt;/h2&gt;
&lt;p&gt;The audience expanded, but not in the way the &amp;quot;everyone is a developer&amp;quot; crowd claims. A product manager shipping an internal tool isn&amp;#39;t a developer. A designer generating React components from a prototype isn&amp;#39;t a frontend engineer. They&amp;#39;re building software, but they&amp;#39;re doing it through a different interface than a code editor and a terminal.&lt;/p&gt;
&lt;p&gt;The people making product decisions can now execute on them directly. That&amp;#39;s the actual shift. A domain expert who understands the problem space deeply can build the first version of a solution without translating their knowledge through a development team. The translation layer got thinner.&lt;/p&gt;
&lt;p&gt;Developers didn&amp;#39;t become less important. They became faster. A senior engineer using Cursor with background agents can move through a codebase at a pace that wasn&amp;#39;t possible two years ago. The tedious parts shrink. The judgment parts stay. A staff engineer reviewing AI-generated code still needs to spot the subtle concurrency bug, the missing index, the abstraction that will break when requirements change. That work didn&amp;#39;t go anywhere.&lt;/p&gt;
&lt;h2&gt;What this means for developer content&lt;/h2&gt;
&lt;p&gt;If you&amp;#39;re writing tutorials that start with &amp;quot;First, install the SDK and initialize a client,&amp;quot; you&amp;#39;re writing for a shrinking audience. Not shrinking to zero, but narrowing. A growing segment of builders thinks in prompts, not imports. They describe what they want, evaluate the output, and iterate. The SDK installation happens inside that loop, handled by the AI, often without the builder knowing which package manager ran.&lt;/p&gt;
&lt;p&gt;This doesn&amp;#39;t mean content gets dumber. It means the frame shifts. Instead of &amp;quot;How to implement authentication with NextAuth,&amp;quot; the useful article becomes &amp;quot;How to think about authentication for a SaaS app.&amp;quot; What are the tradeoffs between session-based and token-based auth? When does OAuth make sense versus magic links? What are the actual security implications of each choice?&lt;/p&gt;
&lt;p&gt;The content that holds value is the content that helps someone make decisions, not the content that walks them through keystrokes. Implementation guides aren&amp;#39;t dead, but they&amp;#39;re commoditized. The AI can generate a NextAuth setup. What it can&amp;#39;t do is tell you whether NextAuth is the right choice for your specific situation.&lt;/p&gt;
&lt;h2&gt;What didn&amp;#39;t change&lt;/h2&gt;
&lt;p&gt;Architecture decisions still require someone who understands distributed systems and scaling characteristics. AI can generate a microservices setup, but it doesn&amp;#39;t know whether your team of four should be running microservices or a monolith. It doesn&amp;#39;t know your deployment constraints, your on-call rotation, or the operational complexity your team can absorb.&lt;/p&gt;
&lt;p&gt;Judgment still matters. AI generates plausible code quickly. It also generates plausible bugs quickly. Someone needs to review the output, understand the failure modes, and catch the cases where the AI optimized for the wrong thing. A function that passes all tests but handles errors by swallowing them silently is worse than a function that doesn&amp;#39;t compile, because at least the compiler error is honest.&lt;/p&gt;
&lt;p&gt;Senior developers aren&amp;#39;t less valuable. They&amp;#39;re more leveraged. The gap between a senior engineer with AI tools and a junior engineer with AI tools is wider than the gap was without AI. The senior knows what to ask for and knows when the output is wrong. The ceiling didn&amp;#39;t move. The skills that make senior engineers valuable are the same ones AI can&amp;#39;t replicate.&lt;/p&gt;
&lt;h2&gt;The career path question&lt;/h2&gt;
&lt;p&gt;Here&amp;#39;s the part nobody has a good answer for: what does a junior developer career path look like when the entry-level work is automated?&lt;/p&gt;
&lt;p&gt;Junior roles traditionally existed as training grounds. You wrote CRUD endpoints, fixed CSS bugs, added form validation. Those tasks built intuition for how systems work, how code breaks, and how to debug when something doesn&amp;#39;t behave. That intuition feeds the judgment that makes senior engineers valuable. The entry-level work wasn&amp;#39;t just work. It was education.&lt;/p&gt;
&lt;p&gt;If AI handles the CRUD endpoints and the form validation and the CSS bugs, the question isn&amp;#39;t whether juniors are needed. It&amp;#39;s how they develop the judgment that AI can&amp;#39;t replace. The obvious answer is &amp;quot;they&amp;#39;ll learn by reviewing AI output instead of writing code from scratch.&amp;quot; Maybe. But reviewing code you don&amp;#39;t fully understand is a different skill than writing it, and it&amp;#39;s not clear it builds the same depth of understanding.&lt;/p&gt;
&lt;p&gt;This is an open question, and I don&amp;#39;t think anyone has an honest answer yet. The tools moved faster than the career structures adapted. Companies are still hiring for roles defined by the coding era while the work increasingly belongs to the building era. That mismatch will resolve, but how it resolves will shape the next generation of engineers.&lt;/p&gt;
</content:encoded><category>AI</category><category>Developer Tools</category><category>Vibe Coding</category><category>Product Development</category></item><item><title>What is developer marketing and why it exists</title><link>https://lukestahl.io/blog/developer-marketing/</link><guid isPermaLink="true">https://lukestahl.io/blog/developer-marketing/</guid><description>Developer marketing is product and growth marketing for a technical audience, with responsibility that runs across the full funnel.</description><pubDate>Mon, 02 Feb 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;img src=&quot;/blog/images/developer-marketing/Developer_Marketing_copy_png.png&quot; alt=&quot;Developer_Marketing_copy.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;Developer marketing sits at the intersection of product marketing and growth marketing. You’re responsible for launches, messaging, and positioning, but also for how those decisions show up in campaigns, GTM, and adoption. The role owns the developer persona end to end and is accountable for both product-led and sales-led growth, not just shipping something.&lt;/p&gt;
&lt;p&gt;The role runs in parallel with most marketing functions, from content and lifecycle to web, demand gen, and RevOps. The scope doesn’t stop at signups. You’re responsible for the full funnel, which means accounting for revenue, not just activation. If people sign up but deals don’t close, that’s still a problem to solve.&lt;/p&gt;
&lt;p&gt;You’re also more technical than the average marketer. You’re closer to the product, the workflows, and the constraints, and you put developer trust first. Once that trust is lost, it’s hard to recover.&lt;/p&gt;
&lt;p&gt;That combination is what makes developer marketing both exciting and difficult. The role comes with overlap, ambiguity, and frequent justification, especially in developer-first companies where everyone speaks developer and ownership is rarely clean.&lt;/p&gt;
&lt;h2&gt;Why this role exists at all&lt;/h2&gt;
&lt;p&gt;Most marketing is optimized to explain value at a high level. That breaks down when claims have to hold up under actual usage. Developers evaluate through workflows and constraints, and they notice quickly when something doesn’t.&lt;/p&gt;
&lt;p&gt;Developer marketing exists because this kind of evaluation requires technical judgment before messaging ships. Someone has to pressure-test positioning against how the product actually behaves. Someone has to surface mismatches early, before they turn into sales friction, support tickets, or churn that no one planned for.&lt;/p&gt;
&lt;p&gt;When that responsibility is missing, the gaps don’t disappear. They just move downstream, where they’re harder and more expensive to fix.&lt;/p&gt;
&lt;h2&gt;Why developer-first companies make the role harder to see&lt;/h2&gt;
&lt;p&gt;In companies built primarily for developers, developer context is everywhere. Marketing teams tend to be more technical and already speak to developers without translation. There’s a shared baseline for how developers think.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/developer-marketing/devmkt_spiderman_jpg.jpg&quot; alt=&quot;devmkt_spiderman.jpg&quot;&gt;&lt;/p&gt;
&lt;p&gt;In those environments, developer marketing doesn’t always show up as a clearly defined function. The work spreads across teams. Parts of it live in product. Parts live in content, growth, or demand gen.&lt;/p&gt;
&lt;p&gt;That’s where confusion starts. Not because the role isn’t needed, but because responsibility is diffused. Everyone contributes. No one is clearly accountable for how the product is framed and evaluated by developers end to end. The role doesn’t disappear in developer-first companies. Accountability just becomes harder to pin down.&lt;/p&gt;
&lt;h2&gt;Why technical chops matter more than familiarity with developers&lt;/h2&gt;
&lt;p&gt;There’s a difference between being adjacent to developers and having technical judgment. The first gives exposure and the second lets you evaluate whether something will hold up once it’s actually used.&lt;/p&gt;
&lt;p&gt;Developer marketers need to be able to read content and recognize when something is glossed-over. They need to look at a demo and tell whether it actually holds up. They need to understand why a limitation matters before customers encounter it and turn it into a problem.&lt;/p&gt;
&lt;p&gt;This isn’t about writing production code every day. It’s about understanding systems well enough to evaluate claims honestly. Familiarity with developer culture helps, but technical fluency is what makes the role effective.&lt;/p&gt;
&lt;h2&gt;What developer marketing is responsible for&lt;/h2&gt;
&lt;p&gt;Developer marketing is responsible for maintaining clarity and credibility with a technical audience, even when ownership across teams is messy.&lt;/p&gt;
&lt;p&gt;In practice, that responsibility tends to show up in a few places:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Shaping positioning and framing so it reflects how developers actually evaluate tools&lt;/li&gt;
&lt;li&gt;Validating that messaging aligns with development workflows, not idealized ones&lt;/li&gt;
&lt;li&gt;Surfacing mismatches early, before they ship and become someone else’s problem&lt;/li&gt;
&lt;li&gt;Representing developer reality consistently across product, sales, and marketing&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This isn’t a checklist of tactics. When no one owns it, the gaps show up fast. Demos that fall apart under usage. Content that explains features but avoids constraints. Messaging that sounds right until someone tries to build with the product.&lt;/p&gt;
&lt;h2&gt;How developer marketing runs alongside other marketing functions&lt;/h2&gt;
&lt;p&gt;Developer marketing doesn’t replace product or growth marketing. In smaller companies, it often &lt;em&gt;is&lt;/em&gt; product and growth marketing because one person owns the work end to end. As teams grow, the functions split out, but the responsibility doesn’t disappear.&lt;/p&gt;
&lt;p&gt;In larger orgs, developer marketing becomes a coordinating role. Product marketing, growth, content, lifecycle, and RevOps have clear owners, but someone still has to keep the work aligned. Positioning has to match how the product actually works. Campaigns can’t get ahead of reality. Growth tactics can’t create downstream cleanup. That’s also why developer marketing experience scales into leadership. You’ve already had to own the whole system.&lt;/p&gt;
&lt;h2&gt;Developer marketing vs Developer Relations&lt;/h2&gt;
&lt;p&gt;Developer marketing and DevRel are often confused because they work with the same audience. They solve different problems. DevRel focuses on relationships, education, and feedback loops. Developer marketing focuses on clarity, positioning, adoption and revenue. There’s overlap in execution, but not in responsibility. They work best together. Things break when one is asked to replace the other.&lt;/p&gt;
&lt;h2&gt;&lt;strong&gt;Developer marketing in the AI era&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;The core responsibility hasn&amp;#39;t changed. You still own trust, positioning, and credibility with a technical audience. But who that audience is and how they evaluate shifted. AI coding agents moved the constraint from writing code to describing what you want built. The people evaluating your product now include PMs, designers, and domain experts who build through prompts, not syntax. Developer content has two audiences now, humans and the AI agents helping them build. If your docs can&amp;#39;t be parsed by an agent in Cursor or Claude Code, developers won&amp;#39;t see your product at all.&lt;/p&gt;
&lt;p&gt;I wrote more about this shift in &lt;a href=&quot;https://lukestahl.io/blog/end-of-coding-age-of-building/&quot;&gt;End of coding, age of building&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;How I think about being a better developer marketer&lt;/h2&gt;
&lt;p&gt;First, you should use the product. You should build with the thing you’re marketing. You should hit the same rough edges users hit and understand why they exist. It’s hard to explain limitations honestly if you’ve never run into them yourself.&lt;/p&gt;
&lt;p&gt;Second, use AI aggressively, but deliberately. Automate repetitive work and invest in reusable tools, like a writing style guide or a Claude skill, so you’re not starting from scratch every time.&lt;/p&gt;
&lt;p&gt;If you’re vibe coding, write instructions that explain what the generated code is doing. You should understand the structure of a codebase well enough to know where files live and how to update them manually if something breaks. Even if you work primarily in visual tools or AI editors, that baseline matters.&lt;/p&gt;
&lt;p&gt;Build a sandbox. Have a place where you can play around with different dev tools and see how they actually behave. My &lt;a href=&quot;https://lukestahl.io/&quot;&gt;site&lt;/a&gt; became my personal playground. &lt;/p&gt;
&lt;p&gt;Learn from people who are close to the work and opinionated about it. I read an interesting &lt;a href=&quot;https://www.linkedin.com/posts/morganepalomares_questions-i-always-ask-in-interviews-how-activity-7417609243677356032-F4H1?utm_source=share&amp;utm_medium=member_desktop&amp;rcm=ACoAAAVTVpMBxaiFbpYSMC1uTcI4crtPYBeihxA&quot;&gt;question&lt;/a&gt; posed to potential hiring candidates, who do you think is doing marketing well? It made me run through the exercise and I think it’s important to follow folks who show their thinking, not just outcomes. Here are a couple folks that keep my wheels spinning, it a good way of course. &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://x.com/theHankTaylor&quot;&gt;Hank Taylor&lt;/a&gt; - Developer Marketing Advisor, &lt;a href=&quot;https://codetomarket.fm/&quot;&gt;CodetoMarket&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.linkedin.com/in/morganepalomares/&quot;&gt;Morgane Palomares&lt;/a&gt; - VP of Marketing, &lt;a href=&quot;https://www.braintrust.dev/&quot;&gt;Braintrust&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.linkedin.com/in/rdegges/&quot;&gt;Randall Degges&lt;/a&gt; - VP of AI Eng &amp;amp; DevRel, &lt;a href=&quot;https://snyk.io/&quot;&gt;Snyk&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://x.com/Steve8708&quot;&gt;Steve Sewell&lt;/a&gt; - CEO, &lt;a href=&quot;https://www.builder.io/&quot;&gt;Builder.io&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://x.com/james406&quot;&gt;James Hawkins &lt;/a&gt;- Co-CEO, &lt;a href=&quot;https://posthog.com/&quot;&gt;PostHog&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I keep most of this thinking written down so I don’t have to relearn the same lessons every time. I collect articles, threads, talks, and tools as &lt;a href=&quot;https://lukestahl.io/gems/&quot;&gt;gems&lt;/a&gt;. Over time, that gets distilled into my &lt;a href=&quot;https://lukestahl.io/handbook/&quot;&gt;developer marketing handbook&lt;/a&gt;, which is where I capture how I approach personas, messaging, enablement, and GTM strategies when developers are part of the equation.&lt;/p&gt;
&lt;h2&gt;A note on how I actually apply this&lt;/h2&gt;
&lt;p&gt;I keep running into the same questions when I’m doing developer marketing work. What actually matters here? What’s noise? Where am I about to overcomplicate something or gloss over a constraint that will come back later?&lt;/p&gt;
&lt;p&gt;One example is a &lt;a href=&quot;https://github.com/Stahlwalker/developer-marketing&quot;&gt;Developer Marketing Claude skill&lt;/a&gt; I built that turns my developer marketing handbook into an interactive reference. I plan to use it to get oriented quickly and sanity-check decisions when I’m moving fast. It’s not meant to replace judgment or thinking. It’s there to reduce the cost of starting from scratch every time.&lt;/p&gt;
&lt;p&gt;I’ve also published a &lt;a href=&quot;https://lukestahl.io/dev-marketing-cheat-sheet/&quot;&gt;developer marketing cheat sheet&lt;/a&gt; that pulls together roles and responsibilities, along with distribution channels and metrics. It’s intentionally lightweight and doesn’t require an email to download. If it’s useful, take it. If it’s not, ignore it.&lt;/p&gt;
&lt;h2&gt;What the job actually demands now&lt;/h2&gt;
&lt;p&gt;Doing this work well today requires more than knowing channels or tactics. It requires technical literacy, comfort operating between teams with unclear responsibility, and a willingness to be specific even when that means accepting tradeoffs. Most of all, it requires accountability for trust and revenue, not just reach.&lt;/p&gt;
</content:encoded><category>Developer Marketing</category></item><item><title>What gets lost during leadership transitions</title><link>https://lukestahl.io/blog/leadership-flywheel/</link><guid isPermaLink="true">https://lukestahl.io/blog/leadership-flywheel/</guid><description>When leadership changes every year, strategy rarely has time to land. This is a look at the leadership flywheel that keeps resetting teams, why it happens, and what gets lost for the people doing the work.</description><pubDate>Fri, 09 Jan 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;img src=&quot;/blog/images/leadership-flywheel/leadership-flywheel-v2_png.png&quot; alt=&quot;leadership-flywheel-v2.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;Over the past few years, I’ve seen the same pattern repeat across different companies, teams, and stages. It stopped feeling situational and started feeling structural. Leadership changes. Direction shifts. Teams reset. Then the same sequence plays out again.&lt;/p&gt;
&lt;p&gt;I don’t think this usually comes from bad intent. Leadership choices shape the outcome in systems that reward visible change over durability. Over time, I started thinking about it as a flywheel. Not a formal model, just a way to explain why the same behaviors reinforce each other and keep repeating.&lt;/p&gt;
&lt;h3&gt;A pattern I keep seeing&lt;/h3&gt;
&lt;p&gt;It usually starts the same way. A new leader joins with a mandate to fix things. Expectations are high, time is limited, and the organization wants to see movement quickly. There’s pressure to create clarity and confidence, both internally and externally.&lt;/p&gt;
&lt;p&gt;That pressure shapes the first decisions. Change becomes the most visible signal of ownership. Direction gets reset, priorities get reshuffled, and the org starts moving before there’s full context on why things look the way they do.&lt;/p&gt;
&lt;p&gt;What follows is consistent enough that it’s hard to ignore:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A rebrand or reset of priorities&lt;/li&gt;
&lt;li&gt;A restructure to match the new direction&lt;/li&gt;
&lt;li&gt;People exit, sometimes by choice, sometimes not&lt;/li&gt;
&lt;li&gt;New roles open that look a lot like the ones that were removed&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Within a year to eighteen months, leadership changes again. The next person inherits a partially reset system, and the cycle starts over.&lt;/p&gt;
&lt;h3&gt;The leadership flywheel, as I see it&lt;/h3&gt;
&lt;p&gt;I think of this as a leadership flywheel because each step reinforces the next. New leadership enters under pressure to show progress. Visible change signals momentum. Structural and team changes follow. Outcomes take time. Leadership tenure runs out before those outcomes fully materialize.&lt;/p&gt;
&lt;p&gt;The flywheel keeps moving not because anyone wants disruption, but because motion is easier to measure than durability. The system rewards visible change more than sustained execution.&lt;/p&gt;
&lt;h3&gt;Why leadership transitions often start with change&lt;/h3&gt;
&lt;p&gt;From the outside, change reads as action. From the inside, it often reads as necessity. New leaders inherit teams they didn’t build and strategies they didn’t choose. There’s limited trust in inherited context and little patience for slow understanding.&lt;/p&gt;
&lt;p&gt;Resetting direction, structure, or messaging establishes ownership and creates distance from what came before. The issue isn’t that change happens. It’s that change becomes the starting point instead of the result of learning.&lt;/p&gt;
&lt;h3&gt;What gets lost first&lt;/h3&gt;
&lt;p&gt;The earliest losses are quiet. Context disappears. The reasoning behind past decisions fades. Work that was in motion loses sponsorship and stalls, not because it failed, but because no one is left to carry it forward.&lt;/p&gt;
&lt;p&gt;Each transition removes another layer of institutional memory. Over time, teams stop assuming their work will compound. They expect it to be revisited.&lt;/p&gt;
&lt;h3&gt;What gets lost over time&lt;/h3&gt;
&lt;p&gt;As these transitions stack, the cost becomes harder to ignore. Strategy turns into something that’s constantly reworked instead of built toward. Teams get cautious about long-term bets. Confidence in direction erodes, even when the direction itself is sound.&lt;/p&gt;
&lt;p&gt;People adapt by narrowing scope and shortening time horizons. Not because they lack ambition, but because they’ve learned what survives resets and what doesn’t.&lt;/p&gt;
&lt;h3&gt;The professional impact no one plans for&lt;/h3&gt;
&lt;p&gt;Leadership transitions don’t just reset strategy. They reset careers. Performance gets evaluated by managers who weren’t there for the work. Progress has to be re-explained. Advocacy disappears when leaders leave, often through no fault of the people they supported.&lt;/p&gt;
&lt;p&gt;Growth becomes uneven. Not tied to output or impact, but to timing. This is one of the harder realities to talk about without sounding bitter, but it shapes how people experience their careers more than most organizations acknowledge.&lt;/p&gt;
&lt;p&gt;I’ve experienced this firsthand, including situations where positive reviews, bonuses, and pay adjustments existed right up until a leadership change reset the narrative. In those moments, the work didn’t suddenly stop delivering. What changed was how success was defined and which outcomes mattered.&lt;/p&gt;
&lt;h3&gt;A leadership assumption worth challenging&lt;/h3&gt;
&lt;p&gt;Teams often get labeled as underperforming when the real issue is instability. Work doesn’t fail because the people were wrong for the role. It fails because the system never gave it time to land.&lt;/p&gt;
&lt;p&gt;Strategy gets blamed when continuity was the missing ingredient. Changing players is easier than stabilizing the field, but it rarely addresses the underlying problem.&lt;/p&gt;
&lt;h3&gt;Entering a team without restarting the cycle&lt;/h3&gt;
&lt;p&gt;There’s another way to enter an organization, though it’s slower and less visible. Spend time with the team before reshaping it. Separate inherited issues from structural ones. Learn what’s already been tried and why it stalled.&lt;/p&gt;
&lt;p&gt;Most teams don’t need to be replaced. They need space to operate without being reset every year. This doesn’t mean avoiding change. It means earning it.&lt;/p&gt;
&lt;h3&gt;Living inside leadership transitions&lt;/h3&gt;
&lt;p&gt;For people inside the system, there’s no clean answer. Some leave early. Some stay and adapt. Some get reshaped into roles they didn’t come in to do. Occasionally, a leader arrives who builds with what’s there instead of tearing it down, but you can’t plan on that.&lt;/p&gt;
&lt;p&gt;What you can plan for is the reset.&lt;/p&gt;
&lt;p&gt;You shouldn’t assume your previous performance reviews will be read. You shouldn’t assume context will carry forward. You shouldn’t assume the work you did speaks for itself once the people who sponsored it are gone. That’s frustrating, and it’s draining, but it’s also the reality of how often leadership changes.&lt;/p&gt;
&lt;p&gt;Writing things down helps more than people want to admit. Not as self-promotion, but as continuity. Capture what you worked on, why it mattered, what changed because of it, and what didn’t get finished. Keep a record of decisions, outcomes, and tradeoffs while the context is still fresh.&lt;/p&gt;
&lt;p&gt;This isn’t about selling yourself constantly. It’s about not starting from zero every time the org resets.&lt;/p&gt;
&lt;p&gt;Over time, these notes become your own internal handoff doc. They make transitions survivable. They give you a way to ground conversations when direction shifts again. They help you advocate for your work without relying on memory or missing context.&lt;/p&gt;
&lt;p&gt;You can’t stop the leadership flywheel alone, but you can make sure it doesn’t erase your contribution every time it turns.&lt;/p&gt;
</content:encoded><category>Marketing</category><category>Leadership</category></item><item><title>Can a modern website run on Markdown + AI alone?</title><link>https://lukestahl.io/blog/markdown-vs-cms/</link><guid isPermaLink="true">https://lukestahl.io/blog/markdown-vs-cms/</guid><description>A clear look at how far Markdown and AI can go, where a CMS becomes necessary, and why visual development offers a different approach entirely.</description><pubDate>Mon, 15 Dec 2025 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;img src=&quot;/blog/images/markdown-vs-cms/Markdown_hero_1200_png.png&quot; alt=&quot;Markdown_hero_1200.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;A debate surfaced over the weekend about whether teams still need a CMS at all. The idea is simple. If AI can draft content, migrate files, clean up structure, and commit everything to your repo, maybe you don’t need a content system. Everything becomes Markdown. Everything lives in Git. AI handles the busywork.&lt;/p&gt;
&lt;p&gt;It’s a compelling argument for code-first teams.&lt;/p&gt;
&lt;p&gt;But the real question is bigger.&lt;/p&gt;
&lt;p&gt;Can an entire production website be built and maintained this way, or are there parts of a CMS that AI and Markdown still can’t replace?&lt;/p&gt;
&lt;p&gt;And once you widen the scope to the whole website, a third option starts to matter too: visual development.&lt;/p&gt;
&lt;p&gt;Here’s how these models actually compare.&lt;/p&gt;
&lt;h2&gt;The argument for Markdown and AI&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://leerob.com/agents&quot;&gt;Markdown&lt;/a&gt; has always been appealing to developers. It’s simple, local, versioned, and transparent. With AI acting on files the same way developers already do, a lot of friction disappears. You can:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;draft content in your editor&lt;/li&gt;
&lt;li&gt;use agents to clean up structure&lt;/li&gt;
&lt;li&gt;generate frontmatter&lt;/li&gt;
&lt;li&gt;fix formatting&lt;/li&gt;
&lt;li&gt;reorganize folders&lt;/li&gt;
&lt;li&gt;handle bulk changes&lt;/li&gt;
&lt;li&gt;commit and deploy without leaving your flow&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For a fully technical team, this model feels clean. There are no CMS dashboards to maintain, no API layers to wire up, and no separate content environments to manage. Everything lives in one place, and automation handles the repetitive work. It’s a real advantage for teams that operate entirely in code. But it also assumes something important: every contributor works like a developer. And most websites aren’t run that way.&lt;/p&gt;
&lt;h2&gt;Where Markdown starts to fall short&lt;/h2&gt;
&lt;p&gt;Markdown works well when content is simple and the team is small. That’s why some engineering-led teams can run it without friction. But once a website becomes a system instead of a collection of documents, Markdown introduces gaps that aren’t obvious until you hit them.&lt;/p&gt;
&lt;p&gt;Some websites genuinely don’t need structured content, workflows, localization, or a cross-functional editing model. If your site is simple and your team is entirely technical, Markdown can work fine. But if your website relies on these capabilities, getting them in a Markdown workflow means building and maintaining them yourself.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/markdown-vs-cms/Website_vs_Markdown_png.png&quot; alt=&quot;Website_vs_Markdown.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;Here are the big ones.&lt;/p&gt;
&lt;h3&gt;1. No structured content model&lt;/h3&gt;
&lt;p&gt;Some sites don’t need schema-level modeling. Many do. Production sites often rely on:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;fields&lt;/li&gt;
&lt;li&gt;types&lt;/li&gt;
&lt;li&gt;relationships&lt;/li&gt;
&lt;li&gt;reusable fragments&lt;/li&gt;
&lt;li&gt;shared data across many surfaces&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Markdown doesn’t enforce any of this. One missing field can break pages. One inconsistent value can break queries.&lt;/p&gt;
&lt;p&gt;You can bolt on schema validation, but at that point you’re recreating parts of a CMS.&lt;/p&gt;
&lt;h3&gt;2. No safe editing environment&lt;/h3&gt;
&lt;p&gt;Markdown works for engineers. It doesn’t work for non-technical contributors. Even if AI generates the files, someone still ends up reviewing diffs or dealing with frontmatter.&lt;/p&gt;
&lt;p&gt;Supporting broader teams means building:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;internal UIs&lt;/li&gt;
&lt;li&gt;guardrails around edits&lt;/li&gt;
&lt;li&gt;validation layers&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Otherwise, you&amp;#39;re asking non-technical contributors to operate in Git.&lt;/p&gt;
&lt;h3&gt;3. No preview tied to actual components&lt;/h3&gt;
&lt;p&gt;Some engineering teams already maintain strong preview systems. Most don’t.&lt;/p&gt;
&lt;p&gt;Without one, a Markdown file can’t show:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;how a component renders&lt;/li&gt;
&lt;li&gt;how content interacts with layout&lt;/li&gt;
&lt;li&gt;how changes ripple across a page&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You only see the real outcome after a build or preview deployment.&lt;/p&gt;
&lt;h3&gt;4. No built-in workflows&lt;/h3&gt;
&lt;p&gt;Websites need:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;approvals&lt;/li&gt;
&lt;li&gt;version history&lt;/li&gt;
&lt;li&gt;scheduling&lt;/li&gt;
&lt;li&gt;role-based access&lt;/li&gt;
&lt;li&gt;localization&lt;/li&gt;
&lt;li&gt;multi-surface consistency&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Markdown doesn’t support this. Reproducing it means building bots, branch rules, and CI automation to simulate editorial workflows.&lt;/p&gt;
&lt;h3&gt;5. No shared space for collaboration&lt;/h3&gt;
&lt;p&gt;Markdown keeps contributors split across tools:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;developers in Git&lt;/li&gt;
&lt;li&gt;designers in Figma&lt;/li&gt;
&lt;li&gt;marketers in docs&lt;/li&gt;
&lt;li&gt;reviewers in comments elsewhere&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There is no shared environment for the website itself. Markdown is simple until the website isn’t.&lt;/p&gt;
&lt;h2&gt;Can AI agents solve this gap?&lt;/h2&gt;
&lt;p&gt;AI agents can automate a lot inside a Markdown workflow. They can generate files, rewrite content, reorganize structure, migrate documents, clean up frontmatter, and handle several repetitive tasks developers used to own.&lt;/p&gt;
&lt;p&gt;But agents don’t replace the systems that keep a full website consistent and safe.&lt;/p&gt;
&lt;p&gt;To match what a CMS provides, AI agents would need to handle:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;schema enforcement&lt;/li&gt;
&lt;li&gt;field validation&lt;/li&gt;
&lt;li&gt;relationship modeling&lt;/li&gt;
&lt;li&gt;reference checks&lt;/li&gt;
&lt;li&gt;localization parity&lt;/li&gt;
&lt;li&gt;role-based permissions&lt;/li&gt;
&lt;li&gt;publishing controls&lt;/li&gt;
&lt;li&gt;approvals and sequencing&lt;/li&gt;
&lt;li&gt;preview environments&lt;/li&gt;
&lt;li&gt;multi-surface consistency&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;AI can help with pieces of this, but it can’t &lt;em&gt;be&lt;/em&gt; the system.&lt;/p&gt;
&lt;p&gt;To get all of these capabilities in a Markdown world, you’d need to build:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;your own schema engine&lt;/li&gt;
&lt;li&gt;your own validation pipeline&lt;/li&gt;
&lt;li&gt;your own preview pipeline&lt;/li&gt;
&lt;li&gt;your own workflow and approval model&lt;/li&gt;
&lt;li&gt;your own contributor UI&lt;/li&gt;
&lt;li&gt;your own localization framework&lt;/li&gt;
&lt;li&gt;your own content governance rules&lt;/li&gt;
&lt;li&gt;your own automation to prevent destructive edits&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;At that point, you’re not “avoiding a CMS.” You’ve built one. AI assists the work. It does not replace the infrastructure that coordinates it. This is the distinction most people miss.&lt;/p&gt;
&lt;h2&gt;The argument for a CMS&lt;/h2&gt;
&lt;p&gt;A &lt;a href=&quot;https://www.sanity.io/blog/you-should-never-build-a-cms&quot;&gt;CMS solves the problems Markdown can’t&lt;/a&gt;. It gives you structure, validation, workflows, and a safer way for non-technical contributors to manage content without touching code.&lt;/p&gt;
&lt;p&gt;People don’t write inside the CMS. They publish inside the CMS.&lt;/p&gt;
&lt;p&gt;That’s the important distinction.&lt;/p&gt;
&lt;p&gt;Even if AI drafts and rewrites content, the CMS provides:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;a consistent schema&lt;/li&gt;
&lt;li&gt;a place to enforce rules&lt;/li&gt;
&lt;li&gt;a predictable workflow&lt;/li&gt;
&lt;li&gt;a safe environment to update live content&lt;/li&gt;
&lt;li&gt;a shared model for content relationships&lt;/li&gt;
&lt;li&gt;clear separation between editing and code&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That’s why CMSs exist.&lt;/p&gt;
&lt;p&gt;Not because writing is hard, but because maintaining content across teams requires structure that Markdown doesn’t provide on its own.&lt;/p&gt;
&lt;p&gt;But CMSs are not perfect either. They can feel disconnected from the real site. They often separate content from layout. Editors can break things. API layers add overhead. Preview systems vary in quality.&lt;/p&gt;
&lt;p&gt;So even though a CMS solves real coordination problems, it introduces friction in other areas.&lt;/p&gt;
&lt;p&gt;This is why a third model matters.&lt;/p&gt;
&lt;h2&gt;The case for visual development&lt;/h2&gt;
&lt;p&gt;Visual development sits between Markdown and a CMS. It treats the website as a full system, not just content or code. It gives teams a shared environment with structure built in.&lt;/p&gt;
&lt;p&gt;In a visual development model:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;developers define components, logic, and structure&lt;/li&gt;
&lt;li&gt;designers work directly in the real layout&lt;/li&gt;
&lt;li&gt;editors update content inside the same system&lt;/li&gt;
&lt;li&gt;schema and relationships stay consistent&lt;/li&gt;
&lt;li&gt;content and layout stay aligned&lt;/li&gt;
&lt;li&gt;AI operates with awareness of the actual components and fields&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It removes the separation between:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;content and layout&lt;/li&gt;
&lt;li&gt;authors and developers&lt;/li&gt;
&lt;li&gt;code and preview&lt;/li&gt;
&lt;li&gt;structure and design&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Markdown doesn’t provide this.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://webflow.com/blog/headless-cms-developer-tradeoffs&quot;&gt;Headless CMSs&lt;/a&gt; rarely provide this.&lt;/p&gt;
&lt;p&gt;And the role of AI changes completely in a visual development system. AI isn’t generating files in a vacuum. It works inside the actual structure of the site. That means:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;AI site builders generate real pages using real components&lt;/li&gt;
&lt;li&gt;AI visual editing adjusts layout, spacing, and structure with context&lt;/li&gt;
&lt;li&gt;AI app scaffolding uses actual data models and extension points&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;AI doesn’t sit on the side. It participates in the system.&lt;/p&gt;
&lt;p&gt;Visual development builds the website as one environment instead of splitting it across tools.&lt;/p&gt;
&lt;p&gt;This is the gap &lt;a href=&quot;https://webflow.com/&quot;&gt;Webflow&lt;/a&gt; fills.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;ℹ️ Some platforms even combine visual development with Git-based workflows, like &lt;a href=&quot;https://www.builder.io/fusion&quot;&gt;Builder.io&amp;#39;s Fusion&lt;/a&gt; approach, which shows there’s real demand for visual editing that still fits into an engineering-centric versioning model.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;So what does a team lose moving from a CMS to Markdown?&lt;/h2&gt;
&lt;p&gt;You lose:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;schema enforcement&lt;/li&gt;
&lt;li&gt;relationship modeling&lt;/li&gt;
&lt;li&gt;predictable workflows&lt;/li&gt;
&lt;li&gt;safe editing environments&lt;/li&gt;
&lt;li&gt;shared context across teams&lt;/li&gt;
&lt;li&gt;reliable previews &lt;em&gt;(unless you maintain your own preview system)&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;localization pipelines&lt;/li&gt;
&lt;li&gt;role-based permissions&lt;/li&gt;
&lt;li&gt;structured publishing&lt;/li&gt;
&lt;li&gt;cross-page consistency&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can rebuild some of this, but now you’re maintaining a custom system that grows in complexity as your website grows.&lt;/p&gt;
&lt;h2&gt;What a team gains by going Markdown-first&lt;/h2&gt;
&lt;p&gt;You gain:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;full developer control&lt;/li&gt;
&lt;li&gt;simpler architecture&lt;/li&gt;
&lt;li&gt;fewer systems&lt;/li&gt;
&lt;li&gt;automation through agents&lt;/li&gt;
&lt;li&gt;transparency through Git&lt;/li&gt;
&lt;li&gt;lower overhead&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It’s a strong option for small, fully technical teams with simple content needs and tight ownership over the site.&lt;/p&gt;
&lt;p&gt;It’s not designed for companies with cross-functional contributors or evolving structure.&lt;/p&gt;
&lt;h2&gt;What visual development offers instead&lt;/h2&gt;
&lt;p&gt;Visual development isn’t trying to be Markdown. It isn’t trying to be a CMS either.&lt;/p&gt;
&lt;p&gt;It offers:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;structure like a CMS&lt;/li&gt;
&lt;li&gt;flexibility like a codebase&lt;/li&gt;
&lt;li&gt;real layout context like a design tool&lt;/li&gt;
&lt;li&gt;shared collaboration across roles&lt;/li&gt;
&lt;li&gt;AI that works inside the true structure&lt;/li&gt;
&lt;li&gt;developer extension points when needed&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It’s the only model where the website stays a single source of truth for everyone.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/blog/images/markdown-vs-cms/Webflow_AI_png.png&quot; alt=&quot;Webflow_AI.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;Developers keep control. Designers keep fidelity. Marketers keep clarity. AI doesn’t act blindly. That’s why visual development matters in this debate.&lt;/p&gt;
&lt;h2&gt;The simple way to think about it&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Markdown + AI&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Great when the team is fully technical and the site structure is simple.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;CMS&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Great when content structure, governance, and shared responsibility matter.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Visual development&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Great when a website is treated as a unified system where design, content, development, and AI all work together with real context.&lt;/p&gt;
</content:encoded><category>CMS</category><category>Markdown</category><category>Visual Development</category><category>AI</category></item><item><title>Inside my developer marketing stack</title><link>https://lukestahl.io/blog/my-stack/</link><guid isPermaLink="true">https://lukestahl.io/blog/my-stack/</guid><description>Every builder finds their own rhythm. This is mine. A look at the tools I use every day in developer marketing, how they fit together, why they’ve stuck, and how they keep the work moving.</description><pubDate>Mon, 24 Nov 2025 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;img src=&quot;/blog/images/my-stack/My_Stack_png.png&quot; alt=&quot;My_Stack.png&quot;&gt;&lt;/p&gt;
&lt;h2&gt;&lt;/h2&gt;
&lt;p&gt;Every builder has a rhythm to how they build. For me, that rhythm lives somewhere between code and content, the overlap where developer marketing happens. These tools help me stay organized, move faster, and keep projects on track without overcomplicating the work.&lt;/p&gt;
&lt;h3&gt;IDE&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://cursor.com/&quot;&gt;Cursor&lt;/a&gt; is now my main workspace after years in &lt;a href=&quot;https://code.visualstudio.com/&quot;&gt;VS Code&lt;/a&gt;. They share the same foundation and extensions, but Cursor feels like where things are moving.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Same integrations and ecosystem as VS Code&lt;/li&gt;
&lt;li&gt;Familiar setup that just works&lt;/li&gt;
&lt;li&gt;AI that’s advancing faster than VS Code, which is surprising since GitHub built Copilot first&lt;/li&gt;
&lt;li&gt;I pair it with &lt;a href=&quot;https://claude.ai/code&quot;&gt;Claude Code&lt;/a&gt; in the terminal for context, explanations, and large-scale debugging when I need deeper insight&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It’s less about what’s new and more about pace. Cursor feels like it’s evolving while VS Code feels like it’s coasting.&lt;/p&gt;
&lt;h3&gt;AI chat tools&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://chat.openai.com/&quot;&gt;ChatGPT&lt;/a&gt; is my go-to for projects, building custom GPTs, and setting up connectors or integrations. I also use it for content curation when I’m gathering examples or early research.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://gemini.google.com/&quot;&gt;Gemini&lt;/a&gt; handles search and visual creation. Its new Nano Banana feature makes generating visual references surprisingly quick.&lt;/p&gt;
&lt;p&gt;Right now, both tools serve different purposes, but my guess is Gemini eventually edges out OpenAI. We’ll see.&lt;/p&gt;
&lt;h3&gt;Design and visual thinking&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://figma.com/&quot;&gt;Figma&lt;/a&gt; is where I build and prototype designs. I don’t use its AI features, they’re still behind the curve. What makes it useful is how quickly I can move from layout ideas to something ready for production. Components, variables, and shared libraries keep everything consistent without extra effort.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://excalidraw.com/&quot;&gt;Excalidraw&lt;/a&gt; is where I whiteboard. I sketch flows, map systems, and rough out diagrams before anything formal. A cheat code I use is generating Mermaid diagrams in ChatGPT to get a starting point, then rebuilding them in Excalidraw to make them clearer and more visual.&lt;/p&gt;
&lt;h3&gt;Analytics stack&lt;/h3&gt;
&lt;p&gt;A big part of developer marketing is understanding how people use what you build. These tools help turn product usage into insight.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://posthog.com/&quot;&gt;PostHog&lt;/a&gt; for web analytics, event tracking, and session replay&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.fullstory.com/&quot;&gt;FullStory&lt;/a&gt; for journey mapping and deeper behavioral insights&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.sigmacomputing.com/&quot;&gt;Sigma&lt;/a&gt; for the full view, from lead tracking to closed-won pipeline reporting&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://looker.com/&quot;&gt;Looker&lt;/a&gt; for clean visualization and quick summaries&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There isn’t one analytics platform that does everything well. A mix is usually the reality, and this is the combination that works for me.&lt;/p&gt;
&lt;h3&gt;Marketing and SEO stack&lt;/h3&gt;
&lt;p&gt;This is the set of tools I rely on to manage search, automation, and content performance. Each one plays a different role, from tracking technical SEO to finding new opportunities and testing campaigns.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://ahrefs.com/&quot;&gt;Ahrefs&lt;/a&gt; is my go-to for monitoring technical SEO, tracking backlinks, and doing competitive keyword research or content gap analysis&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://surferseo.com/&quot;&gt;SurferSEO&lt;/a&gt; helps grade keyword-focused content and tighten on-page structure before publishing&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://ads.google.com/home/tools/keyword-planner/&quot;&gt;Google Keyword Planner&lt;/a&gt; is great for generating new ideas based on topics or URLs and gives solid monthly search volume data&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://ads.google.com/&quot;&gt;Google Ads&lt;/a&gt; still earns its place, paid search still works, especially in developer marketing where people search everything (even &lt;a href=&quot;https://www.bing.com/&quot;&gt;Bing&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.airops.com/&quot;&gt;AirOps&lt;/a&gt; is a newer tool I’ve been using for automation workflows, from competitive intel to content research, all built through AI&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://buffer.com/&quot;&gt;Buffer&lt;/a&gt; is my social tool of choice, mainly because its free tier goes further than most and supports multiple platforms without friction&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.commonroom.io/&quot;&gt;Common Room&lt;/a&gt; helps track and understand community activity across social, forums, and other developer channels.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This stack helps me connect research, creation, and distribution. It’s about staying close to what works and improving a little with each iteration.&lt;/p&gt;
&lt;h3&gt;Sales and enablement stack&lt;/h3&gt;
&lt;p&gt;This is where product knowledge turns into enablement.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://www.gong.io/&quot;&gt;Gong&lt;/a&gt; for reviewing customer calls and spotting themes that shape positioning and messaging&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.workramp.com/&quot;&gt;WorkRamp&lt;/a&gt; for internal education and onboarding content, keeping sales and marketing aligned on what’s new&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.figma.com/slides/&quot;&gt;Figma Slides&lt;/a&gt; for building presentations with real design flexibility. It feels like designing, not fighting templates.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.arcade.software/&quot;&gt;Arcade&lt;/a&gt; for creating interactive product demos that help explain features in context without heavy editing or production time&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;They keep information flowing and teams aligned on what actually matters, the customers using the product. &lt;/p&gt;
&lt;h3&gt;Collaboration stack&lt;/h3&gt;
&lt;p&gt;Collaboration works best when it’s async, not noisy.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://notion.so/&quot;&gt;Notion&lt;/a&gt; is where I write everything, content drafts, specs, notes, and ideas. It’s underrated for collaboration and perfect for pulling in code snippets or assets. People think they need Google Docs, but they’re wrong.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://airtable.com/&quot;&gt;Airtable&lt;/a&gt; is a dream for anyone who loves spreadsheets. It’s flexible, visual, and great for tracking content and projects. I’ve been a fan since day one, even if not everyone adopts it.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://slack.com/&quot;&gt;Slack&lt;/a&gt; is the only real choice for team communication. I live and die by the save-for-later feature, and the new canvas tools are great for organizing thoughts. Don’t at me with Microsoft Teams.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.loom.com/&quot;&gt;Loom&lt;/a&gt; is the best way to explain anything on video. It’s perfect for walkthroughs, feedback, or education. The editing tools are ok, not Descript-level, but work well enough. The AI still has some catching up to do.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This stack makes it easy to share work, document ideas, and keep everyone aligned without the constant back-and-forth.&lt;/p&gt;
&lt;h3&gt;Developer and Data Stack&lt;/h3&gt;
&lt;p&gt;This is where projects take shape.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/&quot;&gt;GitHub&lt;/a&gt; for version control and collaboration&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://supabase.com/&quot;&gt;Supabase&lt;/a&gt; for quick databases and APIs&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://vercel.com/&quot;&gt;Vercel&lt;/a&gt; or &lt;a href=&quot;https://www.netlify.com/&quot;&gt;Netlify&lt;/a&gt; for deployments&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.warp.dev/&quot;&gt;Warp&lt;/a&gt; for a cleaner, more visual terminal&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It’s a reliable setup that scales smoothly and keeps development fast without adding unnecessary infrastructure.&lt;/p&gt;
&lt;h3&gt;Closing Thoughts&lt;/h3&gt;
&lt;p&gt;These tools have stuck because they make the work smoother. They cover everything I need to build, write, and collaborate without slowing things down.&lt;/p&gt;
&lt;p&gt;The stack will keep changing, but the goal won’t. &lt;/p&gt;
</content:encoded><category>Teck Stack</category><category>AI</category><category>Developer Marketing</category></item></channel></rss>