HTTP: H is for Hallucinated

A Server Of Little Purpose

For the short-on-time: the ginprov web server is real and it’s live at ginprov.com. You probably want to check that out before reading the article. Try changing the URL to some random path and see what happens!

Please note the results are more impressive on desktop than on mobile. My personal favorite so far is Broccoli Desserts Amazing.

Background

In Font Around and Find Out I joked about browsers generating pages with an LLM when the real server is slow. I introduced this as a bad idea — but what if it’s a good idea? What if instead of developing a web site, you can just have an LLM take requests and return HTML responses? I wanted to try this, then I saw an article on Hacker News about Anthropic discontinuing their own LLM-generated blog, then I really wanted to try it. Since the LLM does most of the work, it should be easy…

Ginprov

The Ginprov improvisational web server makes up responses as it goes along. You can run your own instance and try it out locally in under a minute!

How does it work? I decided to put the greatest possible burden onto the LLM:

When a path is first requested, the LLM is asked to perform a safety check.
Then the LLM is asked to create a site description and outline.
After that, for every user navigation the LLM generates the HTML in real-time. Its context contains the outline and a list of other known pages.
Images are handled similarly.
All LLM output is cached once generated and served as a static page.

This prompting in code is a bit messy; I need to find a better pattern for this part of an LLM-based app. You can find it in prompter.go.

Which LLM?

For this project I used Gemini Flash. Only Gemini has image generation fast enough for my attention span.

Progress

I’m pleased with the approach to progress. If a page is not yet generated, the LLM’s raw responses are streamed to the user in HTML chunked encoding, then when generation is complete, the page is refreshed to reveal the final content.

Just an LLM Wrapper

The wrapper code invokes the LLM, applies some concurrency control and limits, and performs retries if the LLM fails to produce valid output. It also cleans up the CSS/HTML, forces links to be relative, and tracks pages that are part of the generated site. It was maybe a bit more code than I thought it would be, but not too much. I found only images needed to be retried, HTML from Gemini has arrived perfectly formed since I started this project.

The server’s output is cached and served by the CloudFlare CDN with some reasonable caching policy.

Results

Take a look at the deployed instance on ginprov.com. Try following links to see what others have found, or just browse to a random topic of your own.

I was pleased with how well the LLM did. As a concept site it works really well. Will this technology replace real websites? Not yet, the output is too uncontrolled and unhinged. But maybe there will be some middle ground where web content starts to adapt more to the user like the video game from Ender’s Game. As with anything LLM, only time will tell.

Next Steps

Please leave a star on ginprov! For more articles like this follow me on X and subscribe to my RSS Feed. Thanks for reading!