We actually built a full game, but people I showed it to were most surprised when I told them that the music was realtime, pure JS, and vibe coded; thus the title.
Before reading, I'd recommend playing the game so that you can put the story in context.
Let's begin.
Why
At work we've been evaluating LLMs to see if they can help us as engineers (I'm convinced they can, but we're still collectively playing around more). So the natural thought I had one evening after work was, hey, let me vibe code something for lulz.
Vibe coding is when you forget everything about the program, forget your hard earned wisdom, forget best practices, and just come up with a prompt and ask an LLM to make the program for you instead of writing by hand. This was not really possible an year ago (for multiple reasons), but this is now very much possible.
The term originated with Andrej Karpathy's tweet.
After a phase of silliness, people realized this way of software engineering actually works for many tasks, and now are trying to see how to vibe code responsibly, e.g. see this talk by Anthropic.
Vibe coding is not just for software. For example, people are vibing Mathematics
True to the vibe spirit, I didn't even bother coming up with an idea, but instead picked one almost at random from the #idea channel we have in our Discord, a shower thought by our CEO Vishnu:
Game to dismiss as many cookie banners as you can in 60s
Take 1
I placed only one constraint on myself - I'll not edit anything by hand. The entire game should be LLM generated.
Since I wanted the end result to be easily playable, I decided on a web based game, and started hacking around in a Claude artifact (a souped up version of a regular Claude chat that can also render an interactive web page).
Things didn't go well. Even before the euphoria of call to adventure had subsided, I'd hit the downward curve of the hero's arc.
Claude immediately gave me a playable game, but it was meh, and I spent a few hours prompting it this way or that to elevate the game (both gameplay and visuals) before realising that I, rather we, Claude and I, were going around in circles.
Next evening I changed tack. Instead of asking it to give me a playable game, I asked it to give me a few conceptual variations first, with the thought of spending more energy on promising avenues. It did give me variations, but still nothing sparked.
The third day I thought of a mental picture of how I wanted the game to be like, without saying it aloud to Claude.
The fourth evening too I tettered and tottered around for a few hours. I picked a visual direction which was okay, but struggled with getting the gameplay right. I asked it for variations on difficulty ramps, but again, nothing sparked.
I was losing hope. I knew Claude could do it, what was frustrating me was that I was unable to formulate how to get it to do it. And there were the edges of those nagging doubts that maybe, indeed, Claude couldn't do it.
Take 5
It was the weekend. Emerging from the Zen of my Twitter feed, calm like a pond with fishes eating each other inside, I started a fresh chat (artifact), and gave Claude the following prompt, with a black and white Bauhaus-y image as a visual reference:
I want to make a game where the player has to close as many gdpr cookie banners as they can in 30 seconds. The pacing of the game should be like WarioWare. The visual style should be lik
The prompt might look nonchalant, but every word, and the overall phrasing, was a culmination of all the previous takes. As a small example, along the way I'd realized 60 seconds was too long for a cookie clicker game, and had changed it to 30.
I wasn't expecting it to work - if you see the full transcript of my interaction with Claude, you'll see how I absentmindedly said "A" first, had a typo in "lik" - which are things I normally wouldn't do, since they can slightly harm the context, but even before giving it the prompt I did remember feeling that I had a better understanding of the problem I was posing for Claude, and how to communicate it, as compared to the first take.
And it worked!
One play through of the generated game, and I knew that we were 80% there.
Which brings me to the first learning I'd like to share:
Learning #1
When trying to vibe code, if the first shot doesn't already get you the majority of the way, don't try to haggle with the LLM and get it to delta fix things, instead trash that session, spend time iterating your initial prompt, and then try afresh.
I tinkered with it to get the remaining 20% right, tweaking placements, asking it to add some colors etc. This took time, but I was in a happy state of mind so it felt like playing.
Tip #1
At least currently, web based Claude artifacts are not good for incremental edits - the resultant artifact has various bugs, and I often needed to ask it to "regenerate the page" to fix them. I think this is just a teething issue; Claude code doesn't have this problem, it is great at edits, and in principle Anthropic should be able to use the same mechanisms to fix the harness for the web based artifacts too.
Finally, with the game play and visual look ironed out, I came to the last of the trifecta - the music direction. Mentally I'd kept it last since I felt it'd be easy - from years of dabbling in music, I knew that while there is no recipe for making great music, making music that sounds good is formulaic, and with Claude at my side doing the heavy lifting, I should be done in a few prompts.
Boy was I wrong.
The Second Storm
Some of my missteps you can see in the chat transcript, but those were not all, I started many new chats with it on the side too, as I grew increasingly bewildered at Claude's inability to do coherent music.
Tip #2
Claude artifacts have this great feature called "forking". You can go to a previous prompt in the conversation, click "Edit", change the prompt, and then Claude will fork the conversation with the context as it were at that point, effectively causing both Claude and I to travel down a new road.
At any point, I can then go back to a previous fork point and use the arrows to choose the other roads that were taken. You can see examples of it in the chat transcript I shared (or even better, experiment yourself!)
One thing I do wish is if these forks were visualized better in the UI.
After quite some fumbling around and going nowhere, I gave up.
It was during a walk later that I found the issue - Claude can't hear music! (yet)
What I'm about to say is just a guess, please excuse any inaccuracies.
The thing is, I'd been misled by Claude's ability to "see". From my understanding, if I point Claude at a visual, or a screenshot of something it has made, it can convert that to embeddings that it can then chew on and "see". I was, perhaps naively, extrapolating that ability to "hearing", but then I realized that the current incarnation of LLMs cannot embed sonic media the same way (though I'd soon expect them to, and maybe they already can and I was holding it wrong).
This might be an obvious point, but it took me long to figure it out since Claude kept "You're absolutely right!"-ing me as I asked it to hear YouTube examples of, say, the bass line I wanted, while completely ignoring them and cooking up something completely generic.
All that said, Claude does know what music is, and how to mechanically generate it, by virtue of reading the countless words that humans have poured out of their hearts about individual songs. It can't hear a song yet, but if we do tell it precisely what sort of notes to generate, it can.
Riding the Storm
Armed with this insight, and knowledge that Claude artifacts are not good at edits, and a general sense that I was pushing the context of Claude artifacts to their limits (each regeneration was taking longer and longer, sometimes timing out), I decided to bring in the heavyweight - Claude Code.
I knew I couldn't ask it to generate the music on its own. And the original constraint I'd placed on myself was that I won't write any code by hand. To solve this impasse, I remembered the Steve Jobs calculator incident (tldr; an engineer, frustrated by Jobs' constant tweaks to the calculator app, made him a calculator construction kit instead)
Thus I started a separate session with Claude Code, with the intent of making a 30 second game music builder instead of 30 second game music:
I want to design the music for a game. As a high level overview, the game is a clicker with a fixed 30 second time of play, so in effect we're creating a 30 second song. The song starts with minimal elements, then builds up by adding more voices (first kick drum, then bass, then pads, then more percussion as we reach 30). Each positive / negative click results in an stab sound that sound fit the song's vibe while being triggered on user action.
A previous claude tried building it, but despite their best efforts the output of their code did not match this description - there just was no audio (except when doing the explicit click actions). So we need to proceed iteratively, starting very simple and only adding the next element when you can demonstrate that the current sound layers you've added are producing sounds.
The game will run on the web, so it needs to use JavaScript.
Some more interesting snippets from the conversation follow. e.g. I felt that while it can't hear music, but it should know enough of how the TB-303 in "I Feel Love" sounds from human descriptions to be able to reproduce it, so I asked it to do just that:
Let us change the bass line to 16th acid-like pattern. driving, donna summers like.
At each step of the conversation, Claude Code would make changes to a simple HTML file which I also opened in browser so that I could hear the results, and guide it with more prompts.
Some more of these prompts are below (it's maybe boring to read someone else's prompts, but since the title of this post is about the soundtrack, I'm erring on the side of including more details than less):
Can we try a happy high frequency bippity boppity melody as a counterpoint to the bass. Create it in a separate layer.
1. Can you make the layer indicators as buttons so that I can experiment with different orders of introducing these elements
2. Can you change the pad to a continuous drone, something in which other parts of the composition can sit in
Great, thank you. Let us try a happier pad, and something that is higher in the frequency range. For stabs, let us create two variant. One can be an "ah" sound for when the user clicks in the correct place, and the second one is the error like sound for when the user clicks in the wrong place. The error sound should also echo a few times (with decreasing levels)
The error sound is perfect. The ah sound isn't. Let us also add echo to the ah sound, and increase its volume. Then let us try giving more energy to it, maybe a "ha ha" feel
The pad sounds is also great, but let us add a I-v-iv long sequence to the underlying chord over the duration of the pad - that it, it starts at the current chord, but then transitions to v then resolves to iv as the pieces is finishing
Great, the accept sound is perfect. The chord progression is also great, but we need to maybe rearrange the piece to best use it.
1. The piece can start with the pad
2. Kicks can come in at 5 seconds
3. Bass can come in at 10 seconds.
4. Melody can come in at 15 seconds.
5. Perc can come in at 25 seconds.
Finally, let me call out a happy accident. The song had an irritating beep at the 10 second and 20 second marks, something I hadn't asked it to add. I wasn't reading the code, but I wagered a guess that this was filter resonance when the chord transitions were happening. Like a disciplined jazz improviser, I wanted to celebrate the mistake, so instead of taking it out, I just asked Claude to reduce the level of the beep to more sumptuous levels:
Maybe add a limiter or volume envelope over the chords to ensure the artifact volume remains withing some threshold (without changing other parts of the pad).
That's about it. There was the usual head banging against Claude to fix its minor mistakes and it trying to gaslight me into ignoring them, but nothing too bad (I was having fun!). At the end of this process, I had a HTML file which played the song when I clicked a button.
To ELI5 - I didn't create an MP3 for the song. The entire song is generated on the fly using JavaScript code.
Master of the Two Worlds
I gave this HTML file containing the generated song to the original Claude artifact that was generating the game, asking it to integrate it into the gameplay, and it did almost in the first shot.
I exported the resultant HTML file (now containing both the game and the soundtrack) into the file system, asked Vishnu to buy the consent.gg domain (I'd decided on the domain during the first day of brainstorming), and plonked that HTML file there.
That's it.
At no step in the process did I write even a single character of code by hand. After deploying it, I wanted to make a few tweaks, e.g. add a meta description tag, but even for these I didn't open a text editor but instead launched a Claude code in the folder, asked it to read the entire HTML file into context, and make the changes I wanted.
Afterword
If you're one of those folks who is angry about any of this, you're not realizing (a) what you're missing out because of your preconceptions, and (b) the fun I had in the entire process.
If you're not angry, I'd love to hear your thoughts! There's a lot of noise but very little of insights from people on the ground who are learning how to communicate with these new alien intelligences that humanity has found, and I'd love to hear your anecdotes (indeed that's why I've dusted off and started using my Twitter again; I realized that most AI alpha is in the chaos on X)