Shepherd's Dog: A Game by the Most Dangerous AI Model

78 points by vnglst 3 hours ago

raincole 9 minutes ago

There were dozens (if not hundreds) of more complex games made by Fable on Twitter the first day it was released. The only reason this is on HN frontpage is the stupid clickbait title.

Some random examples:

https://x.com/fe_yukichi/status/2064635098411180374 https://x.com/akiraxtwo/status/2064780732082651402 https://x.com/kieradev/status/2064482704763085202 https://x.com/VincentLogic/status/2064699740936356065 https://x.com/XiaohuiAI666/status/2064994538591223911

jna_sh 2 hours ago

“ can it build a game idea I've had for years, in a single shot?”

Do people do no research or introspection when they’ve had an “idea for years”? There are countless examples of this exact game. I played this on the Gameboy Advance! There’s like 50 of them on the App Store right now.

The standard “this almost certainly exists wholesale in the training data” applies, but I’m also interested in how you carry an idea for years and don’t notice this, or whether the “idea” here was actually “using this thing that’s been remade thousands of times as an AI benchmark”.

There’s nothing wrong with remaking an old classic formula, especially in game dev. It’s the describing it as “an idea I’ve had for years” that rings weird.

redrobein 2 hours ago

While I agree that it isn't revolutionary that it could implement this from a single prompt, what's surprising to see is how well done this one is compared to the other tries. The controls and movement are smooth, the animations aren't jittery, the ui makes sense, there's a clear progression in difficulty. This model clearly "understands" the implementation of this game far better than the others did.
vnglst 2 hours ago

I also realized this, a quick Google search would’ve told me that this game has been made several times before, also way before I ever had this idea. Apparently it’s a pretty obvious game idea.
Ah well, it’s still fun and it does appear to measure how good AI is in creating these kind of games.
- dools 1 hour ago
  
  Well … it’s a measure of how good it is at reproducing a game that probably already exists in multiple forms in its training data.
  
  puttycat 1 hour ago
  
  The question is more whether this game exists as open source somewhere in the training data (probably does).
  
  sevenzero 1 hour ago
  
  You can't possibly think those models are only trained on open source data?
fennecbutt 2 hours ago

I think that's exactly why AI is suited for 99% of stuff we do.
I have pointed out on here before that instances of truly unique human ideas not grounded in nature or previous ideas from others is almost nil, there are not many examples that someone can give me. All human ideas and work is derivative.
Elves? Humans with pointy ears. Werewolves? Humans mixed with wolves. Car tyre? Cart wheel...stone wheel/roller. Etc.
- jna_sh 2 hours ago
  
  I feel like prior to GenAI, you’d have had to reckon with the true originality of your idea in some form as you did the research. Creatives having to confront their own unoriginality is such a thing it itself is reflected in countless pieces of media.
  So it’s interesting to me that the creator here didn’t encounter the tens of physically published versions, or the hundreds of them shipped to digital app stores, or all the codebases on GitHub, in the course of making this. I’m sure they would have done naturally prior to GenAI. Is that good or bad? I don’t know! But it’s interesting to me.
  
  NitpickLawyer 1 hour ago
  
  > the creator here didn’t encounter the tens of physically published versions
  The simplest counterargument: since there are already tens of similar games out there, why didn't the previous authors, supposedly grass-fed genuine checkmark blood-through-their-veins humans didn't notice the other 9-8-7-6-5... games, and still released their own version? Maybe because it was still that they wanted the game out there? Maybe because originality really isn't that common? Maybe because each individual had their own idea and spin to it? Maybe because they wanted the game out as they made it?
  Same for this author. How they made the game is irrelevant, and nitpicking the "originality" or anything else is silly. Something like this wasn't possible 3 years ago. Now it's possible. Deal with it, and stop trying to find ways to diminish it. It's a huge accomplishment any way you cut it.
  
  jna_sh 50 minutes ago
  
  My thoughts are less about the merits of creating something that already exists than they are about _knowing_ you are doing that. Which I think my post made very clear :)
  
  NitpickLawyer 36 minutes ago
  
  > I’m sure they would have done naturally prior to GenAI.
  I gave a simple counterargument to this. Since there are "countless" prior games, many of them released before genAI, your argument is pointless.
  
  jna_sh 29 minutes ago
  
  Do you think the only reaction to knowing you’re not the first to do something is not to do it? Do you think I said that?
  To spell it out in case it is still non-obvious: knowing this allows iteration. It allows remixing. It allows you to inspect what has come before and what it did well and where it succeeded and where it fell short and thus what you could _add_. It is an enabler of creativity! Thus I think it is interesting that GenAI may make it harder to have this experience.
  
  customguy 36 minutes ago
  
  They said they think they would have encountered those other games without GenAI, not that they or any of those other authors shouldn't have released the game.
- bxk76 1 hour ago
  
  Just because AI can give you a recipe for an sandwich doesnt mean everyone who sells or buys or experiments making sandwiches are going to stop.
- ai_fry_ur_brain 1 hour ago
  
  I think this is false. New ideas are born every minute, and llms arent going to help people with those for the most part, they'll end up steering you back towards the gradient if you do.
  
  0xEF 33 minutes ago
  
  Can you give us an example of a new idea that is not derivative of something that already exists? Should only take about a minute.
  Snark aside (and apologies), there's absolutely nothing wrong with the "no new ideas" take and nobody should think there is. Humans tend to work collectively, try as we might to do or appear otherwise, and often come to the same conclusions through reasoning and logic. No one-person truly invented the light bulb, etc, when really all inventive thought is branches of derivative thought as we build our collective knowledgebase. A better question would be how many novel ideas are the logical conclusion of branches of derivative thought and how many are tangential brought about by the injection of our irrationally.
uludag 51 minutes ago

Same thoughts exactly. I personally started looking into indie game dev and I've just started to realize how naive I was and how hard just game design can be, and that I'll probably never be good at it, and that most of my ideas are pretty garbage (or incomplete at best).
Even with the perfect AI to write, one would need to iterate through many different ideas, play testing constantly, getting people to play test and analyze what they found fun and where they got stuck. And to get the best ideas you'll need to be playing lots of different kinds of games.
neonstatic 39 minutes ago

Well, “an idea I’ve had for years” and “something that has never been done before” are not the same thing.
- jna_sh 37 minutes ago
  
  This is fair! I am possibly attaching some notion of originality to the word “idea” in the context of a project that others don’t.

ciscoriordan 2 hours ago

My Belgian Tervuren and I have a basic herding title and about 4 years of herding experience.

The sheep movement is excellent. You could make it even more realistic by having them favor lusher areas and by having one occasionally bolt spastically (hard mode?)

A handler mode where you play as a human and shout commands at the dog could be cool too!

da-x 10 minutes ago

Curious enough, I tried the same prompt with Qwen3.6-27B.

One shot produced a game with no sheeps. I had to told it fix two bugs then.

Overall, the graphics and games seems good enough and better than most of the closed models that were shown. However, not surprisingly, falls short of Fable.

I've put the index.html and open code session here:

https://github.com/da-x/when-ai-fails/tree/qwen3.6-27b/shepa...

evilturnip 1 hour ago

I think it’s impressive that an LLM can take you to a local maxima in one-shot.

But once you start maintaining it, improving it and fixing bugs, you’ll eventually need to rip it apart and put it back together again while understanding how it all works.

This is why I think the better approach isn’t to one-shot but to have the architecture in your head and build it up piece by piece, with the AI accelerating the code writing.

dools 1 hour ago

I’ve found it very easy to maintain, add features to and fix bugs in software I’ve written entirely with LLMs, and in languages and frameworks with which I’m unfamiliar. You just ask the LLM to explain the code and then work with it to come up with the fix.
- ai_fry_ur_brain 57 minutes ago
  
  How big are those projects.. I dont think this is good for your mental health or physicaly your brains health. Problem solving keeps your brain strong. The laziness in us is inclined to take shortcuts, don't do it. Its like driving your car 3 blocks instead of walking, your physical health will suffer.
  
  dools 42 minutes ago
  
  > How big are those projects
  Define big I guess. They're non-trivial, mix of internal enterprise tools, a multiplatform app (android/ios/mac/windows/web currently headbutting its way through review), including a billing system for my small telecommunications business.
  > I dont think this is good for your mental health or physicaly your brains health
  I find the experience of doing it without writing the code to be intellectually pretty similar. I still solve a lot of problems, the LLM couldn't, for example, one shot the event sourcing model I built for synching data between devices. It took quite a few iterations and I had to define a lot of the architecture, but I did it at a level that wasn't "here is a class, here is a module, this module does XYZ", more at the "whitepaper" level or describing how specific bits of the app needed to work in order to solve some problem.
  It's also very similar to managing other developers.
  > Its like driving your car 3 blocks instead of walking, your physical health will suffer
  It's more similar to having staff rather than doing everything yourself. The problem solving just shifts to a different area, and you get more done.
  
  matwood 20 minutes ago
  
  > Problem solving keeps your brain strong.
  Coding is not the sole problem solving skill. In fact, coding may be one of the easier skills much of the time. Deciding what to build, where to focus efforts, understanding a customer's needs, could all be just as if not more challenging than the coding part.
  
  dools 15 minutes ago
  
  Also what the code should do and how it should do it. LLMs regularly cannot come up with the best way to approach something. Once those decisions are made, codifying them is kind of the least interesting part of the entire exercise.
hurtigioll 1 hour ago

LLMs are good now at looking at existing project and suggesting big refactors for technical debt removal and new better architectures after the project grew organically for a while
MrScruff 11 minutes ago

I think this is true for projects beyond a certain complexity. I have 100% vibe coded projects with tens of thousands LOC, and haven't seen any real issues with fully automated maintenance. Will that approach work in every scenario, absolutely not, but the size and complexity of projects where it does is growing with each new model release.

momocowcow 14 minutes ago

So it created a trivial game that a teenager could’ve built as a part-time project while acquiring deep knowledge.

_pdp_ 18 minutes ago

If you sit down and write that game by hand you will not only finish it in a week but also learn a lot of things along the way and perhaps even discover something about the game and you did not imagine. That is how programming works. It is a search problem.

Also this is a game has very simple mechanics I am sure you can generate as easily with Cursor or some other tools.

thih9 14 minutes ago

Cursor has access to the latest models so it should be equivalent, right?
Or is there some other AI usage described in this article that is not supported by cursor?

fennecbutt 2 hours ago

Looks kinda like "Sheepherds" which came out recently.

However as others have pointed out the idea is a common one, probably because many people are exposed to sheep and sheep dogs and farming. Which further reinforces a previous point I made that all human work is derivative and barely anything actually original.

But that's why it doesn't matter! Make that game/app/website that someone else has made before, make your own interpretation! The beauty and uniqueness is in the skin not the flesh!

totetsu 1 hour ago

I’m sure I saw a blog post about this same mechanic being made by llms back a year or so ago too
zkry 30 minutes ago

But isn't getting an LLM to n-shot something just going to produce non-unique, non-original interpretations of an idea?

thih9 2 hours ago

The article’s title seems needlessly dramatic, the article itself doesn’t reference the LLM’s danger.

The title could have been just “Shepherd’s Dog: A game by Fable 5”.

vnglst 1 hour ago

Not sure if it would've gone to the front page of Hackernews with that title! I was also trying to make a little fun about the drama around Mythos/Fable: Even though Fable did this really well, to me it does not appear to be fundamentally different from other top models.
- dakolli 1 hour ago
  
  Yeah, fundamentally the same: Worthless.
  
  hurtigioll 1 hour ago
  
  funny how a worthless LLM belongs to the fastest revenue growing company in the history of Capitalism
  
  ps3udo 43 minutes ago
  
  Can you provide any source for that claim? Thanks!
  
  hurtigioll 38 minutes ago
  
  google it. this article from one month ago is already obsolete, annualized revenue grew from 30 bln to 44 bln in the last month
  https://venturebeat.com/technology/anthropic-says-it-hit-a-3...
  
  techpression 33 minutes ago
  
  Because others are paying for it. It’s a lot easier to get revenue when you don’t have to care about CAC or paying the bills.

ernst_klim 28 minutes ago

When you say €20 worth of tokens is it fair direct API call price or subsidized claude code?

bloomark 23 minutes ago

> It's really fun and exactly how I imagined it.

If this is what you imagined, you need to imagine better.

* Pathfinding is terrible (if I end up inside the fenced area clicking outside doesn’t lead me out). * Forcing me to go landscape while not even filling the entire screen is terrible (where did you even test this). * Controls are disastrous (I’m either barking all the time or a bark makes my sprite ignore my movements).

You one-shotted this, and I will admit it’s incredible that these agents can create something like this in minutes.

But your statements along with the “most dangerous AI model” in the title are disingenuous. Please do better.

nickandbro 2 hours ago

I sure do miss Fable. It just knew how to do things and do them well. Sad it’s now blocked.

willtemperley 1 hour ago

I wonder if this is the real problem: it was too good, and a lobby of companies feeling threatened by the competition decided to push the jailbreak narrative as a scapegoat.

sixhobbits 2 hours ago

Enjoyed playing it, here's the direct link to play as otherwise you have to click from the article to the GitHub and then find the correct demo link

https://vnglst.github.io/when-ai-fails/shepards-dog/claude-f...

vnglst 2 hours ago

Thanks for that, I messed up copying the links into the article!

tbreschi 2 hours ago

Brilliant marketing here in the title

stephbook 2 hours ago

Playing on iphone13 mini.

It instructs me to rotate my phone. The pasture doesn't get any bigger, but now the top bar blocks half the screen. The tooltip about rotating stays in the middle of the screen. Unplayable. There's a music note indicating sound, but I never heard the dog bark.

It's exactly the kind of unpolished slop I expected it to be.

PUSH_AX 59 minutes ago

In which harness?

andrepd 49 minutes ago

He should ask AI to tell him that #aaa text on #eee background is not acceptable.

CarRamrod 1 hour ago

BAA VRAM EWE

ai_fry_ur_brain 1 hour ago

Forces me to rotate to get warning message to disappear (works fine on portrait, but regardless forces me to play with two hands..), when rotate doesnt even fit on phone.

fROnTEnD DeV Is DeAd

DeSiGN Is DeAD

Cool idea tho, could be a fun game if if the UX wasnt so hostile.

wg0 1 hour ago

Now next game - The Boy who cried wolf! Wolf!

hbarka 2 hours ago

That’s one tired sheepdog.

vnglst 2 hours ago

This was my second attempt, I'm still learning! Besides, the wolf was freaking me out.
- defrost 1 hour ago
  
  Always fun having a go, mind you Michael Nyman had some thoughts on all this: https://www.youtube.com/watch?v=xn1_vUe_Vws
  For interest, some shepherds run two dogs, each on a different whistle or voice command pitch.

esailija 2 hours ago

I didn't even have to play. Immediately after opening, some notification about rotating my phone is obscuring the instructions and I cannot read them.

fennecbutt 2 hours ago

Damn I couldn't load it on my Nokia n95 from 2007 either. Damn bruh, these silly devs should make this stuff work on everything.
- esailija 1 hour ago
  
  I am on a flagship samsung that runs for example the Red Alert 2 browser port well.
  OP is just pushing slop, the 80% part anyone gets for free. (well 20 bucks)

isoprophlex 1 hour ago

"a game idea I've had for years"

Bruv, there are already countless games with this exact mechanic...

chvid 1 hour ago

As far as I can tell it is possible to get this sort of quality game with a properly tuned harness out of one of the cheaper models.