fractionl.ai
Posts
Yesterday I had to have a stern word with my AI agent.

Yesterday I had to have a stern word with my AI agent.

Building with AI agents sounds sexy, until you realise you’ve basically hired a hyperactive intern with infinite knowledge but no experience.

Krish Raja
April 15, 2025

Agents are like onboarding a smart human intern; they are energetic, curious and faster than you at technical tasks, but their work doesn’t come to life unless you learn to teach.

Yesterday, I had to sit my website coding agent down and give it a stern talking to due to performance issues. After many many hours training and working with the Replit AI agent to build a site that helps me understand where the best remote working cafes are in NYC - just for practice - we were getting along great, but it was just getting stuck in random places. My patience was tested as it got stuck in a loop of its own making for days, and just kept telling me it was doing things but nothing was getting fixed. I had tried everything, and I finally got irritated enough to scold it (I felt bad afterwards) before planning to give up:

And it worked! After days of being stuck, being told off made the agent literally step backwards to try to understand the patterns behind it’s own errors. After that, with a little extra encouragement and a few hints from me (see image above), it completed the job of taking in and understanding my data feeds for cafes correctly. I literally laughed out loud at how absurd and bizarrely human that moment was 😂

Who knew? Sometimes AI needs to be scolded to make a change too.

The first roadblock was fixed, but the next one came minutes later. In another scenario, the map was just showing up as a grey box, so midway through a debugging ‘chat’, I managed to get it to understand the problem. I could see it working and spotted it correcting the issue, and the map was working—victory! 🥳

I blinked once, and it was gone again. Replaced by… the damn grey box again. The agent had, in its infinite enthusiasm, rewritten over its own working logic even though it did the right thing half way through the task.

I had to step in and this time, just narrow its focus down to one task and one task alone:

After honing in on a singular problem and getting it fixed, next up I had issues fetching venue images from Google’s API. But by this point I was ready. I said to the agent “the images are not stored in the file so you’ll need to fetch them”. I knew it would assume the information was all in one place otherwise, instead of wondering where the information might be. My intuition was right - zero errors first time around!! 🎉🎉🎉

The root issue of a lot of these issues? The agent couldn’t “remember” what worked and what didn’t—it was chasing symptoms, not the cause. It was creating logs upon logs, but none of those error logs were being tracked holistically. And unless I explicitly redirected it ("the map is grey – stop everything else and find out why"), it happily continued reworking everything except the one thing that mattered.

I then gave it one final reminder:

“Make sure you are not continuing to add layers of code every time you try repeatedly to fix the same issue. Look at the patterns and the root causes.”

Me, to AI.

A very polite way of saying: “Head up, not head down. Think.”

Then came the SEO implementation. I asked the agent to “optimize for search,” and it proudly delivered a lovely <meta name="keywords"> tag, which my SEO-expert mate described as “something we used in 2008.” Thanks, pal.

A helpful reminder that even though agents will take your instruction and start implementing an approach, AI is not yet able to determine the best strategy - proven by the fact that it auto implemented an outdated tech implementation. So you can see, AI still needs a qualified person to provide approvals on the best approach to solving a problem.

So what’s really going on here?

AI coding agents like this are built on Large Language Models (LLMs), which means they’re very good at pattern matching and auto-completing — but they have had no real context or persistent memory unless you give it to them. They’ll confidently execute what looks like a solution based on probabilities, not reasoning. That’s why it will keep rewriting or stacking logic without truly checking if the underlying issue has been resolved. This will change shortly across the board.

It also doesn’t have a proper concept of debugging. Unlike a human developer who can look at an error and go “ah, wrong scope,” your AI agent just thinks, “ah, code is broken, must add code to fix the bug.” Over and over, to insanity and beyond.

It’s like adding sellotape to fix broken sellotape; eventually the entire structure will fall apart or be overweight if you do that. You need to fix the root of the issue.

In plain english, what’s being upgraded in AI right now?

Memory: Tools like OpenAI’s Assistants API and Replit’s own upcoming memory layers will let agents recall things across tasks, debug stages, and user preferences. Think of it literally like human memory.
- Memory is what will make AI really useful at it’s base layer as you’d already be loving if you use ChatGPT paid version - it builds a picture of you over time, in the way your Google search histories would have - but way richer.
Context: upgrades like Model Context Protocol enable two way interactions and tighter guardrails of context
- Think about this kind of like having a specific meeting agenda with your employee: “Sales Pipeline” instead of just “Meeting”. You’re there to discuss the sales pipeline. Not just meet and talk about whatever you want. Similarly, ideally you keep your convos with agents to one topic at a time (or orchestrate multi-agents, see below 😉).
Reasoning vs action: Agents are starting to reason over function calls (interactions with other sites or apps etc) and not just the code output (it’s own spew). This lets your agent separate thinking from doing (finally).
- This is important, if you think about the SEO implementation above it would be important the AI separates its reasoning from its actions so it doesn’t implement things that may be painful in the long run (like 10 year old SEO tech configurations). Ultimately, this one should end up being the check point for you, the human, to give it the thumbs up to proceed.
Multi agent orchestration: Instead of one agent doing everything and getting its wires crossed, we’ll see more specialised agents collaborating—like a product manager agent guiding a frontend dev agent who checks in with QA agent before deployment.
- In the Replit agent above, I realized I was trying to get one intern to be a 5-person startup which you would not do in real life.

Final thoughts: start now

Working with coding agents isn’t quite like managing code—it’s like managing a new grad who has just swallowed every coding textbook on the planet but hasn’t ever built anything, who doesn’t sleep, and doesn’t learn from mistakes unless you tell them to.

But when you learn how to teach, they do listen, and they fly.

So yeah, building with AI today is like onboarding a peppy grad. You’ll have to sit them down, give them real-time feedback, and occasionally tell them off. But get it right, and you’re training tomorrow’s best employee.

I’d recommend everyone start playing around with AI agents now, as huge areas like context and memory are being implemented.

The simplest way to start is using the paid version of ChatGPT which has long memory - you can create a ‘Project’ and load it with Project Files (it’s knowledge), and then have it act as your EA for any given business or project you want. I have a finance one, a health one and one for work.

If you’re interested in building more customized AI agents, here are some of my current favourites: Replit builds sites, Lindy builds agentic automations and Manus is more of an outcomes based critical thinking assistant.

In my next post, I’ll outline the difference between those tools/concepts, and show examples of how to get going!

And me and my agent, did we patch things up after our misunderstandings?

We did indeed. I think we'll be getting along just fine.

This was the website output btw - zero human code, just old fashioned teamwork 😄