Some of you may remember a newsletter I sent out a month ago, saying how a startup in China has released a new AI agent that is getting raving reviews.
\ Following that, the startup Monica gave me early access to Manus AI with plenty of free credits to play around with the AI agent.
\ Now that I have first-hand experience, I can confirm, it really ups the game on the AI front.
\
Its ability to define the checklist needed to take an objective to completion and perform one task after another to get there is breathtaking.
\ So, the one big task I chose to perform with Manus AI was to high-level design a game where the player would live through Indian history by trying to build and defend cities.
\ Manus had clear instructions — we would brainstorm to come up with the perfect gameplay, but then only code a minimally viable game.
Instead, it tried to achieve the “objective” anyhow within a single execution — and we ended up with this landing page for the game it deployed on its own, rather than having an MVP of a gameplay.
\ If you think, I could just ask it to refine and go back to what we were trying to do, uh oh, bad luck, we are all out of credits.
\
\ So, I ended up blowing all of those fancy free credits Monica gave me, just so we could all learn this valuable lesson. We need a red button on AI.
\ AI agents are all the rage. And I am very, very optimistic about them as somebody who is building multiple AI agents of my own.
\ But, we need to acknowledge the negative side.
\ That AI agents, even top-of-the-line ones, are not yet ready to perform complex tasks autonomously.
\ This needs to be on the label.
\ Because users can’t be expected to have blown up 100s (or even 1000s) of dollars before realizing this.
\ This isn’t just a case with Manus AI, though. This is a similar problem I face from time to time with Cursor — my go-to AI agent for coding.
\ Sometimes, it introduces unnecessary components or even new libraries, entirely different from what you are currently using, into your code.
\ Take your eyes off what the agent is doing for a second, and you may just end up with a disaster of epic proportions.
\ Now, to be fair, Cursor does have a “Reject All” button. But this only works if you review the code immediately.
\
Many times, you may notice the issue after doing some more work on the code, and then a git rollback remains the only option.
\ This isn’t to dissuade you from trying out AI agents (they really are the future), but just something I needed to say out loud, especially since I have contributed to the hype around them.
\ AI agents are currently already a viable option to end-to-end code and deploy things like landing pages, classic games, and in-depth research on any given topic.
\ I just want that button within reach on my desk that, when hit, just stops whatever task the AI is executing at a time, and just rolls the whole thing back.
\ This post is an excerpt from the 12th edition of the Artificially Boosted newsletter.
Tidak ada komentar:
Posting Komentar