How to Use the New ChatGPT Agent, If You Trust It

The AI agent era is here: ChatGPT is no longer just for answering your queries with a confident-sounding, often wildly incorrect response synthesized from masses of data scraped from other sources. It’ll now hook into your apps to carry out real actions for you: booking tickets, looking up prices, checking your calendar, creating slideshows, and much more.

This new service is called ChatGPT Agent, and it essentially gives the AI bot its own virtual computer inside your conversations. OpenAI tells us to expect a bot “fluidly shifting between reasoning and action to handle complex workflows from start to finish, all based on your instructions”—which sounds ambitious and perhaps a little scary.

If you’re on any of the paid plans that ChatGPT offers (starting from $20 a month), then you can try out the Agent now, so I thought I’d put it to the test on a couple of made-up projects (I’m not quite ready to trust it with anything real just yet). You’re able to launch ChatGPT Agent in the web app by clicking the + (plus) button to the left of the prompt box and picking Agent mode.

How Agent mode works

Nothing too dramatic happens when you go into Agent mode: You’re simply asked to describe the task you want ChatGPT to carry out. There aren’t any guidelines about your prompt. You do get some suggestions on screen, from getting ChatGPT to summarize the news, to having it order groceries.

Once you’ve decided what you want ChatGPT to do, it may ask you follow-up questions for clarity, and the interface isn’t that much different from a regular conversation with the AI bot. What is different is an embedded window that gives you a general idea of what ChatGPT is doing on its own virtual computer.

You can jump in and take control whenever you want.
Credit: Lifehacker

It’s not a direct live feed, but ChatGPT will tell you what it’s doing and throw up some graphics to represent each action. At any point you can scroll backwards through the feed, or take control of ChatGPT’s computer—at which point you will see exactly what ChatGPT is doing, as if you were connecting to another PC remotely.

You can also switch to what’s called an activity mode, where you just get a scrolling text feed of the steps ChatGPT is taking, without the visuals. There’s also the option to stop the Agent at any point, if you feel it’s getting off track or doing something you don’t want it to do. It only takes a couple of clicks.

The Agent presents its results in the normal ChatGPT format.
Credit: Lifehacker

When ChatGPT Agent has finished doing everything you asked it to do, you’ll be given a summary and report. You’ll also get a list of sources at the bottom of the final response, as is the norm for ChatGPT conversations, and there’s the option of asking follow-up questions, if needed.

On the whole, the Agent works well, though it can take its time: Like the Deep Research tool, you’ll probably want to set this up and then do something else for a while. That means you can’t watch and check every step that ChatGPT takes, so you’re going to have to decide how much you trust it.

How my ChatGPT Agent experiment went

The first task I asked ChatGPT to do was plan a birthday party for me: I told it the age I am, what kind of party I wanted (a quiet, low-key affair), the kind of space I wanted (a small room next to a bar), and the potential dates I was considering. I also asked the AI to come up with some invites.

And overall, the bot did a pretty good job. It identified the local venues I would’ve picked myself, though it ran into some issues getting booking details (opening PDFs from the web didn’t seem to work). The invitation artwork and text was fine, if a little generic, and the final report gave me a neat comparison chart to help me pick a place to hold the party—and contact details for booking it.

You get a live feed of what ChatGPT Agent is up to.
Credit: Lifehacker

For my next experiment, I tried to get ChatGPT Agent to produce a nicely formatted spreadsheet with all the iPhone launch dates on it—something that would genuinely help me in my work and save me some time. One definite plus point here was that ChatGPT did well at identifying trustworthy sources: Wikipedia, Apple press releases, and sites like MacRumors.

The final spreadsheet looked to be fully accurate as far as I could tell, and was delivered in an Excel spreadsheet—I didn’t get the nice formatting I asked for, and the sources column didn’t really make sense, but all the key data was there. This did take quite a while to compile, though, and I think I could’ve probably done it myself in the same time (though I was free to do something else while ChatGPT was working).

A text-based feed view is also available.
Credit: Lifehacker

I’m impressed by how slick and capable ChatGPT Agent is. It wasn’t perfect, but most of the time it took the right steps and successfully switched between tasks. There’s a good amount of transparency about what it’s doing, and you can always take control as and when needed.

Personally, though, these are the sort of tasks I’d still rather do myself. I’m too worried about ChatGPT Agent making a mistake, missing a detail, or not understanding some nuance to rely on it heavily. Your own threshold for those sorts of concerns might be different, and I suspect that plenty of users will overlook minor issues because of the time the Agent can save them.

Disclosure: Lifehacker’s parent company, Ziff Davis, filed a lawsuit against OpenAI in April, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.

Leave a Reply

Your email address will not be published. Required fields are marked *