OpenAI Launches ChatGPT Agent for Task Automation

OpenAI has officially launched a new AI-powered general-purpose agent integrated into ChatGPT, designed to complete complex digital tasks on users’ behalf. Dubbed the ChatGPT Agent, the tool marks a major leap in OpenAI’s efforts to evolve its chatbot from a conversational assistant into a fully functional autonomous agent.

What Can ChatGPT Agent Do?

The new agent is capable of handling a broad spectrum of computer-based activities, including:

  • Navigating calendars
  • Generating and editing slideshows and presentations
  • Executing code and accessing developer tools

OpenAI has integrated technologies from its existing agentic systems — such as Operator, which can interact with websites, and Deep Research, known for synthesizing insights from vast amounts of web data.

ChatGPT Agent is now available to subscribers of OpenAI’s Pro, Plus, and Team plans. To use it, users simply select “Agent Mode” from the tools dropdown menu within ChatGPT and start prompting in natural language.

Smarter, More Capable, and Tool-Ready

Unlike previous agent experiments, this new version is deeply integrated with OpenAI’s Connectors, allowing it to access third-party apps like Gmail and GitHub to retrieve relevant data. It also features a built-in terminal and the ability to use APIs — a first for ChatGPT.

Example use cases promoted by OpenAI include:

  • Planning and purchasing ingredients for a Japanese breakfast
  • Conducting competitor analysis and building a slide deck

These scenarios demand complex reasoning, task planning, and multi-tool coordination — all of which this new agent is designed to handle.

Benchmark Performance

According to OpenAI, the underlying model driving ChatGPT Agent sets new standards in several major benchmarks:

  • Humanity’s Last Exam (pass@1): 41.6%, nearly double the scores of previous models like o3 and o4-mini
  • FrontierMath (with tools): 27.4%, compared to o4-mini’s 6.3%

These scores suggest significant improvements in both general knowledge reasoning and high-difficulty mathematical problem solving.

Safety, Security, and Restrictions

With its increased capabilities, OpenAI acknowledges the enhanced risks. In its safety report, the company designates ChatGPT Agent as “high capability” in areas like biological and chemical weapon domains — per its own Preparedness Framework. While OpenAI found no direct misuse evidence, the company is implementing several new safety measures:

  • Real-Time Monitoring: A classifier evaluates every prompt for potential biological content. If flagged, responses undergo further review.
  • Memory Disabled: Unlike other ChatGPT modes, the Agent’s memory feature is turned off to prevent misuse through prompt injection or data exfiltration. OpenAI may revisit this in the future.

The Road Ahead

While the capabilities of ChatGPT Agent are promising, OpenAI admits real-world performance will be the true test. To date, agent-based systems have often underdelivered when interacting with real-world workflows. Still, the company claims this is its most advanced and reliable attempt yet.

Stay tuned to TechXNow for continued updates on AI agents and real-world testing insights as ChatGPT Agent rolls out to users globally.

sources ( Techcrunch )

The premier tech event bringing together industry leaders, innovators, and visionaries.

Related Content

  • All Posts
  • Blog
  • News
  • Phone
    •   Back
    • AI
    • Tech Industry
    • Microsoft
    • Startups
    • Apple
    • Phone
    • Robotics
    • Apps
    •   Back
    • Tech Conference
    • AI-Powered Startups
    • HealthTech Conference
    •   Back
    • Robotics
    • Apps
    •   Back
    • Conference
    • Tech Conference
    • AI-Powered Startups
    • HealthTech Conference

Newsletter

Join Our 1,000 subscribers list!

You have been successfully Subscribed! Ops! Something went wrong, please try again.

By signing up, you agree to our Privacy Policy

Edit Template

Experience the Future of Technology.

Copyright © 2025 All rights reserved.

loader
Open chat
Hello
Can we help you?