
OpenAI has introduced a general-purpose AI agent within ChatGPT, aiming to move beyond answering questions by completing various computer-based tasks.
Known as ChatGPT agent, the tool allows users to navigate calendars, generate editable presentations, and write code simply by using natural language prompts.
Rather than acting as a standalone product, ChatGPT agent integrates capabilities from OpenAI's earlier tools, combining website navigation and in-depth research features.
Rolling out to Pro, Plus, and Team subscribers, the ChatGPT agent also connects with external apps like Gmail and GitHub. Instead of being limited to basic queries, it can access a terminal and use APIs, enabling tasks such as analysing competitors or planning shopping lists.
OpenAI claims its underlying model delivers state-of-the-art results, scoring significantly higher than previous versions on academic and maths benchmarks.
While positioning ChatGPT as its most capable AI tool yet, OpenAI has implemented several new safety measures due to the agent's potential risks. The company acknowledges its model could amplify harm in sensitive areas like biological and chemical threats.
To mitigate such dangers, OpenAI monitors prompts in real time and turns off ChatGPT's memory feature within the agent to avoid data leaks through malicious attacks.
Despite these precautions, questions remain over whether the ChatGPT agent will consistently perform complex tasks in the real world. Earlier agent technologies from various companies have often failed to meet expectations.
OpenAI, however, insists its new release represents a more robust step towards fulfilling the vision of practical AI agents.