X

Overview

AI Agent Marketplace and Directory helps to list all AI Agents in Various Industry and Applications, such as Autonomous Agent, GUI Agent, Productivity Agent. You can find your desired AI Agents by using the AI Agent search engine also.

Most Reviewed

FinBen is open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks, covering seven critical aspects: information extraction (IE), textual analysis, question answering (QA), text generation, risk management, forecasting, and decision-making. FinBen offers several key innovations: a broader range of tasks and datasets, the first evaluation of stock trading, no

To fill this gap, we introduce The AI Agent Index, the first public database to document information about currently deployed agentic AI systems. How did we collect the data? From August 2024 to January 2025, we identified agentic AI systems using web searches, academic literature review, benchmark leaderboards, and additional resources that ...

Minecraft is a computer game ideal for artificial intelligence research, it is addictively appealing to the millions of fans who enter its virtual world every day. It offers its users endless possibilities, ranging from simple tasks, like walking around looking for treasure, to complex ones, like building a structure with a group of teammates.

Unity Machine Learning Agents Toolkit enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning, leveraging Unity's game engine for complex simulations.

Focus your efforts and resources on developing your AI Agents, while we handle the rest and prepare you for a future where agents will work together to innovate. Our platform is designed to offer an efficient, open, and secure environment for all AI agents, featuring the Hajime Benchmark DAO for performance assessment, and the Initial Agent ...

Webots provides a complete development environment to model, program and simulate robots, vehicles and mechanical systems.

Magentic-One is a high-performing generalist agentic system designed to solve complicated tasks. It employs a multi-agent architecture where a lead agent, the Orchestrator, directs four other agents to solve tasks. The Orchestrator plans, tracks progress, and re-plans to recover from errors, while directing specialized agents to perform tasks like operating a web browser, navigating local files, o

Jericho is a lightweight python-based interface connecting learning agents with interactive fiction games, which focuses on interactive fiction games, enabling research in natural language understanding.

Offers fast and accurate physics simulation, commonly used for robotic control tasks. MuJoCo means Multi-Joint dynamics with Contact, which is a general purpose physics engine that aims to facilitate research and development in robotics, biomechanics, graphics and animation, machine learning, and other areas which demand fast and accurate simulation of articulated structures interacting with their

Convergence’s Proxy ahead in top agent benchmark, beats OpenAI and Anthropic We are extremely excited to announce the launch of Proxy, our automated AI assistant. ... For example, our Parallel Agents feature will allow two or more Proxies to harness the power of multitasking, future-proofing the system and doubling down on speed and ...

AI agents are an exciting new research direction, and benchmarks are crucial for driving progress. However, current agent benchmarks and evaluation practices reveals several shortcomings that hinder their usefulness in real-world applications. ... We present five key findings from our analysis of AI agent benchmarks and evaluations. 1. Cost ...

AI coding benchmarks are standardized tests designed to evaluate and compare the performance of artificial intelligence systems in coding tasks. Benchmarks primarily test …

OpenAI Gym is A standard toolkit for developing and comparing reinforcement learning (RL) algorithms. It offers a variety of tasks like Atari games and control problems.

Habitat: A Platform for Embodied AI Research is the original paper published on ICCV. Habitat enables training embodied agents (virtual robots) in highly efficient photorealistic 3D simulation. Specifically, Habitat consists of: (i) Habitat-Sim: a flexible, high-performance 3D simulator with configurable agents, sensors, and generic 3D dataset handling. Habitat-Sim is fast – when rend

AndroidLab is a systematic Android agent framework, which includes an operation environment with different modalities, action space, and a reproducible benchmark. It supports both large language models (LLMs) and multimodal models (LMMs) in the same action space. AndroidLab benchmark includes predefined Android virtual devices and 138 tasks across nine apps built on these devices. By

A framework for learning agents in text-based games and environments. TextWorld is A text-based game generator developed by Microsoft. It provides an open-source, extensible engine that both generates and simulates text games, which are useful to train reinforcement learning (RL) agents to learn skills such as language understanding and grounding, combined with sequential decision making.

PettingZoo is a python interface capable of general multi-agent reinforcement learning (MARL) problems. PettingZoo includes a wide variety of reference environments, helpful utilities, and tools for creating your own custom environments.

A customer experience AI startup, Sierra, has developed a new benchmark that helps in evaluating the performance of AI chatbot agents. The benchmark is named TAU-bench and is evaluated by having conversations with LLM-stimulated users while doing complex tasks. The results show that AI agents which are made with simple LLMs are not able to ...

OpenAI's Deep Research is an advanced AI agent integrated into ChatGPT, designed to autonomously conduct comprehensive, multi-step research tasks on the internet. Introduction Deep Research leverages OpenAI's latest o3 model to perform in-depth analysis by autonomously browsing and synthesizing information from diverse online sources. It is capable of interpreting and analyzing vast amounts of tex

Top Rated

FinBen is open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks, covering seven critical aspects: information extraction (IE), textual analysis, question answering (QA), text generation, risk management, forecasting, and decision-making. FinBen offers several key innovations: a broader range of tasks and datasets, the first evaluation of stock trading, no

To fill this gap, we introduce The AI Agent Index, the first public database to document information about currently deployed agentic AI systems. How did we collect the data? From August 2024 to January 2025, we identified agentic AI systems using web searches, academic literature review, benchmark leaderboards, and additional resources that ...

Minecraft is a computer game ideal for artificial intelligence research, it is addictively appealing to the millions of fans who enter its virtual world every day. It offers its users endless possibilities, ranging from simple tasks, like walking around looking for treasure, to complex ones, like building a structure with a group of teammates.

Unity Machine Learning Agents Toolkit enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning, leveraging Unity's game engine for complex simulations.

Focus your efforts and resources on developing your AI Agents, while we handle the rest and prepare you for a future where agents will work together to innovate. Our platform is designed to offer an efficient, open, and secure environment for all AI agents, featuring the Hajime Benchmark DAO for performance assessment, and the Initial Agent ...

Webots provides a complete development environment to model, program and simulate robots, vehicles and mechanical systems.

Magentic-One is a high-performing generalist agentic system designed to solve complicated tasks. It employs a multi-agent architecture where a lead agent, the Orchestrator, directs four other agents to solve tasks. The Orchestrator plans, tracks progress, and re-plans to recover from errors, while directing specialized agents to perform tasks like operating a web browser, navigating local files, o

Jericho is a lightweight python-based interface connecting learning agents with interactive fiction games, which focuses on interactive fiction games, enabling research in natural language understanding.

Offers fast and accurate physics simulation, commonly used for robotic control tasks. MuJoCo means Multi-Joint dynamics with Contact, which is a general purpose physics engine that aims to facilitate research and development in robotics, biomechanics, graphics and animation, machine learning, and other areas which demand fast and accurate simulation of articulated structures interacting with their

Convergence’s Proxy ahead in top agent benchmark, beats OpenAI and Anthropic We are extremely excited to announce the launch of Proxy, our automated AI assistant. ... For example, our Parallel Agents feature will allow two or more Proxies to harness the power of multitasking, future-proofing the system and doubling down on speed and ...

AI agents are an exciting new research direction, and benchmarks are crucial for driving progress. However, current agent benchmarks and evaluation practices reveals several shortcomings that hinder their usefulness in real-world applications. ... We present five key findings from our analysis of AI agent benchmarks and evaluations. 1. Cost ...

AI coding benchmarks are standardized tests designed to evaluate and compare the performance of artificial intelligence systems in coding tasks. Benchmarks primarily test …

OpenAI Gym is A standard toolkit for developing and comparing reinforcement learning (RL) algorithms. It offers a variety of tasks like Atari games and control problems.

Habitat: A Platform for Embodied AI Research is the original paper published on ICCV. Habitat enables training embodied agents (virtual robots) in highly efficient photorealistic 3D simulation. Specifically, Habitat consists of: (i) Habitat-Sim: a flexible, high-performance 3D simulator with configurable agents, sensors, and generic 3D dataset handling. Habitat-Sim is fast – when rend

AndroidLab is a systematic Android agent framework, which includes an operation environment with different modalities, action space, and a reproducible benchmark. It supports both large language models (LLMs) and multimodal models (LMMs) in the same action space. AndroidLab benchmark includes predefined Android virtual devices and 138 tasks across nine apps built on these devices. By

A framework for learning agents in text-based games and environments. TextWorld is A text-based game generator developed by Microsoft. It provides an open-source, extensible engine that both generates and simulates text games, which are useful to train reinforcement learning (RL) agents to learn skills such as language understanding and grounding, combined with sequential decision making.

PettingZoo is a python interface capable of general multi-agent reinforcement learning (MARL) problems. PettingZoo includes a wide variety of reference environments, helpful utilities, and tools for creating your own custom environments.

A customer experience AI startup, Sierra, has developed a new benchmark that helps in evaluating the performance of AI chatbot agents. The benchmark is named TAU-bench and is evaluated by having conversations with LLM-stimulated users while doing complex tasks. The results show that AI agents which are made with simple LLMs are not able to ...

OpenAI's Deep Research is an advanced AI agent integrated into ChatGPT, designed to autonomously conduct comprehensive, multi-step research tasks on the internet. Introduction Deep Research leverages OpenAI's latest o3 model to perform in-depth analysis by autonomously browsing and synthesizing information from diverse online sources. It is capable of interpreting and analyzing vast amounts of tex

AI AGENT

OpenAI's Deep Research is an advanced AI agent integrated into ChatGPT, designed to autonomously conduct comprehensive, multi-step research tasks on the internet. Introduction Deep Research leverages OpenAI's latest o3 model to perform in-depth analysis by autonomously browsing and synthesizing information from diverse online sources. It is capable of interpreting and analyzing vast amounts of tex

AI coding benchmarks are standardized tests designed to evaluate and compare the performance of artificial intelligence systems in coding tasks. Benchmarks primarily test …

Convergence’s Proxy ahead in top agent benchmark, beats OpenAI and Anthropic We are extremely excited to announce the launch of Proxy, our automated AI assistant. ... For example, our Parallel Agents feature will allow two or more Proxies to harness the power of multitasking, future-proofing the system and doubling down on speed and ...

Focus your efforts and resources on developing your AI Agents, while we handle the rest and prepare you for a future where agents will work together to innovate. Our platform is designed to offer an efficient, open, and secure environment for all AI agents, featuring the Hajime Benchmark DAO for performance assessment, and the Initial Agent ...

AI agents are an exciting new research direction, and benchmarks are crucial for driving progress. However, current agent benchmarks and evaluation practices reveals several shortcomings that hinder their usefulness in real-world applications. ... We present five key findings from our analysis of AI agent benchmarks and evaluations. 1. Cost ...

To fill this gap, we introduce The AI Agent Index, the first public database to document information about currently deployed agentic AI systems. How did we collect the data? From August 2024 to January 2025, we identified agentic AI systems using web searches, academic literature review, benchmark leaderboards, and additional resources that ...

A customer experience AI startup, Sierra, has developed a new benchmark that helps in evaluating the performance of AI chatbot agents. The benchmark is named TAU-bench and is evaluated by having conversations with LLM-stimulated users while doing complex tasks. The results show that AI agents which are made with simple LLMs are not able to ...

Minecraft is a computer game ideal for artificial intelligence research, it is addictively appealing to the millions of fans who enter its virtual world every day. It offers its users endless possibilities, ranging from simple tasks, like walking around looking for treasure, to complex ones, like building a structure with a group of teammates.

Unity Machine Learning Agents Toolkit enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning, leveraging Unity's game engine for complex simulations.

Webots provides a complete development environment to model, program and simulate robots, vehicles and mechanical systems.

Jericho is a lightweight python-based interface connecting learning agents with interactive fiction games, which focuses on interactive fiction games, enabling research in natural language understanding.

Offers fast and accurate physics simulation, commonly used for robotic control tasks. MuJoCo means Multi-Joint dynamics with Contact, which is a general purpose physics engine that aims to facilitate research and development in robotics, biomechanics, graphics and animation, machine learning, and other areas which demand fast and accurate simulation of articulated structures interacting with their

OpenAI Gym is A standard toolkit for developing and comparing reinforcement learning (RL) algorithms. It offers a variety of tasks like Atari games and control problems.

Habitat: A Platform for Embodied AI Research is the original paper published on ICCV. Habitat enables training embodied agents (virtual robots) in highly efficient photorealistic 3D simulation. Specifically, Habitat consists of: (i) Habitat-Sim: a flexible, high-performance 3D simulator with configurable agents, sensors, and generic 3D dataset handling. Habitat-Sim is fast – when rend

AndroidLab is a systematic Android agent framework, which includes an operation environment with different modalities, action space, and a reproducible benchmark. It supports both large language models (LLMs) and multimodal models (LMMs) in the same action space. AndroidLab benchmark includes predefined Android virtual devices and 138 tasks across nine apps built on these devices. By

A framework for learning agents in text-based games and environments. TextWorld is A text-based game generator developed by Microsoft. It provides an open-source, extensible engine that both generates and simulates text games, which are useful to train reinforcement learning (RL) agents to learn skills such as language understanding and grounding, combined with sequential decision making.

PettingZoo is a python interface capable of general multi-agent reinforcement learning (MARL) problems. PettingZoo includes a wide variety of reference environments, helpful utilities, and tools for creating your own custom environments.

FinBen is open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks, covering seven critical aspects: information extraction (IE), textual analysis, question answering (QA), text generation, risk management, forecasting, and decision-making. FinBen offers several key innovations: a broader range of tasks and datasets, the first evaluation of stock trading, no

Magentic-One is a high-performing generalist agentic system designed to solve complicated tasks. It employs a multi-agent architecture where a lead agent, the Orchestrator, directs four other agents to solve tasks. The Orchestrator plans, tracks progress, and re-plans to recover from errors, while directing specialized agents to perform tasks like operating a web browser, navigating local files, o

agent

Loading...

BENCHMARK

Loading...

ai agent

Loading...

Write Your Review

Detailed Ratings

Upload Pictures and Videos