Goldman Sachs Tests Agentic AI to Automate Software Engineering
Goldman Sachs is testing Devin, an AI agent by Cognition, to act as an autonomous software developer, marking a leap from assistant tools to agentic AI.
Goldman Sachs is piloting an autonomous AI software engineer named Devin, developed by Cognition Labs, to work alongside its 12,000 human developers.
This initiative, led by Chief Information Officer Marco Argenti, aims to boost productivity by automating complex, multi-step coding tasks such as updating internal codebases and modernizing legacy systems.
Devin operates under human supervision and is expected to enhance developer efficiency by three to four times, marking a significant step toward a “hybrid workforce” where humans and AI collaborate.
This move positions Goldman Sachs as the first major bank to test such agentic AI, signaling a broader trend in Wall Street’s adoption of advanced AI tools for enterprise applications. The pilot began as of July 11, 2025, with plans to potentially deploy thousands of AI agents.
Devin
Devin, created by Cognition Labs, is an autonomous AI software engineer designed to handle complex coding tasks with minimal human oversight, and it’s currently being piloted by Goldman Sachs to work alongside its 12,000 human developers.
The AI excels at generating, debugging, and modifying code across various programming languages, enabling it to tackle projects like updating internal codebases or modernizing legacy systems.
By breaking down intricate software engineering tasks into manageable steps, Devin autonomously plans and executes them, analyzing requirements, writing code, testing it, and iterating based on results. Its advanced reasoning capabilities allow it to understand project contexts and make decisions that mirror a human developer’s problem-solving approach.
Ai-Enhanced DevOps
Beyond coding, Devin integrates seamlessly with development environments, APIs, and external tools like version control systems or testing frameworks, enabling it to complete tasks from start to finish.
It also identifies bugs, proposes fixes, and implements solutions, significantly reducing the time developers spend on manual debugging.
While Devin operates autonomously, it functions under human supervision, allowing developers to guide or refine its work to ensure alignment with project objectives. In Goldman Sachs’ pilot Devin is being tested to boost developer productivity by three to four times, with potential plans to deploy thousands of AI agents.
This functionality positions Devin as a transformative tool for streamlining enterprise-scale software development and fostering a hybrid workforce where humans and AI collaborate effectively.