GitHub's Continuous Efficiency: AI Agents That Optimize Code While You Sleep

GitHub Next is building AI agents that automatically optimize codebases for performance and sustainability. Early pilots show real wins, but the tooling is experimental. Here's what works and what doesn't.

GitHub's Continuous Efficiency: AI Agents That Optimize Code While You Sleep

TL;DR

  • GitHub Next is building "Continuous Efficiency" — AI agents that automatically optimize codebases for performance and sustainability
  • Built on Agentic Workflows, agents write natural language rules that scan repos, find inefficiencies, and submit PRs with fixes
  • Early pilots show real gains: one merged PR now impacts 500M+ monthly npm downloads
  • You can try the experimental framework today, but expect rough edges — this is research-grade tooling

The Big Picture

When's the last time your team discussed carbon efficiency in a sprint planning meeting? Probably never. Green software and performance optimization are perpetually stuck in the backlog, buried under feature work and bug fixes.

GitHub Next and GitHub Sustainability think they've found a way to change that: AI agents that continuously optimize your codebase without human intervention. They call it Continuous Efficiency, and it's the intersection of two emerging practices — Continuous AI (LLM-powered automation in CI/CD) and Green Software (code designed for energy efficiency).

The pitch is simple: what if your repository could improve itself? Not through manual refactoring sprints or performance audits, but through always-on AI agents that scan for inefficiencies, propose fixes, and submit pull requests while you sleep.

This isn't vaporware. GitHub has been running internal and external pilots using an experimental framework called Agentic Workflows. The results are uneven — some agents produce garbage, others deliver measurable wins. One recently merged PR optimizes a library with 500 million monthly downloads. That's the kind of leverage that makes this worth watching.

How It Works

Continuous Efficiency runs on Agentic Workflows, an experimental GitHub Actions framework that lets you write automation in Markdown instead of YAML. You define rules in natural language, compile them to standard GitHub Actions workflows, and let AI agents execute them.

The workflow is straightforward. You write a .md file with YAML front matter (triggers, permissions, tools) followed by plain-English instructions. Run gh aw compile and it generates a .yml file that GitHub Actions can execute. When the workflow runs, it spins up an AI agent (GitHub Copilot CLI, Claude, or OpenAI) in a sandboxed environment. The agent reads your repo, applies your instructions, and outputs comments, PRs, or other modifications.

GitHub is exploring two main use cases:

Rules and standards enforcement. Traditional linting tools match hard-coded patterns. Agentic workflows interpret intent. You write "avoid RegExp instantiation in hot loops" and the agent understands the semantic meaning across languages and architectures. It doesn't just flag violations — it fixes them and opens a PR.

GitHub implemented the W3C Web Sustainability Guidelines as 20 agentic workflows. They ran them against GitHub and Microsoft web properties and found opportunities ranging from deferred loading to using native browser features. The agents didn't just report issues — they built resolutions.

Heterogeneous performance improvement. Performance engineering is hard because every codebase is different. Languages, architectures, bottlenecks — it's all over the map. GitHub's "Daily Perf Improver" workflow attempts to solve this with a three-phase process: research and plan improvements, infer how to build and benchmark the repo, then iteratively propose measured optimizations.

In a pilot on FSharp.Control.AsyncSeq, the agent delivered multiple accepted PRs, including a rediscovered performance bug fix and verified microbenchmark-driven optimizations. Results vary dramatically across repos, but the wins are real when they hit.

The technical approach here is notable. Instead of trying to build a one-size-fits-all optimizer, GitHub is betting on agents that can navigate ambiguity. The agent figures out how to build your project, identifies relevant performance tools, runs microbenchmarks, and proposes targeted changes. It's semi-automatic performance engineering — automated iteration under human guidance.

GitHub's internal process for building these workflows is recursive: they use agents to create agents. Define the intent (based on a standard or engineering requirement), author the workflow in Markdown with help from the create-agentic-workflow agent, compile to YAML, then run in GitHub Actions.

What This Changes For Developers

If Continuous Efficiency works at scale, it shifts performance and sustainability from backlog items to background processes. You don't schedule refactoring sprints. You don't debate whether to optimize. The agents just do it.

The developer experience is hands-off by design. You review PRs from bots instead of writing optimization code yourself. The quality bar matters here — if agents produce too much noise or low-quality changes, teams will disable them. GitHub's pilots show mixed results, which is honest. Some PRs get merged immediately. Others get closed.

The business case is clearer than the technical maturity. Reducing power consumption, improving code quality, and lowering costs are measurable outcomes. The challenge is trust. Developers need to believe the agent's changes are correct and safe. That requires good benchmarking, clear explanations in PR descriptions, and easy rollback paths.

This also changes how you think about engineering standards. Instead of writing linting rules in code, you write them in prose. "Use native browser features where possible." "Hoist RegExp literals out of hot functions." The agent interprets and applies them. That's a fundamentally different authoring model than ESLint configs or custom static analysis tools.

The sustainability angle is interesting but secondary for most teams. The real hook is performance and code quality. If agents can make your app faster and your codebase cleaner without manual effort, that's valuable regardless of carbon impact. The green software framing helps with corporate ESG goals, but developers care about speed and maintainability first.

One practical concern: agent-generated PRs could flood your review queue. GitHub hasn't published guidance on rate-limiting or batching yet. If you enable multiple workflows, you might wake up to dozens of bot PRs. That's a workflow problem, not a technical one, but it matters for adoption.

Try It Yourself

Agentic Workflows is open source and available now, but GitHub labels it a "research demonstrator" — expect bugs, breaking changes, and incomplete features. If you're comfortable with experimental tooling, you can start today.

The Agentic Workflows documentation includes examples you can run immediately, including the Daily Perf Improver. You'll need the gh aw CLI extension and access to an LLM provider (GitHub Copilot CLI, Claude, or OpenAI).

GitHub hasn't published the full ruleset for their Green Software workflows yet, but they plan to soon. If you want early access or want to be a design partner, you can contact GitHub Sustainability directly through their support channel.

For context on how GitHub is thinking about AI-powered development workflows more broadly, see their WRAP framework for Copilot, which outlines a structured approach to getting real work done with coding agents.

The Bottom Line

Use this if you're already deep in GitHub Actions and comfortable debugging experimental tools. The performance optimization angle is compelling for high-traffic libraries or services where small gains scale massively. Skip it if you need production-ready tooling or can't afford to review bot-generated PRs.

The real risk is adoption friction. If agents produce too many low-quality PRs, teams will turn them off and the whole concept stalls. The real opportunity is leverage — one merged optimization that touches 500 million downloads is the kind of impact that justifies the experimentation cost.

GitHub is betting that natural language rule authoring will democratize performance engineering and standards enforcement. That's a big bet. Traditional static analysis tools are precise but narrow. Agentic workflows are broad but unpredictable. The question is whether LLMs are reliable enough to close that gap. Early results suggest maybe, but it's too soon to call it.

Source: GitHub Blog