GitHub Security Lab Uses LLMs to Triage Vulnerabilities at Scale

GitHub Security Lab built an LLM framework that triaged thousands of CodeQL alerts and found 30 real vulnerabilities. Here's how they combined fuzzy pattern matching with structured workflows to make SAST results actually useful.

GitHub Security Lab Uses LLMs to Triage Vulnerabilities at Scale

TL;DR

  • GitHub Security Lab built an LLM-powered framework that triaged thousands of CodeQL alerts and found ~30 real vulnerabilities since August
  • The Taskflow Agent breaks security audits into discrete YAML-defined tasks, letting LLMs handle fuzzy pattern matching while MCP servers handle deterministic checks
  • The system doesn't auto-exploit — it generates detailed bug reports with code references that human auditors can verify in minutes
  • If you have repetitive security workflows with clear goals but fuzzy logic, this approach might work for you

The Big Picture

Security alert triage is mind-numbing work. You stare at hundreds of CodeQL warnings. Most are false positives. The patterns are obvious to you — "oh, that workflow requires maintainer approval" or "that input gets sanitized three lines down" — but impossible to encode as formal rules without drowning in heuristics and regex hell.

GitHub Security Lab got tired of this and built something different: an LLM-powered triage system that found 30 real vulnerabilities in open source projects while filtering out the noise. The key insight? LLMs are terrible at formal reasoning but excellent at fuzzy pattern matching. So they built a framework that plays to those strengths.

The GitHub Security Lab Taskflow Agent breaks security audits into discrete tasks defined in YAML files. Each task has a narrow scope and clear success criteria. The LLM handles the "does this look like an access control check?" questions. MCP servers handle the "is this workflow disabled?" API calls. Results get stored in a database so you can resume from failures without burning through your quota again.

This isn't about replacing security researchers. The system generates detailed reports with file references and line numbers. A human still makes the final call. But instead of spending hours per alert, you spend minutes verifying conclusions that are already 80% correct.

The approach works because security triage has structure. You're not asking an LLM to "find all the bugs." You're asking it to check specific conditions in a specific order. Does this workflow run in a privileged context? Is there a sanitizer? Can an attacker reach this code path? These are questions with answers, not open-ended research problems.

How It Works

Taskflows are YAML files that define a sequence of prompts and checks. Think of them as security audit scripts where some steps are handled by LLMs and others by conventional code. The framework runs each task, stores the results, and passes them to the next task in the chain.

Here's the structure for triaging a GitHub Actions code injection alert:

Information Collection: The first stage gathers facts. What events trigger this workflow? What permissions does it request? Does it use secrets? Is the workflow disabled in the repo settings? Each question becomes a separate task with explicit instructions. The LLM records findings in "audit notes" — a running commentary that gets serialized to a database.

The prompts are brutally specific. Not "check if this is exploitable" but "include the line number where untrusted code is invoked and the exact package manager command in your notes." This precision reduces hallucination. If the LLM can't cite a line number, it probably made something up.

Audit Stage: Now the LLM reviews the collected information against known false positive patterns. Does this workflow only trigger on maintainer-approved events? Are permissions explicitly restricted? Is there a sanitizer between the user input and the dangerous sink?

For GitHub Actions alerts, common false positives include workflows that require repo maintainer privileges to trigger, workflows that are disabled, or workflows that run with minimal permissions and no secrets. The LLM checks each condition. Alerts that fail these checks get dismissed.

Report Generation: Alerts that survive the audit stage get turned into bug reports. The LLM uses the audit notes to write a structured report with code snippets, file references, and reasoning. No new analysis happens here — it's pure synthesis of previously gathered information.

A validation task then checks the report for completeness and consistency. Missing information usually means the LLM hallucinated or couldn't track down a data flow. Those reports get rejected.

Issue Creation and Review: Valid reports become GitHub Issues. This creates a checkpoint where human auditors can verify the findings. But it also enables something clever: the system can learn from dismissals.

When a human marks an alert as a false positive and documents why, that reason gets stored. Later, a separate taskflow collects all dismissal reasons for a repo and re-evaluates open issues. The LLM can now spot repo-specific patterns — custom permission checks, project-specific sanitizers — that weren't in the original prompts.

The team tested this on CodeQL alerts for GitHub Actions code injection, untrusted checkouts, and JavaScript XSS vulnerabilities. They used Claude Sonnet 3.5 for most tasks. The LLM had access to basic file fetching and search tools via MCP servers, but no static analysis beyond the original CodeQL results and no dynamic testing environment.

For JavaScript XSS alerts, the taskflows focused on highlighting exploitability factors rather than making binary decisions. The reports call out custom sanitization functions, unreachable sources, and context that might prevent exploitation. This gives the human auditor the information they need without pretending the LLM can make the final call on complex client-side vulnerabilities.

What This Changes For Developers

If you maintain open source projects, you might start seeing better vulnerability reports. The GitHub Security Lab has been using these taskflows since August and has published fixes for many of the discovered issues at securitylab.github.com/ai-agents.

If you run security tooling, this is a template for making SAST results actually useful. Static analyzers are great at finding potential issues and terrible at understanding context. LLMs are the opposite. Combining them in a structured workflow gets you closer to human-level triage without the human time investment.

The key is task decomposition. Don't ask an LLM to "audit this codebase." Ask it to answer ten specific questions in sequence, store the answers, and synthesize them into a report. Use MCP servers or similar tooling for anything deterministic — API calls, file operations, regex matching. Save the LLM for semantic reasoning that's hard to encode as rules.

The database-backed approach matters more than you'd think. LLM workflows fail. APIs time out. You hit quota limits. Models hallucinate. By storing intermediate results, you can resume from the last successful task instead of starting over. This also lets you tweak individual tasks and rerun just that part of the workflow.

The team learned to keep tasks small and independent. Early versions tried to do multiple checks in one prompt. The LLM would skip steps or ignore instructions. Breaking everything into separate tasks with fresh context windows fixed most of these issues. They use templated "repeat_prompt" tasks to loop over lists of alerts, starting a new context for each one.

Another lesson: delegate to code whenever possible. Initially they had the LLM extract workflow trigger events from YAML files. It worked most of the time but was inconsistent — sometimes it would miss triggers or misclassify privilege levels. Moving that logic to an MCP server tool made results deterministic.

The reusable taskflow feature lets you extract common patterns. Many security checks are shared across different vulnerability types. By defining reusable tasks and prompts, changes propagate across all taskflows that use them. This is critical when you're maintaining multiple triage workflows.

Try It Yourself

Both the seclab-taskflow-agent framework and the example taskflows are open source. The repos include the actual YAML files used to triage GitHub Actions and JavaScript alerts.

Fair warning: running these taskflows generates a lot of LLM API calls. The team notes this can burn through your quota fast. Also, the taskflows create GitHub Issues. Don't run them on someone else's repo without permission.

If you want to build your own taskflows, start with a workflow that has these characteristics:

  • Many repetitive steps with clear, well-defined goals
  • Some steps involve semantic reasoning that's hard to encode as rules but easy for humans to spot
  • You can break the workflow into discrete tasks that don't depend on complex shared state

Security triage fits this perfectly. So does code review for specific patterns, documentation quality checks, or API usage audits. Anything where you're looking for fuzzy patterns in a structured process.

The framework supports model configuration files, so you can swap models across all taskflows without editing individual YAML files. Useful when new model versions drop or you want to compare results across different LLMs.

The Bottom Line

Use this if you're drowning in SAST false positives and the patterns are obvious to humans but hard to encode. Use this if you have security workflows that are repetitive but require semantic reasoning. Use this if you can clearly define what "done" looks like for each step.

Skip it if your workflow is too open-ended or if you need formal guarantees. The LLM will hallucinate sometimes. The reports need human review. This is a force multiplier for security researchers, not a replacement.

The real opportunity here isn't just vulnerability triage. It's the pattern: structured workflows where LLMs handle fuzzy logic and conventional code handles deterministic operations. That combination works for a lot more than security. The risk is treating this like magic — it's not. It's careful task decomposition and prompt engineering wrapped in a framework that handles the plumbing.

GitHub Security Lab found 30 real vulnerabilities with this approach. That's not a benchmark, it's a proof point. The question is what you'll build with the same pattern.

Source: GitHub Blog