comparison

Semgrep vs CodeQL: Lightweight Patterns vs Semantic Analysis for SAST (2026)

Semgrep vs CodeQL for static analysis - rule syntax, performance, language support, GitHub integration, custom rules, and when to use each SAST tool.

Published: March 13, 2026

Last Updated: March 13, 2026

Quick verdict

Semgrep security scanning tool homepage screenshot — Semgrep homepage

Semgrep and CodeQL represent two fundamentally different philosophies for static application security testing. Semgrep is a lightweight, pattern-based scanner that matches YAML rules against source code in seconds, prioritizing speed, simplicity, and developer accessibility. CodeQL is a semantic analysis engine that builds a queryable database of your entire codebase, enabling deep data flow analysis and complex vulnerability research at the cost of significantly longer scan times and a steeper learning curve.

If you need fast, practical security scanning in CI/CD, choose Semgrep. Its YAML-based rules are writable by any developer, scans complete in 10-30 seconds, and the open-source CLI runs anywhere with zero dependencies. The full platform is free for up to 10 contributors. Semgrep is the tool you run on every pull request.

If you need deep semantic analysis and vulnerability research, choose CodeQL. Its QL query language and database-backed architecture enable whole-program data flow analysis, complex taint tracking across call chains, and the kind of deep vulnerability discovery that pattern matching cannot achieve. CodeQL is the tool you run for thorough security audits - nightly, weekly, or as part of dedicated research.

If you want the strongest security posture, run both. Semgrep on every PR for fast pattern matching and standards enforcement. CodeQL on a schedule for deep semantic analysis. This layered approach is what many security-mature organizations deploy, and both tools are free for open-source projects.

At-a-glance comparison

Dimension	Semgrep	CodeQL
Primary approach	Pattern matching against source code	Semantic queries against code database
Rule/query language	YAML (mirrors target language)	QL (dedicated query language)
Learning curve	Hours	Weeks to months
Scan speed (typical)	10-30 seconds	10-60+ minutes (database build + queries)
Taint tracking	Yes (Pro engine - cross-file)	Yes (deep, whole-program)
Data flow analysis	Cross-file (Pro engine)	Whole-program, inter-procedural
Languages supported	30+	~17
IaC scanning	Yes (Terraform, K8s, Docker, CloudFormation)	No
Open source	Yes (LGPL-2.1)	Partially (queries are open, engine is proprietary)
Free for open source	Yes	Yes
Free for private repos	Yes (OSS CLI, full platform for 10 contributors)	No - requires GitHub Advanced Security ($49/committer/month)
GitHub integration	Good (Actions, PR comments, SARIF)	Native (built into GitHub Security tab)
Non-GitHub CI	Full support everywhere	Possible but licensing requires GHAS
Custom rule authoring	Excellent - minutes to write	Powerful but slow - hours to days
AI features	Semgrep Assistant (AI triage)	Copilot Autofix (AI-generated fixes)
SCA / dependency scanning	Yes (Semgrep Supply Chain with reachability)	Limited (dependency analysis in some queries)
Secrets detection	Yes (Semgrep Secrets with validation)	No dedicated module
Community rules/queries	20,000+ Pro / 2,800+ community	400+ community queries
IDE integration	VS Code (LSP-based)	VS Code (CodeQL extension)

Understanding the comparison: pattern matching vs semantic analysis

Before diving into features, it is important to understand the architectural distinction between these tools. This distinction drives every practical difference - from scan speed to detection depth to the learning curve for writing rules.

Semgrep uses pattern matching. When you run Semgrep, it parses source code into an abstract syntax tree and matches your YAML rules directly against that tree. Rules describe the code patterns you want to find using syntax that mirrors the target language. This approach is fast because it does not require building an intermediate representation of the entire program’s semantics. It is also intuitive because rules look like the code they detect. The trade-off is that pattern matching is inherently local - it excels at finding specific code constructs but struggles with analysis that requires understanding the full program’s behavior across many files and call chains.

CodeQL uses semantic analysis. When you run CodeQL, it first builds a relational database (called a “CodeQL database”) that represents your entire codebase’s structure - every function, variable, type, call site, control flow path, and data flow relationship. Then it executes queries written in QL (a purpose-built declarative query language similar to Datalog) against that database. This approach is slower because building the database requires compiling or parsing the entire codebase and extracting all semantic relationships. But it enables queries that are impossible with pattern matching alone - for example, “find every path through which user input can reach a SQL query, across any number of intermediate functions and files.”

The practical implication: Semgrep is the tool you run on every pull request because it is fast enough to never block developers. CodeQL is the tool you run on a schedule because its depth justifies the longer analysis time. Semgrep catches the patterns you know to look for. CodeQL discovers vulnerabilities you did not know existed. Both are valuable, and they serve different roles in a security program.

What is Semgrep?

Semgrep is a lightweight, programmable static analysis engine built for application security. Created by Semgrep, Inc. (formerly Return To Corp), it scans source code for patterns that match rules you define or pull from a community registry. The core engine is open source under the LGPL-2.1 license, runs as a single binary with no external dependencies, and completes scans in seconds rather than minutes.

How Semgrep works

Semgrep takes a fundamentally different approach from traditional SAST tools. Rather than building a full program representation and running complex dataflow analyses, Semgrep uses a pattern-matching approach where rules describe the code you want to find using syntax that mirrors the target language. This design makes rules readable by any developer - not just security specialists - and keeps scan times extremely fast.

The Semgrep engine operates in three tiers:

Community Edition (OSS): Single-file, single-function analysis. The core pattern-matching engine with 2,800+ community rules. Free forever, runs anywhere.
Pro Engine: Cross-file and cross-function dataflow analysis. Traces tainted data from sources to sinks across entire codebases. Available with the Team tier. Independent testing found that the Pro engine detected 72-75% of vulnerabilities in test suites compared to just 44-48% for the Community Edition.
Semgrep AppSec Platform: The commercial product that wraps the engine with AI triage (Semgrep Assistant), a managed dashboard, policy management, and integrations. Includes three product modules - Semgrep Code (SAST), Semgrep Supply Chain (SCA with reachability), and Semgrep Secrets (credential detection with validation).

Key strengths of Semgrep

Custom rule authoring. Semgrep’s rule syntax is the gold standard for static analysis. Rules are written in YAML and use patterns that mirror the target language:

rules:
  - id: sql-injection-concat
    patterns:
      - pattern: |
          $QUERY = "..." + $INPUT + "..."
      - metavariable-regex:
          metavariable: $QUERY
          regex: (?i)(select|insert|update|delete)
    message: >
      Possible SQL injection: query built with string concatenation.
      Use parameterized queries instead.
    severity: ERROR
    languages: [python]

Any developer who reads Python can read this rule. The learning curve is measured in hours, not weeks. For a deeper look at Semgrep’s pricing tiers and what each includes, see our Semgrep pricing breakdown.

Scan speed. Semgrep scans a typical repository in 10-30 seconds. The median CI scan time reported by Semgrep is 10 seconds. This speed makes it practical to run on every commit and every pull request without becoming a bottleneck.

AI-powered triage. Semgrep Assistant uses AI to analyze findings, assess exploitability, and prioritize fixes. Semgrep reports that Assistant reduces false positive noise by 20-40% out of the box.

Broad language and IaC support. Semgrep supports 30+ programming languages plus infrastructure-as-code formats including Terraform, CloudFormation, Kubernetes manifests, and Dockerfiles. This breadth makes it valuable for both application developers and platform engineering teams.

What is CodeQL?

CodeQL is a semantic code analysis engine developed by GitHub (originally Semmle, acquired by GitHub in 2019). It treats code as data by building a relational database of your codebase’s structure, then lets you query that database using QL - a purpose-built logic programming language. CodeQL powers GitHub’s code scanning feature and is the analytical foundation of GitHub Advanced Security (GHAS).

How CodeQL works

CodeQL’s analysis process has two distinct phases:

Database creation: CodeQL extracts information from your source code by observing the build process (for compiled languages like Java, C++, C#, Go) or by directly parsing source files (for interpreted languages like Python, JavaScript, Ruby). The result is a CodeQL database - a relational representation of your code’s abstract syntax tree, control flow graph, data flow graph, type hierarchy, and call graph. For compiled languages, this step requires that your code successfully compiles, which means the build environment must be correctly configured. Database creation time varies from a few minutes for small projects to over an hour for large C++ codebases.
Query execution: Once the database exists, you execute QL queries against it. Each query describes a pattern or property you want to find, expressed as a logical predicate over the code’s structure. GitHub ships a default set of queries for each supported language, covering common vulnerability classes (SQL injection, XSS, path traversal, buffer overflow, etc.). You can also write custom queries using the QL language and CodeQL’s language-specific libraries.

Key strengths of CodeQL

Deep semantic analysis. CodeQL’s database representation captures the complete semantic structure of your code - every type, every function call, every data flow path, every control flow branch. This enables analysis that pattern-matching tools cannot perform. For example, CodeQL can answer: “Does any user-controlled input reach this SQL query through any possible execution path, including through virtual method dispatch, callbacks, and framework-specific routing?” This question requires understanding the entire program’s call graph and data flow, which is exactly what CodeQL’s database provides.

Sophisticated taint tracking. CodeQL’s taint tracking is among the most sophisticated in any static analysis tool. The DataFlow::Configuration class lets you define sources (where untrusted data enters), sinks (where dangerous operations occur), and sanitizers (where data is validated). CodeQL then exhaustively searches the entire call graph for paths from sources to sinks, accounting for:

Inter-procedural calls across file boundaries
Virtual dispatch and dynamic binding
Framework-specific routing (Spring MVC controllers, Express handlers)
Container operations (adding tainted data to a collection, then reading it later)
Field sensitivity (tracking taint through object fields)

Here is a simplified CodeQL query that finds SQL injection in Java:

import java
import semmle.code.java.dataflow.TaintTracking
import semmle.code.java.security.SqlInjectionQuery

class SqlInjectionConfig extends TaintTracking::Configuration {
  SqlInjectionConfig() { this = "SqlInjectionConfig" }

  override predicate isSource(DataFlow::Node source) {
    source instanceof RemoteFlowSource
  }

  override predicate isSink(DataFlow::Node sink) {
    sink instanceof SqlInjectionSink
  }
}

from SqlInjectionConfig config, DataFlow::PathNode source, DataFlow::PathNode sink
where config.hasFlowPath(source, sink)
select sink, source, sink, "SQL injection from $@.", source, "user input"

This query leverages CodeQL’s built-in libraries for Java data flow analysis, remote flow sources, and SQL injection sinks. The engine traces data from any HTTP request parameter through every possible code path to any SQL query execution. The precision is remarkable, but the QL language requires significant learning investment.

GitHub-native integration. CodeQL is deeply integrated into GitHub’s security ecosystem. When CodeQL runs via GitHub Actions, findings appear as code scanning alerts in the repository’s Security tab. Alerts are automatically annotated on pull requests, dismissed alerts persist across scans, and the alert management workflow is built into GitHub’s native UI. For teams that live in GitHub, this integration is seamless - there is no separate dashboard to manage.

Copilot Autofix. GitHub has integrated Copilot with CodeQL to provide AI-generated fix suggestions for code scanning alerts. When CodeQL identifies a vulnerability, Copilot Autofix generates a proposed code change that addresses the issue. This feature is available to GHAS customers and represents an interesting convergence of AI-assisted development and static analysis.

Open query repository. While the CodeQL engine itself is proprietary, the query libraries are open source and maintained on GitHub. The community contributes queries, and GitHub’s security research team publishes new queries regularly. This creates a collaborative ecosystem where security researchers share detection logic. However, the total number of community queries (approximately 400+) is significantly smaller than Semgrep’s registry (20,000+ Pro / 2,800+ community).

Feature-by-feature deep dive

Rule syntax and authoring

This is the most important practical differentiator between the two tools, and it deserves detailed examination.

Semgrep rules are YAML-based and mirror the target language. A taint-tracking rule that detects command injection through a Flask endpoint takes minutes to write:

rules:
  - id: flask-command-injection
    mode: taint
    pattern-sources:
      - patterns:
          - pattern: flask.request.$ANYTHING
    pattern-sinks:
      - patterns:
          - pattern: subprocess.call(...)
    message: >
      User input from flask.request flows to subprocess.call(),
      creating a command injection vulnerability.
    severity: ERROR
    languages: [python]

Any Python developer can read this rule and understand exactly what it detects. The Semgrep playground at semgrep.dev/playground lets you test rules interactively against sample code. Writing, testing, and deploying a new Semgrep rule to CI takes under an hour.

CodeQL queries are written in QL, a dedicated logic programming language. The equivalent command injection detection in CodeQL for Python would involve:

import python
import semmle.python.dataflow.new.TaintTracking
import semmle.python.Concepts
import semmle.python.dataflow.new.RemoteFlowSources

class CommandInjectionConfig extends TaintTracking::Configuration {
  CommandInjectionConfig() { this = "CommandInjectionConfig" }

  override predicate isSource(DataFlow::Node source) {
    source instanceof RemoteFlowSource
  }

  override predicate isSink(DataFlow::Node sink) {
    exists(SystemCommandExecution cmd | sink = cmd.getCommand())
  }
}

from CommandInjectionConfig config, DataFlow::PathNode source, DataFlow::PathNode sink
where config.hasFlowPath(source, sink)
select sink.getNode(), source, sink,
  "Command injection from $@.", source.getNode(), "user input"

This query is more powerful - it will find command injection through any code path, not just direct flows - but it requires understanding QL’s type system, predicate logic, CodeQL’s class hierarchy, and the language-specific library for Python. The learning curve is measured in weeks for basic competency and months for proficiency with complex taint-tracking queries.

The practical impact: When a security team discovers a new vulnerability pattern - say, an internal API that must always validate authentication tokens - a Semgrep rule can be written, tested, and deployed within an hour. The equivalent CodeQL query might take a day or more for someone proficient in QL, and significantly longer for someone still learning the language. For organizations that need to rapidly encode new detection patterns into automated scanning, Semgrep’s rule authoring speed is a decisive advantage.

Rule authoring dimension	Semgrep	CodeQL
Language	YAML (declarative, code-like patterns)	QL (logic programming language)
Learning curve	Hours	Weeks to months
Time to write a basic rule	15-30 minutes	1-4 hours (for proficient users)
Time to write a taint rule	30-60 minutes	2-8 hours (for proficient users)
Playground/testing	semgrep.dev/playground	VS Code CodeQL extension + database
Documentation	Concise, developer-oriented	Extensive but dense
Community contributions	Easy (submit YAML to registry)	Moderate (submit QL to GitHub repo)

Performance and speed

The performance difference between Semgrep and CodeQL is not a minor distinction - it fundamentally shapes how each tool fits into development workflows.

Semgrep is designed for real-time CI/CD integration. A typical scan completes in 10-30 seconds. Semgrep achieves this speed by matching patterns directly against the parsed abstract syntax tree of source files without building an intermediate database representation. Diff-aware scanning in CI means only changed files are analyzed, keeping incremental scan times constant regardless of total codebase size. This speed makes Semgrep practical to run on every commit, every pull request, and even as a pre-commit hook.

CodeQL is designed for thorough analysis, not speed. A CodeQL analysis involves two phases, each with significant time cost:

Database creation - for compiled languages, CodeQL observes the build process, which means the code must compile fully. For a medium-sized Java project, database creation might take 5-15 minutes. For a large C++ codebase, it can take 30-90+ minutes. For interpreted languages (Python, JavaScript), database creation is faster (2-10 minutes) since CodeQL only needs to parse source files.
Query execution - running the default query suite against a database takes additional minutes. Complex taint-tracking queries that explore the entire call graph are computationally expensive. A full security scan with all default queries might add 5-20 minutes on top of database creation.

Total analysis time comparison:

Codebase size	Semgrep	CodeQL
Small (10K LOC)	5-10 seconds	3-8 minutes
Medium (100K LOC)	10-30 seconds	10-30 minutes
Large (1M LOC)	30-90 seconds	30-90+ minutes
Very large (10M+ LOC)	2-5 minutes	1-3+ hours

The workflow implication: Semgrep fits into the “run on every PR” workflow without friction. CodeQL fits into the “run on schedule” or “run on merge to main” workflow. Attempting to run CodeQL on every PR in a fast-moving repository with frequent merges will create significant pipeline bottlenecks. Many teams address this by running Semgrep on PRs for fast feedback and CodeQL nightly or weekly for deeper analysis.

Language support

Semgrep supports 30+ languages including Python, Java, JavaScript, TypeScript, Go, Ruby, Rust, C, C++, C#, Kotlin, Swift, PHP, Scala, Lua, OCaml, Elixir, R, Solidity, and infrastructure-as-code languages (Terraform/HCL, CloudFormation, Kubernetes YAML, Dockerfiles). The breadth of language support is one of Semgrep’s strongest selling points, especially the IaC coverage that CodeQL lacks entirely.

CodeQL supports approximately 17 languages including C, C++, C#, Go, Java, Kotlin, JavaScript, TypeScript, Python, Ruby, Swift, and Rust (in preview). CodeQL’s analysis depth for its supported languages is typically deeper than Semgrep’s, with language-specific libraries that model framework behaviors, standard library APIs, and common coding patterns with high precision. However, the narrower language roster means teams with polyglot stacks or infrastructure-as-code needs may find gaps.

Key language support differences:

Language / Format	Semgrep	CodeQL
Python	Strong	Strong
Java / Kotlin	Strong	Very strong
JavaScript / TypeScript	Strong	Strong
Go	Strong	Strong
C / C++	Good	Very strong
C#	Good	Strong
Ruby	Good	Good
Rust	Good (stable)	Preview / experimental
Swift	Good	Good
PHP	Good	Not supported
Elixir	Supported	Not supported
Terraform / HCL	Strong	Not supported
Kubernetes YAML	Strong	Not supported
Dockerfiles	Strong	Not supported
CloudFormation	Strong	Not supported

The absence of IaC support in CodeQL is significant for platform engineering and DevOps teams. If scanning Terraform configurations, Kubernetes manifests, or Dockerfiles for security misconfigurations is part of your requirements, Semgrep covers this natively while CodeQL does not.

Taint tracking and data flow analysis

This is where CodeQL’s architectural advantage is most apparent.

CodeQL’s taint tracking is whole-program and exhaustive. Because CodeQL operates on a database that represents the entire codebase’s semantic structure, its taint tracking can:

Trace data through arbitrary numbers of function calls and file boundaries
Handle virtual dispatch - tracking tainted data through interface implementations and method overrides
Model framework-specific routing - understanding that a Spring @RequestMapping method receives HTTP parameters
Track data through container operations - adding tainted data to a List, passing the list to another function, then extracting the data
Handle field sensitivity - tracking which specific object fields are tainted versus clean
Account for sanitizers - recognizing when data passes through a validation or escaping function

This depth makes CodeQL exceptionally powerful for vulnerability research. Security researchers use CodeQL to discover zero-day vulnerabilities by querying for complex data flow patterns across large open-source projects. GitHub’s own security team has used CodeQL to discover CVEs in widely used projects.

Semgrep’s taint tracking is practical and fast. The Pro engine (available on the Team tier) supports cross-file and cross-function taint tracking, tracing data from user-defined sources to sinks across file boundaries. Independent testing showed the Pro engine detected 72-75% of vulnerabilities compared to 44-48% for the single-file Community Edition. Semgrep’s taint mode is defined directly in the YAML rule syntax, making it accessible to developers who are not taint-tracking specialists.

However, Semgrep’s taint tracking has limitations compared to CodeQL:

It does not build a complete call graph of the program, so some inter-procedural paths may be missed
Virtual dispatch and dynamic binding resolution is less complete
Container and field sensitivity is limited
The analysis is faster but less exhaustive

The practical distinction: For catching common vulnerability patterns (OWASP Top 10 categories), Semgrep’s taint tracking is more than sufficient and delivers results in seconds. For discovering complex, novel vulnerabilities that require tracing data through multiple layers of abstraction, CodeQL’s exhaustive analysis is necessary but takes much longer. The choice depends on whether you are defending against known attack patterns (Semgrep) or researching unknown ones (CodeQL).

CI/CD integration

Semgrep is the easiest security scanner to add to CI/CD. The CLI runs as a single binary with zero external dependencies. Adding it to a GitHub Actions workflow takes one step:

- uses: semgrep/semgrep-action@v1
  with:
    config: p/default

Semgrep supports GitHub Actions, GitLab CI, Jenkins, CircleCI, Bitbucket Pipelines, Azure Pipelines, and any CI system that can run a command-line tool. There is no database to configure, no server to maintain, and no build process to replicate. Diff-aware scanning means only changed files are analyzed, keeping incremental scans fast. For a step-by-step guide, see our Semgrep setup tutorial.

CodeQL’s CI/CD integration is more involved. The standard deployment path is through GitHub Actions:

- uses: github/codeql-action/init@v3
  with:
    languages: javascript
- uses: github/codeql-action/autobuild@v3
- uses: github/codeql-action/analyze@v3

For compiled languages (Java, C++, C#, Go), CodeQL requires a successful build to create its database. The autobuild step attempts to detect and run the correct build command, but complex projects with custom build systems may need manual configuration. This build dependency adds friction that Semgrep avoids entirely.

CodeQL can run outside of GitHub Actions, but licensing requires GHAS for private repositories. Some teams run CodeQL in Jenkins or GitLab CI by downloading the CLI bundle, though this is a less-supported deployment path.

Integration comparison:

CI/CD dimension	Semgrep	CodeQL
Setup complexity	Very low - single binary	Moderate - requires build environment
Build dependency	None	Yes (for compiled languages)
Scan speed	10-30 seconds	10-60+ minutes
Diff-aware scanning	Yes	Limited (full database rebuild required)
PR comments	Yes	Yes (native GitHub annotations)
GitHub Actions	Official action	Official action
GitLab CI	Full support	Possible (requires GHAS license)
Jenkins	Full support	Possible (requires GHAS license)
Bitbucket Pipelines	Full support	Not officially supported
Non-GitHub platforms	Full support everywhere	Limited by licensing
SARIF output	Yes	Yes

GitHub integration

CodeQL has a significant advantage in GitHub integration because it is a GitHub product.

CodeQL findings are native GitHub code scanning alerts. When CodeQL runs via GitHub Actions, findings appear directly in the repository’s Security tab under “Code scanning alerts.” Each alert includes the vulnerability description, the affected code, the data flow path, and a severity rating. Alerts are automatically annotated on pull requests, and developers can dismiss alerts with reasons that persist across future scans. The code scanning API lets teams programmatically query and manage alerts across their organization’s repositories.

Copilot Autofix generates AI-powered fix suggestions for CodeQL alerts. When a vulnerability is detected, Copilot analyzes the finding and proposes a code change that addresses the issue. This feature reduces the time between detection and remediation, especially for developers who may not be security experts.

Semgrep integrates well with GitHub but through external mechanisms. Semgrep posts PR comments with findings, uploads SARIF to GitHub’s code scanning API (so findings can appear in the Security tab), and supports GitHub Actions as a CI platform. The integration is functional and well-maintained, but it is not as deeply embedded in GitHub’s native security features as CodeQL. Semgrep’s dashboard, alert management, and triage workflows live in the Semgrep Cloud platform rather than in GitHub’s UI.

For non-GitHub teams, Semgrep has the clear advantage. Semgrep works equally well with GitLab, Bitbucket, Azure DevOps, and any CI system. CodeQL’s licensing model ties it to GitHub, making it impractical for teams that use other source code management platforms.

Pricing and licensing

The pricing models for Semgrep and CodeQL are fundamentally different, and understanding the structure is important for budgeting.

Semgrep pricing:

Tier	Price	What you get
Community Edition (OSS)	Free	Open-source engine, 2,800+ community rules, single-file analysis, CLI and CI/CD
Team	$35/contributor/month (free for first 10 contributors)	Cross-file analysis, 20,000+ Pro rules, Semgrep Assistant (AI triage), Semgrep Supply Chain (SCA), Semgrep Secrets, dashboard
Enterprise	Custom pricing	Everything in Team plus SSO/SAML, custom deployment, advanced reporting, dedicated support

CodeQL / GitHub Advanced Security pricing:

Tier	Price	What you get
Open source	Free	Full CodeQL analysis on public GitHub repositories
GitHub Advanced Security (GHAS)	$49/active committer/month	CodeQL for private repos, secret scanning, dependency review, Copilot Autofix
GitHub Enterprise	Custom pricing	GHAS + enterprise platform features, GHES support

Cost comparison for different team sizes:

Team size	Semgrep Team (annual)	GHAS (annual)	Notes
5 developers	$0 (free for 10 contributors)	$2,940/year	Semgrep is free
10 developers	$0 (free for 10 contributors)	$5,880/year	Semgrep is free
25 developers	$6,300/year (15 paid contributors)	$14,700/year	Semgrep is cheaper
50 developers	$16,800/year (40 paid contributors)	$29,400/year	Semgrep is cheaper
100 developers	$37,800/year (90 paid contributors)	$58,800/year	Semgrep is cheaper

Important pricing notes:

GHAS pricing is per “active committer” - GitHub counts unique committers who trigger code scanning in a billing period. This can be lower than total team headcount if not all developers commit to repositories with GHAS enabled.
GHAS includes more than just CodeQL - it also includes secret scanning with push protection, dependency review, and Copilot Autofix. If you would pay for those features separately, the effective cost of CodeQL is lower.
Semgrep’s free tier for 10 contributors includes the full platform with cross-file analysis, AI triage, and the Pro rule library. This makes Semgrep essentially free for small teams.
Semgrep’s OSS CLI is free for unlimited contributors with no licensing restrictions. Teams can use the open-source engine in production indefinitely, only paying if they need the platform features.
CodeQL has no equivalent to Semgrep’s OSS CLI for private repositories. Scanning private code with CodeQL requires GHAS.

For a complete breakdown of Semgrep’s tiers, see our Semgrep pricing guide.

Use cases and recommendations

When to choose Semgrep

Choose Semgrep when:

Fast CI/CD scanning is non-negotiable. If your team practices continuous deployment with frequent merges and cannot tolerate multi-minute scan times blocking pull requests, Semgrep’s 10-30 second scans fit into any workflow without creating bottlenecks. This is especially critical for teams with dozens of daily PRs.

Custom rules need to be written quickly. If your organization discovers new vulnerability patterns - internal API misuse, custom authentication requirements, organization-specific coding standards - and needs to scan for them within hours rather than days, Semgrep’s YAML-based rule authoring delivers that velocity. No other tool matches Semgrep’s rule-to-production speed.

You use multiple CI/CD platforms or non-GitHub hosting. Semgrep runs identically on GitHub, GitLab, Bitbucket, Azure DevOps, Jenkins, CircleCI, and any system that can execute a command-line tool. CodeQL’s licensing ties it to GitHub, making Semgrep the only practical choice for multi-platform or non-GitHub teams.

Infrastructure-as-code scanning is required. Semgrep natively scans Terraform, Kubernetes manifests, CloudFormation templates, and Dockerfiles with security-focused rules. CodeQL does not support IaC scanning at all.

You have 10 or fewer contributors. The full Semgrep platform - including cross-file analysis, AI triage, SCA with reachability, and secrets detection - is free for 10 contributors. This is an extraordinary value for startups and small teams.

Enforcing coding standards is a priority. Banning deprecated APIs, requiring specific error handling patterns, mandating authentication on endpoints - these are code standard enforcement tasks where Semgrep’s speed and simple rule syntax excel. CodeQL is overkill for this use case.

For more options in this space, see our Semgrep alternatives guide.

When to choose CodeQL

Choose CodeQL when:

Deep vulnerability research is the goal. If you have a dedicated security research team that needs to discover novel vulnerabilities - not just detect known patterns - CodeQL’s semantic analysis and whole-program data flow tracking provide analytical power that pattern-matching tools cannot match. CodeQL is how security researchers find zero-days.

Your team is fully committed to GitHub. If you already pay for GitHub Enterprise with GHAS, CodeQL is included at no additional cost. The native integration with GitHub’s Security tab, code scanning alerts, PR annotations, and Copilot Autofix creates a seamless experience. Adding Semgrep would be an additional tool and cost.

Maximum taint tracking depth is required. If your codebase has complex data flow patterns - data passing through multiple abstraction layers, virtual dispatch, framework-specific routing, container operations - CodeQL’s exhaustive taint tracking will find flows that Semgrep’s pattern-based approach might miss.

Compliance requires thorough analysis evidence. Some compliance frameworks and security audits require evidence of thorough static analysis. CodeQL’s detailed data flow paths and exhaustive query execution provide stronger evidence of analytical completeness than pattern-based scanning.

You work primarily in Java, C++, or C# and need the deepest analysis. CodeQL’s language-specific libraries for Java, C++, and C# are exceptionally mature, with detailed modeling of standard libraries, framework APIs, and language-specific vulnerability patterns.

When to use both

Running Semgrep and CodeQL together is the strongest SAST strategy for many organizations. The tools serve different roles with minimal operational overlap:

Pattern 1 - Semgrep for speed, CodeQL for depth:

Semgrep runs on every pull request (10-30 seconds) to catch common vulnerability patterns, enforce coding standards, and provide fast developer feedback
CodeQL runs nightly or weekly (scheduled) to perform deep semantic analysis, discover complex data flow vulnerabilities, and provide thorough security coverage
Both upload SARIF to GitHub’s code scanning API, creating a unified view of findings

Pattern 2 - Semgrep for breadth, CodeQL for core languages:

Semgrep scans all languages in the stack including IaC (Terraform, K8s, Docker) with custom rules for organization-specific patterns
CodeQL provides deep analysis for the primary application languages (Java, Python, JavaScript) where its semantic understanding delivers the most value
The combination covers more languages and more analysis depth than either tool alone

Pattern 3 - Free tiers of both:

Semgrep OSS CLI (free) for fast pattern matching in CI
CodeQL via GitHub Actions (free for public repos, included in GHAS for private)
Cost depends on whether GHAS is already part of your GitHub subscription

This layered approach gives you Semgrep’s speed and rule authoring flexibility for daily development work, paired with CodeQL’s analytical depth for thorough security coverage. Both tools output SARIF, making finding consolidation straightforward.

Alternatives to consider

If neither Semgrep nor CodeQL fits your requirements, several alternatives are worth evaluating:

Snyk Code provides developer-first SAST with ML-based detection, strong IDE integration across VS Code, JetBrains, Eclipse, and Visual Studio, and a unified platform that includes SCA with auto-fix pull requests. It is a strong choice for teams that want comprehensive application security without writing custom rules. See our Snyk vs Semgrep comparison for details.

SonarQube is a code quality platform that covers both security vulnerabilities and code quality concerns (bugs, code smells, duplication, complexity, coverage). If you need quality gate enforcement and technical debt tracking alongside security scanning, SonarQube covers both dimensions. See our Semgrep vs SonarQube comparison for a detailed breakdown.

Checkmarx is an enterprise SAST and SCA platform with deep dataflow analysis, compliance reporting, and professional services. It targets large enterprises with dedicated AppSec teams and regulatory requirements. Pricing is significantly higher than both Semgrep and CodeQL.

For a broader overview, see our guide to the best SAST tools in 2026.

Final recommendation

Semgrep and CodeQL are complementary tools that serve different roles in a security program. Semgrep is the lightweight, fast, accessible scanner you run on every pull request to catch known patterns, enforce standards, and provide immediate developer feedback. CodeQL is the deep, semantic analyzer you run on a schedule to discover complex vulnerabilities, trace data flows exhaustively, and provide thorough security coverage.

If you must choose one, choose Semgrep for practical, day-to-day security scanning. Its speed (10-30 seconds), accessible rule authoring (YAML that any developer can read and write), broad language support (30+ languages plus IaC), platform independence (runs anywhere), and generous free tier (full platform for 10 contributors) make it the more practical choice for most engineering teams. Semgrep is the tool that actually gets deployed, used consistently, and maintained - because it is fast enough to never get turned off.

Choose CodeQL as your primary tool only if you are a GitHub-native organization that already has GHAS, your team includes dedicated security researchers who will invest in learning QL, and deep semantic analysis is more important to your security program than scan speed and rule authoring velocity.

The strongest posture is both. Semgrep for every PR, CodeQL on a schedule. Pattern matching for speed and coverage, semantic analysis for depth. This is the approach used by security-mature organizations, and both tools offer free tiers (Semgrep OSS CLI plus CodeQL on public repos, or Semgrep free for 10 contributors plus CodeQL via GHAS) that make dual deployment financially accessible.

For teams currently using neither tool, start with Semgrep. Add it to your CI pipeline today - it takes five minutes and scans in seconds. Once that baseline is established, evaluate adding CodeQL for deeper analysis on a schedule. Building security scanning incrementally is always more effective than planning the perfect tool stack and never deploying it.

Frequently Asked Questions

Is Semgrep or CodeQL better for finding security vulnerabilities?

It depends on the type of analysis you need. CodeQL is better for deep vulnerability research that requires semantic understanding of code - tracing complex data flows across multiple files, identifying subtle logic flaws, and finding vulnerabilities that require full program analysis. Semgrep is better for enforcing known security patterns quickly and at scale - catching OWASP Top 10 issues, enforcing coding standards, and running fast CI scans. For most engineering teams, Semgrep provides more practical value because its speed and ease of use mean it actually gets deployed and used consistently. For dedicated security research teams, CodeQL's depth is unmatched.

Can I use Semgrep and CodeQL together?

Yes, and this is a strong strategy for security-mature organizations. The most effective pattern is running Semgrep on every pull request for fast, lightweight scanning (10-30 seconds) to catch common vulnerability patterns and enforce coding standards, while running CodeQL on a scheduled basis (nightly or weekly) for deeper semantic analysis that catches complex data flow vulnerabilities. Both tools output SARIF format, so findings from both can feed into GitHub's code scanning alerts or any SARIF-compatible dashboard. The tools have complementary strengths with minimal operational overlap.

Is CodeQL free to use?

CodeQL is free for open-source projects hosted on GitHub. For private repositories, CodeQL is included as part of GitHub Advanced Security (GHAS), which costs $49 per active committer per month. The CodeQL CLI can be downloaded and used for research purposes on open-source code without cost, but scanning proprietary code outside of GitHub requires a GHAS license. There is no standalone CodeQL product - it is exclusively distributed through GitHub.

Is Semgrep free for commercial use?

Yes. Semgrep's open-source CLI (Community Edition) is free for commercial use under the LGPL-2.1 license. You can run it in CI/CD pipelines on proprietary codebases, write custom rules, and use the 2,800+ community rules at no cost. The full Semgrep AppSec Platform (Team tier) is also free for up to 10 contributors, which includes cross-file analysis, AI triage, and the 20,000+ Pro rule library. Beyond 10 contributors, the Team tier costs $35/contributor/month.

How long does a CodeQL scan take compared to Semgrep?

Semgrep scans a typical repository in 10-30 seconds. CodeQL takes significantly longer because it must first build a database representation of your code (which can take minutes to tens of minutes depending on language and codebase size), then execute queries against that database. A full CodeQL scan on a medium-sized Java project might take 10-30 minutes, and large C++ codebases can take an hour or more. This speed difference is fundamental to the tools' architectures - Semgrep matches patterns directly against source code, while CodeQL builds and queries a relational database of code semantics.

Which has better custom rule authoring, Semgrep or CodeQL?

Semgrep has far easier custom rule authoring. Its rules are written in YAML using patterns that mirror the target language, making them readable and writable by any developer in hours. CodeQL rules are written in QL, a purpose-built declarative query language with its own type system, predicates, and recursion. QL is powerful but has a steep learning curve measured in weeks or months. Semgrep is the better choice for teams that need to rapidly deploy new detection rules. CodeQL is the better choice for security researchers who need maximum analytical power and are willing to invest in learning the language.

Does CodeQL work outside of GitHub?

The CodeQL CLI can be run outside of GitHub Actions, but the licensing restricts its use. CodeQL is free for analyzing open-source code regardless of where you run it. For proprietary code, you need a GitHub Advanced Security license, and the typical deployment path is through GitHub Actions or GitHub Enterprise Server. Some organizations run the CodeQL CLI in other CI systems (Jenkins, GitLab CI) by downloading the CLI bundle, but this still requires a GHAS license for private repositories. Semgrep, by contrast, runs anywhere with no licensing restrictions on the open-source CLI.

What languages does CodeQL support?

CodeQL supports approximately 17 languages: C, C++, C#, Go, Java, Kotlin, JavaScript, TypeScript, Python, Ruby, Swift, Rust (preview), and several configuration/markup languages. Semgrep supports 30+ languages including all of CodeQL's supported languages plus Rust (stable), Elixir, Lua, OCaml, Terraform/HCL, CloudFormation, Kubernetes YAML, Dockerfiles, and others. Semgrep has significantly broader language coverage, especially for infrastructure-as-code and newer languages.

Which tool has better taint tracking and data flow analysis?

CodeQL has more mature and deeper taint tracking and data flow analysis. Its database-backed approach allows it to perform whole-program analysis, tracking data through complex call chains, callbacks, virtual dispatch, and framework-specific patterns with high precision. Semgrep added taint mode with the Pro engine, which supports cross-file and cross-function taint tracking, but it is not as deep as CodeQL's analysis. CodeQL can answer questions like 'does any user input reach this SQL query through any possible execution path?' with greater precision than Semgrep, especially in large codebases with complex control flow.

Can CodeQL replace Semgrep?

CodeQL cannot fully replace Semgrep for most teams. CodeQL's scan times (minutes to hours) make it impractical to run on every pull request in fast-moving development workflows. Its language support is narrower (no IaC scanning, fewer languages). Its QL query language has a steep learning curve that limits adoption to specialized security engineers. And it requires GitHub for licensing. Semgrep's speed, broader language support, easier rule authoring, and platform independence make it more practical as a primary CI/CD security scanner. However, CodeQL provides deeper analysis that Semgrep cannot match, making it a valuable complement rather than a replacement.

Is CodeQL or Semgrep better for GitHub-centric teams?

For teams fully committed to the GitHub ecosystem, CodeQL has a tighter integration. CodeQL findings appear natively in GitHub's Security tab as code scanning alerts, with automatic PR annotations, alert management, and dismissal workflows built into GitHub's UI. Semgrep also integrates with GitHub through Actions and SARIF uploads, and its PR comments work well, but the experience is not as deeply embedded in GitHub's native security features. If your team already pays for GitHub Advanced Security, CodeQL is included at no additional cost, making it the obvious choice for deep analysis alongside Semgrep for fast scanning.

What is the learning curve for CodeQL's QL language?

The QL language has a significant learning curve. QL is a declarative, logic-programming language with its own type system, class hierarchy, predicates, and recursive query patterns. Security engineers with backgrounds in SQL or Datalog will find some concepts familiar, but writing effective CodeQL queries requires understanding the language-specific CodeQL libraries (for example, the Java library has different abstractions than the Python library). Plan for 2-4 weeks of dedicated learning to write basic queries and several months to become proficient at writing complex taint-tracking queries. Semgrep's YAML-based rules, by comparison, can be learned in hours.

Which tool is better for enforcing coding standards?

Semgrep is significantly better for enforcing coding standards. Its YAML-based rules can express any code pattern you want to ban or require, and scans complete in seconds so they can run on every commit without slowing development. Common use cases include banning deprecated API calls, enforcing authentication patterns, requiring specific error handling approaches, and mandating secure defaults. CodeQL can theoretically enforce coding standards, but the effort required to write QL queries for simple pattern matching is disproportionate, and the slow scan times make it impractical for real-time enforcement in CI/CD.

Explore More

Tool Reviews

Semgrep Review

Free Newsletter

Stay ahead with AI dev tools

Weekly insights on AI code review, static analysis, and developer productivity. No spam, unsubscribe anytime.

Join developers getting weekly AI tool insights.

comparison

Checkmarx vs Veracode: Enterprise SAST Platforms Compared in 2026

Checkmarx vs Veracode - enterprise SAST, DAST, SCA, Gartner positioning, pricing ($40K-250K+), compliance, and when to choose each AppSec platform.

March 13, 2026

comparison

Codacy Free vs Pro: Which Plan Do You Need in 2026?

Codacy Free vs Pro compared - features, limits, pricing, and when to upgrade. Find the right Codacy plan for your team size and workflow.

March 13, 2026

comparison

Codacy vs Checkmarx: Developer Code Quality vs Enterprise AppSec in 2026

Codacy vs Checkmarx - developer code quality vs enterprise AppSec, pricing ($15/user vs $40K+), SAST, DAST, SCA, compliance, and when to choose each.

March 13, 2026

Quick verdict

At-a-glance comparison

Understanding the comparison: pattern matching vs semantic analysis

What is Semgrep?

How Semgrep works

Key strengths of Semgrep

What is CodeQL?

How CodeQL works

Key strengths of CodeQL

Feature-by-feature deep dive

Rule syntax and authoring

Performance and speed

Language support

Taint tracking and data flow analysis

CI/CD integration

GitHub integration

Pricing and licensing

Use cases and recommendations

When to choose Semgrep

When to choose CodeQL

When to use both

Alternatives to consider

Final recommendation

Frequently Asked Questions

Explore More

Stay ahead with AI dev tools

Related Articles

Checkmarx vs Veracode: Enterprise SAST Platforms Compared in 2026

Codacy Free vs Pro: Which Plan Do You Need in 2026?

Codacy vs Checkmarx: Developer Code Quality vs Enterprise AppSec in 2026