How to Set Up Semgrep in 2026 - Complete Installation and Configuration Guide
Set up Semgrep for security scanning. Covers CLI install, custom rules, GitHub Actions integration, Semgrep Cloud, PR comments, and troubleshooting.
Published:
Why Semgrep and why now
Semgrep is a fast, open-source static analysis tool that finds bugs and security vulnerabilities by letting you write rules that look like the code you are searching for. Unlike legacy SAST tools that require specialized security expertise to configure and produce overwhelming false positive rates, Semgrep was designed from the ground up for developers. Its pattern syntax mirrors your actual source code, its CLI runs in seconds, and its rule library covers over 30 programming languages with thousands of pre-written checks.
Since its launch by r2c (now Semgrep, Inc.) in 2020, Semgrep has become the default security scanner for thousands of engineering teams - from startups running the free open-source engine to enterprises using the full cloud platform. Dropbox, Figma, Snowflake, and Hashicorp all use Semgrep in their development pipelines. The tool scans over 100 million lines of code daily across its user base.
The reason to set up Semgrep now is straightforward. Every major compliance framework - SOC 2, PCI DSS 4.0, ISO 27001 - requires or strongly recommends static analysis in the development lifecycle. Semgrep gives you that compliance evidence while also catching real vulnerabilities. The open-source engine is free, and the full platform is free for teams of up to 10 contributors. There is no licensing cost barrier to getting started.
This guide walks through every step of setting up Semgrep - from installing the CLI on your local machine to running it in CI/CD pipelines with automatic PR comments. By the end, you will have a production-ready Semgrep configuration that catches security issues before they reach your main branch.
Step 1 - Install the Semgrep CLI
Semgrep provides three installation methods. Choose the one that fits your environment.
Install with pip (all platforms)
The pip installation is the recommended method and works on macOS, Linux, and Windows (via WSL). Semgrep requires Python 3.8 or later.
# Install Semgrep
pip install semgrep
# Verify the installation
semgrep --version
If you prefer to isolate Semgrep from your system Python, use pipx:
# Install with pipx for isolated environment
pipx install semgrep
# Verify
semgrep --version
Install with Homebrew (macOS)
On macOS, Homebrew provides a straightforward installation:
# Install via Homebrew
brew install semgrep
# Verify
semgrep --version
Install with Docker (any platform)
Docker is useful for CI environments or when you do not want to install anything on the host system:
# Pull the Semgrep image
docker pull semgrep/semgrep
# Run a scan using the Docker image
docker run --rm -v "${PWD}:/src" semgrep/semgrep semgrep --config auto /src
The Docker approach mounts your current directory into the container at /src and runs the scan against it. This method ensures a consistent environment regardless of the host operating system.
Verify your installation
After installing through any method, confirm that Semgrep is working:
semgrep --version
# Expected output: semgrep 1.x.x
If you see a version number, the installation was successful. If you encounter a “command not found” error, ensure that the installation directory is in your system PATH. For pip installations, this is typically ~/.local/bin on Linux or ~/Library/Python/3.x/bin on macOS.
Step 2 - Run your first scan
With Semgrep installed, you can run your first scan immediately without any configuration files.
Scan with the default rule set
Navigate to any project directory and run:
cd /path/to/your/project
semgrep --config auto
The --config auto flag tells Semgrep to automatically select rules that are relevant to the languages and frameworks detected in your project. Semgrep downloads the appropriate rules from the Semgrep Registry, runs the scan, and prints findings to your terminal.
Scan with a specific rule set
For more control over which rules run, specify a rule set by name:
# Run the default curated rule set
semgrep --config p/default
# Run security-focused rules
semgrep --config p/security-audit
# Run language-specific rules
semgrep --config p/python
semgrep --config p/javascript
semgrep --config p/golang
Scan a specific file or directory
You do not have to scan your entire project. Target specific paths:
# Scan a single file
semgrep --config p/default src/auth/login.py
# Scan a specific directory
semgrep --config p/default src/api/
# Scan multiple paths
semgrep --config p/default src/auth/ src/api/ src/middleware/
Understand the output
A typical Semgrep finding looks like this:
src/api/users.py
security.python.sql-injection.sql-injection
Detected string concatenation in SQL query. Use parameterized queries instead.
14│ query = "SELECT * FROM users WHERE id = " + user_id
Each finding includes the file path, the rule ID that triggered the match, a human-readable message explaining the issue, and the exact line of code that was matched. The rule ID is important because you will use it later to customize which rules run and to suppress false positives.
Output in different formats
Semgrep supports multiple output formats for integration with other tools:
# JSON output for programmatic processing
semgrep --config p/default --json > results.json
# SARIF output for GitHub Code Scanning integration
semgrep --config p/default --sarif > results.sarif
# JUnit XML for CI/CD integration
semgrep --config p/default --junit-xml > results.xml
# Emacs/Vim compatible output
semgrep --config p/default --emacs
Step 3 - Understand Semgrep rule sets
Rule sets are collections of rules curated for specific use cases. Choosing the right rule sets determines what Semgrep looks for and how many findings you get.
Core rule sets
p/default is the starting point for most teams. It contains high-confidence security and correctness rules curated by the Semgrep team. These rules have low false positive rates and focus on issues that are almost always worth fixing. Start here and expand later.
p/security-audit is a broader security-focused set that includes rules with moderate confidence. It catches more potential issues but produces more findings that may require manual review. Use this when you want comprehensive security coverage and have the bandwidth to triage additional findings.
p/owasp-top-ten maps rules to the OWASP Top 10 vulnerability categories - injection, broken authentication, sensitive data exposure, and so on. This set is useful for compliance-driven teams that need to demonstrate OWASP coverage in their security program.
Language-specific rule sets
Semgrep provides curated rule sets for individual languages and frameworks:
| Rule set | Focus |
|---|---|
| p/python | Python security and correctness |
| p/javascript | JavaScript and Node.js security |
| p/typescript | TypeScript-specific patterns |
| p/golang | Go security and error handling |
| p/java | Java security patterns |
| p/ruby | Ruby and Rails security |
| p/csharp | C# security patterns |
| p/php | PHP security patterns |
| p/rust | Rust safety and correctness |
Infrastructure rule sets
For infrastructure-as-code scanning:
| Rule set | Focus |
|---|---|
| p/terraform | Terraform misconfigurations |
| p/dockerfile | Dockerfile security |
| p/docker-compose | Docker Compose issues |
| p/kubernetes | Kubernetes YAML security |
Combining multiple rule sets
You can run multiple rule sets in a single scan by passing multiple --config flags:
semgrep --config p/default --config p/security-audit --config p/python
Start with p/default alone, review the findings, and then add additional sets incrementally. Adding too many rule sets at once can generate an overwhelming number of findings that make it hard to prioritize what to fix first.
Step 4 - Write custom Semgrep rules
One of Semgrep’s most powerful features is how easy it is to write custom rules. Unlike tools that require a proprietary query language, Semgrep rules use patterns that look like the code they are matching.
Basic rule structure
Create a file called custom-rules.yaml:
rules:
- id: no-hardcoded-passwords
patterns:
- pattern: password = "$VALUE"
message: >
Hardcoded password detected. Use environment variables or a
secrets manager instead of embedding credentials in source code.
severity: ERROR
languages:
- python
metadata:
cwe:
- "CWE-798: Use of Hard-coded Credentials"
category: security
Every rule needs five required fields:
- id - a unique identifier for the rule, used in output and for suppression
- pattern or patterns - the code pattern to match, using metavariables like
$VALUEas placeholders - message - a human-readable explanation shown when the rule matches
- severity - one of ERROR, WARNING, or INFO
- languages - an array of languages this rule applies to
Using metavariables
Metavariables are placeholders that match any expression. They start with $ and an uppercase name:
rules:
- id: insecure-hash-algorithm
patterns:
- pattern: hashlib.$ALGO(...)
- metavariable-regex:
metavariable: $ALGO
regex: (md5|sha1)
message: >
Insecure hash algorithm '$ALGO' detected. Use SHA-256 or
stronger for cryptographic hashing.
severity: WARNING
languages:
- python
This rule matches any call to hashlib.md5(...) or hashlib.sha1(...) regardless of the arguments passed.
Combining patterns with pattern-either and pattern-not
Use pattern-either to match multiple patterns and pattern-not to exclude safe patterns:
rules:
- id: dangerous-eval
patterns:
- pattern-either:
- pattern: eval($INPUT)
- pattern: exec($INPUT)
- pattern-not: eval("literal_string")
message: >
Use of eval() or exec() with dynamic input is a code injection
risk. Consider using ast.literal_eval() or a safer alternative.
severity: ERROR
languages:
- python
A more advanced example - detecting SQL injection
rules:
- id: flask-sql-injection
patterns:
- pattern: |
$CURSOR.execute("..." + $INPUT + "...")
- pattern-not: |
$CURSOR.execute("..." + "..." + "...")
message: >
SQL query built using string concatenation with variable input.
Use parameterized queries with placeholders to prevent SQL injection.
Replace: cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
severity: ERROR
languages:
- python
metadata:
cwe:
- "CWE-89: SQL Injection"
owasp:
- "A03:2021 - Injection"
Testing your custom rules
Run your custom rules against your codebase:
# Run a single custom rule file
semgrep --config custom-rules.yaml
# Run custom rules alongside registry rules
semgrep --config custom-rules.yaml --config p/default
# Test a rule against a specific file
semgrep --config custom-rules.yaml src/database/queries.py
Organizing custom rules
For teams with multiple custom rules, organize them in a directory:
.semgrep/
security/
sql-injection.yaml
auth-bypass.yaml
secrets.yaml
correctness/
null-checks.yaml
error-handling.yaml
style/
naming-conventions.yaml
Then scan with the entire directory:
semgrep --config .semgrep/
Step 5 - Set up Semgrep in GitHub Actions
Running Semgrep in CI ensures every pull request is scanned automatically. Here is how to set up a production-ready GitHub Actions workflow.
Basic GitHub Actions workflow
Create .github/workflows/semgrep.yml:
name: Semgrep
on:
pull_request: {}
push:
branches:
- main
jobs:
semgrep:
name: Semgrep Scan
runs-on: ubuntu-latest
container:
image: semgrep/semgrep
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Run Semgrep
run: semgrep ci
env:
SEMGREP_APP_TOKEN: ${{ secrets.SEMGREP_APP_TOKEN }}
This workflow runs Semgrep on every pull request and every push to the main branch. The semgrep ci command is designed for CI environments - it performs diff-aware scanning (only analyzing changed files on PRs), uploads results to Semgrep Cloud if a token is configured, and exits with a non-zero code if blocking findings are detected.
GitHub Actions without Semgrep Cloud
If you do not want to use Semgrep Cloud, you can run the scan using only the CLI with specific rule sets:
name: Semgrep
on:
pull_request: {}
push:
branches:
- main
jobs:
semgrep:
name: Semgrep Scan
runs-on: ubuntu-latest
container:
image: semgrep/semgrep
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Run Semgrep
run: semgrep scan --config p/default --config p/security-audit --error
The --error flag causes Semgrep to exit with a non-zero code when findings are detected, which fails the GitHub Actions check and blocks the PR from merging if branch protection is configured.
Adding SARIF upload for GitHub Code Scanning
To see Semgrep findings in GitHub’s Security tab alongside CodeQL results:
name: Semgrep
on:
pull_request: {}
push:
branches:
- main
jobs:
semgrep:
name: Semgrep Scan
runs-on: ubuntu-latest
container:
image: semgrep/semgrep
permissions:
security-events: write
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Run Semgrep
run: semgrep scan --config p/default --sarif --output semgrep-results.sarif
- name: Upload SARIF
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: semgrep-results.sarif
if: always()
This uploads Semgrep results in SARIF format so they appear in the GitHub Security tab under “Code scanning alerts.” The if: always() ensures results are uploaded even when Semgrep finds issues and returns a non-zero exit code.
Configuring branch protection
After adding the Semgrep workflow, configure GitHub branch protection to require the scan to pass before merging:
- Go to your repository Settings, then Branches
- Edit the branch protection rule for your main branch
- Enable “Require status checks to pass before merging”
- Search for and add the “Semgrep Scan” check
- Save your changes
Now any PR with blocking Semgrep findings will be prevented from merging until the issues are resolved.
Step 6 - Connect Semgrep Cloud
Semgrep Cloud (the Semgrep AppSec Platform) adds a web dashboard, PR comments, AI-powered triage, and cross-file analysis on top of the open-source CLI. It is free for up to 10 contributors.
Create a Semgrep Cloud account
- Go to semgrep.dev and sign up with your GitHub, GitLab, or email account
- Create an organization that matches your GitHub organization
- Navigate to Settings and generate an API token
- Save this token - you will need it for CI configuration
Add the token to GitHub
- Go to your repository Settings, then Secrets and variables, then Actions
- Click “New repository secret”
- Name it
SEMGREP_APP_TOKEN - Paste the token from Semgrep Cloud
- Click “Add secret”
Configure scanning policies in Semgrep Cloud
Semgrep Cloud lets you manage rule configuration from the web dashboard instead of hardcoding rule sets in your CI file:
- In Semgrep Cloud, go to Policies
- You will see a default policy with recommended rules enabled
- Add or remove rule sets based on your needs
- Set rules to Comment, Block, or Monitor mode
When you use semgrep ci in your workflow (instead of semgrep scan --config ...), Semgrep pulls its configuration from the cloud policy. This means you can change what gets scanned and what severity levels block PRs without modifying your workflow file.
Rule modes in Semgrep Cloud
Semgrep Cloud supports three modes for each rule:
- Block - findings from this rule fail the CI check and block the PR from merging
- Comment - findings are posted as PR comments but do not block merging
- Monitor - findings are tracked in the dashboard but are not surfaced on the PR at all
Most teams start with the majority of rules in Comment mode and only promote rules to Block mode after confirming they produce zero or near-zero false positives in their specific codebase.
Step 7 - Configure PR comments
PR comments are how Semgrep delivers findings to developers in their existing workflow - directly in the pull request where the code was changed.
How PR comments work
When Semgrep Cloud is connected and semgrep ci runs in your CI pipeline, it performs a diff-aware scan that analyzes only the code changed in the pull request. New findings are posted as inline comments on the exact lines of code where the issues were detected. Each comment includes the rule name, severity, a description of the issue, and often a suggested fix.
Install the Semgrep GitHub App
For PR comments to work, you need the Semgrep GitHub App installed:
- In Semgrep Cloud, go to Settings, then Source Code Managers
- Click “Add GitHub” and authorize the Semgrep GitHub App
- Choose which repositories or organizations to grant access to
- Confirm the installation
Once installed, any repository connected to Semgrep Cloud will receive inline PR comments when semgrep ci detects new findings.
Customizing comment behavior
In Semgrep Cloud under Settings, you can control:
- Whether comments include fix suggestions
- Whether to add a summary comment at the top of the PR
- Whether to leave comments only for blocking findings or for all findings
- Whether to automatically resolve comments when the underlying code is fixed
Example PR comment
A typical Semgrep PR comment looks like:
⚠️ semgrep: python.flask.security.injection.tainted-sql-string
Detected user input flowing into a SQL query without sanitization.
This is a SQL injection vulnerability.
Suggested fix: Use parameterized queries.
- cursor.execute("SELECT * FROM users WHERE id = " + request.args["id"])
+ cursor.execute("SELECT * FROM users WHERE id = %s", (request.args["id"],))
🔗 Rule details | 📘 CWE-89 | Triage in Semgrep Cloud
Developers can respond to findings directly in the PR - fixing the code, marking as false positive, or adding a # nosemgrep comment to suppress the finding with justification.
Step 8 - Set up .semgrepignore
The .semgrepignore file tells Semgrep which files and directories to skip during scans. This is essential for reducing noise from test files, generated code, vendored dependencies, and other paths where findings are not actionable.
Create a .semgrepignore file
Create a .semgrepignore file in your repository root:
# Test files - findings in tests are usually not exploitable
tests/
test/
*_test.go
*_test.py
test_*.py
*.test.js
*.test.ts
*.spec.js
*.spec.ts
# Generated code - cannot be fixed in source
generated/
__generated__/
*.generated.ts
*.gen.go
# Vendored dependencies - managed upstream
vendor/
node_modules/
third_party/
# Build artifacts
dist/
build/
out/
.next/
# Documentation and configuration
docs/
*.md
*.rst
# Migrations - often contain raw SQL by design
migrations/
alembic/
.semgrepignore syntax
The syntax follows .gitignore conventions:
directory/ignores an entire directory and its contents*.extignores all files with that extensionpatternignores matching files anywhere in the tree!patternnegates a previous ignore (force-includes a file)#starts a comment line
When not to ignore
Be cautious about ignoring too much. Some common patterns to avoid ignoring:
- Do not ignore configuration files. Security misconfigurations in Dockerfiles, Terraform files, and Kubernetes manifests are high-value findings.
- Do not ignore migration files entirely. While raw SQL in migrations is expected, injection vulnerabilities can still appear when migrations accept dynamic input.
- Do not ignore scripts/ or tools/ directories. Internal tooling often has weaker security standards and is a common source of vulnerabilities.
Step 9 - Advanced configuration
Beyond the basics, Semgrep supports several configuration options that help you tune scanning behavior for your codebase.
The semgrep.yaml configuration file
Create a .semgrep.yaml file in your repository root to set default scan options:
rules:
- p/default
- p/security-audit
- .semgrep/
options:
timeout: 30
max-memory: 5000
Excluding specific rules
If a specific rule from a registry set produces too many false positives in your codebase, exclude it:
# Exclude specific rules by ID
semgrep --config p/default --exclude-rule "generic.secrets.gitleaks.generic-api-key"
Inline suppressions
Suppress individual findings with inline comments:
# nosemgrep: python.flask.security.injection.tainted-sql-string
cursor.execute("SELECT * FROM config WHERE key = " + safe_internal_key)
The nosemgrep comment accepts a rule ID to suppress only that specific rule on the annotated line. This is preferable to broad suppressions because it documents exactly which check was reviewed and accepted.
For other languages, the comment syntax follows the language convention:
// nosemgrep: javascript.express.security.injection.tainted-sql-string
db.query("SELECT * FROM config WHERE key = " + internalKey);
// nosemgrep: go.lang.security.injection.tainted-sql-string
db.Query("SELECT * FROM config WHERE key = " + safeKey)
Setting exit codes for CI
Control how Semgrep’s exit code maps to CI pass/fail behavior:
# Exit 1 only for ERROR severity findings
semgrep --config p/default --severity ERROR --error
# Exit 1 for WARNING and ERROR findings
semgrep --config p/default --severity WARNING --error
This lets you treat ERROR-level findings as blocking while allowing WARNING-level findings to be advisory.
Max target size and timeout
For large repositories, configure limits to prevent scans from hanging:
# Skip files larger than 1MB
semgrep --config p/default --max-target-bytes 1000000
# Set per-rule timeout to 30 seconds
semgrep --config p/default --timeout 30
# Set maximum memory usage
semgrep --config p/default --max-memory 5000
Step 10 - Troubleshooting common issues
Here are the most common problems teams encounter when setting up Semgrep and how to resolve them.
”command not found” after installation
If semgrep is not found after installing via pip, your Python scripts directory is not in your PATH:
# Find where pip installed Semgrep
python -m site --user-base
# Add the bin directory to your PATH
export PATH="$HOME/.local/bin:$PATH"
# Make it permanent by adding to your shell profile
echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc
On macOS with Homebrew, this issue is rare. If it occurs, run brew link semgrep.
Slow scans on large repositories
If scans are taking too long, there are several optimizations:
# Use diff-aware scanning to only check changed files
semgrep ci # Automatically diff-aware in CI
# Exclude large directories
semgrep --config p/default --exclude "vendor" --exclude "node_modules"
# Limit concurrency to reduce memory usage
semgrep --config p/default --jobs 2
# Skip large files
semgrep --config p/default --max-target-bytes 500000
Out of memory errors
Semgrep can consume significant memory on very large files or complex rule sets:
# Limit maximum memory usage (in MB)
semgrep --config p/default --max-memory 4000
# Reduce the number of rules being run
semgrep --config p/default # Instead of p/security-audit which has more rules
# Exclude known large files
# Add to .semgrepignore:
# *.min.js
# *.bundle.js
# package-lock.json
Rules not matching expected code
When a custom rule does not match code that you expect it to match:
# Test with verbose output to see what Semgrep is doing
semgrep --config your-rule.yaml --verbose target-file.py
# Use --debug for detailed matching information
semgrep --config your-rule.yaml --debug target-file.py
# Validate your rule syntax
semgrep --validate --config your-rule.yaml
Common reasons rules fail to match:
- Language mismatch - the
languagesfield in the rule does not include the file’s language - Whitespace sensitivity - Semgrep normalizes most whitespace, but multiline patterns need to use
...for gaps - Metavariable scope - a metavariable used across patterns must match the same expression in all occurrences
GitHub Actions workflow not triggering
If the Semgrep workflow does not run on pull requests:
- Verify the workflow file is in
.github/workflows/on the default branch - Check that the
on:trigger includespull_request - Ensure the YAML syntax is valid - GitHub silently ignores malformed workflow files
- Check the Actions tab for workflow run errors
SEMGREP_APP_TOKEN errors
If semgrep ci fails with authentication errors:
- Verify the secret is named exactly
SEMGREP_APP_TOKENin GitHub Settings - Ensure the token has not expired in Semgrep Cloud
- Check that the repository is connected to the correct organization in Semgrep Cloud
- Regenerate the token in Semgrep Cloud and update the GitHub secret
Findings in generated or vendored code
If Semgrep is flagging code you cannot change:
- Add the paths to
.semgrepignore - Use inline
# nosemgrepcomments for individual suppressions - In Semgrep Cloud, triage findings as “Ignored” with a reason of “Generated code” or “Vendored dependency”
Recommended setup for different team sizes
The right Semgrep configuration depends on your team’s size, security maturity, and available bandwidth for triaging findings.
Solo developers and small teams (1-5 developers)
Use Semgrep OSS with the default rule set. Install Semgrep locally, run semgrep --config p/default before pushing code, and add a basic GitHub Actions workflow with p/default. This catches the highest-confidence security issues with minimal noise. Total setup time is about 15 minutes.
# Local development workflow
semgrep --config p/default --error
Mid-size teams (5-20 developers)
Use Semgrep Cloud on the free Team tier. Connect your repositories to Semgrep Cloud, enable PR comments, and start with rules in Comment mode. This gives your team automated feedback on every PR without blocking velocity while you learn which rules are most relevant to your codebase. Promote high-confidence rules to Block mode after a few weeks of triage data. Total setup time is about 30 minutes.
Large teams and enterprises (20+ developers)
Use Semgrep Cloud with a dedicated policy per repository or team. Enable Semgrep Assistant for AI-powered triage to reduce false positive noise. Create custom rules for your internal frameworks and patterns. Integrate SARIF output with GitHub Code Scanning for centralized security visibility. Set up separate policies for different service tiers - stricter rules for payment processing code, lighter rules for internal tools. Total setup time is about 2 hours for the initial configuration plus ongoing refinement.
What to do after setup
Once Semgrep is running in your CI pipeline, the work shifts from configuration to maintenance. Here is a practical sequence for the first 30 days:
Week 1 - Run Semgrep with p/default in Comment mode. Review findings as they come in on PRs. Note any rules that produce frequent false positives in your codebase.
Week 2 - Add paths to .semgrepignore for test files, generated code, and vendored dependencies that are generating noise. Suppress any rule IDs that are consistently false positives for your project.
Week 3 - Promote high-confidence rules to Block mode. Start with rules for critical vulnerabilities like SQL injection, command injection, and hardcoded credentials. Add p/security-audit in Monitor mode to evaluate its findings without surfacing them on PRs.
Week 4 - Write your first custom rule targeting a pattern specific to your codebase - an internal API misuse, a deprecated function that should not be called, or an authentication pattern that must be followed. Share the rule with your team and collect feedback.
This gradual rollout ensures that Semgrep becomes a trusted part of your workflow rather than a noisy tool that developers learn to ignore. The goal is not to enable every rule on day one - it is to build a scanning configuration that your team actually reads and acts on.
Frequently Asked Questions
How do I install Semgrep?
Install Semgrep using pip with 'pip install semgrep', using Homebrew on macOS with 'brew install semgrep', or using Docker with 'docker run semgrep/semgrep'. The pip method works on all operating systems and is the recommended approach. After installation, verify it works by running 'semgrep --version' in your terminal.
Is Semgrep free to use?
Yes, the Semgrep open-source engine is free under the LGPL-2.1 license and includes 2,800+ community rules. The full Semgrep platform - which adds cross-file analysis, SCA, secrets detection, and AI-powered triage - is free for teams of up to 10 contributors. Beyond 10 contributors, the Team plan costs $35 per contributor per month.
What is the difference between Semgrep OSS and Semgrep Cloud?
Semgrep OSS is the open-source CLI engine that performs single-file pattern matching with 2,800+ community rules. Semgrep Cloud (also called Semgrep AppSec Platform) adds cross-file dataflow analysis, 20,000+ Pro rules, AI-powered triage with Semgrep Assistant, a web dashboard for managing findings, and integrations for PR comments. Semgrep Cloud is free for up to 10 contributors.
How do I set up Semgrep in GitHub Actions?
Create a workflow file at .github/workflows/semgrep.yml that triggers on pull_request and push events. Use the 'semgrep/semgrep-action@v1' action or run the Semgrep CLI directly. For Semgrep Cloud integration, add your SEMGREP_APP_TOKEN as a GitHub secret and use 'semgrep ci' as the scan command to get PR comments and dashboard reporting.
How do I write custom Semgrep rules?
Create a YAML file with a rules array. Each rule needs an id, a pattern or patterns block, a message, a severity level (ERROR, WARNING, or INFO), and a languages array. Semgrep's pattern syntax mirrors the target language - you write patterns that look like the code you want to match, using metavariables like $X as placeholders. Test rules with 'semgrep --config your-rule.yaml' against your codebase.
What languages does Semgrep support?
Semgrep supports over 30 programming languages including Python, JavaScript, TypeScript, Java, Go, Ruby, C, C++, C#, Rust, Kotlin, Swift, PHP, Scala, Terraform, Dockerfile, and Kubernetes YAML. The open-source engine supports all languages. The Pro engine adds deeper cross-file analysis for a subset of these languages.
How do I reduce false positives in Semgrep?
Use several strategies: create a .semgrepignore file to exclude test files, generated code, and vendor directories. Use Semgrep Assistant (available in the Cloud platform) for AI-powered triage that automatically filters noise. Write custom rules with pattern-not clauses to exclude known-safe patterns. Start with high-confidence rule sets like p/default rather than broad sets like p/security-audit.
What are the best Semgrep rule sets to start with?
Start with p/default, which includes high-confidence security and correctness rules curated by the Semgrep team. Add p/security-audit for broader security coverage when you are ready for more findings. For specific languages, use targeted sets like p/python, p/javascript, or p/golang. For infrastructure scanning, add p/terraform or p/docker-compose. You can combine multiple rule sets in a single scan.
How do I configure Semgrep to post comments on pull requests?
Connect your repository to Semgrep Cloud at semgrep.dev, install the Semgrep GitHub App, and add your SEMGREP_APP_TOKEN to your CI environment. When Semgrep runs with 'semgrep ci' in your pipeline, it automatically posts inline comments on pull requests for any new findings. Comments include the rule ID, severity, explanation, and remediation guidance.
Can Semgrep scan Docker containers and infrastructure-as-code?
Yes. Semgrep has dedicated rule sets for Dockerfile security (p/dockerfile), Terraform misconfigurations (p/terraform), Kubernetes YAML (p/kubernetes), and Docker Compose files (p/docker-compose). These rules detect issues like running containers as root, exposing unnecessary ports, missing resource limits, and insecure cloud resource configurations.
How fast is Semgrep compared to other SAST tools?
Semgrep is one of the fastest SAST tools available. The median CI scan time is approximately 10 seconds because Semgrep uses diff-aware scanning to analyze only changed files on pull requests. Full repository scans typically complete in under 60 seconds for most codebases. This is significantly faster than tools like SonarQube, Checkmarx, or Veracode, which often take minutes to hours for comparable analysis.
How do I ignore specific Semgrep findings?
Ignore findings at three levels. In code, add a '# nosemgrep: rule-id' comment on the line before the finding. In configuration, add paths or patterns to a .semgrepignore file. In Semgrep Cloud, triage findings as false positives or accepted risks in the dashboard - these triage decisions persist across future scans so the same finding is not reported again.
Does Semgrep work with GitLab CI and other CI platforms?
Yes. Semgrep works with any CI platform that can run a shell command. For GitLab CI, add a Semgrep job to your .gitlab-ci.yml file. Semgrep also has documented integrations for Jenkins, CircleCI, Buildkite, and Azure Pipelines. The 'semgrep ci' command handles authentication, diff-aware scanning, and result reporting regardless of which CI platform you use.
Explore More
Tool Reviews
Related Articles
Free Newsletter
Stay ahead with AI dev tools
Weekly insights on AI code review, static analysis, and developer productivity. No spam, unsubscribe anytime.
Join developers getting weekly AI tool insights.
Related Articles
Codacy GitHub Integration: Complete Setup and Configuration Guide
Learn how to integrate Codacy with GitHub step by step. Covers GitHub App install, PR analysis, quality gates, coverage reports, and config.
March 13, 2026
how-toCodacy GitLab Integration: Setup and Configuration Guide (2026)
Set up Codacy with GitLab step by step. Covers OAuth, project import, MR analysis, quality gates, coverage reporting, and GitLab CI config.
March 13, 2026
how-toHow to Set Up Codacy with Jenkins for Automated Review
Set up Codacy with Jenkins for automated code review. Covers plugin setup, Jenkinsfile config, quality gates, coverage, and multibranch pipelines.
March 13, 2026
Semgrep Review