Why Semgrep?
Semgrep is a fast, open-source static analysis tool. It finds bugs and security issues using pattern matching. Think "grep for code structure."
Quick Start
Installation
# macOS
brew install semgrep# pip (any platform)
pip install semgrep
# Docker
docker pull returntocorp/semgrep
First Scan
# Scan with default security rules
semgrep scan --config auto# Scan specific directory
semgrep scan --config auto ./src
What "auto" Config Includes
- Security audit rules (OWASP Top 10)
- Secret detection
- Best practices for your languages
Configuration Options
Using Registry Rules
# Security-focused rules
semgrep --config p/security-audit# Secret detection
semgrep --config p/secrets
# Language-specific
semgrep --config p/javascript
semgrep --config p/typescript
semgrep --config p/react
# Multiple configs
semgrep --config p/security-audit --config p/secrets
Popular Rule Packs
| Pack | Focus |
|---|
| p/security-audit | General security |
|---|
| p/secrets | Hardcoded secrets |
|---|
| p/owasp-top-ten | OWASP vulnerabilities |
|---|
| p/xss | Cross-site scripting |
|---|
| p/sql-injection | SQL injection |
|---|
| p/jwt | JWT security issues |
|---|
GitHub Actions Integration
Basic Workflow
# .github/workflows/semgrep.ymlname: Semgrep Security Scan
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
semgrep:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run Semgrep
uses: returntocorp/semgrep-action@v1
with:
config: >-
p/security-audit
p/secrets
Block PRs with Findings
name: Semgrep Security Gateon:
pull_request:
branches: [main]
jobs:
semgrep:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run Semgrep
uses: returntocorp/semgrep-action@v1
with:
config: p/security-audit p/secrets
generateSarif: true
- name: Upload SARIF
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: semgrep.sarif
if: always()
Configuring Required Status Check
- Go to repository Settings
- Branches → Add rule for main
- Enable "Require status checks"
- Select "semgrep" job
Custom Rules
Basic Rule Structure
# .semgrep/custom-rules.ymlrules:
- id: no-console-log
pattern: console.log(...)
message: "Remove console.log before production"
severity: WARNING
languages: [javascript, typescript]
- id: no-any-type
pattern: |
function $FUNC(...): any { ... }
message: "Avoid using 'any' return type"
severity: INFO
languages: [typescript]
Pattern Syntax
# Exact match
pattern: console.log(...)# Match any function call
pattern: $FUNC(...)
# Match with metavariable
pattern: fetch($URL)
# Match any statement
pattern: ...
# Match specific structure
pattern: |
if ($COND) {
$BODY
}
AI Code Security Rules
# .semgrep/ai-security.ymlrules:
- id: sql-template-literal
patterns:
- pattern-either:
- pattern: $DB.query(...${$VAR}...)
- pattern: $DB.execute(...${$VAR}...)
message: |
SQL injection risk: Template literal with variable interpolation.
Use parameterized queries instead.
severity: ERROR
languages: [javascript, typescript]
- id: dangerous-innerhtml
pattern: dangerouslySetInnerHTML={{__html: $VAR}}
message: |
XSS risk: dangerouslySetInnerHTML with variable.
Sanitize content with DOMPurify or avoid innerHTML.
severity: WARNING
languages: [javascript, typescript]
- id: hardcoded-stripe-key
pattern-regex: sk_(live|test)_[A-Za-z0-9]{24,}
message: "Hardcoded Stripe key detected. Use environment variables."
severity: ERROR
languages: [generic]
- id: missing-auth-check
patterns:
- pattern: |
export async function $METHOD(request: Request) {
...
$DB.$QUERY(...)
...
}
- pattern-not: |
export async function $METHOD(request: Request) {
...
getServerSession(...)
...
}
message: "API route accesses database without session check"
severity: WARNING
languages: [typescript]
Using Custom Rules
# Local file
semgrep --config .semgrep/custom-rules.yml# Combined with registry
semgrep --config p/security-audit --config .semgrep/custom-rules.yml
Ignoring False Positives
Inline Comments
// nosemgrep: sql-injection
const query = SELECT * FROM logs WHERE level = '${level}'
// This is safe because 'level' is an enum validated elsewhereSemgrep Ignore File
# .semgrepignore# Ignore test files
tests/
**/*.test.ts
**/*.spec.ts
# Ignore specific files
src/legacy/old-code.js
# Ignore generated files
generated/
*.generated.ts
Rule-Specific Ignores
# In custom rule
rules:
- id: my-rule
pattern: ...
paths:
exclude:
- tests/*
- "*.test.ts"Output Formats
# Default (human-readable)
semgrep --config auto# JSON (for processing)
semgrep --config auto --json > results.json
# SARIF (for GitHub)
semgrep --config auto --sarif > results.sarif
# JUnit (for CI systems)
semgrep --config auto --junit-xml > results.xml
Performance Optimization
Scan Specific Files
# Only changed files (in CI)
git diff --name-only HEAD~1 | xargs semgrep --config autoParallel Scanning
# Use multiple cores
semgrep --config auto --jobs 4Caching
# GitHub Actions with caching
- name: Cache Semgrep
uses: actions/cache@v3
with:
path: ~/.semgrep
key: semgrep-${{ runner.os }}Complete Setup Example
Project Structure
my-project/
├── .github/
│ └── workflows/
│ └── semgrep.yml
├── .semgrep/
│ ├── custom-rules.yml
│ └── ai-security.yml
├── .semgrepignore
├── src/
│ └── ...
└── package.jsonWorkflow File
# .github/workflows/semgrep.ymlname: Security Scan
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
jobs:
semgrep:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run Semgrep
uses: returntocorp/semgrep-action@v1
with:
config: >-
p/security-audit
p/secrets
.semgrep/custom-rules.yml
.semgrep/ai-security.yml
- name: Upload Results
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: semgrep.sarif
if: always()
Troubleshooting
"No rules found"
Check config path:
semgrep --config ./correct/path/to/rules.yml"Pattern parse error"
YAML indentation matters:
# Wrong
pattern: function $F(...) { ... }# Right
pattern: |
function $F(...) {
...
}
"Too many findings"
Start with specific rules:
# Instead of auto
semgrep --config p/sql-injection# Then gradually add
semgrep --config p/sql-injection --config p/secrets
The Bottom Line
Semgrep gives you control over security scanning. Start with registry rules, add custom rules for your patterns, and integrate into CI/CD.
Automated scanning catches what manual review misses. Every push, every PR.