Documentation

Ship Safe CLI

18 AI security agents. 80+ attack classes. One command.

npx ship-safe audit .

Installation

Ship Safe requires Node.js 18 or later. No signup or API key required.

# Run directly (no install)
npx ship-safe audit .

# Or install globally
npm install -g ship-safe
ship-safe audit .

For AI-powered deep analysis, set one of these environment variables (optional):

export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...
export GOOGLE_AI_API_KEY=AIza...

Quick Start

# Full security audit with remediation plan + HTML report
npx ship-safe audit .

# Red team: 18 agents, 80+ attack classes
npx ship-safe red-team .

# Quick secret scan
npx ship-safe scan .

# Security health score (0-100, A-F)
npx ship-safe score .

# Fun emoji security grade
npx ship-safe vibe-check .

# Scan only changed files (fast pre-commit)
npx ship-safe diff --staged

# CI/CD mode with threshold gating
npx ship-safe ci . --threshold 80

Commands

Core Audit

Command	Description
`audit .`	Full audit: secrets + 18 agents + deps + remediation plan + HTML report
`red-team .`	Run 18 agents with 80+ attack classes
`scan .`	Secret scanner (pattern matching + entropy scoring)
`score .`	Security health score (0-100, A-F grade)
`deps .`	Dependency CVE audit with EPSS scores
`diff`	Scan only changed files (supports `--staged`)

AI-Powered

Command	Description
`hooks install`	Install real-time Claude Code hooks — block secrets before they land on disk
`hooks status`	Check if Claude Code hooks are installed
`hooks remove`	Uninstall Claude Code hooks
`agent .`	AI audit: scan + classify with LLM + auto-fix
`remediate .`	Auto-fix hardcoded secrets (rewrite code + write .env)
`rotate .`	Open provider dashboards to revoke exposed keys
`audit . --deep`	LLM-powered taint analysis for critical/high findings
`audit . --verify`	Probe provider APIs to check if leaked secrets are active

CI/CD & Baseline

Command	Description
`ci .`	Pipeline mode: compact output, exit codes, threshold gating
`baseline .`	Accept current findings, only report regressions
`vibe-check .`	Fun emoji security grade with shareable badge
`benchmark .`	Compare score against industry averages
`watch .`	Continuous monitoring (watch files for changes)

Infrastructure

Command	Description
`init`	Initialize security configs (.gitignore, headers)
`doctor`	Environment diagnostics
`sbom .`	Generate CycloneDX SBOM (CRA-ready)
`abom .`	Agent Bill of Materials (CycloneDX 1.5)
`policy init`	Create policy-as-code config
`guard`	Block git push if secrets found
`checklist`	Launch-day security checklist
`update-intel`	Update threat intelligence feed

Flags

Flag	Description
`--json`	Structured JSON output
`--sarif`	SARIF format for GitHub Code Scanning
`--csv`	CSV export
`--md`	Markdown report
`--html [file]`	Custom HTML report path
`--pdf [file]`	PDF report (requires Chrome/Chromium)
`--deep`	LLM-powered taint analysis
`--local`	Use local Ollama for deep analysis
`--model <model>`	Specify LLM model
`--provider <name>`	LLM provider: groq, together, mistral, deepseek, xai, perplexity, lmstudio
`--base-url <url>`	Custom OpenAI-compatible base URL (e.g. LM Studio, vLLM)
`--budget <cents>`	Cap LLM spend (default: 50 cents)
`--verify`	Check if leaked secrets are still active
`--baseline`	Only show findings not in baseline
`--compare`	Show score delta vs. last scan
`--timeout <ms>`	Per-agent timeout (default: 30s)
`--no-deps`	Skip dependency audit
`--no-ai`	Skip AI classification
`--no-cache`	Force full rescan

18 Security Agents

All agents run in parallel with per-agent timeouts. Each implements shouldRun(recon) to skip irrelevant projects automatically.

Agent	Category	What It Detects
InjectionTester	Code Vulns	SQL/NoSQL injection, command injection, XSS, path traversal, XXE, ReDoS, prototype pollution
AuthBypassAgent	Auth	JWT flaws (alg:none, weak secrets), cookie security, CSRF, OAuth misconfig, BOLA/IDOR, TLS bypass
SSRFProber	SSRF	User input in fetch/axios, cloud metadata endpoints, internal IPs, redirect following
SupplyChainAudit	Supply Chain	Typosquatting, git/URL deps, wildcard versions, suspicious install scripts, dependency confusion
ConfigAuditor	Config	Docker (root user, :latest), Terraform, Kubernetes, CORS, CSP, Firebase, Nginx
SupabaseRLSAgent	Auth	Row Level Security issues, service_role key exposure, anon key inserts
LLMRedTeam	AI/LLM	OWASP LLM Top 10: prompt injection, excessive agency, system prompt leakage
MCPSecurityAgent	AI/LLM	MCP server misuse, tool poisoning, typosquatting, unvalidated inputs
AgenticSecurityAgent	AI/LLM	OWASP Agentic AI Top 10: agent hijacking, privilege escalation, memory poisoning
RAGSecurityAgent	AI/LLM	RAG pipeline security: context injection, document poisoning, vector DB access
PIIComplianceAgent	Compliance	PII detection: SSNs, credit cards, emails, phone numbers in source code
VibeCodingAgent	Code Vulns	AI-generated code anti-patterns: no validation, empty catches, TODO-auth
ExceptionHandlerAgent	Code Vulns	OWASP A10:2025: empty catches, unhandled rejections, leaked stack traces
AgentConfigScanner	AI/LLM	Prompt injection in .cursorrules, CLAUDE.md, malicious hooks, OpenClaw security
MobileScanner	Mobile	OWASP Mobile Top 10 2024: insecure storage, WebView injection, debug mode
GitHistoryScanner	Secrets	Leaked secrets in git commit history
CICDScanner	CI/CD	OWASP CI/CD Top 10: pipeline poisoning, unpinned actions, secret logging
APIFuzzer	API	Routes without auth, mass assignment, GraphQL introspection, debug endpoints

Post-processors: ScoringEngine (8-category weighted scoring), VerifierAgent (secrets liveness verification), DeepAnalyzer (LLM-powered taint analysis)

Scoring System

Starts at 100. Each finding deducts points by severity and category, weighted by confidence level (high: 100%, medium: 60%, low: 30%).

Category	Weight	Critical	High	Medium	Cap
Secrets	15%	-25	-15	-5	-15
Code Vulnerabilities	15%	-20	-10	-3	-15
Dependencies	13%	-20	-10	-5	-13
Auth & Access Control	15%	-20	-10	-3	-15
Configuration	8%	-15	-8	-3	-8
Supply Chain	12%	-15	-8	-3	-12
API Security	10%	-15	-8	-3	-10
AI/LLM Security	12%	-15	-8	-3	-12

Grades: A (90-100), B (75-89), C (60-74), D (40-59), F (0-39)

Exit codes: 0 for A/B (>= 75), 1 for C/D/F. Use in CI to fail builds.

CI/CD Integration

# Basic CI: fail if score < 75
npx ship-safe ci .

# Strict: fail on any critical finding
npx ship-safe ci . --fail-on critical

# Custom threshold + SARIF for GitHub Security tab
npx ship-safe ci . --threshold 80 --sarif results.sarif

# Post results as PR comment
npx ship-safe ci . --github-pr

# Only report new findings (not in baseline)
npx ship-safe ci . --baseline

Export formats: --json, --sarif, --csv, --md, --html, --pdf

GitHub Action

# .github/workflows/security.yml
name: Security Audit
on: [push, pull_request]

jobs:
  security:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: asamassekou10/ship-safe@v6
        with:
          path: .
          threshold: 75
          sarif: true
          comment: true

      - uses: github/codeql-action/upload-sarif@v3
        if: always()
        with:
          sarif_file: /tmp/ship-safe-results.sarif

Input	Default	Description
`path`	`.`	Path to scan
`threshold`	`75`	Minimum passing score (0-100)
`fail-on`		Fail on severity: critical, high, medium, low
`sarif`	`true`	Generate SARIF for Code Scanning
`deep`	`false`	Enable LLM deep analysis
`deps`	`true`	Audit dependency CVEs
`baseline`	`false`	Only report new findings
`comment`	`true`	Post PR comment with results

Multi-LLM Support

AI classification is optional. All core commands work fully offline. Use --provider <name> or set the matching environment variable.

Provider	Env Variable	Flag	Default Model
Anthropic	`ANTHROPIC_API_KEY`	auto-detected	claude-haiku-4-5
OpenAI	`OPENAI_API_KEY`	auto-detected	gpt-4o-mini
Google	`GOOGLE_AI_API_KEY`	auto-detected	gemini-2.0-flash
Ollama	`OLLAMA_HOST`	`--local`	Local models
Groq	`GROQ_API_KEY`	`--provider groq`	llama-3.3-70b-versatile
Together AI	`TOGETHER_API_KEY`	`--provider together`	Llama-3-70b-chat-hf
Mistral	`MISTRAL_API_KEY`	`--provider mistral`	mistral-small-latest
DeepSeek	`DEEPSEEK_API_KEY`	`--provider deepseek`	deepseek-chat
xAI (Grok)	`XAI_API_KEY`	`--provider xai`	grok-beta
Perplexity	`PERPLEXITY_API_KEY`	`--provider perplexity`	llama-3.1-sonar-small-128k-online
LM Studio	none	`--provider lmstudio`	Local server
Custom	any	`--base-url <url> --model <model>`	Any OpenAI-compatible

Incremental Scanning

Ship Safe caches file hashes and findings in .ship-safe/context.json. Only changed files are re-scanned on subsequent runs.

~40% faster on repeated scans
Auto-invalidation after 24 hours or when ship-safe updates
--no-cache to force a full rescan

LLM responses are cached in .ship-safe/llm-cache.json with a 7-day TTL to reduce API costs.

Suppressing Findings

Inline: Add # ship-safe-ignore on any line:

password = get_password()  # ship-safe-ignore

File-level: Create .ship-safeignore (gitignore syntax):

# Exclude test fixtures
tests/fixtures/
*.test.js

# Exclude documentation
docs/

Policy-as-Code

npx ship-safe policy init

Creates .ship-safe.policy.json:

{
  "minimumScore": 70,
  "failOn": "critical",
  "requiredScans": ["secrets", "injection", "deps", "auth"],
  "ignoreRules": [],
  "maxAge": { "criticalCVE": "7d", "highCVE": "30d", "mediumCVE": "90d" }
}

OWASP Coverage

Standard	Coverage
OWASP Top 10 Web 2025	A01-A10: Broken Access Control, Cryptographic Failures, Injection, Insecure Design, Security Misconfiguration, Vulnerable Components, Auth Failures, Data Integrity, Logging Failures, SSRF
OWASP Mobile 2024	M1-M10: Improper Credentials, Supply Chain, Insecure Auth, Insufficient Validation, Insecure Communication, Privacy, Binary Protections, Misconfiguration, Insecure Storage, Insufficient Crypto
OWASP LLM 2025	LLM01-LLM10: Prompt Injection, Sensitive Disclosure, Supply Chain, Data Poisoning, Output Handling, Excessive Agency, System Prompt Leakage, Vector Weaknesses, Misinformation, Unbounded Consumption
OWASP CI/CD Top 10	CICD-SEC-1 to 10: Flow Control, Identity Management, Dependency Chain, Pipeline Poisoning, PBAC, Credential Hygiene, System Config, Ungoverned Usage, Artifact Integrity, Logging
OWASP Agentic AI	ASI01-ASI10: Agent Hijacking, Tool Misuse, Privilege Escalation, Unsafe Execution, Memory Poisoning, Identity Spoofing, Excessive Autonomy, Logging Gaps, Supply Chain, Cascading Hallucination

Compliance mapping to SOC 2 Type II, ISO 27001:2022, and NIST AI RMF is included in audit reports.

OpenClaw Security

# Focused OpenClaw security scan
npx ship-safe openclaw .

# Auto-harden configs (0.0.0.0->127.0.0.1, add auth, ws->wss)
npx ship-safe openclaw . --fix

# Red team: simulate ClawJacked, prompt injection, data exfil
npx ship-safe openclaw . --red-team

# CI preflight
npx ship-safe openclaw . --preflight

# Scan a skill before installing
npx ship-safe scan-skill https://clawhub.io/skills/some-skill

# Generate Agent Bill of Materials
npx ship-safe abom .

# Update threat intelligence (ClawHavoc IOCs, malicious skills)
npx ship-safe update-intel

OpenClaw GitHub Action

Drop-in CI action that blocks PRs introducing agent config vulnerabilities:

# .github/workflows/openclaw-security.yml
name: OpenClaw Security Check
on: [pull_request]
permissions:
  contents: read
jobs:
  openclaw:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: asamassekou10/ship-safe/.github/actions/openclaw-check@main
        with:
          fail-on-critical: 'true'

Input	Default	Description
`path`	`.`	Path to scan
`fail-on-critical`	`true`	Fail the check if critical findings are found
`node-version`	`20`	Node.js version to use

Scans openclaw.json, .cursorrules, CLAUDE.md, Claude Code hooks, and MCP configs. Checks against the bundled threat intelligence database for known ClawHavoc IOCs.

Claude Code Plugin

claude plugin add github:asamassekou10/ship-safe

Command	Description
`/ship-safe`	Full security audit with remediation plan
`/ship-safe-scan`	Quick scan for leaked secrets
`/ship-safe-score`	Security health score (0-100)
`/ship-safe-deep`	LLM-powered deep taint analysis
`/ship-safe-ci`	CI/CD pipeline setup guide

Configuration Files

File	Purpose
`.ship-safeignore`	Exclude paths from scanning (gitignore syntax)
`.ship-safe.policy.json`	Policy-as-code: minimum score, fail-on severity, required scans
`.ship-safe/context.json`	Incremental scan cache (auto-generated)
`.ship-safe/history.json`	Score history for trend tracking
`.ship-safe/baseline.json`	Accepted findings baseline
`.ship-safe/llm-cache.json`	LLM response cache (7-day TTL)

The .ship-safe/ directory is automatically excluded from scans and should be added to .gitignore.

Supply Chain Hardening

Ship Safe practices what it preaches. Our own supply chain is hardened against the 2026 Trivy/CanisterWorm attack chain:

All GitHub Actions pinned to full commit SHAs
CI token scoped to contents: read
npm ci --ignore-scripts in all pipelines
OIDC trusted publishing with provenance attestation
CODEOWNERS on supply-chain-critical files
Strict files allowlist in package.json
Self-scanning with ship-safe in CI
5 direct dependencies (minimal attack surface)