Gitlyzer fetches, classifies, and visualises every commit and pull request across your entire GitHub organisation, without sending a single byte of code to a third party. Runs on your infrastructure, backed by PostgreSQL.
Incremental, idempotent data collection meets an interactive web dashboard, no manual spreadsheets, no third-party SaaS.
Every commit is automatically classified into 30+ semantic types using pattern matching on message content, diff stats, and branch context. Track noise vs. meaningful change over time.
Measure average and median review time, PR size distribution, merge rate, and which authors get reviewed fastest. Identify bottlenecks before they block delivery. Track PRs merged without any review.
Visualise commit velocity, PR throughput, and activity patterns at daily, weekly, or monthly granularity. Chart.js powered interactive charts built into every dashboard view.
Drill into individual contributors or entire teams. Each profile shows commits by type, PR stats, top repos, and activity periods across configurable time windows, all time, last year, last 3 months.
Clone repositories locally for faster, rate-limit-free data fetching. Smart size-based selection automatically falls back to the GitHub API for oversized repos. Configurable clone directory and cleanup policy.
Rank contributors and repositories by gross lines, net lines, noise ratio, or PR count. Full-text search, column sorting, and pagination handle orgs with hundreds of engineers.
A dedicated admin interface with session-based Flask-Login authentication, bcrypt password hashing, repository management via AJAX toggles, and a data sync dashboard for triggering fetch jobs.
Re-run any fetch job safely, already-processed records are skipped automatically. Supports --since and --until flags for targeted backfills and daily cron automation.
11 tables with foreign keys, ON DELETE CASCADE, and proper indexes covering commits, PRs, teams, users, contributions, and classification results.
Real views from a live deployment, every chart and table pulls directly from your GitHub data. Click any screenshot to enlarge.
Gitlyzer is modular by design. Each job runs independently, fetch only what you need, as often as you need it.
Clone the repo, run pip install -e ., copy config-example.yaml and add your GitHub token and DB connection.
Create a PostgreSQL database and run psql -d gitlyzer-v2 -f db/gitlyzer_create_tables.sql. 11 tables, all foreign-keyed, with proper cascade deletes.
Run jobs from jobs/ to sync repos, commits, PRs, users, and teams. Use --since / --until or automate with cron.
Run python run_webapp.py and open localhost:8082. Filter by date, team, or user, every chart is interactive and drill-down ready.
A full-featured admin interface protected by Flask-Login session authentication and bcrypt password hashing, no plain-text credentials, ever.
Enable or disable repositories via AJAX toggles. Search, sort, and control which repos are actively tracked, no script restarts needed.
Create and manage admin accounts with bcrypt-hashed passwords. Protected against deletion of the last admin to prevent lockout.
Trigger fetch jobs directly from the UI, pull latest commits, PRs, users, or teams on demand without touching the command line.
Customise commit classification rules through the admin UI. Add, edit, or disable patterns without touching source code or restarting the server.
Follow these steps to get Gitlyzer collecting your GitHub data on your own infrastructure.
Use editable install so all project imports resolve correctly from the root.
git clone https://github.com/satsha7/gitlyzer cd gitlyzer pip install -e .
Copy the example config and fill in your GitHub token and PostgreSQL connection details.
cp config/config-example.yaml config/config.yaml # Set GitHub token, DB host/name/user/password export FLASK_SECRET_KEY="your-strong-random-key"
createdb gitlyzer-v2 psql -d gitlyzer-v2 -f db/gitlyzer_create_tables.sql
python jobs/fetch_and_store_all_repos.py python jobs/fetch_and_store_commits.py \ --since 2025-01-01 --until 2025-12-31 python jobs/fetch_and_store_pull_requests.py python jobs/fetch_and_store_users.py python jobs/fetch_and_store_teams.py
python run_webapp.py # Open http://localhost:8082
pip install flask-login bcrypt flask-wtf psql -d gitlyzer-v2 \ -f db/migrations/add_admin_fields.sql PYTHONPATH=. python utils/create_admin_user.py # Admin: http://localhost:8082/admin/login
requests, GitHub APIpsycopg2-binary, PostgreSQLGitPython, Local clone modeFlask, Web dashboardpyyaml, ConfigurationDashboard: http://localhost:8082
No vendor lock-in. Every component is open source and runs entirely on your own infrastructure.
Core data fetching, incremental classification, and analytics layer
11-table relational schema with foreign keys, indexes, and cascade deletes
Lightweight web dashboard with Jinja2 templates and Flask-Login auth
Interactive, responsive charts for all trend and analytics views
Download Gitlyzer, connect your GitHub token, and have your first analytics dashboard running in under 30 minutes, entirely on your own infrastructure.
MIT License • Python 3.9+ • PostgreSQL 13+ • No data leaves your network