Popular destinations
Skill trends, comparisons, salary context, resume help and long-form guides — jump straight to what brings people back.
Data Tool Momentum Index — Open Dataset
A weekly snapshot of the open source data tooling ecosystem — GitHub stars and 4-week growth, PyPI and npm downloads, and how many active job listings ask for each tool. One tidy row per tool, with a single momentum score that blends all four signals. Free to download, analyse and republish — just credit the source.
What's inside
Each download contains the most recent snapshot, one row per tool, ranked by momentum. The CSV is plain tabular data; the JSON wraps the same rows with metadata (version, licence, generation timestamp). Columns:
| Column | Type | Description |
|---|---|---|
| snapshot_date | date | UTC date the latest snapshot was taken (YYYY-MM-DD). |
| tool | string | Tool name (e.g. dbt, Apache Airflow, DuckDB). |
| slug | string | Stable identifier used across Datamata surfaces. |
| category | string | Tooling category: transform, orchestrator, processing, streaming, ingestion, bi, ml, ai, mlops, warehouse or quality. |
| stars | integer | GitHub stargazers on the snapshot date. |
| forks | integer | GitHub forks on the snapshot date. |
| open_issues | integer | Open GitHub issues on the snapshot date. |
| pypi_downloads_month | integer | PyPI downloads in the trailing month. Blank for tools not on PyPI. |
| npm_downloads_month | integer | npm downloads in the trailing month. Blank for tools not on npm. |
| job_listing_count | integer | Active job listings mentioning the tool. Blank for tools not in the skill taxonomy. |
| star_growth_4w_pct | number | Change in GitHub stars over the trailing 4 weeks, as a percentage. Blank until 4 weeks of history exist. |
| momentum_score | integer | 0-100 percentile composite of stars, job demand, downloads and 4-week star growth. |
| github | string | GitHub repository (owner/repo). Blank if not tracked on GitHub. |
| website | string | Project homepage. |
How often it updates
The pipeline takes a fresh snapshot each week from the GitHub REST API, pypistats.org, the npm registry and our active job listings, then recomputes the momentum score. The momentum score is a percentile composite — 35% job demand, 30% GitHub stars, 20% downloads and 15% four-week star growth. For the full collection method, sources and known limitations, read the data methodology. To explore the same data interactively, see the live Data Tool Momentum Index.
Query it with the API
The same snapshot is available as a read-only JSON endpoint with open CORS, so you can fetch it from the browser or a script. Pass a category to narrow it to one tooling group.
# Every tool
curl https://www.datamatastudios.com/api/datasets/data-tool-momentum
# Just orchestrators
curl https://www.datamatastudios.com/api/datasets/data-tool-momentum?category=orchestratorResponses carry the same columns as the download, plus the licence, attribution and the two bulk download URLs. Please keep the attribution link when you republish.
Licence & attribution
The Data Tool Momentum Index is released under the Creative Commons Attribution 4.0 (CC BY 4.0) licence. You can share and adapt it for any purpose, including commercially, as long as you credit Datamata Studios and link back to this page.
Suggested citation
Datamata Studios. "Datamata Data Tool Momentum Index." —. https://www.datamatastudios.com/datasets/data-tool-momentum. Licensed under CC BY 4.0.More open datasets
Skill Demand Index
Daily share of active tech job listings mentioning each skill across data, engineering, product, DevOps, security and AI. Free, CC BY 4.0.