Skip to main content
Datamata Studios

Popular destinations

Skill trends, comparisons, salary context, resume help and long-form guides — jump straight to what brings people back.

FreeCC BY 4.0

Data Tool Momentum Index — Open Dataset

A weekly snapshot of the open source data tooling ecosystem — GitHub stars and 4-week growth, PyPI and npm downloads, and how many active job listings ask for each tool. One tidy row per tool, with a single momentum score that blends all four signals. Free to download, analyse and republish — just credit the source.

Tools tracked
Categories covered
Last updated

What's inside

Each download contains the most recent snapshot, one row per tool, ranked by momentum. The CSV is plain tabular data; the JSON wraps the same rows with metadata (version, licence, generation timestamp). Columns:

ColumnTypeDescription
snapshot_datedateUTC date the latest snapshot was taken (YYYY-MM-DD).
toolstringTool name (e.g. dbt, Apache Airflow, DuckDB).
slugstringStable identifier used across Datamata surfaces.
categorystringTooling category: transform, orchestrator, processing, streaming, ingestion, bi, ml, ai, mlops, warehouse or quality.
starsintegerGitHub stargazers on the snapshot date.
forksintegerGitHub forks on the snapshot date.
open_issuesintegerOpen GitHub issues on the snapshot date.
pypi_downloads_monthintegerPyPI downloads in the trailing month. Blank for tools not on PyPI.
npm_downloads_monthintegernpm downloads in the trailing month. Blank for tools not on npm.
job_listing_countintegerActive job listings mentioning the tool. Blank for tools not in the skill taxonomy.
star_growth_4w_pctnumberChange in GitHub stars over the trailing 4 weeks, as a percentage. Blank until 4 weeks of history exist.
momentum_scoreinteger0-100 percentile composite of stars, job demand, downloads and 4-week star growth.
githubstringGitHub repository (owner/repo). Blank if not tracked on GitHub.
websitestringProject homepage.

How often it updates

The pipeline takes a fresh snapshot each week from the GitHub REST API, pypistats.org, the npm registry and our active job listings, then recomputes the momentum score. The momentum score is a percentile composite — 35% job demand, 30% GitHub stars, 20% downloads and 15% four-week star growth. For the full collection method, sources and known limitations, read the data methodology. To explore the same data interactively, see the live Data Tool Momentum Index.

Query it with the API

The same snapshot is available as a read-only JSON endpoint with open CORS, so you can fetch it from the browser or a script. Pass a category to narrow it to one tooling group.

# Every tool
curl https://www.datamatastudios.com/api/datasets/data-tool-momentum

# Just orchestrators
curl https://www.datamatastudios.com/api/datasets/data-tool-momentum?category=orchestrator

Responses carry the same columns as the download, plus the licence, attribution and the two bulk download URLs. Please keep the attribution link when you republish.

Licence & attribution

The Data Tool Momentum Index is released under the Creative Commons Attribution 4.0 (CC BY 4.0) licence. You can share and adapt it for any purpose, including commercially, as long as you credit Datamata Studios and link back to this page.

Suggested citation

Datamata Studios. "Datamata Data Tool Momentum Index." —. https://www.datamatastudios.com/datasets/data-tool-momentum. Licensed under CC BY 4.0.

More open datasets

Skill Demand Index

Daily share of active tech job listings mentioning each skill across data, engineering, product, DevOps, security and AI. Free, CC BY 4.0.

Building something with this data? We'd love to see it — get in touch.