Data
Visualizing Code Repositories Metadata
The code repository that builds this page
All GitHub Organizations Data Gathered in the Flattened CSV
const repos = FileAttachment("./data/all_orgs_merged_20240120.csv").csv();
Original data in flattened CSV from Ecosyste.ms
Processed data with cohorts calculated
Samples
Number of code repositories
Important to note that this data comes from Ecosyste.ms API. Ecosyste.ms doesn't collect data on all repositories but rather the subset with engagement, that are source for a package, etc. The total number of repositories actually in each GitHub organization is larger than what is captured here.
NASA (US government agency)
National Security Agency (US government agency)
AirBnB (tech company known for creating open source tools)
home-assistant (one of the most forked and contributed to open source projects)
houstondatavis (local meetup for data visualization)
Counts of repositories with committer statistics that are not zero
It looks like as many repositories do not have committer data, but it is not clear if that is because they did not get pushes after public release or whether Ecosyste.ms only collects committer data on some subset of repositories?
NASA (US government agency)
National Security Agency (US government agency)
AirBnB (tech company known for creating open source tools)
home-assistant (one of the most forked and contributed to open source projects)
houstondatavis (local meetup for data visualization)
Repositories with more than 100 committers
For the small subset of repositories with very large communities developing them, neither HoustonDataViz or CMSgov have repositories on this list.