Measures Concepts
GitHub icon

Does every programming language have a central package repository?

by Breck Yunits

February 7, 2019 — Like millions of other programmers, every day I depend on central package repositories (CR) like npm, PyPI and CRAN.

The other day I was curious: does every programming language have one of these? I decided to find out. I pointed my crawler and trained a model to check for a package repository for every one of the 3,006 languages I am tracking. The results surprised me.

★ Only 1% have them

My model found only 39 languages with central package repositories. (For comparison, Wikipedia lists ~20). That's just ~1% of languages. I thought it would be higher.

★ ~30% of the Top 100 have them

Given that a programming language is very popular and appears in my top 100 list, it is about 15 - 30x more likely to have a CR. Given a language is not in the top 100, <1% will have a CR.

★ 2 Million+ Packages Total

There are over 2,000,000 packages (aka modules or libraries) across these CRs. That means there are about 1,000x more packages than there are programming languages.

With that many packages, name collision is certainly a problem (maybe a subject for another post), though not as much of a problem as in the domain system where 130,000,000+ ".coms" alone are registered.

★ The top 5 account for ~80% of all packages

At over 900,000, Javascript's npm has almost more packages than all other CRs combined. Javascript, Java, PHP, Perl, and Python account for about 80% of packages.

★ GitHub has over 100,000,000 repositories

Given the size of GitHub, and it's growth as somewhat of a universal central package repository (though totally unmoderated), and given that many (if not the majority) of the packages in these CRs are also listed on GitHub, it's conceivable that GitHub is the largest CR and that the number of packages out there is easily 10x bigger than 2M.

★ Newer languages are not more likely to have CRs

This surprised me. The median age of a language with a CR is 24 (1995). Of the top 5 languages I mentioned above, all were created by then. Almost always the creation of the CR follows the launch of the language, sometimes by months or sometimes by years. I expected most CRs to be from newer languages, but that wasn't the case. While some new languages like Rust and Julia have CRs, others like Go and Kotlin do not.

★ A Visual

Languages with web package managers plotted by year and number of published packages.

★ The List

Here's my list of the main central package repositories for languages that have them. I cut the list a bit to only include CRs with more than 100 packages available. As always, open an issue if you spot any omissions or mistakes!

id website packages appeared
javascript http://npmjs.org 901,025 1995
java https://search.maven.org/ 266,776 1995
php https://packagist.org/ 211,636 1995
perl https://www.cpan.org/ 176,876 1987
python https://pypi.python.org/pypi 167,097 1991
ruby https://rubygems.org/ 154,445 1995
csharp https://www.nuget.org/ 141,524 2000
swift https://cocoapods.org/ 57,000 2014
clojure https://clojars.org/ 23,459 2007
rust https://crates.io/ 22,486 2010
r https://cran.r-project.org/ 13,674 1993
haskell https://hackage.haskell.org/ 13,487 1990
matlab https://www.mathworks.com/matlabcentral/fileexchange/ 9,718 1984
erlang https://hex.pm/ 8,069 1986
julia https://juliahub.com/ui/Packages 7,820 2012
tex https://ctan.org/ 5,649 1978
stata https://www.stata.com/manuals/rssc.pdf 4,608 1985
smalltalk http://smalltalkhub.com/ 4,534 1972
powershell https://www.powershellgallery.com/ 4,382 2006
emacs-editor https://melpa.org/ 4,079 1976
dart https://pub.dartlang.org/ 2,751 2011
maple https://www.maplesoft.com/applications/ 2,650 1982
ocaml https://opam.ocaml.org/ 2,224 1996
lua https://luarocks.org/ 2,047 1993
d https://code.dlang.org/ 1,498 2001
dynamo-visual-language https://dynamopackages.com/ 1,494 2011
haxe https://lib.haxe.org/ 1,303 2005
racket https://pkgs.racket-lang.org/ 1,122 1994
elm https://package.elm-lang.org/ 594 2012
coldfusion https://www.forgebox.io 519 1995
nim https://nimble.directory/ 499 2008
spark https://spark-packages.org/ 441 1988
prolog http://www.swi-prolog.org/pldoc/doc/_SWI_/library/prolog_pack.pl 275 1972
mathematica http://packagedata.net/ 210 1988

Updates

As Jay18001 pointed out, a few of these repositories serve packages for more than one language. Cocoapods => Objective-C, and Nuget => F# and other .Net langs. In this post I collapsed things so each repo only has 1 language.

Update: 8/26/2019. Multiple readers pointed out that my stat for Ruby was off by 10x. I used the # I found on this page, which turned out to be just the gems beginning with the letter A. I apologize for the mistake and am very grateful for the corrections.

Thanks to PallHaraldsson for reviewing this post and encouraging it to get updated.

Analyze this data yourself in Observable
Analyze this data yourself in Ohayo

View source

- Build the next great programming language · Search · Add Language · Features · Creators · Resources · About · Blog · Acknowledgements · Queries · Stats · Sponsor · Day 605 · feedback@pldb.io · Logout