Wednesday, November 23, 2016

Continuous Unix commit history from 1970 until today in Git

The Unix History Repository on GitHub is a Continuous Unix commit history from 1970 until today:

The history and evolution of the Unix operating system is made available as a revision management repository, covering the period from its inception in 1970 as a 2.5 thousand line kernel and 26 commands, to 2016 as a widely-used 27 million line system. The 1.1GB repository contains about half a million commits and more than two thousand merges. The repository employs Git system for its storage and is hosted on GitHub. It has been created by synthesizing with custom software 24 snapshots of systems developed at Bell Labs, the University of California at Berkeley, and the 386BSD team, two legacy repositories, and the modern repository of the open source FreeBSD system. In total, about one thousand individual contributors are identified, the early ones through primary research. The data set can be used for empirical research in software engineering, information systems, and software archaeology. The project aims to put in the repository as much metadata as possible, allowing the automated analysis of Unix history.

The files appear to be added in the repository in chronological order according to their modification time, and large parts of the source code have been attributed to their actual authors. Commands like git blame and (sometimes) git log produce the expected results.

No comments: