Managing Billions Lines of Code – Archivist Take Note

Google developers logoA piece posted today at Wired by Cade Metz discusses the code repository used by Google to manage the code for all their services, app’s etc. Metz is discussing comments by Google’s Rachel Potvin made at a Silcon Valley engineering conference. According to Potvin and discussed at length by Metz, Google keeps all of of its code for it’s services in one central repository accessible by its 25,000 engineers. The amount of code is staggering. Roughly two billion lines of it is managed, tracked and made accessible through Google’s home-grown repository system known as Piper.

For some archivist, this information is neat and ultimately not relevant to what we do. I insist that in the post-custodial world we live in, examining how massive large-scale software is develop is something to take note of. The software itself may not be considered a record, though that’s up for debate, the interactions between developers, bot, and testers generate records many of which are tracked by version control software like Piper. The question we have to ask ourselves is what should we do with these records, how do we capture them, how does a scenario like a large software code repository translate to other instances where records can be manipulated by large groups of users? I can’t answer any of these, but it’s time to pay attention.



%d bloggers like this: