What can we learn from 750 billion GitHub events and 42 TB of code by Felipe Hoffa

Published on: Friday, 16 June 2017

“Data gives us insights into how people build software, and the activities of open source communities on GitHub represent one of the richest datasets ever created of people working together at scale.” –GitHub Universe 2016 We are going to analyze – live on stage – 5 years of GitHub metadata and 42 TB code stored in it to answer questions like: – How is this run – How coding patterns have changed through time. – Guiding your project design decisions based on actual usage of your APIs. – How to request features based on data. – The most effective phrasing to request changes. – Effects of social media on a project’s popularity. – Who starred your project – and what other projects interest them. – Measuring community health. – Running static code analysis at scale. – Tabs or spaces? (I gave a shorter overview of this at the official GitHub Universe conference in 2016)

Felipe Hoffa (Google)
In 2011 Felipe Hoffa moved from Chile to San Francisco to join Google as a Software Engineer. Since 2013 he’s been a Developer Advocate on big data – to inspire developers around the world to leverage the Google Cloud Platform tools to analyze and understand their data in ways they could never before. You can find him in several YouTube videos, blog posts, and conferences around the world.