What is Kaggle? & why should I care about it?

  • Should be on your radar, if it's not already
  • Opportunity to apply skills to areas outside your day-to-day activities
  • Can peek over people's shoulders
    • Kernals preserve the work that people have already done
    • Easy way to quickly see how an approach is implemented in practice
  • It's an excellent source for unique datasets


  • Data science, gameified
  • You can win $ (hypothetically)
  • Recruitment
  • Research
    • Kiva competition (ongoing)
    • Malware classification (3 yrs ago)
    • American Epilepsy Society's Seizure Prediction using EEG
  • Rankings & Tiers
    • Complete your profile: Novice, Contributor
    • Win some medals, discuss content: Expert
    • Be more baller than 'Experts': Master, Grandmaster

You can work in the cloud (kernals!)

  • Kaggle docker images
  • Come pre-loaded with popular libraries
    • All of CRAN
    • Plus, development versions of other packages: rstan, h2o, among others
  • Can run Jupyter notebooks using either Python or R, executing code in cells
  • Or draft scripts
  • Version control is baked in
  • 17 gigs of RAM, ? CPU, 55mb~ storage space, operations time out after 2 hours


  • Kaggle hosts job listings from companies that want to advertise positions to its userbase
  • No experience applying through one, but would be curious to hear about people's experience
  • Can see how many people have looked at a posting at a given time, kind of interesting?
  • Would guess that companies advertising might have a better grasp on their needs?
  • Maybe preferable to recruiters contacting you?


Kernal/Competitions interface demonstration


  • Other platforms for crowd-sourced data science?

  • Help me be less wrong on the internet:
    • github: @mooreaw
    • twitter: @theysayHeyDrew
    • Blog
    • A2MADS slack: @andrew