Menu
A key challenge when deploying an analytics model is to keep it from becoming stale in the presence of evolving data. In the context of supervised machine learning (ML), we describe a quick-and-dirty sampling-based method for maintaining model accuracy over time that allows existing analytic algorithms for static data to be applied to dynamic streaming data essentially without change. Specifically, we provide temporally-biased sampling schemes that weight recent data most heavily, with inclusion probabilities for a given data item decaying exponentially over time.