This summer Margaret Tian interned on Fitbit’s Performance and Capacity team and worked on researching, designing, and implementing an automatic mean shift detection system to help with production site health monitoring. Here she describes her project and her experience as an intern at Fitbit.
What is a mean shift? A mean shift occurs when the mean of a time series changes in a statistically significant way. Some examples can be seen below, where the vertical lines mark the locations of the change, aka the change points.
In a production environment as large scale as we have at Fitbit, there are numerous application-level and system-level metrics our engineering team is interested in observing and monitoring. We would like to design an automated system to detect mean shifts like the ones above in our metrics because they may indicate potential problems and interesting shifts in user behavior. The system I have worked on this summer is designed to automatically alert Fitbit engineers to mean shifts as early as possible, allowing them to monitor many metrics and respond to interesting changes as early as possible.
Two major tools are used in developing the detection system. The first is STL (Seasonal Trend Decomposition using Loess by Cleveland et al.) which is a method of decomposing seasonal data into three components: seasonal, trend, and random variation.
Many of the metrics (or time series) we monitor have seasonal components that may obscure the underlying mean shifts. In my research we found that STL was useful for removing the data’s seasonal component, allowing us to extract the trend for a clearer signal on how the data was changing. STL was implemented in R using the stl function. To get the best decomposition, we adjusted 1) the window of data and 2) the s.window parameter, which corresponds to ns per the paper on STL.
The second tool we use is the R changepoint package by Killick and Eckley that can do change point detection. As mentioned earlier, a mean shift is a particular type of change point. Change point detection is a method based on statistics that first calculates the most likely point that a change could have occurred and then performs a hypothesis test to determine if that change is statistically significant. In our use case, we adopted the changepoint package and modified some of the methods to determine the point in time that a mean shift occurs. If you’re interested, you can learn more about it here or in Inference About the Change-Point in a Sequence of Random Variables – David Hinkley, 1970.
Armed with the tools, we need an algorithm to determine if the user should be alerted that a mean shift has occurred in a certain time series. The basic idea is to 1) examine a window of data, 2) use STL to extract trend if applicable, and then 3) alert based on the location and characteristics of the change point detected. This window can then be rolled as time passes. We considered and found the best values for the following:
- The length of the window of data
- The minimum segment length used in change point detection (see paper) *
- The alerting thresholds for the location and characteristics of the change point
The final algorithm has been tested on many historical time series and has managed to alert on mean shifts quickly with very few false alarms. Two test cases are presented below. The red lines mark the days that the user would have received an alert about a mean shift, and they agree well with observation.
The Automated System
Finally we automated the algorithm using Python and R to monitor real production time series. The automatic mean shift detection system pulls in many time series data from our monitoring system and determines if an alert should be raised. It also categorizes the alerts as new, continuing, or discontinued. Finally it sends out a daily email of the categorized alerts to a selective group of Fitbit engineers and updates a dedicated monitoring dashboard which is accessible to all Fitbit engineers.
In my last few weeks at Fitbit, the automatic mean shift detection system has already been useful in identifying a few production trend changes during our testing phase. There is a lot of potential for this system to be expanded beyond performance-related data to monitor all sorts of metrics across Fitbit.
The Internship Experience
Before this summer at Fitbit, I did not have a clear idea what I would do after graduation. I decided to join the Fitbit internship program mostly based on recommendation from a few upperclassmen who seemed to love their Fitbit internship experience. Also, as a tech company with hardware, software, and some interesting health-related datasets, I felt like interning at Fitbit might expose me to different areas.
In the end, I learned quite a few things this summer and they were not all just about work:
- Working with data is a great fit for my math and computer science skill set, and I’d love to explore data science opportunities
- The advanced theoretical classes I took were actually useful! Being able to understand the theory made me much more confident in applying STL and change point detection
- Knowing your own health data, like steps and sleep quality, is both surprisingly empowering and addicting. I freak out everytime I leave my Fitbit Alta at home
- There’s a great Chinese food place called Hunan’s Home near the office
- I can actually run 3 miles without wanting to pass out – who knew! Thank you to my running partner and manager, Bryce Yan, for encouraging this highly sedentary college student to become slightly less sedentary
I want to give a huge shout out my mentor Aaron Floyd, manager Bryce Yan, and the rest of the wonderful Performance and Capacity team for trusting me with this project, supporting me through the journey, and buying me endless coffee (especially David Lam). Also thank you for encouraging me to start running – I’m going to win all those work week challenges! And of course, thank you to Fitbit for giving me this awesome summer 2016 opportunity.
About the Author
Margaret Tian is about to start her junior year at MIT. She is a member of the class of 2018, majoring in Math with Computer Science. She is the managing director of TechX, an MIT club that empowers students through technology. She spent summer 2016 as an intern on Fitbit’s Performance and Capacity team.