Custom stat tracking tool StatHat processes 12 billion data points per week - peaking at 100 million per hour or 30,000 per second. This is what CTO Patrick Crosby deals with every week, every hour, every second. Data is in the blood of this Chicago techie: he previously dealt with the massive amounts of data that OkCupid processes as CTO of the company. These days at StatHat, he is working on the company's custom time-series dataset storage system and their native iPhone app so users access their stats quickly.
How did you role as CTO of StatHat prepare you for your role as CTO of StatHat?
When OkCupid launched in February 2004, we wrote a tool that allowed anyone in the company to track any number or event with one line of code and chart the results instantly.
It ended up being the most important tool we built. It was used for years to help us answer all kinds of questions: Is the new matching algorithm doing a better job? Do we need a different image serving architecture? Is OkCupid successfully bringing people together? Do iPhone or Android users get the most replies to their messages?
StatHat started based on the same ideas as the internal OkCupid tool. We've added significant features that we never had at OkCupid, and made it work for any company on the internet, not just one.
I learned a lot about building scalable systems at OkCupid. OkCupid needs to process a ton of data very quickly, and so does StatHat.
What technologies are playing the biggest roles in StatHat?
We were an early adopter of the Go programming language. We use Go for all of our backend and web services. We built a custom time-series dataset storage system that stores all the stat data. We also developed a native iPhone app last year to let our users access their stats and get push alerts.
How is StatHat anticipating trends in data and data analysis industry?
Tools like StatHat are making it easy for companies to track a lot of custom stats. However, there is still a lot to learn regarding what data is the most useful to collect and how to analyze it. We are working on several projects that will help companies discover stats that are crucial to their businesses. At the same time we are adding more analysis services to automatically process the data, like our existing automatic anomaly detection service.
How do you constantly improve and innovate Stathat?
StatHat is one of StatHat's biggest users. We use StatHat to track all kinds of custom stats about itself. Based on our own usage and feedback from our users, we have a long list of things we would like to experiment with to make StatHat better. We scrap the code for many more features than we release.
What would you tell young companies looking for early product validation?
I would charge for your product from day one. There's nothing like paying customers to validate a product. Get advice in areas where you are weak, but you know the most about your product, so filter accordingly.