Thumbtack helps customers search for the right local professionals to get projects done. Our search product collects project details from customers and matches them against preferences from professionals. Afterwards, our ranking algorithm displays the professionals most likely to result in a job well done. We tackle the search ranking problem by scoring professionals that match the customer’s requirements and then sorting them by score. Earlier this year, we changed our search ranking algorithm from a heuristic scoring system to a machine learning (ML) based scoring system. This change was very challenging but impactful. In this blog post, we’ll discuss why we wanted to transition our search ranking algorithm to use machine learning,
By: Ben Anderson & Xin Liu
When a customer posts a request on Thumbtack, we want to match them with the right professional for the job. When the marketplace was small, this was easy—just blast the request out to all of the pros in the request’s category and location. Today, with millions of requests a year and hundreds of thousands of active pros, we can’t rely on that simple algorithm anymore. The definition of “right” is no longer obvious—the pro and the customer each have their own preferences, and we need to balance how we benefit customers and pros to grow a healthy marketplace in the long run.
We’re excited to introduce Raghavendra Prabhu, the newest member of our team.
RVP joins us as Thumbtack’s Director of Engineering. Previously, he was the Head of Infrastructure at Pinterest, where he helped build the core backend infrastructure, including storage systems, caching, service framework and core business logic. Prior to Pinterest, he held senior engineering roles at Twitter and Google.
In his free time, RVP enjoys traveling and outdoor activities with his family.
Part 1: Organizing Chaos
Over the past year, we’ve built out Thumbtack’s data infrastructure from the ground up. In this two-part blog post, I wanted to share where we came from, some of the lessons we’ve learned, and key decisions we’ve made along the way.
When we started this project in early 2015, Thumbtack didn’t have a standalone data infrastructure; all analytics and data-oriented tasks were accomplished by directly using production databases. Individuals across all engineering and non-engineering teams were using the PGAdmin desktop tool for running queries. These and other dashboard/analytics queries were directly hitting a production PostgreSQL replica.