Meet a Data Scientist: Tatiana Maravina
WiDS Puget Sound is excited to present the next entry in our series, “Meet a Data Scientist!”
“Meet a Data Scientist” is dedicated to recognizing the amazing women powering the Puget Sound area’s data science community, spotlighting their journey into the field, their incredible accomplishments, and the weighty challenges that they faced along the way. This lies at the heart of WiDS Puget Sound and Data Circles’ mission of inspiring women to enter the data science field by showcasing its many incredible role models.
Do you know any marvelous women in data science? Send us a tip here!
Tatiana Maravina, Director, Machine Learning Science, Expedia Group
Tatiana Maravina's journey into data science began with a natural affinity for mathematics. Growing up in Russia, she pursued an undergraduate degree in mathematical statistics and computer science at the Lomonosov Moscow State University. She remembers taking her first statistics class and reflecting on how different the subject was compared to abstract math, which always came easy to her; it was so practical. Coincidentally, it was also a class where Tatiana got her lowest grade.
Seeking further specialization, Tatiana moved to the United States to earn her PhD in Statistics at the University of Washington. While her earlier studies in Russia focused heavily on advanced statistical theory and mathematical analysis, sharpening her analytical abilities but lacking direct preparation for applied data science work, Tatiana found the courses at UW struck a balance between theory and practical applications. "I was surprised by the practical approach of the courses at UW," Tatiana explained. "The various applied statistics courses at both the Masters and PhD level were immensely valuable, providing knowledge I continue to apply in my work to this day." The doctorate program equipped Tatiana with strong theoretical foundations in statistics and mathematics, complemented by training that allowed her to develop intuition for using science to solve real-world business problems.
While working on her PhD, Tatiana joined Boeing’s Applied Mathematics group within Boeing Research & Technology (BR&T). Starting as an intern, she later became a full-time applied statistician. Tatiana enjoyed applying her statistical knowledge to a wide range of complex engineering challenges in manufacturing, quality control, materials sciences, reliability engineering, and structural engineering.
Tatiana further honed her skills at SpaceX, diving into statistical reliability engineering, accelerated life testing, developing automated anomaly detection algorithms for satellite data, and analyzing on-orbit tests data from Starlink Demo satellites. However, she still sought exposure to big data and deploying models into production at scale.
In 2019, Tatiana joined Expedia Group, working as an individual contributor for the first few years. She applied her statistical expertise to develop A/B testing methodologies, prediction models, performance evaluation techniques, and methods for marketing capital allocation. One of the projects she was working on was the booking cancellation prediction model, which became very relevant during COVID. After gaining experience, she transitioned to a technical lead role managing a team that has grown from two to four data scientists.
Another notable project Tatiana worked on was developing predictive models for customer lifetime value (CLV), which estimate the future cash flows from each customer over the long-term horizon. Accurate CLV predictions enable better business decisions around customer acquisition, retention strategies, marketing investments, and more. The sophisticated CLV system utilizes gradient-boosted trees models to predict an individual customer's future value as a complex function of numerous input features representing the customer's past purchase behavior and engagement. The CLV model leverages data from multiple Expedia Group brands and lines of business, including stays, flights, packages, cars, and cruises. Over 200 engineered input features are divided into two main categories: bookings and engagement. The engagement features capture the customer's interactions with Expedia Group brands outside of bookings, such as engagement with marketing emails (e.g., number of clicks in the last 3 months) and loyalty program tiers. Initially, customers are categorized into five geographical regions and further grouped within each region based on metrics like booking recency and frequency, resulting in a total of 30 segments. Each customer type is modeled individually, with a dedicated CatBoost model trained for each segment. The models are retrained monthly, and CLV predictions for hundreds of millions of customers are updated daily to fuel various business intelligence applications. Here is a Medium blog with more details about this project.
When asked how she stays up to date with the developments in the field of data science, Tatiana recalled the Lewis Carroll quote “It takes all the running you can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!”
"The field is evolving so rapidly that the learning never stops," Tatiana added. "I learn so much from my colleagues at Expedia Group, both informally and through days of learning, which are a great opportunity to share knowledge." She noted that Expedia Group nurtures a culture of formal and informal knowledge exchange with courses, seminars and dedicated monthly days of learning.
Throughout her career, Tatiana has cultivated a diverse skill set at the intersection of mathematics, statistics, computer science, and business acumen. She emphasizes the importance of clear communication, as a critical skill for having impact and driving action in any data science role.
Tatiana has had to hone the ability to explain complex technical concepts clearly to diverse audiences, from engineers to business leaders.
She pointed out several main aspects of communication:
· Clearly explaining the work, main learnings, insights and any important data limitations to business stakeholders at a high level so that both technical and non-technical audience gets a good understanding. She recommends first thinking about the core story and main messages, knowing (or learning if possible) the audience's background, and structure content accordingly within the given time, while being mindful of the tendency to go deep into technical details. From her experience, if the key messages were clear and the high-level context was set, the Q&A section provides a great opportunity to drill deeper into technical specifics for those interested.
· She also noted the importance of day-to-day communications and noted that especially in a stressful situation there could be a tendency to look for a perfect solution before updating anyone. She noted how important it is to keep the stakeholders informed while searching for the solution. Sharing the work that might not be perfect, soliciting early feedback and sending updates regularly could be very helpful in keeping the project on track and aligned with the business goals.
· A more challenging aspect of communication is influencing and convincing others of the benefits of one approach over the other, especially when there are differing opinions. This requires skills in persuasion, negotiation and getting stakeholder buy-in.
"My journey underscores that data science is a continually evolving craft at the intersection of mathematics, statistics, computer science, and business acumen," Tatiana reflected. "The most fulfilling aspect is the opportunity to always be learning and expanding my capabilities in tandem with the rapid pace of innovation in this field."