Unlocking Leadership: How Volunteering Supercharges Your Career

At Diversity in Data Science, we believe that leadership isn’t confined to titles—it’s about taking initiative, inspiring change, and making an impact wherever you are. We held a virtual mentoring session with our Women in Data Science Puget Sound volunteers, where we explored how the skills we cultivate through volunteer work can become powerful assets in our professional journeys. From translating WiDS experience into career growth, to mastering the art of delegation and leading without formal authority, our discussion was filled with valuable takeaways. Here’s what we learned.

Using WiDS Experience to Elevate Your Career

Volunteering with WiDS isn’t just about giving back and strengthening the data science community—it’s also a powerful catalyst for personal and professional growth. Several of our former conference chairs and leads have stepped into leadership roles within their companies, leveraging the skills they honed in WiDS to drive impact in their workplaces. To make the most of valuable volunteer time and translate those experiences into career success, some key leadership skills include:

  • Prioritize What Matters: Let go of perfectionism and focus on impact. Identify key priorities and make strategic decisions about where to invest your time and energy.

  • Optimize Workflows: Leverage project management tools such as Asana or action item trackers to keep initiatives moving forward with clarity and accountability.

  • Delegate to Empower: True leadership is about lifting others up. Assign tasks thoughtfully to distribute workload and cultivate growth.

  • Run Meetings Like a Pro: Clear agendas, structured discussions, and defined action items turn meetings into powerful tools for progress.

  • Set Expectations for Effective Communication: Establish expectations (such as checking Slack once a day) to ensure clear and timely communication across committees.

  • Stay Transparent: Regular status reports and check-ins ensure that everyone stays informed and aligned, making it easier to tackle challenges proactively.

The Art of Delegation: Elevating Others While Achieving More

Great leaders don’t do everything alone—they trust and empower others to contribute meaningfully. Delegation can be tough, but these best practices make it easier:

  • Recognize Strengths and Encourage Growth: Identify comfort zones and push team members to expand their capabilities.

  • Be Clear and Specific: Avoid ambiguity by setting clear expectations and deadlines.

  • Break It Down: Large tasks can be daunting—divide them into manageable steps.

  • Make Collaboration a Habit: Group working sessions foster teamwork and ensure that tasks don’t fall through the cracks.

  • Support with Training: Provide the knowledge and resources necessary for others to succeed.

  • Create a Culture of Shared Knowledge: Train multiple people on essential tasks so that expertise is never bottlenecked.

Leading Without Authority: Making an Impact from Any Role

Leadership is about influence. Many of our members have led projects and initiatives without direct authority, and their success comes from key strategies like these:

  • Take Initiative: Instead of asking, “What can I do?” propose a specific action you can take to add value.

  • Clarify Goals and Align Priorities: Clearly articulate what you aim to achieve and ensure it resonates with broader team objectives.

  • Align Needs with Action: Understand the needs and perspectives of different stakeholders to drive alignment and collaboration.

  • Show the Bigger Picture: Frame your contributions in terms of time saved, efficiency gained, or strategic impact.

Integrate Leadership Skills into Your Personal and Professional Environments

The skills you develop from volunteering don’t just make a difference in your community—they shape you into a leader ready to take on bigger challenges in both your career and personal life. Whether you’re refining your ability to delegate, leading without authority, or optimizing workflows, every step you take contributes to your professional growth and success.

A heartfelt thank you to everyone who participated in our session! Your insights and experiences are what make this community thrive. 

If you're looking for further inspiration on leadership, here are some excellent book recommendations to deepen your knowledge and refine your skills:

  • How Remarkable Women Lead - Explores the leadership traits that drive successful women, emphasizing personal meaning, positive framing, and connecting with others to inspire and achieve impact.

  • How Women Rise - Identifies 12 habits that hold women back in their careers and provides strategies to overcome them, empowering women to advance and succeed in leadership.

  • Never Split the Difference – Teaches negotiation tactics based on FBI hostage negotiation techniques, emphasizing empathy, tactical mirroring, and emotional intelligence to achieve the best possible outcomes in any negotiation.

  • Getting to Yes - A fundamental guide to effective negotiations and win-win solutions.

Data work with an impact: Discover the many applications of data science in the public sector in King County
 
 
 
 

Join us at Northeastern University on Tuesday, October 22nd to learn about how data science is shaping public policy and improving lives in our community. Three experts will share their experiences and insights on:

  • Using data to track changing health, economic and demographic conditions in King County.

  • How data scientists help shape planning, implementation, and evaluation of critical social service innovations like our regional crisis care centers.

  • Innovative strategies to support collaboration and cross-sector work while stewarding sensitive data about people in our communities.

If you’re a data analyst, scientist, or student curious about career opportunities in the public sector, a public policy enthusiast interested in the role of data in government, or just interested in using technology for social good, this event is for you.

Register here!


Meet the Speakers

 

Aley Joseph Pallickaparambil is a Senior Epidemiologist with Public Health – Seattle & King County, and is clinical faculty at the UW Department of Health Systems and Population Health. At King County, Aley oversees and works alongside a talented quantitative methods specialist team of epidemiologists, social research scientists and a data scientist who use data mining, and epidemiological analyses to conduct population health surveillance, public health program evaluation and build data modernization capacity. Before King County, Aley worked with tribal communities to assess race misclassification in health records. She studied mathematics and microbiology at the University of Mumbai, and has an MPH in epidemiology and biostatistics from the University of Michigan in Ann Arbor.

 
 

Carolina Johnson is a data scientist committed to developing ethical public sector data capacity. She has been working with King County for over seven years working to support cross-system data integration, equity-centered data governance, and creative problem-focused data uses, as well as supporting technical development of a large and growing team of evaluators and data scientists. Before joining King County she completed a PhD in Political Science on the CSSS track at UW, with a focus on understanding the civic effects of participatory budgeting in local communities.

 
 

Minh Phan oversees data and evaluation efforts for behavioral health crisis programs at King County, leading a team of program evaluators and data scientists. Their projects include, but are not limited to, the integration and transformation of complex behavioral health data from local providers, applying time series analysis, client matching algorithms, and crisis episode modeling. These efforts support the creation of data-driven strategies for both existing crisis services and the forthcoming Crisis Care Centers Levy.

 
Helen Jenne
Know Before You Go: Career Club Meetup - Behavioral Interview Practice

Career Club powered by Diversity in Data Science, aims to foster growth among data professionals through targeted professional development events and a supportive peer network. The career club is focused on bringing you resources to boost your career journey including a Behavioral Interview Prep night! Read on, to know more about the event and to make the most of it!

Career Club Meetup: Behavioral Prep Night

  • Date: Wednesday, August 7

  • Time: 5:15-7:15 PM

  • Location: Capitol Hill Branch - The Seattle Public Library (425 Harvard Ave E, Seattle, WA 98102)

  • Event: Behavioral Interview Prep Night

  • Host: WiDS Puget Sound

  • Registration: Register Here

Event Overview

Join us for an engaging and practical session focused on honing your behavioral interview skills. This interactive event is perfect for anyone preparing for career advancements or sharpening their interview techniques. You'll be able to practice in a supportive environment and receive constructive feedback to boost your confidence and performance.

What to Expect?

The session will be divided into three rounds of interview practice, each lasting 10 minutes. During these rounds, you'll practice answering common behavioral interview questions with different partners, allowing you to experience a variety of interview styles and perspectives.

Format:

  1. 5:15 - 5:30 : Arrive & Network

  2. 5:30 - 5:45 : WiDS Intro & Activity Overview

  3. 5:45 - 7:00 : Interview Practice Rounds in Pairs

      – 5:45 - 6:10: Round 1

      – 6:10 - 6:35: Round 2

      – 6:35 - 7:00: Round 3

  1. Each interview round is 25 minutes. 

  2. 10 minutes for each partner (about 3 questions)

  3. We suggest to start with “Tell me about yourself”

  4. You can choose questions yourself or ask your practice partner to choose for you

  5. 5 minutes at the end for mutual feedback.

  6. 7:00 - 7:15 : Network & Farewells 

 

Sample Questions and Core Competency tested:

  1. "Tell me about a time you faced a significant challenge at work. How did you handle it?"

    • Core Competency: Problem-solving skills, resilience, and staying calm under pressure.

  2. "Describe a situation where you had to work as part of a team to achieve a goal. What was your role?"

    • Core Competency: Teamwork, collaboration, and leadership abilities.

  3. "Can you give an example of a time you had to manage multiple responsibilities? How did you prioritize?"

    • Core Competency: Time management, organizational skills, and ability to handle stress.

  4. "Tell me about a time you received constructive feedback. How did you respond?"

    • Core Competency: Receptiveness to feedback, willingness to improve, and self-awareness. 

Tips for Participants:

  • Preparation: Familiarize yourself with common behavioral interview questions and practice using the STAR method.

  • Engagement: Be active and engaged during the session, offering and receiving feedback constructively.

  • Flexibility: Be open to adjusting your responses based on feedback and different interviewer styles.

  • Networking: Take advantage of the opportunity to network with other professionals and learn from their experiences.

We look forward to seeing you at the Career Club Meetup! This is a fantastic opportunity to refine your interview skills and boost your career prospects. Don't forget to register and come prepared to engage and learn.


Shweta Manjunath
Meet a Data Scientist: Fidan Aydamirova

WiDS Puget Sound is excited to present the next entry in our series, “Meet a Data Professional!”

“Meet a Data Professional” is dedicated to recognizing the amazing women powering the Puget Sound area’s data science community, spotlighting their journey into the field, their incredible accomplishments, and the weighty challenges that they faced along the way. This lies at the heart of WiDS Puget Sound and Data Circles’ mission of inspiring women to enter the data science field by showcasing its many incredible role models.

Do you know any marvelous women in data science? Send us a tip here!

Fidan Aydamirova’s journey to becoming a data scientist is a story of hard work, passion, and learning. Born and raised in Baku, Azerbaijan, Fidan studied Finance at Azerbaijan State University. After graduation, she worked briefly in the banking industry before moving to the United States with her husband in 2015. The United States offered educational and work opportunities as well as adventure for this young couple.

In the US, her husband studied Computer Science at the University of Texas at Austin. They started a family, having two daughters. During this time, Fidan supported her husband and even helped him with his homework. She became interested in computer science herself and decided to take classes at Austin Community College, initially virtually and then in person. The community college was a great transition back into being a student. She realized she was good at programming, but data science gave her the ability to also be innovative and creative.

In 2021, Fidan's husband accepted a job at Expedia, and they moved to Seattle. She started a Graduate Certificate in Computer Science at Seattle University to qualify for their MS in Computer Science with a Data Science specialization program. When Seattle University opened a new MS degree in Data Science, she enrolled in the first cohort for that program. The classes were all in person and in the evening, allowing some to work and be in the program at the same time. This allowed Fidan to care for her daughters, then aged 3 and 5, during the day and let her husband take on that role in the evening while she went to school. She attended full time and completed the program in two years in June 2023.

In Azerbaijan, women often choose less technical professions due to their underrepresentation in the technical fields. Fidan is grateful for the support her family and husband have given to her educational and work goals. Her kids brag that their mom is a scientist. She knows she has inspired and supported other women to get degrees and challenge themselves in their careers. It’s rewarding to be a role model for both her kids and others.

While studying, Fidan took every opportunity to gain practical experience in data science. She was part of a group that did a data science project for the astronomy department. Later their work was published and presented at a Conference of the American Astronomical Society. While getting her Masters she worked in the procurement office at Seattle University as a systems administrator and data analyst, creating dashboards and automated processes, which greatly helped the university manage its finances. Her previous experience in finance and computer science, combined with her new skills in data science, made her work very effective. She feels that having these projects on her resume helped get her recognized by recruiters.

Fidan has nothing but positive things to say about the program at Seattle University. Each professor took their students seriously, pushing them to their limits but being supportive at the same time. They arranged for social meetings to catch up and create a sense of community among students and faculty. Her capstone project was with Costco and gave her invaluable experience, using all the skills she had gained in the program.

Fidan started job hunting early in her master's program. She strongly advises not waiting until graduation to start the process. She would set aside time every day to apply for positions. After many applications and interviews, she had two offers: one at a startup as a machine learning engineer and one at Holland America Line as a Senior Strategic Analyst, which she accepted. Job hunting is hard work and you have to persevere. Sometimes you are over or under-qualified. Sometimes you are sure a company will offer you a job because the interviews went so well, and then you don’t get the job and your self-esteem takes a hit. But you have to believe in yourself and keep going.

Fidan started at Holland America in December 2023.  In her new role, she has worked on predictive modeling, forecasting, automating processes, and creating dashboards. All her past experience allowed her to get up to speed quickly, and her managers have been impressed with how quickly she adapted to her new job. Right now she’s working remotely, but she’s looking forward to working from the company’s Seattle office starting in September.  Not long ago, she was applying for intern positions, but this summer she will be mentoring and leading a Data Scientist intern herself. She hopes to one day become a manager and have the chance to mentor others. 

Fidan Aydamirova’s story is a testament to resilience, adaptability, and the pursuit of knowledge. She transformed challenges into opportunities and became a role model for others in her community. Her journey from Baku to a successful career in data science in the United States is an inspiration to many, demonstrating that with determination and support, significant achievements are possible.

Dana Lindquist
Meet a Data Professional: Mridula Polina

WiDS Puget Sound is excited to present the next entry in our series, “Meet a Data Professional!”

“Meet a Data Professional” is dedicated to recognizing the amazing women powering the Puget Sound area’s data science community, spotlighting their journey into the field, their incredible accomplishments, and the weighty challenges that they faced along the way. This lies at the heart of WiDS Puget Sound and Data Circles’ mission of inspiring women to enter the data science field by showcasing its many incredible role models.

Do you know any marvelous women in data science? Send us a tip here!

Mridula Polina, Data Engineering Manager at Amazon Web Services (AWS) Finance Technology Team

After graduating with a B.E. in Computer Science from Osmania University in India, Mridula Polina was a Systems Engineer at Infosys where she worked on various data engineering tasks, before making her transition to the U.S. She first worked at a nonprofit organization called Community Center for Education Results, then went on to complete her master’s in Information Systems at University of Washington. Upon graduation, she started working at Amazon as a Data Engineer, eventually moving to Technical Program Manager roles. Currently, she is the Data Engineering Manager at Amazon Web Services (AWS) Finance Technology Team. In this blog interview, she shares her experience navigating the career path in the data field as an Engineering Manager.

Q: You initially began your career as a Systems Engineer in India. How did you make the decision to pursue your graduate studies at UW MS in Management Information Systems and Services (MSIS)? And how did the MS program at UW help you navigate your career in data engineering?

My interests in data engineering first bloomed when I started working as a Systems Engineer at Infosys, where I was part of various projects in data warehousing, business intelligence and analytics.

At Infosys, I witnessed how large-scale data systems are built and implemented to drive business decisions, and how big data can be transformed to meaningful insights. I got intrigued by the process of how companies leveraged large-scale data. I had always wanted to pursue a master’s degree in the intersection of technology and business.To bridge the gap between the two, I decided to apply to MS in Information Systems at UW, which combined these fields in its curriculum.

When I first applied to the master’s program at UW, I did not get in. It was a humbling experience, and while it was a difficult time, I decided to join a nonprofit called Community Center for Education Results and worked on the Roadmap Project, which aimed to help underprivileged students in South King County get back into the education system. Here, as the only data engineer at the organization I made use of the power of data to collect and analyze data and build systems to facilitate different projects. After gaining experience working in this role, I applied again to UW and was able to get into the Master of Science in Information Systems (MSIS) program.

Q: At AWS, you have worked on various roles from data engineer to technical program manager (TPM) and now a data engineering manager. How did your roles and responsibilities change over time?

Upon earning my MS in Information Systems at UW, I joined Amazon as a Data Engineer. Since I had prior experience as a data engineer, this helped me get a foot in the door. When I got in, I started working in the AWS Enterprise business space in 2017, and at that time AWS was taking off and expanding rapidly in scale. It was an exciting period at AWS, and I grew a lot professionally during this time. 

One of the important takeaways from my experience as an entry level engineer was if you have a good idea, you need to actually build a product and present it for people to recognize the value of your idea. If you want to convince others, you have to work on developing clear deliverables. 

As a Technical Program Manager (TPM), you work on multiple projects with different engineering teams. I was able to deliver important initiatives at AWS, and my managers saw a potential in me to become a people manager. As a people manager, I love helping my team deliver and helping engineers develop their career paths. As an Individual Contributor (IC), you tend to work at a narrower scope in-depth on one problem; on the other hand, as a manager the focus is on taking a look at the bigger picture and problem solving.

At Amazon, you truly use data extensively - without data, you don’t make decisions. When you are a manager, I think the biggest differentiating factor is that you are responsible for the entire team who looks up to you, and you have great responsibility to look out for their individual needs and team goals while working towards goals.

Q: How do you navigate your situation in balancing your work and family?

I would say that it is definitely not straightforward to navigate and balance your work responsibilities fully while also caring for young humans at home. In such situations, having a team that allows for flexibility and understanding of your needs is crucial. I believe it is important to ask for help and flexibility, and have leadership that is supportive and understanding of your needs.

Q: In your experience, do you think there are challenges that come with being a female leader? How did you address the difficulties? In your opinion, how does mentorship play a role in adjusting to the role or helping more women break into these positions?

I have observed that as you grow in your career, there are less number of women in leadership positions. While it is incredibly inspiring to learn from women leaders across the technology industry, I would like to see a high number of women in leadership positions.

Early in my career, I received feedback that was not actionable which was more about my personality, not actual feedback that would help me grow in my role. What I wanted was more constructive, actionable feedback that would help me grow in my career. I got advice from mentors who helped me get to my next goal in career development. Through skill development and by voicing my requests, I was able to experience various roles. That is why I believe it is important to find a mentor who can guide you through the gridlock situations you run into at times.

During my master’s, I had a mentor who was in a senior position at Amazon, who mentored me to land a job at Amazon. I believe it is important to find a mentor that you can look up to and can offer you insights - don’t be afraid to reach out to people and ask them to be your mentor if you feel there is value you can get. Getting advice from my mentors was crucial to growing in my roles, and I came to build confidence through their help and suggestions.

Disclaimer: I am not speaking for Amazon. All the opinions shared by Mridula are her own opinions.

hb gloria
Meet a Data Professional: Shanu Sushmita

WiDS Puget Sound is excited to present the next entry in our series, “Meet a Data Professional!”

“Meet a Data Professional” is dedicated to recognizing the amazing women powering the Puget Sound area’s data science community, spotlighting their journey into the field, their incredible accomplishments, and the weighty challenges that they faced along the way. This lies at the heart of WiDS Puget Sound and Data Circles’ mission of inspiring women to enter the data science field by showcasing its many incredible role models.

Do you know any marvelous women in data science? Send us a tip here!

Shanu Sushmita, Assistant Teaching Professor at Khoury College of Computer Sciences, Northeastern University

“My happy place is in my classroom”, says Shanu.

Born and brought up in a small city in India, Shanu Sushmita is a first-generation PhD from her family. She draws inspiration from Buddha as she comes from the land where Buddha found his enlightenment. As a child, Shanu was interested in mathematics. After studying computer science, curiosity and her quest to learn more took her to Glasgow, Scotland where she did her PhD.

As an assistant professor at Khoury College of Computer Sciences, Northeastern University, Shanu shares her love of teaching and engaging with students and how it helped her navigate her career from heading a data science team to going back to teaching.

Let’s get to know her better and learn about her journey.

After earning an undergraduate degree in computer science in India, Shanu worked as a research assistant while completing her MTech at IIT Delhi. Her passion for mathematics, problem-solving, research, and AI propelled her towards a doctoral degree in informational retrieval at the University of Glasgow, Scotland. While doing her PhD, she got an offer from UCLA to be a visiting research scholar where she worked on exploring ways to disambiguate author profiles in the digital libraries. Her focus was on building an optimized method for this task.

During her PhD, she investigated users’ search behavior in online health information. The goal was to examine users’ preferences for the type of search results (image, news, video, etc.). In 2012, her doctoral work was recognized as among the most interesting of the year by the ACM SIGIR newsletter

Her first job after her doctorate was at the University of Washington where she joined as a post-doctoral research scientist and worked on several projects. She led a graduate student team in a personality prediction project, where the focus was on predicting the personality type of YouTube video bloggers. During this period, she also worked on various healthcare analytics projects to find solutions to problems in healthcare settings like estimating the future healthcare cost for individuals based on their past medical and cost information. Various data mining and machine learning methods were used to predict the healthcare costs of the population and individuals based on their prior history of medical and claims records.

Additionally, she also worked on building prediction models for the risk of hospital readmission, length of stay, and mortality. The hospital readmission rate within 30 days post-discharge stands as a widely acknowledged metric for healthcare quality and expenditure in the United States. Estimating hospitalization costs with a 30-day risk assessment for such readmissions offers added value for accountable care, a global concern and cornerstone of the US government's mandate under the Affordable Care Act. Recent endeavors in data mining typically focus on either predicting healthcare costs or the risk of hospital readmission, but rarely both. In this paper, Shanu and her team introduced a dual predictive modeling approach that leverages healthcare data to forecast both the risk and cost of any hospital readmission (referred to as "all-cause"). To achieve this, machine learning algorithms were explored to make precise predictions regarding healthcare costs and the risk of 30-day readmission. Their results in risk prediction for "all-cause" readmission, when compared to the standardized readmission tool (LACE), showed promise. Furthermore, the techniques proposed for cost prediction consistently outperform baseline models, demonstrating significantly lower mean absolute error (MAE).

When a colleague approached Shanu with an offer to serve as a senior consultant for his new company, Shanu enthusiastically accepted, firmly believing that every opportunity holds value. According to her philosophy, "No learning is ever wasted," as there's always something to be gained from each experience. Thus, she joined KenSci as a senior research consultant, providing strategic guidance to the data science team. Her role involved devising machine learning solutions for various challenges in healthcare, including predicting hospital readmission risks, estimating healthcare costs, and detecting fraud in healthcare claims data.

Later, Shanu was offered a full-time position at the company where she was asked to lead the data science team. With her constant thirst for knowledge and curiosity, Shanu eagerly accepted this opportunity. Beginning with just one data scientist, she gradually built up the team to twelve members within three years. Reflecting on her experience, Shanu reveals that her greatest challenge lay in shifting her perspective to align with the client's needs. Unlike in research, where the focus is on delivering the best results possible, in the corporate world, she had to learn to manage tight deadlines and prioritize delivering results, even if they weren't perfect.

Shanu stresses the significance of enjoying data storytelling and data exploration for aspiring data scientists. She points out that storytelling is a skill that takes time to develop, as it requires hands-on experience with data-driven projects. Additionally, she underscores the importance of effectively communicating findings to both technical and non-technical audiences, highlighting storytelling as a crucial yet often overlooked aspect of data science.

According to Shanu, storytelling is not only vital but also undervalued in the field of data science. She encourages her students to embrace the challenges of working with data and to approach it fearlessly. She believes that maintaining a sense of curiosity and quest for uncovering patterns and relationships within data is essential for success in this field. Shanu advises her students to utilize various resources, such as AWS, Conferences, and Kaggle data challenges, for honing their skills. She emphasizes the limitations of classroom learning alone, stressing the importance of applying concepts to real-life problems and experimenting with algorithms across different types of data. In her view, the more practical experience one gains, the deeper their understanding of data science concepts becomes.

Shanu expresses her enthusiasm and inquisitiveness about ChatGPT and the development of large language models (LLMs) as avenues for expanding her knowledge. She built a model called MUGC (Machine Generated vs User Generated Content Detection) with her team which can detect whether a text is written by a human or ChatGPT. She, along with her team performed a comparative evaluation of eight traditional machine-learning algorithms to distinguish between machine-generated and human-generated data across three diverse datasets: Poems, Abstracts, and Essays. The results indicated that a high level of accuracy can be achieved using traditional machine learning in identifying machine-generated data, reflecting the documented need of popular pre-trained advanced models like RoBERT. They found that machine-generated texts tend to be shorter and exhibit less word variety compared to human-generated content. Furthermore, readability, bias, moral, and affect comparisons revealed a discernible contrast between machine-generated and human generated content. There are variations in expression styles and potentially underlying biases in the data sources (human and machine-generated). This study provides valuable insights into the advancing capacities and challenges associated with machine-generated content across various domains.

Not being a social media savvy person, Shanu enjoys reading conference papers and listening to podcasts and seminars to keep herself updated in the field of data science. She firmly believes that curiosity is paramount in this field, especially given the constant evolution of technology. For her, it's fascinating to learn about the problems others are tackling and the challenges they face. In addition to her passion for data science, Shanu's interest in psychology led her to explore a project on the psychological impact of music on young minds. She recognizes the profound effect music can have on children and was drawn to investigate further. As a mother, she feared how the music that her children were listening to could have significant impact on their growth and development.   Music has a profound impact on our lives, bringing people together, enhancing health and well-being, providing a creative outlet, and more. Most significantly, music influences our emotions and brain function, activating some of the most extensive and diverse networks in the brain. The amount of time that children and adolescents spend listening to various forms of music has steadily increased over the years. Consequently, the influence of music on these demographics can be significant. As children develop their personal identities, they often imitate the behaviors and language of musical role models. However, some themes presented in song lyrics raise concerns (also recognized by the American Academy of Child and Adolescent Psychiatry (AACAP). Specifically, certain themes frequently found in song lyrics can be particularly troubling; such as drugs and alcohol abuse that is glamorized, suicide as an "alternative" or "solution, graphic violence, sex which may focus on control, devaluation of women, or violence toward women. Therefore, she felt that it was important to find solutions embedded within online music platforms and virtual home assistants (Google Home, Alexa, etc.) to empower parents like her to make the right music choices for their children.

Reflecting on her journey, Shanu admits that while she didn't envision herself becoming a professor during her formative years, she always relished the opportunity to explain concepts and share her knowledge with others. Unfazed by public speaking, she embraced every opportunity that came her way, navigating her path one step at a time. After spending over fourteen years in the field of data science, Shanu finds her motivation and inspiration from the meaningful impact her work has on people's lives. To her, data science is more than just numbers—it's about telling stories that resonate with real experiences and emotions. She says, “Data Science is the art of storytelling through data”.

hb gloria
Meet a Data Scientist: Tatiana Maravina

WiDS Puget Sound is excited to present the next entry in our series, “Meet a Data Scientist!”

“Meet a Data Scientist” is dedicated to recognizing the amazing women powering the Puget Sound area’s data science community, spotlighting their journey into the field, their incredible accomplishments, and the weighty challenges that they faced along the way. This lies at the heart of WiDS Puget Sound and Data Circles’ mission of inspiring women to enter the data science field by showcasing its many incredible role models.

Do you know any marvelous women in data science? Send us a tip here!

Tatiana Maravina, Director, Machine Learning Science, Expedia Group

Tatiana Maravina's journey into data science began with a natural affinity for mathematics. Growing up in Russia, she pursued an undergraduate degree in mathematical statistics and computer science at the Lomonosov Moscow State University. She remembers taking her first statistics class and reflecting on how different the subject was compared to abstract math, which always came easy to her; it was so practical. Coincidentally, it was also a class where Tatiana got her lowest grade.

Seeking further specialization, Tatiana moved to the United States to earn her PhD in Statistics at the University of Washington. While her earlier studies in Russia focused heavily on advanced statistical theory and mathematical analysis, sharpening her analytical abilities but lacking direct preparation for applied data science work, Tatiana found the courses at UW struck a balance between theory and practical applications. "I was surprised by the practical approach of the courses at UW," Tatiana explained. "The various applied statistics courses at both the Masters and PhD level were immensely valuable, providing knowledge I continue to apply in my work to this day." The doctorate program equipped Tatiana with strong theoretical foundations in statistics and mathematics, complemented by training that allowed her to develop intuition for using science to solve real-world business problems.

While working on her PhD, Tatiana joined Boeing’s Applied Mathematics group within Boeing Research & Technology (BR&T). Starting as an intern, she later became a full-time applied statistician. Tatiana enjoyed applying her statistical knowledge to a wide range of complex engineering challenges in manufacturing, quality control, materials sciences, reliability engineering, and structural engineering.

Tatiana further honed her skills at SpaceX, diving into statistical reliability engineering, accelerated life testing, developing automated anomaly detection algorithms for satellite data, and analyzing on-orbit tests data from Starlink Demo satellites. However, she still sought exposure to big data and deploying models into production at scale.

In 2019, Tatiana joined Expedia Group, working as an individual contributor for the first few years. She applied her statistical expertise to develop A/B testing methodologies, prediction models, performance evaluation techniques, and methods for marketing capital allocation. One of the projects she was working on was the booking cancellation prediction model, which became very relevant during COVID. After gaining experience, she transitioned to a technical lead role managing a team that has grown from two to four data scientists. 

Another notable project Tatiana worked on was developing predictive models for customer lifetime value (CLV), which estimate the future cash flows from each customer over the long-term horizon. Accurate CLV predictions enable better business decisions around customer acquisition, retention strategies, marketing investments, and more. The sophisticated CLV system utilizes gradient-boosted trees models to predict an individual customer's future value as a complex function of numerous input features representing the customer's past purchase behavior and engagement. The CLV model leverages data from multiple Expedia Group brands and lines of business, including stays, flights, packages, cars, and cruises. Over 200 engineered input features are divided into two main categories: bookings and engagement. The engagement features capture the customer's interactions with Expedia Group brands outside of bookings, such as engagement with marketing emails (e.g., number of clicks in the last 3 months) and loyalty program tiers. Initially, customers are categorized into five geographical regions and further grouped within each region based on metrics like booking recency and frequency, resulting in a total of 30 segments. Each customer type is modeled individually, with a dedicated CatBoost model trained for each segment. The models are retrained monthly, and CLV predictions for hundreds of millions of customers are updated daily to fuel various business intelligence applications. Here is a Medium blog with more details about this project.

When asked how she stays up to date with the developments in the field of data science, Tatiana recalled the Lewis Carroll quote “It takes all the running you can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!”

"The field is evolving so rapidly that the learning never stops," Tatiana added. "I learn so much from my colleagues at Expedia Group, both informally and through days of learning, which are a great opportunity to share knowledge." She noted that Expedia Group nurtures a culture of formal and informal knowledge exchange with courses, seminars and dedicated monthly days of learning.

Throughout her career, Tatiana has cultivated a diverse skill set at the intersection of mathematics, statistics, computer science, and business acumen. She emphasizes the importance of clear communication, as a critical skill for having impact and driving action in any data science role.

Tatiana has had to hone the ability to explain complex technical concepts clearly to diverse audiences, from engineers to business leaders.

She pointed out several main aspects of communication:

·  Clearly explaining the work, main learnings, insights and any important data limitations to business stakeholders at a high level so that both technical and non-technical audience gets a good understanding. She recommends first thinking about the core story and main messages, knowing (or learning if possible) the audience's background, and structure content accordingly within the given time, while being mindful of the tendency to go deep into technical details. From her experience, if the key messages were clear and the high-level context was set, the Q&A section provides a great opportunity to drill deeper into technical specifics for those interested.

·  She also noted the importance of day-to-day communications and noted that especially in a stressful situation there could be a tendency to look for a perfect solution before updating anyone. She noted how important it is to keep the stakeholders informed while searching for the solution. Sharing the work that might not be perfect, soliciting early feedback and sending updates regularly could be very helpful in keeping the project on track and aligned with the business goals.

·  A more challenging aspect of communication is influencing and convincing others of the benefits of one approach over the other, especially when there are differing opinions. This requires skills in persuasion, negotiation and getting stakeholder buy-in.

"My journey underscores that data science is a continually evolving craft at the intersection of mathematics, statistics, computer science, and business acumen," Tatiana reflected. "The most fulfilling aspect is the opportunity to always be learning and expanding my capabilities in tandem with the rapid pace of innovation in this field."

Valentina Komarova
Meet a Data Scientist: Lucy Zou

WiDS Puget Sound and Data Circles is excited to present the next entry in our series, “Meet a Data Scientist!”

“Meet a Data Scientist” is dedicated to recognizing the amazing women powering the Puget Sound area’s data science community, spotlighting their journey into the field, their incredible accomplishments, and the weighty challenges that they faced along the way. This lies at the heart of WiDS Puget Sound mission of inspiring women to enter the data science field by showcasing its many incredible role models.

Do you know any marvelous women in data science? Send us a tip here!

“As a Data Scientist, you can build a model that improves offline metrics by 20% but you’ll still have to convince your team that it’s effective and needs to be put on the website,” says Lucy Zou, a Machine Learning Scientist III at Expedia Group (EG) in Seattle, Washington, “And you have to adapt your material to different audiences.” 

Zou's words encapsulate the multifaceted nature of a data scientist's role in the field’s dynamic landscape.  She has worked in data science for three years and has seen her own models deployed live on EG’s website. These models are  an important shift in how the travel booking company engages with customers.  Let's delve deeper into Lucy's journey and glean insights from her education and career.

Like many in the field of data science, Zou did not start her career in machine learning. She’d grown up in Shanghai, China and had come to the US as a high school cultural exchange student, initially spending a year in Virginia and then a year in Palm Desert, California. In high school, she discovered working with numbers was especially rewarding. After graduating, she left the 120 degree temperatures and sudden sandstorms of Palm Desert to return east, where she earned degrees in Accounting and Finance, and minored in Mathematics, at Georgetown University in Washington DC. 

Shortly after graduation, Zou began her career at Deloitte, where she worked on a Risk Advisory Team in Washington DC, consulting clients on IT database applications. This experience helped cultivate Zou’s communication skills, which allow her to engage successfully with diverse stakeholders today. After a year at Deloitte, Zou returned to university to pursue a master’s degree in Computational Analytics at Georgia Institute of Technology in Atlanta. She sought to merge her passions for technology and business with mathematics, and was excited to dip her toes into Deep Learning and Natural Language Processing.

During her studies, Zou gained valuable experience through internships in Strategic Analytics at Intercontinental Exchange. Shortly after graduating with her master’s degree, she accepted a position at EG and moved west to Seattle. She is glad she pivoted toward data science and she says it shows that it’s never too late to change what you’re doing if something else really sparks your interest. 

At EG, Zou is part of the Recommendations Team, which is responsible for implementing machine learning models to provide personalized recommendations to customers. She is instrumental in improving EG's recommendation system. The company is moving away from older models that relied on popularity to recommend cars and hotels, to more complex models that factor in user click-history, geographic area of interest, traveler preferences and other variables. These new models provide a more seamless and personalized user experience. They use embeddings, matrix factorization, tree-based algorithms and neural networks to now recommend new properties, destinations, flight bundles and activities that touch on different parts of a traveler’s journey.

Like many in the data science field, Zou works closely with a variety of stakeholders including product managers, platform engineers, machine learning engineers, and business stakeholders. “It’s really collaborative,” she says. “I love it and I’m really passionate about it. The collaboration is one of the main reasons I wanted to go into this field.” 

One of Zou's key contributions to her team has been her ability to visualize and communicate the impact of their machine learning models effectively. By using maps to illustrate how recommendations adapt to user preferences in real-time, she has been able to demonstrate model uplift and garner support for their implementation.

Zou is committed to continuous learning and development, which is also part of EG’s company culture. She takes advantage of opportunities such as weekly paper reading sessions with colleagues, as well as the monthly Day of Learning where the Data &AI team’s employees are encouraged to set down their work and invest time in learning new concepts and technologies. “It’s possible to read through a paper in 20 minutes, but it takes longer to really understand it. The Day of Learning is good for taking time to work with new concepts.” 

She also stays informed through research papers, blogs (she’s a fan of The Batch by Deep Learning AI), and online courses, like Udemy and Coursera. Many companies, including EG, offer access to learning platforms such as these, and Zou takes advantage of these benefits.  She most recently earned a certificate in Deep Learning through Coursera. Zou says that YouTube tutorials are another great resource for digging into specific topics.

Zou’s high degree of proficiency in communicating complex ideas does not limit her from refining her communication skills, which are important in data science. She emphasizes that practice is key, and she says that continuing to improve presentation and communication abilities is an ongoing process. Every bit helps. To this end, she actively participates in conferences, where she also gets a chance to learn about new models, best practices, and trends in academia and industry. Notably, she attended Recsys, a conference dedicated to recommender systems, where she gained insights into how other companies integrate recommender systems in their day to day operations.  

Zou places a high value on mentoring and knowledge-sharing as integral components of career development. By mentoring interns at EG through its Buddy program and actively seeking opportunities to learn from her colleagues, she fosters a culture of collaboration and continuous learning. Recognizing the importance of developing management skills early in her career, Zou emphasizes the role of mentorship in her professional growth, underscoring its significance in shaping her journey as a data scientist. These opportunities for greater responsibility help Zou to stay more organized, manage timelines, set clear tangible deliverables, and increase communicativeness about updates to her team. She’s very much an advocate for frequent and timely communications about updates to models and project details. 

Looking ahead, Zou sees team management as a potential path in her career but remains open to exploring different opportunities. She recognizes that, as an individual contributor, she loves finding solutions for different use-cases, reading research papers, exploring cutting-edge machine learning techniques, and seeing what colleagues are doing. She is aware that many data scientists come to a point where they make the decision to focus mostly on being individual contributors or becoming people managers. “It’s hard to make that decision if you don’t know enough about both roles, so having opportunities to learn about both will help you decide well when that time comes.” And if others are interested in management experiences, Zou encourages them to get a good taste of what it is like to be a people manager. To gain these experiences, Zou encourages others to talk to their higher-ups, voice the desire to take on more ownership of projects, more responsibility, and additional mentoring opportunities. 

When asked how she describes data science, Zou says, “Data is all around us. If someone asked me 5-10 years ago what data science was, I would have answered it is a way to scientifically analyze data. Now I know it is so much more. For instance, when you go into a store, data informs where products are placed, how they are selected, how many are stocked and how they are priced.” She says that generative AI (like ChatGPT) has been a disruptive force recently. It has been adopted widely and its development is happening very rapidly, which has helped many more companies personalize customer experiences, including small businesses. “There is a lot of potential for boosted business revenue, but that’s not the extent of it.” She says, “Data science is a way to extract meaningful insight and knowledge from many different scenarios, from structured and unstructured data, identify trends, behaviors, and risk factors and help everyone make data-driven decisions in competitive landscapes. It really is a way for us to understand the world around us.”

As an enthusiastic traveler, Zou is happy to work for EG because it offers her opportunities to explore new destinations and experiences. For her next non-data science adventure, she’s hoping to secure tickets and travel to a EuroCup soccer game in Germany this summer.

Speed Mentoring Success

Last night was our first event of the year, and we had an amazing time! Thanks so much to our mentors, volunteers, and the attendees who came out for our mentoring event. And extra thanks to the team at Northeastern University Seattle for hosting!

We had 13 mentors and 50-60 guests. It was a great ratio for small group chats. We rotated every ten minutes so attendees could chat with 5 or 6 mentors throughout the evening. I think I speak for both mentors and attendees when I say that 10 minutes is not nearly enough. That said, I felt like I started some really great conversations. I hope to hear from any attendees who want to continue these chats! (You may have to remind me of the topic, because things have already started to blur together.)

 

The one and only moment I paused to take a picture.

 

Aside from our 2023 conference, this was our first in person event since February of 2020. We are excited to get back into a routine of smaller, year-round events to keep our community active in between conferences. It was amazing to see so many new and familiar faces.

This event was a full-circle moment for me, personally. Our last in person event before Covid was also speed mentoring, in a very similar format. I was a brand new volunteer with WiDS Puget Sound, and I attended the event to seek advice about finding my first job in tech. 4 years later, I ended up filling in for a mentor who was sick. I remember vividly what it was like to be brand new to this industry, so it was somewhat surprising to find that I actually had answers to so many questions that came up. To anyone who doubts your own ability to mentor others, I encourage you to give it a shot. Perhaps there is value in offering guidance while the perspective of being a beginner is still so fresh.

A few recurring questions came up, and a few answers were left incomplete, so I will throw in a few tips below.


Q: For people transitioning from another career, how much information should my resume include about about my past work experiences?
A: The amount of information depends on how relevant your previous job experiences are to the job you want. It is worth including jobs from your recent past, but make sure every bullet point contains information directly relevant to the job you want; focus on transferable skills! Be concise.


Q: How essential is networking? What kinds of networking activities are most worthwhile?

A: It is hard to measure, but engaging in some form of networking can have a big impact on your career. Even if it doesn’t get you your next job, it can teach you about what jobs are out there and what it is like to work for different companies.
When I was job searching, I focused on two forms of networking: educational events and volunteering. I went to events where I would learn something useful, so even if I made no great connections, it would still be a good use of my time. I also (obviously) got involved with WiDS Puget Sound. By helping to organize these events, I have made my most valuable career connections. It is not instantaneous; I have given a lot of energy to the team, and I have gained meaningful relationships as a result. If you want to get involved, start up conversations with the organizers when you go to events. Ask if they are run by volunteers or if the event is organized by a company.
Local groups that may have volunteer opportunities, and definitely have educational & networking events: Puget Sound Programming Python (PuPPy), PyCascades, WiDS Puget Sound (yes, I’m biased), SeattlePyLadies, R-Ladies Seattle, PyData Seattle, Women Techmakers Seattle

Q: When I meet people in the industry, what should I talk about?

A: I had three go-to questions:
(1) Do you have any tips for catching the attention of a recruiter from your company?
(2) Is there a centralized department for data scientists, or are they dispersed throughout various teams at the company?
(3) How did you initially get started at your company? Did you cold-apply? Or did you know someone at the company?

I hope this event proved useful, and would love to hear feedback from anyone!

Thanks for your time.

Cheers,

Kelly

Co-chair of WiDS Puget Sound

Click image to navigate to a PDF with linked QR codes

Kelly Stroh
Meet a Data Scientist: Frederike Dubeau

WiDS Puget Sound and Data Circles is excited to present the next entry in our series, “Meet a Data Scientist!”

“Meet a Data Scientist” is dedicated to recognizing the amazing women powering the Puget Sound area’s data science community, spotlighting their journey into the field, their incredible accomplishments, and the weighty challenges that they faced along the way. This lies at the heart of WiDS Puget Sound and Data Circles’ mission of inspiring women to enter the data science field by showcasing its many incredible role models.

Do you know any marvelous women in data science? Send us a tip here!

Frederike Dubeau was an NCAA Division I athlete who started her Business career at PACCAR, a Fortune 500 company headquartered in Seattle, Washington. PACCAR partially funded Dubeau’s Master’s Degree in Predictive Analytics, and Dubeau, in turn, applied what she learned directly to PACCAR’s operations, transforming the way the company did business. She helped establish a Data Science Center of Excellence at PACCAR and went on to become a Manager of Data Science at Logic20/20, a business and technology consulting firm where she currently works.

“Communication is the most important part of data science,” says Frederike Dubeau, Manager of Data Science at Logic20/20, a business and technology consulting firm headquartered in Seattle, Washington. “If you’re able to present complicated topics in a clear and concise way to people who have less familiarity, and convey the benefits of a solution, that’s crucial.” Dubeau knows this because her ability to communicate the value of data is a critical driver for her career.

Dubeau started as an intern about 10 years ago in the Purchasing department at PACCAR, a Fortune 500 company and one of the world’s largest manufacturers of medium- and heavy-duty trucks. She had just earned a Bachelor’s Degree in Business Administration with a focus in Supply Chain Management from Cleveland State University in Ohio. While she had played Division I soccer throughout college, a choice that had provided her many valuable experiences, she had not had the opportunity to do internships as an undergraduate and find her professional passion. Fortunately, her bachelor’s program, which was statistics-heavy and focused on topics like demand forecasting, struck a strong chord with Dubeau. The PACCAR internship confirmed that she really wanted to focus her career on data. 

Dubeau accepted a full-time job in the Materials department at PACCAR Parts, the aftermarket parts division of PACCAR, in a role that focused on reporting inventory KPIs. In this position, she was happy to work with data, but felt that more could be achieved with the application of data science techniques she had learned about. The challenge was, however, that the 26,000-employee company had no formal experience with data science applications. Fortunately, Dubeau was surrounded by great leaders at PACCAR who gave her opportunities to showcase the power of data to others across the division.  

As she looked to further her education, Dubeau remembered one of her undergradutate professors who had started each class with interesting examples from the news of how data and statistics were being utilized to solve a diverse set of tangible, real-world problems. In those moments, she was wowed. “Data!” she’d thought, “Applying those approaches to the real world could be cool!” She chose to get a Master’s Degree in Predictive Analytics from Northwestern University, instead of the more traditional MBA route that many of her cohort took. She studied after her working hours, and PACCAR supported this choice by covering half of her tuition costs. 

The investment turned out to be good for both parties because Dubeau pursued topics that inspired her, and then she applied her new knowledge at PACCAR, transforming the way it did business. Dubeau says,

“I was using what I was learning in school in my day-to-day with my team. I was showing there was a different path. Leadership was also seeing the value that data-driven approaches could bring.”  

Dubeau stands behind the open door of a shiny white and baby-blue Peterbilt truck. She's smiling through the open window. The truck is parked in a warehouse with tall shelves, containing many boxes behind it. A sign says the truck's EV model number

Dubeau at a PACCAR shareholder meeting.

When she graduated with her Master’s Degree in 2017, she looked for other teams who were using predictive analytics at PACCAR. She discovered that PACCAR’s IT Division had a plan to start a data science Center of Excellence (CoE). When she had the chance, Dubeau made the move and joined the four-person CoE team. Initially, the CoE team acted as an internal consulting group, approaching different divisions of the company (which include, among others, Peterbilt, Kenworth, PACCAR Parts, and DAF Trucks) with novel and powerful analytics applications. The team demonstrated that they understood the business side of the problems each division faced and they were able to scope advanced solutions. In those moments, she says,

“Defining the business value was key. That was the main driver for continued investment in the group and growing the team across the organization. With data science, it’s not just Oh great you have a model that can predict x, y, z; it’s also How do you put it into production, or use those insights?” 

The team developed insights, put solution after solution into production and grew their team. Within four years, their team of four grew to about thirty people, across multiple divisions. They operated using a hub and spoke model, with their CoE as the hub, and smaller teams in the divisions as the spokes. Team growth was a challenge. “Being a manufacturing company in the Seattle area, competing with big tech firms for analytics talent was tough.” Once again, she relied on her ability to communicate the impact these technologies could have at PACCAR.

“Many times, people with non-traditional backgrounds, who had curiosity and capability, were the ones we were able to convert toward data science roles. A lot of times, for these projects, the business or engineering backgrounds they had or other experience were really helpful.”

As an early member of the CoE during the high growth phase, Dubeau helped establish data science best practices for PACCAR and defined the career paths for the Data Scientists at the company.

About 18 months ago, Dubeau left PACCAR and joined Logic20/20. She is now a manager in the Advanced Analytics practice and delivers data science solutions to a new set of customers.

In her current role, she works with southern California utility companies, helping to prevent electricity outages and wildfires that result from trees coming into contact with power lines. While wildfires are a big problem in California, vegetation management is a problem for utilities across the planet. Dubeau says, “A ton of work goes into managing crews and inspections, trimming vegetation, and follow-up work.” She explores data related to tree growth patterns to help clients understand which areas, at which times of the year, should get prioritized across service territories. 

Dubeau, wearing hard hat and high visibility vest stands below power lines, under a clear blue sky. A herd of white and black goats lay in a brown field behind her. In the distance palm trees and other large, bushier trees are visible.

Dubeau on a field visit to project locality, inspecting southern California utility lines

So how does one cultivate the good communication Dubeau sees as essential? She says,

“I’m an athlete. Practice is what I always did, so I promote that, which can be harder now with remote work. One-off conversations, or talks with larger audiences, like presentations to management or at conferences, are all great opportunities,” she says. “It’s always uncomfortable to present. As with anything, by doing it more, you get more comfortable, and you get better.”

She suggests even finding small ways to connect professionally and communicate, like describing your work to colleagues not directly connected to your projects, or talking about your work with other people in your network. She remembers having a manager who said,

“If you can’t explain this to me in a simple way, you probably need to learn more about it.”

While she worries that may come off a bit harsh, she does believe it is true. Dubeau says she has worked with difficult stakeholders in the past.

“The biggest thing is educating [stakeholders] and showing the concepts in a digestible way, but also clearly defining the measures of success for what they are trying to drive forward. If you lay that out first, and show you are doing this because you are moving that metric from a to b, and this is how we will do it. That’s important.”

As a consultant, data scientists can expect projects to be much more defined within a statement of work, but within a larger organization, that may not have established data science methods yet, objectives may not be so clear, and things can get messy.

“Defining the solution, documenting the delivery items, and laying out a road map with clear delivery dates helps projects stay on track. Early in my data science career, I had projects that went on forever because it was not clear what the measurements of success were.”

All of this boils down to practicing and getting good at clear communication.

Another helpful carry-over from her background as a soccer player is that Dubeau sees data science as a team sport.

“You must have collaboration. The data engineers get the data to the right place and in the right form. If data is not accessible or not reliable, the data scientists can’t easily pick it up. The data scientists explore and create the models, and then cloud engineers put the models into production in a scalable way. It is a collaboration, not just with the technical pieces, but also with the business owners. Understanding how to scope a data science project is crucial because there are so many different pieces along the way and it can fail if you don’t have buy-in from the person who owns the process.” 

Dubeau stays on top of what’s happening in Data Science by staying up to date with certificates and new technologies. Her recent work focused on GIS data, and so she’s excited about Amazon SageMaker geospatial capabilities. She also sees the projects on which she consults as opportunities to learn more about a variety of industries and the world. For example, in her current role, she’s learned a lot about government regulations on privately owned public utilities in the state of California.

Clearly, her roles have varied dramatically; so how does Dubeau describe data science to people? She says it comes down to using data to understand patterns, being able to process information, and finding ways to optimize current processes “..to do what you’re doing, only better or on a larger scale—to save time and reduce risk.”

When Dubeau is speaking with people that are considering data science as a career, she highlights the many areas on which a person can focus. For example, you can be super technical, with an in-depth understanding of the algorithms and underlying math, or someone (like herself) can understand business needs along with what is possible technically, to bridge the technical and business teams. There are many other flavors of data scientists, though, and Dubeau admits that the infamous imposter syndrome does exist. To that, she says,

“Find the skills you’re good at and lean into those while also staying on top of the technology. For me, I’m an extrovert and I’m curious. I like to talk to people and understand problems, then come back to my team and convert those into technical pieces of work.. and I’ve learned that, on a day-to-day basis, I need that social part!”

And Dubeau has found ways in her data science career to combine her passion for data and for people, to communicate data science’s strengths and solutions, to bring value, and make a difference in the companies where she’s worked and in the communities impacted by them.

Jenica Andersen
Meet a Data Scientist: Dr. Joyce Cahoon

WiDS Puget Sound and Data Circles is excited to present the next entry in our series, “Meet a Data Scientist!”

“Meet a Data Scientist” is dedicated to recognizing the amazing women powering the Puget Sound area’s data science community, spotlighting their journey into the field, their incredible accomplishments, and the weighty challenges that they faced along the way. This lies at the heart of WiDS Puget Sound and Data Circles’ mission of inspiring women to enter the data science field by showcasing its many incredible role models.

Do you know any marvelous women in data science? Send us a tip here!

Joyce Cahoon Senior Data Scientist at Microsoft

Joyce Cahoon, PhD, a Senior Data Scientist at Microsoft, has roots in entrepreneurialism and is a supporter of open source sharing online. She is a high achiever who spent a year on Wall Street, did some soul searching and landed in a role she loves, using large language models at Microsoft’s Gray Systems Lab. 

I’ve always been a data scientist,” says Joyce Cahoon, PhD, a Senior Data Scientist at Microsoft.  “At the end of the day, I wanted to make decisions based on evidence, not just based on the last anecdote I’d heard.” Dr. Cahoon remembers, throughout her life, always wanting to see the larger data. After graduating from Duke University with bachelor’s degrees in biomedical engineering and economics, she worked for a year on Wall Street as an investment banker. The lifestyle on Wall Street was not a great fit for her and she elected to pursue other roles.

Entrepreneurship, a passion of hers from earlier in life when she’d participated in startup competitions, is where she turned her attention next. She joined a startup called SMSmart, which made an app that allowed users to access sites like Yelp, Yahoo Finance, and Google Maps using just text messages—without internet access. Her team applied to Y Combinator (the company that coaches and funds early-stage tech startups). They advanced quite far in the competition, and found they had a steady user base (surprisingly in the trucking and transportation industry), but ultimately they disbanded when they were not selected in the final rounds for funding. After that, she tried her hand as a bartender and other roles.

When you’re a college student and high achiever, you want a straight path to success, but this was a period of soul searching. At some point, though, you look back and the dots connect—they make sense.

She says that at the time, these different experiences felt like getting outside of herself. Then, in 2015 she applied to the Statistics PhD program at North Carolina State University, was accepted, and things began to coalesce.

As a graduate student, Dr. Cahoon held multiple internships. One particularly formative experience was her internship as a software developer at RStudio, which makes the open-source IDE for R. The experience was impactful because of the contributions she made to the open-source community. “I was able to give back,” she says. R is used heavily in academia, especially in the field of Statistics. To contribute to the development and evolution of the open-source tool felt meaningful to Dr. Cahoon. A mentor there, Max Kuhn, a PhD statistician and software engineer, inspired her with his approach to sharing his life’s work at no cost. “It’s great because it’s like accessing the contents of his mind,” Dr. Cahoon said in reference to his insightful and widely-read book, Applied Predictive Modeling, which (along with other books of his) is available online for free. She asked him why he made it free, and his answer still resonates with her. “That’s one of the great things about life,” said Kuhn, “Give people a platform and see what they do with it.” After she left RStudio, Dr. Cahoon completed her PhD and started at Microsoft shortly after.

Today Dr. Cahoon works at Microsoft’s Gray Systems Lab (GSL), and is proud to be on a team that makes many of its products and materials available open-source too. GSL is an R&D team that develops and evaluates database systems technologies before bringing them to the Azure product line. She and her team currently work with large language models, figuring out how they can be applied across Microsoft. She says one of the data science problems her team is trying to solve stems from the fact that engineers approach their problems with a lot of experience and knowledge, which gives them intuition on, for example, how to configure a parameter.

“Because they have seen so much, they know intuitively how to tune it. But all of that knowledge and intuition is localized to one human.”

GSL, under the guidance of Azure CTO Raghu Ramakrishnan, believes there is a more robust, efficient way to operate, and her team is trying to make new models that can turn that individual intuition into organizational knowledge—a collective mindshare.

What excites Dr. Cahoon about data science now are the advances in natural language processing, especially large language models and prompt engineering.

We’re close to entering a new era in data science.

She says programs like OpenAI Playground, GPT-3, and Github Co-Pilot, that take in a doc string or natural language command and converts it to code, are able to increase productivity of desk workers and tech workers by 4x-5x. “If we can take what Github Co-Pilot has done and get a co-pilot to apply to other things, that will allow us to augment ourselves so we can have more of our lives dedicated outside of work.” She is excited that the applications of this technology are far-reaching. Just recently, people have been able to type up research notes and use GPT-3 to write abstracts that they’ve submitted and had accepted at peer reviewed journals. Assuaging concerns of anyone who may worry about the increased productivity leading to job loss in the tech industry, Dr. Cahoon says, “Think of it like a cell phone. Yes, it can do a lot, but it still needs a human to operate it.” She says, with the infrastructure and tools becoming available to us, you don’t have to be amazing at code—now, the focus of a career can be on the bleeding-edge science. 

And how does Dr. Cahoon stay up to date on the newest discoveries in data science? She says she uses a customized RSS feed, which includes news sites that interest her and Google Scholar links to researchers she follows. This way she is able to read the articles her favorite scholars are publishing right as they come out. “It’s my own curated news feed”. It’s her way of seeing which startups are getting funded and where venture capitalists are putting their money. One other factor in her life that helps her keep up to speed is instructing others. For RStudio, she took rigorous instructor tests, to be able to teach in the R community. Now at Microsoft, there is a conscious effort in her group to make sure people are teaching. Recently Dr. Cahoon taught Introductions to Statistics to the solutions architects at Microsoft who are communicating with customers—customers who are making major decisions that have a real impact on the day to day lives of the millions of people across the globe.

“It’s great to give them statistical approaches so they don’t have to just go with their gut feeling,”

she says of the experience. She was also very active in the R-Ladies meet up group for about 5 years, but has turned her attention to another great shift in her life, she’s become a mother. She says she loved opportunities to teach, ad hoc, because they keep her mind fresh and connected to important data science topics.

But the fact that data science is both a far-reaching field and an elusive term is not lost on Dr. Cahoon. In reality, the ambiguity of its origins and evolution is something she values. She says,

“Data science is sprinkled into everything we do, especially today where so much data is available. Everyone is a data scientist—if they’re thinking critically.” 

In describing what she means by “data science”, she points to a 2015 article called “50 Years of Data Science” by Dr. David Donoho that was based on a presentation he gave at workshop in Princeton, NJ that year. “I’ve read that article every year since it came out,” she says, “because it is timeless. Every time you read it, you get something new out of it.” What stands out to her is that the article describes three roles in academia where data science has its empirical roots: Statistics, Computer Science, and Mathematics. “Their claims to data science are all true—they are all applying their own rigorous proof.” She says she has yet to see one definition of data science that has stood the test of time. And like the various experiences in Dr. Cahoon’s own eclectic journey to her position as Senior Data Scientist at Microsoft’s GSL, the world of data science would not be what it is today without the diverse contributors who lay claim to it.

Every field should claim part of it because we need these different areas and domain experts working together. We need this interdisciplinary approach to make it what it is. All of these actors and agents working together—it is because of them that we have a richer ecosystem in data science than just one field claiming it. Everyone deserves a seat at the table.