Dina Bavli

Life, Death, and Shopping

Data Scientist

Dina Bavli

Life, Death, and Shopping

Data Scientist


Dina is a data scientist with experience in NLP, graph theory, NetworkX, churn prediction, and automated speech recognition. Her Master’s thesis deals with classifying and characterizing persuasion. She is a former teaching assistant for ML and an experienced international public speaker. She is a data science content writer for workshops, meetups, and online courses, and an official author of the Towards Data Science and Better Programming publications.

Dina spent a significant part of the summer at the German Aerospace Data Science Center.

She is passionate about data, sharing knowledge, and contributing to society and open source.

Whenever she is unable to find a sufficient tutorial, she creates one.


Dina is a data scientist with experience in NLP, graph theory, NetworkX, churn prediction, and automated speech recognition. Her Master’s thesis deals with classifying and characterizing persuasion. She is a former teaching assistant for ML and an experienced international public speaker. She is a data science content writer for workshops, meetups, and online courses, and an official author of the Towards Data Science and Better Programming publications.

Dina spent a significant part of the summer at the German Aerospace Data Science Center.

She is passionate about data, sharing knowledge, and contributing to society and open source.

Whenever she is unable to find a sufficient tutorial, she creates one.


This talk is a step-by-step introduction to purchase prediction, also applicable to survival analysis and churn prediction including implementation in PySpark.

When dealing with survival analysis, the model’s success is predicting death correctly. But it can also predict an engine failure, abandonment, or even purchases.

In purchase prediction, survival analysis, or churn prediction, the data is usually labeled or artificially labeled by a set of rules, such as inactivity for 30 days equivalent to churn. But the data structure is different from classical machine learning, and the data handling and modeling are different, too.

This lecture covers the data structures and aggregations for such analysis, focusing on time aggregations using Pyspark and how NLP is involved.


This talk is a step-by-step introduction to purchase prediction, also applicable to survival analysis and churn prediction including implementation in PySpark.

When dealing with survival analysis, the model’s success is predicting death correctly. But it can also predict an engine failure, abandonment, or even purchases.

In purchase prediction, survival analysis, or churn prediction, the data is usually labeled or artificially labeled by a set of rules, such as inactivity for 30 days equivalent to churn. But the data structure is different from classical machine learning, and the data handling and modeling are different, too.

This lecture covers the data structures and aggregations for such analysis, focusing on time aggregations using Pyspark and how NLP is involved.

Planned Agenda

8:45 Reception
9:30 Opening words by WiDS TLV ambassador Nitzan Gado and by Lily Ben Ami, CEO of the Michal Sela Forum
9:50 Prof. Bracha Shapira – Data Challenges in Recommender Systems Research: Insights from Bundle Recommendation
10:20 Juan Liu – Accounting Automation: Making Accounting Easier So That People Can Forget About It
10:50 Break
11:00 Lightning talks
12:20 Lunch & poster session
13:20 Roundtable session & poster session
14:05 Roundtable closure
14:20 Break
14:30 Merav Mofaz – “Every Breath You Take and Every Move You Make…I'll Be Watching You:” The Sensitive Side of Smartwatches
14:50 Reut Yaniv – Ad Serving in the Online Geo Space Along Routes
15:10 Rachel Wities - It’s Not Just the Doctor’s Handwriting: Challenges and Opportunities in Healthcare NLP
15:30 Closing remarks
15:40 End