Robin von Malottki

AI

Techniques for Chat Data Analytics with Python

Part II: Topic Extraction with BERTopic Robin von Malottki · Follow Published in Towards Data Science · 10 min read · 22 hours ago — Photo by Mikechie Esparagozaand obtained from Pexels.com In the first part of this series, I introduced you to my artificially created friend John, who was nice enough to provide us with his chats with five of the closest people in his life. We used just the metadata, such as who

Read More »
AI

Techniques for Chat Data Analytics with Python

Part I: Communication Density Analysis Robin von Malottki · Follow Published in Towards Data Science · 8 min read · 14 hours ago — Photo by Mikechie Esparagozaand obtained from Pexels.com For many years now, our communication has become increasingly digital. Whether through quick text messages via chat applications or sending emails, digital messaging has firmly embedded itself in our daily lives. This leads to an increase in the amount of digitally produced data. Since

Read More »
AI

ETL Pipelines in Python: Best Practices and Techniques

Strategies for Enhancing Generalizability, Scalability, and Maintainability in Your ETL Pipelines Robin von Malottki · Follow Published in Towards Data Science · 10 min read · 12 hours ago — Photo by Produtora Midtrack and obtained from Pexels.com When building a new ETL pipeline, it’s crucial to consider three key requirements: Generalizability, Scalability, and Maintainability. These pillars play a vital role in the effectiveness and longevity of your data workflows. However, the challenge often lies

Read More »
AI

The Transparent Human

How data can reveal the true nature of anyone Robin von Malottki · Follow Published in Towards Data Science · 8 min read · 6 hours ago — Image by zahtris from pixabay. In 2014, a German data scientist started to collect data about articles from the German digital newspaper magazine “Der Spiegel.” For 2.5 years, he scraped over 100,000 articles as well as the associated metadata, such as publishing time, section, and author. After

Read More »
AI

Efficient Testing of ETL Pipelines with Python

How to Instantly Detect Data Quality Issues and Identify their Causes Robin von Malottki · Follow Published in Towards Data Science · 10 min read · 2 days ago — Photo by Digital Buggu and obtained from Pexels.com In today’s data-driven world, organizations rely heavily on accurate data to make critical business decisions. As a responsible and trustworthy Data Engineer, ensuring data quality is paramount. Even a brief period of displaying incorrect data on a

Read More »