Aaron Birnbaum and Matthew Makansi

AI

Evolving Product Operating Models in the Age of AI | Towards Data Science

In a previous article on organizing for AI (link), we looked at how the interplay between three key dimensions — ownership of outcomes, outsourcing of staff, and the geographical proximity of team members — can yield a variety of organizational archetypes for implementing strategic AI initiatives, each implying a different twist to the product operating model. Now we take a closer look at how the product operating model, and the core competencies of empowered product

Read More »
AI

No More Tableau Downtime: Metadata API for Proactive Data Health | Towards Data Science

In today’s world, the reliability of data solutions is everything. When we build dashboards and reports, one expects that the numbers reflected there are correct and up-to-date. Based on these numbers, insights are drawn and actions are taken. For any unforeseen reason, if the dashboards are broken or if the numbers are incorrect — then it becomes a fire-fight to fix everything. If the issues are not fixed in time, then it damages the trust placed on

Read More »
AI

What Germany Currently Is Up To, Debt-Wise | Towards Data Science

€1,600 per second. That’s how much interest Germany has to pay for its debts. In total, the German state has debts ranging into the trillions — more than a thousand billion Euros. And the government is planning to make even more, up to one trillion additional debt is rumored to follow over the next 10 years. The numbers involved in governmental finances are so huge that one probably cannot realistically assess just how much even

Read More »
AI

Google’s Data Science Agent: Can It Really Do Your Job? | Towards Data Science

On March 3rd, Google officially rolled out its Data Science Agent to most Colab users for free. This is not something brand new — it was first announced in December last year, but it is now integrated into Colab and made widely accessible. Google says it is “The future of data analysis with Gemini”, stating: “Simply describe your analysis goals in plain language, and watch your notebook take shape automatically, helping accelerate your ability to

Read More »
AI

Deciphering language processing in the human brain through LLM representations

During speech production, it is evident that language embeddings (blue) in the IFG peaked before speech embeddings (red) peaked in the sensorimotor area, followed by the peak of speech encoding in the STG. In contrast, during speech comprehension, the peak encoding shifted to after the word onset, with speech embeddings (red) in the STG peaking significantly before language encoding (blue) in the IFG. All in all, our findings suggest that the speech-to-text model embeddings provide

Read More »
AI

R.E.D.: Scaling Text Classification with Expert Delegation | Towards Data Science

With the new age of problem-solving augmented by Large Language Models (LLMs), only a handful of problems remain that have subpar solutions. Most classification problems (at a PoC level) can be solved by leveraging LLMs at 70–90% Precision/F1 with just good prompt engineering techniques, as well as adaptive in-context-learning (ICL) examples. What happens when you want to consistently achieve performance higher than that — when prompt engineering no longer suffices? The classification conundrum Text classification is

Read More »
AI

Load balancing with random job arrivals

Cluster management systems, such as Google’s Borg, run hundreds of thousands of jobs across tens of thousands of machines with the goal of achieving high utilization via effective load balancing, efficient task placement, and machine sharing. Load balancing is the process of distributing network traffic or computational workloads across multiple servers or computing resources, and it is one of the most critical components of a modern cluster management system. Effective load balancing is critical to

Read More »
AI

Loss of Pulse Detection on the Google Pixel Watch 3

Acknowledgements The research described here is joint work across Google Research, Google Health, Google DeepMind, and partnering teams, including Consumer Health Research, Personal Safety, quality, regulatory, and clinical operations. The following researchers contributed to this work: Kamal Shah, Anran Wang, Yiwen Chen, Jitender Munjal, Sumeet Chhabra, Anthony Stange, Enxun Wei, Tuan Phan, Tracy Giest, Beszel Hawkins, Dinesh Puppala, Elsina Silver, Lawrence Cai, Shruti Rajagopalan, Edward Shi, Yun-Ling Lee, Matt Wimmer, Pramod Rudrapatna, Thomas Rea, Shelten

Read More »
AI

Generating synthetic data with differentially private LLM inference

Due to challenges in generating text while maintaining DP and computational efficiency, prior work focused on generating a small amount of data points (<10) to be used for in-context learning. We show that it’s possible to generate two to three orders of magnitude more data while preserving quality and privacy by solving issues related to the privacy budget and computational efficiency. The privacy budget constrains the amount of output the model can release while maintaining

Read More »
AI

Mastering Hadoop, Part 3: Hadoop Ecosystem: Get the most out of your cluster | Towards Data Science

As we have already seen with the basic components (Part 1, Part 2), the Hadoop ecosystem is constantly evolving and being optimized for new applications. As a result, various tools and technologies have developed over time that make Hadoop more powerful and even more widely applicable. As a result, it goes beyond the pure HDFS & MapReduce platform and offers, for example, SQL, as well as NoSQL queries or real-time streaming. Hive/HiveQL Apache Hive is

Read More »