Aaron Birnbaum and Matthew Makansi

AI

The Impact of GenAI and Its Implications for Data Scientists | Towards Data Science

GenAI systems affect how we work. This general notion is well known. However, we are still unaware of the exact impact of GenAI. For example, how much do these tools affect our work? Do they have a larger impact on certain tasks? What does this mean for us in our daily work? To answer these questions, Anthropic released a study based on millions of anonymized conversations on Claude.ai. The study provides data on how GenAI is incorporated

Read More »
AI

Mastering Prompt Engineering with Functional Testing: A Systematic Guide to Reliable LLM Outputs  | Towards Data Science

Creating efficient prompts for large language models often starts as a simple task… but it doesn’t always stay that way. Initially, following basic best practices seems sufficient: adopt the persona of a specialist, write clear instructions, require a specific response format, and include a few relevant examples. But as requirements multiply, contradictions emerge, and even minor modifications can introduce unexpected failures. What was working perfectly in one prompt version suddenly breaks in another. If you

Read More »
AI

Nine Pico PIO Wats with Rust (Part 2) | Towards Data Science

This is Part 2 of an exploration into the unexpected quirks of programming the Raspberry Pi Pico PIO with Micropython. If you missed Part 1, we uncovered four Wats that challenge assumptions about register count, instruction slots, the behavior of pull noblock, and smart yet cheap hardware. Now, we continue our journey toward crafting a theremin-like musical instrument — a project that reveals some of the quirks and perplexities of PIO programming. Prepare to challenge your understanding

Read More »
AI

Forget About Cloud Computing. On-Premises Is All the Rage Again | Towards Data Science

Ten years ago, everybody was fascinated by the cloud. It was the new thing, and companies that adopted it rapidly saw tremendous growth. Salesforce, for example, positioned itself as a pioneer of this technology and saw great wins. The tides are turning though. As much as cloud providers still proclaim that they’re the most cost-effective and efficient solution for businesses of all sizes, this is increasingly clashing with the day-to-day experience. Cloud Computing was touted

Read More »
AI

Anatomy of a Parquet File | Towards Data Science

In recent years, Parquet has become a standard format for data storage in Big Data ecosystems. Its column-oriented format offers several advantages: Faster query execution when only a subset of columns is being processed Quick calculation of statistics across all data Reduced storage volume thanks to efficient compression When combined with storage frameworks like Delta Lake or Apache Iceberg, it seamlessly integrates with query engines (e.g., Trino) and data warehouse compute clusters (e.g., Snowflake, BigQuery).

Read More »

Fourier Transform Applications in Literary Analysis | Towards Data Science

Poetry is often seen as a pure art form, ranging from the rigid structure of a haiku to the fluid, unconstrained nature of free-verse poetry. In analysing these works, though, to what extent can mathematics and Data Analysis be used to glean meaning from this free-flowing literature? Of course, rhetoric can be analysed, references can be found, and word choice can be questioned, but can the underlying– even subconscious– thought process of an author be

Read More »
AI

Mastering Hadoop, Part 2: Getting Hands-On — Setting Up and Scaling Hadoop | Towards Data Science

Now that we’ve explored Hadoop’s role and relevance, it’s time to show you how it works under the hood and how you can start working with it. To start, we are breaking down Hadoop’s core components — HDFS for storage, MapReduce for processing, YARN for resource management, and more. Then, we’ll guide you through installing Hadoop (both locally and in the cloud) and introduce some essential commands to help you navigate and operate your first

Read More »
AI

Are You Still Using LoRA to Fine-Tune Your LLM? | Towards Data Science

LoRA (Low Rank Adaptation – arxiv.org/abs/2106.09685) is a popular technique for fine-tuning Large Language Models (LLMs) on the cheap. But 2024 has seen an explosion of new parameter-efficient fine-tuning techniques, an alphabet soup of LoRA alternatives: SVF, SVFT, MiLoRA, PiSSA, LoRA-XS 🤯… And most are based on a matrix technique I like a lot: the SVD (Singular Value Decomposition). Let’s dive in. LoRA The original Lora insight is that fine-tuning all the weights of a

Read More »

2026 Will Be The Year of Data + AI Observability | Towards Data Science

GenAI has already made an extraordinary impact on enterprise productivity. Marc Benioff has stated Salesforce will keep its software engineering headcount flat due to a 30% increase in productivity thanks to AI. Users leveraging Microsoft Co-pilot create or edit 10% more documents. But this impact has been evenly distributed. Powerful models are a simple API call away and available to all (as Meta and OpenAI ads make sure to remind us).  The real disruption lies

Read More »

Introducing Gemini Robotics and Gemini Robotics-ER, AI models designed for robots to understand, act and react to the physical world.

Research Published 12 March 2025 Authors Carolina Parada Introducing Gemini Robotics, our Gemini 2.0-based model designed for robotics At Google DeepMind, we’ve been making progress in how our Gemini models solve complex problems through multimodal reasoning across text, images, audio and video. So far however, those abilities have been largely confined to the digital realm. In order for AI to be useful and helpful to people in the physical realm, they have to demonstrate “embodied”

Read More »