Aaron Birnbaum and Matthew Makansi

AI

How to Train LLMs to “Think” (o1 & DeepSeek-R1) | Towards Data Science

In September 2024, OpenAI released its o1 model, trained on large-scale reinforcement learning, giving it “advanced reasoning” capabilities. Unfortunately, the details of how they pulled this off were never shared publicly. Today, however, DeepSeek (an AI research lab) has replicated this reasoning behavior and published the full technical details of their approach. In this article, I will discuss the key ideas behind this innovation and describe how they work under the hood. [embedded content] OpenAI’s

Read More »
AI

Generative AI and Civic Institutions | Towards Data Science

Different sectors, different goals Recent events have got me thinking about AI as it relates to our civic institutions — think government, education, public libraries, and so on. We often forget that civic and governmental organizations are inherently deeply different from private companies and profit-making enterprises. They exist to enable people to live their best lives, protect people’s rights, and make opportunities accessible, even if (especially if) this work doesn’t have immediate monetary returns. The

Read More »
AI

LLM + RAG: Creating an AI-Powered File Reader Assistant | Towards Data Science

Introduction AI is everywhere.  It is hard not to interact at least once a day with a Large Language Model (LLM). The chatbots are here to stay. They’re in your apps, they help you write better, they compose emails, they read emails…well, they do a lot. And I don’t think that that is bad. In fact, my opinion is the other way – at least so far. I defend and advocate for the use of

Read More »
AI

Data Science: From School to Work, Part II | Towards Data Science

In my previous article, I highlighted the importance of effective project management in Python development. Now, let’s shift our focus to the code itself and explore how to write clean, maintainable code — an essential practice in professional and collaborative environments.  Readability & Maintainability: Well-structured code is easier to read, understand, and modify. Other developers — or even your future self — can quickly grasp the logic without struggling to decipher messy code. Debugging & Troubleshooting: Organized code with clear variable

Read More »
AI

Avoidable and Unavoidable Randomness in GPT-4o | Towards Data Science

Of course there is randomness in GPT-4o’s outputs. After all, the model samples from a probability distribution when choosing each token. But what I didn’t understand was that those very probabilities themselves are not deterministic. Even with consistent prompts, fixed seeds, and temperature set to zero, GPT-4o still introduces subtle, frustrating randomness. There’s no fix for this, and it might not even be something OpenAI could fix if they wanted to, just so we’re clear

Read More »
AI

Vision Transformers (ViT) Explained: Are They Better Than CNNs? | Towards Data Science

1. Introduction Ever since the introduction of the self-attention mechanism, Transformers have been the top choice when it comes to Natural Language Processing (NLP) tasks. Self-attention-based models are highly parallelizable and require substantially fewer parameters, making them much more computationally efficient, less prone to overfitting, and easier to fine-tune for domain-specific tasks [1]. Furthermore, the key advantage of transformers over past models (like RNN, LSTM, GRU and other neural-based architectures that dominated the NLP domain

Read More »
AI

Unraveling Large Language Model Hallucinations | Towards Data Science

Introduction In a YouTube video titled Deep Dive into LLMs like ChatGPT, former Senior Director of AI at Tesla, Andrej Karpathy discusses the psychology of Large Language Models (LLMs) as emergent cognitive effects of the training pipeline. This article is inspired by his explanation of LLM hallucinations and the information presented in the video. You might have seen model hallucinations. They are the instances where LLMs generate incorrect, misleading, or entirely fabricated information that appears plausible. These hallucinations

Read More »
AI

Announcing the Towards Data Science Author Payment Program | Towards Data Science

At TDS, we see value in every article we publish and recognize that authors share their work with us for a wide range of reasons — some wish to spread their knowledge and help other learners, others aim to grow their public profile and advance in their career, and some look at writing as an additional income stream. In many cases, it’s a combination of all of the above. Historically, there was no direct monetization

Read More »
AI

Mind the GAP: Geometry Aware Passthrough mitigates cybersickness

User study design We introduce a comprehensive protocol focused on key VST use cases to holistically assess visually-induced discomfort and cybersickness in VST HMDs. We then use this to compare our GAP algorithm to DP. To achieve reproducibility, repeatability, and real-life relevance, we began with tasks identified in the literature, tested them in a pilot study, and iteratively refined the task nature and duration based on participant feedback. A total of 25 consenting participants with

Read More »
AI

I Won’t Change Unless You Do | Towards Data Science

In Game Theory, how can players ever come to an end if there still might be a better option to decide for? Maybe one player still wants to change their decision. But if they do, maybe the other player wants to change too. How can they ever hope to escape from this vicious circle? To solve this problem, the concept of a Nash equilibrium, which I will explain in this article, is fundamental to game

Read More »