Ben Dickson

AR/VR

Breaking down Grok 3: The AI model that could redefine the industry

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Less than two years since its launch, xAI has shipped what could arguably be the most advanced AI model to date. Grok 3 matches or beats the most advanced models on all key benchmarks as well as the user-evaluated Chatbot Arena, and its training has not even been completed yet.  We still don’t have a lot

Read More »
AR/VR

Researchers find you don’t need a ton of data to train LLMs for reasoning tasks

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Large language models (LLMs) can learn complex reasoning tasks without relying on large datasets, according to a new study by researchers at Shanghai Jiao Tong University. Their findings show that with just a small batch of well-curated examples, you can train an LLM for tasks that were thought to require tens of thousands of training instances. 

Read More »
AR/VR

Less supervision, better results: Study shows AI models generalize more effectively on their own

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Language models can generalize better when left to create their own solutions, a new study by Hong Kong University and University of California, Berkeley, shows. The findings, which apply to both large language models (LLMs) and vision language models (VLMs), challenge one of the main beliefs of the LLM community — that models require hand-labeled training

Read More »
AR/VR

OpenAI responds to DeepSeek competition with detailed reasoning traces for o3-mini

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI is now showing more details of the reasoning process of o3-mini, its latest reasoning model. The change was announced on OpenAI’s X account and comes as the AI lab is under increased pressure by DeepSeek-R1, a rival open model that fully displays its reasoning tokens. Models like o3 and R1 undergo a lengthy “chain of

Read More »
AR/VR

Beyond benchmarks: How DeepSeek-R1 and o1 perform on real-world tasks

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More DeepSeek-R1 has surely created a lot of excitement and concern, especially for OpenAI’s rival model o1. So, we put them to test in a side-by-side comparison on a few simple data analysis and market research tasks.  To put the models on equal footing, we used Perplexity Pro Search, which now supports both o1 and R1. Our

Read More »
AR/VR

DeepSeek-R1 is a boon for enterprises — making AI apps cheaper, easier to build, and more innovative

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More The release of the DeepSeek R1 reasoning model has caused shockwaves across the tech industry, with the most obvious sign being the sudden sell-off of major AI stocks. The advantage of well-funded AI labs such as OpenAI and Anthropic no longer seems very solid, as DeepSeek has reportedly been able to develop their o1 competitor at

Read More »
AR/VR

No retraining needed: Sakana’s new AI model changes how machines learn

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Researchers at Sakana AI, an AI research lab focusing on nature-inspired algorithms, have developed a self-adaptive language model that can learn new tasks without the need for fine-tuning. Called Transformer² (Transformer-squared), the model uses mathematical tricks to align its weights with user requests during inference.  This is the latest in a series of techniques that aim

Read More »
AR/VR

Beyond RAG: How cache-augmented generation reduces latency, complexity for smaller workloads

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Retrieval-augmented generation (RAG) has become the de-facto way of customizing large language models (LLMs) for bespoke information. However, RAG comes with upfront technical costs and can be slow. Now, thanks to advances in long-context LLMs, enterprises can bypass RAG by inserting all the proprietary information in the prompt. A new study by the National Chengchi University

Read More »