Ben Dickson

AR/VR

How Meta’s latest research proves you can use generative AI to understand user intent

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Meta — parent company of Facebook, Instagram, WhatsApp, Threads and more — runs one of the biggest recommendation systems in the world. In two recently released papers, its researchers have revealed how generative models can be used to better understand and respond to user intent.  By looking at recommendations as a generative problem, you can tackle

Read More »
AR/VR

OpenAI’s o3 shows remarkable progress on ARC-AGI, sparking debate on AI reasoning

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI’s latest o3 model has achieved a breakthrough that has surprised the AI research community. o3 scored an unprecedented 75.7% on the super-difficult ARC-AGI benchmark under standard compute conditions, with a high-compute version reaching 87.5%.  While the achievement in ARC-AGI is impressive, it does not yet prove that the code to artificial general intelligence (AGI) has

Read More »
AR/VR

Hugging Face shows how test-time scaling helps small language models punch above their weight

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More In a new case study, Hugging Face researchers have demonstrated how small language models (SLMs) can be configured to outperform much larger models. Their findings show that a Llama 3 model with 3B parameters can outperform the 70B version of the model in complex math problems. Hugging Face has fully documented the entire process and provides

Read More »

Sakana AI’s CycleQD outperforms traditional fine-tuning methods for multi-skill language models

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Researchers at Sakana AI have developed a resource-efficient framework that can create hundreds of language models specializing in different tasks. Called CycleQD, the technique uses evolutionary algorithms to combine the skills of different models without the need for expensive and slow training processes. CycleQD can create swarms of task-specific agents that offer a more sustainable alternative

Read More »

Alibaba releases Qwen with Questions, an open reasoning model that beats o1-preview

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Chinese e-commerce giant Alibaba has released the latest model in its ever-expanding Qwen family. This one is known as Qwen with Questions (QwQ), and serves as the latest open source competitor to OpenAI’s o1 reasoning model. Like other large reasoning models (LRMs), QwQ uses extra compute cycles during inference to review its answers and correct its

Read More »

Alibaba researchers unveil Marco-o1, an LLM with advanced reasoning capabilities

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More The recent release of OpenAI o1 has brought great attention to large reasoning models (LRMs), and is inspiring new models aimed at solving complex problems classic language models often struggle with. Building on the success of o1 and the concept of LRMs, researchers at Alibaba have introduced Marco-o1, which enhances reasoning capabilities and tackles problems with

Read More »

Chinese researchers unveil LLaVA-o1 to challenge OpenAI’s o1 model

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI‘s o1 model has shown that inference-time scaling—using more compute during inference—can significantly boost a language model’s reasoning abilities. LLaVA-o1, a new model developed by researchers from multiple universities in China, brings this paradigm to open-source vision language models (VLMs). Early open-source VLMs typically use a direct prediction approach, generating answers without reasoning about the prompt

Read More »

Anthropic’s Computer Use mode shows strengths and limitations in new study

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Since Anthropic released the “Computer Use” feature for Claude in October, there has been a lot of excitement about what AI agents can do when given the power to imitate human interactions. A new study by Show Lab at the National University of Singapore provides an overview of what we can expect from the current generation

Read More »

How Microsoft’s next-gen BitNet architecture is turbocharging LLM efficiency

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More One-bit large language models (LLMs) have emerged as a promising approach to making generative AI more accessible and affordable. By representing model weights with a very limited number of bits, 1-bit LLMs dramatically reduce the memory and computational resources required to run them. Microsoft Research has been pushing the boundaries of 1-bit LLMs with its BitNet

Read More »