The recent Nobel Prize for groundbreaking advancements in protein discovery underscores the transformative potential of foundation models (FMs) in exploring vast combinatorial spaces. These models are ...
Recent advancements in training large multimodal models have been driven by efforts to eliminate modeling constraints and unify architectures across domains. Despite these strides, many existing ...
The Transformer architecture, introduced by Vaswani et al. in 2017, serves as the backbone of contemporary language models. Over the years, numerous modifications to this architecture have been ...
The transformative impact of Transformers on natural language processing (NLP) and computer vision (CV) is undeniable. Their scalability and effectiveness have propelled advancements across these ...
In a new paper Time-Reversal Provides Unsupervised Feedback to LLMs, a research team from Google DeepMind and Indian Institute of Science proposes Time Reversed Language Models (TRLMs), a framework ...
A research team introduces Automated Search for Artificial Life (ASAL). This novel framework leverages vision-language FMs to automate and enhance the discovery process in ALife research.
An NVIDIA research team proposes Hymba, a family of small language models that blend transformer attention with state space models, which outperforms the Llama-3.2-3B model with a 1.32% higher average ...
An Apple research team introduces AIMV2, a family of vision encoders that is designed to predict both image patches and text tokens within a unified sequence. This combined objective enables the model ...
In a new paper Diffusion Models Are Real-Time Game Engines, a Google research team presents GameNGen, the first game engine powered entirely by a neural model that enables real-time interaction with ...
In a new paper Diffusion Models Are Real-Time Game Engines, a Google research team presents GameNGen, the first game engine powered entirely by a neural model that enables real-time interaction with ...