SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model | #ai #2025 #genai #google
AI Today7 Helmi 2025

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model | #ai #2025 #genai #google

Paper: https://arxiv.org/pdf/2501.17161 This research paper compares supervised fine-tuning (SFT) and reinforcement learning (RL) for post-training foundation models. Using novel and existing tasks involving arithmetic and spatial reasoning, the study finds that RL promotes better generalization to unseen data, unlike SFT which tends to memorize training data. Further analysis reveals RL enhances visual recognition capabilities in multimodal models, while SFT aids in stabilizing RL training by improving output formatting. The paper also explores the impact of increased inference-time computation on generalization. #ai, #artificialintelligence, #arxiv, #research, #paper, #publication, #llm, #genai, #generativeai, #largevisualmodels, #largelanguagemodels, #largemultimodalmodels, #nlp, #text, #machinelearning, #ml, #nvidia, #openai, #anthropic, #microsoft, #google, #technology, #cuttingedge, #meta, #llama, #chatgpt, #gpt, #elonmusk, #samaltman, #deployment, #engineering, #scholar, #science, #apple, #samsung, #turing, #aiethics, #innovation, #futuretech, #deeplearning, #datascience, #computervision, #autonomoussystems, #robotics, #dataprivacy, #cybersecurity, #digitaltransformation, #quantumcomputing, #aiapplications, #aiethics, #techleadership, #technews, #aiinsights, #aiindustry, #aiadvancements, #futureai, #airesearchers

Jaksot(30)

Deepseek Janus-Pro: Unified Multimodal Understanding and Generation | #ai #2025 #genai #deepseek

Deepseek Janus-Pro: Unified Multimodal Understanding and Generation | #ai #2025 #genai #deepseek

Paper: https://github.com/deepseek-ai/Janus/blob/main/janus_pro_tech_report.pdf Github: https://github.com/deepseek-ai/Janus/tree/main?tab=readme-ov-file The paper introduces Janus-Pro, an improved m...

30 Tammi 202516min

Memory Layers at Scale | #ai #2024 #genai #meta

Memory Layers at Scale | #ai #2024 #genai #meta

Paper: https://arxiv.org/pdf/2412.09764 This research paper explores the effectiveness of memory layers in significantly enhancing large language models (LLMs). By incorporating a trainable key-value...

11 Tammi 202514min

Large Concept Models: Language Modeling in a Sentence Representation Space | #ai #2024 #genai

Large Concept Models: Language Modeling in a Sentence Representation Space | #ai #2024 #genai

Paper: https://scontent-dfw5-1.xx.fbcdn.net/... This research paper introduces Large Concept Models (LCMs), a novel approach to language modeling that operates on sentence embeddings instead of indi...

6 Tammi 202529min

DeepSeek v3 | #ai #2024 #genai

DeepSeek v3 | #ai #2024 #genai

Technical Report: https://arxiv.org/pdf/2412.19437 Github: https://github.com/deepseek-ai/DeepSe... This research paper introduces DeepSeek-V3, a 671-billion parameter Mixture-of-Experts (MoE) large ...

31 Joulu 202428min

VISION TRANSFORMERS NEED REGISTERS | #ai #2024 #genai #meta

VISION TRANSFORMERS NEED REGISTERS | #ai #2024 #genai #meta

Paper: https://arxiv.org/pdf/2309.16588 This research paper examines artifacts in vision transformer feature maps, specifically high-norm tokens appearing in non-informative image areas. The authors ...

30 Joulu 202433min

Byte Latent Transformer: Scaling Language Models with Patches | #ai #2024 #genai

Byte Latent Transformer: Scaling Language Models with Patches | #ai #2024 #genai

Paper: https://arxiv.org/pdf/2412.09871v1.pdf The paper introduces the Byte Latent Transformer (BLT), a novel large language model architecture that processes raw byte data without tokenization. BLT ...

27 Joulu 202421min

CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models | #ai #2024 #genai

CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models | #ai #2024 #genai

This research paper introduces CosyVoice 2, an improved streaming speech synthesis model. Building upon its predecessor, CosyVoice 2 utilizes advancements in large language models (LLMs) and incorpora...

27 Joulu 202420min