Bilateral Reference for High-Resolution Dichotomous Image Segmentation | #ai #genai #llm #cv #2024
AI Today27 Marras 2024

Bilateral Reference for High-Resolution Dichotomous Image Segmentation | #ai #genai #llm #cv #2024

Paper: https://arxiv.org/pdf/2401.03407 Github: https://github.com/ZhengPeng7/BiRefNet This research introduces BiRefNet, a novel deep learning framework for high-resolution dichotomous image segmentation. BiRefNet uses a bilateral reference mechanism, incorporating both original image patches and gradient maps, to improve the accuracy of segmenting fine details. The framework is composed of localization and reconstruction modules, enhancing performance through multi-stage supervision and other training strategies. Extensive experiments demonstrate BiRefNet's superior performance across several image segmentation tasks, outperforming existing state-of-the-art methods. The authors also highlight the model's potential applications and its adoption by the community for various third-party projects. ai , computer vision , cv , nankai university , artificial intelligence , arxiv , research , paper , publication , lvm , large visual models, llm

Jaksot(30)

SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with MotionAware Mem | #2024

SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with MotionAware Mem | #2024

Paper: https://arxiv.org/pdf/2411.11922 Github: https://github.com/yangchris11/samurai Blog: https://yangchris11.github.io/samurai/ The paper introduces SAMURAI, a novel visual object tracking method...

27 Marras 202414min

Adding Error Bars to Evals: A Statistical Approach to LM Evaluations | #llm #genai #anthropic #2024

Adding Error Bars to Evals: A Statistical Approach to LM Evaluations | #llm #genai #anthropic #2024

Github: https://arxiv.org/pdf/2411.00640 This research paper advocates for incorporating rigorous statistical methods into the evaluation of large language models (LLMs). It introduces formulas for c...

27 Marras 202414min

Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions | #ai #llm #alibaba #genai #2024

Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions | #ai #llm #alibaba #genai #2024

Paper: https://arxiv.org/pdf/2411.14405 Github: https://github.com/AIDC-AI/Marco-o1 The Alibaba MarcoPolo team introduces Marco-o1, a large reasoning model designed to excel in open-ended problem-sol...

27 Marras 202414min

FLUX.I TOOLS | #ai #computervision #cv #BlackForestLabs #2024

FLUX.I TOOLS | #ai #computervision #cv #BlackForestLabs #2024

Github: https://github.com/black-forest-labs/... Black Forest Labs announced FLUX.1 Tools, a suite of four open-access and API-based models enhancing their FLUX.1 text-to-image model. FLUX.1 Fill exc...

27 Marras 202414min

Tülu 3 opens language model post-training up to more tasks and more people | #ai #llm #allenai #2024

Tülu 3 opens language model post-training up to more tasks and more people | #ai #llm #allenai #2024

Blog: https://allenai.org/blog/tulu-3 Summary The Allen Institute for Artificial Intelligence (Ai2) has released Tülu 3, an open-source family of post-trained language models. Unlike closed models fr...

27 Marras 202414min

Multimodal Autoregressive Pre-training of Large Vision Encoders | #ai #computervision #apple #2024

Multimodal Autoregressive Pre-training of Large Vision Encoders | #ai #computervision #apple #2024

Paper: https://arxiv.org/pdf/2411.14402 Github Link: https://github.com/apple/ml-aim This research introduces AIMV2, a family of large-scale vision encoders pre-trained using a novel multimodal auto...

27 Marras 202414min