Meta unveils Llama 3.2:A multimodal AI model for images and text processing
Sep 26, 2024 at 5:21 PM

Meta unveils Llama 3.2:A multimodal AI model for images and text processing

Meta has introduced Llama 3.2, its first open-source AI model which can process both images and text. This multimodal capability opens up possibilities for advanced AI applications such as augmented reality tools, visual search engines, and document analysis systems. Llama 3.2 also enhances real-time video understanding and content-based image sorting.

The model includes two vision models with 11 billion and 90 billion parameters and two lightweight text-only models with 1 billion and 3 billion parameters, designed for mobile and lower-power hardware. Llama 3.2 supports Meta's hardware projects, including the Ray-Ban Meta glasses, where vision capabilities are essential. The company plans to make Llama 3.2 functional on Arm-based mobile platforms, expanding AI deployment possibilities.

The previous version, Llama 3.1, remains significant for its powerful 405 billion parameter text generation capabilities.

Sep 26, 2024 by Mauricio B. Holguin

K0RR
K0RR found this interesting
  • ...

Meta Llama, developed by Meta, is a state-of-the-art foundational large language model designed to assist researchers in advancing their AI work. As an AI-powered chatbot, it forms part of Meta's commitment to open science. Rated 5, Meta Llama stands out for its advanced capabilities. Top alternatives include ChatGPT, HuggingChat, and Google Gemini.

No comments so far, maybe you want to be first?
Gu