Sponsored by Looka AI – Exclusive lifetime deal

Cohere Introduces Aya Vision AI Model for Image and Text Understanding

Cohere has introduced Aya Vision, an AI model designed to handle text and image-based tasks such as generating captions, answering visual questions, and translating text in 23 major languages. The company aims to bridge language gaps in AI models, making technical advancements more accessible. 

Aya Vision is available in two versions: Aya Vision 32B, which outperforms models twice its size, and Aya Vision 8B, which excels over models ten times larger. Both models surpass benchmarks, even outperforming Meta’s Llama-3.2 90B Vision in certain tasks. Cohere offers Aya Vision for free through WhatsApp and on the Hugging Face platform under a Creative Commons 4.0 license, though it is restricted from commercial use.

To train Aya Vision, Cohere utilized diverse English datasets, translating them and generating synthetic annotations to enhance the model’s learning process. These annotations help the model recognize and interpret images accurately. 

The use of AI-generated synthetic data is becoming a trend among AI firms as real-world data sources become scarcer. OpenAI and other competitors are also adopting this method, with research indicating that synthetic data plays a key role in AI development.

Aya Vision marks a major step forward in AI-powered multimodal learning. Cohere’s approach to making such models available for free aligns with its goal of advancing AI research and accessibility. With its superior performance and multilingual capabilities, Aya Vision has the potential to set new standards in visual AI applications.

Facebook
X
LinkedIn
Pinterest
Reddit

Subscribe and get Cheat Sheet of Super Power AI prompts for FREE !

Limited Time Only!

Embark on your AI journey by securing your copy today!