Instella

AMD''s Instella is here to shake things up in open source language models. It is easy to use and efficient, unlike other models that can be pricey and limited.
Instella is a text only language model with 3 billion parameters. It handles lots of text and different language patterns, making it great for many uses. The model can manage up to 4096 tokens, which helps it work with lots of text.
Key Features
Instella has 36 decoder layers and 32 attention heads. It uses a tokenizer that manages around 50000 tokens. Its training is optimized with FlashAttention 2, Torch Compile, and Fully Sharded Data Parallelism (FSDP).
Benefits
Instella''s open source nature makes it great for both school research and real world uses. Its efficient training and performance optimizations ensure it runs well.
Use Cases
Instella is versatile and can be used in many applications. Its instruction tuned versions, like Instella 3B Instruct, are great for tasks that need a deep understanding of questions and context aware responses. The model''s performance has been tested against several benchmarks, showing an average improvement of around 8 percent compared to other open source models of similar scale.
Instella VL 1B, a vision language model, is also part of the family. It is designed for image understanding and combines a vision encoder with a language model. This model is great for both general vision language tasks and OCR related benchmarks.
Cost Price
The article did not provide specific cost or pricing information for Instella.
Funding
The article did not provide specific funding information for Instella.
Reviews Testimonials
Users and researchers have found Instella to be a competitive and practical choice for various applications. Its open source approach and clear methodology have been well received, making it a strong base for further research and development.
Comments
Please log in to post a comment.