Browse Models
Command A is a transformer-based language model family developed by Cohere and Cohere Labs, featuring a hybrid attention mechanism that supports context lengths up to 256,000 tokens. The model incorporates grouped-query attention, shared input/output embeddings, and SwiGLU activation functions for enhanced efficiency. Training employs supervised fine-tuning, reinforcement learning from human feedback, and specialized expert merging across domains including safety, mathematics, and code generation. The model demonstrates competitive performance on academic benchmarks and supports multilingual capabilities across 23 languages with robust tool-use functionality.