SOLAR 10.7B is a large language model developed by Upstage AI using 10.7 billion parameters and a transformer architecture based on Llama 2. The model employs Depth Up-Scaling (DUS), which increases network depth by duplicating and concatenating layers from Mistral 7B initialization, resulting in a 48-layer architecture. Released in both pretrained and instruction-tuned variants under open-source licensing, it demonstrates competitive performance on standard benchmarks through multi-stage training including continued pretraining, instruction fine-tuning, and alignment optimization.
Loading...