SOLAR 10.7B

SOLAR 10.7B is a large language model developed by Upstage AI using 10.7 billion parameters and a transformer architecture based on Llama 2. The model employs Depth Up-Scaling (DUS), which increases network depth by duplicating and concatenating layers from Mistral 7B initialization, resulting in a 48-layer architecture. Released in both pretrained and instruction-tuned variants under open-source licensing, it demonstrates competitive performance on standard benchmarks through multi-stage training including continued pretraining, instruction fine-tuning, and alignment optimization.

Model Architecture and Depth Up-Scaling

SOLAR 10.7B is structured upon a transformer architecture, specifically extending the Llama 2 design. It introduces Depth Up-Scaling (DUS), a methodology that increases model depth while retaining architectural compatibility and efficient training. The process begins by initializing a 32-layer Llama 2 network with pretrained weights from Mistral 7B, a choice that leverages the strong foundation of previous models while maintaining compatibility with Llama 2's architecture, as described in the SOLAR 10.7B arXiv paper.

Depth Up-Scaling involves duplicating the initial model, then trimming the final segments from one instance and the leading segments from the duplicate, before concatenating the remaining layers to generate a deeper network. For SOLAR 10.7B, this resulted in a 48-layer architecture by removing eight layers from both the head and tail of the concatenated models, optimizing performance while considering hardware constraints.

After the network's depth increases, continued pretraining is essential to recover and improve model performance, particularly to resolve heterogeneity at the junction where layers are joined. This strategy enables the creation of a larger-capacity model with minimal disruption to the training and inference workflows, distinguishing DUS from approaches such as Mixture-of-Experts that introduce additional complexity in expert routing and inference, as detailed in Upstage's model documentation.

Training Data, Fine-Tuning, and Alignment

The SOLAR 10.7B training pipeline comprises multiple phases beginning with depthwise scaling and continued pretraining. The next major step is instruction fine-tuning, in which the model is trained to more robustly follow natural language instructions by leveraging open-source datasets including Alpaca-GPT4 and OpenOrca, as well as a synthetic mathematics question-answering dataset designed to bolster mathematical reasoning and prevent benchmark leakage.

Instruction datasets are reformatted in a consistent style to optimize conversational and task-oriented performance, taking care to filter out data that could directly contaminate benchmark evaluations. Following instruction tuning, the model undergoes further alignment via sDPO (synthetic Direct Preference Optimization), which is an improved variant of standard DPO designed to align the model’s outputs with human or strong model preferences, including those established by large reference models such as GPT-4. This alignment process uses both general-purpose and specialized synthetic datasets in a triplet format consisting of {prompt, chosen, rejected} pairs to reinforce accurate and preferred responses, utilizing techniques similar to those described in HuggingFace documentation.

The project has also experimentally merged models trained in various stages to harness task-specific strengths, illustrating the utility of model merging tools for post-training enhancement.

Benchmarks and Performance

SOLAR 10.7B has been systematically evaluated against leading open-source large language models on the HuggingFace Open LLM Leaderboard, using tasks such as ARC, HellaSwag, MMLU, TruthfulQA, Winogrande, and GSM8K. The alignment-tuned variant, SOLAR-10.7B-Instruct, achieves an H6 average score of 74.20, positioning it competitively with models substantially larger in size.

On tasks demanding instruction-following and mathematical proficiency, the instruct model consistently demonstrates higher performance metrics relative to its pretrained counterpart, as documented in the model's research publication. This performance profile is attributed to the multi-stage training and alignment strategy, as well as the up-scaling and merging methodologies.

Applications and Use Cases

SOLAR 10.7B is primarily designed for research and engineering tasks involving general language understanding, generation, and instruction-following. The instruction-tuned variant excels in scenarios requiring structured task completion, answered queries in a question–answer format, and mathematical reasoning. Due to its architecture and open licensing under Apache-2.0 for the base model and CC-BY-NC-4.0 for the instruction-tuned version, SOLAR 10.7B is particularly suitable for domain adaptation, further fine-tuning, and academic exploration.

While robust out-of-the-box, the pretrained variant is intended for further customization and benefits from application-specific fine-tuning to reach optimal performance in specialized domains. The versatility and adaptability enabled by its architecture and licensing make it an attractive resource for a broad range of linguistic and cognitive tasks in artificial intelligence research.

Limitations

Despite its strengths, SOLAR 10.7B presents certain constraints. The architectural choices for layer removal during up-scaling were guided by hardware considerations rather than exhaustive hyperparameter optimization, suggesting room for further empirical refinement. Like all large-scale language models, it inherits biases from its training data and presents significant computational demands during both training and inference, which may limit its accessibility for some users. Furthermore, substantial energy requirements raise environmental concerns. While depth up-scaling enables efficient enlargement, it initially causes a transient drop in performance that must be recovered through extended pretraining.

The instruction-tuned model, although exhibiting improved alignment and performance on instruction-based tasks, may still necessitate further fine-tuning to realize specialized or domain-specific outcomes.

Release and Licensing

SOLAR 10.7B was first introduced publicly through an arXiv research paper published in December 2023, with continued incremental updates leading into 2024. Both the base and instruct variants are available on HuggingFace, under permissive licenses that foster wide accessibility and further experimentation, as described in the SOLAR 10.7B model documentation.

Laboratory OS

Direct Download

Open WebUI

Text Generation Web UI

Explore the Future of AI

Your server, your data, under your control