Browse Models

Xwin-LM /

Xwin 70B

Family

Llama 2

Type

Fine-Tuned Model

License

LLAMA-2 Community License Agreement

Released

2023-09-19

How To Use

Laboratory OS

Launch a dedicated cloud GPU server running Laboratory OS to download and run Xwin 70B using any compatible app or framework.

Direct Download

Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on local system resources, particularly GPU(s) and available VRAM.

Browse Compatible Apps

open-webui /

Open WebUI

Open WebUI is an open-source, self-hosted web interface with a polished, ChatGPT-like user experience for interacting with LLMs. Integrates seamlessly with local Ollama installation.

oobabooga /

Text Generation Web UI

The most full-featured web interface for experimenting with open source Large Language Models. Featuring a wide range of configurable settings, inference engines, and plugins.

Model Report

Xwin-LM / Xwin 70B

Xwin 70B is a 70-billion parameter language model developed by Xwin-LM, based on the Llama 2 architecture and enhanced through reinforcement learning from human feedback (RLHF). The model incorporates supervised fine-tuning, reward modeling, and reject sampling techniques to improve conversational alignment and safety, achieving strong performance on benchmarks including a 95.57% win-rate against Text-Davinci-003 on AlpacaEval.

Explore the Future of AI

Your server, your data, under your control

Xwin 70B is a large language model (LLM) developed as part of the broader Xwin-LM project, which focuses on advancing open-source alignment technologies for LLMs through techniques such as supervised fine-tuning, reward modeling, reject sampling, and reinforcement learning from human feedback. Released in September 2023, Xwin 70B is built upon the Llama 2 architecture and leverages reinforcement learning from human feedback (RLHF) as a central methodology for improving communicative effectiveness and safety.

The model and its family are released under the Llama 2 License and are intended to foster reproducible, transparent research in model alignment and scalable conversational systems.

Model Architecture and Alignment Techniques

Xwin 70B is based on the Llama 2 70B parameter model, which provides a foundation for extended context and generative capabilities. The Xwin-LM team specifically focuses on enhancing alignment through a multi-step process that includes supervised fine-tuning to guide base behaviors, reward modeling to assess desirable system outputs, and reinforcement learning from human feedback to iteratively adapt the model based on user preference data.

Core to Xwin 70B's alignment is its use of RLHF, whereby human annotators provide feedback on model outputs to create a reward function that further tunes the system. This process is designed to yield more helpful, precise, and polite responses consistent with user expectations. The model also employs a conversational prompt format pioneered by Vicuna and incorporated by FastChat, which structures multi-turn dialogues for improved interaction.

Technical Features and Inference

Xwin 70B emphasizes stable and reproducible conversational alignment while maintaining compatibility with widely adopted machine learning frameworks. The model utilizes a prompt format that begins with an introductory description of the chat, followed by alternating user and assistant turns, as exemplified below:

“A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user’s questions. USER: Hi! ASSISTANT: Hello.</s>USER: Who are you? ASSISTANT: I am Xwin-LM.</s>”

For inference, Xwin 70B is compatible with Hugging Face’s AutoModelForCausalLM and AutoTokenizer, and is optimized for high-speed inference using vllm due to its Llama 2-based implementation. Strict adherence to the prescribed conversational templates is recommended to ensure coherent and contextually relevant outputs. Suggested inference settings include a max_new_tokens value of 4096 and a temperature parameter set around 0.7 to balance creativity and determinism in responses.

Benchmark Performance

Following release, Xwin 70B achieved strong performance across a number of public benchmarks. In the AlpacaEval evaluation, Xwin 70B recorded a 95.57% win-rate against Text-Davinci-003. It also achieved a 60.61% win-rate versus GPT-4 and 87.50% versus ChatGPT in the same benchmark. The model generally exhibited higher relative scores in head-to-head comparisons against models such as Llama-2-70B-Chat and WizardLM-70B in several evaluations.

Xwin 70B also scored highly on NLP foundation tasks recorded in the Open LLM Leaderboard. The reported average score across core tasks was 71.8, with detailed results including a score of 69.6 on MMLU (5-shot), 70.5 on ARC (25-shot), 60.1 on TruthfulQA (0-shot), and 87.1 on HellaSwag (10-shot). These results provide insight into its performance across various question-answering and reasoning challenges.

Xwin-LM Model Family

The Xwin-LM project has released several models alongside Xwin 70B, notably the Xwin-LM-13B-V0.1 and Xwin-LM-7B-V0.1 models. These models are also based on the Llama 2 architecture and employ the same alignment strategies, including RLHF.

The 13B variant achieved a 91.76% win-rate on AlpacaEval against Text-Davinci-003 and recorded win-rates of 81.79% versus ChatGPT and 55.30% compared to GPT-4. Its average score on NLP foundation tasks was 61.9. The 7B variant recorded an 87.82% win-rate on AlpacaEval, with corresponding win-rates of 76.40% and 47.57% against ChatGPT and GPT-4, respectively, and an average NLP score of 58.4. Both models are distributed under the Llama 2 License, ensuring alignment with open-source principles.

Limitations and Future Directions

While Xwin 70B and its related models demonstrate strong results on automated benchmarks, the Xwin-LM project has outlined areas for continued development. One primary aim is to release more comprehensive source code to support greater scientific transparency and reproducibility. Furthermore, the project envisions enhancing capabilities in mathematics and reasoning to address domains that require more rigorous logic or step-wise solution generation.

Ongoing research is also anticipated to address any limitations in prompt flexibility, context retention, or specific use-case adaptation as highlighted in the broader LLM community.

Xwin 70B

Laboratory OS

Direct Download

Open WebUI

Text Generation Web UI

Explore the Future of AI

Your server, your data, under your control

Xwin 70B

Laboratory OS

Direct Download

Open WebUI

Text Generation Web UI

Explore the Future of AI

Your server, your data, under your control

Model Architecture and Alignment Techniques

Technical Features and Inference

Benchmark Performance

Xwin-LM Model Family

Limitations and Future Directions

Helpful Links