Browse Models

teknium /

OpenHermes 2.5 Mistral 7B

Family

Mistral (2023)

Type

Fine-Tuned Model

License

Apache-2.0 License

Released

2023-12-03

How To Use

Laboratory OS

Launch a dedicated cloud GPU server running Laboratory OS to download and run OpenHermes 2.5 Mistral 7B using any compatible app or framework.

Direct Download

Must be used with a compatible app, notebook, or codebase. May run slowly, or not work at all, depending on local system resources, particularly GPU(s) and available VRAM.

Browse Compatible Apps

open-webui /

Open WebUI

Open WebUI is an open-source, self-hosted web interface with a polished, ChatGPT-like user experience for interacting with LLMs. Integrates seamlessly with local Ollama installation.

oobabooga /

Text Generation Web UI

The most full-featured web interface for experimenting with open source Large Language Models. Featuring a wide range of configurable settings, inference engines, and plugins.

Model Report

teknium / OpenHermes 2.5 Mistral 7B

OpenHermes 2.5 Mistral 7B is a 7.24 billion parameter language model fine-tuned from Mistral-7B-v0.1 using distilled supervised fine-tuning and direct preference optimization. Developed by teknium, it was trained on approximately 1,000,000 dialogue entries with 7-14% programming instructions, achieving notable improvements in conversational AI, coding tasks, and general language performance across standard benchmarks including HumanEval and TruthfulQA.

Explore the Future of AI

Your server, your data, under your control

OpenHermes 2.5 Mistral 7B is a large language model (LLM) developed by Teknium as a continuation and improvement upon previous OpenHermes iterations. Built by fine-tuning the Mistral-7B-v0.1 architecture, OpenHermes 2.5 integrates advanced alignment and training strategies to enhance its utility for conversational, creative, and code-oriented tasks. The model’s name pays homage to Hermes, the Greek messenger god, symbolizing its communicative role as an AI assistant.

A stylized banner image representing OpenHermes 2.5 Mistral 7B, signifying its blend of mythological inspiration and advanced technology.

Full Size Image Image Source

Model Architecture and Training Approach

OpenHermes 2.5 Mistral 7B is grounded in the Mistral 7B architecture, a transformer model with approximately 7.24 billion parameters, designed for performance. The fine-tuning process draws on methodologies established in models like Zephyr-7B, which use a pipeline consisting of distilled supervised fine-tuning, preference optimization, and reinforcement with alignment data.

Training leveraged axolotl for data transformation, ensuring compatibility with standard formats such as ShareGPT and ChatML. This structured approach facilitates improved multi-turn dialogue capabilities, more nuanced system prompts, and reliable alignment with human preferences. The resulting model exhibits strong generalist performance, enhanced particularly by the inclusion of code-based instruction data during training.

Datasets and Alignment Techniques

The development of OpenHermes 2.5 Mistral 7B involved curated datasets comprising roughly 1,000,000 dialogue entries, primarily generated via GPT-4, and supplemented with additional high-quality publicly available data. Significant portions of the dataset—estimated between 7% and 14%—contain programming instructions, contributing to measurable improvements in both code and general language tasks.

Following processes observed in Zephyr-7B, training included extensive supervised fine-tuning on multi-turn conversations, collection of preference data rated by large language models such as GPT-4, and distilled direct preference optimization (dDPO), an approach that directly optimizes for responses preferred by teacher models.

Transformation into the ChatML format ensured prompt and response consistency, aiding reproducible alignment and increased model interoperability.

Performance Evaluation and Benchmarks

OpenHermes 2.5 Mistral 7B achieves competitive results across a variety of benchmarks, frequently surpassing previous OpenHermes and other Mistral-based fine-tuned models at this scale. Notably, the integration of additional code-centric instruction data during training improved performance on benchmark suites such as GPT4All, AGIEval, TruthfulQA, and HumanEval.

Direct comparison to prior models highlights incremental improvements: relative to OpenHermes-2 Mistral 7B, OpenHermes 2.5 achieves higher average scores in major benchmarks, including a 73.12 average on GPT4All, 43.07 on AGIEval, 53.04 on TruthfulQA, and 40.96 on BigBench. Coding ability is evidenced by a HumanEval pass@1 score of 50.7%, a notable jump from prior generations.

Incremental progress across OpenHermes model generations can be observed in comprehensive visualizations:

OpenHermes model evolution performance comparison chart

Use Cases and Example Outputs

OpenHermes 2.5 Mistral 7B is positioned as a general-purpose conversational agent, demonstrating strength in a wide array of practical applications. Its outputs showcase proficiency in programming assistance, creative composition, philosophical discussion, and character roleplay.

OpenHermes 2.5 responding to a programming query with Python code

The model is also capable of generating detailed recipes, reflecting its competence in structured task composition:

It can also participate in abstract and philosophical conversations, evidencing nuanced language and persona:

Roleplay capabilities are exemplified in character-driven outputs:

Interaction Format and Usage Guidance

OpenHermes 2.5 Mistral 7B employs the ChatML prompt format, structured to facilitate multi-turn dialogues with consistent system and message roles. This format is compatible with OpenAI-style endpoints and modern transformers frameworks.

A typical prompt sequence in ChatML may appear as follows:

<|im_start|>system
You are Hermes 2, a superintelligent artificial intelligence developed by Teknium. Your purpose is to assist users with any request.<|im_end|>
<|im_start|>user
Hello, who are you?<|im_end|>
<|im_start|>assistant
Hi there! My name is Hermes 2, and I am here to assist you.<|im_end|>

When interacting locally, graphical tools such as LM Studio support easy configuration of prompt templates. Selection of the "ChatML" preset within such interfaces ensures compatibility with OpenHermes 2.5’s expected input structure.

LM Studio ChatML prompt preset selection screenshot

The tokenizer from Hugging Face Transformers can apply the ChatML template programmatically using tokenizer.apply_chat_template() with the correct parameters to facilitate generation.

Model Family, Limitations, and Considerations

OpenHermes 2.5 Mistral 7B is part of the broader Hermes model family, including earlier iterations such as OpenHermes-1 Llama-2 13B, OpenHermes 2 Mistral 7B, and larger-scale versions like Hermes 70B. Each successive generation integrates refinements in dataset construction, alignment, and system prompt usage.

Despite advances, limitations inherited from similar fine-tuning pipelines persist. The use of teacher models like GPT-4 for evaluation and preference data collection introduces potential biases, as newer models may be indirectly optimized for scores on benchmarks that rely on the same teacher’s outputs. Furthermore, while the addition of code data improved overall performance, certain specialized domains—particularly highly technical math and safety-critical dialogues—may not reach top benchmark results compared to larger, proprietary models. Current training methodologies focus on helpfulness and alignment but do not directly address all aspects of safe or harm-avoiding behavior, which requires additional curation and evaluation strategies.

OpenHermes 2.5 Mistral 7B

Laboratory OS

Direct Download

Open WebUI

Text Generation Web UI

Explore the Future of AI

Your server, your data, under your control

OpenHermes 2.5 Mistral 7B

Laboratory OS

Direct Download

Open WebUI

Text Generation Web UI

Explore the Future of AI

Your server, your data, under your control

Model Architecture and Training Approach

Datasets and Alignment Techniques

Performance Evaluation and Benchmarks

Use Cases and Example Outputs

Interaction Format and Usage Guidance

Model Family, Limitations, and Considerations

Helpful Links