Mythalion 13B is a generative language model tailored for fictional writing and entertainment-focused conversational applications. It is the result of a collaborative merge between PygmalionAI and Gryphe, combining aspects from both parties’ prior models to provide a tool for roleplay (RP) and chat-based storytelling tasks. As with many contemporary language models, Mythalion 13B is built atop the foundational Llama-2 architecture, maintaining an open release for both research and commercial contexts.
Model Development and Release
Mythalion 13B was introduced as a part of the PygmalionAI initiative to advance text-generation models with an emphasis on creative writing and interactive roleplay. This model is specifically a merge, using advanced merging techniques, of Pygmalion-2 13B and Gryphe's MythoMax L2 13B, both of which are themselves derivatives of Meta's Llama-2. The release was particularly aimed at the RP community, with a focus on designing longer, in-character responses and facilitating conversational fiction.
The development incorporated community feedback, notably from testers who highlighted improved performance relative to MythoMax L2 13B for inline roleplay and chat scenarios. Technical details regarding the merge can be found in the official PygmalionAI blog post, which also offers guidance on use with various chat interfaces.
Model Architecture
Mythalion 13B’s architecture is grounded in the Llama-2 transformer design. It comprises approximately 13 billion parameters and utilizes 16-bit floating-point (FP16) tensors, providing a balance between model expressiveness and computational efficiency. The model was built using the Axolotl training framework, which supports modular experiment workflows for large language model research.
This architecture enables the model's ability to produce long-form, context-rich dialogue, remaining attentive to narrative structure and character consistency throughout extended exchanges. The merging of underlying models leverages techniques that blend their respective strengths, resulting in synthesized model behaviors designed for creative output and in-character roleplay.
Training Techniques and Data
For training, Mythalion 13B utilizes specialized conversational prompts characterized by explicit role tokens: <|system|>
, <|user|>
, and <|model|>
. The <|system|>
token injects meta-context or out-of-band narrative guidance, <|user|>
marks the user's direct contributions, and <|model|>
prompts the AI to generate its response. This conversational scaffolding enables chaining of turns for multi-step dialogues and was a design choice to mirror natural RP and storytelling formats seen in user communities.
The core component Pygmalion-2 13B was trained on data that includes a diverse selection of fictional dialogues, character-driven scenarios, and entertainment-oriented exchanges, aligning with the model’s intended use cases.
Applications and Usage
The primary application of Mythalion 13B is found in fictional and entertainment contexts, with capabilities for RP chat environments. Users engage the model to enact characters, develop plotlines, and simulate interactive storytelling. Community assessments note its ability to generate extended responses and maintain persona-specific consistency.
The model supports two prompting formats to maximize compatibility with popular chat applications. The Alpaca format structures prompts as instruction-response pairs, while the Pygmalion/Metharme format leverages the role tokens defined above for nuanced conversation management. Details concerning these formats and recommended generation settings are available via the PygmalionAI documentation.
Limitations and Responsible Use
While capable of creative entertainment, Mythalion 13B was not fine-tuned for safety, factual accuracy, or moderation. Training data for both its foundation models and merged variants includes unfiltered internet text, which may lead to the output of content that is profane, offensive, or factually incorrect. The developers recommend restricting its use to designated fictional and entertainment purposes, as outputs may be unsuitable for other contexts or sensitive applications.
The model's licensing aligns with the Llama-2 community license, allowing both non-commercial and commercial use cases, provided users adhere to the responsible AI use guidelines specified therein.
Helpful Links