Paving the way to efficient architectures: StripedHyena-7B, open source models offering a glimpse into a world beyond Transformers
Together Research has introduced the StripedHyena models, including StripedHyena-Hessian-7B (SH 7B) and StripedHyena-Nous-7B (SH-N 7B). These models are designed to improve training and inference performance for long-context sequence modeling tasks. They are the first alternative models competitive with the best open-source Transformers in short and long-context evaluations, achieving comparable or better performance with faster and […]