Thanks to Charles for the conversion scripts, I've converted several of the new internLM2 models into Llama format. I've also made them into ExLlamaV2 while I was at it.

You can find them here:

https://huggingface.co/bartowski?search_models=internlm2

Note, the chat models seem to do something odd without outputting [UNUSED_TOKEN_145] in a way that seems equivalent to <|im_end|>, not sure why, but it works fine despite outputting that at the end.

0 comments

LocalLLaMA

noneabove1182

•

WizardLM/WizardCoder-33B-V1.1 released!

WizardLM/WizardCoder-33B-V1.1 · Hugging Face

https://huggingface.co/WizardLM/WizardCoder-33B-V1.1

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

WizardLM/WizardCoder-33B-V1.1 · Hugging Face

11 comments

LocalLLaMA

noneabove1182

•

Microsoft announces WaveCoder

https://twitter.com/_akhaliq/status/1739486811100004513?t=3dcn2vphG5G-1boaLBQH6w

0 comments

LocalLLaMA

noneabove1182

•

Mixture of Experts Explained (Huggingface blog)

Mixture of Experts Explained

https://huggingface.co/blog/moe

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

0 comments

LocalLLaMA

noneabove1182

•

Mistral releases version 0.2 of their 7B model

La plateforme

https://mistral.ai/news/la-plateforme/

Our first AI endpoints are available in early access.

5 comments

LocalLLaMA

noneabove1182

•

Mistral drops a new magnet download

https://twitter.com/MistralAI/status/1733150512395038967?t=1qjjZauoJSPikKFkNto2kg&s=19

2 comments

LocalLLaMA

noneabove1182

•

Orca 2: Teaching Small Language Models How to Reason

https://www.microsoft.com/en-us/research/blog/orca-2-teaching-small-language-models-how-to-reason/

At Microsoft, we’re expanding AI capabilities by training small language models to achieve the kind of enhanced reasoning and comprehension typically found only in much larger models.

7 comments

noneabove1182

My personal collection of interesting models I've quantized from the past week (yes, just week)

itsme2417/PolyMind: A multimodal, function calling powered LLM webui.

GitHub - itsme2417/PolyMind: A multimodal, function calling powered LLM webui.

Introducing Nomic Embed: A Truly Open Embedding Model

Introducing Nomic Embed: A Truly Open Embedding Model

InternLM2 models llama-fied

WizardLM/WizardCoder-33B-V1.1 released!

WizardLM/WizardCoder-33B-V1.1 · Hugging Face

Microsoft announces WaveCoder

Mixture of Experts Explained (Huggingface blog)

Mixture of Experts Explained

Mistral releases version 0.2 of their 7B model

La plateforme

Mistral drops a new magnet download

Orca 2: Teaching Small Language Models&nbsp;How to Reason

Orca 2: Teaching Small Language Models&nbsp;How to Reason

My personal collection of interesting models I've quantized from the past week (yes, just week)

itsme2417/PolyMind: A multimodal, function calling powered LLM webui.

GitHub - itsme2417/PolyMind: A multimodal, function calling powered LLM webui.

Introducing Nomic Embed: A Truly Open Embedding Model

Introducing Nomic Embed: A Truly Open Embedding Model

InternLM2 models llama-fied

WizardLM/WizardCoder-33B-V1.1 released!

WizardLM/WizardCoder-33B-V1.1 · Hugging Face

Microsoft announces WaveCoder

Mixture of Experts Explained (Huggingface blog)

Mixture of Experts Explained

Mistral releases version 0.2 of their 7B model

La plateforme

Mistral drops a new magnet download

Orca 2: Teaching Small Language Models&nbsp;How to Reason

Orca 2: Teaching Small Language Models&nbsp;How to Reason

noneabove1182

Orca 2: Teaching Small Language Models How to Reason

Orca 2: Teaching Small Language Models How to Reason

Orca 2: Teaching Small Language Models How to Reason

Orca 2: Teaching Small Language Models How to Reason