Finetune LLMs via the Finetuning Hub

rsaha7

Hi community, I have been working on benchmarking publicly available LLMs these past couple of weeks. More precisely, I am interested on the finetuning piece since a lot of businesses are starting to entertain the idea of self-hosting LLMs trained on their proprietary data rather than relying on third party APIs. GitHub repo: https://github.com/georgian-io/LLM-Finetuning-Hub To this point, I am tracking the following 4 pillars of evaluation that businesses are typically look into: - Performance - Time to train an LLM - Cost to train an LLM - Inference (throughput / latency / cost per token) For each LLM, my aim is to benchmark them for popular tasks, i.e., classification and summarization. Moreover, I would like to compare them against each other. So far, I have benchmarked Flan-T5-Large, Falcon-7B and RedPajama and have found them to be very efficient in low-data situations, i.e., when there are very few annotated samples. Llama2-7B/13B and Writer’s Palmyra are in the pipeline. But there’s so many LLMs out there! In case this work interests you, would be great to join forces. GitHub repo attached — feedback is always welcome :) Happy hacking!

Southmountain

what is your use case for LLM?

diligent hands rule....

k5054

Landing on the Moon?

Keep Calm and Carry On

Lost User

We have anumber of LLM's in our parish, maybe I could ask one of them.

BillWoodruff

more details needed. why not compare most used ones ?

«The mind is not a vessel to be filled but a fire to be kindled» Plutarch