LocalLLaMa

29 readers

1 users here now

Magazine to talk about LLaMA (large language model created by Meta AI) and any related Open Source LLMs. Inspired by Reddit's /r/LocalLLaMA/ subreddit.

founded 1 year ago

Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length (blog.salesforceairesearch.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink hide all child comments

TLDR We trained a series of 7B LLMs named XGen-7B with standard dense attention on up to 8K sequence length for up to 1.5T tokens. We also fine tune the models on public-domain instructional data. The main take-aways are: * On standard NLP benchmarks, XGen achieves comparable or better results

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here