Modèles de langage statistiques
Cette page contient quelques liens utiles mais nécessite d'être actualisée.
Articles
Factored Language Models and Generalized Parallel Backoff
Les outils
SRILM
Intro to the SRI Language Modeling toolkit
Introduction to SRILM Toolkit
MITLM
IRSTLM
A tutorial on the IRSTLM library
CMU SLM Toolkit
OpenGRM
NGramLibrary
KenLM: Faster and Smaller Language Model Queries
(dépôt)
KenLM: Faster and Smaller Language Model Queries
(article)
Reconnaissance d'entités nommées
CasEN
Les données
Liste de corpora
Modèles de langage neuronaux
Flaubert: Unsupervised Language Model Pre-training for French
FlauBERT: Unsupervised Language Model Pre-training for French
(arXiv)
OPT-175B
Meta’s Challenge to OpenAI—Give Away a Massive Language Model : At 175 billion parameters, it’s as powerful as OpenAI’s GPT-3
Democratizing access to large-scale language models with OPT-175B
OPT: Open Pre-trained Transformer Language Models
(
pdf
)
EleutherAI
(GPT-NeoX, GPT-J)
Bloom
(Hugging Face)
https://github.com/openai/openai-cookbook: Examples and guides for using the OpenAI API
Librairies
*
langchain: Building applications with LLMs through composability
Tokenizer
tiktoken
Analyse
How does GPT Obtain its Ability? Tracing Emergent Abilities of Language Models to their Sources
Historical analogies for large language models
(prospective)
Chatbot
HuggingChat