Microsoft Research has trained a transformer-based generative language model with over 17 billion parameters. And it performs very well, answering many natural language questions directly:


What is the right size? How much bigger can they get? We are rapidly approaching human levels of natural language performance (but not comprehension).