Emergent abilities of large language models
Scaling up language models has been shown to predictably improve performance and
sample efficiency on a wide range of downstream tasks. This paper instead discusses an …
sample efficiency on a wide range of downstream tasks. This paper instead discusses an …
Quaternion knowledge graph embeddings
In this work, we move beyond the traditional complex-valued representations, introducing
more expressive hypercomplex representations to model entities and relations for knowledge …
more expressive hypercomplex representations to model entities and relations for knowledge …
Long range arena: A benchmark for efficient transformers
Transformers do not scale very well to long sequence lengths largely because of quadratic
self-attention complexity. In the recent months, a wide spectrum of efficient, fast Transformers …
self-attention complexity. In the recent months, a wide spectrum of efficient, fast Transformers …
Deep learning based recommender system: A survey and new perspectives
With the growing volume of online information, recommender systems have been an effective
strategy to overcome information overload. The utility of recommender systems cannot be …
strategy to overcome information overload. The utility of recommender systems cannot be …
Sparse sinkhorn attention
We propose Sparse Sinkhorn Attention, a new efficient and sparse method for learning to
attend. Our method is based on differentiable sorting of internal representations. Concretely, …
attend. Our method is based on differentiable sorting of internal representations. Concretely, …
[HTML][HTML] Printability region for 3D concrete printing using slump and slump flow test
Rheological studies are important for successful 3D concrete printing. The main challenge for
successful 3D concrete printing is the complex characteristic the materials should possess. …
successful 3D concrete printing is the complex characteristic the materials should possess. …
Palm: Scaling language modeling with pathways
Large language models have been shown to achieve remarkable performance across a variety
of natural language tasks using few-shot learning, which drastically reduces the number …
of natural language tasks using few-shot learning, which drastically reduces the number …
Scaling instruction-finetuned language models
Finetuning language models on a collection of datasets phrased as instructions has been
shown to improve model performance and generalization to unseen tasks. In this paper we …
shown to improve model performance and generalization to unseen tasks. In this paper we …
Palm 2 technical report
We introduce PaLM 2, a new state-of-the-art language model that has better multilingual and
reasoning capabilities and is more compute-efficient than its predecessor PaLM. PaLM 2 is …
reasoning capabilities and is more compute-efficient than its predecessor PaLM. PaLM 2 is …
The flan collection: Designing data and methods for effective instruction tuning
We study the design decision of publicly available instruction tuning methods, by reproducing
and breaking down the development of Flan 2022 (Chung et al., 2022). Through careful …
and breaking down the development of Flan 2022 (Chung et al., 2022). Through careful …