default search action
Sainbayar Sukhbaatar
Person information
SPARQL queries
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c21]Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho, Xian Li, Sainbayar Sukhbaatar, Jing Xu, Jason Weston:
Self-Rewarding Language Models. ICML 2024 - [i39]Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho, Sainbayar Sukhbaatar, Jing Xu, Jason Weston:
Self-Rewarding Language Models. CoRR abs/2401.10020 (2024) - [i38]Lucas Lehnert, Sainbayar Sukhbaatar, Paul McVay, Michael Rabbat, Yuandong Tian:
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping. CoRR abs/2402.14083 (2024) - [i37]Alex Havrilla, Yuqing Du, Sharath Chandra Raparthy, Christoforos Nalmpantis, Jane Dwivedi-Yu, Maksym Zhuravinskyi, Eric Hambro, Sainbayar Sukhbaatar, Roberta Raileanu:
Teaching Large Language Models to Reason with Reinforcement Learning. CoRR abs/2403.04642 (2024) - [i36]Sainbayar Sukhbaatar, Olga Golovneva, Vasu Sharma, Hu Xu, Xi Victoria Lin, Baptiste Rozière, Jacob Kahn, Daniel Li, Wen-tau Yih, Jason Weston, Xian Li:
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM. CoRR abs/2403.07816 (2024) - [i35]Olga Golovneva, Zeyuan Allen-Zhu, Jason Weston, Sainbayar Sukhbaatar:
Reverse Training to Nurse the Reversal Curse. CoRR abs/2403.13799 (2024) - [i34]Richard Yuanzhe Pang, Weizhe Yuan, Kyunghyun Cho, He He, Sainbayar Sukhbaatar, Jason Weston:
Iterative Reasoning Preference Optimization. CoRR abs/2404.19733 (2024) - [i33]Olga Golovneva, Tianlu Wang, Jason Weston, Sainbayar Sukhbaatar:
Contextual Position Encoding: Learning to Count What's Important. CoRR abs/2405.18719 (2024) - [i32]Weizhe Yuan, Ilia Kulikov, Ping Yu, Kyunghyun Cho, Sainbayar Sukhbaatar, Jason Weston, Jing Xu:
Following Length Constraints in Instructions. CoRR abs/2406.17744 (2024) - [i31]Tianhao Wu, Weizhe Yuan, Olga Golovneva, Jing Xu, Yuandong Tian, Jiantao Jiao, Jason Weston, Sainbayar Sukhbaatar:
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge. CoRR abs/2407.19594 (2024) - 2023
- [c20]Jack Lanchantin, Sainbayar Sukhbaatar, Gabriel Synnaeve, Yuxuan Sun, Kavya Srinet, Arthur Szlam:
A Data Source for Reasoning Embodied Agents. AAAI 2023: 8438-8446 - [c19]Leonard Adolphs, Tianyu Gao, Jing Xu, Kurt Shuster, Sainbayar Sukhbaatar, Jason Weston:
The CRINGE Loss: Learning what language not to model. ACL (1) 2023: 8854-8874 - [c18]Jack Lanchantin, Shubham Toshniwal, Jason Weston, Arthur Szlam, Sainbayar Sukhbaatar:
Learning to Reason and Memorize with Self-Notes. NeurIPS 2023 - [i30]Lina Mezghani, Sainbayar Sukhbaatar, Piotr Bojanowski, Alessandro Lazaric, Karteek Alahari:
Learning Goal-Conditioned Policies Offline with Self-Supervised Reward Shaping. CoRR abs/2301.02099 (2023) - [i29]Raghav Goyal, Effrosyni Mavroudi, Xitong Yang, Sainbayar Sukhbaatar, Leonid Sigal, Matt Feiszli, Lorenzo Torresani, Du Tran:
MINOTAUR: Multi-task Video Grounding From Multimodal Queries. CoRR abs/2302.08063 (2023) - [i28]Lina Mezghani, Piotr Bojanowski, Karteek Alahari, Sainbayar Sukhbaatar:
Think Before You Act: Unified Policy for Interleaving Language Reasoning with Actions. CoRR abs/2304.11063 (2023) - [i27]Jack Lanchantin, Shubham Toshniwal, Jason Weston, Arthur Szlam, Sainbayar Sukhbaatar:
Learning to Reason and Memorize with Self-Notes. CoRR abs/2305.00833 (2023) - [i26]Imanol Schlag, Sainbayar Sukhbaatar, Asli Celikyilmaz, Wen-tau Yih, Jason Weston, Jürgen Schmidhuber, Xian Li:
Large Language Model Programs. CoRR abs/2305.05364 (2023) - [i25]Jing Xu, Da Ju, Joshua Lane, Mojtaba Komeili, Eric Michael Smith, Megan Ung, Morteza Behrooz, William Ngan, Rashel Moritz, Sainbayar Sukhbaatar, Y-Lan Boureau, Jason Weston, Kurt Shuster:
Improving Open Language Models by Learning from Organic Interactions. CoRR abs/2306.04707 (2023) - [i24]Jack Lanchantin, Sainbayar Sukhbaatar, Gabriel Synnaeve, Yuxuan Sun, Kavya Srinet, Arthur Szlam:
A Data Source for Reasoning Embodied Agents. CoRR abs/2309.07974 (2023) - [i23]Jason Weston, Sainbayar Sukhbaatar:
System 2 Attention (is something you might need too). CoRR abs/2311.11829 (2023) - [i22]Jing Xu, Andrew Lee, Sainbayar Sukhbaatar, Jason Weston:
Some things are more CRINGE than others: Preference Optimization with the Pairwise Cringe Loss. CoRR abs/2312.16682 (2023) - 2022
- [c17]Lina Mezghani, Sainbayar Sukhbaatar, Piotr Bojanowski, Alessandro Lazaric, Karteek Alahari:
Learning Goal-Conditioned Policies Offline with Self-Supervised Reward Shaping. CoRL 2022: 1401-1410 - [c16]Kushal Arora, Kurt Shuster, Sainbayar Sukhbaatar, Jason Weston:
Director: Generator-Classifiers For Supervised Language Modeling. AACL/IJCNLP (1) 2022: 512-526 - [c15]Lina Mezghani, Sainbayar Sukhbaatar, Thibaut Lavril, Oleksandr Maksymets, Dhruv Batra, Piotr Bojanowski, Karteek Alahari:
Memory-Augmented Reinforcement Learning for Image-Goal Navigation. IROS 2022: 3316-3323 - [c14]Da Ju, Stephen Roller, Sainbayar Sukhbaatar, Jason Weston:
Staircase Attention for Recurrent Processing of Sequences. NeurIPS 2022 - [c13]Akram Erraqabi, Marlos C. Machado, Mingde Zhao, Sainbayar Sukhbaatar, Alessandro Lazaric, Ludovic Denoyer, Yoshua Bengio:
Temporal abstractions-augmented temporally contrastive learning: An alternative to the Laplacian in RL. UAI 2022: 641-651 - [i21]Akram Erraqabi, Marlos C. Machado, Mingde Zhao, Sainbayar Sukhbaatar, Alessandro Lazaric, Ludovic Denoyer, Yoshua Bengio:
Temporal Abstractions-Augmented Temporally Contrastive Learning: An Alternative to the Laplacian in RL. CoRR abs/2203.11369 (2022) - [i20]Kushal Arora, Kurt Shuster, Sainbayar Sukhbaatar, Jason Weston:
DIRECTOR: Generator-Classifiers For Supervised Language Modeling. CoRR abs/2206.07694 (2022) - [i19]Lina Mezghani, Sainbayar Sukhbaatar, Piotr Bojanowski, Karteek Alahari:
Walk the Random Walk: Learning to Discover and Reach Goals Without Supervision. CoRR abs/2206.11733 (2022) - [i18]Leonard Adolphs, Tianyu Gao, Jing Xu, Kurt Shuster, Sainbayar Sukhbaatar, Jason Weston:
The CRINGE Loss: Learning what language not to model. CoRR abs/2211.05826 (2022) - 2021
- [c12]Sainbayar Sukhbaatar, Da Ju, Spencer Poff, Stephen Roller, Arthur Szlam, Jason Weston, Angela Fan:
Not All Memories are Created Equal: Learning to Forget by Expiring. ICML 2021: 9902-9912 - [c11]Stephen Roller, Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston:
Hash Layers For Large Sparse Models. NeurIPS 2021: 17555-17566 - [i17]Lina Mezghani, Sainbayar Sukhbaatar, Thibaut Lavril, Oleksandr Maksymets, Dhruv Batra, Piotr Bojanowski, Karteek Alahari:
Memory-Augmented Reinforcement Learning for Image-Goal Navigation. CoRR abs/2101.05181 (2021) - [i16]Sainbayar Sukhbaatar, Da Ju, Spencer Poff, Stephen Roller, Arthur Szlam, Jason Weston, Angela Fan:
Not All Memories are Created Equal: Learning to Forget by Expiring. CoRR abs/2105.06548 (2021) - [i15]Da Ju, Stephen Roller, Sainbayar Sukhbaatar, Jason Weston:
Staircase Attention for Recurrent Processing of Sequences. CoRR abs/2106.04279 (2021) - [i14]Stephen Roller, Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston:
Hash Layers For Large Sparse Models. CoRR abs/2106.04426 (2021) - 2020
- [i13]Angela Fan, Thibaut Lavril, Edouard Grave, Armand Joulin, Sainbayar Sukhbaatar:
Accessing Higher-level Representations in Sequential Transformers with Feedback Memory. CoRR abs/2002.09402 (2020) - [i12]Lina Mezghani, Sainbayar Sukhbaatar, Arthur Szlam, Armand Joulin, Piotr Bojanowski:
Learning to Visually Navigate in Photorealistic Environments Without any Supervision. CoRR abs/2004.04954 (2020)
2010 – 2019
- 2019
- [c10]Sainbayar Sukhbaatar, Edouard Grave, Piotr Bojanowski, Armand Joulin:
Adaptive Attention Span in Transformers. ACL (1) 2019: 331-335 - [c9]Edouard Grave, Sainbayar Sukhbaatar, Piotr Bojanowski, Armand Joulin:
Training Hybrid Language Models by Marginalizing over Segmentations. ACL (1) 2019: 1477-1482 - [c8]Amanpreet Singh, Tushar Jain, Sainbayar Sukhbaatar:
Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks. ICLR (Poster) 2019 - [i11]Sainbayar Sukhbaatar, Edouard Grave, Piotr Bojanowski, Armand Joulin:
Adaptive Attention Span in Transformers. CoRR abs/1905.07799 (2019) - [i10]Sainbayar Sukhbaatar, Edouard Grave, Guillaume Lample, Hervé Jégou, Armand Joulin:
Augmenting Self-attention with Persistent Memory. CoRR abs/1907.01470 (2019) - 2018
- [b1]Sainbayar Sukhbaatar:
Elements of Intelligence: Memory, Communication and Intrinsic Motivation. New York University, USA, 2018 - [c7]Sainbayar Sukhbaatar, Zeming Lin, Ilya Kostrikov, Gabriel Synnaeve, Arthur Szlam, Rob Fergus:
Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play. ICLR (Poster) 2018 - [c6]Amy Zhang, Sainbayar Sukhbaatar, Adam Lerer, Arthur Szlam, Rob Fergus:
Composable Planning with Attributes. ICML 2018: 5837-5846 - [i9]Amy Zhang, Adam Lerer, Sainbayar Sukhbaatar, Rob Fergus, Arthur Szlam:
Composable Planning with Attributes. CoRR abs/1803.00512 (2018) - [i8]David Folqué, Sainbayar Sukhbaatar, Arthur Szlam, Joan Bruna:
Planning with Arithmetic and Geometric Attributes. CoRR abs/1809.02031 (2018) - [i7]Sainbayar Sukhbaatar, Emily Denton, Arthur Szlam, Rob Fergus:
Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement Learning. CoRR abs/1811.09083 (2018) - [i6]Amanpreet Singh, Tushar Jain, Sainbayar Sukhbaatar:
Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks. CoRR abs/1812.09755 (2018) - 2017
- [i5]Sainbayar Sukhbaatar, Ilya Kostrikov, Arthur Szlam, Rob Fergus:
Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play. CoRR abs/1703.05407 (2017) - 2016
- [c5]Sainbayar Sukhbaatar, Arthur Szlam, Rob Fergus:
Learning Multiagent Communication with Backpropagation. NIPS 2016: 2244-2252 - [i4]Sainbayar Sukhbaatar, Arthur Szlam, Rob Fergus:
Learning Multiagent Communication with Backpropagation. CoRR abs/1605.07736 (2016) - 2015
- [c4]Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, Rob Fergus:
End-To-End Memory Networks. NIPS 2015: 2440-2448 - [c3]Sainbayar Sukhbaatar, Rob Fergus:
Learning from Noisy Labels with Deep Neural Networks. ICLR (Workshop) 2015 - [i3]Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, Rob Fergus:
Weakly Supervised Memory Networks. CoRR abs/1503.08895 (2015) - [i2]Sainbayar Sukhbaatar, Arthur Szlam, Gabriel Synnaeve, Soumith Chintala, Rob Fergus:
MazeBase: A Sandbox for Learning from Games. CoRR abs/1511.07401 (2015) - [i1]Bolei Zhou, Yuandong Tian, Sainbayar Sukhbaatar, Arthur Szlam, Rob Fergus:
Simple Baseline for Visual Question Answering. CoRR abs/1512.02167 (2015) - 2013
- [c2]Sainbayar Sukhbaatar, Takaki Makino, Kazuyuki Aihara:
Auto-pooling: Learning to Improve Invariance of Image Features from Image Sequences. ICLR (Workshop Poster) 2013 - 2011
- [c1]Sainbayar Sukhbaatar, Takaki Makino, Kazuyuki Aihara, Takashi Chikayama:
Robust Generation of Dynamical Patterns in Human Motion by a Deep Belief Nets. ACML 2011: 231-246
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-09-04 00:32 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint