How can we do better prompt enginering? By automating it! New research introduces OPRO, a method using large language models (LLMs) as optimizers. By framing optimization tasks as natural language prompts, OPRO allows LLMs to iteratively refine solutions. The real game-changer? Automated prompt engineering. OPRO-optimized prompts significantly outperform human-designed ones on reasoning benchmarks, with accuracy gains up to 50%. This showcases LLMs' potential as versatile optimizers adaptable to various tasks through natural language. While challenges remain, OPRO opens new possibilities for AI-driven optimization across industries. Check out the paper below or try Anthropic's prompt generator https://lnkd.in/eM-qx_Ct #anthropic
Atria AI’s Post
More Relevant Posts
-
Optimization by Prompting (OPRO) is an approach that utilizes large language models (LLMs) as optimizers by describing optimization problems in natural language and instructing the LLM to iteratively generate solutions. This method adapts to different tasks by modifying the problem description in the prompt and allows for customization based on desired solution properties. Case studies on linear regression and the traveling salesman problem demonstrate that LLMs can achieve competitive results through prompting alone. OPRO also explores prompt optimization, focusing on maximizing task accuracy by refining the prompt format, which LLMs are sensitive to. By using a meta-prompt that includes past prompts and their scores, OPRO iteratively generates new prompts to improve task accuracy. Experiments across various LLMs show consistent improvements in task performance through optimization, with OPRO-optimized prompts outperforming human-designed ones in several benchmarks. Key challenges addressed include the trade-off between exploration and exploitation, optimization stability, and managing prompt space constraints. OPRO is further evaluated on different models and tasks, demonstrating its effectiveness in both mathematical and prompt optimization contexts. https://lnkd.in/gKBF9grE
Large Language Models as Optimizers
arxiv.org
To view or add a comment, sign in
-
Wave Network: An Ultra-Small Language Model https://lnkd.in/gtp8gmsq We propose an innovative token representation and update method in a new ultra-small language model: the Wave network. Specifically, we use a complex vector to represent each token, encoding both global and local semantics of the input text. A complex vector consists of two components: a magnitude vector representing the global semantics of the input text, and a phase vector capturing the relationships between individual tokens and global semantics. Experiments on the AG News text classification task demonstrate that, when generating complex vectors from randomly initialized token embeddings, our single-layer Wave Network achieves 90.91% accuracy with wave interference and 91.66% with wave modulation - outperforming a single Transformer layer using BERT pre-trained embeddings by 19.23% and 19.98%, respectively, and approaching the accuracy of the pre-trained and fine-tuned BERT base model (94.64%). Additionally, compared to BERT base, the Wave Network reduces video memory usage and training time by 77.34% and 85.62% during wave modulation. In summary, we used a 2.4-million-parameter small language model to achieve accuracy comparable to a 100-million-parameter BERT model in text classification.
Wave Network: An Ultra-Small Language Model
arxiv.org
To view or add a comment, sign in
-
Why Write Tests in Natural Language? In our experience, natural language is the only format that truly stands the test of time. Here’s why: Manual test cases in natural language rarely change. They describe what needs to happen at a high level, independent of the underlying implementation. Automated tests, on the other hand, are constantly changing. Every tweak in the implementation means rewriting or updating code-based tests, which can be a time sink. By keeping all test cases in natural language, we ensure they remain stable and usable over the long term. Plus, as our technology evolves, the capabilities of your tests improve without needing constant rewrites. It’s a smarter, more resilient approach to testing—one that adapts with your system rather than working against it. How do you write and maintain your test cases? I’d love to hear your thoughts!
To view or add a comment, sign in
-
How confident are you that the prompt you are using is the best prompt for your task? This paper from DeepMind finds the best prompt for you and explores the fascinating world of language models and their potential use as optimisers. The optimisation task is written in natural language. The researchers discovered that LLMs can outperform humans when it comes to optimization tasks. The study showcases how LLMs can refine and improve prompts to create better working prompts. https://lnkd.in/d7nabbYk Check out the original paper for more insights into the exciting possibilities of LLMs as optimisers.
2309.03409.pdf
arxiv.org
To view or add a comment, sign in
-
https://lnkd.in/exiBW9Xv Enlightening work that shows how language models trained on next word prediction can build internal representations that contain semblance of semantic meaning. Quoting Alistair Isaac - “Words bear natural meaning about other words, and, though it is not equivalent to the conventional meaning they bear about the world, this natural meaning nevertheless determines some of their paradigmatically semantic features.” Isaac discusses a theory of semantic vectors, rooted in Shannon’s information theory, that attribute semantic meaning to objects based on their correlations across a distribution of objects occurring in a natural environment. Highly recommend this reading to gain a better perspective: https://lnkd.in/eN3s6wnT
Emergent Representations of Program Semantics in Language Models Trained on Programs
arxiv.org
To view or add a comment, sign in
-
💡 Thinking Tokens For Language Models! How much is 56 times 37? Can you answer that right away? In a short paper, David Herel and Tomas Mikolov propose a simple method to improve the reasoning of language models when performing complex calculations. 📌 They note that, although language models are not that good with difficult calculations, humans also cannot perform these calculations immediately and require a considerable amount of time to come up with an answer. Inspired by this, they introduce 💡Thinking Tokens💡 So what are those "thinking tokens"?! Nothing fancy, they are just special tokens '<T>' that you insert after each word in a sentence whenever a complex problem is encountered. That's it! 👉 The main idea is to "buy" the model "some time" to think about the problem with these additional computations before answering. Using this method they observed an improved (a little bit) perplexity. 👉 Before getting excited note that: They have added these tokens manually, and they have used an RNN language model. From the paper: "As a proof of concept, we have added N ’thinking tokens’ (< T >) after each observed word in a dataset. Our vision is that this basic concept can be extended to a self-adjusting model, which will be able to decide itself if and how many ’thinking tokens’ will be used for a specific problem, where N could also vary throughout the sentence. This would allow us to reduce the computational time, which would not increase N times."
To view or add a comment, sign in
-
-
Reflections on Evaluating Retrieval-Augmented Language Models Reading about the evaluation of Retrieval-Augmented Language Models (RAG) through automated exams was enlightening. The use of Item Response Theory (IRT) to quantify model performance is innovative, yet I worry about its reliance on initially generated questions and potential biases. Integrating online data and user preferences could offer stronger, real-world insights. Including metrics like click-through rates and user satisfaction scores would align evaluations more closely with human preferences. This hybrid approach could bridge the gap between automated assessments and practical effectiveness, providing a more holistic understanding of RAG systems' real-world performance. https://lnkd.in/e8mkMVv5
Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation
arxiv.org
To view or add a comment, sign in
-
Retrieval Augmented Generation (RAG) techniques with large language models (LLMs) to alleviate the limitations of LLMs, such as hallucination and out-of-date internal knowledge. A very good paper .... https://lnkd.in/eiZCTB2c
A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models
arxiv.org
To view or add a comment, sign in
-
Traditional language models (autoregressive or AR models) generate text one token at a time, which is quite slow. This approach creates significant lag and limits the model's potential. This paper introduces Self-Distillation Through Time (SDTT), a method for improving text generation speed and quality in language models by using discrete diffusion processes instead of traditional autoregressive (AR) models. SDTT enables the simultaneous generation of 32 tokens, improving speed by up to 8× compared to AR models, while achieving better text quality and perplexity. The technique uses a "diffusion" approach, where the model learns to gradually refine text generation. I have thought for a while that the “diffusion” approach could benefit LLMs, so I am very glad someone finally did it. Source: https://lnkd.in/d_KfY_qf
Beyond Autoregression: Fast LLMs via Self-Distillation Through Time
arxiv.org
To view or add a comment, sign in
-
Happy to share our latest work "MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation" published in the Table Representation Workshop, NeurIPS 2024. Text-2-SQL translation is an important problem to enable agentic workflows for automating database tasks. Recent advances in the space of text-to-SQL utilize closed proprietary models like GPT-4 that present challenges in accessibility, privacy, and latency. We address these issues by developing small, efficient open models (under 10 billion parameters) for text-to-SQL translation and obtain state-of-the-art results compared to other open models while remaining competitive with larger proprietary models at a much lower cost. Joint work with my colleagues at Layer 6 AI. Ilan Gofman, Zhaoyan Liu, Paul Wu, Noël Vouitsis, Guangwei Yu, Jesse Cresswell, Rasa Hosseinzadeh. Paper link: https://lnkd.in/ghi5mKwB
MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation
arxiv.org
To view or add a comment, sign in