Prabakaran Chandran’s Post

Engineer & Data Scientist | Building AI/Data-First Products & Solutions | Problems First | at the Crossroads of Systems, Behavior, and Intelligence | Learning over Knowing

6mo Edited

🚀 I recently dived into Anoop Kunchukuttan’s survey presentation on Extending Large Language Models (LLMs) to other languages, AI4Bhārat it’s nothing short of a playbook for building customized LLMs. 🌐🛠️ 🔍 Highlights: • An E2E crisp overview on Extending the power of English-based LLMs to Indic and other languages.🌍 • Technical Breakdown: Clear intros to canonical LLMs and design patterns – essential for both beginners and experts. 🧩 • In-depth Process Guide: From extending vocabularies 📚 to embedding new words , initialisation methodologies 🧬, the survey meticulously outlines each step. • Practical Insights: Details on pretraining corpuses, continual pretraining, instruction fine-tuning, and the importance of dataset preparation by augmenting/distilling from existing models/translations. 💡 - Insights and details have been given on various data mixing strategies like romanized / code switching/ using cross lingual datases • Cross-Lingual Innovation: Discusses cross-lingual prompting methods and instruction alignment . 🤖⚙️ • Empirical Evidence: Ablation studies to back the theories and provide concrete data to easily understand what worked better on these activities 📊 This survey is not just an academic read but a solid reference for anyone looking to adapt LLMs to specific domains or languages. It underscores the critical steps in making AI accessible and functional across linguistic barriers. A massive shoutout to Anoop Kunchukuttan ! #AI #LanguageModels #MachineLearning #InclusiveAI #AI4Bharat #generativeai #deelearning

4 Comments

Anoop Kunchukuttan

Researcher-Microsoft, Co-founder and Co-lead-AI4Bharat. I work on Machine Translation, Multilingual Learning and Indian Language NLP

6mo

Thanks, happy to see you found it useful!

3 Reactions

Abhishek Pawar

Senior Data Scientist | Blogging @ Medium

6mo

Can you share the link?

Pete Grett

GEN AI Evangelist | #TechSherpa | #LiftOthersUp

6mo

Excited to dive into this! Thanks for sharing your insights. Prabakaran Chandran

2 Reactions

See more comments

To view or add a comment, sign in

More Relevant Posts

Tech Taniwha

544 followers
1mo
Report this post
"Has the horse already bolted on data sovereignty and ownership for te reo Māori?" We had some thought-provoking kōrero last night at our Kōrero AI Technical Panel Discussion. 🤖 Ngā mihi aroha ki a koutou ki ngā korokoro taniwha; 🤖 Jahminique Grace - Māori Data Analyst and AI Specialist 🤖 Punahamoa Walker - Accomplished AI Researcher 🤖 Michael Puhara - Self-taught Developer and AI Enthusiast Some of the kōrero shared from our speakers and the audience; 💚 We need to be at the forefront of creating our own models, involving iwi, hapū and those communities who the model is built for, in the process. 💚 Some barriers to creating an LLM could include; data quality, open vs closed source, being without access to technology or people in rural communities and larger corporates not having the kaupapa at the centre of what they do. 💚 Might we be able to use AI as a tool for language preservation? Only if we work with the impacted communities to understand their needs in terms of preservation, rather them tell them that we are here to preserve their language. 💚 Is anyone thinking about the intersection of AI and linguistics? There are so many dialects (within every language, not just te reo Māori) that need to be accounted for when building these models. Effortless code-switching between languages was also talked about. An "AI linguistic notation" was mentioned. Engari, I'll leave that kōrero for those who studied linguistics, nē. 💚 The best way to get into AI? Just start! Subscribe to YouTube channels, watch TikTok educational videos, curate your LinkedIn feed and just try it out. Although the horse may (or may not) have bolted, there is still time to reign it in so-to-speak, with an indigenous saddle me kī. 🐴 Once again, a special thank you to everyone who took time out of their evening to dail-in, ask pātai and share whakaaro. Mīharo ake nei! See you at the next one. 💻 🐉 💚 #TechTaniwha #AI #ArtificialIntelligence #Aotearoa
2 Comments
Like Comment
To view or add a comment, sign in
Walled AI

164 followers
5mo Edited
Report this post
Read our top ACL publication on AI safety! We are a group of researchers who have a record of publishing in top AI places. If you have a challenging AI problem related to AI governance (evaluation/safety/alignment), we can solve it for you! Email: admin@walled.ai Website: walled.ai

Rishabh Bhardwaj

Researcher and founder
5mo Edited

Got the recent work accepted at ACL 2024-main track. When I started my PhD about four years ago, I felt a lot happier about top AI publications, regardless of their impact beyond the research community. Now it feels like I need to make them valuable to the general public. So, I will aim to work in directions that have a wider impact, more specifically in the AI safety domain. RESTA is one such work, receiving scores of 4, 4, and 4 (/5) from the reviewing committee, placing it in the top 3% of papers submitted to the ARR. Paper: Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic Impact: Prevent safety compromise of LLMs due to enterprise-specific fine-tuning. Link: https://lnkd.in/gtbRY_5k Code: https://lnkd.in/geAtnctr Hugginface (safety evaluation multilingual dataset): https://lnkd.in/gEzMV3Pp More about the approach?-https://lnkd.in/gb_y62hX Grateful to my collaborators Duc Anh Do and Soujanya Poria. AI safety needs a lot of exploration-including discussions on its definition, evaluation, and alignment. #AISafety #NLProc #ACL2024

Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic

arxiv.org
Like Comment
To view or add a comment, sign in
SentientMatters

1,947 followers
7mo
Report this post
ASMv2: Revolutionizing Relation Comprehension in Visual AI The All-Seeing Project V2 introduces a new model, ASMv2, designed for advanced image understanding, particularly in identifying and comprehending object relations. It combines text generation, object localization, and relational understanding into a single task, aimed at enhancing Multi-modal Large Language Models (MLLMs) capabilities. The project includes a specialized ReC dataset for training and a novel CRPE benchmark for evaluation, showcasing significant improvements in relation understanding, evidenced by its outperformance of existing models. This work aims to push the boundaries of AI towards greater interpretative and relational insight in visual data. #ASMv2 #AllSeeingProject #VisualAI #RelationComprehension #ObjectRelations #MultiModalLargeLanguageModels #MLLMs #ReCDataSet #CRPEBenchmark #AIInnovation #VisualData #Research #Interpretation

The All-Seeing Project V2: Towards General Relation Comprehension of the Open World

arxiv.org
Like Comment
To view or add a comment, sign in
Arslan Chaudhary

LinkedIn manager & ghostwriter For AI, digital transformation, & cybersecurity leaders.
2w
Report this post
Here’s a little-known trick to boost your AI game: The language you use with Large Language Models (LLMs) directly impacts the quality of responses. And guess what? It’s not the model’s fault it's all about the data. Here’s the inside scoop LLMs are trained on a massive amount of English data, giving them a deeper understanding of context, tone, and accuracy. However, the quality can dip when you switch to languages with less training data (like Portuguese). Even if you're working in a different language, try this method to get better responses: ✅ Translate Your Prompt into English Start with English. It’s the goldmine for rich, nuanced answers. ✅ Generate the Response Let the LLM do its magic in English, where it’s most effective. ✅ Translate Back to Your Target Language Take that high-quality response and shift it back to your desired language: Voilà, a more detailed and accurate output. Why it Works More data means more precision. By using English, you're leveraging the LLM’s strongest dataset. This leads to better answers, even in specialized or complex topics. When to Use This ↳ Specialized industries ↳ Tackling complex queries ↳ Writing for multilingual audiences The next time you work with an LLM in a multilingual context, consider translating your prompt to English first. It’s a simple switch, but the results can be game-changing. #AI #generativeAI #LLMs #multilingual #promptengineering #translationHacks #automation
3 Comments
Like Comment
To view or add a comment, sign in
Sarfraz Nawaz

Agentic Process Automation | AI Agents | CxO Advisory | Angel Investor
1mo
Report this post
𝐇𝐮𝐠𝐠𝐢𝐧𝐠 𝐅𝐚𝐜𝐞'𝐬 𝐌𝐮𝐥𝐭𝐢𝐥𝐢𝐧𝐠𝐮𝐚𝐥 𝐒𝐩𝐞𝐞𝐜𝐡-𝐭𝐨-𝐒𝐩𝐞𝐞𝐜𝐡 𝐩𝐢𝐩𝐞𝐥𝐢𝐧𝐞 𝐢𝐬 𝐦𝐚𝐤𝐢𝐧𝐠 𝐚 𝐬𝐩𝐥𝐚𝐬𝐡! 🌊 This revolutionary tool allows you to effortlessly switch languages mid-conversation, thanks to its lightning-fast 100ms delay. Imagine having GPT4o-like experiences on your device, no matter where you are or who you're talking to. 🌎 Building on the incredible success of our GitHub repository (2600 stars and counting!), they have expanded the library to support multiple languages. 𝐖𝐚𝐧𝐭 𝐭𝐨 𝐭𝐫𝐲 𝐢𝐭 𝐨𝐮𝐭? Specify a language: Add the flag --language fr for French, or any other supported language. Let the system decide: Skip the flag, and let the pipeline automatically detect the language. The future of multilingual communication is here! 🌍 #AI #MachineLearning #SpeechRecognition #Multilingual #HuggingFace #GPT4o #LanguageBarrier
Like Comment
To view or add a comment, sign in
To Data & Beyond

8,515 followers
5mo
Report this post
Quantization is an indispensable technique for serving Large Language Models (LLMs) and has recently found its way into LoRA fine-tuning. However, in such cases, it is common to observe a consistent gap in the performance on downstream tasks between full fine-tuning and quantization plus the LoRA fine-tuning approach. 𝐋𝐨𝐟𝐭𝐐: 𝐋𝐨𝐑𝐀-𝐅𝐢𝐧𝐞-𝐓𝐮𝐧𝐢𝐧𝐠-𝐀𝐰𝐚𝐫𝐞 𝐐𝐮𝐚𝐧𝐭𝐢𝐳𝐚𝐭𝐢𝐨𝐧 𝐟𝐨𝐫 𝐋𝐚𝐫𝐠𝐞 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐌𝐨𝐝𝐞𝐥𝐬 a new paper published last week focused on the scenario where quantization and LoRA fine-tuning are applied together on a pre-trained model. LoftQ (LoRA-Fine-Tuning-aware Quantization) is a novel quantization framework that simultaneously quantizes an LLM and finds a proper low-rank initialization for LoRA fine-tuning. Such an initialization alleviates the discrepancy between the quantized and full-precision model and significantly improves the generalization in downstream tasks. This method is evaluated on natural language understanding, question answering, summarization, and natural language generation tasks and it outperforms existing quantization methods, especially in the challenging 2-bit and 2/4-bit mixed precision regimes. ➡ Paper Link: https://lnkd.in/dSPcbxrb ⭐ Subscribe to To Data & Beyond to receive similar content: https://lnkd.in/dVkD-7WT
Like Comment
To view or add a comment, sign in
Vasyl Rakivnenko

CEO @ IngestAI Labs | Serial Entrepreneur | Stanford GSB Alumni | Angel Investor | Research Partner @ Stanford
6mo
Report this post
Diving into the world of Language Model (LLM) Functions! 🧑💻🌐 👇 As #LLMs become more and more powerful and get more and more use cases, I thought it's important to have a structured approach for levareging their capabilities. This is where #LLM Functions come in. An LLM Function is essentially a task-specific wrapper around a language model and it includes: 1. An LLM instance (the actual #AI model) 🤖 2. A prompt template (instructions for the specific task) 📝 Encapsulating these components into one re-usable LLM Function you can get a clear interface for performing different LLM-specific task, such as: - Text Summarization 📄➡️📃 - Question Answering ❓➡️💬 - Language Translation 🌐➡️🌍 On a high level, the power of LLM Functions lies in their modularity and reusability, but you're welcome to my Medium post to dive deeper into LLM Functions and better understand how it actually works: https://lnkd.in/gVvNxg8B
11 Comments
Like Comment
To view or add a comment, sign in
Rishabh Bhardwaj

Researcher and founder
5mo Edited
Report this post
Got the recent work accepted at ACL 2024-main track. When I started my PhD about four years ago, I felt a lot happier about top AI publications, regardless of their impact beyond the research community. Now it feels like I need to make them valuable to the general public. So, I will aim to work in directions that have a wider impact, more specifically in the AI safety domain. RESTA is one such work, receiving scores of 4, 4, and 4 (/5) from the reviewing committee, placing it in the top 3% of papers submitted to the ARR. Paper: Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic Impact: Prevent safety compromise of LLMs due to enterprise-specific fine-tuning. Link: https://lnkd.in/gtbRY_5k Code: https://lnkd.in/geAtnctr Hugginface (safety evaluation multilingual dataset): https://lnkd.in/gEzMV3Pp More about the approach?-https://lnkd.in/gb_y62hX Grateful to my collaborators Duc Anh Do and Soujanya Poria. AI safety needs a lot of exploration-including discussions on its definition, evaluation, and alignment. #AISafety #NLProc #ACL2024

Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic

arxiv.org

2 Comments
Like Comment
To view or add a comment, sign in
Mohammad Arshad

CEO DecodingDataScience.com | 🌎 AI Community Builder | Data Scientist | Strategy & Solutions | Generative AI | 20 Years+ Exp | Ex- MAF, Accenture, HP, Dell | LEAP & GITEX Keynote Speaker & Mentor | LLM, AWS, Azure & GCP
3mo
Report this post
Embarking on the transformative journey of language generation with cutting-edge LLMs (𝐋𝐚𝐫𝐠𝐞 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐌𝐨𝐝𝐞𝐥𝐬) Exciting news! Presenting our latest exploration, "Mastering the Art of Language Generation: A Guide to 𝐋𝐚𝐫𝐠𝐞 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐌𝐨𝐝𝐞𝐥𝐬 Parameters." This guide is a treasure trove of insights that have revolutionized my digital communication and creativity. Delving deep into this guide, I've uncovered the profound impact of four crucial parameters shaping the linguistic prowess of 𝐋𝐚𝐫𝐠𝐞 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐌𝐨𝐝𝐞𝐥𝐬: 🔑 Temperature: A game-changer, adjusting the temperature unlocks creativity and innovation, revealing remarkable word choices that elevate surprise in our creations. 🔑 Top_P: Broadening the adventure into word selection, a higher Top_P value introduces delightful randomness, enriching our textual landscapes with diverse vocabulary. 🔑 Frequency Penalty: Effectively addressing repetition, elevating the frequency penalty ensures vibrant and captivating language, avoiding the mundane echo of redundancy. 🔑 Presence Penalty: Sharpening focus and coherence through the presence penalty ensures content relevance. Fine-tuning this parameter aligns our words closely with intended topics. Exploring these parameters, especially through examples starting with 'A,' has revealed the vast capabilities and versatility of 𝐋𝐚𝐫𝐠𝐞 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐌𝐨𝐝𝐞𝐥𝐬. It's a clear demonstration of harnessing AI's power to enrich our linguistic expressions. I'm thrilled to invite you to delve into this guide and embark on your language generation exploration with 𝐋𝐚𝐫𝐠𝐞 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐌𝐨𝐝𝐞𝐥𝐬. Let's leverage AI to elevate our creativity and communication! 🌍 Download this complete Guide for Free by Joining our vibrant AI community. For a deep dive into language generation mechanics and to start your journey, check the link in the comments below. Together, let's explore, create, and inspire! Follow Decoding Data Science for insightful posts #ai #tech #llm #generativeai

14 Comments
Like Comment
To view or add a comment, sign in
Chen Qi

Columbia University | Data Science Student | Python, SQL | Seeking Full-Time & Internship Opportunities | 2024 Graduation
4mo
Report this post
🌟 𝐇𝐨𝐥𝐢𝐬𝐭𝐢𝐜 𝐄𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧 𝐨𝐟 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐌𝐨𝐝𝐞𝐥𝐬: 𝐀 𝐌𝐮𝐬𝐭-𝐑𝐞𝐚𝐝! 🌟 I've recently come across a fantastic paper titled "𝐇𝐨𝐥𝐢𝐬𝐭𝐢𝐜 𝐄𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧 𝐨𝐟 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐌𝐨𝐝𝐞𝐥𝐬" by Stanford researchers, and I can't help but share it with everyone. This paper is incredibly comprehensive, providing deep insights into evaluating the outputs of large language models (LLMs) across various scenarios. 🔍 𝐖𝐡𝐲 𝐘𝐨𝐮 𝐒𝐡𝐨𝐮𝐥𝐝 𝐑𝐞𝐚𝐝 𝐈𝐭: 𝐂𝐨𝐦𝐩𝐫𝐞𝐡𝐞𝐧𝐬𝐢𝐯𝐞 𝐂𝐨𝐯𝐞𝐫𝐚𝐠𝐞: The paper covers a wide range of evaluation metrics and methodologies, making it a valuable resource for understanding how to assess LLMs effectively. 𝐈𝐧𝐬𝐩𝐢𝐫𝐢𝐧𝐠 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐕𝐚𝐥𝐮𝐞: The insights from this paper can inspire new ways to generate business value by leveraging LLMs in innovative applications. 𝐆𝐫𝐞𝐚𝐭 𝐒𝐭𝐚𝐫𝐭𝐢𝐧𝐠 𝐏𝐨𝐢𝐧𝐭: Whether you're new to the field or an experienced professional, this paper serves as an excellent starting point for learning about the evaluation of LLM outputs. If you're interested in the future of AI and language models, this paper is a must-read! 📚✨ Read the paper here: 𝐇𝐨𝐥𝐢𝐬𝐭𝐢𝐜 𝐄𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧 𝐨𝐟 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐌𝐨𝐝𝐞𝐥𝐬 (https://lnkd.in/e8G_wWQU) #AI #MachineLearning #LanguageModels #Evaluation #Research #Innovation

Holistic Evaluation of Language Models

arxiv.org

5 Comments
Like Comment
To view or add a comment, sign in

635 Posts

View Profile Follow

Prabakaran Chandran’s Post

More Relevant Posts

Explore topics