Lingua Custodia

Lingua Custodia

Technologie, information et Internet

Paris, Île-de-France 3 616 abonnés

Natural Language Processing (NLP) for Finance

À propos

Lingua Custodia is a Fintech company leader in Natural Language Processing (NLP) and Language Technologies for Finance. It was created in 2011 by finance professionals to initially offer specialised machine translation and is now a specialist in Generative AI with many publicayions in this field. Leveraging its state-of-the-art NLP expertise, the company now offers a growing range of applications: Speech-to-Text automation, Document classification, Linguistic data extraction from unstructured documents, Mass web crawling and data collection, ... and achieves superior quality thanks to highly domain-focused machine learning algorithms. Its cutting-edge technology has been regularly rewarded and recognised by both the industry and clients: Investment houses, custody or investment banks, private banks, financial divisions within major corporations, and service providers for financial institutions.

Site web
http://www.linguacustodia.finance
Secteur
Technologie, information et Internet
Taille de l’entreprise
11-50 employés
Siège social
Paris, Île-de-France
Type
Société civile/Société commerciale/Autres types de sociétés
Fondée en
2011
Domaines
Machine Translation, Translation memories management, Financial Translations, Artificial intelligence, neural networks, machine learning, Natural Language Processing, Finance, Technology, FINTECH, Deep learning, Research, nlp, saas, large language models et generative AI

Lieux

Employés chez Lingua Custodia

Nouvelles

  • Voir la page d’organisation pour Lingua Custodia, visuel

    3 616  abonnés

    We were thrilled to be at STATION F last Friday for the XYZ Paris event to discuss #generativeai, #innovation and #transformation for #finance.🕺 ℹ The common themes? Generative AI can streamline processes and optimise productivity. 🙌 Upskilling and reskilling will help us to leverage its full potential while our soft skills and creativity are irreplaceable. 😍 Generative AI should 'augment' and not 'replace' humans! 💪 Our 'magic makers' Raheel Qader Olivier Debeugny Gaetan Boulard Gaëtan Caillaut Jingshu Liu Massinissa Ahmim Jean-Gabriel BARTHELEMY enjoyed meeting and discussing their favourites topics (LLMs and RAG) with the other attendees! 😊 Thank you XYZ Paris for a great event! #genai #ai #finance #rag #augment #softskills #creativity #llm #productivity

    • Aucune description alternative pour cette image
    • Aucune description alternative pour cette image
    • Aucune description alternative pour cette image
  • Voir la page d’organisation pour Lingua Custodia, visuel

    3 616  abonnés

    🎉 We have just had a third research paper for 2024 accepted! "Scaling Laws of Decoder-Only Models on the Multilingual Machine Translation Task" We are so very proud of our Lab team Gaëtan Caillaut Raheel Qader Mariam N. Jingshu Liu Jean-Gabriel BARTHELEMY 😍 So why are we so excited? 🕺 🔬 This work is our first, and successful, attempt at training large scale models for the multilingual and multi-domain machine translation.  ℹ MLMD models are able to handle multi languages which is a huge benefit given that the majority of LLMs are English focused. ℹ MLMD can also handle and detect different domains. Lingua Custodia is specialised in financial terminology, but MLMDs means that more generic text translations are also possible using the same LLM. 👩🔬 For example, a client can translate both a KID as well as a marketing brochure using a MLMD LLM! We also found that a decoder only architecture helps to make the model more efficient for certain tasks! A win win! 👉 paper https://lnkd.in/eNyhsFxc #llm #mlmd #genai #finance #fintech #innovation

    2409.15051

    2409.15051

    arxiv.org

  • Voir la page d’organisation pour Lingua Custodia, visuel

    3 616  abonnés

    Our CAO, Charlotte S Bain was one of the 45,000 volunteers for the Paris Olympics. She was part of a team of volunteers supporting the Paralympic #tabletennis athletes during the games. ❓ Can you share some interesting facts on para table tennis? Happily. It is fast and furious! It's the third largest sport in terms of athlete numbers - 280 players and 31 medals! 😍 ❓ Can you sum up your volunteering experience in 3 words? Fabulous, Fun and Thought provoking ❓ Will this experience have an impact on your professional and/or social life? Definitely. I loved hearing all the different languages around me, so I'd like to learn another language. The power of a smile. When I could not easily communicate, or when I wanted to celebrate or commiserate with the athletes, a smile always helped, so I definitely need to smile more (and stress less! ).😊 The other point, was the realisation that we are always fully capable of finding a solution, whether it is working together as a team of volunteers to address an issue, or the athletes themselves adapting to a new temporary environment/challenge. We always manage to resolve the problem. 💪 So in a nutshell - communicate, smile and be confident that a solution exists. A perfect way to start my September! 🎉 #paris #france #olympics #paralympics #volunteer

    • Aucune description alternative pour cette image
  • Voir la page d’organisation pour Lingua Custodia, visuel

    3 616  abonnés

    Our two brilliant research interns Zhihan Hu and Hayder Bouaziz who have been with us for six months leave us today 😢 We will miss them! As always, we asked for their feedback! 🤔 🤞  1. (Very, very important question) How often did the team make you laugh or smile? ⁉ 😂 Honestly, all the time! Whether it was a quick joke in the middle of a meeting or a funny story shared over lunch, the team always knew how to keep things light while staying focused. The team at Lingua Custodia really knows how to create a pleasant atmosphere, which makes work very enjoyable! We often laugh together, especially during lunch breaks. 😍 2. What were your highlights during these 6 months ❔ One of the highlights for me was being able to share the cool libraries and datasets I worked with. There’s something really rewarding about showing colleagues new tools and watching their eyes light up when they realize how they can apply them to their own work. It’s those little moments of knowledge-sharing that made my time here really fulfilling. 💪 One of the highlights of working at Lingua Custodia was the collaborative and innovative environment. I had the opportunity to work on cutting-edge projects that challenged me and helped me grow professionally. It was both inspiring and rewarding. 🕺 3. What should Lingua Custodia focus on next ❔ 👨🔬 I believe the company is in a great position to dive deeper into large language models. Having worked on LLMs myself, I’ve seen first hand how they can be harnessed, and hopefully it gives the team a nice foundation to explore how these models can be integrated into the workflows for even greater impact. 👨🔬 Having worked on a vision model during this internship, I believe the company should continue to enhance its RAG (retrieval augmented generation) tool by incorporating visual modality and I hope that my work will serve as a first step in this direction Thank you so much for all your hard work and wishing you every success for the future. You've both been fantastic! 🤗 #research #internship #france #paris #llm #visionmodel #chatbot #ai #proudtoworkwithyou #happyatwork #funinternship

    • Aucune description alternative pour cette image
  • Voir la page d’organisation pour Lingua Custodia, visuel

    3 616  abonnés

    We hope you’ve enjoyed our series of posts during the summer and that you are now feeling comfortable with all the different definitions of LLMs, foundation models and RAG! 😁 A summary of the main points is below (which you can use to test your knowledge!) 😜 ❓ What is a foundation model? 👨🔬 A foundation model is an AI model, trained on huge amounts of data (documents, audio, images, text….). It is trained to ‘generate’ the next word as it ‘learns’ the language. It can then be specialised and fine-tuned for a wide variety of applications and tasks, which then means it is no longer a foundation model! ❓ What is an LLM? 👩🔬 An LLM is an umbrella term used for all foundation and specialised models. ❓ How do you optimise and improve the accuracy of an LLM? 1. Fine tuning (on a specialised dataset) 2. RAG – Retrieval Augmented Generation (so including a retrieval technology to an external knowledge base) ❓ What are the use cases for RAG? Infinite! Knowledge management systems, chatbots, question and answer, data extraction, data summarisation, data analysis...... It ensures information is retrieved information rapidly and accurately so the LLM can generate responses that are precise and contextually relevant. 💪 And a new question - What is the role of humans in all this? 🤔 Humans act as gate keepers of these models which is a huge responsibility. We need to ensure the output is accurate, measurable and reliable. This is only really possible if the underlying data is ‘clean’ and organised, and the algorithms and calculations are fully transparent (so avoiding the ‘black box’ issue). Human input is needed to validate and verify the data sets these models are based on, and to monitor, assess and evaluate the accuracy of these models. #llm #foundationmodel #rag #knowledgemanagement #ai #genai #data #transparency

    • Aucune description alternative pour cette image
  • Voir la page d’organisation pour Lingua Custodia, visuel

    3 616  abonnés

    ❓ What is knowledge management ℹ Knowledge management is the process of capturing, organising, sharing and using knowledge across an organisation. This helps an organisation to be more efficient and productive! Employees no longer have the frustration of not knowing where to find the answers to their questions. 🎉 A knowledge management system helps to ensure company procedures and policies are easy to access, communicated, distributed and followed. 😁 A RAG architecture enhances and improves a knowledge management system. It can be connected to a company’s existing knowledge database, retrieving information rapidly and accurately while using an LLM to generate responses that are precise and contextually relevant. 💪 The workflow: The RAG query process starts with an initial prompt (question). This is sent  to a Retriever, which extracts relevant chunks of information from a database. The retrieved ‘chunks of information’ are sent to an LLM which generates a coherent and contextually relevant response. This end-to-end process ensures that the final output is a direct answer to the user’s query, optimized for accuracy and relevance. 🙌 🕺 #rag #ai #llm #knowledgemanagement #data

    • Aucune description alternative pour cette image
  • Voir la page d’organisation pour Lingua Custodia, visuel

    3 616  abonnés

    ❓ What are the use cases for RAG? Reminder: RAG (Retrieval Augmented Generation) is an AI framework which helps to enhance the accuracy and reliability of LLMs through the addition of contextual information retrieved from external sources. It’s this retrieval aspect that really enhances the range of use cases for RAG, as it 'extends' the LLM's knowledge, ensuring the responses are factual and contextual. RAG models are fundamental for: 💬 Chatbots and conversational models Ensuring that responses are up to date and informative – for example, for customer support queries such as ‘Where is my delivery?’ or ‘What is the balance on my bank account?' 🔎 Information Retrieval Optimising the relevancy and accuracy of search results when searching for information in a large volume of data. For example – ‘what are the data retention policies for company X?’ 📄 Content summarisation Generating text based on specific prompts or topics. The RAG model is able to find the relevant parts of the document and generate a concise summary of the most important points. #rag #llm #ai #usecases #chatbot #data #generativeai

    • Aucune description alternative pour cette image
  • Voir la page d’organisation pour Lingua Custodia, visuel

    3 616  abonnés

    ❓ What is RAG? 👩🔬 RAG (Retrieval Augmented Generation) is an AI framework which helps to enhance the accuracy and reliability of Large Language model (LLM) outputs (Recap note: Foundation models and specialised models are all types of LLMs!)   👨🔬 Both RAG and fine tuning improve a model’s outputs using data.  Finetuning focuses on retraining a model using newer, more specific datasets, while RAG uses an external knowledge base to supplement the model’s own knowledge. 🙂 RAG helps to ensure that the model has access to the most current, reliable facts. The sources are accessible, so that responses can be checked for accuracy.  🙂 The advantages of RAG are that the external knowledge base remains up to date and current, avoiding the problem of ‘out of date’ and redundant data. It helps to reduce the risk of hallucinations and inaccurate responses, and helps enforce transparency ( a win, win win!) 💪 ℹ Note that both RAG and fine tuning can be used together – it is then known as RAFT! A Retrieval-Augmented Language Model is known as a REALM! 🕺 #ai #genai #llm #rag #finetuning #foundationmodel

    • Aucune description alternative pour cette image
  • Voir la page d’organisation pour Lingua Custodia, visuel

    3 616  abonnés

    ❔ How do you fine tune a foundation model? ℹ A foundation model is an ai model trained on a huge amount of generic data which then allows it to be used for a wide range of tasks. 👩🔬 We’ve already highlighted the importance of ‘clean data’ to ensure the outputs generated are accurate and free from bias. It is also possible to ‘fine tune’ a foundation model for specific tasks, which also helps to optimise its accuracy and performance for a specialised domain. 👨🔬 Fine tuning an existing foundation model might be a better option for companies who do not have the expertise and resources to create a specialised model from scratch. 📰 Fine tuning techniques include: 👷♀️ Retraining the foundation model on a smaller subset of more specific data. 👷♂️ Ensuring the model is optimised for specific use cases. #foundationmodels #finetuning #ai #generativeai #data #usecases

    • Aucune description alternative pour cette image
  • Voir la page d’organisation pour Lingua Custodia, visuel

    3 616  abonnés

    ⁉ Is data the new oil?! Recap:  ℹ Foundation models are trained on huge amounts of data which allows them to ‘learn’ and also generate content (so Generative AI). The more data used for training the models, the better the model tends to be. However, if the data used to train the foundation model is of poor quality, then any outputs risk being inaccurate. Garbage in, Garbage Out 😭 ℹ There are also the risks we’ve already highlighted for data which include, bias, discrimination, privacy and security. So, how can we ensure that the underlying data for a foundation model is good or 'clean' data? 🤔  1. Use publicly available data sets (though this data might still need to be cleaned!) 2. Use your own data and clean it using algorithms to: ⛑ Remove duplicates ⛑ Remove outliers ⛑ Remove noisy data (errors, inaccuracies, or irrelevant elements) ⛑ Organise and structure the data 😍 Our superhero data team was asked to sum up why data is the new oil! Mariam N. ‘I like to think of data as snapshots of our world - the more we have and the cleaner it is, the better will be the image of the world we're trying to describe’ Arezki SADOUNE 'La donnée est aujourd'hui si précieuse qu'elle survivra à tous les systèmes et technologies actuels pour nourrir ceux de l'avenir' #data #ai #genai #foundationmodels

    • Aucune description alternative pour cette image

Pages similaires

Parcourir les offres d’emploi

Financement

Lingua Custodia 5 rounds en tout

Dernier round

Subside

5 340 152,00 $US

Voir plus d’informations sur Crunchbase