The Federal Trade Commission's new, strict interpretation of #anonymization practices exposes the inadequacies of traditional methods like hashing, which can no longer guarantee #dataprivacy. Companies relying on these outdated techniques face increased regulatory scrutiny and risk. The FTC has made it clear that data can only be considered truly anonymous if it cannot be re-identified under any circumstances. This shift necessitates a deeper understanding and adoption of advanced privacy-preserving methods. Differential privacy offers a mathematically sound framework that ensures data cannot be traced back to individuals. It provides a level of security that traditional methods simply cannot match, making it the gold standard in the evolving landscape of data privacy. At Sarus (YC W22) we’ve embedded #differentialprivacy at the heart of our solutions, meeting the FTC’s stringent requirements, allowing organizations to harness the full potential of their data without compromising privacy. Our approach is designed for the future, ensuring that as regulations become stricter our clients remain compliant and their data secure. #DataSecurity #DataProtection #RegulatoryCompliance https://lnkd.in/dPPJ_sxa
À propos
At Sarus, we believe that the potential of AI can only be realized if it does not come at the cost of privacy. This is why we are developing a privacy layer to let Enterprises and Public organizations unleash the full potential of sensitive data, and do research, analytics and AI while keeping data safe. With Sarus, data scientists and analysts work on data without ever seeing it. Compliance is streamlined, data value unlocked from day one and security maximal, especially in the fields of healthcare, finance, and sustainable cities. Our solution implements the latest research in privacy-preserving technology, namely differential privacy, remote execution and synthetic data. Sarus was founded in 2019 by three seasoned entrepreneurs with strong engineering and scientific backgrounds. Since then, we have raised over €3M, delivered breakthrough science, hired an outstanding team, published innovative approaches to privacy-preserving AI, built a product and signed first clients. Interested in joining the adventure? Reach out to us! All our job offers are here: https://sarus-technologies.welcomekit.co/
- Site web
-
https://sarus.tech
Lien externe pour Sarus (YC W22)
- Secteur
- Produits logiciels de sécurité des données
- Taille de l’entreprise
- 11-50 employés
- Siège social
- Paris
- Type
- Société civile/Société commerciale/Autres types de sociétés
- Fondée en
- 2020
- Domaines
- ai, privacy, data, compliance, data protection, machine learning, software, security et data governance
Lieux
-
Principal
75008 Paris, FR
Employés chez Sarus (YC W22)
Nouvelles
-
🔒 Fine-Tune LLMs with Differential Privacy in Databricks Databricks simplifies the specialization of foundation models and offers a range of tutorials to showcase their capabilities. However, standard fine-tuning doesn't guarantee #dataprivacy. This is where #DP fine-tuning comes into play, providing the necessary privacy assurances. In our latest project, we demonstrate the ease of using Sarus and Databricks to fine-tune an open-source model without worrying about training set leakage. We compared models fine-tuned with and without DP (Epsilon 3, Delta 1e-5). Using GPT-4 to judge the answers' correctness and similarity, we found that in 90% of cases, the models achieved the highest grades. While DP models had slightly lower mean utility, they offered guaranteed privacy! This approach ensures robust model fine-tuning while safeguarding sensitive information. 🚀 If you're a Databricks user looking to harness the power of LLMs without compromising on privacy, let's connect! #GenAI #Databricks #MachineLearning #LLM https://lnkd.in/euRPtYid
LLM DP fine-tuning with Sarus in a Databricks workspace
sarus.tech
-
Sarus (YC W22) a republié ceci
OpenAI has just announced the possibility to fine-tune #GPT4omini. It makes fine-tuning of LLM more accessible than ever, opening many new possibilities. But it comes with #privacy pitfalls. Yes #LLMs are blabbermouths, they reveal secrets 🔒 from your training data in an uncontrolled way 😨. Read our post to learn more: https://lnkd.in/eyEvf_Sv Contact us if you want to fine-tune LLMs (#Mistral or #Llama3) with privacy guarantees.
Fine-Tuning GPT-4o mini: Privacy not Included
sarus.tech
-
Last week, we explored the possibility of fine-tuning pre-trained LLMs like #Mistral7B to solve general #ML problems, such as classification for medical diagnoses, without needing to define a specific loss function or clean unstructured data. One limitation we encountered was the model memorizing many training examples, which raised privacy concerns and hindered its use in medical applications. In this new experiment, we continue to learn from our fictitious disease dataset but use the #DPSGD algorithm to guarantee record privacy. Key takeaways: - LLMs fine-tuned with DP-SGD can still solve the problem with 93% accuracy 🤯 - More data is needed to learn the same knowledge; below a critical amount of data, accuracy drops dramatically. - If you lack additional data, you can guide the LLM by changing the loss function to ensure it focuses on getting diagnoses right. https://lnkd.in/evq4Qrsg Pretty neat! 😎 If you have private data Sarus (YC W22) can help you build LLM based solutions 🚀 . Feel free to contact us.
Discovering New Knowledge while Protecting Privacy
sarus.tech
-
New week, new post on #privacy in #AI from Sarus (YC W22). Using a synthetic dataset of fake symptoms and corresponding diagnoses expressed in natural language, we explored whether #Mistral7B can produce accurate diagnoses just by fine-tuning it on text. - Good news: The model performs well as a doctor 👩⚕️. - Bad news: The model memorizes 🐘 a lot of irrelevant, yet private, information about individuals in the dataset. Next week's post will show how to overcome this limitation 👌. https://lnkd.in/eGCgxph3 Mistral AI
Fine-tuning Mistral 7B with QLoRA for new knowledge learning
sarus.tech
-
Can a pre-trained #LLM learn a new skill from a bunch of private documents? 🤔 How does #RAG perform on these tasks? What about #FineTuning? ❓ Testing and benchmarking with public datasets can be misleading since they might have been used to pre-train the base #LLM itself 🪞. To address this, Sarus (YC W22) generated a synthetic dataset of conversations between patients and doctors, specifically for testing and benchmarking private knowledge understanding. Explore the dataset on Hugging Face and see how it can enhance your LLM's capabilities: https://lnkd.in/e2wCWZvy New releases will be published on a regular basis as this dataset may be used to train the next iteration of #gpt or Mistral AI ...
An open-source dataset to test LLM private knowledge understanding
sarus.tech
-
Navigating Privacy in RAG-Based Architectures 🌐 Are you deploying Retrieval-Augmented Generation (RAG) systems? Here’s what you need to know to safeguard #dataprivacy and #security: 1. Handle sensitive prompts: mask identifiers and ensure user data is protected. 2. Secure knowledge bases: implement access controls and filter sensitive information. 3. Protect training data: use #differentialprivacy to maintain compliance and security. 4. Ensure overall compliance: address data security at each stage with a zero-trust approach. 🔗 Dive deeper into these crucial strategies to mitigate risks and secure your RAG applications. (link in the comments) #Privacy #RAG #DataSecurity #AI #Compliance #GenAI #GenerativeAI
-
Generate Time-Series Synthetic Data with OpenAI's Fine-Tuning API ⏲ Unlock the potential of your time-series data by leveraging OpenAI's Fine-Tuning API. We've published an insightful guide to help you seamlessly generate synthetic data while ensuring privacy and efficiency. 💡 What you can expect to learn: 1. Enhance privacy and security with synthetic data. 2. Rapid generation using OpenAI's Fine-Tuning API. 3. Practical examples and code snippets included. 4. Applications across various industries. Discover how Sarus (YC W22) can elevate your data projects and help you: - Achieve privacy compliance. - Enhance data utility without compromising security. - Optimize AI integration with synthetic data. Full blog post in comments! #AI #SyntheticData #Privacy #OpenAI #DataSecurity #TechInnovation
-
Sarus (YC W22) a republié ceci
🌟 And the new Shake Up accelerated startups are... Since 2016, we've successfully accelerated around 50 startups including Yuka, Beekast, Whispli, and Tolv. Now, we're thrilled to announce our latest call for projects focused on Artificial Intelligence, led by our co-founders Eva Rosilio, Mathilde Peyret and the project team. After reviewing more than 45 applications in categories like trusted AI, ethical AI and social responsibility, disruptive AI Business Model, our expert jury selected 3 winners: 🏆 UncovAI, introduces an innovative platform for detecting Generative AI Content, offering a blend of efficiency and sustainability. 🏆 Naaia, first AIMS® on the market in Europe, it is the solution for governance and management of AI systems without ethical or compliance compromises. 🏆 Sarus (YC W22), the privacy layer that unleashes the full potential of sensitive data. Do research, analytics, or AI while keeping data safe. 👏 Congratulations to these groundbreaking startups for joining our accelerator and pushing the boundaries of AI! 🚀 And thanks again to the jury: Valentin Blanchot, Caroline Chopinaud, Adrian Dan, Chadi Hantouche, Matthias Houllier, Maximilien Moulin, Charlie Perreau, Marianne Tordeux Bitker
-
+1
-
🚀 Enhancing Privacy with Easy PII Replacement for AI APIs 🚀 Integrating AI features from OpenAI, Mistral AI, or Anthropic into your applications can be challenging when managing Personally Identifiable Information (#PII). We published a step by step guide on how you can seamlessly handle PII replacement, ensuring privacy without compromising on the functionality. 🔑 Key Takeaways: - Understand the challenges of PII in AI applications and the necessity of efficient PII handling. - Learn how to implement PII replacement using various APIs, enhancing security while maintaining performance. - Get hands-on with detailed code snippets and practical examples to integrate these solutions into your projects effortlessly. - Explore the differences between OpenAI, Mistral, and Anthropic APIs, and discover best practices for each. Reach out to learn how Sarus (YC W22) Arena can help you: - Ensure compliance with data protection regulations. - Boost your application’s trustworthiness by safeguarding user data. - Gain insights into optimizing AI integration with robust privacy measures. Don't miss out on making your AI applications both powerful and privacy-compliant (full blogpost in the comments). #AI #Privacy #DataProtection #OpenAI #Mistral #Anthropic #TechInnovation #DataSecurity