Over the past few months, we built Vijil Evaluate into a high-performance evaluation engine that runs at massively parallel scale to test LLM applications. In parallel, we constructed a “Lite” version of every benchmark of interest to us using tinyBenchmarks, a principled method for approximating a benchmark. Our Lite version of MMLU-Pro reproduces the score of the full version with 95% accuracy at 95% cost savings, running 1000x faster on Vijil Evaluate than the full MMLU-Pro on the default evaluation harness. Read the blog post for details. https://lnkd.in/g5FBcFaE
vijil
Software Development
Helping AI developers build intelligent agents that people can trust
About us
Vijil is an AI software startup on a mission to help developers build and operate intelligent agents that humans can trust. Many organizations today are drawn to the productive potential of AI agents but are held back from production use because the large language models inside have inherent vulnerabilities to attack and propensities for harm. For AI developers at enterprises and startups that want to measure and mitigate these risks, Vijil offers tools that continuously improve the trustworthiness of agents.
- Website
-
https://vijil.ai
External link for vijil
- Industry
- Software Development
- Company size
- 11-50 employees
- Type
- Privately Held
- Founded
- 2023
Employees at vijil
Updates
-
vijil reposted this
"The upside of GenAI - is that it should be saving you a lot of work elsewhere. " Thanks to everyone for joining us today! Huge thank you to our panelists 🎉 Rodney Shetler, Subho Majumdar, PhD, Ryan Carr, Dan F., & Rahul Pradhan - Chainguard, Couchbase, Enveil, & vijil Thanks to Brittany Carambio for moderating 🙌 If you missed it, stay tuned to find out when you can catch the recorded session. 🔐 Find out how you can securely and confidently run GenAI at scale within the enterprise environment with OctoStack - https://lnkd.in/e4NmdkxG
-
Subho Majumdar, PhD will be a panelist at the upcoming GenAI Collective x OctoAI Builder's Roundtable. The panel will cover LLM security, privacy, and observability as topics to enterprises deploying GenAI into production. Mark your calendars for August 27th and secure your spot today! Registration link: https://lu.ma/jdbxsd0w
🔐 Meet the panelists! We are officially 1 week out from our Secure GenAI for Enterprise Builder's Roundtable. Sign up and let us know what questions you want answered ✋ Happy to have Cerebral Valley as our Community Partner! The expert panel: 🔑 Dan F.- Staff Product Manager, Chainguard 🔑 Ryan Carr - CTO, Enveil 🔑 Subho Majumdar, PhD - Co-founder, Head of AI, vijil 🔑 Rodney Shetler - Director of Sales Engineering, OctoAI 🔑 Rahul Pradhan - Vice President, Product and Strategy | Data and AI - Cloud to Edge, Couchbase Moderated by Brittany Carambio, Director of Corporate Marketing at OctoAI Register here 👉 https://lu.ma/jdbxsd0w
-
-
vijil reposted this
vijil raised in their Seed Round-Congrats Vin Sharma Full Round Info: https://lnkd.in/g2FMpZFV Round Investors: AIStart seed fund, Gradient Ventures
-
-
vijil reposted this
vijil is helping businesses to create safe and reliable #AI agents that can be used to automate tasks, improve efficiency, and make better decisions. Learn more from Mayfield Partner Vijay Reddy and Vijil CEO and Co-founder Vin Sharma here: https://lnkd.in/gfM_F48r
-
-
vijil reposted this
Dazz, a three-year-old Palo Alto startup has raised $50 million at a post-money valuation of approximately $400 million. The funding round was co-led by Greylock Partners, Cyberstarts, Insight Partners, and Index Ventures. Pearl, a five-year-old Los Angeles startup secured $58 million in its Series B round. The investment was led by Left Lane Capital. Vanta, a seven-year-old San Francisco startup that streamlines security and compliance processes for businesses, raised $150 million in a Series C round, achieving a valuation of $2.45 billion. Igloo, Inc., a Miami startup developing Abstract, a consumer-facing blockchain aiming to make blockchain technology more accessible and user-friendly, raised $11 million. Lakera, a three-year-old San Francisco startup focused on protecting AI systems from malicious activities by detecting and mitigating vulnerabilities in large language models, secured $20 million in a Series A round. Splight, a four-year-old San Francisco startup utilizing AI technology to integrate more clean energy into power grids by predicting and managing energy flow to reduce waste and enhance efficiency, raised $12 million in a seed round led by Noa. Star Catcher, a newly founded startup in Jacksonville, FL, developing a space-based grid to capture solar energy in space and distribute it to satellites and other space assets, raised $12.5 million in a seed round. Farmblox, a two-year-old Boston startup providing an AI-powered platform that assists farmers in monitoring and managing their crops through real-time data collection on soil, weather, and plant health, raised $2.5 million in a seed round led by Hyperplane, with Slow Ventures. Noded Ai, a one-year-old San Francisco startup that transforms user notes into a central hub for managing work tasks and collaborations, raised $4 million in a round led by boldstart ventures. Promptfoo, a newly founded San Mateo, CA, startup helping developers find and fix vulnerabilities in their AI applications, raised $5 million in a seed round led by Andreessen Horowitz. rift, a months-old San Francisco startup offering tools for sales professionals to manage leads, track sales activities, and automate routine tasks, raised $5 million in a seed round led by Sequoia Capital. vijil AI, a one-year-old Menlo Park startup ensuring AI agents operate safely and reliably by running automated tests tailored to specific business contexts, raised $6 million in a round co-led by Mayfield Fund and Gradient Ventures. ZEST Security, a one-year-old startup that automatically identifies and addresses vulnerabilities in cloud-based systems, raised $5 million in a seed round. Agellus Capital secured $400 million for its debut private equity fund aimed at investing in essential services. CityRock Venture Partners, a venture outfit associated with the 15-year-old H/L Ventures, raised a second fund of $24 million. #fundrasing #startups #vc #privateequity #investing #privatemarkets
Dazz Secures $50M in Funding to Revolutionize Cloud Security with AI, Valued at $400M
TRACT on LinkedIn
-
Today is a big day for us at vijil. We started our company this year with a simple mission — help enterprises build trustworthy agents based on open, safe, and secure models. We’re coming out of stealth today to announce the availability of two products ready for preview. Vijil Evaluate is a cloud service to measure trust in AI applications built with language models. Vijil Dome is a cloud service to maintain that trust by mitigating risks in real-time. We’re thrilled to have assembled a small team with amazing talent to focus on trustworthy AI. Some of us have worked together at Amazon and others are new mates. Together we’re proud of having tackled some thorny problems to make LLM evaluations and guardrails faster, cheaper, and easier to use than ever before. We’re excited to work with customers who can now make the evaluation of LLM reliability, security, and safety a seamless part of AI agent development and deployment. It took a village to bring Vijil this far. We are deeply grateful to our investors who believed in our vision long before the demand signal became obvious as a market. Thank you Navin Chaddha, Vijay Reddy, Patrick Salyer, Guru Pangal, and Gamiel Gran at Mayfield Fund for your thoughtful support even before our inception. Thank you Darian Shirazi, Vig Sachidananda, PhD, and Kyle Duffy at Gradient Ventures for the clarity of your commitment to responsible AI. In addition, we are incredibly lucky to have pioneers in academia and industry as advisors. Thank you Leon Derczynski, Russ Salakhutdinov, Bratin Saha, and Joseph Spisak for your kindness and wisdom. Thank you Manvinder Singh and Oscar Wahltinez for your collaboration on responsible AI, and Kyle Grimsrud and for your partnership on Google Cloud. And to all our families and friends, thank you for your unwavering support. This launch is just the beginning. If you want to dive deep into customer-obsessed research and development of trustworthy agents, come join us! If you’re in an AI team looking for tools to build and deploy trusted agents based on open LLMs, send us a note — we’d love to help. https://lnkd.in/gEDJxWzP Press Release -- https://lnkd.in/gfM_F48r More updates are coming soon! Vin Sharma, Zdravko Pantic, and Subho Majumdar
Vijil Emerges from Stealth with Seed Funding from Mayfield and Gradient Ventures to Help Enterprises Build Generative AI Agents That They Can Trust
prnewswire.com
-
vijil reposted this
We evaluated Meta Llama3 and Llama 2 for the propensity to perpetuate harmful stereotypes in response to jailbreaking prompts. Using 10K diverse prompts, we tested if the model agrees with common stereotypes associating negative attributes that might lead to unfair outcomes in the workplace for people in EEOC protected groups. Turns out, there are stark differences between Llama 3 and Llama 2, and within the demographic groups. Leave us a comment or suggestion. https://lnkd.in/gW4J-6DU
Evaluating Meta Llama 3 for Stereotyping
vijil.substack.com
-
We evaluated Meta Llama3 and Llama 2 for the propensity to perpetuate harmful stereotypes in response to jailbreaking prompts. Using 10K diverse prompts, we tested if the model agrees with common stereotypes associating negative attributes that might lead to unfair outcomes in the workplace for people in EEOC protected groups. Turns out, there are stark differences between Llama 3 and Llama 2, and within the demographic groups. Leave us a comment or suggestion. https://lnkd.in/gW4J-6DU
Evaluating Meta Llama 3 for Stereotyping
vijil.substack.com
-
vijil reposted this
Product Marketing @ NVIDIA | GTM lead for GitHub Copilot GA | AI developer tools | PMC | Startup Advisor
"Testing and mitigating toxicity — inherent in LLMs trained on Internet data — is important" If you had asked me a month ago about how to measure (much less mitigate) toxicity in LLMs, I wouldn't have had a good answer. The parameters/size of training data in the most popular models presents a prohibitively expensive pruning process. OctoAI partnered with vijil to evaluate WizardLM-8x22B, Mixtral-8x22B (base model), and Mixtral 8x7B Instruct. Vijil tested these models for several categories of toxicity. The Vijil evaluation framework obtains test prompts from the realtoxicityprompts (RTP) benchmark dataset, generated prompts from a custom adversarial model, and hand-crafted by red-team experts. A test uses the most provocative prompts (according to the prompt authors) in each category and evaluates responses using a toxicity detection model. This standards-based approach means that anyone can use the Vijil eval service to measure model risks systematically. Have you used a toxicity mitigation technique on LLMs before or is there more you'd be interested in learning? Link in comments for the blog post.
-