Toloka

IT-services en consultancy

Your high quality data partner for all stages of AI development

Alle 1.041 medewerkers weergeven

Over ons

Toloka empowers businesses to build high quality, safe, and responsible AI. We are the trusted data partner for all stages of AI development from training to evaluation. Toloka has over a decade of experience supporting clients with our unique methodology and optimal combination of machine learning technology and human expertise, offering the highest quality and scalability in the market.

Website: https://toloka.ai/
Externe link voor Toloka
Branche: IT-services en consultancy
Bedrijfsgrootte: 51 - 200 medewerkers
Hoofdkantoor: Amsterdam
Type: Naamloze vennootschap
Opgericht: 2014
Specialismen: Data Annotation, Data Labeling, Machine Learning, Computer Vision, Autonomous Driving, Training Data, Deep Learning, Search, Data Collection , Text creation, Crowdsourcing, Product descriptions, Web research, Tagging, Categorization, Surveys, Sentiment analysis, AI Training Data en Natural Language Processing (NLP)

Producten

Toloka

Datawetenschap- en machinelearningplatforms

Empower AI Development and LLM Fine-Tuning Elevate your ML with next-level expert data for SFT and RLHF. Access skilled experts in 20+ domains and 40+ languages with unlimited scalability, backed by an advanced technology platform.

Locaties

Primair

Amsterdam, NL

Routebeschrijving
Lucerne, Switzerland, 6005, CH

Routebeschrijving
Newburyport, US

Routebeschrijving
San Francisco, US

Routebeschrijving
Chicago, US

Routebeschrijving
Warsaw, PL

Routebeschrijving
Montreal, CA

Routebeschrijving
Tel Aviv, IL

Routebeschrijving
Singapore, SG

Routebeschrijving
Belgrade, RS

Routebeschrijving

Medewerkers van Toloka

Alle medewerkers weergeven

Updates

Toloka

101.736 volgers
6 d
Deze bijdrage melden
🚀 We’re LIVE at GITEX GLOBAL Largest Tech & Startup Show in the World! Toloka is officially onsite at GITEX 2024 and ready to meet you! Our booth is buzzing with excitement, and we’re eager to share our latest innovations in AI with you. 🎉 Why Should You Stop By? ⭐ Get hands-on: Explore our interactive demos and see how Toloka’s AI solutions can transform your business. ⭐ Meet the experts: Ranjay Ghai, Catherine Fedorenko, Nima Karimi and Abdulrazzak Jaroukh are ready to chat, answer your questions, and dive deep into the future of AI with you! ⭐ Discover opportunities: Whether you’re looking to enhance your AI models, improve data quality, or find new crowdsourcing strategies, we’ve got something for you. 📍 Where to Find Us: Hall 9, H9-B60, GITEX Exhibition Hall Don’t miss out—swing by, say hello, and let’s shape the future of AI together. We can't wait to meet you at GITEX! 👋
2 commentaren

Interessant Commentaar Delen
Toloka

101.736 volgers
6 d
Deze bijdrage melden
🚀 Building great AI starts with quality data. But where do you get yours? From labeled datasets to synthetic generation, the options are endless—each with its own strengths and challenges. We’re curious, what’s your go-to source for training data? Vote and tell us how you fuel your AI! 🔥👇 #AI #DataScience #MachineLearning #DataStrategy #TolokaAI

Deze content is hier niet beschikbaar

Open deze content en meer in de LinkedIn-app

1 commentaar

Interessant Commentaar Delen
Toloka

101.736 volgers
1 w Bewerkt
Deze bijdrage melden
Are you ready to experience the future of AI? Join us AT GITEX GLOBAL Largest Tech & Startup Show in the World. 🚀 At Toloka, we’re on a mission to push the boundaries of AI, and we can’t wait to show you how! Ranjay Ghai, Catherine Fedorenko, Nima Karimi and Abdulrazzak Jaroukh will be at the heart of GITEX, presenting our cutting-edge solutions and real-world applications powered by human intelligence and machine learning. 🎉 Join the AI Revolution and be inspired by the possibilities. Whether you’re a tech enthusiast, industry leader, or simply curious, Toloka’s booth at GITEX is the place to be! 📅 When: October 14-18, 2024 📍 Where: Dubai World Trade Centre, GITEX Exhibition Hall 🚪 Visit Us: Hall 9, H9-B60 Let’s shape the future together! See you at GITEX! 👋
1 commentaar

Interessant Commentaar Delen
Toloka

101.736 volgers
1 w
Deze bijdrage melden
🚀 Meet Beemo – The Ultimate Benchmark for AI-Generated Text Detection! We’re excited to announce Beemo, a cutting-edge tool developed in collaboration with Toloka, the University of Oslo, and Penn State University to push the boundaries of AI text detection! Beemo lets you compare three types of responses to any prompt: ⭐ Human-written ⭐ LLM-generated ⭐ Expert-edited LLM-generated answers Why is Beemo a game-changer? 1️⃣ Benchmark zero-shot and trained AI detection systems. 2️⃣ Test AI detectors across diverse LLMs and prompt categories. 3️⃣ Train your own AI detectors to distinguish between machine-generated, human-written, and hybrid texts! Experts like Adaku Uchendu from MIT and Preslav Nakov from MBZUAI (Mohamed bin Zayed University of Artificial Intelligence) emphasize the importance of detecting AI-generated and hybrid texts to maintain data integrity and address ethical concerns. With contributions from top researchers, Beemo sets a new standard in AI content detection. 👉 Check out Beemo on GitHub — try it for yourself and contribute to improving AI detection: https://lnkd.in/dp4db-gt Let’s continue innovating and enhancing AI together! 🔗 Full blog in the comments!
1 commentaar

Interessant Commentaar Delen
Toloka

101.736 volgers
2 w
Deze bijdrage melden
A big thank you to everyone who participated in our recent poll, in which we asked where the next generation of LLM training data will come from. Most of you voted for a combination of synthetic and human-curated data. At Toloka, we specialize in both. Our data pipelines blend LLM-generated data with human input from experts, AI tutors, and a global crowd—tailored to meet your price, quality, and speed needs. While LLMs help deliver fast and cost-effective solutions, human experts ensure final accuracy and quality. Talk to us, and we'll help you find the right balance between automation and human expertise: https://bit.ly/3YUM67F #ArtificialIntelligence #MachineLearning #LLMs #genAI #Data

Interessant Commentaar Delen
Toloka

101.736 volgers
2 w
Deze bijdrage melden
We’re excited to continue sharing key takeaways from #ICML2024 in Vienna. One research paper that stood out to us, authored by Kuang-Huei Lee, Xinyun Chen, Hiroki Furuta, John Canny, and Ian Fischer introduces an innovative agent system designed to handle tasks with extended contexts. Current LLMs are limited in processing long inputs because they are restricted by a maximum context length. ReadAgent is the system that expands the effective context length by up to 20x. Its design is inspired by how humans read and interact with long documents rather than simply processing text word by word. Thank you to the authors for pushing the boundaries of modern AI. Check out the GitHub page for more details: https://lnkd.in/gkxmWTaG #ArtificialIntelligence #MachineLearning #LLMs #genAI Google DeepMind

A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

read-agent.github.io

Interessant Commentaar Delen
Toloka

101.736 volgers
1 mnd
Deze bijdrage melden
Scaling AI models is no small feat, especially when human-labeled data is in the mix. What's the biggest challenge you see in this process? Cast your vote and join the conversation!

Deze content is hier niet beschikbaar

Open deze content en meer in de LinkedIn-app

1 commentaar

Interessant Commentaar Delen
Toloka

101.736 volgers
1 mnd
Deze bijdrage melden
🚀 Exciting News: Toloka and Top Universities Launch Innovative Benchmark for Detecting AI-Generated Texts! We’re thrilled to announce a groundbreaking collaboration between the University of Oslo, Penn State University, and Toloka, unveiling Beemo, a cutting-edge benchmark to revolutionize AI text detection. This new benchmark, created by experts from leading institutions, offers a robust, realistic testing environment for AI text detectors. Beemo is designed using LLMs like LLaMA and expert human annotators, challenging detectors to differentiate between purely machine-generated texts and human-edited ones, reflecting real-world scenarios. Why is this important? Detecting AI-generated content is crucial for: 1️⃣ Maintaining data integrity, 2️⃣ Addressing ethical and legal concerns, 3️⃣ Enhancing the reliability of AI systems. Adaku Uchendu from MIT Lincoln Labs emphasizes the importance of distinguishing artificial texts from human-written ones to protect the integrity of our information ecosystem. Meanwhile, Preslav Nakov from MBZUAI (Mohamed bin Zayed University of Artificial Intelligence) highlights the challenge of detecting hybrid texts co-authored by humans and AI, as they can be particularly deceptive. With contributions from top NLP researchers such as Vladislav Mikhailov, Saranya Venkatraman, Jason Lucas, M.Sc., MPH, Ph.D (cand), MPH, Ph.D (cand), Jooyoung Lee, and more. As AI evolves, this benchmark is a vital tool for NLP practitioners and researchers. It sets new standards for AI-generated content detection and paves the way for future innovations. Beemo is now available for public use on: GitHub: https://lnkd.in/dksfBKFD Hugging Face: https://lnkd.in/dp4db-gt Let’s continue pushing the boundaries of AI together! Read the full blog - link in the comments! Ekaterina Artemova Natalia Fedorova
13 commentaren

Interessant Commentaar Delen
Toloka

101.736 volgers
1 mnd
Deze bijdrage melden
We are pleased to continue sharing insights from our participation at #ICML2024 in Vienna. A notable research paper by Alexander Wettig, Aatmik Gupta, Saumya Malik, and Danqi Chen has garnered our attention for its exploration of high-quality data selection in language model training. The authors present a novel approach that encapsulates human intuition on data quality by focusing on four key factors: writing style, required expertise, factual accuracy, and educational value. By leveraging language models to perform pairwise comparisons of texts and translating these judgments into scalar values, they propose an efficient method for selecting superior data for model training. Their findings highlight the importance of balancing data quality with diversity, demonstrating that models trained with this approach achieve lower perplexity and improved in-context learning performance compared to traditional methods. This research represents a significant advancement in optimizing language model training, and we extend our gratitude to the authors for their valuable contributions. Read the full paper: https://lnkd.in/dVi3YSgY #ArtificialIntelligence #MachineLearning #LLMs #genAI

QuRating: Selecting High-Quality Data for Training Language Models

arxiv.org

Interessant Commentaar Delen
Toloka

101.736 volgers
1 mnd
Deze bijdrage melden
Inter-rater reliability has been believed to be an important factor in ensuring data quality for AI and machine learning projects, but there are better ways to ensure data quality.📊 In our latest blog, we cover: 💡 What is Inter-Rater Reliability (IRR)?: A fundamental concept that measures the level of agreement among different annotators working on the same data set. 💡 Why IRR matters: Reliable data annotations are vital for training accurate and dependable AI models. Consistency in labeling can impact the performance of your algorithms. 💡 How to measure IRR: We discuss various methods such as Cohen's Kappa, Fleiss' Kappa, and Krippendorff's Alpha, explaining how each technique helps in assessing annotation consistency. 💡 Improving on IRR: Practical strategies and best practices to ensure high-quality data for your AI models. Dive into the full article to learn more: https://bit.ly/4dgMPDH #AI #MachineLearning #DataAnnotation #InterRaterReliability #DataQuality #TolokaAI

4 commentaren

Interessant Commentaar Delen

Toloka

IT-services en consultancy

Your high quality data partner for all stages of AI development

Over ons

Producten

Toloka

Datawetenschap- en machinelearningplatforms

Locaties

Medewerkers van Toloka

Andrew Braun

Global Accounts at Toloka, a global leader in crowd science and AI

Dmitriy Kachin

VP of Product - Hybrid Data Labeling at Toloka AI | ex-COO, Chatfuel (YC, W16)

Tania Ignatova

Director of Finance @ Toloka | Financial Planning and Analysis | ex-Microsoft

Oleg Levchuk

CPO at Toloka AI, ex-Yandex

Updates

Word nu lid en bekijk wat u mist

Gerelateerde pagina’s

Crowdsourcing Practice for Efficient Data Labeling

AI/ML Memes and Laughs

Vergelijkbare pagina’s

Mindrift

SuperAnnotate

Yandex

Nebius

Remotasks

mindrift

Appen

DataAnnotation

Outlier

Crossover

Door vacatures bladeren

Vacatures voor Reclame

Vacatures voor Accountmanager

Vacatures voor Projectmanager

Vacatures voor Analist

Vacatures voor Ingenieur

Vacatures voor Ontwikkelaar

Vacatures voor Directeur

Vacatures voor Marketingmanager

Vacatures voor Accountexecutive

Vacatures voor Schrijver

Vacatures voor Vertaler

Vacatures voor CEO

Vacatures voor Redacteur

Vacatures voor Copywriter

Vacatures voor Art Director

Vacatures voor President

Vacatures voor Inkoper

Vacatures voor Scrummaster

Vacatures voor Software-ingenieur

Vacatures voor User Experience-designer