Join us at the W&B Hackathon to build and improve LLM Judges. Whether you’re refining an existing model or creating a new annotation UI, this event is for AI Engineers who are ready to push the boundaries. Cash prizes and LLM API credits available. Register here: https://lnkd.in/g8VXptg7
Weights & Biases’ Post
More Relevant Posts
-
Posts on Generative AI | learner | Winner of Huggingface / Cohere / Machine Hack / Adobe global hackathons🏅 | Prompt engineer🦜 | Creator of Shaheen 🦅, Baith-al-suroor ,meme world 🤗.
Octopus v2: On-device 📱 language model for super agent👮, a new method that empowers an on-device 2B model to outperform GPT-4 in both accuracy and latency, and decrease the context length by 95%. I was working on digital assistant for visually impaired people for upcoming Deutsche telekom hackathon, i think combining this with GPT-4 V model will be perfect for usecase, if anybody wants to team up feel free to ping. Paper 📄 - https://lnkd.in/gGVDvaPY
To view or add a comment, sign in
-
🔥 Join Gabe Monroy from Google, Sanjeev Hasiza of Sensormatic, and Jyoti Bansal at {unscripted}'s closing keynote for an insightful session. Be there when they explore how to measure developer productivity in an AI-driven landscape, the impact of generative AI on code generation and maintenance, and the broader role of AI in automating and enhancing the Software Development Life Cycle (SDLC). 🗓 9/25 📍 https://lnkd.in/gGiF93mq
To view or add a comment, sign in
-
OpenAI's new model o1 is crazy powerful. We built an AI financial analyst that can make it's own financial models depending on the input files
We built an AI financial analyst embedded in Excel for the OpenAI o1 hackathon. It takes a company's annual report, extracts the financial data and builds its own projections. Scary stuff.
To view or add a comment, sign in
-
🔥 Join Gabe Monroy from Google, Sanjeev Hasiza of Sensormatic, and Jyoti Bansal at {unscripted}'s closing keynote for an insightful session. Be there when they explore how to measure developer productivity in an AI-driven landscape, the impact of generative AI on code generation and maintenance, and the broader role of AI in automating and enhancing the Software Development Life Cycle (SDLC). 🗓 9/25 📍 https://lnkd.in/gcHZWEi4
To view or add a comment, sign in
-
🔥 Join Gabe Monroy from Google, Sanjeev Hasiza of Sensormatic, and Jyoti Bansal at {unscripted}'s closing keynote for an insightful session. Be there when they explore how to measure developer productivity in an AI-driven landscape, the impact of generative AI on code generation and maintenance, and the broader role of AI in automating and enhancing the Software Development Life Cycle (SDLC). 🗓 9/25 📍 https://lnkd.in/e_iwprMg
To view or add a comment, sign in
-
🔥 Join Gabe Monroy from Google, Sanjeev Hasiza of Sensormatic, and Jyoti Bansal at {unscripted}'s closing keynote for an insightful session. Be there when they explore how to measure developer productivity in an AI-driven landscape, the impact of generative AI on code generation and maintenance, and the broader role of AI in automating and enhancing the Software Development Life Cycle (SDLC). 🗓 9/25 📍 https://lnkd.in/ekUgskcS
To view or add a comment, sign in
-
🔥 Join Gabe Monroy from Google, Sanjeev Hasiza of Sensormatic, and Jyoti Bansal at {unscripted}'s closing keynote for an insightful session. Be there when they explore how to measure developer productivity in an AI-driven landscape, the impact of generative AI on code generation and maintenance, and the broader role of AI in automating and enhancing the Software Development Life Cycle (SDLC). 🗓 9/25 📍 https://lnkd.in/eBuqN-ih
To view or add a comment, sign in
-
For data teams interested in Hackathons - a cool side effect of multi modal LLMs; If the project is an internal tool, never again will you need a dashboard to be the artefact by the end of the day! The graphing capabilities of an LLM mean convincing the end user of its usefulness is something they can figure out, given a CSV and subscription to GPT4!
To view or add a comment, sign in
-
🔥 Join Gabe Monroy from Google, Sanjeev Hasiza of Sensormatic, and Jyoti Bansal at {unscripted}'s closing keynote for an insightful session. Be there when they explore how to measure developer productivity in an AI-driven landscape, the impact of generative AI on code generation and maintenance, and the broader role of AI in automating and enhancing the Software Development Life Cycle (SDLC). 🗓 9/25 📍 https://lnkd.in/g72D3nej
To view or add a comment, sign in
-
HumaneIntelligence Fellow | Responsible & Sustainable AI | AI for Social Impact | Certified in Ethics of AI at LSE
HuggingFace announces the new Open LLM Leaderboard with many changes in benchmark selection, normalization techniques in the evaluation, the choice of the new interface, and voting process for model selection to name some. Here are the reasons for such a change from their words: Over the past year, the benchmarks we were using got overused/saturated: -They became too easy for models. For instance, models are now reaching baseline human performance on HellaSwag, MMLU, and ARC, a phenomenon called saturation. -Some newer models also showed signs of contamination. By this, we mean that models were possibly trained on benchmark data or on data very similar to benchmark data. As such, some scores stopped reflecting the general performance of the model and started to overfit on some evaluation datasets instead of reflecting the more general performance of the task being tested. This was, in particular, the case for GSM8K and TruthfulQA, which were included in some instruction fine-tuning sets. -Some benchmarks contained errors. MMLU was recently investigated in depth by several groups (see MMLU-Redux and MMLU-Pro), which surfaced mistakes in its responses and proposed new versions. Another example was that GSM8K used a specific end-of-generation token (:), which unfairly pushed down the performance of many verbose models. We thus chose to completely change the evaluations we are running for the Open LLM Leaderboard v2! Check the full article here: https://lnkd.in/dvVtBViK
To view or add a comment, sign in
74,865 followers
Senior AVP - Transformational BI & Generative AI Leader @ EXL | 2024 3AI Pinnacle Award for Inspiring Women Leader | 2024 EmpowHer access award finalist by Women in Cloud | 2023 Role Model by Women in Cloud | Speaker
2moIs this in-person hackathon?