Unum

Unum

Software Development

San Francisco, CA 1,364 followers

Exascale Search & AI

About us

Unum is a deep-tech company that designs core software technologies for next-generation Data-Lakes and Cloud providers. Unum's portfolio of products includes hardware-accelerated Storage Engines, Compute Kernels, Search Algorithms, and Neural Networks that can process Google-scale datasets in a fraction of the time and at lower costs.

Website
https://unum.cloud
Industry
Software Development
Company size
11-50 employees
Headquarters
San Francisco, CA
Type
Privately Held
Founded
2015
Specialties
Machine Learning, Mathematics, AI, Database, Cryptography, C++, HPC, OpenCL, CUDA, Artificial Intelligence, Neural Networks, Deep Learning, DBMS, Semantic Search, Generative Search, and Analytics

Locations

Employees at Unum

Updates

  • View organization page for Unum, graphic

    1,364 followers

    Today, we are releasing a new set of pocket-sized multimodal AI models, trained in partnership with Nebius and already available on Hugging Face 🤗 - Matryoshka style multimodal embeddings ranging from 64 to 256 and 768 dimensions 🖼️ - Improved multimodal chat in 1.2B parameters, aligned with Direct Preference Optimization 💬 - ONNX backend, making PyTorch dependency optional for lightning-fast deployments ⚡ This marks our biggest AI release to date, paving the way for real-time multimodal perception and personalized assistants that can run on any device, including Windows, Linux, macOS, iOS, Android, and most wearable and IoT devices. Tuning with Direct Preference Optimization allowed us to grow our "Multi-Modal Evaluation" (MME) perception score from 863 to 1049 for the same baseline model. Unexpectedly significant for us and visible to the naked eye. Avoiding PyTorch dependency allowed us to shrink the image size from over 5 GB to under 500 MB and will shrink further. Training with Matryoshka losses allows you to crop resulting embeddings, dropping up to 92% of the resulting dimensions but often retaining 99% of the original accuracy. The new guide also covers the recommended quantization and down-casting approaches, helping you export `f32`, `f16`, `i8`, and binary vectors and use them in conjunction with USearch and the rest of our hardware-friendly ecosystem of Search & AI infra 🤗 New models: https://lnkd.in/dpfGcV59 GitHub repository: https://lnkd.in/dTrZ5Q2d

    • No alternative text description for this image
  • Unum reposted this

    View organization page for Nebius AI, graphic

    12,500 followers

    How to make language models more compact using Nebius AI?⚡️ Our colleagues at Unum just did exactly that on our infrastructure. The goal was to scale down UForm-Gen, which is already among the industry's smallest multimodal generative models. The smaller it becomes, the easier it is to deploy on edge and mobile devices, popularizing local, privacy-preserving AI. UNUM pre-trained the model on an internal image captioning dataset, trained on H100, and then fine-tuned it on public instruction datasets such as SVIT, LVIS, and VQAs.  The resulting UForm-Gen-2 has just been released on Hugging Face: https://lnkd.in/dyUjQVdF. It works with the HF Transformers library out of the box, free for commercial use. The model also significantly outperforms its predecessor in visual question answering and multi-turn chat capabilities. If you are facing the task of training ML models, learn how to do it effectively with Nebius AI: https://lnkd.in/dezbZsPQ

    • No alternative text description for this image
  • View organization page for Unum, graphic

    1,364 followers

    How small should a Language Model be? Less than two months ago, we released UForm-Gen, one of the industry's most miniature multimodal Generative #AI models. Now downloaded over 100,000 times a month, it's one of the industry's most popular captioning and visual question-answering models. It was built on a Language Model with only 1.3 billion parameters in size and sometimes worked better for Vision-Language tasks than the 100-1000x larger Google Gemini. Scaling down is the key to privacy-preserving AI models running on every one of the billions of chips produced yearly! So, using the NVIDIA DGX-H100 nodes on the Nebius #GPU cloud, we've trained several tiny models! One of them features the smallest Language Model we've used to date, derived from the "Qwen1.5" by Alibaba Group with only 0.5 billion parameters. The model significantly outperforms our previous result thanks to a considerably larger and higher-quality multimodal dataset and an improved Vision tower! On a technical side, the new "UForm-Gen-2" scored 19.58 on MM-Vet, 45.5 on SQA, and 880 on MME benchmarks, similar to last year's models 10-20x the size. It works with the Hugging Face Transformers library out of the box and is already available online, free for commercial use 🤗 https://lnkd.in/gti4HdAK

    • Unum UForm Gen2 captioning previews
  • View organization page for Unum, graphic

    1,364 followers

    UForm is going Generative! The UForm family of tiny multimodal AI models just got broader! In addition to the existing CLIP-like embedding models, we now have a generative model useful for image captioning, visual question answering, and multimodal chats. All that is #opensource and takes around a billion parameters, small enough to fit even on mobile devices 🎉 Repository: https://lnkd.in/dTrZ5Q2d Generative model: https://lnkd.in/gZ9y4KEW Chat model: https://lnkd.in/gpaRVvKm Discord: https://lnkd.in/gGj-rRGW Check our the quality of image captions in the comments ⬇️

    • No alternative text description for this image

Similar pages

Browse jobs