Break Into Data

Break Into Data

Technology, Information and Internet

San Francisco, California 8,561 followers

Learn. Build. Show.

About us

Whether you're aiming for that dream job, wanting to boost your skills, acing academics, or even achieving fitness goals, Break Into Data is a community that will support and empower you to reach new heights!

Industry
Technology, Information and Internet
Company size
11-50 employees
Headquarters
San Francisco, California
Type
Self-Owned
Founded
2024

Locations

Employees at Break Into Data

Updates

  • Break Into Data reposted this

    View profile for Meri Nova, graphic

    ML/AI Engineer | Community Builder | Founder @Break Into Data | ADHD + C-PTSD advocate

    Machine Learning is such an incredibly vast and generous field. ❤️ If you DON'T like what you are working on, you can always move to: - Computer Vision: Projects: Face recognition, object detection, autonomous driving Big Tech: Meta (PyTorch), Google (MediaPipe), Apple (Vision ML) Hot Startups: Scale AI, Weights & Biases, Roboflow - Recommender Systems: Projects: Product recommendations, content personalization Big Tech: Netflix (Shows), Amazon (Products), Spotify (Music) Hot Startups: TikTok, Glean, Constructor - Large Language Models (LLMs) / GenAI: Projects: pre-training, fine-tuning and deploying LLMs Big Tech: Google (Gemini), Meta (Llama), Amazon (Titan) Hot Startups: OpenAI, Anthropic, Cohere, Imbue - MLOps/AI Infrastructure: Projects: Model deployment, monitoring, scaling Big Tech: Google (Vertex AI), Amazon (SageMaker) Hot Startups: Databricks, Anyscale, Weights & Biases - Natural Language Processing: Projects: Translation, content moderation, sentiment analysis Big Tech: Google (BERT), Meta (RoBERTa), Microsoft (T5) Hot Startups: Hugging Face, Duolingo, Grammarly - AI for Search/Information Retrieval: Projects: Semantic search, ranking, question answering, RAG applications Big Tech: Google, Microsoft (Bing), Amazon Hot Startups: Perplexity AI, You.com, Neeva - Reinforcement Learning / Robotics: Projects: Motion planning, perception, control Big Tech: Amazon (Robotics), Google (DeepMind) Hot Startups: Tesla, Agility Robotics, Boston Dynamics ... If you want to explore your career options, sign up for my newsletter. 👇 merinova.substack.com In the next edition, I will share about trending industry opportunities in ML and AI. ... What domains are you excited about the most?

  • Break Into Data reposted this

    View profile for Meri Nova, graphic

    ML/AI Engineer | Community Builder | Founder @Break Into Data | ADHD + C-PTSD advocate

    Build a state-of-the-art(SOTA) architecture as PhD researcher - make 45k/year. Fine-tune pre-trained LLMs from Hugging Face as an ML engineer - make 200k/year. Debug production issues at 3 a.m. as a senior engineer - make 500k/year Build an open-source library: make 0$ Build a SaaS wrapper around that library: make $100k/month #thank_you_open_source 😭

  • Break Into Data reposted this

    View profile for Meri Nova, graphic

    ML/AI Engineer | Community Builder | Founder @Break Into Data | ADHD + C-PTSD advocate

    Sparse and Dense retrieval methods are fundamentally different! *This is for you if you are building RAG systems. Sparse retrieval (Term-Based): - focuses on matching query keywords directly within documents. - ideal for finding documents with exact phrases like "ML engineering." - uses inverted indices for quick lookups; optimized for high-speed retrieval - uses TF-IDF and BM25 - typically cheaper since indexing is less compute-intensive. Dense retrieval (Embedding-Based): - embeds text in vector spaces to retrieve similar meanings, not just matching words. - effective when meaning is more critical than keywords - high-performance, but slower retrieval - requires vector storage (e.g., FAISS) for efficient nearest-neighbor search. - higher cost due to embedding generation and vector storage, especially with frequently updated data #machinelearning #ai #nlp

  • Break Into Data reposted this

    View profile for Daliana Liu, graphic
    Daliana Liu Daliana Liu is an Influencer

    Founder of "Data Science & ML Career Accelerator" | Ex-Amazon Sr. Data Scientist | I write about {career growth, stakeholder management, my solo-founder journey}

    Kaggle and Leetcode might help you get a 6-figure data science job. But once you get it, reality will slap you in the face. · The data is always messy · The stakeholders reject your solutions · Your manager doesn’t trust you to give you high-impact projects. Just a few common struggles from data scientists face. I have seen it over and over again, very technical data scientists struggle to progress in their careers. It’s not because they lack technical skills, but because they haven’t developed the problem-solving and communication skills to advance further. Success isn’t just about the hard skills, its about knowing how to: · break down complex business problems into actionable steps · communicate value to the stakeholders effectively · advocate for yourself for promotions If you want to know what it takes to build a successful data science career, I’m doing a fireside chat + Q&A with Meri Nova - AI/ML engineer, founder of the “break into data” community this Tuesday, 7 pm PT. I’ll share the ah-ha moments I learned only after I became a senior data scientist. I will talk about: · how to create more impact as a data scientist · how to position yourself to get a promotion · how to identify your own niche Join us on Tuesday at 7 PM Pacific time! *Update: this event has ended. It was a great session packed with actionable take-aways, watch the recording: https://lnkd.in/embvfGat

    • No alternative text description for this image
  • Break Into Data reposted this

    View profile for Meri Nova, graphic

    ML/AI Engineer | Community Builder | Founder @Break Into Data | ADHD + C-PTSD advocate

    Don't try to learn Machine Learning all at once! You will get overwhelmed and paralyzed. Instead, follow these rules👇 (This is a beginner-friendly guide.) Rule #1 - No theory without application If you want to gain the breadth of ML topics, don't just consume content! If you do not immediately try to implement what you've learned, you will likely forget it in a few months or weeks. Theory and application should go hand in hand! Action: For every new ML concept you learn, set aside time to implement a simple version or experiment with it using a small dataset. Rule #2 - Learn the math from the source code When importing popular ML libraries, don't just use them as black boxes. Dive into the source code to understand the math that's running the show under the hood. Action: Choose a simple ML model from a popular library (like scikit-learn) and reverse engineer its source code with underlying logic and math. (This is like watching a fitness coach squat and trying to learn his form) Rule #3 - Build from scratch Before jumping to the next model, give yourself at least 30 minutes to try building that model from scratch using only NumPy and Python. Action: After learning about the fundamental logic behind this new ML algorithm, try to recreate own implementation. (This is like actually squatting) A small gift for you. 🎁 This free ML from Scratch Course (https://lnkd.in/g6fcGD8N) takes you through writing 10 algorithms from scratch with nothing but Python and NumPy! - K-Nearest Neighbors - Linear Regression - Logistic Regression - Decision Trees - Random Forest - Naive Bayes - PCA - Perceptron - SVM - K-Means ... Don't wait! Follow Break Into Data's motto: Build. Learn. Share!

  • Break Into Data reposted this

    View profile for Meri Nova, graphic

    ML/AI Engineer | Community Builder | Founder @Break Into Data | ADHD + C-PTSD advocate

    SQL and Python might get you a $130k data science job. But once you get it, reality will slap you in the face. Everything will change. The focus will shift. Suddenly, your ability to navigate company politics will become FAR more important than just writing good code. I’ve seen it happen time and again: brilliant data scientists who build powerful models but struggle to progress in their careers. They find themselves stuck in the same role for years, not because they lack technical skills, but because they haven’t developed the strategic and communication skills to advance further. Success isn’t just about the hard skills, its about: - finding your zone of genius - advocating for yourself in the eyes of the management - driving influence with stakeholders If you want to know what it takes to build a successful data science career, then join me and Daliana Liu this Tuesday, 7 pm PT. Daliana, Senior Data Scientist with 7 years of experience from Amazon, founder of Data Science Career Accelerator program and creator with 300k following! You can't miss this event, because Daliana will share secrets only Senior Data Scientists know. We will talk about: - finding a fulfilling career path that leverages your own strength - how to find and lead high-impact data science projects - senior career opportunities nobody talks about - and much more! There will be a Q&A session towards the end, so prepare your questions! Join us on Tuesday at 7 PM Pacific time! Register here - https://lu.ma/td2zq4gs

  • Break Into Data reposted this

    View profile for Meri Nova, graphic

    ML/AI Engineer | Community Builder | Founder @Break Into Data | ADHD + C-PTSD advocate

    I still can’t believe that: - Hugging Face is free - Python is free - Google Collab is free - PyTorch is free  - Kaggle is free - VSCode is free - Andrej Karpathy is free - Andrew Ng is free - Meri Nova is also free 😂 ... Aren't we the luckiest generation to get to build the future of AI? ... #machinelearning

  • Break Into Data reposted this

    View profile for Meri Nova, graphic

    ML/AI Engineer | Community Builder | Founder @Break Into Data | ADHD + C-PTSD advocate

    17 new engineering articles from Big Tech worth reading to improve your ML system design: 1. Uber: Optimizing LLM training - https://lnkd.in/g5Qr6_eY 2. Netflix: Recommending for Long-term satisfaction - https://lnkd.in/g7FJg5yK 3. Linkedin: RecSys -https://lnkd.in/gnBnGvTd 4. Discord: Rapid GenAI development - https://lnkd.in/gNb6cEVA 5. Pinterest: Ad ranking - https://lnkd.in/g7njuVn8 6. Instacart: Fraud Detection - https://lnkd.in/gfK-NEas 7. GoDaddy: Classify support tickets w LLMs - https://lnkd.in/g7BYKfi8 8. Gitlab: LLM-powered features - https://lnkd.in/ghVFmSBx 9. Goldman Sachs: NLP to improve PRs - https://lnkd.in/gnYgkZiw 10. Target: Recommender System - https://lnkd.in/gZPzWrw9 11. Ebay: Developer Productivity w LLMs - https://lnkd.in/gqAE9rqF 12. Replit: Fine-tuning LLMs for code repairs - https://lnkd.in/gti_PSfx 13. Linkedin: Suggesting new connections - https://lnkd.in/g7bj3-aY 14. Canva: Detect related groups of objects - https://lnkd.in/gaAuRYW6 15. Yelp: Detect inappropriate video content - https://lnkd.in/gZN9vR_p 16. Nvidia: Detect software vulnerabilities - https://lnkd.in/gsdFTBzk 17. Grammarly: Detect delicate text - https://lnkd.in/gsVdM89k ... I personally enjoyed learning about Discord's framework on "Developing rapidly with GenAI". ... If you find these helpful... 👍 React ♻️ Share 💬 Comment So more people can learn. #machinelearning #systemdesign

  • Break Into Data reposted this

    View profile for Meri Nova, graphic

    ML/AI Engineer | Community Builder | Founder @Break Into Data | ADHD + C-PTSD advocate

    The most useful AI roadmaps are not found in textbooks. They're shaped by real people and their inspiring career transitions. Check out my favorite stories on breaking into AI with unique roadmaps! 1. "How I got into Deep Learning?" - https://lnkd.in/gK38nA45 A candid story written by Vik Paruchuri on breaking into AI with helpful resources. 2. "How I Started Learning Machine Learning?" - https://lnkd.in/gMVFua-B Great reddit thread on breaking into NLP and LLMs with resources. 3. "How I got into OpenAI?" - https://lnkd.in/gHphR-VF Amazing read by Rai (Michael) Pokorny with the most comprehensive list of resources. ... If you want to hear another successful story of breaking into GenAI with Aishwarya Naresh Reganti, don't miss today's event! 👇 Sign up here - https://lu.ma/rpqpvbkt

Similar pages