Small Data SF

Small Data SF

Data Infrastructure and Analytics

San Francisco, California 879 followers

Build bigger with small data and AI - join the movement.

About us

Small Data SF is the first of its kind, in-person experience for the community to gather, build together, and learn how to make the most of their data with simple, efficient architectures and workflows. What was once considered 'Big Data' can now fit on your laptop - so why are we still treating it like Big Data? Not everything deserves a round-trip to the cloud. Think small, develop locally, ship joyfully.

Industry
Data Infrastructure and Analytics
Company size
11-50 employees
Headquarters
San Francisco, California
Type
Privately Held

Locations

Updates

  • Small Data SF reposted this

    View organization page for MotherDuck, graphic

    17,793 followers

    Something small is happening 🔥 On Wednesday, 11/13, Join Altana, Jamsocket, MotherDuck and the Small Data community for Watch Party Wednesday @ 25 Kent. We heard that the be(a)st coast 💪 wanted in on Small Data SF, so we brought you something special: A first look at talks from Benn Stancil and Jordan Tigani before they drop online. Space is limited. Save your spot: https://lnkd.in/exs9WvpM

    • No alternative text description for this image
  • Small Data SF reposted this

    View organization page for MotherDuck, graphic

    17,793 followers

    The costs associated with running large language models (LLMs) have fallen significantly, making advanced natural language processing techniques more accessible than ever before. The emergence of small language models (SLMs) like gpt-4o-mini has led to another order of magnitude in cost reductions for very capable language models. This democratization of AI has reached a stage where integrating small language models (SLMs) like OpenAI’s gpt-4o-mini directly into a scalar SQL function has become practicable from both cost and performance perspectives. We’re thrilled to announce the prompt() function, which is now available in Preview on MotherDuck.

    • No alternative text description for this image
  • Small Data SF reposted this

    View profile for Ravit Jain, graphic
    Ravit Jain Ravit Jain is an Influencer

    Founder & Host of "The Ravit Show" | LinkedIn Top Voice | Startups Advisor | Gartner Ambassador | Evangelist | Data & AI Community Builder | Influencer Marketing B2B | Marketing & Media | (Mumbai/San Francisco)

    Small Data vs Big Data!!!! Why is focusing on Small Data important? I spoke to Ryan Boyd 🐤, Co-Founder, MotherDuck at Small Data SF last week and he shared some really interesting insights about how machines are becoming powerful, enterprise leaders are intrigued about small data, and much more! Can’t wait for Small Data SF 2025! #data #ai #smalldatasf #theravitshow

  • View organization page for Small Data SF, graphic

    879 followers

    Wow. We've been very quiet after last week, soaking in your response to our first gathering to officially kick off the Small Data movement. Thank you. Thank you for showing up and bringing unmatched energy, enthusiasm, and collective community brain power to this idea that small really is mighty 💪 Our moment is here, and it's just getting started... We're so grateful to be building bigger together with you.

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
      +15
  • Small Data SF reposted this

    View profile for Koosha T., graphic

    Data & Analytics Engineering | ETL Pipelines | Data Warehouses & Lakes | Business Intelligence | ML Ops | AWS | GCP | Snowflake

    Post Small Data SF conference post (2/2) Of course, there was the main event - the speakers were amazing. Of course there was Jordan Tigani's opening talk on Big Data and why it's not what you might think - plus the awesome conversation with Fivetran CEO George Fraser about why and where Big Data problems really occur (and why ideally they shouldn't). Also presenting was Julia Schottenstein from LangChain discussing the rise of agents and Directed Cyclic Graphs (DCGs); Lindsay Murphy from Hiive explaining how the end of the Zero Interest Rate Policy (ZIRP) era will completely change the data engineering game; and Mode founder Benn Stancil on why traditional BI needs a new approach when it comes to small data (which most of the world is working with!) There were so many others to list - (just look at how awesome the agenda was! https://lnkd.in/g2HCVcBJ) but this was the limit of what I managed to catch! Can't wait until Small Data 2025! Way to go Sheila Sitaram, Margaret Lawrence Rosas and the MotherDuck team for organizing! 👏

    • No alternative text description for this image
    • No alternative text description for this image
  • Small Data SF reposted this

    View profile for Ravit Jain, graphic
    Ravit Jain Ravit Jain is an Influencer

    Founder & Host of "The Ravit Show" | LinkedIn Top Voice | Startups Advisor | Gartner Ambassador | Evangelist | Data & AI Community Builder | Influencer Marketing B2B | Marketing & Media | (Mumbai/San Francisco)

    I had the pleasure of interviewing Wes McKinney, Creator of Pandas, a name well-known in the data world through his work on the Pandas Project and his book, Python for Data Analysis. Wes is now at Posit PBC, and during our conversation at Small Data SF, we covered several key topics around the evolving data landscape! Wes shared his thoughts on the significance of Small Data, why it’s a compelling topic right now, and what “Retooling for a Smaller Data Era” means for the industry. We also dove into the challenges and potential benefits of shifting from Big Data to Small Data, and discussed whether this trend represents the next big movement in data. Curious about Apache Arrow and what's next for Wes? Check out our interview where Wes gives some great insights into the future of data tooling. #data #ai #smalldatasf2024 #theravitshow

  • Small Data SF reposted this

    View profile for James Winegar, graphic

    CEO of CorrDyn | Helping Great Companies to Make Smarter Strategic Decisions with their Data

    Most data is Small Data, and we talked about it; the goalposts are constantly moving, and most data questions don't require a distributed cluster to answer them. Today, we can run analytical queries on a single node for most queries; many of the most important queries still run over the full data set, which might require distributed processing, but 99% don't. How can we improve our processes and latency while enabling new use cases with small data? The world is moving towards composable data systems, and I believe there's a large opportunity for systems optimized for single workers like DuckDB. Shoutout to the MotherDuck team for putting together the Small Data SF conference, particularly Sheila Sitaram, who put on an amazing event (you would never believe it was the first conference she ran), with support from Ryan Boyd 🐤 and Jordan Tigani. I might not have been the most contrarian or the most joy-sparking member of the panel with Ravit Jain, Celina Wong, Jake Thomas, and Josh Wills, but we had a good time! My highlights from the conference: - A whiteboarding session with the MotherDuck team where we talked about how we can do more for our customers - Finally meeting Margaret Lawrence Rosas and Alex Monahan in person after our last failed attempt - Introduced to Guillermo Gonzalez Aleman, who runs Litebox and put together some amazing digital for the event - Accidently running into Sean Lynch while reading each other's nametags - Met with some of CorrDyn's clients, new and old, while visiting SF

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
  • Small Data SF reposted this

    View profile for Lindsay Murphy, graphic

    Director, Head of Data @ Hiive // Host of Women Lead Data Podcast // Advanced dbt Instructor @ Uplimit

    I had an amazing time at Small Data SF this week, learning about all the exciting things that folks are doing with "small" data. I gave a talk on squeezing max ROI out of small data...but I think the concepts really apply to any situation for data teams. Here are the key takeaways: 🔑The "modern data stack" industry tends to focus on our lack of constraints (easily scalable Cloud data warehouses, endless dbt models, and dashboards for everyone-that no one uses). 🔑 This abundance mindset teaches data teams bad behaviours (frivolous building, lack of governance, prioritizing speed over quality and performance) 🔑 Hyperfocus to deliver value: identify the most critical stakeholders in your business and focus on their needs, learn how the business functions as well as (or better than) they do. This is how we maximize our efforts and avoid getting caught in support ticket hell with low influence stakeholders. 🔑 Stop ignoring our constraints (cost, scope, time) when designing solutions, whether they're obvious or not. Give constraints a first row seat at the table in projects (and PRs). It helps drive innovation. Also loved to catch talks from Benn Stancil on BI's Big Lie, and an awesome panel to end off the day with Celina Wong, Jake Thomas, Ravit Jain, James Winegar, and Josh Wills Huge thank you to the MotherDuck team for having me join the conference Sheila Sitaram Ryan Boyd 🐤 that was awesome, can't wait for the next one!!

    • No alternative text description for this image
    • No alternative text description for this image

Similar pages