Crunchy Data

Crunchy Data

Software Development

Charleston, South Carolina 5,461 followers

The Trusted Open Source Enterprise PostgreSQL Leader

About us

Crunchy Data is the industry leader in enterprise PostgreSQL support and open source solutions. Crunchy Data was founded in 2012 with the mission of bringing the power and efficiency of open source PostgreSQL to security-conscious organizations and eliminate expensive proprietary software costs. Since then, Crunchy Data has leveraged its expertise in managing large-scale, mission-critical systems to provide a suite of products and services, including: * Building secure & mission-critical PostgreSQL deployments * Architecting on-demand, secure database provisioning solutions on any cloud infrastructure * Eliminating support inefficiencies to provide customers with guaranteed access to highly-trained engineers * Helping enterprises to adopt large-scale, open source solutions safely and at scale Crunchy Data is committed to hiring and investing in the best talent available to provide unsurpassed PostgreSQL expertise to your enterprise.

Industry
Software Development
Company size
51-200 employees
Headquarters
Charleston, South Carolina
Type
Privately Held
Founded
2012
Specialties
PostgreSQL, Security, Kubernetes, Containers, Geospatial, PostGIS, and Cloud

Products

Locations

Employees at Crunchy Data

Updates

  • View organization page for Crunchy Data, graphic

    5,461 followers

    Database term of the day: “vectorized query” Postgres can be a vector data store. This has been common for years, especially in PostGIS and geospatial data. Vectors are also common with AI workloads and the pg_vector extension is now very popular for ML and AI workloads that use Postgres. *Note: Vector data and vectorized queries are two different things!* For the context of database queries, vectors are a one-dimensional array or list of values. Basically a batch of values processed as a group. Vectorized queries are optimized for modern hardware that can handle multiple operations in parallel. Using the word "vector" reflects the fact that these batches are essentially arrays of values being processed simultaneously. Unlike traditional row-based processing, vectorized execution processes data in chunks or vectors. This results in significant speed improvements in larger data sets and analytical queries. Row based processing is still ideal for application and transactional workloads. By fusing DuckDB + Postgres, Crunchy Bridge for Analytics enables vectorized querying for Postgres. This unlocks a new power your data science for business intelligence, log data, time series, and spatial analytics.

    • No alternative text description for this image
  • View organization page for Crunchy Data, graphic

    5,461 followers

    Today we're excited to announce the release of new open source extension by Crunchy Data: pg_parquet. Pg_parquet makes it easy to: * Export tables or queries to Parquet files * Ingest data from parquet files into Postgres * Inspect the schema and metadata of parquet files You can read a lot more in the release announcement on our blog, but if you want to get started right away or show some love on Github with a star check it out here - https://lnkd.in/gd7u_gQh

    GitHub - CrunchyData/pg_parquet: Copy to/from Parquet files from/to PostgreSQL. Inspect metadata of Parquet files.

    GitHub - CrunchyData/pg_parquet: Copy to/from Parquet files from/to PostgreSQL. Inspect metadata of Parquet files.

    github.com

  • View organization page for Crunchy Data, graphic

    5,461 followers

    Audit logs can burn money, unless … Unless … you use a strategy like partitioning. IoT and sensor data also save money with partitioning. “Partitioning” refers to a strategy of managing a large tables on a single host (”sharding” is multiple hosts). IoT and audit logs have similar characteristics: • use time-series data • write-once, updated-never • older data is less queried than recent data The best part is: Applications can use Postgres’ partitioning without code-changes to the application. Partitioning enables the ability to interact with a single table. Data storage is distributed across multiple nested tables. Send all writes and queries to the parent table. Postgres routes writes / queries the proper partition. The benefits of using partitioning are: • keeps tables small • smaller tables translate into smaller indexes • faster queries because smaller indexes • ability to export, detach, drop older data to cold storage • reduced costs due to smaller data / smaller indexes on hot-storage Configuring partitioning: • create parent table with a PARTITION BY RANGE (column) • implement partition strategy (i.e: date by hour, date by day, date by week, etc ) • depends on how quickly the table grows and desired partition size Partitioning maintenance: once partitioning is configured, it doesn’t mean you’re done. You’ll need to: • implement a retention strategy • use pg_partman for for the following (or roll your own with cron, but we recommend pg_partman): • extract older data to cold storage • detach and drop partitions as they roll-out of the retention period • create future partitions

  • View organization page for Crunchy Data, graphic

    5,461 followers

    Every organization is facing the question of 'how to use AI'. We see a lot of folks looking to get started with incorporating AI into their workflows. The good news is that Postgres users are already covered. The community and extensibility of Postgres makes it a winning platform to build on - with pgvector and a range of other tools to make it easy to get started developing AI enabled applications. Crunchy Data Postgres technology comes with AI enabled Postgres out of the box and by default - whether you are deploying Postgres to your infrastructure, to the cloud, on Kubernetes or are interested in a fully managed offering. Indico Data recently shared how Crunchy Postgres for Kubernetes helped them reduce costs and stop worrying about their data infrastructure so they could focus on building LLM and AI tools to equip customers with data-centered solutions for decision making, pricing, and research Interested in learning more about how you can get started with high performance AI applications in AI? Crunchy Data has a number of resources to help get started. https://lnkd.in/gb6tUMt2

    • No alternative text description for this image
  • View organization page for Crunchy Data, graphic

    5,461 followers

    Query your Iceberg tables with your Postgres! The Iceberg data format is already prominent in the big-data space, mainly for accessing data lakes. Recently, it has gained momentum in all facets of the data industry. In the future, we expect Iceberg to be a tool used by application developers and traditional DBAs. (Aside: AWS has a product called “Glacier” — Iceberg and Glacier are entirely different) Iceberg: The Basics Iceberg is an open table format specification. Its purpose is to interact with files as if they were databases. The specification defines metadata for organizing the files and the structure within them. Iceberg is designed for large data sizes (i.e. big data) and supports analytical workloads on top of that. Because Iceberg is a specification, any data tools can integrate with Iceberg tables. How is Iceberg different from other files-on-disk? Iceberg offers several advantages like traditional databases. These features include: - querying with smart query scan planning - schema evolution - hidden partitioning - version rollback Iceberg & Parquet Iceberg tables are stored on disk or on Cloud Object Storage, mainly what’s S3 compatible. The most popular format is sets of Parquet files. Parquet is a columnar file format with built-in compression, optimized for analytical queries. Storing Parquet files in an open table format provides a crucial benefit: interoperability. This allows many tools and query engines to access the same data. For example, Spark jobs can process large-scale data transformations from the same Iceberg table that Postgres queries for analytics. **Why is a Postgres company talking about Iceberg?** Crunchy Data is betting on Iceberg, and we are betting on the interoperability of Crunchy Postgres with Iceberg tables. In July 2024, we launched the ability to query Iceberg tables with Postgres. Companies with data scientists and analysts use SQL to query Iceberg tables from their warehouse. You can use Postgres for the ET (extract-transform) of your Iceberg tables. Since launch we have released additional Postgres + Iceberg features, and have more on the roadmap. Iceberg will be a foundation of the data ecosystems for the future.

    • No alternative text description for this image
  • View organization page for Crunchy Data, graphic

    5,461 followers

    “It was much easier to move to Postgres than we thought it would be.” Wyoming Department of Transportation (WYDOT) improves performance 4x by migrating from Oracle to Crunchy Postgres. When using Oracle, WYDOT struggled with uptime, scaling & costs, and lack of support. WYDOT was introduced to Postgres on a joint project. They quickly prioritized a migration of their core-data-stack from Oracle to Postgres. Since completing the migration, WYDOT has better uptime and more options than ever before. Read More: https://lnkd.in/gVYj8JXW

    • No alternative text description for this image
  • View organization page for Crunchy Data, graphic

    5,461 followers

    We're proud to be a Gold Sponsor at Red Hat Summit Connect in Darmstadt next month. This Open Source Flagship Event in Germany offers the opportunity to network with like-minded people, learn about the latest Red Hat technologies and engage with industry leaders to explore the future of technology.  Join us on November 19, 2024 and discover the world of open source technologies. https://lnkd.in/exEecfgv #RHSummit #OpenSource

    • No alternative text description for this image
  • View organization page for Crunchy Data, graphic

    5,461 followers

    Most queries against a database are short lived. Whether you're inserting a new record or querying for a list of upcoming tasks for a user, you're not typically aggregating millions of records or sending back thousands of rows to the end user. A typical short lived query in Postgres can easily be accomplished in a few milliseconds or less. But lying in wait is a query that can bring everything crashing to a crawl. Queries that run for too long are often going to create some cascading effects, most commonly these queries take one of four forms: * An intensive BI/reporting query that is scanning a lot of records and performing some aggregation * A database migration that inadvertently updates a few too many records * A miswritten query that wasn't intended to be a reporting query, but now is joining a few million records. * A runaway recursive query Each of the above queries is likely to scan a lot of records and shuffle the cache within your database. It may even spill from memory to disk in sorting data... It could be as bad as holding some locks so new data can't be written. Enter your key defense to keep your PostgreSQL database safe from these disaster situations: The statement_timeout configuration parameter. You can set this value at a database, user, or session level. This makes it easy to have a sane default while overriding it intentionally for long running queries. If you haven't already set this on your Postgres database what are you waiting for?

Similar pages

Browse jobs

Funding

Crunchy Data 1 total round

Last Round

Series A
See more info on crunchbase