Building a data platform on Kubernetes? Looking for a leg up on automating the process? Josh Lee and I will be showing how Terraform, Helm, and Argo CD can help you automate setup of analytic stacks based on ClickHouse. Join us on July 23 to find out more! https://lnkd.in/grRrF32j #opensource #clickhouse #kubernetes #terraform #opentofu #helm #argocd #analytics
Robert Hodges’ Post
More Relevant Posts
-
💡 At LoopStudio, we believe that efficient data processing architectures are the backbone of any successful ML solution. Recently, we implemented a powerful system that integrates Snowflake, DBT, Apache Airflow, and AWS services like SQS, Lambda, and Aurora to solve complex data challenges for one of our clients. Here’s a sneak peek of what we covered: 1) Data Collection: High-quality data is the foundation for any ML project. 2) Data Processing: Efficiently transforming raw data into valuable insights. 3) Data Storage: Choosing the right solution to ensure scalability and performance. 4) Machine Learning in Action: Leveraging data to drive innovation and business decisions. Want to know more? Check out the full breakdown in our latest blog post and learn how we build scalable, resilient architectures that push the boundaries of what's possible ➡️ https://lnkd.in/drM2R86W
To view or add a comment, sign in
-
Are you looking for a way to simplify your data engineering processes? DoubleCloud is here to save you time and effort by maintaining your data infrastructure. Introducing Managed Apache Airflow service! Say goodbye to the headaches of managing Airflow. With DoubleCloud, you get: - Regular software updates - Automatic scaling - Continuous monitoring DoubleCloud accepts custom docker images, allowing you to easily incorporate your workflow-specific libraries into Airflow running on a cluster. And the best part? This service is currently FREE! Join the DoubleCloud community today: Apply for early access via this page or follow the steps below -- https://bit.ly/3Kg2Ayq - Sign up on double.cloud. - Create a cluster. - Select the Airflow service and apply for the preview. Check out the GIF below to see how easy it is to get started. Let's accelerate your CI/CD processes together with DoubleCloud! #data #ai #dataengineering #doublecloud #theravitshow
To view or add a comment, sign in
-
-
30 Days Devops Challenge - Day 1: Weather Dashboard For the first task, I built an application that fetches real-time weather data like temperature, humidity and weather conditions for multiple cities and automatically stores the weather data in s3. DeShae Lyda thanks for making it straightforward. Checkout my submission on Github: https://lnkd.in/ghb2MTVC #DevopsAllStarsChallenge #s3 #API
GitHub - PreciousDipe/weather-dashboard: Day 1 of 30 Days DevOps Challenge: Weather Dashboard
github.com
To view or add a comment, sign in
-
Leverage Redis for processing Workato Recipe understanding and analysis of hardcoded values in recipes using Workato copilot resource. https://lnkd.in/gFV5_Ayr https://lnkd.in/gtkdQk-p https://lnkd.in/gkeHR2aB #productivityhack #workato #automation #integration #genaiusecase
To view or add a comment, sign in
-
-
Databricks’ recommended approach to MLOps may be confusing when you look at the diagram. This is what they actually suggest: ➡ Trunk-based development: short-lived feature branches (developers work in the Development workspace). PRs go directly into the main branch. ➡ At PR to main, CI pipeline is triggered. Unit tests and integration tests (in Staging workspace) run. After all the tests are green and changes are approved, PR is merged into main. ➡ After PR is merged into main, the CD pipeline runs to deploy to the Staging workspace. ➡ When deployment to Staging works as expected, main branch is merged into release branch. ➡ Push to release branch creation triggers the deployment to the Production workspace. Asset Bundle is what actually gets deployed. As shown in the picture, it consists of a Model train-deploy Workflow, Batch inference Workflow, and Monitoring Workflow. I believe this approach would work for most teams. We follow a similar git branching and deployment strategy except we use tag-based deployment to Production. 💎 Now it is easy for everyone to follow this approach if you use mlops-stacks to create a new project. Simply run “databricks bundle init mlops-stacks”. It is currently in Public preview. Check out the details here: https://lnkd.in/efJHe2YG #databricks #mlops #machinelearning
To view or add a comment, sign in
-
-
all good points below
Databricks’ recommended approach to MLOps may be confusing when you look at the diagram. This is what they actually suggest: ➡ Trunk-based development: short-lived feature branches (developers work in the Development workspace). PRs go directly into the main branch. ➡ At PR to main, CI pipeline is triggered. Unit tests and integration tests (in Staging workspace) run. After all the tests are green and changes are approved, PR is merged into main. ➡ After PR is merged into main, the CD pipeline runs to deploy to the Staging workspace. ➡ When deployment to Staging works as expected, main branch is merged into release branch. ➡ Push to release branch creation triggers the deployment to the Production workspace. Asset Bundle is what actually gets deployed. As shown in the picture, it consists of a Model train-deploy Workflow, Batch inference Workflow, and Monitoring Workflow. I believe this approach would work for most teams. We follow a similar git branching and deployment strategy except we use tag-based deployment to Production. 💎 Now it is easy for everyone to follow this approach if you use mlops-stacks to create a new project. Simply run “databricks bundle init mlops-stacks”. It is currently in Public preview. Check out the details here: https://lnkd.in/efJHe2YG #databricks #mlops #machinelearning
To view or add a comment, sign in
-
-
What we learned after running Airflow on Kubernetes for 2 years #apacheairflow #kubernetes #dataengineering #dataplatform #workflow https://lnkd.in/eZHkYQ9r
What we learned after running Airflow on Kubernetes for 2 years
medium.com
To view or add a comment, sign in
-
🚀 Day 36 of #90daysofdevopschallenge: Managing Persistent Volumes in Your Deployment! 💥 Excited to delve into the intricacies of Persistent Volumes in Kubernetes and optimize storage utilization for enhanced application performance. Let's harness the power of Persistent Volumes to ensure seamless data storage and retrieval within our deployments. 🙌🔥 #devops #learningandgrowing #90daysofdevops #90daysofdevopschallenge #trainwithshubham #LearningIsFun #k8s #Kubernetes #LearningJourney #configmap #secrets #pvc #pv #persistentsvolume #persistantvolumesclaim 🛠️📂
Day 36 - Unlocking Data Persistence: Mastering Persistent Volumes in Kubernetes! 🗃️💡
nilkanth1010.hashnode.dev
To view or add a comment, sign in
-
Community Spotlight! Georg Heiler shared a powerful template that brings together some of the best open-source tools in data engineering: DuckDB, Dagster, dbt, and Rust-based tooling for ensuring code quality. What makes it particularly noteworthy is its use of Pixi for seamless dependency management - eliminating one of the biggest headaches in setting up local development environments. While the stack runs locally, it's designed with scale in mind. The template can be readily deployed to any cloud platform using Kubernetes, complete with intelligent partition handling for optimal performance. This flexibility makes it an ideal starting point for teams that want to start small but plan for growth. Check it out today! Link in Comments
To view or add a comment, sign in
-
-
Join me tomorrow to see what we've learned building a platform for big and fast data on Kubernetes. #opensource #realtime #analytics #kubernetes #bigdata
Don’t forget to RSVP to tomorrow's #DoK talk, where Robert Hodges will present on “Building a Kubernetes Platform for Trillion Row Tables.” Click here to RSVP: https://lnkd.in/geBS8-aw #Kubernetes #RunDoK
Building a Kubernetes Platform for Trillion Row Tables-- CN Data at Scale, Thu, Jun 13, 2024, 10:00 AM | Meetup
meetup.com
To view or add a comment, sign in