Please join us in welcoming Ben Lucas to Illumination Works! Ben joins our Commercial Division where he will be leveraging his data engineering talents on a supply chain logistics project. Ben brings a wealth of experience in data visualization, SQL, Python, and ETL technologies. Ben values maximizing quality and efficiencies to provide an accurate and streamlined data flow for reporting. Welcome to the Team! #ILW #Welcome
Illumination Works’ Post
More Relevant Posts
-
AI & Data Marketing Maven: Turning Your Tech into Talk with a Dash of Humor and a Heap of Results – Let's Connect!
Why Data Modeling is the Secret Superpower Every Software Engineer Needs! ▸ https://lttr.ai/ATgwS #DataReignsSupreme #DataModeling #DigitalAge #SoftwareEngineering #WellOiledMachineDriving #MakingDataUnderstandable #HarmonizesRawData #DataMasteryStarts #EffectiveDataModeling #SecretSuperpower
To view or add a comment, sign in
-
Senior Platform Engineer | Senior Cloud Infrastructure Engineer | Data Platform Engineer | AWS | GCP | Kubernetes | Terraform | Golang | Kafka | Flink | Java
I was trying to get something in my area then I saw this restaurant is close by to my house. If you are a Data Engineer working with LakeHouse Architectures. I think this is the restaurant you should be eating from. #dataengineering #cloudcomputing #coding #programming
To view or add a comment, sign in
-
planning on breaking into the data engineering field? here's some fundamental skills you should know based on my experience: - Data Warehousing Concepts (star schemas, dimensions and facts, SCDs, full load vs. incremental) - SQL (joins, aggregates, unions) - Python (data manipulation techniques) -- knowing how to read files in various file formats and apply transformations) these technical skills are game changer and had definitely been proven to be useful in my career also, these soft skills: - has growth mindset (stay humble, always be curious) - hacking skills (problem solving) - communication and collaboration - being pro-active (ownership of task/project) 10/10 I would choose to work with those who possess these soft skills rather than those who possess purely technical skills. The right attitude is something you can't easily teach. -- Join Data Engineering Pilipinas 🇵🇭 to kickstart your journey! #dataengineering #DataEngineeringPilipinas #community
To view or add a comment, sign in
-
DATA ENGINEER | PYTHON | PYSPARK | AZURE DATA BRICKS | SQL | PANDAS | NUMPY | DATABRICKS | DATA FACTORY |
Ready to dive into the world of data engineering with 2 years and 3 months of valuable experience? 👩💻💼 Here's a snapshot of my journey: Expertise in database management and optimization for seamless operations Proficient in data modeling and analysis to drive strategic decision-making Skilled in Python, SQL, and ETL tools for efficient data processing Let's connect, share insights, and explore new opportunities together! 🚀 #DataEngineering #CareerGrowth #Networking
To view or add a comment, sign in
-
🚀 Data engineering meets teamwork! 📊 Imagine your data as a puzzle 🧩, with each piece needing to fit just right. Now, let's talk about how a data engineer can seamlessly make that data available for a quality assurance check by a data analyst. 🔍💡 💡 Picture this: The data engineer has ingested the data into a PySpark DataFrame, let's call it raw_df. Now, they need to give the data analyst access to this data in SQL format, but just for the duration of the Spark session. 🕒✨ 🔧 The magic command here is raw_df.createOrReplaceTempView("raw_df"). 📊 It's like setting up a temporary exhibit in a museum 🏛️ - the data is available for analysis, but it's only there for this specific session. Once the session ends, the exhibit disappears, keeping things tidy and efficient. 🧹 🌟 So, with this command, the data engineer ensures smooth collaboration between teams while maintaining control over data access and integrity. 🛠️💻 #DataEngineering #DataAnalysis #Teamwork #SparkSession 📈✨
To view or add a comment, sign in
-
SOLID principles do apply to Data Engineers and Platforms Or maybe not, you be the judge... 👇 Single Responsibility: use Data Marts, each focusing on one business case Open/Closed: when adding a new datasource add a new pipeline, existing ones should not need to be changed Liskov Substitution: decommissioning a dashboard should not require a six month sign-off process! Each component should be swappable with minimal effort Interface Segregation: prefer views, stored procedures, table functions and other APIs, over direct access to table objects Dependency Inversion: provide an abstraction layer between users and data, even for Python power users and Parquet objects I found the L and D to be the most challenging ones They are not purely engineering and require close alignment to the Business What is your experience with them? Let me know below! #dataengineering #data #dataanalytics
To view or add a comment, sign in
-
Excellent conversation between Steven Johnson and Julia Salinas, MBA. This is just a short clip but the whole convo is great! To piggyback off of her point here, in my experience the analytics engineers provide the base layer of business logic to be consumed by analysts and data scientists. This consists of a lot of data modeling as well as domain knowledge (heavy SQL). Data engineers on the other hand are more focused on building and designing end-to-end pipelines of data loads and potential transformations (before the business logic is applied, and heavy Python and SQL). That being said, it is never easy to generalize and odds are there is probably a lot of overlap for most in these positions
To view or add a comment, sign in
-
You're not a Data Engineer until you have… ✅Built a dataset that no one uses ✅Cried because your source data schema changed ✅Archived a dozen tables to cold storage and then realized you actually still need them ✅Spent hours looking for a missing comma in a SQL query ✅Fought with your Airflow environment for hours just to install a Python package ✅Had at least one nightmare about a SQL query that never finishes running ✅Tried to explain your job to your friends, given up, and just said “I’m a Software Engineer” instead Data Engineers of LinkedIn, what did I forget? #dataengineering #dataengineer
To view or add a comment, sign in
-
Data Engineer @ Wells Fargo || Pyspark, Alteryx,AWS, Stored Procedure, Hadoop,Python,SQL,Airflow,Kakfa,IceBurg,DeltaLake,HIVE,BFSI,Telecom
𝗗𝗮𝘁𝗮 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝗝𝗼𝘂𝗿𝗻𝗲𝘆: 📌 𝗗𝗔𝗬 𝟳𝟲/𝟵𝟬 📢 𝗗𝗮𝘁𝗮𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴-- 𝗣𝘆𝘀𝗽𝗮𝗿𝗸 🚩𝗤𝘂𝗲𝘀𝘁𝗶𝗼𝗻- C𝗹𝘂𝘀𝘁𝗲𝗿𝗶𝗻𝗴 𝗸𝗲𝘆𝘀 𝗮𝗻𝗱 𝗰𝗼𝗺𝗽𝗼𝘀𝗶𝘁𝗲 𝗸𝗲𝘆𝘀 𝗶𝗻 𝗦𝗻𝗼𝘄𝗳𝗹𝗮𝗸𝗲. --------------------------------------------------------- 𝗖𝗹𝘂𝘀𝘁𝗲𝗿𝗶𝗻𝗴 𝗞𝗲𝘆𝘀 Clustering keys in Snowflake are used to improve the performance of queries on very large tables by co-locating similar rows in the same micro-partitions. Here are some key points: 𝗗𝗲𝗳𝗶𝗻𝗶𝘁𝗶𝗼𝗻: A clustering key is a subset of columns in a table (or expressions on a table) that are explicitly designated to co-locate the data in the table in the same micro-partitions1. 𝗣𝘂𝗿𝗽𝗼𝘀𝗲: They help improve scan efficiency by skipping data that does not match filtering predicates, and they can also enhance column compression1. 𝗨𝘀𝗮𝗴𝗲: Clustering keys are particularly useful for very large tables where the natural clustering has degraded over time due to extensive DML operations1. 𝗜𝗺𝗽𝗹𝗲𝗺𝗲𝗻𝘁𝗮𝘁𝗶𝗼𝗻: ALTER TABLE sales CLUSTER BY (order_date, customer_id); 𝗖𝗼𝗺𝗽𝗼𝘀𝗶𝘁𝗲 𝗞𝗲𝘆 𝗣𝘂𝗿𝗽𝗼𝘀𝗲: Composite keys are used to uniquely identify rows in a table by combining multiple columns. They can be used as primary keys or unique keys. 𝗨𝘀𝗮𝗴𝗲: Useful when a single column is not sufficient to uniquely identify a row. Composite keys ensure data integrity and can be used in joins and other operations. 𝗘𝘅𝗮𝗺𝗽𝗹𝗲: A table with order_id and product_id as a composite key ensures that each product in an order is uniquely identified ALTER TABLE orders ADD CONSTRAINT unique_order_product UNIQUE (order_id, product_id); #SQL #DataEngineering #python #TechTips #dataengineer #Pyspark #Pysparkinterview #Bigdata #BigDataengineer #interview #sparkdeveloper #sparkbyexample #ApacheKafka #Kafka #BigData #DataEngineering #StreamingData #DataArchitecture #EventDriven #RealTimeAnalytics #DataIntegration #DataPipeline
To view or add a comment, sign in
-
Solution Architect & Data engineering Guy || Snowflake & Matillion Certified || Driving Data driven Innovation & Cloud based Solutions || SQL, Python, Snowflake, Matillion,DBT,ADF,AWS, MSBI,Spark & Data Modeling
You don't need to create complex pipelines to showcase in your resume as early stage of Data engineering career. Just get hands on following one. Work on Source Connection: 1- Connect your ETL/ELT tool with S3 & ADLS. 2- How you will extract data from SFTP/FTP? 3- Connect S3/ADLS using Python. 4- Connect your On Prem databases using ETL tool or Python? 5- How to extract data from APIs? 6- Read CSV/Parquet files using Spark . 7- Download sample data from Kaggle and Use pandas for your analytics. Second part Transformation: 1- How will you combine multiple sources data? 2- How to handle nulls and missing values? 3- How to implement SCD 2 & CDC using tools or script? 4- Work on loading Incremental/Full load and Delta loads. 5- Work on Data cleansing like aggregation, Filtering and Data validation. Then unload all these data to Onprem or your warehouse. Unload to Data lake S3/Blob/GCS etc. Break the complex process into small chunks and then implement. It will give you confidence and will keep you motivated for your data engineering journey. #dataengineering #etl #elt #dataengineer #careerdevelopment #learnings #projectideas
To view or add a comment, sign in
29,524 followers