GLOBALTECH SOLUTIONS SAGL hat dies direkt geteilt
Machine learning education is broken, especially for those who aspire to start solving real-world problems at a company. Most classes, courses, and books start with a dataset and show you how to train a model. dataset → model This is, at best, 5% of the work you'll need to do. Real-life problems never start with a "dataset," and they never end after you finish training a model. I've never seen a company with a "dataset" ready to go. In fact, most companies don't even have any data at all. It's your job to determine what data you need and how to collect it. Here is a simplified process that will give you a better idea of how people solve real problems: problem → framing → data → model → feedback → repeat Before understanding the problem and deciding how you'll frame it to solve it, you can't start thinking about datasets. A few other challenges: 1. How do you get data from its source? 2. Is the data diverse enough to solve the problem? 3. Do you have enough data? 4. How is the data biased? 5. How frequently does the data change? 6. How sensitive is the data? 7. Are there missing, inconsistent, or incorrect values? 8. How noisy is the data? 9. How can you trace back every piece of data to its source? 10. Are there any legal restrictions on the use of the data? 11. How do you scale as data grows? 12. How quickly does the data become stale? Building systems that work requires a lot of effort. I wish more people would talk about this.