Things I Wish I Knew Before I Started With Machine Learning

Looking back at my journey so far, here are few things that would have made my life easier.

It is not easy
You’re gonna spend a lot of time searching through resources. Just running an algorithm from caret library isn’t enough. You have to understand the maths behind it and it takes a lot of careful reading and mental efforts. Any resource that tells you “you don’t need to learn maths” is probably not worth the efforts.

Machine learning is just one part of data science
Data science as a whole is much bigger game. Machine learning is just one bit of it. Even though not officially defined, it involves data pre-processing/cleansing, feature engineering, modeling, model training, model validation, presentation of results as building blocks. It is critical to move beyond pre defined techniques and develope an intuition for all factos in data science.

Mostly, you will be occupied with preparing data sets
When you’re doing machine learning, majority of time will be spend in preparing data: extracting data from multiple sources, cleaning datasets, aligning different datasets, mapping attributes to different dimensions etc.

Visualizations are important
Running all the algorithms is cool. But it is not of much use if it cannot be presented effectively to people who are not from machine learning background, executives and product managers are probably not gonna spend time looking how your R console looks. Creating visualizations considering both technical and non technical audience is crucial in machine learning.

There are libraries for EVERYTHING, you just need to find them
Most of the popular algorithms are already implemented in R and python. This serves great as a point of reference for performance. Also, it helps a lot understanding configuration options of each algorithm and see how it affects your model. Sure, you can code up an algorithm. But do it from learning’s perspective or only if required.

Most of the valuable material is in books and papers
Most of the online resources (including this post) will never give you the “whole deal”. They usually work only if you follow exactly what they say. But they don’t give much intuition on how to extend it beyond the example presented in the article. Eventually you will start reading books to get a better understanding.

Following the path of machine learning is a big commitment in terms of time and efforts. Be prepared for it.


Chinmay Kulkarni

Full stack polyglot developer and amature guitarist. I like to code in python, golang and JavaScript. Currently, I am pursuing MS from Carnegie Mellon University. I am a machine learning enthusiast and an avid open source contributor. You can reach me here -
GithubLinkedInTwitter


Share this article