I just realized how misleading tutorials can be and I had to write down my thoughts on the topic.
Read a book! At least the chapter with the corresponding topic of you needs. 🙂
Recently I started to write my own notebook/book about Machine Learning and Computer Vision algorithms, so I can practice and keep my knowledge up to date. I discover (re-discover) the algorithms in the lowest level as possible, and I concentrate on the details and the math. Also after every chapter I implement the algorithm from scratch in Python. I am not optimizing, etc… the code, I keep it simple, verbose and understandable. Should I make it public?
While I started doing this project for myself, I wanted to see how others write tutorials on the algorithms or problems.
Let’s get for example Linear Regression. Simple, isn’t it? Basic algorithm which I knew very well but I still wanted to include in my notes. So I started reading tutorials to see how others explain it. I read more than 20 tutorials on the internet! After this, you should think I dream with this technique. But this was not the case.
Linear Regression Tutorial Nightmare
First thing I noticed, is that almost all of the tutorials were about the univariate case (but this was not stated in the title). They refer to that as Simple Linear Regression. Ok…but what about the multivariate case which is much more useful in real world. Also, the lack of effort they put to the writing was ridiculous. I think someone first wrote a tutorial blog post, and then 15 others copied it and changed a few things. Sometimes they made mistakes but they did not understand the algorithm so just kept those. And beginners can’t sport the mistakes so they learned what they should not.
98% of these tutorials are just for self advertisement and viewer collection. The remaining 2% is quality content with enough low-level and high-level abstraction but those writings are too hard for those people who read these blogs so they do not get much attention (except 1-2). Ans as the result of this, it is much harder to find.
After this first disappointment I challenged myself to find a tutorial on Multivariate LinReg with the normal equation solution. Impossible. I found a few which mentioned it, but I could not read a really nice write up. All of the them concentrated on the Gradient Descent method and treated it as it would be the only solution. And this is a big problem, at least they should mention the possibilities. I guess this is because Gradient Descent is a fantastic useful, simple algorithm and the Normal Equation Method would need to explain inverse, dependency, pseudo-inverse, etc… And the writers often could not multiply 2 matrices but they can “intuitively” explain the hiker metaphor without math of course.
Enough is enough. This was the time I couldn’t get myself into continuing this challenge I closed all the browser tabs and open a good old book. I realized in the book they explained the problem and the solution more clearly with more details in less word count then in the online tutorials.
Beginners are afraid to read books like these because they do not use the “kitchen language”, instead they think a simple 5 min read tutorial is perfectly enough. Unfortunately it’s not.
A few tips on how to find good tutorials and read them:
- Read more than 6 tutorials from different authors, if 2 looks like the same, skip it and find another one
- Find those ones which use equations in the explanation. Of course it depend on the topic itself, but this is for ML and CV tutorials
- First, read a book (only a chapter of a book which contains the topic) even if you don’t understand. Only then seek for online tutorials. Without understanding what was written in the book you’ll be able to “rank” tutorials
- popular != good
- Never trust a tutorial (yes, even this one I just wrote). Always take it with a pich of salt
Thanks for reading this and sorry if I was too raw but this journey really got me. I don’t like the commercial use of tutorials just to use a certain platform/API, etc…
Stay clever! 🙂