My First Month as a Junior Data-Scientist

In the middle of the summer, just surfed the web and found an interesting startup company. “Neuromarketing” was the word which grabbed my attention. Just for fun, I applied for the Data-Scientist job (because why not?). After 3 interviews (and tests) I worked my first month with them. 🙂

Before this job, I worked with Machine-Learning and Computer Vision for a little bit more than 1,5 year for a startup company. But there I only learned from the internet because I was the only one who knew these things. It had the good and the bad side too.

The interviews begin…
On one of the interviews where I spoke with the R&D Lead, I got to know how much I don’t know about these things and I just scratched the surface of data-science. But I was really passionate, and I wanted to learn more and more… After the interview I thought that I won’t get the job because I failed on some questions. I didn’t know the answers because I only knew how to use ML and DL for various problems, I just didn’t know the whole background of it, just a little piece.

Now, I’ll skip to the interesting part, enough about the background story. 😉

When I walked to the place at my first day, I imagined to create all the neural nets I dreamed about and I will train the sh*t out of the nets. This was a sweet dream but I started to realize this is not what I signed up for. 😀
One thing I knew for sure: I want to learn everything and even more than that!

On the first&second week I talked with almost everyone in the company and I started to get to know our data. A lot of data (but as we know data is power)… I never saw this many rows and columns in my previous experience. :O
After that first shock I learned about data pipelines and I got a small project: merging data files to create usable data and after that cleaning the data. I didn’t know that it will be so hard because when I thought the data is cleaned the senior members show me 10 other cleaning possibilities. I made some rookie mistakes, like not checking the data type because in pandas DataFrame it looked the same while in one table it was a string and in the other table it was an int.
At first I worked with jupyter-notebooks and after the merging worked there, I had to create pipeline tasks (Luigi framework) so we could save the intermediate files which were needed. After this “session” I started to realize what itvreally means to be a data scientist.

I have to say that it is really interesting because you need a lot of knowledge to create usable data from raw data. And the other important thing I learned in my first 2 weeks: cleaning the data is a lot harder than it sounds and it is a lot of work.

After this time I continued to write the merging tasks and I started to figure out correlation between data (Those who are under 20 are more likely to … etc…). Fortunately I am the full member of the R&D team and they listen to my ideas (so I am not that guy who does the boring monotone work).

So far I really enjoy to work with these problems and I think in the future I will just like it more and more as I will know more and more 🙂

If you have ny questions about the interviews/experience/etc. just write a comment! 😉

11 thoughts on “My First Month as a Junior Data-Scientist”

Add yours

vrtslav
October 28, 2017 at 9:07 pm


Woah, good article! You said you worked a bit with ML and data-science and you’ve found a job. Here is my question, maybe you can answer it: do you think if I am learning (via internet and books as you did) ML and data science for few months, I could try to look for a job in data science in one year? My background is microbiology and biophysics, I like it a lot, however data science is way more fascinating for me. Thanks for your answer in advance!

LikeLike

- Gábor Vecsei
  October 28, 2017 at 9:23 pm
  
  
  Thanks you very much!
  I think you should start learning it from courses and don’t be afraid to ask questions in ML and DL forums. Also concentrate on creating small projects which will be available publicly so your future employers will know what you are capable of.
  
  LikeLike
  
  - vrtslav
    October 28, 2017 at 9:34 pm
    
    Currently I am doing courses from Udemy for example (honestly, very well spent money!) and more I know, more motivated I am. Now I want to find some ideas for minor projects, as you advised as well, which I can show to my potential employers. Thanks again for your answer! 🙂
    
    LikeLike
  - Gábor Vecsei
    October 28, 2017 at 10:45 pm
    
    It was my pleasure to help you a little bit. (But as I see you don’t need much help 🙂 )
    
    LikeLike
Elka Firmanda
October 29, 2017 at 8:36 am


wow, great.. right now, I’m looking for my first job as data scientist
I have worked a lot of data mining and some machine learning project in college

LikeLike

- Gábor Vecsei
  October 29, 2017 at 1:00 pm
  
  
  Than just go for it! 🙂
  
  LikeLike
  
chaurasiamit
October 29, 2017 at 5:42 pm


I want to be a data scientist Can you tell me how to give a start to become a DS.

LikeLike

- Gábor Vecsei
  October 30, 2017 at 8:14 am
  
  
  Learn learn learn. 😉 Just start creating smaller projects as you learn new things. This will do the trick.
  
  LikeLike
  
  - Ida A
    November 2, 2017 at 9:06 pm
    
    I am a Data Analyst. I mostly do data cleaning and data Analysis with R and mostly STATA. I have a master Degree in Statistics. I am now interested in Data Science. Can you please advise how I should go about it?
    
    LikeLike
Harsha
October 29, 2017 at 6:04 pm


If you can , Could you please tell the type of questions that they had asked you in the interview that you couldn’t answer. Thanks in advance.

LikeLike

- Gábor Vecsei
  October 30, 2017 at 8:13 am
  
  
  For example: “Could you explain how the tf idf works?”, “Explain how random forest works”.
  
  I always used these things I just couldn’t explain deeply how things work. This was my biggest problem.
  
  LikeLike

	Astha on Home Security System with Comp…
	Sistema de seguridad… on Home Security System with Comp…
	quamedprogramma on Straighten Image with Ope…
	Roy on Deep Learning on Heroku tutori…
	Rodolfo Ferro on Image Analysis – Finger…

GáborVecsei