The world of tech has witnessed an extraordinary surge in the popularity of Data Science in recent years. More and more people are taking academic courses in the discipline, or else repurposing their skills to integrate data analysis and manipulation into their line of work.
In spite of this, the topic remains relatively nebulous for beginners. This article is intended for just those beginners, and particularly those who are interested in Data Science and would like to learn more (either independently or via a university, school or bootcamp), but are not sure where to start.
This is not intended as a breakdown of the fields that make up Data Science (much less as an exhaustive overview), but as a guide to getting started. We will present a number of steps which, followed in order, will furnish you with the basic skills to start Data Science in earnest, and also with a robust sense of what Data Science is and whether it’s right for you.
If you intend to take this any further after the final step, well – the sky is the limit!
Step 0: Do Your Research
We are calling this Step 0 because most people who are reading this article have probably already completed it, or are in the process of completing it. If this is not the case, however, this is where you’ll have to start.
The field of Data Science is not as mystical as it sounds, and contrary to what many take for granted, you don’t need an outstanding talent in mathematics to practice it. At the same time, it’s also not for everyone, and it’s certainly not the most intuitive of topics. So it’s not inconceivable that someone may commit to a learning program only to discover after a month that this is not what they really want to do.
Start with a few of the basic “What Is Data Science” articles/videos you will find by the handful on Google – here is one video to get you started, and another one that is slightly more polemical. Once you have the basics down, get on social media and ask around. There is an excellent subreddit about Data Science, inclusive of a pinned “Weekly Entering & Transitioning Thread” where you can ask beginner questions. LinkedIn and Twitter also allow you to ask and converse on public forums, although it may take a little longer to research the best groups for you and/or find the most useful contacts.
Finally, if you’re considering a bootcamp, you can always give them a call. If they’re serious, they should have no qualms about having a chat and helping you figure out if Data Science is right for you (we certainly don’t).
Step 1: Brush Up On Statistics
We already said that Data Science does not require you to be a mathematical genius. But make no mistake, you do have to crunch numbers.
If you do not already have at least a foundational understanding of statistics, then get hold of a few textbooks or online courses on that topic. Make sure that they include practical exercises, as opposed to purely conceptual overviews – you are not looking for pop science here, but for something that will let you get handy with inferential statistics (interpreting and drawing conclusions from data). If you get some (reasonably updated) high school textbooks, for example, those should serve your purpose, as they offer more or less the level you need to get started.
Some may suggest a more gentle, gradual path into the mathematical side of Data Science. In this case, however, we believe the frontal approach to be what’s best. Statistics are so essential to Data Science that if you try this branch of maths and come to the conclusion that you hate it, then it’s a sure sign that Data Science is not for you, and you’re only doing yourself a favour by finding out early.
If you want to know what other mathematics you may need aside from statistics, we have also published a more detailed article on the minimum mathematical requirements to study Data Science.
Step 2: Learn Python
Although Python is not the only programming language used in Data Science, it is almost certainly the best one to get started (incidentally, not just for Data Science). It’s among the most approachable and intuitive to use, it has a huge amount of freely available material to help you learn, and an equally large community able to help.
Most importantly, proficiency with Python will remain every bit as valuable at every stage of your career in Data Science. Indeed, when it comes to professional applications in this field, it is probably the most prevalent and popular programming language, with its closest competitor, the R language, being generally more popular in academic settings.
Bear in mind that Python is vast, and that once you get beyond the basics, you’ll want to start learning about the specific aspects relevant to Data Science. Two libraries that you should get into as soon as possible are pandas, which is essential to manipulate and visualise data, and Scikit-learn. The latter may be less essential for a beginner, but it’s still very much worth looking into, as it will give you your first taste of machine learning – a hugely fascinating sub-field in its own right.
Step 3: Identify A Topic You Like
At this point, the next step in a linear approach to learning Data Science would involve learning about data manipulation tools, like Advanced Microsoft Excel or Tableau. While these are unquestionably important and we would not discourage anybody from moving onto them next, our suggestion is to take a slightly more indirect approach.
Once you have a solid foundation in statistics and in Python, you can actually start thinking in terms of a project, albeit a rather simple one. In other words, find something that interests you and which allows you to gather data in some form or another – a great example would be the field of professional sports, for which a ton of data is publicly available for the fans. Then build a program to process that data and get an insight.
The program doesn’t actually have to be useful, and in fact it’s just as well to start with something frivolous, like the relationship between the number of times a day you open your fridge and the number of times you use your microwave. The essential point is that you should be doing something practical. This is the quickest way to to get a sense of how you feel about actual Data Science work, and it will also start building up your skills and intuition.
Potentially, this may include skills with the tools we mentioned at the outset. You’ll probably need at least the basic functions of Microsoft Excel, for example, but often it’s best to figure out why you need something first and then look into the field – and not the other way around.
Final Step: Go Pro!
If you have completed all of the steps above, it’s no exaggeration to say that you have started learning Data Science and that you’re ready to say how you feel about it. At this point, if you have decided that you want to continue, it’s time to commit to a proper learning path and get yourself an official certification.
You can sign up for a university course, for a bootcamp like one of our own, or you can go totally solo and figure out your own path with free courses online. The question of which of these options will work best for you goes beyond the remit of this article, and we’re happy to leave it for you to decide.