By and large good advice for someone looking to start a company or find a job with some skills in programming and machine learning, but following these advices will not make anyone a data scientist. Remember that data science is more than becoming a programming wizard: it is in equal parts domain knowledge, machine learning and data processing skills, and computations. What you write Chris, will meet two of the three skills good but not good enough to make one a data scientist.

Remember that data science is about doing science with data. This is why a data scientist must not only be proficient in the domain expertise, but also be able to query and set up theories that explain the world on the basis of data, then set up hypotheses, and use data to make sense of the world. Developing data driven applications is fine but not enough. For example, you can certainly use data engineering and feature engineering skills to set up a dashboard to plug in values and estimate house prices in your locality, but that is not the job of a data scientist. A data scientist would explore the relationships between different variables plugged in to predict house prices and build explanatory and causal models based on deep understanding of the economics of house prices. That is the difference between a software programming wizard and a scientist who can 'see' and think with data.

Hence, my two cents worth of observation for an aspirant data scientist would be do everything that you, Chris, have written here, but develop a deep understanding of critical appraisal skills, and ability to read widely (either), if you want to be a generalist, or deep read in a specific area. Definitely devote time to develop visualisation skills. Above all, develop a sense of curiosity and scientific method.

Associate Professor of Epidemiology and Environmental Health at the University of Canterbury, New Zealand. Also in:

