Due to the success of machine learning, it can often be easy to forget that data science is a much wider field of expertise, which does not limit itself to artificial intelligence. It is in fact a concept that aims to better understand the world around us by combining mathematics, statistics, data analysis, data visualization, the scientific method and computer science. This endeavour has been made possible due to an intangible, but omnipresent ressource: data.Before any project, it is crucial to understand the difference between the following data types: numerical, categorical, continuous, discrete, nominal and ordinal.
This knowledge is key to fully grasp the statistical nature of the available data and to properly handle any given features. Despite its simplicity, this step is essential to achieve a robust and meaningful data analysis. In fact, data types usually dictate which imputation strategies, statistical measurements, plot designs and algorithms are the most appropriate to use. Being comfortable with these properties is thus, without a doubt, one of the most valuable tools for a data scientist.