Exploratory data analysis (EDA) - DATA Understanding - Data Types

Exploratory Data Analysis (EDA) is an essential step in the data analysis process that helps to uncover important patterns, trends, and relationships in the data. Understanding the different data types is crucial for effective EDA. In this blog post, we will discuss the different data types used in EDA, including nominal, ordinal, discrete, and continuous data, as well as the difference between quantitative and qualitative data.


Nominal Data: Nominal data is categorical data that has no inherent order or hierarchy. Nominal data is often used to represent characteristics or attributes that are unique to a particular category. Examples of nominal data include:

  • Gender (Male or Female)
  • Marital Status (Single, Married, Divorced, or Widowed)
  • Hair Color (Blonde, Brown, Black, or Red)


Nominal data can be analyzed using frequency tables or bar charts to identify the frequency of occurrence of each category. In EDA, nominal data is considered a qualitative data type.

Ordinal Data: Ordinal data is a categorical data type where the categories have an inherent order or hierarchy. Ordinal data is often used to represent qualities that can be ranked or ordered. Examples of ordinal data include:

  • Education Level (Elementary, High School, Associate's Degree, Bachelor's Degree, or Master's Degree)
  • Disease Severity (Mild, Moderate, Severe)
  • Likert Scale Responses (Strongly Agree, Agree, Neutral, Disagree, Strongly Disagree)

Ordinal data can be analyzed using frequency tables or bar charts, and box plots can be used to examine the distribution of data across different categories. In EDA, ordinal data is considered a qualitative data type.

Discrete Data: Discrete data is a numeric data type that is made up of whole numbers or counts. Discrete data is often used to represent quantities that cannot be divided into smaller parts. Examples of discrete data include:

  • Number of Siblings (0, 1, 2, 3, 4, or more)
  • Number of Cars in a Parking Lot (0, 1, 2, 3, 4, or more)
  • Number of Items Sold (1, 2, 3, 4, or more)

Discrete data can be analyzed using frequency tables or bar charts to understand the distribution of data across different categories. In EDA, discrete data is considered a quantitative data type.

Continuous Data: Continuous data is a numeric data type that can take on an infinite number of values within a given range. Continuous data is often used to represent quantities that can be measured. Examples of continuous data include:

  • Height (in inches)
  • Weight (in pounds)
  • Temperature (in degrees Celsius or Fahrenheit)

Continuous data can be analyzed using histograms, box plots, or scatter plots to identify patterns and relationships. In EDA, continuous data is considered a quantitative data type.

Quantitative Data: Quantitative data is any type of data that can be measured numerically. This includes both discrete and continuous data. Quantitative data can be analyzed using statistical methods to identify patterns and relationships.

Qualitative Data: Qualitative data is any type of data that cannot be measured numerically. This includes nominal and ordinal data. Qualitative data can be analyzed using descriptive statistics to identify patterns and relationships.

In conclusion, understanding the different data types is crucial for effective EDA. By identifying and understanding the properties of each data type, we can choose appropriate methods and visualizations to explore the data and gain insights. Additionally, categorizing data as either quantitative or qualitative can help to guide the analysis process and ensure that the appropriate methods are used to analyze the data.

Comments

Popular posts from this blog

EDA preprocessing using MYSQL steps

Mindmap for studying Fundamentals of Mathematics in Data Science

Where To Begin