Exploratory data analysis (EDA) - DATA Understanding - Data Types
- Gender (Male or Female)
- Marital Status (Single, Married, Divorced, or Widowed)
- Hair Color (Blonde, Brown, Black, or Red)
Nominal data can be analyzed using frequency tables or bar charts to identify the frequency of occurrence of each category. In EDA, nominal data is considered a qualitative data type.
Ordinal Data: Ordinal data is a categorical data type where the categories have an inherent order or hierarchy. Ordinal data is often used to represent qualities that can be ranked or ordered. Examples of ordinal data include:
- Education Level (Elementary, High School, Associate's Degree, Bachelor's Degree, or Master's Degree)
- Disease Severity (Mild, Moderate, Severe)
- Likert Scale Responses (Strongly Agree, Agree, Neutral, Disagree, Strongly Disagree)
Ordinal data can be analyzed using frequency tables or bar charts, and box plots can be used to examine the distribution of data across different categories. In EDA, ordinal data is considered a qualitative data type.
Discrete Data: Discrete data is a numeric data type that is made up of whole numbers or counts. Discrete data is often used to represent quantities that cannot be divided into smaller parts. Examples of discrete data include:
- Number of Siblings (0, 1, 2, 3, 4, or more)
- Number of Cars in a Parking Lot (0, 1, 2, 3, 4, or more)
- Number of Items Sold (1, 2, 3, 4, or more)
Discrete data can be analyzed using frequency tables or bar charts to understand the distribution of data across different categories. In EDA, discrete data is considered a quantitative data type.
Continuous Data: Continuous data is a numeric data type that can take on an infinite number of values within a given range. Continuous data is often used to represent quantities that can be measured. Examples of continuous data include:
- Height (in inches)
- Weight (in pounds)
- Temperature (in degrees Celsius or Fahrenheit)
Continuous data can be analyzed using histograms, box plots, or scatter plots to identify patterns and relationships. In EDA, continuous data is considered a quantitative data type.
Quantitative Data: Quantitative data is any type of data that can be measured numerically. This includes both discrete and continuous data. Quantitative data can be analyzed using statistical methods to identify patterns and relationships.
Qualitative Data: Qualitative data is any type of data that cannot be measured numerically. This includes nominal and ordinal data. Qualitative data can be analyzed using descriptive statistics to identify patterns and relationships.
In conclusion, understanding the different data types is crucial for effective EDA. By identifying and understanding the properties of each data type, we can choose appropriate methods and visualizations to explore the data and gain insights. Additionally, categorizing data as either quantitative or qualitative can help to guide the analysis process and ensure that the appropriate methods are used to analyze the data.
Comments
Post a Comment