Posts

Placement Preparation AIML Internship

Portions to Cover: Basics   For the assesment test 1) Aptitude, Reasoning, Quantitative 2) Data Structures (C and Python) 3) SQL (Mysql, Postgres, OracleDB) Advanced  1) Advanced Data Structures 2) Design Analysis of algorithms  3) Advanced Competitive Coding 4) Data Preprocessing & Data Visualization 5) Machine Learning 6) Deep Learning 7)Scala 8) Pyspark & Kafka 9) Apache hadoop 10) Docker and Kubernetes 11)Pentaho (Data integration) 12) Keycloak  13) Flask and Streamlit

EDA preprocessing using MYSQL steps

EDA Preprocessing in MySQL: Connect to the MySQL database. Retrieve the dataset or table to be analyzed. Handle missing values: Identify columns with missing values. Decide on a strategy to handle missing values (e.g., removing rows, imputation). Implement the chosen strategy to fill or remove missing values. Handle duplicate values: Identify duplicate rows or columns. Decide on a strategy to handle duplicates (e.g., removing duplicates, keeping the first or last occurrence). Implement the chosen strategy to remove or modify duplicate values. Handle outliers: Identify columns with outliers. Decide on an approach to handle outliers (e.g., removing outliers, transforming values). Implement the chosen approach to handle outliers. Perform data type conversion and normalization: Convert columns to the appropriate data types (e.g., dates to datetime, numbers to numeric types). Normalize numerical columns if required (e.g., scaling to a specific range). Handle categorical variables: Identify ...

Mindmap for studying Fundamentals of Mathematics in Data Science

Image
  Here's a mind map outlining the fundamentals of mathematics in data science Fundamentals of Mathematics in Data Science ──────────── │ Math Basics │ ──────────── │ ┌──────────────┴──────────────┐ Statistics Linear Algebra │ │ ┌──────────┴──────────┐ ┌─────────┴─────────┐ Descriptive Statistics │ Matrix Opera t ions │ │ │ │ │ │ Mean, Median, Mode │ Eigenvalues, Eigenvectors Variance, Standard │ Determinants, Inverses Deviation │ Matrix Multiplication │ Matrix Transposition │ Dot Products ┌──────────────┴──────────────┐ Calculus Probability ┌──────...

Exploratory data analysis (EDA) - DATA Understanding - Data Types

Image
Exploratory Data Analysis (EDA) is an essential step in the data analysis process that helps to uncover important patterns, trends, and relationships in the data. Understanding the different data types is crucial for effective EDA. In this blog post, we will discuss the different data types used in EDA, including nominal, ordinal, discrete, and continuous data, as well as the difference between quantitative and qualitative data. Nominal Data: Nominal data is categorical data that has no inherent order or hierarchy. Nominal data is often used to represent characteristics or attributes that are unique to a particular category. Examples of nominal data include: Gender (Male or Female) Marital Status (Single, Married, Divorced, or Widowed) Hair Color (Blonde, Brown, Black, or Red) Nominal data can be analyzed using frequency tables or bar charts to identify the frequency of occurrence of each category. In EDA, nominal data is considered a qualitative data type. Ordinal Data: Ordinal dat...

MINDMAP Basic Python and SQL for Data Science

Image
 The basics of python and SQL are the first and the main ones we have to learn. For Python, I used    GREAT LEARNING PYHTON COURSE FREE This course just covers all the fundamental topics to understand python. For SQL you can choose free courses like these or many are available on youtube as well. Mind map to study basic python for data science                              BASIC CONCEPTS               |      Variables and Data Types               |           Operators               |             Lists               |           Dictionaries               |             Tuples         ...

Where To Begin

Image
As a computer science student aiming to become a data scientist is one of the options from a vast pool. I think I don't have to explain here what data science is and stuffs like that. So, What is the purpose of this blog? This blog will showcase my journey as fresher to a Data Scientist, maybe more. So the blog will contain all the steps that I am taking for this. I hope this will help others to follow me through my journey as well as keep me on track. It will contain learning materials,links, projects, codes..etc. To learn Data Science Python is essential. That leads us to the First Step of our learning journey    Basics Of Python (Explained in the next blog)