Pandas - The Data Manipulation Ninja

 

Introduction

Pandas is the king of data manipulation in Python! If NumPy is a race car, then Pandas is a fully-loaded spaceship  that lets you explore and manipulate data with ease. It’s like Excel on steroids! 

Installing Pandas

Before we dive in, let’s install Pandas:

pip install pandas

Check if it’s installed correctly:

import pandas as pd
print(pd.__version__)

Creating a Pandas DataFrame

A DataFrame is like a super-powered spreadsheet inside Python.

import pandas as pd

data = {
    "Name": ["Alice", "Bob", "Charlie"],
    "Age": [25, 30, 35],
    "City": ["New York", "London", "Tokyo"]
}

df = pd.DataFrame(data)
print(df)

Output:

     Name  Age      City
0   Alice   25  New York
1     Bob   30   London
2  Charlie   35    Tokyo

Boom! You’ve created a DataFrame! 

Reading & Writing Data

Pandas makes it easy to read and write data in different formats.

Reading a CSV file:

df = pd.read_csv("data.csv")
print(df.head())  # Show the first 5 rows

Writing to a CSV file:

df.to_csv("output.csv", index=False)

Reading an Excel file:

df = pd.read_excel("data.xlsx")

Writing to an Excel file:

df.to_excel("output.xlsx", index=False)

Selecting & Filtering Data

Selecting columns:

print(df["Name"])  # Select a single column
print(df[["Name", "Age"]])  # Select multiple columns

Filtering rows:

print(df[df["Age"] > 25])  # Select rows where Age > 25

Using .loc and .iloc:

print(df.loc[1])  # Select row by index (label-based)
print(df.iloc[1])  # Select row by position (zero-based)

Modifying Data

Adding a new column:

df["Salary"] = [50000, 60000, 70000]
print(df)

Updating a value:

df.at[0, "Age"] = 26  # Update Alice’s age to 26

Removing a column:

df.drop(columns=["Salary"], inplace=True)

Grouping & Aggregating Data

Pandas makes it easy to summarize and analyze data.

grouped = df.groupby("City")["Age"].mean()
print(grouped)

Handling Missing Data

Checking for missing values:

print(df.isnull().sum())  # Count missing values per column

Filling missing values:

df.fillna("Unknown", inplace=True)

Dropping missing values:

df.dropna(inplace=True)

Summary 

Feature Pandas Advantage
DataFrames Powerful table-like structure
File Handling Reads/Writes CSV, Excel, etc.
Data Selection Easy column & row selection
Filtering Select specific data easily
Grouping Aggregate data efficiently
Missing Data Handle NaN values like a pro

Pandas is the Swiss Army knife  for data manipulation in Python.

Post a Comment

0 Comments