Introduction
Pandas is the king of data manipulation in Python! If NumPy is a race car, then Pandas is a fully-loaded spaceship that lets you explore and manipulate data with ease. It’s like Excel on steroids!
Installing Pandas
Before we dive in, let’s install Pandas:
pip install pandas
Check if it’s installed correctly:
import pandas as pd
print(pd.__version__)
Creating a Pandas DataFrame
A DataFrame is like a super-powered spreadsheet inside Python.
import pandas as pd
data = {
"Name": ["Alice", "Bob", "Charlie"],
"Age": [25, 30, 35],
"City": ["New York", "London", "Tokyo"]
}
df = pd.DataFrame(data)
print(df)
Output:
Name Age City
0 Alice 25 New York
1 Bob 30 London
2 Charlie 35 Tokyo
Boom! You’ve created a DataFrame!
Reading & Writing Data
Pandas makes it easy to read and write data in different formats.
Reading a CSV file:
df = pd.read_csv("data.csv")
print(df.head()) # Show the first 5 rows
Writing to a CSV file:
df.to_csv("output.csv", index=False)
Reading an Excel file:
df = pd.read_excel("data.xlsx")
Writing to an Excel file:
df.to_excel("output.xlsx", index=False)
Selecting & Filtering Data
Selecting columns:
print(df["Name"]) # Select a single column
print(df[["Name", "Age"]]) # Select multiple columns
Filtering rows:
print(df[df["Age"] > 25]) # Select rows where Age > 25
Using .loc
and .iloc
:
print(df.loc[1]) # Select row by index (label-based)
print(df.iloc[1]) # Select row by position (zero-based)
Modifying Data
Adding a new column:
df["Salary"] = [50000, 60000, 70000]
print(df)
Updating a value:
df.at[0, "Age"] = 26 # Update Alice’s age to 26
Removing a column:
df.drop(columns=["Salary"], inplace=True)
Grouping & Aggregating Data
Pandas makes it easy to summarize and analyze data.
grouped = df.groupby("City")["Age"].mean()
print(grouped)
Handling Missing Data
Checking for missing values:
print(df.isnull().sum()) # Count missing values per column
Filling missing values:
df.fillna("Unknown", inplace=True)
Dropping missing values:
df.dropna(inplace=True)
Summary
Feature | Pandas Advantage |
---|---|
DataFrames | Powerful table-like structure |
File Handling | Reads/Writes CSV, Excel, etc. |
Data Selection | Easy column & row selection |
Filtering | Select specific data easily |
Grouping | Aggregate data efficiently |
Missing Data | Handle NaN values like a pro |
Pandas is the Swiss Army knife for data manipulation in Python.
0 Comments