Pandas

Pandas Interview Q&A

1What is pandas?
Answer: A Python library for structured data analysis using Series and DataFrame.
2Series vs DataFrame?
Answer: Series is 1D labeled data, DataFrame is 2D tabular labeled data.
3How do you read CSV quickly?
Answer: Use pd.read_csv() with dtype hints, parse_dates, and usecols where possible.
4What are loc and iloc?
Answer: loc is label-based indexing, iloc is integer position-based indexing.
5How to filter rows?
Answer: Apply boolean masks like df[df["col"] > 10].
6GroupBy purpose?
Answer: Split data by keys, apply aggregates, and combine results.
7Merge vs concat?
Answer: Merge joins by keys; concat stacks objects across rows or columns.
8How to handle missing values?
Answer: Use isna(), dropna(), or fillna() depending on context.
9What is pivot table?
Answer: A summarized table using rows, columns, and aggregation functions.
10How to optimize pandas performance?
Answer: Avoid loops, vectorize ops, optimize dtypes, and use chunking for big files.
11What is apply() caveat?
Answer: It can be slow; prefer vectorized/native pandas operations first.
12Pandas in one line?
Answer: Pandas is the go-to toolkit for data cleaning and tabular analysis in Python.