Pandas Interview Q&A

1What is pandas?

Answer: A Python library for structured data analysis using Series and DataFrame.

2Series vs DataFrame?

Answer: Series is 1D labeled data, DataFrame is 2D tabular labeled data.

3How do you read CSV quickly?

Answer: Use pd.read_csv() with dtype hints, parse_dates, and usecols where possible.

4What are loc and iloc?

Answer: loc is label-based indexing, iloc is integer position-based indexing.

5How to filter rows?

Answer: Apply boolean masks like df[df["col"] > 10].

6GroupBy purpose?

Answer: Split data by keys, apply aggregates, and combine results.

7Merge vs concat?

Answer: Merge joins by keys; concat stacks objects across rows or columns.

8How to handle missing values?

Answer: Use isna(), dropna(), or fillna() depending on context.

9What is pivot table?

Answer: A summarized table using rows, columns, and aggregation functions.

10How to optimize pandas performance?

Answer: Avoid loops, vectorize ops, optimize dtypes, and use chunking for big files.

11What is apply() caveat?

Answer: It can be slow; prefer vectorized/native pandas operations first.

12Pandas in one line?

Answer: Pandas is the go-to toolkit for data cleaning and tabular analysis in Python.

Related Data Science Links