PWC Azure Data Engineer Interview Q/A

I handle data skew using techniques like salting, broadcast joins, and repartitioning skewed keys evenly across executors. I also analyze Spark UI to identify skewed stages causing long execution times. In production pipelines, I optimized joins and reduced shuffle operations to improve overall processing performance. repartition() increases or decreases partitions with full shuffle and provides […]

Read More

Deloitte Azure Data Engineer Interview Q/A

I would use DENSE_RANK() with descending salary order and filter where rank equals 3. This approach handles duplicate salaries correctly compared to simple TOP or LIMIT queries. Window functions are preferred because they are scalable and easier to maintain. INNER JOIN returns matching records from both tables and is mostly used in transactional reporting. LEFT […]

Read More

KPMG AZURE DATA Engineer Interview Q/A

I worked on Azure-based Data Engineering projects involving ADF, Databricks, ADLS, and PySpark. My role included building ingestion pipelines, developing transformation logic, optimizing Spark jobs, and handling deployments through Azure DevOps. In my recent project, we processed large-scale transactional data and built reporting-ready Gold layer datasets for business teams. ADF mainly supports Schedule Trigger, Tumbling […]

Read More

Tiger Analytics AZURE DE Interview Experience

I worked on Azure-based Data Engineering projects involving ADF, Databricks, ADLS, and PySpark. My role included building ingestion pipelines, developing transformation logic, optimizing Spark jobs, and handling deployments through Azure DevOps. In my recent project, we processed large-scale transactional data and built reporting-ready Gold layer datasets for business teams. ADF mainly supports Schedule Trigger, Tumbling […]

Read More