Data Engineer

BigData Development Montreal, Quebec, Canada

Profile
Contacts

Summary

SUMMARY OF SKILLS AND QUALIFICATIONS
• Programming: Python, Java, Scala
• Big Data Technologies: Apache spark, Hadoop, Hive (HQL), Sqoop
• Cloud Services (AWS): S3, EMR, Redshift, Athena, Aurora DB.
• ETL Tools: Ab initio, Microsoft SSIS.
• Database: Oracle 11g/12c, Microsoft SQL Server.
• Applications: Jupyter Notebook, PyCharm, Eclipse, Oracle SQL Developer, SQL Workbench, TOAD.
• Data Science: Dask, pandas, scikit-learn, matplotlib.
• Other: GitHub, HP ALM, Microsoft TFS, GCP(AWS)

PROFESSIONAL EXPERIENCE
Data Engineer, Infosys Ltd., Bengaluru, India 2015 - 2017
Client – American Family Insurance
Project: Data Lake on Cloud
• Worked on a Hadoop Cluster with Size – 45 Nodes.
• Developed data pipelines in spark to migrate data (200 TB) from on-premises data centres to cloud Data Lake.
• Created analytics pipelines to ingest data into Warehouse on Redshift for analytics and Reporting purposes.
• Scheduled batch and incremental jobs in Airflow for NRT data load.
• Worked on Aurora DB for schema loads and scheduled ETL jobs in AWS Glue.
• Optimized Map Reduce codes and Sqoop jobs for better performance and reduced the execution time by 10 times.
• Driven defect triage meetings, delivered daily status reports to onshore, ensured effective communication in team meetings and escalated identified risks.
• Assisted onboard new members with varying experience, mentored and provided Knowledge Transfer sessions.
ETL Automation Test Engineer, Infosys Ltd., Mysore, India. 2017, Sep – 2018, June
Client - SunTrust Banks, Inc.
Data Warehouse for Reporting
• Developed automation framework in spark for data pipeline validation, NRT load validation.
• Analyzed the requirements, mappings, data flow, transformation logic, look up systems, source and target output type of file/tables.
• Developed SQL queries to validate business rules in EDW, Oracle and BI Reports.
• Developed automation scripts using iDTW automation tool to perform validation of data integration process, statistical analysis and BI reports.

Data Analyst (Project Trainee), Tata Consultancy Services, Delhi, India June 2014 – August 2014
Client – Government organisation and PSB
• Provide a detailed report on daily logs captured from the systems in the examination.
• Optimized data collection and integration process.