Data Engineer
Summary
SUMMARY OF SKILLS AND QUALIFICATIONS
•	Programming: Python, Java, Scala
•	Big Data Technologies: Apache spark, Hadoop, Hive (HQL), Sqoop	
•	Cloud Services (AWS): S3, EMR, Redshift, Athena, Aurora DB.
•	ETL Tools: Ab initio, Microsoft SSIS.
•	Database: Oracle 11g/12c, Microsoft SQL Server.
•	Applications: Jupyter Notebook, PyCharm, Eclipse, Oracle SQL Developer, SQL Workbench, TOAD.
•	Data Science: Dask, pandas, scikit-learn, matplotlib.
•	Other: GitHub, HP ALM, Microsoft TFS, GCP(AWS)
PROFESSIONAL EXPERIENCE
Data Engineer, Infosys Ltd., Bengaluru, India						2015 - 2017
Client – American Family Insurance
Project: Data Lake on Cloud
•	Worked on a Hadoop Cluster with Size – 45 Nodes.
•	Developed data pipelines in spark to migrate data (200 TB) from on-premises data centres to cloud Data Lake.
•	Created analytics pipelines to ingest data into Warehouse on Redshift for analytics and Reporting purposes.
•	Scheduled batch and incremental jobs in Airflow for NRT data load.
•	Worked on Aurora DB for schema loads and scheduled ETL jobs in AWS Glue.
•	Optimized Map Reduce codes and Sqoop jobs for better performance and reduced the execution time by 10 times.
•	Driven defect triage meetings, delivered daily status reports to onshore, ensured effective communication in team meetings and escalated identified risks.
•	Assisted onboard new members with varying experience, mentored and provided Knowledge Transfer sessions.
ETL Automation Test Engineer, Infosys Ltd., Mysore, India. 2017, Sep – 2018, June
Client - SunTrust Banks, Inc. 		
Data Warehouse for Reporting
•	Developed automation framework in spark for data pipeline validation, NRT load validation.
•	Analyzed the requirements, mappings, data flow, transformation logic, look up systems, source and target output type of file/tables.
•	Developed SQL queries to validate business rules in EDW, Oracle and BI Reports.
•	Developed automation scripts using iDTW automation tool to perform validation of data integration process, statistical analysis and BI reports.
Data Analyst (Project Trainee), Tata Consultancy Services, Delhi, India	June 2014 – August 2014
Client – Government organisation and PSB
•	Provide a detailed report on daily logs captured from the systems in the examination.
•	 Optimized data collection and integration process.
Expectations
Challenging Projects and a healthy work environment.
Employment Preferences
Expected Base Salary
**,000 CAD
Academic Degree
Experience
Total Professional Experience
Startup Experience
Big-Tech Companies
Enterprise Experience
Skills
- SUMMARY
- QUALIFICATIONS
- Programming
- Python
- Java
- Scala
- Big Data Technologies
- Apache Spark
- Hadoop
- Hive
- HQL
- Sqoop
- Cloud Services
- AWS
- S3
- EMR
- Redshift
- Athena
- Aurora DB.
- ETL Tools
- Ab Initio
- Microsoft SSIS.
- Database
- Oracle 11g
- 12C
- Microsoft SQL Server.
- Applications
- Jupyter Notebook
- PyCharm
- Eclipse
- Oracle SQL Developer
- SQL Workbench
- TOAD.
- Data Science
- Dask
- Pandas
- Scikit-learn
- Matplotlib.
- GitHub
- HP ALM
- Microsoft TFS
- GCP
Contacts are hidden
Send a connection request to the candidate to get their contact details.
Contact Candidate
