Data Engineering

Summary

Client: Walmart ` March 2019 to Present
Location: Bentonville, AR
Role: Bigdata Engineer
Description: The project is mainly international Supply chain management where in the Supplier and Vendors make international Walmart business and decisions are made on different things like profit, margin, quality and other important decisions.
Responsibilities:
Hand on experience with different file formats like Avro, Parquet, Sequence, text file conversions and loading the data into Hive table.
Hands- on experience on the Horton works cluster.
Experience on the ETL processing and loading data to the SAP HANA, different Servers, HDFS systems.
Performing the preprocessing of the text files, cleaning, and then loading the data in the destinations.
Implemented Partitioning, Dynamic Partitions, Bucketing on the files in HIVE.
Experienced in designing and performing CRUD Operations on SAP HANA, hive to perform data load.
Performing different Git operations and update the branch up to-date.
Analyzing the logs files of the jobs to know the root cause of the issue and addressing it.
Analyzing the scripts and making changes according to the requirement for the data movement.
Worked with CA7 job scheduler mainframes to schedule jobs based on the time triggered and file triggered jobs.
Simultaneously supporting Production issues and addressing them.
Worked with the different source system like MySQL, Teradata, Informix, MYSQL and SAP HANA
Hands on experience with the compression techniques (Snappy) in HQL.
Hands on experience with HQL scripts as most of the Walmart data was Structured.
Experience with different data formats like Avro, Parquet, ORC and compressions like 7-Zip and Z-zip.
Analyzed PySpark scripts to run some of the jobs and complete the data load.
Using various compression techniques, improved the performance and efficiency of the HDFS.
Knowledge on data Sqoop importing and exporting.
Has domain knowledge on using spark with Scala using both Data frames/SQL/Datasets and RDD/MapReduce in Spark for data aggregation, queries
Gaining knowledge on the Talend tool.
Viewing the reports using the Tableau and making decisions.
Worked with SCRUM team in delivering agreed user stories on time for every Sprint.

Environment: Hadoop Ecosystem, HDFS, MapReduce, Hive, Sqoop, Teradata, Informix, SAP HANA, Tumbleweed, MySQL, Informix, GitHub, CA7 Scheduler, PuttyGen, Putty, Tableau.

Client: HMS ` Nov 2017 to Feb 2019
Location: Dallas, TX
Role: Spark Developer
Description: HMS provides the broadest range of cost containment solutions in healthcare to help payers and at-risk providers improve performance.
Responsibilities:
Hand on experience with different file formats like Avro, Parquet, Sequence, text file conversions and loading the data into Hive table.
Developed Spark jobs and Hive Jobs to apply rules, logics and transform data.
Hands on experience with the compression techniques in HQL.
Extensively migrated existing architecture to Spark Streaming to process the live streaming data.
Responsible for Spark Core configuration based on type of Input Source.
Involved in converting Hive/SQL queries into Spark transformations using Spark data frames, Scala.
Imported the data from different sources like cloud services, Local file system into Spark RDD and worked on cloud Amazon Web Services (EMR, S3, EC2, Lambda).
Strong knowledge with the Kappa architecture, lambda Architecture.
Experience in writing Sql queries to process the data using the SparkSQL using Scala.
Used Spark Structured Streaming to perform transformations in data lake which gets data from Kafka and send to HDFS.
Written Pig Scripts for data transformation, filtering grouping and storing the data in HDFS.
Created Cassandra tables to store variable data formats coming from different portfolios.
Written shell scripts for automating every day processes.
Implemented the work flows using Apache Oozie frame work to automate tasks.
Experience with different data formats like Avro, Parquet, ORC and compressions like 7-Zip and Z-zip.
Used Oozie similar to Airflow schedular to schedule the jobs and monitor them
Implemented Spark using Scala and also used PySpark using Python for faster testing and processing of data.
Using various compression techniques, improved the performance and efficiency of the HDFS.
Worked with SCRUM team in delivering agreed user stories on time for every Sprint.

Environment: Hadoop, HDFS, MapReduce, Hive, Scala, Spark, Kafka, Zookeeper, Sqoop, Oozie, MongoDB, MySQL, Zookeeper, CSV, AWS EMR, S3, EC2 instances, Avro, parquet, Sequence, GitHub, PuttyGen.

Client: Sreemantra Technologies Pvt. Ltd Sep 2016 to Jun 2017
Location: Hyderabad, IND
Role: Java/Hadoop Developer
Description: The project involved developing online banking application, which provides several features to bank customers such as Manage accounts, apply for new account, view balance, transfer funds, View Summary.
Responsibilities:
Involved in Installation and configuration of JDK, Hadoop, Pig, Sqoop, Hive, HBase on Linux environment. Assisted with performance tuning and monitoring.
Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
Created reports for the BI team using Sqoop to export data into HDFS and Hive.
Worked on creating MapReduce programs to parse the data for claim report generation and running the Jars in Hadoop. Co-ordinated with Java team in creating MapReduce programs.
Worked on creating Pig scripts for most modules to give a comparison effort estimation on code development.
Writing Map reduce programs using JAVA
Created HIVE Queries to process large sets of structured, semi-structured and unstructured data and store in Managed and External tables.
Created HBase tables to store variable data formats coming from different portfolios
Shared the knowledge of Hadoop concepts with team members.
Used JUnit for unit testing and Continuum for integration testing.
Client: Win It Solutions Jan 2015 to Aug 2016
Location: Hyderabad, IND
Role: Java Developer
Description: The project involved developing online banking application, which provides several features to bank customers such as Manage accounts, apply for new account, view balance, transfer funds.
Responsibilities:
Extensively involved in different stages of Agile Development Cycle including Detailed Analysis, Design, Develop and Test.
Designed the database and worked on DB2 and executed DDLs and DMLs.
Implemented front-end developments such as webpages design, data binding, Single-Page Applications using HTML/CSS, JavaScript, jQuery and AJAX.
Used jQuery libraries to simplify the frontend programming works. Performed users' input validation using JavaScript and jQuery.
Thoroughly documented the detailed process flow with UML diagrams and flow charts for distribution across various teams.
Used JIRA to track the projects and GIT to ensure version control.

Employment Preferences
Expected Hourly Rate

** USD/hr

Academic Degree
Experience

Total Professional Experience

6 years

Startup Experience

6 years

Big-Tech Companies

no experience

Enterprise Experience

no experience
Contact Candidate

Contacts are hidden

Send a connection request to the candidate to get their contact details.

Contact Candidate