Data Scientist
Summary
Resume: Contact
Contact Contact Email : Contact
LinkedIn : Contact- Contact | GitHub : Contact- Contact | Portfolio : Contact- Contact Mobile : Contact
Education
Indian Institute of Information Technology, Gwalior Madhya Pradesh, India
Integrated Post Graduate (B.Tech+M.Tech) - Information Technology July 2019 May 2024
Coursework: Artificial Intelligence, Machine Learning, Neural Computing, NLP, DBMS, Data Structures
Work Experience
Data Science Intern Gurugram, India | On-Site
Newgen Software | Homepage Jan 2024 - June 2024
Developed RAG (Retrieval-Augmented Generation) pipelines, including data splitting techniques, pre-retrieval
techniques, and advanced retrieval methods
Data-Generation: Created a 6000 query-corpus dataset using 300 documents with Mistral-7B LLM.
Embedding Finetuning: Fine-tuned the GTE-large embedding model using the above dataset, improving
NDCG@10 from 45.52 to 54.31.
Instruction fine-tuned Mistral-7B LLM for better response generation, enhancing retrieval performance by 20%.
Employed a Vision Grid Transformer for layout understanding to segment documents for efficient storage in
Milvus vector store.
AI Software Developer Intern Hyderabad, India | Remote
AmberFlux | Homepage June 2023 - August 2023
Engineered two innovative products: PDF-centered Q&A system and Audio Transcript Analyzer.
Applied transformer-based models to speech to text conversion and leveraged the generated transcripts to
generate insights and create Q&A functionalities.
Implemented chained conversation for enhanced responses using Langchain framework.
Tools Used: Langchain, Huggingface, AWS, Whisper, Python 3.9+, chromaDB
ML Pioneer Bengaluru, India | Remote
MarkovML | Homepage | MLOPs May 2023 - June 2023
Spearheaded the development of an advanced Dialogue-based system that incorporates emotional context to
enhance responses using the MarkovML platform.
Employed MELD dataset containing 1400 dialouges from actors with 7 types of emotions.
Implemented emotion recognition and speech-to-text using 2D neural networks and transformers. Also, created
4800+ data with emotion and conversation from alpaca-x for fine-tuning the transformer llama-2 7B.
Tools Used: HuggingFace, PyTorch 2.0+, Transformers 4.2+, Python 3.9+, MarkovML
Skills
Languages: Python, SQL, C++, JavaScript
Frameworks: TensorFlow, Keras, PyTorch, Huggingface, Tableau, Langchain, Fast.ai, AWS, Apache NiFI
Tools: Git, GitHub, Anaconda, VS Code, Kaggle
Projects
Brain Tumor Segmentation | Image Processing Dec 2021 - Feb 2022
Objective: To segment brain tumors, aiding surgical decision-making.
Leveraged the extensive Brats2020 dataset with 369 cases and four MRI modalities.
Created a 3D U-Net model step tailored for brain segmentation tasks, trained for 50 epochs.
Achieved an impressive IOU score averaging 0.56 during training and 0.52 during validation, demonstrating
precise segmentation capability.
Tools Used: Tensorflow 2.0+, NumPy, Matplotlib
Multilingual Extreme Summarization | NLP June 2022 - September 2022
Objective: To summarize and translate research articles in multiple languages using mBart transformer models.
Used the X-SCITLDR dataset, which included 1992 training papers, 619 development papers, and 618 test papers
across German, Spanish and French languages.
Fine-tuned mBart for 25 epochs for summarization and translation of articles
Attained mean ROUGE scores across all languages: ROUGE-1 (30.10), ROUGE-2 (10.20), ROUGE-L (21.54),
and F1 (70.68)
Tool Used: Transformers 4.2+, PyTorch 1.7+, Python 3.7+ Huggingface, FastAI
Expectations
Job in domain of AI and ML,
Employment Preferences
Relocation destinations:
- Bangalore, Karnataka, India
- Hyderabad, Telangana, India
Expected Base Salary
*,*00,000 INR
Academic Degree
Experience
Total Professional Experience
Startup Experience
Big-Tech Companies
Enterprise Experience
Skills
Contacts are hidden
Send a connection request to the candidate to get their contact details.
Contact Candidate
