A data crazy person, be it doing some exploratory analysis on Kaggle, deploying a really deep neural network or doing regression on Excel!
I have 5 years of work experience in the field of Data Engineering and Data Science ~A highly skilled Data Engineer with extensive experience in developing and optimizing data pipelines, leveraging tools such as Databricks, Azure DevOps, and Terraform. Have a knack for energy and finance sector. Proven track record in improving data quality, creating data models, and implementing advanced forecasting models. Expertise in cloud migration, DataOps, MLOps, and building scalable data solutions across various industries. Adept at collaborating with clients to deliver impactful data-driven insights and solutions.
May 2023 - Current
Data Engineering
- Leveraged dbt and airflow to create a near real time ETL pipeline to support creating a data product for a commodity trading desk. Utilized dbt data quality to create a data quality framework to identify issues with KDE.
- Led the data engineering workstream for a global hotel client, architecting the data layer with Snowflake dynamic tables and building a ThoughtSpot semantic layer to enable natural language querying; reduced General Managers’ time to uncover insights on customer surveys, service requests, and revenue by 80% across hotels worldwide.
- Developed Delta Live Tables (DLT) pipelines in Databricks to aggregate and enhance customer and vehicle data for an automobile client. Monitored data quality metrics using Great Expectations, resulting in 18% improvement in data quality across domains.
- Collaborated with clients to design a comprehensive data model that integrates customer information from six different sources. Created customer segments using advanced clustering techniques to enable targeted marketing campaigns.
- Engineered advanced Glue pipelines to seamlessly ingest and aggregate data from Snowflake, subsequently feeding it into Large Language Model (LLM) for a firm-wide, high-performance search retrieval application.
DataOps
- Create a DataOps platform that uses Azure Data Factory (ADF) for orchestration and Azure Synapse for compute, optimizing data transformation, and processing across medallion architecture layers.
- Helped a packaging client set up a governance and cataloging platform using Microsoft Purview, an Azure-native catalog tool which resulted in 65% adoption rate in the first 3 months.
August 2022 - Dec 2022
Had tremendous growth working in an extrememly small team setting.
- Collaborating with investment team to create client request forecasting models that predicts client requests for upto 3 years in advance.
- Implemented latest time series forecasting models such as Prophet, N-Beats etc by integrating it with the firm’s ecosystem.
June 2022 - August 2022
- Worked on end-to-end implementation on ML Monitoring dashboard using latest toolkits including EvidentlyAI, Mlflow and Grafana.
- Built scalable connector functions to the firm’s autoML capability and deployed the dashboard using FastAPI.
July 2019 - Aug 2021
Workedin the Cloud ML/AI team
- Built an omni-channel chat bot and call-center capability for a client using Amazon Lex, Connect and Lambda
- Worked on NLP tasks such as custom NER model, sentimental analysis, topic modelling, question answer detection, classification.
- Prototyped Auto ML capability as an internal asset for regression, Classification and time series forecasting.
- Did a POC on posture recognition where we used Xception and Resnet model and deployed it as a FLASK
- Played a vital role in a cloud migration project where we migrated 2B+ healthcare data points using tools like Apache Kafka and Apache NIFI
Jan 2019 - March 2019
Worked for 3 months as a AI engineer. Built a facial verification system for the company which provides service of conducting offline and online exams.
Jan 2017
Worked on building website for clients using HTML,CSS, JS and bootstrap
Aug 2021 - April 2023
GPA: 4.0/4.0
Related subjects: Deep Learning , Supervised Machine Learning, Unsupervised learning, Data management and processing
July 2015 - June 2019
GPA: 7.9/10
Undergrad in Computer Science with a focus on Machine Learning and Big Data
Having worked on different technologies, here are some of the projects i have worked on:
Being an interesting blend of computer vision(Resnet) and NLP(DistilBert), it implemented two deep learning based models to retrieve images based on a user text query.
Used IBM dataset to predict customer churning using Machine learning model.Used models such as XGBoost. Neural Network, Random Forest, voting classifier etc and got F1 score of 0.81. Deployed the model using streamlit where user can train models based on custom hyperparameters. Also used MLFlow to track the hyperparameters.
Used Apache Kafka to stream Tweets about Trump and Biden and stored them in topics. Used a consumer to consume from topic and put data in MongDB collection. Later used Change stream to capture changes in collection and performed sentimental analysis on the tweets and plotted a pie chart every second of who is leading the Twitter war.
We aim to automate the traffic signal timers based on traffic at the previous junction. So if there is more traffic at the current junction of the road then the timer at next junction will be less, so that the traffic can disperse quickly to suit the needs of the incoming traffic. We use an algorithm called YOLOv3(You Look Only Once).
Using Lexical based approach,we did sentiment analysis of the reviews on various website and suggested the user a website with highest satisfaction percentage.Presented this paper in RISE 2017 conference where it got published in the journal
With the aim to help farmers with very little some cases no knowledge about the crops that could give them the best yield in a particular weather and place,it provides high accuracy prediction of about 97.3%. It predicts one of the 13 most common crops in India depending on 5 different factors. It also gives the farmers tips based on the output,on how much more fertilizer,water etc to be added to his field.
"Never Stop Learning, because life never stops teaching"
I constantly thrive to upskill myself with the help of certification. I feel one should always keep learning. Here are some my key certfications
One of the beauty of entering the field of data science is there is an entire universe to be explored yet. I love to think beyond the box as to how gaining insights from data can solve human issues
New York
niduttnb@gmail.com