A data crazy person, be it doing some exploratory analysis on Kaggle, deploying a really deep neural network or doing regression on Excel!
I have close to 3+ years of work experience in the field of Data Engineering and Data Science ~A highly skilled Data Engineer with extensive experience in developing and optimizing data pipelines, leveraging tools such as Databricks, Azure DevOps, and Terraform . Proven track record in improving data quality, creating data models, and implementing advanced forecasting models. Expertise in cloud migration, DataOps, MLOps, and building scalable data solutions across various industries. Adept at collaborating with clients to deliver impactful data-driven insights and solutions.
May 2023 - Current
Data Engineering
- Developed Delta Live Tables (DLT) pipelines in Databricks to aggregate and enhance customer and vehicle data for an automobile client. Monitored data quality metrics using Great Expectations, resulting in 18% improvement in data quality across domains.
- Collaborated with clients to create a data model that contains information about a customer from 6 different sources. Further created customer segments using clustering to create targeted marketing campaigns.
- Leveraged Azure Devops to create a CI/CD pipeline for the clients.
August 2022 - Dec 2022
Had tremendous growth working in an extrememly small team setting.
- Collaborating with investment team to create client request forecasting models that predicts client requests for upto 3 years in advance.
- Implemented latest time series forecasting models such as Prophet, N-Beats etc by integrating it with the firm’s ecosystem.
June 2022 - August 2022
- Worked on end-to-end implementation on ML Monitoring dashboard using latest toolkits including EvidentlyAI, Mlflow and Grafana.
- Built scalable connector functions to the firm’s autoML capability and deployed the dashboard using FastAPI.
July 2019 - Aug 2021
Workedin the Cloud ML/AI team
- Built an omni-channel chat bot and call-center capability for a client using Amazon Lex, Connect and Lambda
- Worked on NLP tasks such as custom NER model, sentimental analysis, topic modelling, question answer detection, classification.
- Prototyped Auto ML capability as an internal asset for regression, Classification and time series forecasting.
- Did a POC on posture recognition where we used Xception and Resnet model and deployed it as a FLASK
- Played a vital role in a cloud migration project where we migrated 2B+ healthcare data points using tools like Apache Kafka and Apache NIFI
Jan 2019 - March 2019
Worked for 3 months as a AI engineer. Built a facial verification system for the company which provides service of conducting offline and online exams.
Jan 2017
Worked on building website for clients using HTML,CSS, JS and bootstrap
Aug 2021 - April 2023
GPA: 4.0/4.0
Related subjects: Deep Learning , Supervised Machine Learning, Unsupervised learning, Data management and processing
July 2015 - June 2019
GPA: 7.9/10
Undergrad in Computer Science with a focus on Machine Learning and Big Data
Having worked on different technologies, here are some of the projects i have worked on:
Being an interesting blend of computer vision(Resnet) and NLP(DistilBert), it implemented two deep learning based models to retrieve images based on a user text query.
Used IBM dataset to predict customer churning using Machine learning model.Used models such as XGBoost. Neural Network, Random Forest, voting classifier etc and got F1 score of 0.81. Deployed the model using streamlit where user can train models based on custom hyperparameters. Also used MLFlow to track the hyperparameters.
Used Apache Kafka to stream Tweets about Trump and Biden and stored them in topics. Used a consumer to consume from topic and put data in MongDB collection. Later used Change stream to capture changes in collection and performed sentimental analysis on the tweets and plotted a pie chart every second of who is leading the Twitter war.
We aim to automate the traffic signal timers based on traffic at the previous junction. So if there is more traffic at the current junction of the road then the timer at next junction will be less, so that the traffic can disperse quickly to suit the needs of the incoming traffic. We use an algorithm called YOLOv3(You Look Only Once).
Using Lexical based approach,we did sentiment analysis of the reviews on various website and suggested the user a website with highest satisfaction percentage.Presented this paper in RISE 2017 conference where it got published in the journal
With the aim to help farmers with very little some cases no knowledge about the crops that could give them the best yield in a particular weather and place,it provides high accuracy prediction of about 97.3%. It predicts one of the 13 most common crops in India depending on 5 different factors. It also gives the farmers tips based on the output,on how much more fertilizer,water etc to be added to his field.
"Never Stop Learning, because life never stops teaching"
I constantly thrive to upskill myself with the help of certification. I feel one should always keep learning. Here are some my key certfications
One of the beauty of entering the field of data science is there is an entire universe to be explored yet. I love to think beyond the box as to how gaining insights from data can solve human issues
New York
niduttnb@gmail.com