Resume | Aditya Dave

Experience

Beyond Limits June 2022-Present

Software Engineer Glendale, CA

Implemented Kafka exactly-once processing in Flink Job, resolved a critical ETL pipeline issue causing Kafka record size overflow through a recursive fix. Collaborated with DevOps to optimize the CI/CD pipeline, reducing Flink job initialization time from 15 minutes to under 10 seconds.
Enhanced Flink pipeline efficiency by optimizing the reduce step, implementing custom keying logic on raw strings from the previous stage. Resulted in a substantial reduction in processing time for each Kafka record, contributing to overall performance improvements.
Designed and implemented an ML Pipeline utilizing Flink and Flink Stateful Functions for real-time streaming inference. Led a sub-team of one frontend and one backend engineer to seamlessly integrate this feature into the product. Additionally, architected a comprehensive system for on-demand ML model training, deployment, retraining, and streaming inference, intended for use across multiple products.
Contributed to cross-team evaluations of market-ready services, enhancing collaboration and evaluating tools for potential integration.
Enhanced product feature response time by implementing data analysis with Pandas and PyArrow, incorporating async/concurrent programming for fetching data from private APIs. Achieved a remarkable reduction in processing time, decreasing it from over 5 minutes to less than 15 seconds.
Initiated and developed a cross-product Python package using Poetry to facilitate seamless integration of machine learning features with respective product microservices.

Software Engineer Intern

Developed RESTful APIs in Spring Boot using TDD principles, encompassing both Unit and Integration Testing. Utilized JSON and ProtoBuf data formats for seamless communication with both internal and public-facing microservices.
Explored Airflow as a tool for running user-demand scheduled tasks and developed an on-demand schedule generator. This system dynamically processes user-generated data and schedules, effectively handling streaming data.
Implemented a scalable domain-specific language compilation and execution system in a Spring Boot Application using ANTLRv4. Deployed the system as Apache Flink Stateful Functions.
Significantly contributed to the improvement of Spring Boot and React.js + Typescript microservice templates. Implemented industry-leading coding practices, testing strategies, and analysis tools. Dockerized projects included integrated support for linting, codestyle validation, unit and integration tests using testcontainers, jacoco reports, and static code analysis using Sonar.
Spearheaded the creation of a master repository using git submodules for product microservices, optimizing collaboration and efficiency across development teams.
Collaborated with the ML team to establish a Python workflow template using MLFlow and cookiecutter, significantly enhancing development speed and consistency.

Crosstown LA September 2021-December 2021

Fullstack Engineer, Student Worker Los Angeles, CA

Collaborated in a team environment, contributing to both frontend (VueJS) and backend (Node.js) development of a newsletter generation web application.
Developed Python-based data scrapers for websites, CSVs, and Excel files, ensuring regular data updates via CRON jobs, resulting in a data update efficiency by 30%.
Led as a lead engineer for weekly newsletter releases, providing technical support to reporters and reduced newsletter release time by 25%
Facilitated onboarding of new engineers and seamlessly integrated Crosstown's core newsletter system into client systems, resulting in enhanced team cohesion.

DnT InfoTech LLP January 2020-March 2021

Senior Fullstack Engineer (Consultant) Ahmedabad, India

Led a 5-member team in establishing a collaborative environment and designing a scalable architecture and database design for VoIP-based meetings applications, addressing client requirements.
Developed a Python Flask application for data-intensive tasks, utilizing SQLAlchemy ORM for MySQL database interactions. Ensured code quality through rigorous testing with the pytest package and maintained a structured codebase.
Programmed in the MERN stack and Python (Flask and FastAPI) to create robust RESTful APIs for diverse functionalities.
Implemented automated tests using Selenium in PHP to validate the functionality of data-intensive web applications.
Managed AWS accounts and services as an AWS Certified Solutions Architect, optimizing costs while maximizing performance.

AhamAdroit October 2017-December 2019

Fullstack Developer Ahmedabad, India

Developed CMS portals for e-commerce websites and photo galleries using the MERN stack, ensuring seamless functionality and user-friendly interfaces.
Created reusable front-end modules in ReactJS, enhancing efficiency across multiple projects and promoting code consistency.
Optimized workflow by translating client requirements into code, implementing backend services, and achieving a comprehensive understanding of design-to-development processes.

Projects

Fairness Classification and Obligation Detection in ToS | Pytorch, HuggingFace, BERT, RoBERTa August 2022-December 2022

Extracted embeddings from the Terms of Service dataset using the pre-trained BERT-base-uncased model.
Trained SVM, LSTM, Bi-LSTM, GRU, and other RNN models using BERT embeddings to classify unfair Terms of Services, achieving the best accuracy of 0.76 with GRU.
Conducted Named Entity Recognition (NER) to identify user obligations in the Terms of Services dataset achieving an accuracy of 72%.

Ontology Mediated Attention | Pytorch, HuggingFace, BigBird, Transformers, QuickUMLS January 2022-May 2022

Investigated the application of ontology-mediated attention in language modeling, exploring its impact on a biomedical question-answering task.
Evaluated model performance, observed that the Random Ontology mediated attention with window size of 4 showed a 0.14% increase and 0.10% increase in F1 score for relation and heart disease task respectively.
Conducted an ablation study, revealing benefits of ontology-mediated attention over random attention and highlighting areas for further exploration and optimization in sparse models for the biomedical domain.

VAE and Transformer Models | Python, NumPy, PyTorch January 2022-May 2022

Implemented, trained, and visualized an auto-encoder architecture on the MNIST dataset.
Experimented with and trained a Variational Autoencoder (VAE) model on the MNIST dataset, conducting thorough analysis to inspect the learned representations.
Implemented and fine-tuned Transformer models, employing BERT-style language modeling for pre-training and achieved an accuracy of 91.54% for sentiment analysis on the IMDB movie review dataset.

Aditya Dave

Education

University of Southern California August 2021-May 2023

Skills