Preet Sojitra | Projects

Featured Projects

Depression Detection through Speech Analysis

An end-to-end research project to detect depression from speech recordings using the DAIC-WOZ dataset. Systematically evaluated a range of models, from classic CNNs (VGGNet, ResNet) to Vision Transformers (ViT), on spectrograms and raw audio data.

Accomplishment: Gained experience in the full research pipeline, including feature engineering for audio, comparative model benchmarking, and handling class imbalance.

[GitHub]

imgcv: A Image Processing Library from Scratch

Developed and published a Python package on PyPI that re-implements core image processing OpenCV functions from the ground up using only NumPy. The goal was to build a deep, fundamental understanding of how computer vision and image processing algorithms work under the hood.

Accomplishment: Solidified my understanding of algorithms for filtering, color space manipulation, morphological operations, and edge detection.

[GitHub] [PyPI Package]

nanoGPT: Building a Transformer from Scratch

Implemented a decoder-only GPT-style language model from scratch in PyTorch, training it to generate text in the style of Shakespeare. This project was a deep dive into the fundamental mechanics of the Transformer architecture.

Accomplishment: Gained a code-level understanding of self-attention, which I am now extending to train a similar model on the more complex Sanskrit language.

[Kaggle]

Micrograd: An Autograd Engine in Python

A from-scratch implementation of a scalar autograd engine, demystifying the backpropagation process used in neural networks. Later extended to a PyTorch-like structure.

Accomplishment: Developed a deep understanding of how autograd works, including the mechanics of computing gradients and the chain rule.

[GitHub]

Chorale Music Generation with LSTMs

Trained a Long Short-Term Memory (LSTM) network on a dataset of Bach chorales to generate novel, harmonically coherent musical sequences.

Accomplishment: Gained insights into sequence modeling and the unique challenges of generating structured outputs like music.

[Colab Notebook]

End-to-End News Classification with MLOps In Progress

Building a full-stack text classification system for the AG News dataset. This project moves beyond a simple notebook to create a deployable, end-to-end MLOps pipeline, featuring custom tokenizers and models built from scratch in PyTorch.

Goal: To gain hands-on experience in MLOps, CI/CD for machine learning, and to solidify core PyTorch concepts by building a production-ready application.

[GitHub Repo]

Fine-Grained Animal Classification In Progress

Developing a classification model for a challenging dataset of 90 animal classes with a limited number of samples per class. This project focuses on techniques to handle data scarcity and achieve high accuracy in a fine-grained classification scenario.

Goal: To explore and implement advanced techniques for training robust models on datasets with long-tail distributions and limited data.

[GitHub Repo]

Multimodal AI Story Generator

A Streamlit application that generates a unique story from a sequence of user-selected images. The pipeline scrapes images based on user queries, generates descriptive captions for each selected image using a Vision Transformer (ViT), and then feeds the captions to a Mistral LLM to write a coherent narrative.

Accomplishment: Successfully integrated multiple AI models (vision and language) into a seamless, interactive web application, demonstrating skills in building full-stack multimodal systems.

[Demo] [GitHub Repo]

Web Development Projects

Sinkedin: A Social Media Parody

A satirical take on professional networking, described as "LinkedIn's darker, funnier, and more honest cousin." This project serves as a practical sandbox for implementing and testing scalable system design concepts.

Technologies: Next.js, React, Tailwind CSS, Supabase, Vercel

[Live App] [GitHub Repo]

Kalasangam: An MR-Powered Artisan Marketplace

An e-commerce platform connecting local artisans with customers, featuring an integrated mixed reality (MR) experience that allows users to virtually preview products. The platform also includes a comprehensive dashboard for artisans to manage listings and track sales analytics.

Technologies: React, Zustand, Express.js, Flask

[Demo Video] [GitHub Repo]

Anuvaad Ratna: A Multilingual Translation Suite

A full-stack web application that leverages machine learning for language translation. It currently supports English-to-Hindi text and PDF document translation, and includes a text-to-speech (TTS) feature for the translated output.

Technologies: Next.js, Flask, Tensorflow, Hugging Face (Helsinki-NLP)

[Demo] [GitHub Repo]

Loud and Clear: Freelance Client Website

Designed and developed a professional, responsive website for a freelance client to establish their online business presence. The project involved the full lifecycle from client consultation and requirements gathering to final deployment.

Technologies: Next.js, Gsapio, Tailwind CSS

[View Live Site]