Auto Pre-Processing Data Tools
Automated Data Pre-Processing Workflow for Machine Learning

Project Overview
A web-based tool designed to automate and simplify the data pre-processing workflow for machine learning needs. Users can upload datasets and apply various pre-processing techniques through an interactive interface to prepare raw data for use in machine learning models.
This project was developed to address one of the most time-consuming aspects of machine learning: data preparation. The tool provides a no-code interface that guides users through each critical step, from data cleaning (handling missing values, outliers) to transformation (normalization, encoding). The goal is to empower data scientists and analysts to prepare high-quality datasets faster and more efficiently.
Technologies Used
Challenges
- Efficiently managing large datasets in a web environment without causing browser crashes
- Integrating Python backend for data processing with SvelteKit frontend in real-time
- Providing an intuitive interface for users with varying technical expertise
Solutions
- Processed data asynchronously in the backend, with the frontend only displaying status or results
- Used WebSocket for real-time communication between frontend and backend during processing
- Designed a wizard-based (step-by-step) workflow in SvelteKit to guide users through each pre-processing stage
Project Gallery
Project Details
Client
Personal Project
Duration
4 Months
Year
2025
Category
Data Science & AI
Key Features
- Upload Datasets from Various Formats (CSV, Excel)
- Handling Missing Values with Various Methods
- Data Transformation and Scaling (Normalization, Standardization)
- Categorical Variable Encoding (One-Hot, Label Encoding)
- Interactive Data Visualization for Exploration
- Dataset Splitting (Training & Testing Data)
- Download Pre-Processed Dataset

