Available for Remote Work

Hello, I'm

Arun Kumar

I'm a |

Building production-grade ETL pipelines, backend systems, and AI-powered tools. Currently at Technocas & Zank AI. Oracle Cloud GenAI Certified.

Download Resume
Scroll

Get to Know Me

About Me

Arun Kumar

I'm a Software Engineering student at SZABIST, Karachi (graduating 2027), currently working as a Data Engineer at Technocas and Backend Developer at Zank AI, a US-based fintech startup — both roles I hold simultaneously.

I specialize in building end-to-end data pipelines, scraping systems, and production backend infrastructure. From wrangling 10k+ product records out of e-commerce sites to designing Snowflake ETL workflows with Airflow — I build things that work at scale.

I'm also trained in Cloud Data Engineering at SMIT and hold an Oracle Cloud GenAI certification. Open to remote roles in Cloud Data Engineering, Backend, or AI integration.

0+

Companies

0+

Projects

0+

Certifications

0

Graduating

Where I've Worked

Work Experience

Data Engineer

Technocas·Karachi, Pakistan

April 2026 – Present

Owning the full data collection and delivery lifecycle — engineering scrapers that handle bot defenses, building ETL pipelines, and delivering clean, warehouse-ready datasets for business consumption.

  • Built production-grade scrapers in Python using Requests, BeautifulSoup, Playwright, and Apify — covering static HTML pages through to fully dynamic, JavaScript-rendered platforms including Pinterest.
  • Designed multi-stage ETL pipelines that normalize raw scraped data, resolve duplicates, and enforce schema contracts before loading into the data warehouse.
  • Automated end-to-end data workflows via Python scheduling — eliminating manual extraction runs and reducing data delivery to near-zero human intervention.
  • Implemented transformation logic to handle inconsistent formats, null fields, and nested JSON structures across heterogeneous source sites.
  • Integrated processed datasets with Metabase to power reporting dashboards and business intelligence queries.
  • Built resilient scraping infrastructure with retry logic, request throttling, and session management to sustain throughput against anti-bot systems.

Target Sites → Requests / BeautifulSoup / Playwright / Apify → Cleaning & Normalization → ETL → Data Warehouse → Metabase

Raw website HTML becomes a queryable warehouse record — fully automated, no manual steps.

PythonRequestsBeautifulSoupPlaywrightApifyETLMetabase

Backend Developer

Zank AI·Remote — USA

Feb 2026 – Present

Building core backend infrastructure for a US-based fintech startup — REST API design, database architecture, authentication systems, and banking workflow engineering.

  • Architected and shipped RESTful APIs with FastAPI powering core banking workflows: account management, transaction processing, and user onboarding.
  • Designed and enforced JWT-based authentication and role-based access control (RBAC) across all protected endpoints.
  • Modeled relational database schemas in PostgreSQL optimized for financial data integrity, referential consistency, and concurrent-safe operations.
  • Engineered backend workflows for fintech-specific features — fund transfers, balance ledgers, and statement generation.
  • Implemented Redis caching for high-frequency API responses and session data, reducing database load on hot paths.
  • Containerized backend services with Docker for consistent local development and production deployment environments.
  • Built and maintained a double-entry ledger system for tracking financial transactions with full audit trail support.

React Frontend → FastAPI Endpoints → JWT Auth → Redis Cache → PostgreSQL / Ledger → Banking Logic → Docker → API Response

Shipping backend systems that handle real user financial data in a live US fintech product.

PythonFastAPIPostgreSQLJWTRBACREST APILedgerRedisDocker

Software Engineer (AI)

HexaVibes Solutions·Karachi, Pakistan

Aug 2024 – Dec 2025
  • Integrated ML models into production applications, building inference wrappers and API layers to expose model outputs as usable product features.
  • Profiled and optimized model inference pipelines to reduce prediction latency for real-time use cases.
  • Delivered AI-powered features across cross-functional teams, translating research outputs into stable, deployable backend services.
PythonTensorFlowML IntegrationFastAPIInference Optimization

Agentic AI Developer

UXGENIE·Karachi, Pakistan

Sep 2025 – Oct 2025
  • Designed and implemented multi-step agentic AI workflows using LLM orchestration for automated UX research tooling.
  • Built automation pipelines that replaced manual UX research tasks through AI-driven data extraction and synthesis.
Agentic AILLM OrchestrationPythonAutomation

Frontend Developer

High Tech Software House·Karachi, Pakistan

Aug 2025 – Sep 2025
  • Built pixel-perfect, responsive landing pages and portfolio sites in React (TypeScript) with Tailwind CSS.
  • Implemented component-driven UI architecture ensuring cross-browser compatibility and consistent design fidelity.
ReactTypeScriptTailwind CSS

Freelance Engineer

Fiverr·Remote — Global

Dec 2025 – Present
  • Delivering custom data engineering, scraping pipelines, and backend solutions for international clients across e-commerce, research, and analytics domains.
  • Scoped, architected, and shipped complete client projects end-to-end — from requirements to deployment.
PythonWeb ScrapingETLFastAPIData Pipelines

What I've Built

Featured Projects

Airflow ETL: S3 → Snowflake

Jan 2026

Production-style ETL DAG that detects CSV files in S3, auto-creates the Snowflake table schema, and loads data using COPY INTO. Includes SMTP email alerts and robust Sensors/Operators.

Apache AirflowAWS S3SnowflakePythonSMTP

Real-Time Chat Application

May – Jun 2025

Full-featured chat app with one-to-one and group conversations. Built on the MERN stack with Socket.IO for instant message delivery and a fully responsive UI.

MongoDBExpressReactNode.jsSocket.IO

AI Thief Detection System

Jun – Jul 2025

Browser-based real-time surveillance system that detects humans via webcam using TensorFlow.js — no server required. Optimized for low-latency inference in the browser.

Next.jsTensorFlow.jsWebRTCTypeScript

Banggood E-Commerce Pipeline

Nov 2025

Full ETL pipeline that scrapes 10k+ product records from Banggood using Selenium. Implements anti-blocking mechanisms, data cleaning with Pandas, and loads into structured SQL schema.

PythonSeleniumPandasMySQLETL

Deep Dives

Case Studies

Airflow ETL: S3 → Snowflake

Fully orchestrated cloud data pipeline with schema detection, automated loading, and alerting

Jan 2026
W

What

A zero-touch ETL pipeline from cloud storage to data warehouse

Built an Apache Airflow DAG that monitors an S3 bucket for incoming CSV files, automatically detects and creates the Snowflake table schema, loads data using COPY INTO, and dispatches SMTP email alerts on completion or failure — no manual steps at any stage.

W

Why

Manual data loading doesn't scale and breaks silently

Loading files from S3 to Snowflake by hand is error-prone and collapses under volume. The business needed a system that detects new data automatically, handles schema changes without engineer intervention, and alerts the team so no one babysits a data job overnight.

W

Where

Cloud-orchestrated, warehouse-native, alert-driven

Airflow runs as the scheduler and orchestration layer. S3 Sensors listen for new file arrivals and trigger the DAG. Snowflake receives the cleaned load via COPY INTO. SMTP delivers success and failure notifications to the team. All components are cloud-native with no on-prem dependency.

System Architecture

S3 Bucket → Airflow S3 Sensor → DAG Trigger → Schema Auto-Detection → COPY INTO Snowflake → SMTP Alert

Outcome

Files land in S3, the warehouse table is updated, and the team is notified — zero human in the loop, zero manual SQL.

Stack

Apache AirflowAWS S3SnowflakePythonSMTPETL

Kind Words

What People Say

Arun built our entire scraping and ETL pipeline from scratch. Clean code, on time, and the data quality was exactly what we needed. Will hire again.

EC

Client via Fiverr

E-Commerce Startup

Arun consistently delivered scalable backend solutions at Zank AI. His understanding of API design and database optimization is well above his experience level.

ZA

Team Lead

Zank AI — US Fintech

He integrated our ML models into production seamlessly and improved inference performance significantly. Great communicator and fast learner.

HV

Tech Lead

HexaVibes Solutions

What I Do Best

Core Expertise

Data Engineering & ETL Pipelines

Design and ship end-to-end data pipelines — raw source ingestion to warehouse-ready structured tables. Orchestrate multi-stage ETL workflows with schema handling, validation, and automated alerting.

Apache AirflowSnowflakeAWS S3ETL Orchestration

Web Scraping & Data Extraction

Extract structured data from any website — static HTML through to fully JavaScript-rendered platforms. Build resilient scrapers with anti-bot handling, retry logic, and multi-source pipelines at scale.

SeleniumPlaywrightBeautifulSoupApifyRequests

Backend API Development

Build production-grade REST APIs — authentication, RBAC, database modeling, caching, and financial-grade reliability. Architect backend systems that handle real user data and concurrent requests safely.

FastAPIPostgreSQLJWTRedisDocker

Cloud & Infrastructure

Deploy, containerize, and scale backend systems on AWS. Comfortable across S3, EC2, Lambda, SQS, and Glue — with Docker for containerization and Redis for caching and session management.

AWS S3EC2LambdaDockerRedis

Workflow Automation & Scripting

Automate repetitive data and business workflows end-to-end using Python. From scheduled ETL triggers and data reporting to n8n pipelines — eliminate manual steps at every layer of the stack.

Pythonn8nApache AirflowTask Scheduling

What I Work With

Technical Skills

Languages

PythonSQLJavaC/C++HTMLCSSTypeScript

Data & Scraping

PandasNumPySeleniumBeautifulSoupApifyPlaywrightMetabase

Frameworks

FastAPINode.jsReactNext.jsExpress

Databases

MySQLSQL ServerSnowflakePostgreSQLFirebase

Cloud & DevOps

AWS S3EC2LambdaSQSSNSGlueAthenaQuickSightDockerRedis

Tools & Platforms

GitGitHubVS CodeVercelFigman8nApache Airflow

Let's Talk

Get In Touch

Open to remote roles in Data Engineering, Backend, or AI integration. Whether it's a job, project, or just a hello — my inbox is always open.