Md Farhad Alam Bhuiyan

Md Farhad Alam Bhuiyan

Research Engineer | Data Scientist | CTO

A highly accomplished research engineer with over a decade of experience in software engineering, data engineering, and data science. Specialized in architecting scalable microservices and leading technical teams. Currently leading R&D at Penta Global and co-founding krait.io.

About Me

Current Positions

🚀
Co-Founder & CTO
krait.io ↗
🔬
Head of R&D
Penta Global Ltd. ↗
📊

Professional Summary

With over 12 years of industry experience and 5 years in research and teaching, I specialize in building scalable, robust, and efficient solutions. My expertise spans across software engineering, data engineering, and data science, with a proven track record of leading technical teams and delivering impactful projects.

I've successfully built highly scalable applications handling 8000+ RPS, developed data fusion platforms used by 18 law enforcement agencies, and conducted cutting-edge research in NLP and computer vision. My work bridges the gap between academic research and industry applications, focusing on real-world impact and innovation.

Currently, I'm leading multiple initiatives including an advanced security platform at krait.io and research projects on trade-based money laundering with Bangladesh Bank.

Past Experiences

Consultant, Data Engineer
March 2024 - June 2024
  • Structured a Data Lakehouse to future-proof data consistency
  • Shifted all the legacy cron jobs to a proper data pipeline

Responsible for industrial products which increased collaboration between academia research and industry. Managed industry standard training programs.

Head of Data Team
September 2021 - March 2023
  • Designed Organization-wide Data Strategy
  • Built Modern Data Lakehouse from scratch
  • Advanced Data Analytics (Revenue metrics dashboard, KPI, interactive reports)
  • Worked as Growth Hacker between Marketing, Engineering, and Product teams
  • Ensured Data Governance throughout the organization
Engineering Consultant (Solution Architect)
The Coding Crowd
April 2020 - September 2021
  • Led a small agile team to serve different data-driven solutions for South African clients
  • Built a complete solution for NBR's back office
Team Lead
October 2019 - February 2020
  • Led a small agile team to design, code, and deploy a mobile app and data-driven dashboard for bKash merchandisers
  • Pipeline processed at least 20k daily uploaded images through a CNN model for data-driven analysis and reporting
Lead, Data Science & Engineering Team
May 2019 - April 2020
  • Designed a data-driven policy across the business, making decisions based on inferential statistics (e.g. Shohoz Quest and Discount Planning Tool, reducing 4x time and effort)
  • Designed and deployed data pipelines on both AWS & Azure for data cleansing, analysis, and reporting (reducing server cost by one-seventh)
  • Built a fraud-detection application based on network analysis & inferential statistics (reduced 25% food fraud & 32% ride fraud)
  • Designed an end-to-end fraud identification pipeline automation for detecting and reporting fraud
Assistant Professor (Adjunct Faculty)
January 2018 - 2020
Researcher & Software Consultant
October 2017 - 2021

Worked with a talented cybersecurity group in Bangladesh, with a wide interest in cybersecurity.

Research Assistant
January 2017 - June 2018
Lead, Data Science & Engineering Team
January 2015 - April 2019
  • Optimized data ingestion and database queries of eprocure.gov.bd, impacting thousands of tenders and millions of users
  • Built a data visualization tool as a product for the Education Ministry's internal usage
Software Engineer
October 2013 - September 2014

Responsible for both Android and iOS mobile platforms.

Game and Mobile Application Developer
May 2012 - September 2013

Cross-platform game development and cross-platform mobile application development for both Android and iOS platforms.

Social Project

Research Engineer
2025

A digital repository documenting the July-August 2024 mass uprising in Bangladesh. Features an interactive map of verified deaths, a detailed timeline of events, and preserves digital evidence of human rights violations.

Research Engineer
2026

Technical Skills

Programming Languages

Python Go Rust JavaScript

Web Development

FastAPI Django Flask Firebase NodeJS Microservices Serverless

DevOps & MLOps

Docker Kubernetes MLFlow KubeFlow AWS Azure GCP

Data Engineering

Azure Databricks Apache Airflow Apache NiFi PySpark ELK Stack Kafka Spark Streaming

Databases

PostgreSQL MySQL BigQuery RedShift MongoDB Neo4j Cassandra

Machine Learning & AI

Scikit-learn PyTorch TensorFlow Jupyter NLP Computer Vision

Data Visualization

Data Lakehouse Metabase Streamlit Superset D3.js Matplotlib

Mobile Development

Android Flutter Cross-Platform

Research Publications

2026

BanglaProtha: Evaluating Vision Language Models in Underrepresented Long-tail Cultural Contexts
IEEE/CVF WACV 2026 (Core Rank: A)
Accepted
How Good LLMs Are at Answering Bangla Medical Visual Questions? Dataset and Benchmarking
Medicine and Healthcare Bridge Program at AAAI 2026
Under Review
Decoupling Representation for Fine-tuning with Linear Probing and LoRA on Robust Downstream Adaptation
LREC 2026
Under Review
Evaluating Vision Language Models on Bangla Medical Visual Question Answering: Dataset and Benchmarking
LREC 2026
Under Review

2025

ChitroJera: A Regionally Relevant Visual Question Answering Dataset for Bangla
ECML-PKDD 2025 (Core Rank: A)
Accepted
BanTH: A Multi-label Hate Speech Detection Dataset for Transliterated Bangla
Findings of the ACL: NAACL 2025 (Core Rank: A*/A)
Accepted
LP-FT-LoRA: A Three-Stage PEFT Framework for Efficient Domain Adaptation in Bangla NLP Tasks
Second Workshop on Bangla Language Processing (BLP-2025)
Accepted
Robustness of LLMs to Transliteration Perturbations in Bangla
Second Workshop on Bangla Language Processing (BLP-2025)
Accepted
Benchmarking Large Language Models on Bangla Dialect Translation and Dialectal Sentiment Analysis
Second Workshop on Bangla Language Processing (BLP-2025)
Accepted
PentaML at BLP-2025 Task 1: Linear Probing of Pre-trained Transformer-based Models for Bangla Hate Speech Detection
Second Workshop on Bangla Language Processing (BLP-2025)
Accepted

2024

BanglaTLit: A Benchmark Dataset for Back-Transliteration of Romanized Bangla
Findings of the ACL: EMNLP 2024 (Core Rank: A*)
Accepted
Penta ML at EXIST 2024: Tagging Sexism in Online Multimodal Content With Attention-enhanced Modal Context
EXIST Lab at CLEF 2024
Published
Penta-nlp at EXIST 2024 Task 1-3: Sexism Identification, Source Intention, Sexism Categorization in Tweets
EXIST Lab at CLEF 2024
Published

2022

Tools and Techniques Adapted for Teaching Software Engineering Topics Remotely during the COVID-19 Pandemic
HICSS 2022
Published

2018

Scrutiny of Electricity Billing and Supply Data as a Probable Proxy for Economic Activities: A Comprehensive Analysis of Power Consumption of Dhaka, Bangladesh
SSRN
Published

Under Review

Decoupling Representation for Fine-tuning with Linear Probing and LoRA on Robust Downstream Adaptation
LREC Conference
Under Review
Evaluating Vision Language Models on Bangla Medical Visual Question Answering: Dataset and Benchmarking
LREC Conference
Under Review

Education & Certifications

Bachelor of Science
Computer Science & Engineering
2012

University of Dhaka, Bangladesh

Professional Certifications
  • Deep Learning Specialization
  • Certified Data Scientist in Python, Dataquest
  • Certified Data Scientist in R Track, Datacamp
  • Mastering Product Management, Reforge Fall Cohort 2022
  • Data for Product Manager, Reforge Summer 2022 Cohort
  • Certified Scrum Master
📄 Download Resume