About Me
Karthigai Selvan
Data Engineer with over 13 years of experience in building scalable, cloud-native, and cybersecurity-driven data platforms.
🔗 Connect
🧠 Skills & Technologies
- Big Data: Spark, Airflow, Delta Lake
- Cloud: AWS (S3, Redshift, Glue, EMR, Lambda, EKS)
- Governance: DataHub, Great Expectations
- Infra & DevOps: Terraform, Docker, Kubernetes
💼 Work Experience
Development Engineer 4 – Comcast (Dec 2023 – Present)
- Leading cybersecurity-focused data team (DataBee project)
- Built ingestion pipelines for 50+ data sources using Benthos + OCSF
- Extended OCSF with new event classes (e.g., Training Inventory)
- Integrated DataHub to enhance metadata governance
Lead Data Engineer – Yubi (May 2022 – Dec 2023)
- Built DBT models to replace legacy marts
- Architected solutions using AWS Glue, Lambda, Redshift
- Implemented Apache Ranger + Trino for secure data access
- Led data quality automation with Great Expectations
Data Engineer II – Bank of America (Aug 2017 – May 2022)
- Migrated Teradata workloads to Hadoop ecosystem
- Used Spark, Hive, Sqoop for scalable data transformations
- Designed data hubs aggregating diverse customer data
Associate – Cognizant (Jan 2015 – Aug 2017)
- Developed Teradata applications for control assurance workflows
- Automated ETL using shell scripting + BTEQ utilities
- Engaged in client interactions and business requirement analysis
System Engineer – TCS (Dec 2011 – Jan 2015)
- Worked on ecommerce & privacy data warehouse systems
- Designed and optimized complex SQL queries in Teradata
- Created macros and procedures for privacy-focused data views
🎓 Education
B.E. Mechanical Engineering
Government College of Engineering – Tirunelveli (2007 – 2011)
- SSLC: 93.6%
- HSC: 83.3%
- GPA: 75% (First Class with Distinction)
📜 Certifications
- Databricks Certified Data Engineer Associate – Dec 2022
- Databricks Certified Associate Developer for Apache Spark 3.0 – Jul 2022
- Databricks Certified Data Analyst Associate – Oct 2022
- Apache Airflow Fundamentals Certification – Jan 2022
- Apache Cassandra 3 Developer Associate – Jul 2021
- Microsoft Certified – Azure Data Fundamentals (DP-900) – Mar 2021
- Microsoft Certified – Azure Fundamentals (AZ-900) – Feb 2021
- Data Engineering Nanodegree – Udacity – Jan 2021
- Tableau Desktop Specialist – Jul 2020
- Certified Developer, Teradata Vantage – May 2020
- Certified Associate, Teradata Vantage – May 2020