About me

I’m Kartik Mathur. I grew up in a small town in India, spending most of my time obsessing over cricket and wondering how the world works. That curiosity has stayed with me. When I am not working, I enjoy reading and spending time at the piano, usually figuring things out one note at a time.

Currently, I am a Director of Engineering at Snorkel AI, where I lead teams building and scaling the AI infrastructure behind the Snorkelflow labeling platform and Expert Data services. The focus sits at the intersection of LLM deployments, evaluations, and AI infrastructure production challenges.

Before Snorkel, I was at Domino Data Lab, where I led the model deployment and model registry teams for the Domino machine learning platform. Prior to that, I spent several years at Hewlett Packard Enterprise, where I led the inception, development, and scaling of the Model training and deployment initiatives on the HPE Ezmeral Software Platform.

During that time, I co created the open source KubeDirector project, which was designed to support stateful machine learning applications on Kubernetes. That work was recognized by the broader cloud native community, including an invited talk at the Cloud Native Computing Foundation on building dynamic machine learning pipelines with KubeDirector

My Journey

My path in technology began at AMD and continued through engineering leadership roles at several technology startups, including BlueData, which was acquired by HPE in 2019. Over the years, I have focused on building systems that sit beneath complex products, the kind that need to be reliable, scalable, and understandable by the teams who operate them.

I hold multiple granted patents in distributed systems, machine learning platforms, and data infrastructure, which you can find here. Academically, I earned my MS in Computer Science from Indiana University Bloomington, where my research centered on parallel computing and distributed systems.

These days, my interests are increasingly focused on large language model evaluation, deployment optimization, and cost efficient strategies for training and operating machine learning systems at scale. I continue to learn by building, writing, and reflecting, which is ultimately why this blog exists

What You'll Find Here

This blog is where I share my thoughts on:

Technology & Engineering: Deep dives into distributed systems, ML infrastructure, and cloud-native technologies
Industry Insights: Observations from the trenches of building AI/ML platforms at scale
Personal Reflections: Musings on leadership, career growth, and the occasional non-tech topic
Personal Projects: Projects I'm working on or contributions to the open source community

Get In Touch

Feel free to connect with me through the social links below, or reach out if you have questions about my posts. I'm always up for interesting conversations about technology, engineering leadership, or just about anything that sparks curiosity.

Thanks for stopping by!