
Cox Automotive, Inc.
Automotive services & software division of Cox Enterprises
Principal Software Engineer – Site Reliability Engineering
2018 - Present
Platform, Reliability & Developer Experience
- Architected and managed a cloud-based platform-as-a-service (PaaS) on AWS ECS & Docker, supporting 165+ applications across 30+ teams and 200+ engineers—designed from the ground up to minimize friction and maximize developer velocity.
- Led API gateway implementation to streamline management of 15 billion annual requests, improving reliability and simplifying the integration experience for consuming teams.
- Partnered with architecture teams to define production SLOs, error budgets, and observability standards; planned multi-region Diamond+ resiliency tier.
- Facilitated infrastructure and reliability Community of Practice across a 200+ developer organization; hosted workshops on Terraform, AWS, and AI workflows, empowering dev teams to self-manage and own their infrastructure.
AI Transformation
- Transitioned a team of six engineers to AI-first engineering practices, slashing feature delivery times from weeks to days through spec-driven development patterns.
- Implemented Claude Code skills, MCP tooling, and GitHub Copilot to automate code review, deployments, and previously manual processes—freeing developers to focus on higher-value work.
Cloud Migration & Automation
- Led phased migration of 20–30 web servers from a legacy data center to AWS, converting 100% of infrastructure to code (Terraform) with automated deployments.
- Managed migration of a 4TB database from IBM DB2 to MySQL, optimizing performance with zero data loss.
- Designed multi-AZ disaster recovery strategy eliminating single points of failure through automated backups and failover mechanisms.
- Provided infrastructure and support for $40M logistics re-platforming project; streamlined Ready Logistics operations through successful vendor transition to DHL.
- Built CI/CD pipelines (GitHub Actions) for Terraform and application deployments, reducing deployment errors and giving developers a fast, self-service path to production.