As a seasoned Technical Leader with over 16 years of experience, I have consistently delivered innovative and transformative solutions for financial institutions and diverse industries, including Banking, Insurance, Payments, Healthcare, and Retail. My expertise spans Site Reliability Engineering, Non-functional Engineering, Solution Architecting, JVM Tuning, Capacity Planning, Resilience, Observability, and Cloud Adaptation. I specialize in designing and implementing scalable, resilient, and highly available distributed systems using tools such as Datadog, Dynatrace, New Relic APM, and SQL, along with deep expertise in cloud-native architectures (AWS, GCP, Azure) and Kubernetes. In my role as a Senior SRE Architect, I focus on driving enterprise-wide reliability strategies by implementing self-healing and fault-tolerant infrastructures. Leveraging SLOs, SLIs, error budgets, and cutting-edge technologies like AIOps, chaos engineering, and predictive analytics, I ensure system performance and operational excellence. My passion lies in architecting next-generation reliability frameworks that prevent failures before they occur, aligning engineering efforts with business goals.
Additionally, I bring extensive experience managing high-performing teams, including mentoring senior engineers and fostering a culture of collaboration, learning, and proactive reliability. With over seven years of experience in backend engineering, primarily with Java and distributed systems, I have a deep understanding of asynchronous microservices and data consistency. My work with the Kafka ecosystem, including Apache Flink, has equipped me to design and scale complex systems that meet the needs of customer-first environments. As a strategic thought leader at the intersection of SRE, DevOps, and software engineering, I thrive in fast-paced environments, ensuring challenging projects are delivered with precision and impact. I am passionate about inspiring teams, advocating for innovation, and driving the future of scalable, reliable systems.
Work Experience

SRE & Performance Architect| High performane Application Design
Teacher Retirement Systems of NYC
2018 - Present

Platform & Observability Transformation Lead
HealthFirst Insurance
2017 - 2018

Cloud Modernization Architect
Maplin Online Electronic Store
2015 - 2016

NFT Manager - Simplification Program
Lloyds Banking Group
2013 - 2016

Performance Engineer
OCBC Bank
2011 - 2012

Senior Performance Test Engineer
Credit Suisse Bank
2009 - 2011

Software Developer
Persistent System Inc
2008 - 2009
Certifications
AWS Certified Cloud practitioner
Issued by: Amazon Web Services (AWS)
Issued on: March 2021
Salesforce Certified AI Associate
Issued by: Salesforce
Issued on: April 2025
Github Copilot Certified
Issued by: GitHub
Issued on: May 2025
Sun Certified Java Programmer
Issued by: Sun/Oracle
Issued on: February 2007
Blog
Connect
Feel free to contact me at aksahthakur34@gmail.com
