Back to Open Roles
Infrastructure & Cloud
Senior Site Reliability Engineer (SRE / DevOps)
SingaporeFull TimePosted 1 week ago
About the Role
As a Senior SRE, you will be the backbone of our infrastructure, ensuring our global gaming and livestreaming systems are fast, reliable, and scalable for millions of users. You will manage day-to-day operations, drive automation, and collaborate closely with engineering teams to maintain the highest levels of service availability.
Key Responsibilities
- Manage day-to-day operations, deployment, monitoring, and incident response for global gaming/livestreaming systems.
- Collaborate with engineering, QA, and product teams to quickly diagnose and resolve production issues, ensuring high service availability (SLA compliance).
- Analyze system performance and optimize network quality across global regions.
- Oversee production database health: conduct routine inspections, manage backups and recovery, optimize slow queries, and plan for capacity.
- Implement and maintain monitoring and alerting systems to ensure infrastructure observability.
- Automate operational tasks and workflows using scripting languages such as Shell or Python.
- Support capacity planning, cost optimization, and disaster recovery preparedness.
- Participate in an on-call rotation to support 24/7 system uptime as needed.
Requirements
- 5-7 years of relevant experience in DevOps, SRE, or infrastructure operations in the internet, gaming, or livestreaming industry.
- Strong Linux system administration and troubleshooting skills.
- Proficient in infrastructure scripting (Shell/Python) and automation.
- Solid experience with production database management (e.g., MySQL/PostgreSQL), including tuning, scaling, and disaster recovery.
- Familiar with global cloud infrastructure providers such as AWS and AliCloud.
- Experience building network observability and monitoring systems for overseas markets.
- Working knowledge of container technologies (e.g., Docker, Kubernetes) and CI/CD pipelines.
- Experience supporting 24/7 mission-critical environments or participating in on-call duty is a strong advantage.
Nice to Have
- Experience with real-time systems (gaming, live streaming, WebRTC).
- Familiarity with GPU infrastructure for AI/ML workloads.
- Certifications in AWS/GCP (Solutions Architect, DevOps Engineer, etc.).
What We Offer
- Be part of a high-impact global product team targeting emerging markets.
- Competitive compensation and performance-based bonuses.
- Opportunities to grow into infrastructure leadership roles.
- Dynamic, inclusive, and tech-forward working environment.
Quick Apply
Interested in this role? Send your resume and portfolio directly to our hiring team.
Apply for this JobOnly shortlisted candidates will be contacted.
Role Summary
LocationSingapore
TypeFull Time
Experience5+ Years
DepartmentInfrastructure & Cloud