Infrastructure Manager

In addition to finding you the perfect job, we can do the same for your friends and relatives. Send us a referral and we'll give you a cash bonus.
.jpg)
Role: Infrastructure Manager
Reports to: Head of Technology / Director
Direct reports: Lead DevOps, Lead SRE (with dedicated teams)
Scope: Cloud & on-prem infrastructure, reliability, tooling, cost optimization
Collaboration: B2B
Role Purpose
Own and scale the infrastructure supporting multiple SaaS products, including cloud environments and on-prem deployments. Ensure fast, reliable deployments (within weeks), high availability (~99.99%), and cost-efficient scaling without proportional cost growth.
________________________________________
Requirements
• 10+ years in Infrastructure / DevOps / SRE, including 3+ years in leadership roles
• Strong experience with cloud platforms and on-prem infrastructure (server hardware, networking, virtualization – VMware, Hyper-V, and remote management at scale across distributed customer sites.)
• Proven experience in scaling B2B SaaS environments
• Solid understanding of SRE principles (SLOs, incident management, automation)
• Hands-on experience with: Infrastructure as Code (e.g., Terraform), Kubernetes / containerization, CI/CD pipelines
• Experience with FinOps and cost optimization
• Familiarity with compliance frameworks (e.g., SOC 2, ISO 27001)
• Experience managing managers (not just individual contributors)
• Experience in regulated environments is a plus
________________________________________
Key Responsibilities
1. Infrastructure Strategy (Cloud & On-Prem)
• Define and execute the long-term infrastructure strategy
• Own architecture decisions (multi-region, multi-tenant, disaster recovery, data residency)
• Establish a unified infrastructure-as-code approach across cloud and on-prem
• Manage cloud vendors and optimize costs
• Define scalable standards and reference architectures for on-prem deployments
2. Build — DevOps
• Oversee CI/CD pipelines, provisioning, infrastructure-as-code, and deployment tooling
• Enable product teams to deploy independently, safely, and frequently
• Drive self-service infrastructure (minimal reliance on tickets)
• Automate on-prem deployments via remote tooling
• Own release engineering and environments (including demo/sandbox setups)
3. Run — SRE
• Own production reliability (SLOs, SLIs, uptime, monitoring, alerting, incident response)
• Define and manage on-call and operational models across regions
• Implement remote monitoring for on-prem environments
• Drive incident management in coordination with Lead Knowledge and Incident under Head of Operations (who owns the customer-facing incident process).
• Implement AI-powered anomaly detection and predictive incident management to reduce MTTR across both cloud and on-prem (coordinating with Manager AI and Data).
4. Security & Compliance
• Own infrastructure-level security controls for cloud and on-premises: network security, encryption at rest and in transit, access management, vulnerability scanning, and patch management.
• Define secure baselines for on-prem environments
• Support compliance efforts (e.g., SOC 2, ISO 27001)
• Ensure audit readiness
5. Cost Management
• Own infrastructure cost as a line item: cloud spend, on-premises hardware and licensing, tooling licenses, and operational overhead.
• Track and optimize cost per product, cost per customer, and cost per transaction — contributing to the company's RPE (revenue per employee) trajectory.
• Implement FinOps practices for cloud (tagging, showback/chargeback, reserved capacity planning, waste elimination) and total cost of ownership modelling for on-premises.
6. Scaling & Capacity
• Forecast infrastructure capacity needs based on product roadmaps, customer pipelines, and data growth (especially for Data and AI products under Manager Product Portfolio 2).
• Plan infrastructure scaling without proportional headcount growth — leverage automation, self-service, and AI-powered operations.
• Define on-premises sizing guides and capacity planning tools so pre-sales and implementation teams can scope customer hardware requirements accurately.