VP of Infrastructure Engineering
- Job Title
- VP of Infrastructure Engineering
- Job ID
- Sunnyvale, CA
- Other Location
Head/VP of Infrastructure Engineering
We are looking for a highly skilled, experienced, and motivated leader to lead and grow our Infrastructure and System Reliability Engineering group. You will innovate and be able to solve scalability and reliability problems. You will participate in product direction and road-map planning, project execution scheduling, and hiring engineers to scale our systems. A successful candidate will be a deeply curious individual who brings technical and leadership expertise, and ability to work within a fast-paced startup culture in a large company that has broad business impact.
What you'll do:
As a Head of Infrastructure leader, you will be expected to speak authoritatively on behalf of your team and your technical knowledge should demonstrate both depth and breadth. You will be responsible for aligning your team with product and engineering organizations. Leveraging the strengths of individual team members, delegating tasks appropriately and managing the delivery of long term projects will all be critical tasks for this role. A Technical Manager has deep knowledge in their domain and is a sought after thought leader across the organization. They have both management and technical expertise and actively participate in the organization's planning processes.
- Daily Responsibilities:
- Manage a team of high functioning team of SRE, Cloud Operation, Database and DevOps engineers
- Work with the team to define and deliver short, middle, and long term strategies for scaling applications and infrastructure.
- Partner with teams throughout the organization to build the relationships and trust necessary to improve the performance, reliability and availability of the infrastructure
- Enable the team to deliver projects and roll out new tools at scale
- Experience managing and mentoring diverse individuals (diverse in both experience and skill set)
- Practical knowledge and experience working in public cloud environments (Preferably GCP)
- Interested in building creative solutions to challenging problems
- Strategic Responsibilities
- Create and manage a roadmap for the Infrastructure reliability engineering team to help scale the infrastructure. Align with quarterly, annual product roadmap to meet the growth goals of the organization.
- Manage the health of monoliths, including but not limited to inflow of changes into monolith, releases, incident management, logging, monitoring etc.
- Conduct blameless postmortems for incidents in the monoliths, and diligently follow through learnings and remediations.
- Define and adhere to SLAs, availability requirements and change management processes for critical services.
- Develop CI/CD frameworks and tools with time to market, developer empowerment and shift left as key principles. Evangelize the framework with our application development teams and bring adaption to the frameworks.
- Research and evaluate new technology products for pilot or proof of concept by technical teams. Bring a level of maturity around costs for scale into the evaluation at the POC as well as full scale use of a tool/technology.
- Develop and execute on design, runbooks and implementation plans from concept to production for complex migration, performance and scale problems.
- Partner with Security to identify and remediate infrastructure security vulnerabilities proactively. Implement a mature security infrastructure adapting to the security roadmap.
- Partner with backend, frontend web and android leaders in building solutions towards modularizing the stack towards independent release cycles.
What you'll need:
- 15+ years of overall experience in Technology
- 8+ years of experience in devops, reliability engineering, release management or technical operations
- 5+ years of team management experience.
- 3+ years of experience creating and implementing strategic plans and roadmaps at the executive level.
- 3+ years of experience with technical requirements, design, testing, implementation and production rollout.
- 3+ years of experience conducting project meetings, presentations and status reporting.
- 3+ years of experience with Agile project methodologies (Daily Standup, Sprint Planning and Sprint Retrospective meetings)
- Excellent verbal, written, and interpersonal communication skills
- Effective in building partnerships with senior technical, functional and business leaders to forward short-term and longer term initiatives
- Experience with any of the technologies used in our stack is a strong plus - Terraform, Puppet, MySQL, Kubernetes, Google Cloud Jenkins, Github, SRE/Observability Tools , Linux systems
- Daily Responsibilities: