Site Reliability Engineer

Job Description

We stand as the revolutionary vanguard of web3, a vision of a world powered by individual autonomy, shared self-sovereignty and limitless collaboration. Established by trailblazers behind The Graph, were on a mission to make The Graph the internets unbreakable foundation of open data. We invented and standardized subgraphs across the industry, solidifying The Graph as the definitive way to organize and access blockchain data. Utilizing a deep expertise in developing open-source software, tooling, and protocols, we empower builders and entrepreneurs to bring unstoppable applications to life with revolutionary digital infrastructure.

We act on a set of unwavering principles that guide our journey in shaping the future. We champion a decentralized internetfree from concentrated powerwhere collective consensus aligns what is accepted as truth, rather than authoritative dictation. Our commitment to censorship resistance reinforces our vision of an unyielding information age free from the grasp of a single entity. By building for open-source, we challenge the stagnant landscape of web2, recognizing that true innovation thrives in transparency and collaboration. We imagine a permission less future where the shackles imposed by central gatekeepers are not only removed, but relegated to the dustbin of a bygone era. And at the foundation of it all, our trust shifts from malevolent middlemen to trustless systems, leveraging smart contracts to eliminate the age-old vulnerabilities of misplaced trust.

The Site Reliability team works closely with Engineering teams across the organization to ensure the services we operate are reliable, performant, and predictable. We focus on a mix of software development, operational automation and collaboration with other teams to help take our service delivery to the next level.

We are looking for a highly motivated engineer with either SRE or DevOps experience that can help us develop and automate the various services E&N operates as part of the Graph ecosystem. In this role, you will have the opportunity to drive availability and reliability across multiple engineering teams and work closely with them to ensure the operational aspects of managing services is automated and observable.

What you'll be doing:

Building automation and management systems to deliver the various services which enable The Graph to function.
Coaching teams across the Graph ecosystem on best practices for deployment, observability and scalability
Collaborate with other SREs and engineering leaders to ensure our architecture and operations are world-class
Cultivate a culture of learning by providing insight into performance and reliability at an operational level

What We Expect

Experience building and delivering large-scale software systems
Previous experience working with both bare metal infrastructure (e.g. Equinix, etc.) and cloud infrastructure (ideally GCP)
Experience operating as a SRE (or similar role) with hands-on experience implementing processes that drive reliability and performance
History of working across organizations to codify and implement best practices for both operation and construction of software systems; knowledge of CI/CD best practices and ability to implement are considered a plus.
Deep working knowledge of Kubernetes (or other container orchestration systems) and associated technologies
Clear communication skills (written and verbal) to document processes and architectures

Work in United States
Employment Options
Base Salary

120,000 - 160,000 USD

Skills
  • Dev Ops
Apply to Job

Company

Company Name

Talent Partners Ltd

Recruiter

Elizabeth Saruh

External Recruiter

Nairobi, Nairobi City, Kenya

View Details