Senior Site Reliability Engineer
Project Detail
You are responsible for:
- Helping to create a repeatable OpenStack cloud deployment
- Implementing a network topology using Open vSwitch, providing per-tenant networking, load balancing, and IPv6
- Performing day-to-day operational tasks on Wikimedia’s Cloud Services infrastructure (deployment, maintenance, configuration, troubleshooting). Develop and support automation tools and processes in support of these tasks.
- Participating in on-call rotation and support in a 24×7 environment
Skills and Experience:
- Comfortable working and thriving within a Linux ecosystem
- Understand networking in the physical domain of switches and servers
- Software development skills in at least one of the following languages: Python, Go, Javascript, and/or Ruby
- B.S. or M.S. in Computer Science or related field or equivalent in related work experience.
Qualities that are important to us:
- Share our values, appreciate our code of conduct, support our team norms, and work by all three
- Strong English language skills and ability to work independently, as an effective part of a globally distributed team
- Support of our users (volunteer and staff developers) using our service offerings
- Passionate about the value of learning and growing together
Additionally, we’d love it if you have:
- Utilized configuration management tools such as Puppet, Ansible, Chef, and SaltStack
- Used Kubernetes, Docker Swarm, Mesos, or similar container orchestration platforms
- Operated an elastic computing environment such as OpenStack or Cloudstack
- Operated a multi-tenant capable software-defined network (SDN)
- Experience in serverless computing environments
- Linux systems troubleshooting and debugging skills
- Interest in open-source software projects and communities
To apply for this role visit:Senior site reliability Engineer vacancy