Site Reliability Engineer

Would you like to help implement innovative cloud computing solutions and solve the most complex technical problems? Do you have a deep passion and desire to engineer and operate the world's largest cloud computing infrastructure to provide a better world for future generations?

Amazon Web Services is seeking top industry technical experts to grow our team dedicated to expanding our cloud service lines to our customers. We are seeking talented engineers skilled in the art of integration who understand the Agile mindset and DevOps philosophies, yet are not constrained by how “things are usually done" and are willing to decompose and reinvent systems, processes, and tools.

Amazon has a fast-paced environment where we “Work Hard, Have Fun, Make History.” On a typical day engineers might deep dive to root cause a customer issue, investigate why a metric is trending the wrong way, consult with the top engineers at Amazon, or discuss radical new approaches to automate operational issues.

Site Reliability Engineers are responsible for maintaining tools/systems/platforms for our cloud service lines. This includes troubleshooting problems with systems and services, regular deployment of new versions of the systems and their subcomponents, deployment/system validation and testing, service monitoring, standing up new services/tools, etc. The teams work with many different internal Software Development teams to drive improvement of the systems/services within the team's scope. It is important to be able to work collaboratively and independently to investigate and document issues and create solutions to solve them at scale.

You'll be part of a world-class team in a fast-paced environment that has the entrepreneurial feel of a start-up. This is an opportunity to operate and engineer systems on a massive scale, and to gain world class experience in cloud computing. You'll be surrounded by people who are wickedly smart, passionate about cloud computing, and believe that first class service is critical to customer success.

If you possess that rare mix of experience in Development, Operations, Networking, and Systems Engineering then we are waiting for your application.

Amazon is committed to a diverse and inclusive workforce. Amazon is an equal opportunity employer and does not discriminate on the basis of race, ethnicity, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.

BASIC QUALIFICATIONS

· Linux experience – 5+ years
· Development experience in Python, Go, Ruby, or related languages – 5+ years
· Experience of Agile methods and processes – 5+ years
· Advanced knowledge of configuration management systems, such as: Puppet, Chef, Ansible, or related systems – 5+ years
· Ability to participate in an on-call rota

PREFERRED QUALIFICATIONS

· Bachelor’s degree in Computer Science or equivalent Engineering discipline
· Experience with service orientated architectures
· Advanced knowledge of TCP/IP networking, architecture, and core technologies (such as DNS, DHCP, HTTP, Routing, VPN)
· Excellent problem solving skills with a strong attention to detail
· Ability to dive deep into complex technical problems
· Experience of multithreaded, distributed development
· Meets/exceeds Amazon’s leadership principles requirements for this role

View all London Job Descriptions