Staff Engineer, Product Software - Site Reliability
Posted on: September 14, 2019
Staff Engineer, Product Software - Site ReliabilityEquinix is the
leading global interconnection platform, accelerating business
performance by connecting companies to their customers and partners
inside the world's most networked data centers. More than 4,500
customers trust us to provide a place where they can run their
mission-critical applications and grow their businesses.
Our dream is to interconnect the world - and create a historically
significant company in the process. Today we are a $3.6 billion
company with over 7000 employees worldwide, and we're growing - in
fact, in 2013, Forbes named Equinix as the #15 Fastest Growing
Technology Company in America. Our leadership team is top-notch,
our employees are dedicated and committed to customers and each
other, and our size is just right for people who truly want to make
a difference every day.
At Equinix, we make the internet work faster, better, and more
reliably. We hire hardworking people who thrive on solving hard
problems and give them opportunities to hone new skills, try new
approaches, and grow in new directions. Our culture is at the heart
of our success and it's our authentic, humble, gritty people who
create The Magic of Equinix. They share a real passion for winning
and put the customer at the center of everything they do.
- Defining and evangelizing cloud-related optimizations and best
practices to improve reliability, scalability, and performance
- Supporting edge services before they go live through activities
such as system design consulting, capacity planning and launch
- Maintaining edge services once they are live by measuring and
monitoring availability, latency and overall system health,
- Designing, managing and deploying monitoring solutions and
tools to identify and address reliability risks and performance
bottlenecks, optimizing the ROI on the infrastructure and reducing
- Responsible for troubleshooting edge services infrastructure,
systems, network, and application stacks
- Performing on-call duty as part of a team maintaining the
availability and performance of our cloud infrastructure as well as
the various internal services and systems that our engineering team
- Hands-on experience with Kubernetes is preferred
- BS degree in Computer Science or related technical field with
5+ years of working experience; or master's degree with 3+ years of
- 5+ years of system administration experience
- 2+ years of cloud engineering experience with at least one of
the leading public cloud platforms, AWS, GCP, Azure, AliCloud,
Oracle Cloud, etc. Experience with administering AWS or GCP is
- Deep understanding of the software development life cycle and
zero downtime release management. Experience with agile based
iterative development and knowledge of software engineering best
- Experience with Openstack Ironic, Neutron, Keystone, Ceph,
- Ability to solve problems independently and systematically,
coupled with strong written and communication skills and a sense of
ownership and drive
- Experience implementing and managing enterprise-scale
monitoring, trending, and alerting solutions.
- Sufficient linux environment / os performance and
troubleshooting skills. Able to debug and identify network
- Strong automation skills using tools such as Ansible, Chef,
Terraform, Jenkins, etc.
- Self-motivated with ability to multi-task and work under
- Working knowledge network routing & switching, load balancing,
clustering, distributed systems, content delivery networks, message
queueing systems, etc.
- Proficiency with Python programming in the domain of
- Interest in designing, analyzing and troubleshooting
large-scale distributed systems.
- Experience with algorithms, data structures, complexity
analysis and software design.
- Experience working with fault tolerant and highly-available
- Experience with big data systems and/or database administration
(e.g. PostgresSQL, Cassandra, etc) a plus;
- Experience with performance profiling and optimization,
preferably in a distributed environment
- Having a cloud solution associate or above certificate from any
of the public clouds (such as AWS, GCP, Azure, etc.) is a plus
- Experience with ELK stack
Equinix is an equal opportunity employer. All applicants will
receive consideration for employment without regard to race,
religion, color, national origin, sex, sexual orientation, gender
identity, age, status as a protected veteran, or status as a
qualified individual with disability.
Keywords: Equinix, Sunnyvale , Staff Engineer, Product Software - Site Reliability, Professions , Sunnyvale, California
Didn't find what you're looking for? Search again!