We are looking for Principal Site Reliability Engineer (Platforms and Systems Specialist)
For
further details please drop email with updated profile at aravinthu@tsmspl.com
Location:
Remote/Hybrid
General
Summary: The
Cloud Operations team provides 24x7x365 support for all Company SaaS &
Hosting customers globally. This business unit is responsible for the
day-to-day management and support of the cloud operations environment including
the uptime, performance and high availability of all customers supporting
systems inside of the SaaS & Hosted environments. The SaaS & hosted
ecosystem is comprised of multi-tiered applications, microservice
architectures, containers & virtual servers as well as large & complex
multiterabyte SQL database systems. The SRE Platforms & Systems Specialist
will be focused on the orchestration and automation of infrastructure and
deployments supporting the lifecycle management processes for Company hosted
environments in both public and private cloud settings. The ultimate objective
will be to minimize toil as much as possible through automated solutions for
the day-to-day maintenance, upkeep, and operations control tasks. This empowers
the engineer to focus their efforts on improving the performance of the
existing platform and contribute to the design and architecture of new
infrastructure and software inside the SaaS ecosystem. This resource should
have the skillsets of both a principal systems engineer and a junior to
mid-level software developer. This includes a deep understanding of Windows and
Linux, AWS, and other cloud platforms.
Key Responsibilities
• Develop orchestration and automation for Active
Directory, task management systems, secure file transfer systems and other
common cloud operations platforms in support of the Company cloud operations
SaaS & Hosted environments.
• Codevelop the automation & orchestration
framework including establishing design patterns related to the CMDB, config
management, password management and other key integrations from the ground up
with other SREs
• Implement and maintain CI/CD pipelines for the
automation and orchestration of the SaaS & Hosted cloud operations
environments.
• Create automation and orchestration for core
datacenter cloud operations services.
• Continuous development of systems self-healing
automation to reduce toil.
• Partake in a rotation providing incident and
request handling support, identifying improvement opportunities where
automation or rearchitecting of solutions can improve overall outcomes and
reduce toil.
• Will serve as technical lead for Active
Directory and other central platform services on major projects inside of
hybrid cloud environments
• Responsible for training SRE team members,
project engineers, technical support staff and application development staff to
better utilize AD & other managed platforms. Professional Skills &
Abilities
• Desire and ability to thrive in a fast-paced,
highly demanding, dynamic business and cloud operations environment.
• The role requires analytical acumen and
solution orientation to probe for understanding and to make appropriate
decisions to address the nuances of technical and business challenges to
achieve the targeted outcome.
• Strong customer service orientation • Excellent
communication skills and experience in driving cross department initiatives to
obtain organizational objectives & meet customer needs
• Strong communication, presentation, business,
and technical writing skills
• The ability to provide excellent customer
service as well as manage and build strong relationships both internally and
externally
• Strong interest in further developing and
integrating operations with technology in business value creating ways
• Awareness of emerging issues, including
regulations, industry practices and technology
• Experience with Kubernetes and Container
administration is a plus.
Technical Skills &
Experience
• 15+ years of experience in job specific skills.
• 8+ years of experience in writing automation
scripts in bash, python, or powershell to solve technical and business problems
in IT operations.
• 8+ years of experience with Active Directory,
DNS, secure file transfer, OS Patching, and various other platform services
• 4+ years of experience in orchestrating
automation for cloud operations or managed services environments building
runbooks that align with work streams and value streams
• 3+ years of experience direct involvement with
datacenter buildouts &/or Disaster Recovery of core platform systems such
as Active Directory and other services previously mentioned
• 1+ year of experience automation against AWS
APIs for system builds, backups and other system management interactions •
Degree in Computer Science or equivalent experience.
No comments:
Post a Comment