Under supervision, perform activities and team leadership required to support a large, complex Linux based computing environment and an increasing transition to Linux infrastructure in AWS. Assist in driving “infrastructure as code” mentality throughout the organization and demonstrate a passion for automation concepts and tools. Utilize customer service skills while acting as a technical resource to internal departments and system users. Use technical skills to proactively put scripts and documentation in place to comply with current standards.
Primary Duties and Responsibilities:
- Provide advanced system administration, operational support and problem resolution for a large complex Linux computing environment, including both virtualized and physical servers
- Create and Patch AMIs, perform pull requests, write Automation code
- Perform Linux administration including changes, deletes, disk space management, application installation, and backup
- Use your infrastructure and networking knowledge to maintain cloud based infrastructure predominantly on AWS involving EC2, S3, RDS & VPC
- Use configuration management tools (primarily Ansible and Terraform) to build and maintain a hybrid infrastructure hosted both at colocation facilities and in the public cloud.
- Work directly with the development team to build supporting infrastructure for specific new application functionality.
- Run proof of concept projects on early stage infrastructure improvements to validate the feasibility of an approach, evaluate performance, and spike an implementation.
- Review and evaluate virtual and physical server performance and capacity
- Forecast system demands and recommends upgrades, expansions and reconfigurations
- Perform automated computing environment builds, site setup, user training, hardware/software installation, maintenance and support and documentation of operating procedures and processes
- Support VMware environment including changes, adding/removing systems, and disk space management.
- Troubleshoot hardware and software problems, takes appropriate corrective action and/or interact with IT staff or vendors in performing complex testing, support, server recovery, and troubleshooting functions.
- Assist with development and testing of changes needed to maintain DR environment
- Use change management process
- Comply with all audit, compliance, and regulatory requirements
- Attend meetings as a team representative
- Support on call, weekend and off hours work as needed
- Perform other duties as assigned
- Good consultative, communication, analytical, and judgment skills
- Ability to work effectively with clients, technical staff, consultants and vendors
- Ability to work well under pressure and within deadlines
- Experience with disaster recovery testing and creating technical documentation
- Ability to communicate well and perform as part of a team located in multiple cities
- Extensive knowledge of Linux operating systems, Linux shells and standard utilities, and common Linux security tools
- In depth system administration knowledge and skills for RedHat Linux. Knowledge of Amazon LINUX is a plus.
- Experience with using Github or other version control tools for source code management
- Experience using configuration management tools such as Puppet, Chef, or Ansible and container tools such as Docker
- Ability to write and maintain automation code and scripts and IaaS / Infrastructure as code, such as Terraform
- Familiarity with DevOps activities and using CICD pipeline software to deploy code
- Working knowledge of cloud components and services in AWS or Azure.
- System administration experience and knowledge of VMware and administration of virtual servers
- Grub, PXE boot, Kickstart
- Yum, rpms, Satellite server
- SVM, LVM, Boot from SAN, UFS/ZFS, filesystem configuration
- General working knowledge of NAS, SAN, and networking
- Experience with Github, Ansible, Jenkins and Terraform tools/applications
- Knowledge or experience with DevOps, OpenShift, AWS cloud, or other similar technologies is desirable
Education and/or Experience:
- Bachelor’s degree in Computer Science or a related discipline or an equivalent combination of education and work experience.
- Three or more years’ experience in Linux systems installation, operations, administration, and maintenance of physical and virtualized servers