Project: Cloud File Transfer (CFT)
Part 1: Mandatory Requirements
Responsibilities:
- Spearhead cloud operations with a strong focus on monitoring, performance tuning, and release management within AWS environments.
- Ensure L2 incident management and escalation procedures are robust and proactive, prioritizing multiple issues effectively.
- Coordinate with internal and external teams to swiftly resolve application and security incidents in line with SLAs.
- Develop and refine operational support processes, including daily checklists, work dashboards, and communication protocols to maintain clear timelines and issue tracking.
- Regularly analyse operational metrics, report on cloud system status, and provide insightful updates to stakeholders.
- Exhibit excellent communication skills to convey key findings and maintain strong relationships across the board.
- Lead change management initiatives by assessing impacts thoroughly, crafting strategies, and developing risk mitigation measures.
- Oversee a validation team tasked with rigorous QA and security assessments to ensure stakeholder changes are thoroughly vetted before release.
- Organize and execute maintenance schedules and system upgrades to optimize cloud infrastructure performance, liaising with vendors and teams for seamless cloud environment stability.
- Set and evolve OKRs and SLAs, striving for continuous enhancement of cloud operation performance.
Experience and Skills:
- Degree or equivalent in Computer Science, Information Technology, or related fields, supplemented by relevant experience.
- At least 2 years of hands-on management of public cloud services, preferably AWS.
- Acute problem-solving skills within varied cloud infrastructures and applications.
- Exceptional customer service acumen, with a strong sense of urgency and detail-oriented approach to issue resolution.
- Track record in developing and enforcing IT processes, procedures, and policies.
- Proficient in managing cloud production environments and instituting preventative measures to mitigate potential business impact.
- Competent in operational cloud technology activities, including impact assessments and service improvement execution.
Key Technologies:
- Experience with infrastructure as code, specifically Terraform, for efficient resource provisioning and management.
- Proficiency in GitLab for continuous integration/continuous deployment (CI/CD) pipelines and version control.
- Strong understanding of AWS services and architecture, underpinning the majority of our cloud operations.
Part II: General Requirements
As a DevOps Specialist, you will be responsible for:
- Develop automation and processes to enable teams to deploy, manage scale and monitor their applications in data centers and in cloud.
- System troubleshooting and problem solving across platform and application domains, expect to participate in on-call escalations to troubleshoot customer facing issues.
- Take ownership of end-to-end solutions provided by teams across the organisation.
- Deploy and manage monitoring tools of infrastructure performance, utilization and health.
- Implement configuration management system for business continuity management and automate disaster recovery measures.
- Provision virtual machines, databases, application containers and licenses for development team.
As a DevOps Specialist, you need to bring to the team:
- Passion for automation, standardization and best practices
- Excellent understanding of Software Development Life Cycle, Test Driven Development, Continuous Integration and Continuous Delivery
- Experience working with high availability, high performance, high security, multidata centre systems and hybrid cloud environments
- Demonstrable skills in three or more programing/scripting languages
- Experience with version control systems such as Git
- Experience with such as GPC, GCC (i.e. AWS, Azure, Google Cloud)
- Ability to troubleshoot complex issues ranging from system resource to application stack traces
- Comfortable with Agile methodologies and working closely with product development teams
- Strong on collaboration and communication including documentation
- Degree or Diploma in Computer Science, Computer or Electronics Engineering, Information Technology or related disciplines.
Experience required
- Experience in one or more automated provisioning tools such as Vagrant, Ansible, Puppet, Chef
- Experience in one or more automated infrastructure testing tools such as Serverspec
- Experience in one or more Cloud infrastructure such as OpenStack, CloudStack, vSphere
- Knowledge of RPM file deployment, management and design
- Knowledge of disaster recovery, system backup and restore
- Experience in one or more virtualization technologies (KVM, Xen, VMware, Hyper-V)
- Knowledge of container technologies such as Docker, LXC
To apply for this job email your details to elavarasan@acpcomputer.edu.sg