EMS Software: Senior Cloud Operations Engineer at EMS Software (Centennial, CO) (allows remote) (Centennial, CO)
Posted: Mar 20, 2017
Leveraged by millions of user every day, EMS Software manages some of the highest profile spaces in the world (including the NASDAQ bell). We are consistently delivering new features to our suite of products. We want tackle bigger challenges and accomplish some truly amazing things. Our team is always improving our codebase and operations footprint and we have amassed a sizeable backlog of interesting challenges and product initiatives. Our team needs to grow to enable even greater success in our industry, and that is where you come in.
Your First Three Months
In your first month, as your familiarity with the product grows, your responsibilities and influence will grow as well. You will collaborate with other members of the team in established patterns and continue to hone your skills as you push the design, architecture and implementation of our production environments to their next phase for general adoption.
Within two months, you and your team will have a well-tested, low-latency and highly available hosted environment. Additionally, you will have helped to create the process of onboarding customers into this new environment and continue to support them throughout their transition.
Within three months, you will have played an instrumental role in growing your team, helping to hire both your direct manager (our next Director of Cloud Operations) as well as another Operations Engineer. Further, you will have helped to drive changes to the operational and development roadmap as we inch closer to onboarding 40% of our customer base into hosted production environments by the end of 2017.
- Design, provision, configure and maintain the platform operations to handle the scale of running several application stacks in the cloud that will be consumed worldwide
- Automate the deployment and maintenance of cloud platform technologies
- Oversee production operations, log management, data warehouse, and database operations, including management of Splunk services
- Ensure all monitoring systems (IT, development, service management, Apdex) are in place
- Enforce consistency of monitoring, reporting, and alarming systems
- Drive process improvements for service management, including: outage/incident management, rollbacks and reporting
- Research emerging virtualization techniques and advise management
- Perform capacity management, load and scalability planning
- Ensure compliance with deployment and operations documentation
- Assist management in development and optimization of operational cost models
- Design cloud infrastructure for high reliability and availability
- Build strategic and tactical plans for continued improvement of cloud architecture and operations
- Manage, maintain, and enforce service level agreements
- Provide operational reports (e.g., Service Level, Usage, Cost) to upper management
- Optimize cost structure for cloud operations
- Assist in the establishment of 24x7 performance monitoring and response protocols
- Provide on-call support outside of normal work hours/days