Infrastructure Cloud Engineer

Share:

Spire Global is a space-to-cloud analytics company that owns and operates the largest multi-purpose constellation of satellites. Its proprietary data and algorithms provide the most advanced maritime, aviation, and weather tracking in the world. In addition to its constellation, Spire’s data infrastructure includes a global ground station network and 24/7 operations that provide real-time global coverage of every point on Earth.

Who we are and what we’re looking for

The Infrastructure Services team here at Spire are very passionate about the quality of our infrastructure; we fully embrace modern approaches and best practices. Gone are the days of Sysadmins painstakingly hand crafting each Virtual Machine running on an on-prem monolithic server. We maintain a cloud based PaaS running on Kubernetes, hosted by AWS.

We are looking for someone equally passionate about Infrastructure to join our team and help us continue our journey. We’ve made big strides, but there is plenty more to build and innovate. We are currently working on the second iteration of our Kubernetes PaaS. This means redesigning some of the pain points from v1, while also migrating our development teams from standard AWS EC2 instances onto Kubernetes. 

Our ideal candidate will have considerable soft skills as well as strong technical abilities. The team’s mission goes beyond just building software. Socializing solutions, encouraging adoption and supporting our internal customer team will all be essential. Confident individuals with a cool temperament would fit in well, especially if they can empathize with our developers when challenges arise.

Your responsibilities

SRE’s in the Infrastructure Services team are primarily responsible for building and maintaining the PaaS running on Kubernetes. A typical task might be extending the Kubernetes cluster, or deploying a new service on top of it. If the task requires a large amount of new technical decisions to be made, we would require a Technical Proposal to be written first. This allows the entire team to collaborate on new designs, rather than just the individual assigned the ticket.

In addition to building out the PaaS we also support our various teams that rely on it. We tend to be very involved with the migration process to ensure a smooth transition onto Kubernetes. You would be providing technical expertise to developers with varying amounts of infrastructure experience. We are also required to help debug infrastructure issues that can crop up. This might mean helping a user find their application’s logs in Kibana, or debugging why a pod is crashing. To that end we also have a follow-the-sun on-call rota. One of the benefits of being a global company is that nobody is required to handle on-call overnight. 

Technologies we use

  • Kubernetes (AWS EKS)
  • Helm
  • Terraform (HCL and Terraform CDK)
  • Typescript
  • Concourse
  • Argo CD
  • Prometheus/Alertmanager/Grafana
  • Elasticsearch/Kibana
  • AWS

Our journey so far

Any experienced Site Reliability Engineer knows that Kubernetes is not a magic bullet. Instead it is the launch pad for the solutions that support your business requirements. We run the following on top of Kubernetes:

Continuous Integration 

Concourse is our CI tool of choice. Primarily we use it to build, test, and containerise our applications. It is also used to automate various miscellaneous tasks.

Continuous Deployment

Argo-CD is our CD tool of choice. It is Kubernetes native, flexible, and incredibly easy to use. 

  • One click zero-downtime deployments that are rapid and safe
  • One click rollbacks for when mistakes are made
  • Developers are empowered to handle their own deployments via an easy to use GUI

Infrastructure without Toil

We define toil as a boring repetitive task that an individual needs to complete. We hate toil, so we automate it away as much as possible. To date we have built:

  • Fully automated TLS with Let’s Encrypt. Our services don’t experience yearly outages due to expired certificates.
  • Fully automated DNS removes the potential for human error.
  • Developers are empowered to configure their application’s DNS and TLS, via a simple yaml file. This removes human bottlenecks and maintains developer momentum
  • Automatic Pod and Node scaling to ensure we remain cost effective while still providing adequate compute power

 

Basic Qualifications

  • Bsc in Computer Science or similar 
  • Linux Administration
  • Infrastructure as Code (Terraform/Ansible/Puppet/Chef)
  • Programming/Scripting

 

Preferred Qualifications

  • Cloud Infrastructure (AWS/Azure/GCP)
  • Kubernetes (EKS/AKS/GKE)
  • Experience managing CI/CD systems

 

Spire is Global and our success draws upon the diverse viewpoints, skills and experiences of our employees. We are proud to be an equal opportunity employer and are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, marital status, disability, gender identity or veteran status.

 

#LI-JO1

 

#LI-hybrid