DevOps / Site Reliability Engineer

Posted 6/29/2022

This position is responsible for rearchitecting the way in which we update existing services and deploy new services.  Our systems, which transcode and stream live broadcast television, must run 24/7/365 without interruption.  Deploying new or updated services must not disrupt our production workflows.  DevOps is responsible for developing the methodologies for testing and deploying without service interruption.  Site Reliability Engineering is responsible for developing the systems which monitor and maintain the health of our core video services, as well as gathering and displaying core metrics on system performance.

Our systems involve many branches of software development:

  • Embedded software for interfacing to broadcast systems and performing live transcoding.
  • Massive, load-balanced server farms for performing stream processing and distribution.
  • Web-based tool development for managing and scheduling system operations
  • Multi-layer, cached database systems for stream management
  • Large scale data mining for analytics
  • App and web development for viewer-facing systems
  • Systems for monitoring and alerting all of the above systems

We develop in many different languages, including:

  • Go
  • Java
  • C++
  • Node
  • Javascript

We develop apps for the following platforms:

  • Android
  • iOS
  • AppleTV
  • Roku
  • Firestick
  • Web

We have a variety of database systems:

  • SQL server
  • Redis
  • RabbitMQ
  • DynamoDB
  • ElastiCache