stepscale
← All open roles

Founding Platform Engineer

Remote (EU / Israel timezones) · Full-time · Competitive + meaningful equity

About the role

You will own the core stepscale AI runtime that ingests workload telemetry, runs the tuning models, and writes optimized configs back to customer autoscalers across AWS ECS and Kubernetes. This is a foundational position. You will shape what production infrastructure looks like at stepscale for the next several years.

What you will work on

  • The metrics ingestion pipeline that handles queue-depth, task-count, and request-rate streams from customer environments
  • The applier service that pushes tuned configurations through native APIs (ECS UpdateService, HPA / KEDA CRDs) safely, with rollback
  • Multi-tenancy isolation and per-customer state stores
  • Observability for our own scaling decisions - every change should be explainable

What we are looking for

  • 5+ years building production cloud infrastructure, ideally including ECS or Kubernetes at scale
  • Comfortable owning a system end-to-end from API surface to operations
  • Strong written communication - we are async-first
  • A bias toward shipping over polishing the architecture diagram
  • Bonus: prior work on autoscaling, scheduler internals, or time-series systems

How we work

Remote, async by default, no status meetings. We hire ICs only at this stage. You ship to production with confidence and review pairs you up with whoever has the most context, not a fixed gatekeeper.

Apply

Send a CV + a few lines on why this role.