
RevenueCat
We are looking for a Senior Site Reliability engineer to help design, build and support reliable core systems and infrastructure. Someone passionate about reliability, scalability, efficiency and visibility. Able to handle effective cross-team collaboration and extend the SRE culture. Our stability affects the experience of millions of users.
About you:
- You have 5+ years of experience designing and maintaining complex/large/growing systems.
- You collaborate well with others, and can communicate effectively in a fully-remote culture.
- When reviewing new system designs or code, you naturally think about what can go wrong: edge cases, failure modes, bottlenecks, migrations, releases, interesting metrics, etc.
- You love debugging and finding the root cause of production issues.
- You can’t sleep if something doesn’t have enough metrics to ensure everything is working properly.
- You are proactive, when you see something broken you jump on it to fix it or suggest improvements.
- You move fast, test and iterate quickly.
- You love the Linux/Unix shell, but hate manual processes and love to automate all the things.
Preferred Experience:
- Experience with AWS cloud, k8s, Terraform/Pulumi, Prometheus, OpenTelemetry, Elastic Search, PostgreSQL, MariaDB/Mysql.
- Experience supporting highly available, high-throughput REST apis
- Solid knowledge of Python
In the first month, you’ll:
- Meet frequently with your team and mentor to get up to speed
- Setup: familiarize with repositories, task management, dev environment
- Implement and ship your first project
- Familiarize yourself with the RevenueCat dashboards, logging, debugging tools, cloud providers, infrastructure management and general architecture
- Familiarize yourself with workflows and subscription business concepts.
Within the first 3 months, you’ll:
- Be able to scope and work on tasks self-sufficiently.
- Start oncall training
- Participate in code reviews and contribute in other ways (testing, visibility, …) to improve the reliability and quality of services
Within the first 6 months, you’ll:
- Contribute to risk assessment, disaster planning and response strategies
- Be obsessed about our uptime
- Detect our blindspots and add observability to mitigate them
- Work closely with product engineers to design reliable rollouts of new features
- Review code, proposals and participate in architectural discussions.
Within the first 12 months, you’ll:
- Know all the major components of our system and be able to debug complex issues
- Have your own initiatives for improving the services and our infrastructure
- Be able to spec and architect medium-large projects, gather feedback and design validation and rollout plans.
- Mentor other engineers
- Influence the org to improve general reliability, scalability and performance
What we offer:
- $212,000 USD salary regardless of your location
- Competitive equity in a fast-growing, Series C startup backed by top tier investors including Y Combinator
- 10 year window to exercise vested equity options
- Fully remote work environment that promotes autonomy and flexibility
- Suggested 4 to 5 weeks time off to recharge and focus on mental, physical, and emotional health
- $2,000 USD to build your personal workspace
- $1,000 USD annual stipend for your continuous learning and growth