Site Reliability Engineer (SRE) – Containerized Microservices
Aylo · Montréal
Description du poste
About the role
We are looking for a skilled Site Reliability Engineer to ensure the reliability, scalability, and performance of our production systems built on containerized micro‑services. You will join a dynamic, international team that values diversity, inclusion, and innovative problem‑solving.
Key responsibilities
- Own reliability, availability, and performance of production systems in a Kubernetes‑based environment.
- Monitor system health with Grafana dashboards, alerts, and observability tools.
- Manage and operate Kubernetes clusters via Rancher, handling deployments, scaling, and troubleshooting.
- Lead incident management using OpsGenie, including on‑call rotations and post‑incident reviews.
- Troubleshoot across application, infrastructure, messaging, database, and container layers.
- Develop automation scripts with Bash, Go, and Python to improve operational efficiency.
- Support and optimize CI/CD pipelines in GitLab for smooth releases.
- Collaborate with development teams to enhance application reliability and observability.
- Monitor and resolve performance issues in MySQL, Redis, Kafka, and RabbitMQ.
- Maintain operational documentation, runbooks, and knowledge bases in Jira and Confluence.
- Perform root‑cause analysis and implement preventative measures while ensuring security and compliance.
- Leverage AI‑powered engineering tools to accelerate troubleshooting and documentation.
Required profile
- 3+ years of experience in Site Reliability Engineering, DevOps, Production Support, or Systems Engineering.
- Bachelor’s degree in Computer Science or a related field.
- Hands‑on experience with Grafana, Kubernetes, Docker, and Rancher.
- Proven incident‑management experience using OpsGenie.
- Strong background with GitLab/Git, CI/CD pipelines, and release processes.
Required skills
- Grafana
- Kubernetes
- Docker
- Rancher
- OpsGenie
- Bash
- Go
- Python
- GitLab
- Git
- CI/CD pipelines
- MySQL
- Redis
- Kafka
- RabbitMQ
- Jira
- Confluence
What we offer
- Hybrid work environment with flexibility for remote collaboration.
- Opportunity to work on cutting‑edge AI‑assisted tooling.
- Collaborative international team across Montreal, Austin, and Nicosia.
Questions fréquentes
Pourquoi signalez-vous cette offre ?
Postulez en 30 secondes
Entrez votre email pour postuler. Un compte sera cree automatiquement.
En continuant, vous acceptez nos conditions d'utilisation.
Deja un compte ? Connexion
Publie il y a 1 jour
Expire dans 1 mois
9 vues · 0 candidatures
Boostez vos chances
Importez votre CV : nous vous proposons les offres qui matchent votre profil.
Analyse de votre CV en cours...
Aylo
Montréal
Offres similaires
-
Architecte de solutions JD Edwards
StrategieInfo Montréal -
Stratège IA – Pilotage de projets IA
Moov AI Montréal -
Conseiller(ère) Certinia PSA – Implémentation Salesforce
Deloitte Montréal -
Project Manager (Remote)
Crossing Hurdles Canada -
Platform Engineer – Azure & Databricks Developer
Banque Scotia Toronto