Devops/Service Reliability Engineer
Leaf Space is a rapidly growing scale-up company and a leading provider of ground segment as-a-service (GSaaS) solutions.
Our innovative and proprietary concept is focused on providing satellite and launch vehicle connectivity as-a-service, enabling clients to efficiently manage their assets and fully exploit data.
Our GSaaS solutions have been recognized by the market for their efficiency, security, and effectiveness in supporting different applications, from remote sensing to IoT communications.
At Leaf Space, we operate with a flat organizational structure, we are growing in economic and people aspects, we are working in a professional and autonomous manner in fast-paced environment.
We prioritize hiring top talent and cultivating a collaborative, high-achieving, and supportive workplace.
Our core team is headquartered in Lomazzo (Como), Italy, and we have expanded our presence in the U. S. with headquarters based in Northern Virginia.
As we continue to develop innovative technologies to support the NewSpace economy, we are looking for world-class talent ready to tackle challenging projects to drive expansion and sustainability of the space ecosystem.
Leaf Space offers comprehensive benefits and flexible remote work options to support our employees in achieving their goals.
DevOps/Service Reliability EngineerWe are seeking a highly skilled and motivated DevOps/SRE to join our dynamic and fast-paced team.
As an SRE, you will play a crucial role in ensuring the reliability, scalability, and performance of our systems and services.
You will collaborate with cross-functional teams, including developers, operations, cybersecurity, mission managers, and quality assurance, to design, implement, and maintain robust and efficient infrastructure solutions.
In this role, you will be responsible for maintaining and optimizing the telecommunication infrastructure of our ground stations network.
This network enables seamless communication and data transfer with satellites, and it is essential for providing customers and internal operators with the necessary tools to monitor and operate the service effectively.
RESPONSIBILITIESSystem Reliability: Build and maintain highly available and scalable infrastructure systems, ensuring optimal performance and reliability across all services. Incident Management: Respond to and resolve production incidents promptly and effectively, utilizing incident management best practices.
Conduct post-incident analysis to identify root causes and develop preventive measures. Monitoring and Alerting: Develop comprehensive monitoring and alerting systems to proactively identify potential issues, troubleshoot problems, and ensure system health.
Continuously improving monitoring processes and tools. Automation and Tooling: Design, develop, and maintain automation tools, scripts, and frameworks to streamline operational processes, enhance system resilience, and reduce manual efforts. Infrastructure Optimization: Identify opportunities for infrastructure optimization and implement performance tuning strategies.
Collaborate with development teams to improve system architecture and application performance. Capacity Planning: Analyze system capacity and usage patterns, forecast future growth, and work with stakeholders to plan and scale infrastructure resources as required. Collaboration and Communication: Work closely with cross-functional teams, including developers, operations, and quality assurance, to facilitate effective collaboration, knowledge sharing, and alignment towards common goals. Documentation and Knowledge Management: Document system configurations, troubleshooting guides, and operational procedures.
Contribute to knowledge sharing initiatives and maintain an up-to-date knowledge base. Incident Prevention and Reliability Engineering: Proactively identify potential system vulnerabilities, performance bottlenecks, and reliability risks.
Develop and implement preventive measures, conduct system failure simulations, and participate in architectural reviews. BENEFITS AND PERKSCompetitive salary package with performance-based incentives. Flexible working hours and the option for remote work to support a healthy work-life balance. A collaborative and inclusive work environment encourages idea sharing, innovation, and personal growth. Ticket restaurant and transportation allowance. QUALIFICATIONS AND REQUIREMENTSBachelor's degree in computer science, Engineering, related fields, or equivalent experience in previous roles. Proven experience in a Service Reliability Engineering (SRE) or related role (DevOps or similar roles are preferable), with a strong focus on managing large-scale, distributed systems. Solid understanding of Linux/Unix systems. Base experience with cloud platforms (e. g. , AWS, Azure, GCP). Knowledge and understanding of containerization technologies (e. g. , Docker, Kubernetes) and microservices architecture. Knowledge in Infrastructure as Code (IaC) to manage complex infrastructures with multiple environments (Terraform, Helm charts). Familiarity with monitoring and logging tools (e. g. , Prometheus, Grafana, ELK stack) and experience in building observability solutions. Base understanding of networking principles, protocols (e. g. , TCP/IP, HTTP), and load balancing techniques. Strong problem-solving skills and the ability to analyze complex systems to identify root causes and implement effective solutions. Excellent communication and interpersonal skills, with the ability to collaborate effectively with diverse teams and stakeholders. Demonstrated ability to work in a fast-paced, dynamic environment and adapt to changing priorities and technologies. Experience with serverless computing and event-driven architecture. Experience with the Python programming language. Knowledge of security best practices. PROJECTS YOU WILL WORK ONWork on Leaf Space SaaS platform, maintaining, monitoring, and deploying our network. Work on Core Leaf Space Software, managing a distributed IoT platform. Work on Edge and innovative technologies, where we must find a solution over the state-of-the-art (Stack overflow or ChatGPT won't help us). EMPLOYER ELIGIBILITYReports to: Head of SRE
Job Location: Leaf Space Headquarters in Lomazzo (Italy).
On-site working and remote working balance is gladly accepted.
Expected Start Date: ASAP
Job Type: Full Time
#J-18808-Ljbffr
Diventa il primo a rispondere a un'offerta di lavoro!
-
Perché cercare un lavoro con PostiVacanti.it?
Ogni giorno nuove offerte di lavoro È possibile scegliere tra un'ampia gamma di lavori: il nostro obiettivo è quello di offrire la più ampia selezione possibile Ricevi nuove offerte via e-mail Essere i primi a rispondere alle nuove offerte di lavoro Tutte le offerte di lavoro in un unico posto (da datori di lavoro, agenzie e altri portali) Tutti i servizi per le persone in cerca di lavoro sono gratuiti Vi aiuteremo a trovare un nuovo lavoro