**Experienced Full Stack Site Reliability Engineer – Cloud Infrastructure Development and Optimization**

Posted 2025-10-26
Remote, USA Full Time Immediate Start
**Join the Gigster Talent Network and Unlock Your Potential** Are you a highly skilled and experienced Site Reliability Engineer (SRE) looking for a new challenge? Do you want to work on cutting-edge projects with the world's best IT engineers? Do you wish you could control which projects to work on and choose your own pay rate? If so, the Gigster Talent Network is for you. **About Gigster** Gigster is a revolutionary platform that connects top-tier international companies with the world's best software developers. Our model is unique in the software development industry, allowing you to choose from a large variety of 'Gigs' that fit your interests, skills, and career goals. Whether you're looking for part-time and short-term projects or full-time long-term no-end-date openings, we've got you covered. **About the Role** We're seeking an experienced Full Stack Site Reliability Engineer to join our Gigster Talent Network. As a Staff SRE, you will play a pivotal role in shaping infrastructure for our client and driving initiatives that improve the overall service quality. You will be responsible for ensuring the reliability, scalability, and performance of our critical systems and services. **Key Responsibilities** * **System Design and Architecture**: Design, build, and maintain scalable and reliable infrastructure. Collaborate with engineering teams to ensure systems are designed with reliability and scalability in mind. Evaluate and integrate new technologies to enhance our infrastructure. * **Monitoring and Incident Management**: Implement and maintain monitoring and alerting systems to detect and respond to issues promptly. Lead incident response efforts, ensuring quick resolution and effective communication. Conduct post-incident reviews and drive improvements based on findings. * **Automation and Optimization**: Architect & Build innovative automation projects (preferably in Python/GoLang) from scratch to help reduce day-to-day SRE toil. Create Bash scripts to automate manual activities like upgrades, status checks, and deployment. Develop and maintain infrastructure as code (IaC) using tools such as Terraform, Ansible, or similar. Automate repetitive tasks and processes to improve efficiency and reduce manual intervention. * **Collaboration and Mentorship**: Collaborate with cross-functional teams to deliver high-quality products and services. Mentor and guide junior SREs and other team members. Advocate for best practices in reliability engineering across the organization. * **Continuous Improvement**: Drive initiatives to improve service reliability, capacity, and performance. Participate in capacity planning and disaster recovery exercises. Stay current with industry trends and emerging technologies. **Essential Qualifications** * Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent practical experience) * 8+ years of minimum experience in the industry as a Software Engineer, SRE, or Platform Engineer * Minimum 3+ years of experience as a Platform Engineer or SRE * Proven experience in managing large-scale, mission-critical infrastructure **Preferred Qualifications** * Master's degree in Computer Science, Engineering, or a related field * 10+ years of experience in the industry as a Software Engineer, SRE, or Platform Engineer * Experience with cloud platforms (AWS, Azure, GCP) and container orchestration (Docker, Kubernetes) * Strong knowledge of monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack) * Familiarity with CI/CD pipelines and tools (e.g., Jenkins, GitLab CI) **Technical Skills** * Deep understanding of Linux/Unix systems and networking * Proficiency in at least one or more programming languages (e.g., Python, Go, Java) * Intermediate to Expert level skill in bash scripting * Experience with cloud platforms (AWS, Azure, GCP) and container orchestration (Docker, Kubernetes) * Strong knowledge of monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack) * Familiarity with CI/CD pipelines and tools (e.g., Jenkins, GitLab CI) **Soft Skills** * Excellent problem-solving skills and a proactive attitude * Strong communication and collaboration skills * Ability to work independently and as part of a team * Demonstrated leadership and mentoring abilities **Work Environment and Culture** * Work during Pacific time hours 8am - 5pm PST * Open to on-call rotation * Collaborative and dynamic work environment * Opportunity to work with top-tier international companies * Access to cutting-edge technologies and tools **Compensation and Benefits** * Competitive salary and benefits package * Opportunity to choose from a variety of 'Gigs' that fit your interests and skills * Access to professional development and training opportunities * Collaborative and dynamic work environment **Career Growth Opportunities** * Opportunity to work on cutting-edge projects with the world's best IT engineers * Access to professional development and training opportunities * Collaborative and dynamic work environment * Opportunity to choose from a variety of 'Gigs' that fit your interests and skills **How to Apply** If you're a highly skilled and experienced Site Reliability Engineer looking for a new challenge, we want to hear from you. Apply now to join the Gigster Talent Network and unlock your potential. Apply Now **Note:** The application process includes an English Proficiency Assessment, Technical Assessment, Recruiter screen, and Technical Interview. We strive to move efficiently from step to step so the recruitment process can be as fast as possible. Apply for this job
Back to Job Board