[Remote] Site Reliability Specialist
Note: The job is a remote job and is open to candidates in USA. Company 3 provides a full range of Creative Services for content creators. The Site Reliability Specialist position focuses on building, maintaining, and optimizing CI/CD pipelines while ensuring reliable software releases and supporting cloud infrastructure.
Responsibilities
- Configure and maintain development, staging, and production environments
- Implement and maintain monitoring solutions for system health and performance
- Investigate and resolve infrastructure and deployment issues
- Maintain infrastructure documentation and operational runbooks
- Work with development teams to improve build and deployment processes
- Support AWS services including Lambda, S3, DynamoDB, and API Gateway
- Implement security best practices in infrastructure and deployment processes
- Work with development teams to improve build and deployment processes
- Perform other functions as needed
Skills
- Advanced Linux proficiency
- Advanced command-line operations
- System administration
- Shell scripting
- Familiarity with GitHub Actions, GitLab CI, Jenkins, or similar CI/CD pipeline tools
- Strong experience with Git for source code and infrastructure management
- Basic understanding of Terraform for cloud resource management
- Basic knowledge of AWS services and cloud computing concepts
- Understanding of Docker and container orchestration basics
- Proficiency in Python and shell scripting for automation
- Systematic approach to troubleshooting infrastructure and deployment issues
- Clear documentation and effective collaboration with development teams
- Perform repeatable infrastructure tasks with consistent accuracy
- Terraform Experience: Hands-on experience with Terraform for infrastructure provisioning
- AWS Certification: AWS Cloud Practitioner or Solutions Architect Associate certification
- Database Knowledge: Understanding of DynamoDB, SQL databases, and data storage concepts
- Monitoring Tools: Experience with CloudWatch, Prometheus, Grafana, or similar monitoring solutions
- Security Practices: Knowledge of cloud security best practices and compliance requirements
- API Management: Understanding of API Gateway, REST APIs, and microservices architecture
- Testing Frameworks: Experience with pytest, unittest, or similar testing frameworks
- Agile Methodologies: Experience working in Agile/Scrum development environment
Benefits
- Comprehensive package of health, retirement, and insurance benefits
- Paid time off
- Retirement
- Select insurance benefits
- Health benefits
Company Overview
Company H1B Sponsorship
Apply To This Job