STAFFXPERT LLC is seeking a System Engineer / Site Reliability Engineer (SRE) on behalf of our client in Remote, United States. This role is ideal for a highly experienced IT professional with a strong background in enterprise monitoring, incident management, observability, and infrastructure reliability.
Responsibilities:
- Monitor, analyze, and troubleshoot enterprise applications and infrastructure to improve system reliability and performance
- Perform incident triage, root cause analysis, and resolution for complex production issues
- Utilize enterprise monitoring and observability tools to identify operational risks and recommend improvements
- Collaborate with application owners, DevOps teams, infrastructure engineers, and network administrators to resolve system issues
- Analyze application workflows, dependencies, and system behavior across distributed environments
- Support enterprise reliability initiatives by identifying trends, documenting findings, and implementing operational best practices
- Work with cloud, middleware, database, and operating system technologies to support business-critical applications
- Partner with development and security teams during advanced troubleshooting and service investigations
- Create and maintain technical documentation, reports, and operational recommendations
Requirements:
- 8+ years of experience supporting enterprise-scale systems, infrastructure, or application reliability initiatives
- 8+ years of experience in system monitoring, troubleshooting, incident management, and production support
- 3+ years of hands-on experience with two or more enterprise monitoring tools such as: Dynatrace, Splunk, SolarWinds, ServiceNow Operator Workspace
- Strong technical expertise in one or more of the following areas: Windows Administration, Unix/Linux Systems, Network Engineering, AWS or Azure Cloud Platforms, WebSphere Middleware, Java/JavaScript Development, Oracle or Microsoft SQL Databases
- Experience supporting SaaS, PaaS, cloud-native, or virtualized environments
- Proven ability to independently solve complex technical challenges and lead troubleshooting efforts
- Strong communication, collaboration, and analytical skills
- Proficiency with Microsoft Office tools including Word, Excel, and PowerPoint
- High School Diploma/GED with 20+ years of relevant experience OR Master s degree in Computer Science, Engineering, or related technical field with 10+ years of relevant experience
- Experience with distributed systems, microservices, and cloud-native application environments
- Familiarity with test-driven development (TDD) practices
- Experience with tools such as Oracle Enterprise Manager, Riverbed Aternity, or ServiceNow VTBs
- Experience working with remote or virtual teams
- Strong critical thinking and problem-solving abilities
- Public Trust Clearance is a plus
- Bachelor s or Master s degree in Computer Science, Engineering, Information Technology, or related technical discipline preferred
- Equivalent combination of education and relevant professional experience will also be considered