Job details

Site Reliability Engineer - Americas

BforeAI is an innovative and rapidly expanding scale-up dedicated to deterring cybercrime through cutting-edge predictive and pre-emptive technologies. We harness the power of prescriptive AI to revolutionize the way we tackle cyber threats, particularly in the realm of brand protection.

Named by Gartner in 26 reports over the last 2 years, BforeAI is the industry’s fastest, most accurate solution for automated protection against online fraud.

We are like weather forecasts for cyber threats. Join us in the fight for a safer cyberspace!

✨ What’s cool about this job

As an SRE at BforeAI, you will be a critical part of our technology team, responsible for ensuring the reliability, scalability, and performance of our cloud infrastructure and applications. Your expertise in Kubernetes, Networking, Security, Cloud environments and database optimization will be essential for maintaining our high-traffic, data-intensive systems.

Please note, this job can be anywhere in Americas - we have to select a country for job boards.

📣 What you’ll be doing

Architect, deploy, and manage Kubernetes clusters, ensuring high availability, scalability, and reliability to meet organizational demands.
Drive performance improvements for database systems through advanced query optimization, indexing strategies, and efficient caching mechanisms.
Develop and maintain Infrastructure as Code (IaC) using tools like Terraform, Ansible, or equivalent technologies to enable consistent, automated, and scalable deployments.
Implement and manage robust monitoring and alerting systems to proactively maintain system health and ensure optimal performance.
Enforce cloud environment best practices for security, access control, and compliance with regulatory standards.
Establish, maintain and be responsible for our Incident management procedures.
Partner with engineering teams to support their infrastructure needs, ensuring alignment with SRE practices and system requirements.
Make sure our infrastructure and products are resilient and recoverable by establishing and maintaining resiliency and recovery best practices and procedures.
Establish and maintain SRE best practices and remove any blocker to enable the reliability of the system.
Create and maintain detailed documentation for configurations, processes, and procedures to ensure transparency and knowledge sharing across teams.

💥 You’ll be a great fit if

You have 8+ years of experience in SRE, system administration, or similar roles.
You are an expert in Kubernetes, including hands-on experience in cluster setup, management, and maintenance with certifications such as Certified Kubernetes Administrator (CKA) and/or certified Kubernetes Security Specialist (CKSS).
You are proficient in database performance optimization and administration such as PostgreSQL, MySQL, or similar.
You have experience with Infrastructure as Code (IaC) tools such as Terraform (with certification like HashiCorp Terraform Certification), Ansible, or similar.
You have experience with monitoring and logging tools such as Splunk, Prometheus, Grafana, Datadog, ELK, Logstash, Fluentd, etc.).
You have experience with Incident response tools such as PagerDuty, OpsGenie, etc.
You have experience with cloud platforms, such as AWS, Azure, or GCP, ideally supported by an architect-level certification from at least one provider.
You have experience in secrets management tools such as Hashicorp Vault, CyberArk Conjur, AWS Secrets manager, etc.
You have strong problem-solving and troubleshooting skills.
You are a strong communicator with the ability to collaborate across multi-disciplinary global teams.
You have RHCSA (Red Hat Certified System Administrator) and/or RHCE (Red Hat Certified Engineer) certification.

Don't meet every single requirement? Don't count yourself out just yet. Studies show some individuals are less likely to apply to jobs unless they meet every qualification. At BforeAI, we're dedicated to building a diverse workplace based on merit, work ethics, and character, and we believe everyone deserves a fair shot at success!

If you're excited about this role but your past experience doesn't align perfectly with every qualification, we hope you’ll still consider applying!

We use an Employee of Record service to facilitate seamless global hiring processes and offer benefits tailored to the country where you will be working! For countries not supported by our EOR partner, talk to us about being a contractor. In all cases, you will need to be authorized to work in the country you’re based in.

We offer a compensation package up to $110,000 USD per year in CTC (Cost to Company). Cost to Company represents our total investment, which includes all benefits and employer contributions. The final take-home pay will differ due to local tax regulations, selected benefits, and mandatory deductions. The actual offer will be based on the role level, skills, and experience of the candidate. Our compensation structure is thoughtfully designed to align with the expertise and impact potential of each individual.

🚀 Why it’s great to work here

We are a location independent company – no physical office required – and we operate as a fully distributed team. We deeply believe in the value of diversity and inclusivity within our workplace, understanding that these principles lead to a happier team and ultimately a superior product. We offer an intellectually stimulating company environment and you’ll be working with a bright, dedicated team from across the globe.

If you possess a high level of autonomy and self-organization, and feel you can thrive at BforeAI, we’d love to hear from you!

💡 Want to know more about BforeAI?

What You Should Know About Site Reliability Engineer - Americas, BforeAI

Join BforeAI as a Site Reliability Engineer in the Americas, where you'll play a pivotal role in our mission to combat cybercrime using advanced predictive technologies. At BforeAI, we are committed to redefining the landscape of online protection with our prescriptive AI solutions, which have earned us recognition from Gartner over 26 times in the past two years. As a key member of our technology team, you'll ensure the reliability, scalability, and performance of our cloud infrastructure and applications. Your strong expertise in Kubernetes, networking, security, cloud environments, and database optimization will help us maintain our high-traffic systems. Your day-to-day will involve architecting and managing Kubernetes clusters, optimizing database performance, and developing Infrastructure as Code to ensure seamless deployments. You’ll implement robust monitoring and alerting systems, enforce cloud best practices, and collaborate with engineering teams to meet their infrastructure needs. You’ll have the flexibility to work remotely from anywhere in the Americas, and your contributions will help establish best practices that maintain system reliability and resilience. If you’re an experienced SRE with a passion for cyber defense, we can’t wait for you to join our diverse and talented team dedicated to making the internet a safer place!

Frequently Asked Questions (FAQs) for Site Reliability Engineer - Americas Role at BforeAI

What are the primary responsibilities of a Site Reliability Engineer at BforeAI?

As a Site Reliability Engineer at BforeAI, your key responsibilities will include architecting and managing Kubernetes clusters, optimizing database performance, developing Infrastructure as Code, implementing monitoring solutions, and collaborating across teams to ensure system reliability. You'll also be responsible for incident management and documentation, helping to build a robust and efficient infrastructure.