Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Data Pipeline Operations Engineer image - Rise Careers
Job details

Data Pipeline Operations Engineer

We are seeking a detail-oriented and technically skilled Data Pipeline Operations Engineer to manage and execute our weekly scanning process. This critical role ensures the timely flow of customer data through our research, scanning, and UI ingest pipeline. The ideal candidate has a mix of programming, database, and Linux system administration skills to handle the various steps in the scanning workflow.

SixMap is the leading Automated Cyber Defense Platform for continuous threat exposure management (CTEM) across today’s largest, most complex and dynamic enterprise and government environments. With zero network impact and zero agents, SixMap automatically discovers all Internet-facing assets across IPv4 and IPv6 to deliver the most comprehensive external attack surface visibility. The platform identifies vulnerabilities, correlates proprietary and open-source threat intelligence, and provides actionable insights to defend against imminent threats with supervised proactive response capabilities. The SixMap team brings deep intelligence community expertise and best practices to the defense of both U.S. Federal agencies and Fortune 500 corporations.

Responsibilities

    • Manage the weekly scanning process, ensuring customer data progresses through research, scanning, and UI ingest phases according to defined SLAs
    • Prepare input files and kick off processes on the scanning cluster via Airflow
    • Monitor and troubleshoot jobs, adjusting parameters like rate files as needed to optimize runtimes
    • Perform data ingest into production databases using SQL and Python
    • Clear data artifacts and caches in between ingest cycles
    • Execute post-ingest data refresh routines
    • Perform quality checks on ingested data to validate contractual obligations are met
    • Identify process bottlenecks and suggest or implement improvements to the automated tooling to increase speed and reliability
  • Required Skills:
    • Strong Linux command line skills
    • Experience with Airflow or similar workflow orchestration tools
    • Python programming proficiency
    • Advanced SQL knowledge for data ingest, refresh, and validation
    • Ability to diagnose and resolve issues with long-running batch processes
    • Excellent attention to detail and problem-solving skills
    • Good communication to coordinate with other teams
    • Flexibility to handle off-hours work when needed to meet SLAs

  • Preferred Additional Skills:
    • Familiarity with network scanning tools and methodologies
    • Experience optimizing database performance
    • Scripting skills to automate routine tasks
    • Understanding of common network protocols and services
    • Knowledge of AWS services like EC2
    • Competitive compensation packages; including equity
    • Employer paid medical, dental, vision, disability & life insurance
    • 401(k) plans
    • Flexible Spending Accounts (health & dependents)
    • Unlimited PTO
    • Remote Working Options

Average salary estimate

$100000 / YEARLY (est.)
min
max
$80000K
$120000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Data Pipeline Operations Engineer, SixMap, Inc.

Join SixMap as a Data Pipeline Operations Engineer and play a pivotal role in managing our weekly scanning process! In this exciting position, you'll ensure that customer data flows smoothly through our research, scanning, and UI ingest pipeline. We're looking for someone detail-oriented and technically skilled, equipped with programming, database, and Linux system expertise. Your contributions will help us maintain the high standards of our Automated Cyber Defense Platform, which protects complex enterprise and government environments from cyber threats. You'll be responsible for preparing input files, monitoring jobs, and troubleshooting any issues that may arise during scanning. Your familiarity with Airflow will be helpful as you kick off processes on our scanning cluster. Additionally, your proficiency in SQL and Python will be put to good use as you perform data ingests and validate the quality of ingested data. Furthermore, your attention to detail will be crucial in identifying and resolving bottlenecks within the process. At SixMap, we offer competitive compensation packages, including equity and comprehensive benefits, while providing a flexible working environment. If you’re ready to make a significant impact in the world of cybersecurity, we want to hear from you!

Frequently Asked Questions (FAQs) for Data Pipeline Operations Engineer Role at SixMap, Inc.
What are the primary responsibilities of a Data Pipeline Operations Engineer at SixMap?

As a Data Pipeline Operations Engineer at SixMap, your main responsibilities include managing the weekly scanning process, ensuring the timely and correct progression of customer data through research, scanning, and UI ingest phases. You'll prepare input files, monitor and troubleshoot jobs on the scanning cluster utilizing Airflow, and perform data ingests into production databases using SQL and Python. Ensuring the quality and validation of the ingested data is also a key task, alongside identifying process bottlenecks for continuous improvement.

Join Rise to see the full answer
What qualifications do I need to become a Data Pipeline Operations Engineer at SixMap?

To qualify for the Data Pipeline Operations Engineer position at SixMap, you should have strong Linux command line skills, proficiency in Python programming, and advanced SQL knowledge for data ingestion and validation. Experience with workflow orchestration tools like Airflow is essential, along with the ability to troubleshoot long-running batch processes. Flexibility to handle off-hours work is also important to meet SLAs. Familiarity with network scanning tools and AWS services can be an added advantage.

Join Rise to see the full answer
What programming languages and tools should I be familiar with as a Data Pipeline Operations Engineer at SixMap?

As a Data Pipeline Operations Engineer at SixMap, you should be proficient in Python for programming tasks and have advanced SQL skills for data handling. Familiarity with Airflow or similar workflow orchestration tools is crucial as it will assist in managing and monitoring the job execution processes. Experience with Linux command line operations is also necessary to navigate the system effectively.

Join Rise to see the full answer
What are the challenges faced by a Data Pipeline Operations Engineer at SixMap?

Challenges you may encounter as a Data Pipeline Operations Engineer at SixMap involve managing the complexity of data ingestion workflows, troubleshooting issues in a timely manner, and identifying bottlenecks in the scanning processes. You must also ensure that all data processing meets contractual obligations while optimizing for efficiency and speed. This position may require flexibility in handling uncertainties that arise outside of regular hours to meet service-level agreements.

Join Rise to see the full answer
How does the role of Data Pipeline Operations Engineer contribute to SixMap's mission?

The role of a Data Pipeline Operations Engineer at SixMap is crucial as it directly supports our mission to enhance cybersecurity for enterprises and government agencies. By ensuring seamless data flow through our scanning processes, you are enabling SixMap to deliver comprehensive visibility into external attack surfaces, identify vulnerabilities, and provide actionable insights. Your efforts in maintaining data integrity and performance optimizations are key to our proactive defense strategies against imminent threats.

Join Rise to see the full answer
Common Interview Questions for Data Pipeline Operations Engineer
Can you explain your experience with Linux and how it applies to the Data Pipeline Operations Engineer role?

When discussing your Linux experience, highlight specific tasks you've completed on the command line, such as managing files, troubleshooting systems, or automating processes. Mention how these skills will help you efficiently handle data ingestion workflows and resolve issues at SixMap.

Join Rise to see the full answer
How do you prioritize tasks when handling multiple data workflows?

To answer this question, share your strategy for prioritization, such as assessing deadlines, impact on stakeholders, and aligning with SLAs. Discuss any tools or methodologies you use to stay organized, ensuring that no critical items fall through the cracks.

Join Rise to see the full answer
Describe a time you identified and resolved a bottleneck in a data pipeline.

Share a specific example where you analyzed the data flow, recognized inefficiencies, and implemented a solution—be it optimizing a process or introducing new tools. Highlight the outcomes of your actions, emphasizing improvements in speed and reliability.

Join Rise to see the full answer
What strategies do you use to troubleshoot issues with long-running batch processes?

Provide a step-by-step approach you take to troubleshoot, including monitoring logs, checking system resources, and testing various parameters. Mention any tools you've used to facilitate these tasks and how your methodological approach has led to successful resolutions.

Join Rise to see the full answer
How familiar are you with Airflow and its features?

Discuss your experience with Airflow or similar orchestration tools by outlining how you've used them to schedule, monitor, and manage workflows. Be sure to highlight specific features you have leveraged, such as task dependencies and retry mechanisms, to streamline data processing.

Join Rise to see the full answer
Can you explain the importance of data validation in the data pipeline process?

Emphasize data validation's role in ensuring data quality and accuracy. Talk about methods you use for validating data after ingestion and how this contributes to meeting client contractual obligations and maintaining the integrity of the process at SixMap.

Join Rise to see the full answer
What scripting languages have you utilized to automate routine tasks?

Discuss any scripting languages you are adept with, particularly emphasizing Python. Provide examples of automation you have implemented in past roles that improved efficiency and reduced manual intervention.

Join Rise to see the full answer
Describe your approach to communicating with cross-functional teams.

Embrace the importance of clear and consistent communication. Talk about how you keep team members informed of issues and progress and leverage tools for transparency. Share a specific scenario where effective communication resulted in a successful outcome.

Join Rise to see the full answer
How would you approach learning new technologies or tools pertinent to the Data Pipeline Operations Engineer role?

Highlight your proactive approach to continuous learning. Discuss resources you use, such as courses, documentation, or community forums, and how you apply this newfound knowledge to real-world scenarios.

Join Rise to see the full answer
What is your experience with database performance optimization?

Discuss your previous experiences optimizing database performance. Include techniques you employed, such as indexing, query optimizations, or partitioning, and the results you achieved through these practices.

Join Rise to see the full answer
Similar Jobs
Photo of the Rise User
Startup Mindset
Collaboration over Competition
Growth & Learning
Inclusive & Diverse
Photo of the Rise User
1GLOBAL Remote No location specified
Posted 3 hours ago
Posted 5 days ago
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
No info
LOCATION
No info
EMPLOYMENT TYPE
Full-time, remote
DATE POSTED
December 4, 2024

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!