Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Platform & HPC Data Engineer image - Rise Careers
Job details

Platform & HPC Data Engineer

RGi is searching for a talented Platform and HPC Data Engineer to join our team, where you will play a key role in the design, implementation, and optimization of data management solutions within high-performance computing (HPC) environments. We are looking for a candidate with substantial experience in diverse file systems, data labeling and tagging systems, as well as the configuration of various storage appliances.

In this position you will be responsible for ensuring that data workflows, storage configurations, and metadata management are not only efficient and scalable but also adhere to organizational and government security standards. As part of a dynamic, cross-disciplinary team, you will help address the technical demands of HPC platforms, effective data management, and large-scale computational workflows. Join us in advancing our innovative solutions and shaping the future of high-performance computing!


Clearance:

Active Top Secret clearance with willingness and ability to obtain an SCI and CI polygraph

US Citizenship required


As a Platform & HPC Data Engineer you will...
  • Design and implement data management systems and architectures for HPC platforms, focusing on optimizing data flow, storage, and access in large-scale computing environments.
  • Oversee the configuration, maintenance, and optimization of distributed file systems (e.g., Lustre, IBM Spectrum Scale, NFS, GPFS) and storage solutions used in HPC environments to ensure efficient performance, scalability, and reliability.
  • Implement and manage metadata-driven systems for data labeling/tagging. This includes the development of strategies for classifying, indexing, and organizing datasets to enhance data discoverability, access control, and auditing.
  • Configure and maintain various storage appliances (e.g., NetApp, Dell EMC, HPE) and integrated storage solutions. Ensure that storage devices are optimized for performance, capacity, and availability within the HPC ecosystem.
  • Integrate data storage and management systems with HPC clusters, ensuring seamless data flow between compute nodes and storage appliances. Optimize data pipelines to support high-throughput workloads and minimize bottlenecks in I/O performance.
  • Monitor and improve the performance of storage systems, focusing on I/O throughput, latency, and efficient resource allocation. Use performance metrics to guide optimizations across storage appliances and file systems.
  • Implement security best practices for data access, protection, and management, ensuring compliance with government regulations and internal data governance policies. Configure encryption, access control, and secure data sharing methods.
  • Develop and maintain automation scripts (e.g., using Python, Bash, or Perl) to streamline storage configurations, data labeling/tagging, and system monitoring tasks. Automate processes related to data integration and HPC platform management.
  • Work closely with data scientists, HPC administrators, software developers, and other technical staff to support ongoing projects. Provide expertise in troubleshooting data storage issues and ensuring optimal system performance.
  • Maintain thorough documentation for storage configurations, file system setups, data labeling/tagging procedures, and performance optimization strategies. Provide regular reports on system health, data management processes, and any improvements made.


Platform & HPC Data Engineer Qualifications:
  • Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related field. A Master’s degree or higher is a plus.
  • 7+ years of experience in managing data infrastructure in HPC environments, with expertise in file systems, storage appliances, and data workflows.
  • Hands-on experience with distributed file systems, including Lustre, IBM Spectrum Scale (GPFS), NFS, and others commonly used in HPC settings.
  • Proven experience with storage appliance configuration (e.g., NetApp, Dell EMC, HPE, or similar systems), including performance tuning, capacity management, and reliability.
  • Strong experience in implementing data labeling/tagging systems, metadata management, and structuring large datasets for efficient access and compliance.
  • Knowledge of high-performance networking protocols (e.g., InfiniBand, RDMA) and their role in data transfer and storage optimization.
  • Familiarity with data access protocols like GridFTP, rsync, and NFS for large-scale data transfer.


Additional Skills We'd Like to See:
  • Experience with cloud storage integration or hybrid cloud environments, with knowledge of cloud-native storage solutions (e.g., AWS S3, Ceph, OpenShift).
  • Familiarity with high-performance computing (HPC) schedulers (e.g., SLURM, PBS, Torque) and their interaction with data storage systems.
  • Understanding of data protection mechanisms, including data replication, backup strategies, and disaster recovery in HPC environments.
  • Experience with containerization (Docker, Singularity) in an HPC context for data processing and application deployment.
  • Experience with machine learning or data science workflows in HPC environments.


Who we are:

Reinventing Geospatial, Inc. (RGi) is a fast-paced small business that has the environment and culture of a start-up, with the stability and benefits of a well-established firm. We solve complex problems within geospatial software development and national defense to make an Immediate Impact for our nation’s soldiers and analysts.


We pride ourselves on giving employees an exceptional life experience, where creativity thrives, and challenges are simply part of the fun. We provide truly excellent benefits, including:


·        100% paid employee healthcare & dental insurance

·        Paid parental leave

·        401k with matching

·        Escalating vacation time

·        Referral bonuses

·        Tuition reimbursement

·        Professional development training

·        Free beverages and snacks

·        Weekly catered lunches and breakfast on Fridays

 

Grow to be our next leader:

At RGi, fostering a strong and organic corporate culture is paramount and serves as a compass on the decisions we make and how we operate the company. We believe our culture of camaraderie, innovation, and collaboration reflects the caliber of our employees and their dedication to the mission of providing quality software to our customers. As such, we want our employees to feel empowered to seek growth and leadership opportunities within the company and position us to maintain our culture as we grow. RGi provides opportunities, resources, training, and mentorship to all our employees to let them take control of their careers and become a leader or a crucial member of our company. If this is what you are looking for in a company, then you are what we are looking for in an employee.


Reinventing Geospatial, Inc. is an Equal Opportunity Employer committed to hiring and retaining a diverse workforce. We are an Equal Opportunity Employer, making decisions without regard to race, color, religion, sex, national origin, age, veteran status, disability, or any other protected class. U.S. Citizenship is required for all positions.

Average salary estimate

$135000 / YEARLY (est.)
min
max
$120000K
$150000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

What You Should Know About Platform & HPC Data Engineer, Reinventing Geospatial, Inc. (RGi)

Are you ready to take your expertise as a Platform & HPC Data Engineer to the next level? RGi, based in Herndon, VA, is excited to welcome a talented professional to our innovative team. Here, you'll be at the forefront of designing, implementing, and optimizing cutting-edge data management solutions within high-performance computing (HPC) environments. Your experience with diverse file systems and data management strategies will be invaluable as you ensure that data workflows are efficient, scalable, and secure, all while adhering to our organizational and government standards. You'll be working alongside a dynamic, cross-disciplinary team, tackling the technical demands of HPC platforms and large-scale computational workflows. In this role, you'll configure and maintain various storage appliances, manage metadata-driven systems, and optimize data pipelines, all to streamline workflows for high-throughput computing. Beyond technical skills, we value creativity and a collaborative spirit. You'll also have the opportunity to advance your expertise by automating tasks and integrating new technologies. At RGi, we’re not just about the work; we’re about fostering a culture of growth, where you can develop your career and take on leadership roles within our supportive environment. If you have an active Top Secret clearance, a passion for HPC, and a drive to make an impact, we’d love to hear from you!

Frequently Asked Questions (FAQs) for Platform & HPC Data Engineer Role at Reinventing Geospatial, Inc. (RGi)
What are the main responsibilities of a Platform & HPC Data Engineer at RGi?

As a Platform & HPC Data Engineer at RGi, your key responsibilities include designing and implementing data management systems tailored for HPC platforms, optimizing data flow and access, configuring distributed file systems like Lustre and NFS, and managing metadata-driven systems. You will also oversee storage appliances, ensuring they operate efficiently within the HPC framework while upholding security best practices.

Join Rise to see the full answer
What qualifications are required for the Platform & HPC Data Engineer position at RGi?

To qualify for the Platform & HPC Data Engineer position at RGi, you should have a Bachelor’s degree in Computer Science or a related field, along with at least 7 years of experience managing data infrastructure in HPC settings. Proficiency in distributed file systems, storage appliance configuration, and metadata management is essential, alongside an active Top Secret clearance.

Join Rise to see the full answer
What technologies and tools do Platform & HPC Data Engineers at RGi work with?

At RGi, Platform & HPC Data Engineers work with a variety of technologies and tools, including distributed file systems like Lustre and IBM Spectrum Scale, storage solutions from NetApp and Dell EMC, and automation scripts using Python, Bash, or Perl. Familiarity with high-performance networking protocols and data access protocols is also valuable.

Join Rise to see the full answer
What is the work environment like for a Platform & HPC Data Engineer at RGi?

The work environment for a Platform & HPC Data Engineer at RGi is dynamic and collaborative. You will be part of a cross-disciplinary team that encourages creativity and innovation. We value work-life balance and provide excellent benefits in a culture that supports professional growth and leadership development.

Join Rise to see the full answer
What opportunities for advancement exist for Platform & HPC Data Engineers at RGi?

At RGi, we prioritize employee growth and offer numerous opportunities for advancement. As a Platform & HPC Data Engineer, you can access mentorship, professional development training, and leadership roles, allowing you to take control of your career while contributing significantly to our mission.

Join Rise to see the full answer
Common Interview Questions for Platform & HPC Data Engineer
Can you describe your experience with distributed file systems in HPC environments?

When answering this question, provide specific examples of distributed file systems you've managed, such as Lustre or NFS. Discuss the challenges you faced, how you optimized performance, and any improvements you achieved. Highlight your ability to troubleshoot and maintain these systems to demonstrate your expertise.

Join Rise to see the full answer
How do you approach security in data management for HPC systems?

Discuss your understanding of security best practices, including encryption, access control, and compliance with regulations. Share specific strategies you've implemented to protect data integrity and ensure secure data sharing, emphasizing your proactive approach to maintaining security within HPC environments.

Join Rise to see the full answer
What strategies do you use to optimize data workflows and pipelines?

Explain the methodologies you employ to optimize data workflows, such as automating processes, monitoring performance metrics, and identifying bottlenecks. Provide examples of successful optimizations you have executed that improved throughput and reduced latency in HPC contexts.

Join Rise to see the full answer
Can you provide an example of a complex data labeling/tagging system you've implemented?

When responding, describe a specific project involving a data labeling/tagging system. Detail how you structured the dataset, the tools you used for metadata management, and the impact this system had on data discoverability and compliance within the HPC framework.

Join Rise to see the full answer
What is your experience with storage appliances, and how do you ensure their performance?

Share your hands-on experience with storage appliances, detailing the brands and models you've worked with. Discuss how you regularly tune performance, manage capacity, and conduct reliability tests. Mention tools or metrics you utilize to monitor and maintain their performance comprehensively.

Join Rise to see the full answer
How do you stay updated on the latest trends in high-performance computing?

Talk about your methods for staying current, such as participating in industry conferences, following relevant journals, or being part of professional groups. Explain how you have applied new knowledge from these sources to improve HPC solutions in your past roles.

Join Rise to see the full answer
Describe your experience working in cross-disciplinary teams.

Provide examples of successful collaboration with data scientists, software developers, and HPC admins. Discuss how you facilitated communication, shared knowledge, and addressed challenges. Highlight the importance of teamwork in achieving project goals and delivering effective data management solutions.

Join Rise to see the full answer
How do you approach troubleshooting data storage issues?

Outline your systematic approach to troubleshooting, including identifying the problem, analyzing performance metrics, and implementing solutions. Share a specific incident where your troubleshooting skills resolved a significant issue and led to improved system performance.

Join Rise to see the full answer
What automation scripts have you developed, and what tasks do they streamline?

Discuss specific automation scripts you've created, explaining the tasks they streamline, such as storage configuration or system monitoring. Share the programming languages you used (like Python or Bash) and the impact these automations had on efficiency and productivity.

Join Rise to see the full answer
Can you explain your experience with cloud storage integration in HPC?

Describe your familiarity with cloud storage solutions and how you have integrated them into HPC environments. Discuss any challenges you faced during the integration process and how you addressed them while optimizing data workflows and ensuring performance.

Join Rise to see the full answer
Similar Jobs
Posted 12 days ago
Photo of the Rise User
Posted 7 days ago
Photo of the Rise User
Bridgit Remote No location specified
Posted 7 days ago
Photo of the Rise User
Posted 2 days ago
Photo of the Rise User
Posted 5 days ago
Photo of the Rise User
Posted 6 days ago
Photo of the Rise User
AECOM Remote Bengaluru, India
Posted 5 days ago
Photo of the Rise User
AECOM Remote Dubai, United Arab Emirates
Posted 5 days ago
Photo of the Rise User
Posted 10 days ago
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
No info
LOCATION
No info
EMPLOYMENT TYPE
Full-time, on-site
DATE POSTED
January 13, 2025

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!