Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy
Jobs / Job page
Data Engineer image - Rise Careers
Job details

Data Engineer

We are a dedicated team developing Large Language Models, one of the most prestigious and advanced ongoing Natural Language Processing projects in the world. Our team is responsible for the entire data engineering pipeline—from the collection of raw text data, through preprocessing and storage, to serving the data for model training and deployment. Additionally, our Data Engineering team is actively involved in Optical Character Recognition and image processing.

Our focus is on the rapid development and deployment of state-of-the-art information retrieval systems to meet complex information needs. As a Data Engineer, you will play a critical role in our team, owning the core data engineering tasks in our product pipeline. You will collaborate closely with cross-functional teams to provide innovative solutions to real-world problems.

To succeed in this role, you'll need a results-driven mindset, a passion for excellence, and a continuous desire to learn and improve. Your key responsibilities in this project will include:

  • Utilizing programming languages such as Python, R, Scala, etc., to analyze data and build statistical models.
  • Providing insights, metrics, and explanations for data variance through your technical expertise.
  • Building knowledge graphs and services to support the information retrieval process.
  • Implementing best-practice data quality assurance mechanisms.
  • Bachelor’s degree in Computer Engineering, Software Engineering, or equivalent field.
  • 3+ years of experience with data cleaning, preprocessing, and data architecture, especially with big data
  • 3+ years of coding experience in at least one modern programming language (Python is preferred; R, Ruby, Scala, Java, etc. are also acceptable)
  • Extensive knowledge and practical experience in several of the following areas: machine learning, statistics, deep learning, recommendation systems, information retrieval, data preparation, and web crawling
  • Basic NLP skills (e.g., word embeddings, language models) to facilitate communication between end users and data. Knowledge of Large Language Models and their data preparation steps is a plus
  • Basic knowledge of NoSQL databases, with a preference for Elasticsearch and MongoDB. Experience with RDBMS such as PostgreSQL or MySQL is also valuable
  • Experience with data visualization tools. Grafana and Airflow is a strong plus
  • Basic knowledge of Apache Spark and Hadoop is a big advantage
  • Proficiency in Linux-based OS operations
  • A solid understanding of search-related business scenarios and core technologies
  • A passion for sharing knowledge and the confidence to seek help when needed
  • Fluency in both written and spoken English
  • Experience mentoring or leading teams of 5+ members, and providing technical or professional guidance, is a plus
  • An eagerness to learn new technologies is highly valued

Huawei is a global provider of information and communications technology (ICT) infrastructure and smart devices. Huawei is headquartered in Shenzhen, China.

2 jobs
MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, remote
DATE POSTED
October 14, 2024

Subscribe to Rise newsletter

Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!