We’re so excited to have you on our team as we create the future of Rev.
As you know, our mission is to unlock the full power of human speech. Why? Because we believe in bridging the gap between individuals, communities, and global audiences, and fueling connections and experiences that drive meaningful change. Simply put … every voice deserves to be heard.
We built the most accurate Speech-to-Text ASR in the world that surpasses our competitors across: 75 languages and counting; dialects; genders and subject matters. We did this because in a world where so much is said and yet so much goes unheard – we believe every perspective is not only valuable, but critical. And while that’s our technology, that is also our culture.
We didn’t transform the speech technology space by following the status quo. We did it by bringing perspectives to the table that were different from ours, and ensuring they had the freedom and responsibility to create, innovate, and transform our product and how we serve our customer. As a part of the Rev team, the work that you will do will propel these efforts and continue to help us best serve our customers.
It’s an exciting time to join Rev!
How will this role will Serve, Own and Grow at Rev:
Do you want to work at a high-growth company where your impact is seen and rewarded? Are you looking for the autonomy to do your best work?
We are looking for an experienced AI scientist/engineer to join our team at Rev. You must be comfortable with building Automatic Speech Recognition (ASR) and/or Speaker Diarization systems and/or up to date with the latest developments in Large Language Models (LLM), machine learning and neural networks as applied to big data. You enjoy working with the latest deep learning technologies, copious audiovisual and textual data and implementing the latest research findings.
Responsibilities:
As a Rev Speech AI Scientist/Engineer, you will:
Work with a team of engineers and researchers to improve and innovate on the existing ASR, Diarization, (L)LM and NLP infrastructure
Finetune and evaluate existing ASR, (Audio) LLM and NLP models
Exploit and explore our rich data pool, aiming to understand its breadth and characteristics in order to improve learning
Expand and prototype novel Speech and LLM solutions - improve word accuracy, distinguish and leverage speaker characteristics, and dynamically fine-tune and model speech in different acoustic environments.
Innovate new approaches and product features
Automate and integrate workflow from diverse systems
Interact with other teams at Rev working towards a shared goal
Qualifications:
University degree in Computer Science, Software Engineering, or related fields
1+ years of experience supporting and working on production ML systems (training models and tuning existing systems)
Fluency in Python, shell scripting, and Linux usage
Familiarity with ASR, Generative Audio or LLM techniques such as neural net architectures (Transformer / LSTM / CTC / Transducer), acoustic and language models, and decoding
Experience with Deep Learning frameworks (such as TensorFlow or PyTorch) and training large models
Good oral and written communication skills
Comfortable working with remote teams as a proactive team member
Nice to have knowledge of:
(Audio) Large Language Models (LLMs), especially training, finetuning, and inference
Use of LLMs for summarization and insights
Efficient training techniques
Monitoring production model performance
Different optimizers for model training
Fine tuning and knowledge distillation
Speech/language data preparation, curation, correction and maintenance
Low-resource languages speech and NLP techniques
Foreign languages or linguistics
C++, or other languages such as Rust
Keywords:
Kaldi/K2/Icefall/Lhotse, Wenet, ESPnet, C++, TensorFlow, Python, PyTorch, wFST, end-to-end, neural networks, acoustic modeling, language modeling, LLM, BERT, ELMo, ChatGPT, GPT, RWKV, speaker diarization, speaker identification, bash, linux, jenkins, Airflow, and Docker.
#LI-Remote
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
We're thrilled to invite you to consider the position of Applied Research Scientist for Speech and Large Language Models (LLM) at Rev! Our mission? To unlock the full power of human speech and ensure that every voice is heard. Imagine working in a company that has developed the world's most accurate Speech-to-Text systems, surpassing competition across 75 languages, dialects, and various subject matters. Here at Rev, innovation is at our core, driven by diverse perspectives and fresh ideas. As you join our vibrant team, you’ll play a crucial role in enhancing our Automatic Speech Recognition (ASR) and Speaker Diarization systems using cutting-edge technologies. You’ll have the autonomy to finetune and evaluate existing models, investigate our rich data pool, and prototype novel speech solutions that adapt and thrive in various acoustic environments. Your expertise in machine learning, deep learning frameworks like TensorFlow or PyTorch, and a strong background in programming will help propel our technology forward. Plus, your collaborative spirit will shine as you interact with other teams to achieve shared goals. At Rev, we value creativity and initiative, making this an exciting time for you to contribute and grow with us!
Rev is an American speech-to-text company based in San Francisco and Austin that provides closed captioning, subtitles, and transcription services.
11 jobsSubscribe to Rise newsletter