top of page

Senior AI Architect (Costume Language Model Development)

Location: 

Jerusalem /hybrid

Our amazing client is an early-stage deep tech startup at the forefront of digital health, that is developing a groundbreaking Large Language Model for decoding 100% of the human genome, with the potential to analyze whole genome sequencing data in real-time.


We're seeking an experienced AI architect to spearhead the development of custom large-scale language models, focusing on advanced transformer architectures and efficient tokenization strategies.


Core Responsibilities:

  • Architect and implement custom transformer models from scratch, optimizing for scale and efficiency

  • Develop and refine tokenization pipelines, with a focus on BPE and its variants

  • Innovate on attention mechanisms and positional encoding techniques

  • Optimize input processing strategies, balancing between padding, truncation, and dynamic approaches

  • Design and implement custom loss functions and training regimes for large-scale language modeling


Technical Focus Areas:

  1. Transformer Architecture: Innovate multi-head attention, feed-forward networks, and layer normalization techniques. Experience with sparse attention and efficient transformer variants is highly desirable.

  2. Tokenization and BPE: Develop and optimize tokenization strategies, with a particular emphasis on Byte Pair Encoding and its extensions. Familiarity with subword tokenization algorithms and their impact on model performance is crucial.

  3. Input Processing: Implement efficient strategies for handling variable-length inputs, including advanced padding and truncation techniques. Experience with dynamic batching and length-adaptive processing is a plus.

  4. Scale and Efficiency: Optimize models for large-scale training and inference, with a focus on memory efficiency, computational performance, and distributed training strategies.

The ideal candidate will have a proven track record of developing novel language modeling architectures and a deep understanding of the theoretical foundations underlying modern NLP techniques. This role requires a balance between cutting-edge research and practical implementation, with a focus on pushing the boundaries of what's possible in custom language model development.



Key Requirements:

  • 3-5 years of hands-on experience designing and implementing transformer-based architectures

  • -Deep understanding of tokenization techniques, including Byte Pair Encoding (BPE)

  • Expertise in optimizing model input processing, including padding and truncation strategies

  • Proficiency in Python and deep learning frameworks (PyTorch, TensorFlow, NumPy, Pandas)

  • Strong background in self-supervised learning and self-attention mechanisms



APPLY
bottom of page