Skip to content

AI Engineer- Hybrid RAG Solution (LLM & RAG)

  • Remote
    • Lima, Lima, Peru
  • Human Capital

Job description

Job Summary:

We are looking for an experienced AI Engineer specializing in Retrieval-Augmented Generation (RAG) to build and optimize hybrid AI solutions leveraging Large Language Models (LLMs). This role involves working with cutting-edge language models and retrieval systems to deliver highly accurate, context-aware, and responsive AI applications. You’ll collaborate with cross-functional teams to develop scalable solutions that enhance information retrieval, comprehension, and generation capabilities in real-world applications.

Job requirements

Key Responsibilities:

  • Design, develop, and deploy hybrid RAG architectures integrating LLMs with retrieval-based systems for improved relevance and contextual responses.
  • Fine-tune and optimize large language models, enhancing their performance and adaptability to domain-specific requirements.
  • Implement and manage RAG pipelines that effectively combine retrieval mechanisms with generative capabilities, ensuring high accuracy and efficiency.
  • Develop custom plugins, adapters, or APIs to integrate retrieval systems (e.g., Elasticsearch, FAISS) with generative models, facilitating seamless information retrieval.
  • Monitor and troubleshoot issues within RAG pipelines, fine-tuning retrieval parameters and model hyperparameters to optimize performance.
  • Work closely with data engineers to manage and preprocess large datasets for training, ensuring high-quality and diverse data coverage.
  • Evaluate and benchmark the performance of RAG solutions, using metrics such as response accuracy, latency, and user satisfaction.
  • Stay up-to-date with advancements in NLP, LLMs, and RAG methodologies, continually improving existing architectures and recommending new techniques.

Qualifications:

  • Bachelor’s or Master’s degree in Computer Science, Artificial Intelligence, or a related field, or equivalent practical experience.
  • 3+ years of experience in AI/NLP, with a focus on LLMs, transformer-based architectures, and retrieval systems.
  • Proven experience building and deploying RAG solutions or other hybrid AI architectures.
  • Strong understanding of information retrieval methods, including dense retrieval, sparse retrieval, and embeddings-based techniques.
  • Proficiency in Python, TensorFlow or PyTorch, and experience with libraries and tools related to LLMs, such as Hugging Face Transformers.
  • Familiarity with retrieval frameworks like Elasticsearch, FAISS, or OpenSearch.
  • Knowledge of prompt engineering, fine-tuning, and deployment of language models for production environments.
  • Strong analytical skills, with experience in optimizing LLM and retrieval model performance.
  • English required

Preferred Skills:

  • Experience with cloud services and infrastructure (AWS, GCP, Azure) and MLOps tools for model deployment and monitoring.
  • Contributions to open-source RAG projects or experience working with OpenAI, LangChain, or similar frameworks.
  • Knowledge of vector databases, memory-augmented networks, and distributed systems.


Location: Remote/  Only in Brasil

English Advanced Mandatory

Experinece: + 4 years

or

Apply with Indeed unavailable