Job description
Job Summary:
We are looking for an experienced AI Engineer specializing in Retrieval-Augmented Generation (RAG) to build and optimize hybrid AI solutions leveraging Large Language Models (LLMs). This role involves working with cutting-edge language models and retrieval systems to deliver highly accurate, context-aware, and responsive AI applications. You’ll collaborate with cross-functional teams to develop scalable solutions that enhance information retrieval, comprehension, and generation capabilities in real-world applications.
Job requirements
Key Responsibilities:
- Design, develop, and deploy hybrid RAG architectures integrating LLMs with retrieval-based systems for improved relevance and contextual responses.
- Fine-tune and optimize large language models, enhancing their performance and adaptability to domain-specific requirements.
- Implement and manage RAG pipelines that effectively combine retrieval mechanisms with generative capabilities, ensuring high accuracy and efficiency.
- Develop custom plugins, adapters, or APIs to integrate retrieval systems (e.g., Elasticsearch, FAISS) with generative models, facilitating seamless information retrieval.
- Monitor and troubleshoot issues within RAG pipelines, fine-tuning retrieval parameters and model hyperparameters to optimize performance.
- Work closely with data engineers to manage and preprocess large datasets for training, ensuring high-quality and diverse data coverage.
- Evaluate and benchmark the performance of RAG solutions, using metrics such as response accuracy, latency, and user satisfaction.
- Stay up-to-date with advancements in NLP, LLMs, and RAG methodologies, continually improving existing architectures and recommending new techniques.
Qualifications:
- Bachelor’s or Master’s degree in Computer Science, Artificial Intelligence, or a related field, or equivalent practical experience.
- 3+ years of experience in AI/NLP, with a focus on LLMs, transformer-based architectures, and retrieval systems.
- Proven experience building and deploying RAG solutions or other hybrid AI architectures.
- Strong understanding of information retrieval methods, including dense retrieval, sparse retrieval, and embeddings-based techniques.
- Proficiency in Python, TensorFlow or PyTorch, and experience with libraries and tools related to LLMs, such as Hugging Face Transformers.
- Familiarity with retrieval frameworks like Elasticsearch, FAISS, or OpenSearch.
- Knowledge of prompt engineering, fine-tuning, and deployment of language models for production environments.
- Strong analytical skills, with experience in optimizing LLM and retrieval model performance.
- English required
Preferred Skills:
- Experience with cloud services and infrastructure (AWS, GCP, Azure) and MLOps tools for model deployment and monitoring.
- Contributions to open-source RAG projects or experience working with OpenAI, LangChain, or similar frameworks.
- Knowledge of vector databases, memory-augmented networks, and distributed systems.
Location: Remote/ Only in Brasil
English Advanced Mandatory
Experinece: + 4 years
or
Apply with Indeed unavailable
All done!
Your application has been successfully submitted!