Join the best no-code platform team ever!

AI Engineer – Retrieval-Augmented Generation (RAG)
Location: United States | Remote
Experience : 3+

Position Overview:

The AI Engineer (RAG) is responsible for architecting, developing, and deploying advanced Retrieval-Augmented Generation (RAG) pipelines that power AI-driven document understanding, workflow automation, and enterprise search. This role blends expertise in information retrieval with hands-on production deployment of large language model solutions.

Experience:

  • Design and build scalable RAG pipelines using vector search, hybrid retrieval, reranking, and contextual compression techniques for structured and unstructured data.
  • Develop and optimize data ingestion, document chunking, and embedding generation workflows; implement and manage vector databases (Milvus, FAISS, Pinecone, pgvector).
  • Integrate RAG capabilities with LLM systems, APIs, and enterprise applications utilizing frameworks like FastAPI, Flask, and Azure Cognitive Search.
  • Evaluate and monitor performance using classical IR metrics (recall, precision), LLM-specific metrics (factuality, latency), and real-world application feedback; iterate for continuous improvement.
  • Ensure production readiness of AI solutions: robust, observable, secure, and aligned with compliance requirements for data privacy and business standards.
  • Collaborate with cross-functional teams, including product managers, backend engineers, and domain experts, to translate technical capabilities into business value.
  • Contribute to best practices, documentation, and code reviews to strengthen team expertise and support knowledge transfer.
  • Conduct workshops and enablement sessions to upskill internal teams on GenAI and RAG technologies.

Education:

  • Bachelor’s or Master’s in Computer Science, Data Science, or related field.
  • Practical experience with RAG architectures, vector databases, LLM orchestration, and Python.
  • Hands-on expertise in AI search systems and embedding models (OpenAI, Cohere, Sentence Transformers).
  • Strong understanding of prompt engineering, data workflows, and evaluation strategies.
  • Excellent teamwork and communication skills.

Knowledge of:

  • Large Language Models (LLMs): Selection, integration, fine-tuning, and deployment of pre-trained LLMs (e.g. GPT series, LLaMA, Falcon), including model optimization via quantization and distillation.
  • Prompt Engineering: Designing, testing, and refining prompts to guide model outputs, especially in multi-turn and context-sensitive scenarios.
  • Retrieval Techniques: Building semantic search systems using embeddings and vector databases (Pinecone, Weaviate, Milvus, Chroma); hybrid search strategies that combine keyword and semantic search.
  • Document and Data Workflows: Data preprocessing, chunking, embedding generation, and managing structured/unstructured knowledge sources for RAG.
  • Statistical Evaluation: Applying statistical thinking to model and system evaluation—understanding metrics, real-world performance, and analytical error analysis.
  • API & Integration: Building and integrating APIs (RESTful endpoints via FastAPI/Flask), connecting RAG and LLM services to products and business workflows.
  • Productionization & Scaling: Containerization (Docker), cloud deployment, orchestration (Kubernetes), and CI/CD practices for reliable AI service delivery.
  • Continuous Knowledge Update: Ensuring dynamic information retrieval, re-indexing new content, and keeping models grounded in current authoritative data.
  • Security, Ethics, Compliance: Understanding of data privacy laws (GDPR, CCPA), responsible AI guidelines, and enterprise governance practices.

Join our team!

Please fill in the form and we will email you back within 3 days.

All Fields are Mandatory *

Get a free Quote

Business Challenges

This is the heading

This is the heading

This is the heading

This is the heading

Industry Focus