Fleet Management Limited
AI Engineer Intern
- Developed and deployed a GenAI document rewriting application using AWS Bedrock for LLM inference, Amazon EC2 for scalable hosting, and S3 for secure document storage.
- Implemented a RAG pipeline leveraging Gemini models for text embeddings and inference via Google Vertex AI. Integrated Pinecone Vector Store for efficient data storage and retrieval, implemented hybrid search with BM25 and re-ranking strategies, and optimized deployment on Google Kubernetes Engine (GKE).
- Developed advanced prompt engineering techniques to generate synthetic data and implemented an LLM-as-a-judge framework to evaluate metrics like response relevancy, contextual precision/recall, faithfulness for selecting best chunking strategies.
- Implemented end-to-end observability on GKE using OpenTelemetry for distributed tracing, Grafana and Prometheus for real-time monitoring, and a CronJob for periodic online evaluation. Developed an interactive Tableau dashboard for daily reporting, providing stakeholders real-time access to chat histories and key performance metrics.