Automating AI quality evaluation with HCLTech & Google Cloud

Automating AI quality evaluation with HCLTech and Google Cloud

5 min Lesen

A global ride-hailing and mobility leader partnered with HCLTech and Google Cloud to transform its AI based code generation quality. Leveraging Google Gemini models, prompt engineering and automation-led frameworks, HCLTech helped the organization pilot an innovative solution for assessing large language model (LLM) outputs, enhancing scalability, improving consistency and reducing manual effort in AI code quality assurance.

The Challenge

Limited scalability of the existing AI quality evaluation framework
Time-intensive manual validation of LLM-generated responses
Inconsistent application in writing and code quality parameters across multiple use cases

The Objective

Pilot an automated solution for evaluating LLM outputs at scale
Enhance consistency and reliability of AI code quality assessments
Reduce operational overhead through automation and intelligent evaluation
Lay the foundation for enterprise-wide AI assurance capabilities

The Solution

Developed a PoC solution leveraging Google Gemini models to automate evaluation of writing and code quality attributes
Implemented prompt engineering and templated outputs integrated with Google Sheets for traceability and insight generation
Designed an evaluation framework focusing on writing quality (coherence, truncation) and code quality (language clarity, reasoning)