
Job description
We are looking for an experienced Lead Prompt Engineer to guide and manage a team through the full technical migration process, transitioning templates to LLM autoraters. In this role, you will leverage advanced prompt engineering techniques and the client’s internal tools to optimize model performance, ensuring the successful integration and ongoing enhancement of AI systems. As the team lead, you will drive the strategy, mentor junior engineers, and play a key role in shaping the future of our AI-driven solutions.
Responsibilities:
Utilize Automatic Prompt Generation (APG) tools to create baseline prompts for complex parent-child template clusters.
Run and supervise Automated Prompt Optimization (APO) tool, review the outputs, and flag when the APO reaches deadlocks or plateaus.
Manually draft, test, and refine prompts to navigate complex template architectures, overcome anti-patterns, and handle edge cases where tooling is lacking or broken. Solve edge-case scenarios by designing and refining manual prompts.
Monitor shadowbot runs to ensure sufficient disagreements (between human and LLM ratings) are registered, generated, and tracked.
Run prompt versions against established gold data to continuously measure autorater quality against the human crowd baseline, calculating accuracy metrics such as F1 scores, precision, and recall.
Draft technical launch readiness justifications (Launch Certification Documentation) for final.
Requirement:
Language Skills: Native fluency in English.
Location: Must be based in United States.
Education: Master’s, or Doctorate degree in Computer Science, Data Science, Computational Linguistics, Human-Computer Interaction (HCI), Cognitive Science, or a related analytical field.
Prompt Engineering & AI Expertise: At least 7 years' experience as Prompt Engineer. Proven experience tuning Large Language Models (LLMs) for strict, structured outputs, complex classification tasks, and familiarity with chain-of-thought and few-shot learning.
Data Analysis: Strong proficiency in identifying error patterns, analyzing model performance, and using SQL or other data analytics tools.
Technical Agility: Ability to quickly learn and master proprietary tools with minimal supervision.
Communication: Excellent verbal and written communication skills.
Optional / Preferred Skills:
Familiarity with enterprise-grade LLM interfaces like the Goose API.
Experience in AI model evaluation, data science, computational linguistics, or software engineering.
Hands-on experience with Automated Prompt Optimization (APO) systems or tuning workflows.
Linguistic expertise, including an understanding of semantics and logic.
Federal Law Compliance
In compliance with federal law, all persons hired will be required to:
- Verify identity and eligibility to work in the United States; and
- Complete a required employment eligibility verification form.
Please note that in order to verify work authorization as is required by Federal law (I-9 process), all new employees must complete a live video verification with their selected IDs and provide photos of these selected IDs within their first 3 days of employment.
To know more details (Click here)
Ready to apply?
You'll be taken to Welo Global career page to submit your application. We'll also add this to your tracker if you want.

Originally posted via 4 Day Week. View source ↗
Keep looking
All roles at Welo Global →More roles like this

Senior Prompt Engineer
Welo Global · Remote · SeniorLLMData AnalysisCommunicationPrompt EngineeringSQL+10 more1w agovia 4 Day Week
Prompt Engineer
Welo Global · Remote · MidLLMData AnalysisCommunicationPrompt EngineeringFew-shot Learning+9 more1w agovia 4 Day Week
Senior Prompt Engineer
Welo Global · Remote · SeniorLLMData AnalysisCommunicationPrompt EngineeringFew-shot Learning+9 more1w agovia 4 Day Week
Scout Search Quality Rater - Japanese
Welo Global · Remote · EntryJapanese LanguageEnglish LanguageOnline researchQuality EvaluationPop culture knowledge+1 more3w agovia 4 Day Week