Adversarial Prompt Expert
You’ll be part of a red teaming project focused on probing large language models for failure modes and harmful outputs. Your work will involve crafting prompts and scenarios to test model guardrails, exploring creative ways to bypass restrictions, and systematically documenting outcomes. You’ll think like an adversary to uncover weaknesses, while collaborating with engineers and safety researchers to share findings and improve system defenses.
Apply tot his job
Apply To this Job