Research Scientist- Safeguards

London, UK

About the AI Security Institute

The AI Security Institute is the largest team in a government dedicated to understanding AI capabilities and risks in the world. 

Our mission is to equip governments with an empirical understanding of the safety of advanced AI systems. We conduct research to understand the capabilities and impacts of advanced AI and develop and test risk mitigations. We focus on risks with security implications, including the potential of AI to assist with the development of chemical and biological weapons, how it can be used to carry out cyber-attacks, enable crimes such as fraud, and the possibility of loss of control. 

The risks from AI are not sci-fi, they are urgent. By combining the agility of a tech start-up with the expertise and mission-driven focus of government, we’re building a unique and innovative organisation to prevent AI’s harms from impeding its potential. 

Role Description 

Team Description 

Interventions that secure a system from abuse by bad actors will grow in importance as AI systems become more advanced and integrated into society. The AI Security Institute’s Safeguard Analysis Team researches these interventions: we evaluate the protections on current frontier AI systems and research what measures could better secure them in the future. We then share our findings with the frontier AI companies, key UK officials, and other governments – informing their deployment, research, and policy decision-making.  
 
We have published on several topics, including agent misuse, defending finetuning APIs, third-party attacks on agents, safeguards safety cases, and attacks on layered defenses. Some example impacts have been advancing the benchmarking of agent misuse, identifying safeguard vulnerabilities previously unknown to frontier AI companies, and producing insights into the feasibility and effectiveness of attacks and defences in data poisoning and fine-tuning APIs. 

In our team, you can also massively advance both research on how to attack and defend frontier AI and governments’ understanding of misuse risks, which we see as critical to advanced AI going well.  

Role Description 

We’re looking for researchers with expertise developing and analysing attacks and protections for systems based on large language models or who have broader experience with frontier LLM research and development. An ideal candidate would have a strong record of performing and publishing novel and impactful research in these or other areas of LLM research.  

We're primarily looking for research scientists, but we can support staff’s work spanning or alternating between research and engineering. The broader team's work includes research – like assessing the threats to frontier systems, performing novel adversarial ML research on frontier LLMs, and developing novel attacks – and engineering, such as building infrastructure for running evaluations. 

The team is currently led by Xander Davies and advised by Geoffrey Irving and Yarin Gal. You’ll work with incredible technical staff across AISI, including alumni from Anthropic, OpenAI, DeepMind, and top universities. You may also collaborate with external teams like Anthropic, OpenAI, and Gray Swan.  

We are open to hires at junior, senior, staff and principal research scientist levels.  

Representative projects you might work on 

  • Designing, building, running and evaluating methods to automatically attack and evaluate safeguards, such as LLM-automated attacking and direct optimisation approaches. 
  • Building a benchmark for asynchronous monitoring for signs of misuse and jailbreak development across multiple model interactions. 
  • Investigating novel attacks and defences for data poisoning LLMs with backdoors or other attacker goals. 
  • Performing adversarial testing of frontier AI system safeguards and produce reports that are impactful and action-guiding for safeguard developers. 

What we’re looking for 

In accordance with the Civil Service Commission rules, the following list contains all selection criteria for the interview process.

Required Experience

The experiences listed below should be interpreted as examples of the expertise we're looking for, as opposed to a list of everything we expect to find in one applicant:

You may be a good fit if you have:  

  • Hands-on research experience with large language models (LLMs) - such as training, fine-tuning, evaluation, or safety research. 
  • A demonstrated track record of peer-reviewed publications in top-tier ML conferences or journals. 
  • Ability and experience writing clean, documented research code for machine learning experiments, including experience with ML frameworks like PyTorch or evaluation frameworks like Inspect. 
  • A sense of mission, urgency, responsibility for success. 
  • An ability to bring your own research ideas and work in a self-directed way, while also collaborating effectively and prioritizing team efforts over extensive solo work. 

Strong candidates may also have:  

  • Experience working on adversarial robustness, other areas of AI security, or red teaming against any kind of system. 
  • Extensive experience writing production quality code. 
  • Desire to and experience with improving our team through mentoring and feedback. 
  • Experience designing, shipping, and maintaining complex technical products. 

 

Salary & Benefits

We are hiring individuals at all ranges of seniority and experience within this research unit, and this advert allows you to apply for any of the roles within this range. Your dedicated talent partner will work with you as you move through our assessment process to explain our internal benchmarking process. The full range of salaries are available below, salaries comprise of a base salary, technical allowance plus additional benefits as detailed on this page.

  • Level 3 - Total Package £65,000 - £75,000 inclusive of a base salary £35,720 plus additional technical talent allowance of between £29,280 - £39,280
  • Level 4 - Total Package £85,000 - £95,000 inclusive of a base salary £42,495 plus additional technical talent allowance of between £42,505 - £52,505
  • Level 5 - Total Package £105,000 - £115,000 inclusive of a base salary £55,805 plus additional technical talent allowance of between £49,195 - £59,195
  • Level 6 - Total Package £125,000 - £135,000 inclusive of a base salary £68,770 plus additional technical talent allowance of between £56,230 - £66,230
  • Level 7 - Total Package £145,000 inclusive of a base salary £68,770 plus additional technical talent allowance of £76,230

There are a range of pension options available which can be found through the Civil Service website. 

 

This role sits outside of the DDaT pay framework given the scope of this role requires in depth technical expertise in frontier AI safety, robustness and advanced AI architectures.

 

 


Additional Information

Internal Fraud Database 

The Internal Fraud function of the Fraud, Error, Debt and Grants Function at the Cabinet Office processes details of civil servants who have been dismissed for committing internal fraud, or who would have been dismissed had they not resigned. The Cabinet Office receives the details from participating government organisations of civil servants who have been dismissed, or who would have been dismissed had they not resigned, for internal fraud. In instances such as this, civil servants are then banned for 5 years from further employment in the civil service. The Cabinet Office then processes this data and discloses a limited dataset back to DLUHC as a participating government organisations. DLUHC then carry out the pre employment checks so as to detect instances where known fraudsters are attempting to reapply for roles in the civil service. In this way, the policy is ensured and the repetition of internal fraud is prevented.  For more information please see - Internal Fraud Register.

Security

Successful candidates must undergo a criminal record check and get baseline personnel security standard (BPSS) clearance before they can be appointed. Additionally, there is a strong preference for eligibility for counter-terrorist check (CTC) clearance. Some roles may require higher levels of clearance, and we will state this by exception in the job advertisement. See our vetting charter here.

 

Nationality requirements

We may be able to offer roles to applicant from any nationality or background. As such we encourage you to apply even if you do not meet the standard nationality requirements (opens in a new window).

Working for the Civil Service

The Civil Service Code (opens in a new window) sets out the standards of behaviour expected of civil servants. We recruit by merit on the basis of fair and open competition, as outlined in the Civil Service Commission's recruitment principles (opens in a new window). The Civil Service embraces diversity and promotes equal opportunities. As such, we run a Disability Confident Scheme (DCS) for candidates with disabilities who meet the minimum selection criteria. The Civil Service also offers a Redeployment Interview Scheme to civil servants who are at risk of redundancy, and who meet the minimum requirements for the advertised vacancy.

Diversity and Inclusion

The Civil Service is committed to attract, retain and invest in talent wherever it is found. To learn more please see the Civil Service People Plan (opens in a new window) and the Civil Service Diversity and Inclusion Strategy (opens in a new window).

Create a Job Alert

Interested in building your career at AI Security Institute? Get future opportunities sent straight to your email.

Apply for this job

*

indicates a required field

Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf


Select...
Select...

UK Diversity Questions

It's important to us that everyone at AISI feels an included part of the team, whoever they are and whatever their background. These questions will help us to identify the diversity of our applicants. Should you not wish to provide an answer, you will always have the option to not provide a response with a 'I don't wish to answer' option. Your answers will not impact your hiring outcomes whatsoever.

If there are any questions you would like to further discuss or want clarity on, we'd be happy to talk to you about this if you reach out to active.campaigns@dsit.gov.uk

Select...
Select...
Select...
Select...
Select...
Select...
Select...