Back to jobs

Quality Assurance Engineer (Server Production)

Taiwan

Why work at Nebius
Nebius is leading a new era in cloud computing to serve the global AI economy. We create the tools and resources our customers need to solve real-world challenges and transform industries, without massive infrastructure costs or the need to build large in-house AI/ML teams. Our employees work at the cutting edge of AI cloud infrastructure alongside some of the most experienced and innovative leaders and engineers in the field.

Where we work
Headquartered in Amsterdam and listed on Nasdaq, Nebius has a global footprint with R&D hubs across Europe, North America, and Israel. The team of over 800 employees includes more than 400 highly skilled engineers with deep expertise across hardware and software engineering, as well as an in-house AI R&D team.

About the Role

We are looking for a technically strong, hands-on QA Engineer to join our hardware team on-site at ODM factories in Taiwan. This is not a checklist job - we're looking for someone who enjoys digging deep into technical issues, investigating root causes, and taking ownership of complex hardware problems.You'll be the key person ensuring the quality of our servers and racks before they ship, but more importantly, you'll play a critical role in debugging failures, analyzing test data, and working closely with RnD, logistics, and factory teams to continuously improve the process and the product.This is a deeply technical role that blends hardware validation, manufacturing QA, and problem-solving - perfect for someone who understands how servers are built and tested, and wants to make sure every unit that leaves the factory is production-grade.

Key Responsibilities

  • Technical Investigation & Debugging
  • Investigate complex problems (e.g., high GPU failure rate, power-related test failures), gather logs, run diagnostics, and escalate with context to RnD when needed.
  • Drive root cause analysis across factory teams and internal engineering groups.
  • Document findings and help define preventive actions for recurring problems.
  • Act as the first line of technical escalation for hardware issues discovered during factory QA or internal testing.Engineering Support
  • Participate in new platform bring-up sessions together with the visiting RnD teams during on-site trips to ODM labs.
  • Provide technical support, coordination, and hands-on assistance during the bring-up process.
  • Help ensure early-stage hardware behaves as expected, and escalate integration or platform issues to the relevant teams.On-Site Product QA
  • Perform visual inspections of completed products (servers, racks) before packaging.
  • Define and maintain QA checklists and inspection procedures tailored to different product lines.
  • Verify inventory records at the factory against internal system data (part numbers, serials, configurations).
  • Oversee the product packaging process for compliance with defined standards.
  • Supervise pickup operations: ensure outbound trucks meet shipment conditions and schedules.Failure Rate Monitoring & Analytics
  • Collect failure data from vendor-side burn-in and our own test systems.
  • Analyze failure trends and estimate spare part needs for future datacenter deployments.
  • Use dashboards and structured reporting to communicate insights with QA, engineering, and supply chain teams.Feedback Loop & Quality Improvement
  • Gather and process feedback from datacenters on each delivered batch of equipment:
        * Report on packaging issues, impact sensor triggers, shipping anomalies.
        * Assess rack-level build quality: cabling, bracket alignment, labeling.
        * Log systemic hardware issues (design flaws, infant mortality, recurring failures).
  • Forward the feedback to the teams: logistics, ODM partners, hardware RnD, QA.Test Infrastructure & Validation
  • Assist with deployment and maintenance of test infrastructure on-site.
  • Ensure Nebius post-manufacturing hardware validation tests run smoothly (uptime, monitoring, coordination with support team).
  • Coordinate real-time issue escalation and basic triage with factory and internal teams.Local Insight & Communication
  • Communicate relevant local risks and context (e.g., typhoons, holidays, factory-specific constraints) to our global logistics and hardware teams.
  • Maintain productive relationships with factory staff, logistics providers, and internal stakeholders.

Working Conditions & Tools

  • During production peaks, issues may arise that require fast, hands-on debugging and resolution on-site. Flexibility is expected: you may need to stay late to investigate failures in freshly built batches or arrive early to verify and unblock outbound truck shipments. Rapid response and clear communication with engineering and factory teams are critical during these high-pressure periods.
  • Occasional international travel may be expected to Nebius headquarters in Amsterdam or to datacenters in Europe and the US.
  • Daily work tools involve:
      * Managing workflows and escalation via Jira
      * Writing and maintaining technical documentation in Confluence
      * Using Grafana dashboards for monitoring test environments and system health
      * Operating with several internal inventory and test control systems

Qualifications

  • Strong technical background in hardware or systems engineering, able to independently investigate and troubleshoot complex issues with server systems.
  • 5+ years of experience in hardware QA, manufacturing supervision, or server validation.
  • Solid understanding of server and rack hardware: components, layout, cabling, power/cooling, diagnostics.
  • Ability to read and interpret technical documentation (e.g., datasheets, system specs, debug manuals).
  • Solid knowledge of electrical engineering fundamentals (e.g., power specs, grounding, signal integrity).
  • Familiarity with failure analysis, test automation tools, basic data scripting.
  • Familiarity with server test frameworks, factory QA processes, and hardware diagnostic tooling.
  • Working proficiency in English (spoken and written), ideally at B2–C1 level: comfortable participating in Zoom calls, writing updates, and communicating via Slack/email.
  • Highly organized and detail-oriented with a process-driven mindset.
  • Self-motivated, able to operate independently in a fast-paced manufacturing environment.
  • Experience working directly with ODMs or large-scale electronics manufacturing in Taiwan or Asia.
  • Understanding of logistics and hardware supply chain processes.

What we offer 

  • Competitive salary and comprehensive benefits package.
  • Opportunities for professional growth within Nebius.
  • Hybrid working arrangements.
  • A dynamic and collaborative work environment that values initiative and innovation.

We’re growing and expanding our products every day. If you’re up to the challenge and are excited about AI and ML as much as we are, join us!

Create a Job Alert

Interested in building your career at Nebius? Get future opportunities sent straight to your email.

Apply for this job

*

indicates a required field

Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf


Select...
Select...
Select...
Select...