Your mission
Join us at Rhesis AI – Open-source test generation and management for Gen AI applications that deliver value, not surprises!
At Rhesis AI, we empower organizations to develop and deploy Gen AI applications that meet high standards for reliability, robustness, and compliance. As the creators of an open-source solution for test generation & management, we enable AI teams to build context-specific tests, and collaborate directly with domain experts.
We're currently part of the K.I.E.Z. Accelerator at the Merantix AI Campus in Berlin, where we’re building the testing infrastructure Gen AI needs to earn trust at scale.
If you’re passionate about advancing trustworthy AI through practical tools and collaborative infrastructure, we invite you to join our mission.
Your profile
What you will do:
- Support the development of web-based interfaces and internal tools for the Rhesis AI platform—bridging frontend, backend, and AI components.
- Help integrate large language models and AI pipelines into user-facing applications, supporting tasks like prompt management, evaluation dashboards, and result visualization.
- Work alongside experienced engineers to prototype features, contribute to backend APIs, and explore LLM behaviors in real-world testing scenarios.
- Assist in implementing automated evaluation logic and workflows to validate the performance and quality of Gen AI applications.
- Contribute to test case generation, dataset curation, or scripting tasks as needed for AI experiments or platform development.
- Learn and apply modern development practices (e.g., CI/CD, containerization, version control) in a collaborative, product-focused environment.
You are great for this role, if you have:
- Are currently pursuing a Bachelor’s or Master’s degree in Computer Science, Data Science, AI, or a related field.
- Have hands-on experience with Python and/or JavaScript/TypeScript, and enjoy working across both backend and frontend components.
- Are curious about generative AI, large language models, and how they can be integrated into real-world applications.
- Have basic experience with at least one ML framework (e.g., PyTorch, TensorFlow, or HuggingFace Transformers).
- Are comfortable using Git and working in a structured development environment.
- Enjoy learning quickly, tackling new technologies, and contributing to early-stage product development.
Why us?
We’re excited to offer a 20hrs/week contract,
starting 15 June or 1 July, along with a range of benefits to support our team members, including:
- Work at the forefront of Gen AI: Collaborate with some of the most innovative companies building LLM applications. Contribute to the trustworthiness of AI by shaping open-source tools that define how Gen AI is tested and validated.
- Flexible work arrangements: We understand the importance of work-life balance and offer flexible working options to accommodate personal needs and preferences. We have offices in Berlin (AI Campus) and Potsdam (Griebnitzsee).
- Compensation: We offer salaries and benefits tailored to your experience and qualifications.
- A supportive and collaborative work environment: We foster a culture of teamwork, collaboration, and mutual respect, where every team member is valued and supported in their professional and personal growth.
At Rhesis AI, we value diversity and inclusion, believing that diverse perspectives enrich our team and drive innovation. We encourage applications from individuals of all backgrounds, regardless of gender, nationality, religion, or other personal characteristics. Even if you don’t meet every requirement listed, we encourage you to apply—your unique skills and experiences could be exactly what we need to succeed.
About us
At Rhesis AI, we’re driven by the goal of making AI evaluation and testing seamless, thorough, and accessible. We’re not just another tech company – we’re building solutions to ensure that Gen AI applications are reliable, resilient, and ready to meet the demands of real-world use. Our focus is on providing a comprehensive, automated testing platform that validates AI applications across diverse scenarios and industries, helping businesses confidently deploy Gen AI.