Applied ML Engineer
Join Pruna AI as an Applied ML Engineer to turn cutting-edge models into fast, efficient, and production-ready AI — making state-of-the-art accessible, affordable, and sustainable.
We usually respond within a day
👋 About us
At Pruna, we’re on a mission to make AI more efficient to build a better future.
While the focus of Foundational model Labs is scaling up, we aim to level the playing field by building AI models that are as accessible as possible.
After years of research on efficient ML, we decided that the best way to spread our impact was to take it into our own hands. Each of us cares deeply about empowering people to maximize their impact while minimizing their carbon footprint.
🔍 Role Description
As an Applied ML Engineer at Pruna AI, you will bridge the gap between cutting-edge research and real-world application. Your mission is to identify the most promising AI models released by the community and industry, apply a combination of internal and external efficiency methods and to make them more efficient, and deploy them to be used by end users.
You’ll be at the forefront of operationalizing our research, ensuring that users can benefit from state-of-the-art models without the heavy costs of deployment. This is a hands-on role combining deep ML expertise with practical engineering skills.
What you’ll do:
- Model Optimization
- Analyse newly released open source models and identify the impact of optimising and deploying this model.
- Apply a combination of internal and external efficiency methods to make them more efficient.
- Benchmark performance vs. baseline models and ensure minimal accuracy/performance trade-offs.
- Generate clear reports that translate technical results into actionable insights for communication and go-to-market.
- Continuously improve deployed models as research and hardware evolve.
- Deployment & Delivery
- Package optimized models for deployment on the cloud.
- Ensure smooth integration into Pruna’s SaaS platform and customer environments.
- Collaborate with the Software team to scale testing, deployment, and monitoring.
- Collaborate closely with the Research team to help in identifying and applying promising algorithms as well as giving feedback to what can improve current algorithms.
- Customer & Partner Engagement
- You will work closely with customers and users of our Optimised Models. Whether it is to identify the most promising models or to understand the exact specifications that are required.
- You will have a constant contact with users in order to be able to quickly iterate and improve our models to best fit the industry and production use cases.
🌟 Your Skills
We would love to see:
Educational background or Experience
- B.Sc./M.Sc./Ph.D. in Computer Science, Machine Learning, or related fields—or equivalent industry experience.
- Demonstrated experience working with modern AI models (e.g., transformers, diffusion, multimodal architectures,…).
Machine Learning Expertise
- Strong foundations in deep learning and applied ML.
- Expertise in PyTorch and Python.
- Familiarity with model deployment workflows (Cog, Litserve, vLLM, etc.).
Engineering & Deployment
- Experience taking ML models from research to production in real-world environments.
- Understanding of performance benchmarking, profiling, and hardware-aware optimization.
- Comfort with neo cloud platforms (Replicate/Runpod/Modal), or legacy clouds (AWS/Azure/GCP) and containerization (Cog, Docker…).
Evaluation Skills
- Strong understanding of benchmarking tools and frameworks for both quality and efficiency.
- Experience translating evaluation metrics into actionable engineering trade-offs.
Personal Attributes
- Strong sense of ownership and accountability.
- Ability to thrive in ambiguous, fast-moving environments.
- Clear communication skills to bridge research and customer needs.
- Passion for making AI both impactful and sustainable.
Bonus Points
- Experience with compression methods (quantization, pruning, distillation, compilation).
- Knowledge of lower-level optimization frameworks (Triton, CUDA, C++).
- Prior experience in forward-deployed engineering or customer-facing ML roles.
⚖️ Expected Salary
💸 Salary: We pay top market rates based on seniority and location, leveraging aggregated third party data.
🌞 Benefits: Meal vouchers, health & wellness solutions, mobility, travel policy to visit fellow Pruners, and a remote stipend for your home workspace.
🛤️ Recruitment Process
Our recruitment process consists of 4 interviews to check expectations, technical skills, and team/culture fit:
- Intro Call – Get to know each other. [~1 hour]
- Foundations – Problem-solving & ML/engineering fundamentals. [~1 hour]
- Challenge – Apply your skills on a representative task. [~2/3 hours preparation + 1 hour call]
- Meet the Team – Chat with Pruners and learn about day-to-day life. [~1 hour]
Accessibility note: We adapt the process to your needs to ensure equal opportunity for all applicants.
💜 Our Values
- 🧠 Decide Wisely – Rational, customer-focused decisions.
- 🤝 Trust by Default – Transparency and collaboration.
- 🌍 Foster Inclusion – Supportive, diverse workplace.
- 🌱 Grow Together – Feedback and recognition.
- 🚀 Learn Relentlessly – Adapt and innovate in a fast-moving landscape.
- Department
- Engineering
- Locations
- Munich
- Remote status
- Hybrid

Already working at Pruna AI?
Let’s recruit together and find your next colleague.