- Reducto
- San Francisco, CA
- Full-Time
- 2 days ago
- $150K – $300K
Infrastructure Engineer: our view in 3 lines...
- The Role: An experienced infrastructure engineer to design and scale production infrastructure supporting AI and ML workloads for enterprise customers.
- The Person: The person will design, build, and maintain scalable, highly available infrastructure, implement monitoring and incident response, and automate deployments for AI/ML workloads.
- Requirements: The role requires 5+ years experience, comfort with Python, cloud platforms, container orchestration such as Kubernetes, networking, and storage technologies.
Job Description
Reducto helps AI teams ingest real world enterprise data with state of the art accuracy.
The vast majority of enterprise data — from financial statements to health records — is locked in unstructured file formats like PDFs and spreadsheets. We train vision models to read those documents the way a human would, and make it possible to build products, train models, and automate processes at scale.
We’ve grown incredibly quickly, growing revenue by 8x YOY, and now work with hundreds of companies ranging from leading AI teams (Harvey, Vanta, Scale), through to enterprise (FAANG, top 3 trading firm).
We've raised over 100M from world-class investors like A16z, Benchmark, and First Round Capital, and are hiring a founding Infrastructure Engineer.
The Opportunity
As an Infrastructure Engineer at Reducto, you will influence every aspect of our infrastructure from the ground up. You will architect and scale resilient systems for AI and ML workloads, automate cloud infrastructure, and implement monitoring and incident response practices that set the standard for reliability. This role requires technical leadership, hands-on systems engineering, and strong collaboration with our founders and product teams as we build a company around reliability, rapid iteration, and high-impact product delivery.
The core work will include:
-
Designing, building, and maintaining highly available, scalable infrastructure to support intensive AI/ML workloads and real-time model deployments.
-
Implementing robust monitoring, alerting, and observability systems to ensure system health, performance, and uptime across cloud and on-prem environments.
-
Debugging, optimizing, and automating infrastructure for fast iteration and rapid deployment cycles, focusing on both reliability and developer velocity.
-
Proactively identifying, investigating, and resolving incidents to minimize downtime and maintain world-class service levels for enterprise customers.
-
Collaborating closely with engineers, ML specialists, and founders to shape product, infrastructure, and security strategies.
We would love to meet you if you:
-
Are your own worst critic—have an extremely high bar for quality and always aim for robust solutions rather than quick fixes.
-
Have 5+ years of hands-on experience in building or supporting production-grade infrastructure and reliability processes for high-throughput systems.
-
Are comfortable with Python or similar languages, and exceptional at working across cloud platforms, container orchestration (e.g., Kubernetes), networking, and storage technologies.
-
Build your own tools on the fly to diagnose, experiment, and address reliability problems—whether it's an internal dashboard or an automated remediation workflow.
-
Bring a quantitative, hands-on approach to system operations, automation, and continuous improvement.
Bonus points if you:
-
Have prior experience founding a company or building products/infrastructure in early-stage, high-growth environments.
-
Are excited about automating incident management processes with LLMs/AI.
-
Are driven, ambitious, and deeply care about both technical excellence and collaborative problem-solving.
-
Keep up with the latest trends in cloud, observability, and SRE best practices.
-
Are passionate about open-source and have contributed tools or automation to reliability communities.
-
Have built or optimized monitoring, incident response, or high-performance computing systems for demanding AI/ML, fintech, or enterprise clients.
This is an in person role at our office in SF. We’re an early stage company which means that the role requires working hard and moving quickly. Please only apply if that excites you.
Benefits at Reducto
-
Unlimited PTO: We believe great work requires recharging.
-
Lunch: Receive a free lunch to eat with your teammates daily at the office
-
Reimbursed Transportation: Provide us with your receipts and we’ll take care of the costs
-
Insurance: Generous health insurance covering medical, dental, and vision.
-
Health and Wellness Budget: We provide up to $150/mo reimbursement for health and wellness spending, such as gym memberships, fitness classes, or similar.
-
Parental Leave: Work with us to build a leave schedule that works for you and your family
Reducto is an Equal Opportunity Employer committed to diversity and inclusion in the workplace. All qualified applicants will receive consideration for employment without regard to sex, race, color, age, national origin, religion, physical and mental disability, genetic information, marital status, sexual orientation, gender identity/assignment, citizenship, pregnancy or maternity, protected veteran status, or any other status prohibited by applicable national, federal, state or local law.
