{"id":13090,"date":"2025-10-24T13:35:24","date_gmt":"2025-10-24T13:35:24","guid":{"rendered":"http:\/\/kick-start.us\/?post_type=job_listing&#038;p=13090"},"modified":"2025-10-24T13:35:24","modified_gmt":"2025-10-24T13:35:24","slug":"california-398-generalista-de-infraestrutura","status":"publish","type":"job_listing","link":"https:\/\/kick-start.us\/pt-br\/vaga\/california-398-generalista-de-infraestrutura\/","title":{"rendered":"Generalista de infraestrutura"},"content":{"rendered":"<p>Descri\u00e7\u00e3o completa do trabalho<br \/>Generalista de infraestrutura<br \/>Applied Compute builds in-house, highly specialized agent workforces for the most advanced enterprises in the world.<br \/>Today\u2019s state-of-the-art AI is no longer a one-size-fits-all model, but a tailored system that continuously learns from a company\u2019s own processes, data, expertise and goals. The same way companies compete today by having the best human workforce, the companies building for the future will compete by having the best agent workforce supporting their human bosses. We call them Model Employees, and we are already building them today.<\/p>\n<p>We are a small, talent-dense team of engineers, researchers, and operators who have built some of the most influential AI systems in the world, including reinforcement learning infrastructure at OpenAI and data foundations at Scale AI, with additional experience from Together, Two Sigma, and Watershed.<\/p>\n<p>We\u2019re backed by Benchmark, Sequoia, Lux, Greenoaks, Conviction, and Elad Gil. We work in-person in San Francisco.<\/p>\n<p>The Role<br \/>As a founding\u00a0LLM Inference Engineer, you\u2019ll be responsible for integrating, optimizing, and operating large-scale inference systems that power both customer deployments and frontier reinforcement learning research. You\u2019ll build and maintain high-performance serving infrastructure that delivers low-latency, high-throughput access to large language models across thousands of GPUs.<\/p>\n<p>You\u2019ll work closely with our researchers and product engineers to bring cutting-edge inference into enterprise deployments and large-scale RL training workloads. You\u2019ll also contribute to the broader ecosystem by advancing the state of open-source LLM inference.<\/p>\n<p>This role is perfect for a builder who thrives on scale and performance. You care deeply about system reliability and efficiency, and you enjoy solving the hardest infrastructure challenges behind state-of-the-art AI systems.<\/p>\n<p>O que voc\u00ea far\u00e1<\/p>\n<ul>\n<li>Design, optimize, and operate large-scale LLM inference systems that serve customers and power RL training loops<\/li>\n<li>Build and maintain high-performance serving infrastructure with low-latency, high-throughput inference across thousands of GPUs<\/li>\n<li>Collaborate with AI researchers to integrate cutting-edge inference into large-scale reinforcement learning and enterprise workloads<\/li>\n<li>Implement distributed inference techniques (tensor\/expert\/pipeline parallelism, speculative decoding, KV cache management)<\/li>\n<li>Develop tools to optimize GPU utilization and accelerate experimentation at frontier scale<\/li>\n<li>Contribute to open-source LLM inference software and help shape best practices in the field<\/li>\n<\/ul>\n<p>O que estamos procurando<br \/>Technical expertise<\/p>\n<ul>\n<li>Deep experience with high-performance model serving frameworks such as TensorRT-LLM, vLLM, or SGLang<\/li>\n<li>Knowledge of distributed inference techniques (tensor, expert, pipeline parallelism; speculative decoding; KV cache optimization)<\/li>\n<li>Strong background in GPU systems optimization, with a focus on utilization, throughput, and latency<\/li>\n<li>Experience operating large-scale serving infrastructure in production environments<\/li>\n<\/ul>\n<p>Product and research intuition<\/p>\n<ul>\n<li>Ability to work closely with researchers to translate cutting-edge model advances into scalable, production-ready systems<\/li>\n<li>Bias toward fast iteration and experimentation, paired with a high bar for reliability and efficiency<\/li>\n<li>Strong opinions about system design and engineering excellence at scale<\/li>\n<\/ul>\n<p>Strong Candidates May Also Have<\/p>\n<ul>\n<li>Experience optimizing inference for the largest open-source models<\/li>\n<li>Background in reinforcement learning or integration of inference with RL training loops<\/li>\n<li>Contributions to open-source AI infrastructure or systems software<\/li>\n<li>Previous experience as a founder or early engineer in a zero-to-one infrastructure role<\/li>\n<li>Demonstrated technical creativity through published projects, OSS contributions, or side projects<\/li>\n<\/ul>\n<p>Log\u00edstica<br \/>Location:\u00a0This role is based in San Francisco, California.<br \/>Benefits:\u00a0Applied Compute offers generous health benefits, unlimited PTO, paid parental leave, lunches and dinners at the office, and relocation support as needed. We work in-person at a beautiful office in San Francisco\u2019s Design District.<br \/>Visa sponsorship:\u00a0We sponsor visas. While we can\u2019t guarantee success for every candidate or role, if you\u2019re the right fit, we\u2019re committed to working through the visa process with you.<br \/>Compensation:\u00a0Depending on background, skills, and experience, the expected annual salary range for this position is\u00a0$200,000\u2013$250,000 USD.<\/p>\n<p>We encourage you to apply even if you do not believe you meet every single qualification.<br \/>As set forth in Applied Compute\u2019s Equal Employment Opportunity policy, we do not discriminate on the basis of any protected group status under any applicable law.<\/p>","protected":false},"author":10,"featured_media":0,"template":"","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"pmpro_default_level":"","_promoted":"","_job_location":"California","_application":"https:\/\/www.indeed.com\/viewjob?jk=e3b18f4ff7187526&from=mobRdr&utm_source=%2Fm%2F&utm_medium=redir&utm_campaign=dt","_company_website":"","_company_tagline":"","_company_twitter":"","_company_video":"","_filled":0,"_featured":0,"_remote_position":0,"_job_salary":"","_job_salary_currency":"","_job_salary_unit":"","_joinchat":[]},"job-types":[398],"class_list":{"0":"post-13090","1":"job_listing","2":"type-job_listing","3":"status-publish","4":"hentry","5":"pmpro-has-access","7":"job-type-h1-b"},"_links":{"self":[{"href":"https:\/\/kick-start.us\/pt-br\/wp-json\/wp\/v2\/job-listings\/13090","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kick-start.us\/pt-br\/wp-json\/wp\/v2\/job-listings"}],"about":[{"href":"https:\/\/kick-start.us\/pt-br\/wp-json\/wp\/v2\/types\/job_listing"}],"author":[{"embeddable":true,"href":"https:\/\/kick-start.us\/pt-br\/wp-json\/wp\/v2\/users\/10"}],"wp:attachment":[{"href":"https:\/\/kick-start.us\/pt-br\/wp-json\/wp\/v2\/media?parent=13090"}],"wp:term":[{"taxonomy":"job_listing_type","embeddable":true,"href":"https:\/\/kick-start.us\/pt-br\/wp-json\/wp\/v2\/job-types?post=13090"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}