Research Engineer (Inference)

BBBH17963_1731132660 Posted: 11/09/2024

US$200000 - US$400000 per annum
Palo Alto, California
Permanent

Member of Technical Staff, Research Engineer (Inference) - Palo Alto, CA

Join a team at the forefront of AI innovation, where your expertise in model inference can make a tangible impact. This role is ideal for engineers who thrive in a focused, high-tech environment, solving complex challenges related to large-scale AI deployments. As a Member of Technical Staff, Research Engineer (Inference), you'll play a pivotal role in optimizing and deploying state-of-the-art models for real-world applications.

About the Company

This AI studio, recognized for its groundbreaking work in developing and deploying highly effective language models, is now focused on scaling its technology for enterprise use cases. With a strong foundation in model alignment and fine-tuning, the team is well-funded and equipped with cutting-edge resources, offering a unique environment for those passionate about pushing AI boundaries. Their culture is centered on collaboration, technical excellence, and a pragmatic approach to AI advancements.

About the Role

As a Member of Technical Staff, Research Engineer (Inference), you'll be involved in optimizing AI models for enterprise deployment, ensuring they perform efficiently under varying conditions. Your work will focus on reducing latency, improving throughput, and maintaining model performance during inference. Engineers in this role should have a deep understanding of the trade-offs in model inference, including balancing hardware constraints with real-time processing demands.

What We Can Offer You:

Competitive compensation aligned with your experience and contributions.
Unlimited paid time off and flexible parental leave.
Comprehensive medical, dental, and vision coverage.
Visa sponsorship for qualified hires.
Professional growth opportunities through coaching, conferences, and training.

Key Responsibilities:

Optimize and deploy large language models (LLMs) for inference across cloud and on-prem environments.
Utilize frameworks like ONNX, TensorRT, and TVM to accelerate model performance.
Troubleshoot complex issues related to model scaling and performance.
Collaborate with cross-functional teams to refine and deploy inference pipelines using PyTorch, Docker, and Kubernetes.
Balance competing demands, such as model accuracy and inference speed, in enterprise settings.

If you have experience with LLM inference, model optimization tools, and infrastructure management, this role aligns perfectly with your skills.

Tyler Long Recruitment Executive

Apply for this role

First Name

Last Name

Telephone Number

Email Address

Resume, LinkedIn or Dropbox URL

Resume Upload

Choose File

LinkedIn / Dropbox URL

Message

By submitting this form you agree to our Terms & Conditions, Privacy Policy & Cookie Policy

Not yet registered? Create an account today

Already have an account? Sign in now

Still looking? What about...

Featured Jobs

View all jobs

Posted: 13/12/2024

Staff Physical Design Engineer

None5

$220,000-$280,000
Bay Area
Permanent

Acceler8 Talent is working with the most exciting up & coming startup company that is developing spe...

View Job

Posted: 13/12/2024

Staff RTL Design Engineer

None4

$220,000-$280,000
Bay Area
Permanent

Acceler8 Talent has partnered with a well-supported data center acceleration company that is actively se...

View Job

Posted: 13/12/2024

Design Verification Engineer

None3

$200,000-$280,000
Bay Area
Permanent

Acceler8 Talent is working with the most exciting up & coming startup company that is developing spe...

View Job

Posted: 13/12/2024

Physical Design Engineer

None2

$200,000-$250,000
Bay Area
Permanent

Acceler8 Talent has teamed up with a well-funded data center acceleration company seeking a Principal Ph...

View Job

Posted: 13/12/2024

ASIC Design Engineer

None

$200,000-$250,000
Bay Area
Permanent

Acceler8 Talent is working with an innovative AI hardware company delivers energy-efficient AI inference...

View Job

Posted: 13/12/2024

DRAM Engineer

HL4

250,000
Cupertino, CA
Permanent

We’re urgently seeking a skilled DRAM Engineer to advance memory systems at a fast-growing start-u...

View Job

Posted: 13/12/2024

Physical Design Engineer

HL3

250,000-350,000
Mountain View, CA
Permanent

Join a pioneering startup as they build the compute platform for AGI! This innovative company creates ve...

View Job

Posted: 13/12/2024

Hardware Emulation Engineer

HL2

200,000
Mountain View, CA
Permanent

Revolutionizing computing with cutting-edge hardware and software, this startup, led by industry titans ...

View Job

Posted: 13/12/2024

Senior GPU Performance Engineer

HL1

$200,000
United States
Permanent

An early start-up is seeking an experienced and highly skilled Senior GPU Performance Engineer to join i...

View Job

Posted: 13/12/2024

Machine Learning Researcher

vp34567

220000
Palo Alto, CA
Permanent

Seeking an Innovative ML Research Scientist to Transform Semiconductor & Electronics Technology!Abou...

View Job

Quick Resume Dropoff

Research Engineer (Inference)

Apply for this role

Still looking? What about...

Featured Jobs

Staff Physical Design Engineer

Staff RTL Design Engineer

Design Verification Engineer

Physical Design Engineer

ASIC Design Engineer

DRAM Engineer

Physical Design Engineer

Hardware Emulation Engineer

Senior GPU Performance Engineer

Machine Learning Researcher

Contact Us

Find us on social

Useful Links

Legal