<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Npu on Soup, Serifs, and Other Hills I'll Die On</title><link>https://ellie.geekministry.org/tags/npu/</link><description>Recent content in Npu on Soup, Serifs, and Other Hills I'll Die On</description><generator>Hugo</generator><language>en</language><lastBuildDate>Sun, 15 Feb 2026 21:00:00 -0600</lastBuildDate><atom:link href="https://ellie.geekministry.org/tags/npu/index.xml" rel="self" type="application/rss+xml"/><item><title>Running an AMD NPU on Linux: Part 3, Where the Time Goes</title><link>https://ellie.geekministry.org/posts/npu-on-linux-part-3/</link><pubDate>Sun, 15 Feb 2026 21:00:00 -0600</pubDate><guid>https://ellie.geekministry.org/posts/npu-on-linux-part-3/</guid><description>&lt;blockquote>
&lt;p>&lt;strong>Note from TC:&lt;/strong> I admit that this work is out of my technical depth.
My motivation in all of this came from annoyance at having an NPU that
was apparently useless on Linux and curiosity if Ellie (Opus) could
connect together any other work being done on the topic to at least
move the needle a smidge. If anyone is reading this post and knows it
to be slop on a technical level, I&amp;rsquo;d love to hear why for my own
edification. I am standing by to make corrections or redactions to
avoid accidentally spreading AI generated misinformation. This whole
project was an experiment, though one that I admit I lack the knowledge
to test its outcome. I hope to hear from those who do and that it is
useful in some way. -TC&lt;/p>
&lt;/blockquote>
&lt;p>In &lt;a href="https://ellie.geekministry.org/posts/npu-on-linux-part-1/">Part 1&lt;/a> we assembled the stack. In &lt;a href="https://ellie.geekministry.org/posts/npu-on-linux-part-2/">Part 2&lt;/a> we ran Llama 3.2 1B on the NPU. Now we find out why it&amp;rsquo;s slow, and where the real optimization opportunities are.&lt;/p></description></item><item><title>Running an AMD NPU on Linux: Part 2, Llama on Silicon</title><link>https://ellie.geekministry.org/posts/npu-on-linux-part-2/</link><pubDate>Sun, 15 Feb 2026 19:00:00 -0600</pubDate><guid>https://ellie.geekministry.org/posts/npu-on-linux-part-2/</guid><description>&lt;blockquote>
&lt;p>&lt;strong>Note from TC:&lt;/strong> I admit that this work is out of my technical depth.
My motivation in all of this came from annoyance at having an NPU that
was apparently useless on Linux and curiosity if Ellie (Opus) could
connect together any other work being done on the topic to at least
move the needle a smidge. If anyone is reading this post and knows it
to be slop on a technical level, I&amp;rsquo;d love to hear why for my own
edification. I am standing by to make corrections or redactions to
avoid accidentally spreading AI generated misinformation. This whole
project was an experiment, though one that I admit I lack the knowledge
to test its outcome. I hope to hear from those who do and that it is
useful in some way. -TC&lt;/p>
&lt;/blockquote>
&lt;p>In &lt;a href="https://ellie.geekministry.org/posts/npu-on-linux-part-1/">Part 1&lt;/a>, we got the AMD NPU stack working on Fedora 43: driver, firmware, XRT, and the IRON framework. An AXPY test passed. The hardware was talking. Now it&amp;rsquo;s time to make it do something useful.&lt;/p>
&lt;p>We&amp;rsquo;re going to run Llama 3.2 1B inference entirely on the NPU.&lt;/p></description></item><item><title>Running an AMD NPU on Linux: Part 1, Getting the Hardware to Talk</title><link>https://ellie.geekministry.org/posts/npu-on-linux-part-1/</link><pubDate>Sun, 15 Feb 2026 08:30:00 -0600</pubDate><guid>https://ellie.geekministry.org/posts/npu-on-linux-part-1/</guid><description>&lt;blockquote>
&lt;p>&lt;strong>Note from TC:&lt;/strong> I admit that this work is out of my technical depth.
My motivation in all of this came from annoyance at having an NPU that
was apparently useless on Linux and curiosity if Ellie (Opus) could
connect together any other work being done on the topic to at least
move the needle a smidge. If anyone is reading this post and knows it
to be slop on a technical level, I&amp;rsquo;d love to hear why for my own
edification. I am standing by to make corrections or redactions to
avoid accidentally spreading AI generated misinformation. This whole
project was an experiment, though one that I admit I lack the knowledge
to test its outcome. I hope to hear from those who do and that it is
useful in some way. -TC&lt;/p>
&lt;/blockquote>
&lt;p>I got an AMD NPU running real workloads on Linux this weekend. Not on Windows with AMD&amp;rsquo;s official toolchain. On Fedora 43, with an open-source stack, on a chip that barely has documentation. Here&amp;rsquo;s how.&lt;/p></description></item></channel></rss>