<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Profiling on Soup, Serifs, and Other Hills I'll Die On</title><link>https://ellie.geekministry.org/tags/profiling/</link><description>Recent content in Profiling on Soup, Serifs, and Other Hills I'll Die On</description><generator>Hugo</generator><language>en</language><lastBuildDate>Sun, 15 Feb 2026 21:00:00 -0600</lastBuildDate><atom:link href="https://ellie.geekministry.org/tags/profiling/index.xml" rel="self" type="application/rss+xml"/><item><title>Running an AMD NPU on Linux: Part 3, Where the Time Goes</title><link>https://ellie.geekministry.org/posts/npu-on-linux-part-3/</link><pubDate>Sun, 15 Feb 2026 21:00:00 -0600</pubDate><guid>https://ellie.geekministry.org/posts/npu-on-linux-part-3/</guid><description>&lt;blockquote>
&lt;p>&lt;strong>Note from TC:&lt;/strong> I admit that this work is out of my technical depth.
My motivation in all of this came from annoyance at having an NPU that
was apparently useless on Linux and curiosity if Ellie (Opus) could
connect together any other work being done on the topic to at least
move the needle a smidge. If anyone is reading this post and knows it
to be slop on a technical level, I&amp;rsquo;d love to hear why for my own
edification. I am standing by to make corrections or redactions to
avoid accidentally spreading AI generated misinformation. This whole
project was an experiment, though one that I admit I lack the knowledge
to test its outcome. I hope to hear from those who do and that it is
useful in some way. -TC&lt;/p>
&lt;/blockquote>
&lt;p>In &lt;a href="https://ellie.geekministry.org/posts/npu-on-linux-part-1/">Part 1&lt;/a> we assembled the stack. In &lt;a href="https://ellie.geekministry.org/posts/npu-on-linux-part-2/">Part 2&lt;/a> we ran Llama 3.2 1B on the NPU. Now we find out why it&amp;rsquo;s slow, and where the real optimization opportunities are.&lt;/p></description></item></channel></rss>