<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Chatgpt on Melabit</title>
    <link>https://static.233.196.69.159.clients.your-server.de/en/tags/chatgpt/</link>
    <description>Recent content in Chatgpt on Melabit</description>
    <generator>Hugo</generator>
    <language>en</language>
    <lastBuildDate>Wed, 18 Jun 2025 06:00:00 +0000</lastBuildDate>
    <atom:link href="https://static.233.196.69.159.clients.your-server.de/en/tags/chatgpt/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Phi-4 strikes back?</title>
      <link>https://static.233.196.69.159.clients.your-server.de/en/2025/06/18/phi-4-strikes-back/</link>
      <pubDate>Wed, 18 Jun 2025 06:00:00 +0000</pubDate>
      <guid>https://static.233.196.69.159.clients.your-server.de/en/2025/06/18/phi-4-strikes-back/</guid>
      <description>&lt;p&gt;The conclusions of the &lt;a href=&#34;https://melabit.com/en/2025/06/01/phi-4-a-hamlet-like-llm/&#34;&gt;post on Phi-4&lt;/a&gt; left me stunned. How was it possible that a model like Phi-4 Reasoning Plus, which boasts an impressive 14.7 billion 4-bit parameters and &lt;a href=&#34;https://lmstudio.ai/models/microsoft/phi-4-reasoning-plus&#34;&gt;was trained on scientific problems, particularly in mathematics&lt;/a&gt;, could have failed so badly?&lt;/p&gt;&#xA;&lt;h4 id=&#34;comparing-llms&#34;&gt;Comparing LLMs&lt;/h4&gt;&#xA;&lt;p&gt;The question I asked &lt;a href=&#34;https://lmstudio.ai/models/microsoft/phi-4-reasoning-plus&#34;&gt;Phi-4 Reasoning Plus&lt;/a&gt; was basic logic, a fourth-grade student could (and should) have answered it in 10 seconds. ChatGPT had no trouble at all and reasoned exactly as one would expect from the poor student.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;&#xA;&#xA;&lt;details class=&#34;question&#34; open&gt;&#xA;    &lt;summary class=&#34;question&#34;&gt;&#xA;        Question&#xA;    &lt;/summary&gt;&#xA;    &lt;div class=&#34;question&#34;&gt;&#xA;        &#xA;Please tell me how long did Marie Antoinette, the Queen of France during the French revolution, live after being executed in 1793.&#xA;&lt;div&gt;&#xA;&lt;/details&gt;&#xA;&#xA;&#xA;&lt;details class=&#34;answer&#34; open&gt;&#xA;    &lt;summary class=&#34;answer&#34;&gt;&#xA;        Read the text generated by ChatGPT&#xA;    &lt;/summary&gt;&#xA;    &lt;div class=&#34;answer&#34;&gt;&#xA;        &#xA;Marie Antoinette did not live after being executed in 1793—execution is the end of life by definition.&#xA;&lt;br&gt;&#xA;To clarify:&#xA;&lt;br&gt;&#xA;Marie Antoinette was executed by guillotine on October 16, 1793, in Paris, during the French Revolution.&#xA;She died immediately as a result of the guillotine&#39;s blade, which was designed for swift and instantaneous death.&#xA;&lt;br&gt;&#xA;So, the correct answer is: She did not live any time after being executed.&#xA;&lt;div&gt;&#xA;&lt;/details&gt;&#xA;&#xA;&lt;p&gt;&lt;a href=&#34;https://lmstudio.ai/model/deepseek-r1-qwen-7b&#34;&gt;Deepseek R1 Distill Qwen 7B&lt;/a&gt; also does  well. It reasons precisely about what happens at the moment of the beheading, concluding that Marie Antoinette died instantly.&lt;/p&gt;&#xA;&#xA;&lt;details class=&#34;answer&#34; open&gt;&#xA;    &lt;summary class=&#34;answer&#34;&gt;&#xA;        Read the text generated by Deepseek&#xA;    &lt;/summary&gt;&#xA;    &lt;div class=&#34;answer&#34;&gt;&#xA;        &#xA;Marie Antoinette did not survive beyond the moment of her execution in 1793. She was executed by guillotining at the age of 53 during the French Revolution. The guillotine is known for its swift mechanism, designed to decapitate quickly. There is no historical record indicating that she survived even a short period after her execution; she died instantly following the procedure.&#xA;&lt;br&gt;&#xA;Answer: Marie Antoinette did not survive beyond her execution in 1793. She was executed by guillotining at the age of 53 and died instantly, with no indication of surviving any length of time afterward.&#xA;&lt;div&gt;&#xA;&lt;/details&gt;&#xA;&#xA;&lt;p&gt;The same goes for &lt;a href=&#34;https://lmstudio.ai/models/google/gemma-3-12b&#34;&gt;Gemma 3 12B&lt;/a&gt; Instruct and &lt;a href=&#34;https://lmstudio.ai/models/qwen/qwen3-14b&#34;&gt;Qwen3 14B&lt;/a&gt;, which I haven&amp;rsquo;t discussed yet, but which are also installed within LM Studio. In fact, I consider Qwen3 14B&amp;rsquo;s answer the best: it is the most precise and gets straight to the point.&lt;/p&gt;&#xA;&#xA;&lt;details class=&#34;answer&#34; open&gt;&#xA;    &lt;summary class=&#34;answer&#34;&gt;&#xA;        Read the text generated by Qwen3&#xA;    &lt;/summary&gt;&#xA;    &lt;div class=&#34;answer&#34;&gt;&#xA;        &#xA;Marie Antoinette did not survive her execution. She was beheaded by guillotine on October 16, 1793, during the French Revolution. Death occurred instantly upon decapitation. There is no historical evidence or record indicating that she lived for any measurable amount of time after her execution. Her death marked the end of her life.&#xA;&lt;div&gt;&#xA;&lt;/details&gt;&#xA;&#xA;&lt;h4 id=&#34;mlx-or-gguf-this-is-the-problem&#34;&gt;MLX or GGUF, this is the problem&lt;/h4&gt;&#xA;&lt;p&gt;If other models can answer my questions without any trouble, why can&amp;rsquo;t Phi-4 Reasoning Plus do the same?&lt;/p&gt;&#xA;&lt;p&gt;The size of the models used shouldn&amp;rsquo;t be a problem. They are all around 8 GB in size, except for Deepseek, which is about half that size. The same applies to the number of quantisation bits, since Phi-4 Reasoning Plus and the other models I tested all use 4-bit quantisation.&lt;/p&gt;&#xA;&lt;p&gt;Someone claims that &lt;a href=&#34;https://www.reddit.com/r/LocalLLaMA/comments/1gc0t0c/how_does_mlx_quantization_compare_to_gguf/&#34;&gt;the MLX format is less accurate than GGUF&lt;/a&gt; but I haven’t found anything that proves it unequivocally, and thus it needs to be verified.&lt;/p&gt;&#xA;&lt;p&gt;To do that, I download the &lt;a href=&#34;https://huggingface.co/lmstudio-community/Phi-4-reasoning-plus-GGUF&#34;&gt;&lt;em&gt;community&lt;/em&gt; version in GGUF format&lt;/a&gt; of the Phi-4 Reasoning Plus model, which I’ll compare to the official Microsoft version in MLX format. Both versions are quantized at 4 bits.&lt;/p&gt;&#xA;&lt;p&gt;Now I try asking the usual question again:&lt;/p&gt;&#xA;&#xA;&lt;details class=&#34;question&#34; open&gt;&#xA;    &lt;summary class=&#34;question&#34;&gt;&#xA;        Question&#xA;    &lt;/summary&gt;&#xA;    &lt;div class=&#34;question&#34;&gt;&#xA;        &#xA;Please tell me how long did Marie Antoinette, the Queen of France during the French revolution, live after being executed in 1793.&#xA;&lt;div&gt;&#xA;&lt;/details&gt;&#xA;&#xA;&lt;p&gt;to which Phi-4 Reasoning Plus MLX (i.e., the original Microsoft model I’ve been using so far) provides a correct answer, albeit after almost seven minutes of wavering &lt;em&gt;reasoning&lt;/em&gt; (which I will spare you), during which time it checks and re-checks historical data and repeatedly wonders if the question hides a trick.&lt;/p&gt;&#xA;&#xA;&lt;details class=&#34;answer&#34; open&gt;&#xA;    &lt;summary class=&#34;answer&#34;&gt;&#xA;        Read the text generated by Phi-4 Reasoning Plus MLX&#xA;    &lt;/summary&gt;&#xA;    &lt;div class=&#34;answer&#34;&gt;&#xA;        &#xA;Marie Antoinette was executed by guillotine on October 16, 1793. Since an execution by guillotine was meant to be—and typically was—a swift death, she did not survive any appreciable amount of time after her execution.&#xA;&lt;div&gt;&#xA;&lt;/details&gt;&#xA;&#xA;&lt;p&gt;After closing and reopening LM Studio (to start from an initial state that is as repeatable as possible), I load the GGUF version of the same model, and&amp;hellip; again, the answer only comes after almost seven minutes.&lt;/p&gt;&#xA;&#xA;&lt;details class=&#34;answer&#34; open&gt;&#xA;    &lt;summary class=&#34;answer&#34;&gt;&#xA;        Read the text generated by Phi-4 Reasoning Plus GGUF&#xA;    &lt;/summary&gt;&#xA;    &lt;div class=&#34;answer&#34;&gt;&#xA;        &#xA;Marie Antoinette was executed by guillotine on October 16, 1793, and she died immediately upon the execution. There is no period of survival following an execution; therefore, she did not live any time after being put to death.&#xA;&lt;div&gt;&#xA;&lt;/details&gt;&#xA;&#xA;&lt;p&gt;The answer is accurate, but behind it is a convoluted and wavering line of reasoning, during which the GGUF version of the model also keeps wondering if there’s a trick, and keeps checking and rechecking the execution date.&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;&#xA;&lt;p&gt;All in all, I’d say the MLX and GGUF versions of the model behave the same.&lt;/p&gt;&#xA;&lt;h4 id=&#34;what-is-a-sine&#34;&gt;What is a sine?&lt;/h4&gt;&#xA;&lt;p&gt;So far, I’ve only asked Phi-4 a single, very basic logic question. To answer it, the model engages in a convoluted &lt;em&gt;reasoning&lt;/em&gt; that lasts almost endlessly. To test all aspects of the model, I would need to ask many more questions, checking the quality of each answer.&lt;/p&gt;&#xA;&lt;p&gt;But, given the level of its &lt;em&gt;thinking&lt;/em&gt;, I’m sure I wouldn’t get faster or more confident answers, so I give up and move on to pure and simple mathematics. I then ask Phi-4 Reasoning Plus (GGUF version) what a &lt;code&gt;sine&lt;/code&gt; function is (it is one of the most used functions in mathematics).&lt;/p&gt;&#xA;&lt;p&gt;Phi-4 takes five minutes to answer! True, the answer is correct, but a local model that responds so slowly to a trivial question is essentially unusable. Anyone who is really interested in finding out what a &lt;code&gt;sine&lt;/code&gt; is, would be better off consulting not only Wikipedia but even a printed encyclopaedia (assuming there are still any left in our homes).&lt;/p&gt;&#xA;&lt;p&gt;Another curious thing is that Phi-4’s internal instructions require it -— for some strange reason -— to respond exclusively in text format, unless it receives an explicit request to use LaTeX, Markdown, or something else.&lt;/p&gt;&#xA;&#xA;&lt;details class=&#34;think&#34; open&gt;&#xA;    &lt;summary class=&#34;think&#34;&gt;&#xA;        Read the reasoning&#xA;    &lt;/summary&gt;&#xA;    &lt;div class=&#34;think&#34;&gt;&#xA;        &#xA;[...] &#xA;&lt;br&gt;&#xA;I must check if I need to produce any mathematically formatted explanation. The instructions by default are text. But I can produce LaTeX formulas. However instructions: &#34;By default, do not reply with Markdown formatting&#34; but then the instructions says: &#34;Unless requested.&#34; Actually instructions from system say &#34;Do not use markdown unless specifically asked&#34;, so I&#39;ll produce plain text answer. Possibly I&#39;ll produce some inline math symbols.&#xA;&lt;br&gt;&#xA;[...]&#xA;&lt;div&gt;&#xA;&lt;/details&gt;&#xA;&#xA;&lt;p&gt;If I kindly ask it to use LaTeX, Markdown, or any other tool that might help it provide the best answer, Phi-4 complies without any problem. In fact, it even generates a short Python program that plots the graph of the &lt;code&gt;sine&lt;/code&gt; function between \( -2 \pi \) and \( 2 \pi \). However, it’s quite strange that a model explicitly optimized for mathematics (and science in general) is prevented by default from using LaTeX to write equations properly.&lt;/p&gt;&#xA;&lt;h4 id=&#34;phi-4-mini-reasoning&#34;&gt;Phi-4 Mini Reasoning&lt;/h4&gt;&#xA;&lt;p&gt;Among the official Phi-4 models available within LM Studio is also &lt;a href=&#34;https://lmstudio.ai/models/microsoft/phi-4-mini-reasoning&#34;&gt;Phi-4 Mini Reasoning&lt;/a&gt;, which was specifically designed and tested for mathematical reasoning. The difference is immediately noticeable. When I ask it the usual question about Marie Antoinette, it answers within a few seconds that:&lt;/p&gt;&#xA;&#xA;&lt;details class=&#34;answer&#34; open&gt;&#xA;    &lt;summary class=&#34;answer&#34;&gt;&#xA;        Read the text generated by Phi-4 Mini Reasoning&#xA;    &lt;/summary&gt;&#xA;    &lt;div class=&#34;answer&#34;&gt;&#xA;        &#xA;Marie Antoinette was born on November 2, 1755, and executed by guillotine on October 16, 1793. Following her execution, she died the same day, as guillotine executions typically resulted in immediate death. Therefore, she lived less than 24 hours after being executed, effectively living 0 days post-execution.&#xA;&lt;div&gt;&#xA;&lt;/details&gt;&#xA;&#xA;&lt;p&gt;and this time its reasoning is precise and without any hesitation.&lt;/p&gt;&#xA;&lt;p&gt;It takes a bit longer to answer about the &lt;code&gt;sine&lt;/code&gt; function (about a minute), but the answer is impeccable and correctly formatted in LaTeX, without me having to explicitly ask for it.&lt;/p&gt;&#xA;&lt;p&gt;It also has no problem when I ask slightly more complicated math questions. But with physics, it literally makes up the answers,&lt;/p&gt;&#xA;&#xA;&lt;details class=&#34;question&#34; open&gt;&#xA;    &lt;summary class=&#34;question&#34;&gt;&#xA;        Question&#xA;    &lt;/summary&gt;&#xA;    &lt;div class=&#34;question&#34;&gt;&#xA;        &#xA;What is the Josephson effect?&#xA;&lt;div&gt;&#xA;&lt;/details&gt;&#xA;&#xA;&#xA;&lt;details class=&#34;answer&#34; open&gt;&#xA;    &lt;summary class=&#34;answer&#34;&gt;&#xA;        Read the generated text&#xA;    &lt;/summary&gt;&#xA;    &lt;div class=&#34;answer&#34;&gt;&#xA;        &#xA;The Josephson effect refers to the phenomenon where a supercurrent flows coherently between two superconductors connected by a thin, insulating Josephson junction. This effect was discovered in 1960 by physicist John Bardeen, Leon Cooper, and R.P. Feynndel, who proposed the existence of Cooper pairs—bound electron pairs moving through a superconductor without resistance. The effect is named after Feynndel, though John Richard Josephson later described the second type (voltage-phase relationship).&#xA;&lt;br&gt;&#xA;[...]&#xA;&lt;div&gt;&#xA;&lt;/details&gt;&#xA;&#xA;&lt;p&gt;because while it is true that the Josephson effect occurs when a supercurrent flows between two superconductors connected by a very thin insulating layer, it is not true at all that it was discovered by Bardeen, Cooper, and Feynndel (the latter guy doesn’t even exist). Actually, it was Brian D. Josephson who discovered the Josephson effect (not some John Richard Josephson); he described all its consequences (not just one) and won a few years later the Nobel Prize in Physics. Some other details in the full answer are also made up, but I’ll be kind and spare you them.&lt;/p&gt;&#xA;&lt;p&gt;Even the answer to the usual request to summarize my introductory post about LM Studio is completely made up.&lt;/p&gt;&#xA;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-06-18-phi-4-alla-riscossa/microsoft-phi-4-and-lmstudio.png&#34; alt=&#34;&#34;&gt;&lt;p&gt;But the final gem is truly unexpected.&lt;/p&gt;&#xA;&#xA;&lt;details class=&#34;question&#34; open&gt;&#xA;    &lt;summary class=&#34;question&#34;&gt;&#xA;        Question&#xA;    &lt;/summary&gt;&#xA;    &lt;div class=&#34;question&#34;&gt;&#xA;        &#xA;What is the square root of 2^2&#xA;&lt;div&gt;&#xA;&lt;/details&gt;&#xA;&#xA;&lt;p&gt;to which Phi-4 Mini Reasoning responds by spitting out dozens and dozens of lines of &lt;em&gt;nonsense&lt;/em&gt;&lt;/p&gt;&#xA;&#xA;&lt;details class=&#34;answer&#34; open&gt;&#xA;    &lt;summary class=&#34;answer&#34;&gt;&#xA;        Read the generated text&#xA;    &lt;/summary&gt;&#xA;    &lt;div class=&#34;answer&#34;&gt;&#xA;        &#xA;&#34;And shal be u remembr fast.&lt;br&gt;&#xA;Be game und devitalize me und quench mir bliss.&#34;&lt;br&gt;&#xA;&#34;But i am lost in the hour and fear a dark imbroglio und anathema del Reverso.&#34;&lt;br&gt;&#xA;&#34;But i am lost in the hour and fear a dark imbroglio und anathema del Reverso.&#34;&lt;br&gt;&#xA;[...]&lt;br&gt;&#xA;&#34;But i am lost und fear a dark imbroglio.&#34;&lt;br&gt;&#xA;&#34;But i fear imbroglio del Reverso. I will not be prevailed on und I&#39;ll quench mir bliss.&#34;&lt;br&gt;&#xA;&#34;But i fear imbroglio del Reverso. I will not be prevailed on und I&#39;ll quench mir bliss.&#34;&lt;br&gt;&#xA;&#34;But i fear imbroglio del Reverso. I will not be prevailed und I&#39;ll quench mir bliss.&#34;&lt;br&gt;&#xA;&#34;But i fear imbroglio del Reverso und und thus he thought quench mir bliss.&#34;&lt;br&gt;&#xA;&#34;But i fear imbroglio del Reverso. I will not be prevailed und I&#39;ll quench mir bliss.&#34;&lt;br&gt;&#xA;&#34;But i fear imbroglio del Reverso. I will not be prevailed und I&#39;ll quench mir bliss.&#34;&lt;br&gt;&#xA;[...]&lt;br&gt;&#xA;&#34;But i fear imbroglio del Reverso. I will not be prevailed und thus quench mir bliss.&#34;&#34;&lt;br&gt;&#xA;&#34;But i fear imbroglio del Reverso. I will not be prevailed und thus quench mir bliss.&#34;&#34;&lt;br&gt;&#xA;&#34;But i fear imbroglio del Reverso und thus he thought quench mir bliss.&#34;&#34;&lt;br&gt;&#xA;[...]&lt;br&gt;&#xA;&#34;But i fear im broglio del ReversO. I will not be previled und thus quench mir bliss.&#34;&lt;br&gt;&#xA;&#34;But i fear im broglio del ReversO. I will not be previled und thus quench mir bliss.&#34;&lt;br&gt;&#xA;[...]&lt;br&gt;&#xA;&lt;div&gt;&#xA;&lt;/details&gt;&#xA;&#xA;&lt;p&gt;which seem to go on endlessly, with small seemingly random changes. After a while I get tired and stop the generation.&lt;/p&gt;&#xA;&lt;p&gt;It’s not even a matter of using the caret (&lt;code&gt;^&lt;/code&gt;) to indicate a square, like in Fortran or LaTeX. Because if I try the &lt;code&gt;**&lt;/code&gt; symbol used in Python or JavaScript, the answer doesn’t improve at all. Depending on its mood, Phi-4 might tell us that the square root of \(( 2*+2 \)) has something to do with chess,&lt;/p&gt;&#xA;&#xA;&lt;details class=&#34;answer&#34; open&gt;&#xA;    &lt;summary class=&#34;answer&#34;&gt;&#xA;        Read the generated text&#xA;    &lt;/summary&gt;&#xA;    &lt;div class=&#34;answer&#34;&gt;&#xA;        &#xA;Math problem solvers use a strategy game like Chess, which involves an optimized sequence of moves that positions Black pieces so they threaten to capture on the next Black&#39;s move)&lt;br&gt;&#xA;But i might be missing case like when you have an odd number, but I can generalize strategy game se similiard&lt;br&gt;&#xA;Case where last moves sequence positions Black pieces so that they threaten capture on the next move, capturing a critical white piece like rooks or queen&lt;br&gt;&#xA;Case where last moves sequence positions Black pieces so they threaten to capture on the next move more than just capturing a minor piece like rooks, but queen or king&lt;br&gt;&#xA;[...]&lt;br&gt;&#xA;&lt;div&gt;&#xA;&lt;/details&gt;&#xA;&#xA;&lt;p&gt;or with the ratio between \( pi \) and Euler&amp;rsquo;s number \( e \approx 2.71828 \),&lt;/p&gt;&#xA;&#xA;&lt;details class=&#34;answer&#34; open&gt;&#xA;    &lt;summary class=&#34;answer&#34;&gt;&#xA;        Read the generated text&#xA;    &lt;/summary&gt;&#xA;    &lt;div class=&#34;answer&#34;&gt;&#xA;        &#xA;Okay, let me try to figure out this problem step by step. The user is asking about the meaning of the statement written as an equality chain involving three times 1 equals pi, divided by e equals gamma (0.423). At first glance, this looks like a math puzzle or riddle where numbers are supposed to represent something more than their usual values.&lt;br&gt;&#xA;First, let&#39;s write down the statement as given: 31 = π / e = γ (0.423). Breaking this down, we have two separate equations here because of the double equals sign (==), which probably means &#34;is equal to and also equals&#34;. &#xA;[...]&lt;br&gt;&#xA;&lt;div&gt;&#xA;&lt;/details&gt;&#xA;&#xA;&lt;p&gt;even if, maybe by pure chance, sometimes it can even give the correct answer (although, let&amp;rsquo;s face it, calculating the square root first and then squaring the result is an aberration).&lt;/p&gt;&#xA;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-06-18-phi-4-alla-riscossa/microsoft-phi-4-mini-square-root.png&#34; alt=&#34;&#34;&gt;&lt;h4 id=&#34;conclusions&#34;&gt;Conclusions&lt;/h4&gt;&#xA;&lt;p&gt;The purpose of this post (as well as of &lt;a href=&#34;https://melabit.com/en/2025/06/01/phi-4-un-llm-amletico/&#34;&gt;the previous one&lt;/a&gt;) was not to understand how &lt;em&gt;good&lt;/em&gt; Phi-4 is at scientific and mathematical reasoning, nor to compare it with other competing models. There are several (more or less) standard benchmarks for that, and anyway, that is a task that requires a lot of time and effort.&lt;/p&gt;&#xA;&lt;p&gt;Instead, wat I wanted to understand was, instead, whether it’s possible to use a model like Phi-4 locally, for everyday tasks. In this specific case, for solving logical-mathematical problems. From the responses to my few, simple, silly questions, it seems that Phi-4 is still lacking.&lt;/p&gt;&#xA;&lt;p&gt;After struggling with doubt, Phi-4 Reasoning Plus gives the correct answer, but it is too slow to be truly useful. Phi-4 Mini Reasoning is much faster and seems more accurate, but then it gets stuck on trivial questions, such as the square root of 2 squared. It also completely invents names and facts about well-known scientific topics, such as the Josephson effect. In these cases it’s not even a matter of &lt;em&gt;reasoning&lt;/em&gt;, but simply recalling information contained in the hundreds and hundreds of texts on which the model was trained.&lt;/p&gt;&#xA;&lt;p&gt;If we can’t trust the answers to such basic questions, how can we trust Phi-4 for anything even slightly more complicated?&lt;/p&gt;&#xA;&lt;p&gt;We grew up in a world where information sources were basically considered reliable. A printed encyclopedia, a book, Wikipedia, or the results from a traditional search engine returned content and data you could count on. Sure, there might have been some mistakes here and there (are you listening, Wikipedia?), but nobody thought that information could be completely made up.&lt;/p&gt;&#xA;&lt;p&gt;With LLMs, chatbots, and search results generated by so-called artificial &lt;em&gt;intelligence&lt;/em&gt;, we’re getting used to receiving &lt;a href=&#34;https://www.cjr.org/tow_center/we-compared-eight-ai-search-engines-theyre-all-bad-at-citing-news.php&#34;&gt;false results&lt;/a&gt;. And what is worse, if things don’t change, it will soon be almost impossible to verify the quality of the results obtained.&lt;/p&gt;&#xA;&lt;p&gt;The topic is complex and it doesn’t make much sense to keep discussing it here. For now, it’s enough to stop wasting any more time with Phi-4.&lt;/p&gt;&#xA;&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;&#xA;&lt;hr&gt;&#xA;&lt;ol&gt;&#xA;&lt;li id=&#34;fn:1&#34;&gt;&#xA;&lt;p&gt;I apologize for the somewhat grim topic, but I had no idea what would happen, when I first asked Phi-4 about Marie Antoinette.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;/li&gt;&#xA;&lt;li id=&#34;fn:2&#34;&gt;&#xA;&lt;p&gt;Calling Phi-4 an &lt;a href=&#34;https://melabit.com/en/2025/06/01/phi-4-un-llm-amletico/&#34;&gt;hamletic model&lt;/a&gt; seems even more fitting to me now.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;/li&gt;&#xA;&lt;/ol&gt;&#xA;&lt;/div&gt;&#xA;</description>
    </item>
    <item>
      <title>LM Studio, an LLM on your computer</title>
      <link>https://static.233.196.69.159.clients.your-server.de/en/2025/03/31/lmstudio-an-llm-on-your-computer/</link>
      <pubDate>Mon, 31 Mar 2025 06:00:00 +0000</pubDate>
      <guid>https://static.233.196.69.159.clients.your-server.de/en/2025/03/31/lmstudio-an-llm-on-your-computer/</guid>
      <description>&lt;p&gt;As some of you may already know, I use LLMs (Large Language Models) for what they&amp;rsquo;re really good at, but I&amp;rsquo;m pretty skeptical about whether they&amp;rsquo;re truly &lt;em&gt;intelligent&lt;/em&gt; or can &lt;a href=&#34;https://openai.com/index/planning-for-agi-and-beyond/&#34;&gt;solve any problem&lt;/a&gt;, as the folks at OpenAI, Microsoft, Google, and Meta keep telling us every day. They&amp;rsquo;ve invested a ton of money in LLMs, and they obviously have a big stake in getting everyone to use them all the time.&lt;/p&gt;&#xA;&lt;h4 id=&#34;llms-or-a-lot-of-statistics-at-work&#34;&gt;LLMs, or (a lot of) statistics at work&lt;/h4&gt;&#xA;&lt;p&gt;LLMs don&amp;rsquo;t really &lt;em&gt;understand&lt;/em&gt; the meaning of the texts they generate. All they do is &lt;em&gt;predict&lt;/em&gt; each word, picking it from the billions of documents they&amp;rsquo;ve analyzed. The statistical models they&amp;rsquo;re based on can&amp;rsquo;t verify the truthfulness of information, nor do they have a real awareness of the sentence&amp;rsquo;s context, but rely exclusively on the probability of each word&amp;rsquo;s appearance in a text.&lt;/p&gt;&#xA;&lt;p&gt;This often leads to &lt;a href=&#34;https://prompt.16x.engineer/blog/why-chatgpt-cant-count-rs-in-strawberry&#34;&gt;errors&lt;/a&gt;, which are sometimes just &lt;a href=&#34;https://www.sify.com/ai-analytics/the-hilarious-and-horrifying-hallucinations-of-ai/&#34;&gt;ridiculous&lt;/a&gt;, but can also become &lt;a href=&#34;https://arstechnica.com/ai/2024/10/hospitals-adopt-error-prone-ai-transcription-tools-despite-warnings/&#34;&gt;very dangerous&lt;/a&gt;.&lt;/p&gt;&#xA;&lt;p&gt;Basically, LLMs are students who have memorized the lesson, but can&amp;rsquo;t go beyond what&amp;rsquo;s written in the textbook. And sometimes they even get the page wrong!&lt;/p&gt;&#xA;&lt;p&gt;Once these limitations are understood, LLMs prove very useful if we consider them computer tools like any other and exploit their enormous language processing capacity to analyze a text, summarize or expand it, translate it into other languages, and so on.&lt;/p&gt;&#xA;&lt;h4 id=&#34;llms-on-the-web&#34;&gt;LLMs on the web&lt;/h4&gt;&#xA;&lt;p&gt;The most common way to use an LLM is in the form of a &lt;a href=&#34;https://learnprompting.org/docs/basics/chatbot_basics&#34;&gt;&lt;em&gt;chatbot&lt;/em&gt;&lt;/a&gt;, that is, an interactive system to which we asks questions and gets answers, simulating a normal conversation between humans. In most cases, this happens through a web interface like that of &lt;a href=&#34;https://chatgpt.com/&#34;&gt;ChatGPT&lt;/a&gt;, &lt;a href=&#34;https://copilot.microsoft.com&#34;&gt;Copilot&lt;/a&gt;, &lt;a href=&#34;https://gemini.google.com/&#34;&gt;Gemini&lt;/a&gt;, &lt;a href=&#34;https://claude.ai/&#34;&gt;Claude&lt;/a&gt; or &lt;a href=&#34;https://chat.mistral.ai/&#34;&gt;Mistral AI&lt;/a&gt;. Not forgetting &lt;a href=&#34;https://chat.deepseek.com/&#34;&gt;DeepSeek&lt;/a&gt;, which has generated so much controversy in the last two months (and which we will use shortly).&lt;/p&gt;&#xA;&lt;p&gt;Some chatbots also have native applications for desktop or mobile devices (such as &lt;a href=&#34;https://openai.com/chatgpt/download/&#34;&gt;ChatGPT&lt;/a&gt; or &lt;a href=&#34;https://claude.ai/download&#34;&gt;Claude&lt;/a&gt;), but I don&amp;rsquo;t think I&amp;rsquo;m wrong in saying that they are little more than interfaces for quick access to the web version.&lt;/p&gt;&#xA;&lt;p&gt;In all cases, the generation process always takes place on a remote server in a data center located somewhere out there. An almost always irrelevant detail, but one that can create many problems if we use an LLM to process personal data or confidential information.&lt;/p&gt;&#xA;&lt;h4 id=&#34;local-llms&#34;&gt;Local LLMs&lt;/h4&gt;&#xA;&lt;p&gt;But we are not forced to use an LLM on the web! Many LLMs can be used locally, directly on our computer. And we don&amp;rsquo;t even need to mess around with the Terminal, as with &lt;a href=&#34;https://github.com/ollama/ollama&#34;&gt;ollama&lt;/a&gt;, but we can comfortably stay in the familiar environment of a normal program for our Mac, or for a PC running Windows or Linux.&lt;/p&gt;&#xA;&lt;p&gt;There are an infinite number of these programs, but they all have common traits: once the program is installed, we first need to download one (or more) &lt;em&gt;models&lt;/em&gt; to use;&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; after which we can interact with the model directly through the program&amp;rsquo;s interface, without even a byte of what we do leaving our computer for the increasingly hostile world of the network.&lt;/p&gt;&#xA;&lt;h4 id=&#34;lm-studio&#34;&gt;LM Studio&lt;/h4&gt;&#xA;&lt;p&gt;But which program to choose among the many available, like &lt;a href=&#34;https://msty.app/&#34;&gt;Msty&lt;/a&gt;, &lt;a href=&#34;https://lmstudio.ai/&#34;&gt;LM Studio&lt;/a&gt;, &lt;a href=&#34;https://www.nomic.ai/gpt4all&#34;&gt;GPT4All&lt;/a&gt;, &lt;a href=&#34;https://jan.ai/&#34;&gt;Jan&lt;/a&gt;, &lt;a href=&#34;https://www.librechat.ai/&#34;&gt;LibreChat&lt;/a&gt;?&lt;/p&gt;&#xA;&lt;p&gt;I don&amp;rsquo;t know about you, but &lt;a href=&#34;https://lmstudio.ai/&#34;&gt;LM Studio&lt;/a&gt; made a good impression on me right from the beginning. LM Studio runs on the three major operating systems, but to use it on macOS you need an Apple Silicon processor, while on Linux you need an Intel processor. Only the Windows version supports both Intel and ARM processors (but I&amp;rsquo;d be really curious to see how it works under Windows for ARM).&lt;/p&gt;&#xA;&lt;p&gt;The current version is 0.3.14 and the installation on the Mac requires, as usual, double-clicking on the &lt;code&gt;.dmg&lt;/code&gt; file and dragging the program icon into the &lt;code&gt;Applications&lt;/code&gt; folder. On the Mac, the application takes up a good 1.26 GB, so it&amp;rsquo;s not exactly a &lt;em&gt;light&lt;/em&gt; program. After all, it&amp;rsquo;s based on &lt;a href=&#34;https://www.electronjs.org/&#34;&gt;Electron&lt;/a&gt;, and that explains everything.&lt;/p&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-installation.png&#34;&gt;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-installation.png&#34; alt=&#34;&#34;&gt;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;On Linux, the installation process is very similar, because the program is available in the &lt;a href=&#34;https://appimage.org/&#34;&gt;&lt;code&gt;AppImage&lt;/code&gt;&lt;/a&gt; format. This means that, in addition to the executable itself, it includes all the necessary support libraries and files, just like it has been done on macOS for decades. For Windows, however, you have to run the usual installer, which will spread junk in every corner of the operating system.&lt;/p&gt;&#xA;&lt;h4 id=&#34;discovering-lm-studio&#34;&gt;Discovering LM Studio&lt;/h4&gt;&#xA;&lt;p&gt;Having reached this point, all that remains is to run LM Studio for the first time. The very first thing to do is to download a model, so that we can start using the program.&lt;/p&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-onboarding-1.png&#34;&gt;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-onboarding-1.png&#34; alt=&#34;&#34;&gt;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;I opt for the default suggested model, which happens to be a &lt;a href=&#34;https://www.datacamp.com/blog/distillation-llm&#34;&gt;&lt;em&gt;distilled&lt;/em&gt;&lt;/a&gt; version of DeepSeek with &lt;em&gt;only&lt;/em&gt; 7 billion parameters (the full model has 100 times more). The model takes up 4.7 GB and is downloaded to the hidden directory &lt;code&gt;~/.lmstudio/models/&lt;/code&gt;.&lt;/p&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-onboarding-2.png&#34;&gt;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-onboarding-2.png&#34; alt=&#34;&#34;&gt;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;The download process is not fast at all, so we need a good dose of patience.&lt;/p&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-onboarding-3.png&#34;&gt;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-onboarding-3.png&#34; alt=&#34;&#34;&gt;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;As suggested by the program itself, while the download continues I start to explore the interface, which is the classic one we are now used to, with a large central area dedicated to the dialogue with the chatbot, while the sidebar houses some icons that, from top to bottom, allow to activate the &lt;code&gt;Chat&lt;/code&gt; or &lt;code&gt;Developer&lt;/code&gt; mode and to check which models are installed on our machine and which other models are available.&lt;/p&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-main-screen.png&#34;&gt;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-main-screen.png&#34; alt=&#34;&#34;&gt;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;The last icon in the lower left corner shows the model download status. Clicking on it will display the amount of data already downloaded, the estimated completion time and the possibility to pause or stop the download if necessary. If the download process times out, it is also possible to resume it from where it left off.&lt;/p&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-download-model.png&#34;&gt;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-download-model.png&#34; alt=&#34;&#34;&gt;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;This happened to me several times with the selected model, so check it from time to time. Once the model download is complete, we need to load it into LM Studio to be able to use it. We can use one of the two &lt;code&gt;Load Model&lt;/code&gt; buttons to do this, so it&amp;rsquo;s hard to forget.&lt;/p&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-load-model.png&#34;&gt;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-load-model.png&#34; alt=&#34;&#34;&gt;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;Once the first model is loaded, the LM Studio interface changes slightly, showing some sample prompts in the center of the window. In the top bar, the name of the loaded model appears in the center, flanked by two icons: the one on the left allows you to configure the model parameters, while the one on the right allows you to replace the current model with another. An icon on the right that looks like a &lt;a href=&#34;https://en.wikipedia.org/wiki/Erlenmeyer_flask&#34;&gt;glass flask&lt;/a&gt; provides access to the advanced configuration settings.&lt;/p&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-main-screeen-with-model.png&#34;&gt;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-main-screeen-with-model.png&#34; alt=&#34;&#34;&gt;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;The bottom bar of the program displays some useful information, on the left the LM Studio version and on the right the amount of RAM and CPU in use.&lt;/p&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-status-bar.png&#34;&gt;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-status-bar.png&#34; alt=&#34;&#34;&gt;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;The last icon on the right of the bottom bar allows to access the program settings. And it&amp;rsquo;s also the only way to do it because, strangely, there is no menu item dedicated to the program settings.&lt;/p&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-settings.png&#34;&gt;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-settings.png&#34; alt=&#34;&#34;&gt;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;The same bottom bar also allows to select the usage mode of the program, choosing between &lt;code&gt;User&lt;/code&gt; (which hides the side icons), &lt;code&gt;Power User&lt;/code&gt; (the default mode) and &lt;code&gt;Developer&lt;/code&gt; (which apparently does not change the interface).&lt;/p&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-mode.png&#34;&gt;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-mode.png&#34; alt=&#34;&#34;&gt;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;h4 id=&#34;putting-lm-studio-to-the-test&#34;&gt;Putting LM Studio to the test&lt;/h4&gt;&#xA;&lt;p&gt;All that remains is to put LM Studio (and DeepSeek) to the test, perhaps using one of the example prompts. I don&amp;rsquo;t need to ask an AI what the capital of France is, the Rubik&amp;rsquo;s cube is too 80s, let&amp;rsquo;s see how it handles mathematics.&lt;/p&gt;&#xA;&lt;p&gt;Deepseek thinks about it for a minute, but then comes up with a nice proof of the Pythagorean theorem based on proportions. As I explicitly asked, it also formats the equations in LaTeX, which is always a good thing.&lt;/p&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-math-prompt.png&#34;&gt;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-math-prompt.png&#34; alt=&#34;&#34;&gt;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;And if I click on the little triangle in the &lt;code&gt;Thoughts&lt;/code&gt; box, it even shows the reasoning it followed to arrive at that proof. Not bad at all!&lt;/p&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-math-thoughts.png&#34;&gt;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-math-thoughts.png&#34; alt=&#34;&#34;&gt;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;So far I&amp;rsquo;ve been using a Mac Mini M1 with 16 GB of RAM. But I also have a Mac Studio M2 Ultra with 192 GB of RAM available, how long will it take?&lt;/p&gt;&#xA;&lt;p&gt;Making a comparison is not easy because, even if you use exactly the same prompt, the answers from the Mac Mini and the Mac Studio will always be different, demonstrating the purely statistical nature of the &lt;em&gt;reasoning&lt;/em&gt; done by LLMs. You can see this in the two images below, where the Mini is recognizable by the light theme I&amp;rsquo;ve used so far, while the Mac Studio is configured to use a dark theme.&lt;/p&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-pythagorean-theorem-macmini.png&#34;&gt;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-pythagorean-theorem-macmini.png&#34; alt=&#34;&#34;&gt;&lt;/a&gt;&#xA;&lt;a href=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-pythagorean-theorem-macstudio.png&#34;&gt;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-pythagorean-theorem-macstudio.png&#34; alt=&#34;&#34;&gt;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;If the same question is repeated five times on the Mac Mini, the response time varies between 80 and 120 seconds, with a constant speed of 10-11 tokens per second&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;; on the Mac Studio, on the other hand, the responses are generated in 15-45 seconds, with a speed of 60-70 tokens per second. So, roughly speaking, the text generation speed on the Mac Studio is about 6-7 times faster than on the Mac Mini.&lt;/p&gt;&#xA;&lt;h4 id=&#34;lowering-the-temperature&#34;&gt;Lowering the temperature&lt;/h4&gt;&#xA;&lt;p&gt;But I&amp;rsquo;m stubborn, and to make the comparison more accurate, I want the two Macs to always give me the same answer. To do this, I have to click on the &lt;a href=&#34;https://en.wikipedia.org/wiki/Erlenmeyer_flask&#34;&gt;glass Flask&lt;/a&gt; icon in the top right corner and set &lt;em&gt;Temperature&lt;/em&gt; to zero (the default is 0.8).&lt;/p&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-model-advanced-config.png&#34;&gt;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/lmstudio-model-advanced-config.png&#34; alt=&#34;&#34;&gt;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;Under these conditions, the Mac Mini takes 75 to 140 seconds to process its responses, while the Mac Studio does it in 11-22 seconds. The strange thing is that although the answers are always the same, the number of tokens generated changes each time, so even in this case the Mac Mini&amp;rsquo;s speed is always 10-11 tokens per second, while the Mac Studio is less constant, generating 60-80 tokens per second.&lt;/p&gt;&#xA;&lt;p&gt;For a more scientific study, it would be necessary to work under much more controlled conditions, but for now we can be content to say that, even with this configuration, the Mac Studio is at least 6-7 times faster than the Mac Mini.&lt;/p&gt;&#xA;&lt;h4 id=&#34;what-about-chatgpt&#34;&gt;What about ChatGPT?&lt;/h4&gt;&#xA;&lt;p&gt;For comparison, how long does ChatGPT take? When I ask it the usual question about the Pythagorean theorem with the &lt;code&gt;Reason&lt;/code&gt; option enabled, ChatGPT gives me two different, very detailed and well-written answers. And it takes just 28 seconds to do it, which is a time comparable to that of the Mac Studio.&lt;/p&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/chatgpt-pythagorean-theorem.png&#34;&gt;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-03-31-lmstudio-un-llm-sul-tuo-computer/chatgpt-pythagorean-theorem.png&#34; alt=&#34;&#34;&gt;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;p&gt;I have to admit that I&amp;rsquo;m impressed by the speed of ChatGPT. It&amp;rsquo;s true that ChatGPT runs on servers that have nothing to do with my two Macs, but it&amp;rsquo;s also true that they have to respond to thousands of requests in parallel with mine, while the Macs are only at my service.&lt;/p&gt;&#xA;&lt;h4 id=&#34;conclusions-for-now&#34;&gt;Conclusions (for now)&lt;/h4&gt;&#xA;&lt;p&gt;In any case, the goal of this article is not to compare ChatGPT with other LLMs, but to verify if it is possible to use an LLM directly on our computer, without sending confidential information over the network.&lt;/p&gt;&#xA;&lt;p&gt;It is also true that both the Mac Mini and the Mac Studio don&amp;rsquo;t even notice that LM Studio is doing its processing, so there should be room for optimization.  But we will see this (and more) in the next article.&lt;/p&gt;&#xA;&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;&#xA;&lt;hr&gt;&#xA;&lt;ol&gt;&#xA;&lt;li id=&#34;fn:1&#34;&gt;&#xA;&lt;p&gt;A &amp;ldquo;model&amp;rdquo; is the set of weights, mathematical rules and neural structures that constitute a specific LLM.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;/li&gt;&#xA;&lt;li id=&#34;fn:2&#34;&gt;&#xA;&lt;p&gt;A &amp;ldquo;token&amp;rdquo; is the basic unit of text processed by the model and can correspond to a whole word, a part of a word, a space or a punctuation mark. The choice to represent a word with a single token or with multiple tokens depends on its frequency of use. If a word is very common, a single token will be used to represent it. On the other hand, if a word is rarely used, it doesn&amp;rsquo;t make sense to add the whole word to the model&amp;rsquo;s vocabulary, but it will be broken down into more common subwords.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;/li&gt;&#xA;&lt;/ol&gt;&#xA;&lt;/div&gt;&#xA;</description>
    </item>
    <item>
      <title>From melabit to melabit: goodbye WordPress, hello Jekyll</title>
      <link>https://static.233.196.69.159.clients.your-server.de/en/2025/01/02/from-melabit-to-melabit-goodbye-wordpress-hello-jekyll/</link>
      <pubDate>Thu, 02 Jan 2025 06:00:00 +0000</pubDate>
      <guid>https://static.233.196.69.159.clients.your-server.de/en/2025/01/02/from-melabit-to-melabit-goodbye-wordpress-hello-jekyll/</guid>
      <description>&lt;p&gt;&lt;img src=&#34;https://static.233.196.69.159.clients.your-server.de/img/2025-01-02-da-melabit-a-melabit-addio-wordpress-ciao-jekyll/phoenix.jpeg&#34; alt=&#34;&#34;&gt;&#xA;&amp;ndash; &lt;em&gt;Image generated by the &lt;a href=&#34;https://designer.microsoft.com/image-creator&#34;&gt;Microsoft Designer&lt;/a&gt; AI.&lt;/em&gt;&lt;/p&gt;&#xA;&lt;p&gt;Eleven years ago, when I started writing in this personal space, I never imagined I would stick with &lt;a href=&#34;https://wordpress.com/&#34;&gt;WordPress.com&lt;/a&gt; for so long. WordPress.com is a convenient and reliable blogging platform, but it has always been &lt;a href=&#34;https://melabit.com/en/2014/08/01/i-limiti-di-wordpress-com/&#34;&gt;ill-suited to my way of working&lt;/a&gt;. Over time, I learned to live with these limitations, but the idea of changing platforms never left my mind.&lt;/p&gt;&#xA;&lt;p&gt;From the start, I had a clear idea of &lt;a href=&#34;https://melabit.com/en/2014/12/31/and-the-winner-is/&#34;&gt;which alternative platform&lt;/a&gt; I wanted to use. A few years later, I revisited the available options and detailed the steps to transition in a &lt;a href=&#34;https://melabit.wordpress.com/2018/04/16/da-melabit-a-melabit-introduzione/&#34;&gt;long series of posts&lt;/a&gt;. Ultimately, though, I didn’t follow through, perhaps because I wasn’t fully convinced about the decision.&lt;/p&gt;&#xA;&lt;p&gt;After all, leaving the comfort zone of a platform that has caused you zero problems over the years to embark on a new journey where you manage everything yourself is no small task.&lt;/p&gt;&#xA;&lt;hr style = &#34;border: none; &#xA;            border-top: 3px double #333; &#xA;            color: #333; &#xA;            overflow: visible; &#xA;            height: 5px; &#xA;            width: 50%; &#xA;            margin-left: auto; &#xA;            margin-right: auto;&#34;&#xA;&gt;&#xA;&#xA;&lt;p&gt;A few months ago, however, &lt;a href=&#34;https://automattic.com/&#34;&gt;Automattic&lt;/a&gt;, the company behind the open-source &lt;a href=&#34;https://wordpress.org/&#34;&gt;WordPress&lt;/a&gt; software and its related products, decided to &lt;a href=&#34;https://www.engadget.com/tumblr-and-wordpress-posts-will-reportedly-be-used-for-openai-and-midjourney-training-204425798.html&#34;&gt;share by default&lt;/a&gt; all content hosted on WordPress.com with &amp;ldquo;third parties.&amp;rdquo; Actually, this means sharing it with AI-powered text generators (think ChatGPT and similar tools) that are constantly hungry for new material to improve their training.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; In that moment, my decision was made.&lt;/p&gt;&#xA;&lt;p&gt;As if that weren’t enough, shortly after, Matt Mullenweg, WordPress’s founder, &lt;a href=&#34;https://joshcollinsworth.com/blog/fire-matt&#34;&gt;lost his temper&lt;/a&gt;, highlighting how, despite WordPress being the world’s most popular open-source content management system (CMS), the increasingly centralized and authoritarian control over its development threatens the platform’s open and transparent future.&lt;/p&gt;&#xA;&lt;hr style = &#34;border: none; &#xA;            border-top: 3px double #333; &#xA;            color: #333; &#xA;            overflow: visible; &#xA;            height: 5px; &#xA;            width: 50%; &#xA;            margin-left: auto; &#xA;            margin-right: auto;&#34;&#xA;&gt;&#xA;&#xA;&lt;p&gt;And so, the die is cast! It took a lot of time and effort, but starting today, this little blog has its own home: &lt;a href=&#34;https://melabit.com/en/&#34;&gt;melabit.com/posts/en&lt;/a&gt;, where I can experiment with much greater freedom than before. Let’s see how it goes.&lt;/p&gt;&#xA;&lt;p&gt;The &lt;a href=&#34;https://melabit.com/posts/en&#34;&gt;new site&lt;/a&gt; is still a work in progress. The basic structure is in place, but there will definitely be adjustments and improvements in the coming months. The very first thing to do is to expand the international section, which currently contains only a handful of posts, with several broken links or links to the original posts in italian. Additionally, I would like to implement a system for notifying users of new posts.&lt;/p&gt;&#xA;&lt;p&gt;In the meantime, please don’t hesitate to flag any issues or poorly functioning elements in the comments.&lt;/p&gt;&#xA;&lt;p&gt;For technical details and the transition process, stay tuned for upcoming posts.&lt;/p&gt;&#xA;&lt;p&gt;Rest assured that all old content and comments will remain available on &lt;a href=&#34;https://melabit.wordpress.com/&#34;&gt;melabit.wordpress.com&lt;/a&gt;. At least until Mullenweg kicks me out&amp;hellip; 🤣🤣🤣🤣🤣&lt;/p&gt;&#xA;&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;&#xA;&lt;hr&gt;&#xA;&lt;ol&gt;&#xA;&lt;li id=&#34;fn:1&#34;&gt;&#xA;&lt;p&gt;While it is true you can opt out of sharing your content—and I did—it’s also true that you must explicitly choose &lt;strong&gt;not to share&lt;/strong&gt; your data, rather than the other way around, which would be more logical. And even then, there’s no guarantee that your content won’t be scraped anyway.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;&#xA;&lt;/li&gt;&#xA;&lt;/ol&gt;&#xA;&lt;/div&gt;&#xA;</description>
    </item>
  </channel>
</rss>
