Thread View: gmane.emacs.devel
42 messages
42 total messages
Started by Andrew Hyatt
Mon, 07 Aug 2023 19:54
[NonGNU ELPA] New package: llm
Author: Andrew Hyatt
Date: Mon, 07 Aug 2023 19:54
Date: Mon, 07 Aug 2023 19:54
49 lines
2557 bytes
2557 bytes
--0000000000008a8a4d06025df93f Content-Type: text/plain; charset="UTF-8" Hi everyone, I've created a new package called llm, for the purpose of abstracting the interface to various large language model providers. There are many LLM packages already, but it would be wasteful for all of them to try to be compatible with a range of LLM providers API (local LLMs such as Llama 2, API providers such as Open AI and Google Cloud's Vertex). This package attempts to solve this problem by defining generic functions which can then be implemented by different LLM providers. I have started with just two: Open AI and Vertex. Llama 2 would be a next choice, but I don't yet have it working on my system. In addition, I'm starting with just two core functionality: chat and embeddings. Extending to async is probably something that I will do next. You can see the code at https://github.com/ahyatt/llm. I prefer that this is NonGNU, because I suspect people would like to contribute interfaces to different LLM, and not all of them will have FSF papers. Your thoughts would be appreciated, thank you! --0000000000008a8a4d06025df93f Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir="ltr">Hi everyone,<div><br></div><div>I've created a new package called llm, for the purpose of abstracting the interface to various large language model providers. There are many LLM packages already, but it would be wasteful for all of them to try to be compatible with a range of LLM providers API (local LLMs such as Llama 2, API providers such as Open AI and Google Cloud's Vertex). This package attempts to solve this problem by defining generic functions which can then be implemented by different LLM providers. I have started with just two: Open AI and Vertex. Llama 2 would be a next choice, but I don't yet have it working on my system. In addition, I'm starting with just two core functionality: chat and embeddings. Extending to async is probably something that I will do next.</div><div><br></div><div>You can see the code at <a href="https://github.com/ahyatt/llm">https://github.com/ahyatt/llm</a>.<br></div><div><br></div><div>I prefer that this is NonGNU, because I suspect people would like to contribute interfaces to different LLM, and not all of them will have FSF papers.</div><div><br></div><div>Your thoughts would be appreciated, thank you!</div><div><br></div></div> --0000000000008a8a4d06025df93f--
Re: [NonGNU ELPA] New package: llm
Author: Philip Kaluderci
Date: Tue, 08 Aug 2023 05:42
Date: Tue, 08 Aug 2023 05:42
37 lines
1807 bytes
1807 bytes
Andrew Hyatt <ahyatt@gmail.com> writes: > Hi everyone, > > I've created a new package called llm, for the purpose of abstracting the > interface to various large language model providers. There are many LLM > packages already, but it would be wasteful for all of them to try to be > compatible with a range of LLM providers API (local LLMs such as Llama 2, > API providers such as Open AI and Google Cloud's Vertex). This package > attempts to solve this problem by defining generic functions which can then > be implemented by different LLM providers. I have started with just two: > Open AI and Vertex. Llama 2 would be a next choice, but I don't yet have > it working on my system. In addition, I'm starting with just two core > functionality: chat and embeddings. Extending to async is probably > something that I will do next. Llama was the model that could be executed locally, and the other two are "real" services, right? > You can see the code at https://github.com/ahyatt/llm. > > I prefer that this is NonGNU, because I suspect people would like to > contribute interfaces to different LLM, and not all of them will have FSF > papers. I cannot estimate how important or not LLM will be in the future, but it might be worth having something like this in the core, at some point. Considering the size of a module at around 150-200 lines it seems, and the relative infrequency of new models (at least to my understanding), I don't know if the "advantage" of accepting contributions from people who haven't signed the CA has that much weight, opposed to the general that all users may enjoy from having the technology integrated into Emacs itself, in a way that other packages (and perhaps even the core-help system) could profit from it. > Your thoughts would be appreciated, thank you!
Re: [NonGNU ELPA] New package: llm
Author: Spencer Baugh
Date: Tue, 08 Aug 2023 11:08
Date: Tue, 08 Aug 2023 11:08
12 lines
413 bytes
413 bytes
Philip Kaludercic <philipk@posteo.net> writes: > in a way that other packages (and perhaps even the core-help > system) could profit from it. Now I'm imagining all kinds of integration with C-h to automatically help users with Emacs tasks. Such as an analog to apropos-command which queries an LLM for help. And maybe integration with M-x report-emacs-bug. And maybe an M-x doctor which *actually works* :)
Re: [NonGNU ELPA] New package: llm
Author: Andrew Hyatt
Date: Tue, 08 Aug 2023 11:09
Date: Tue, 08 Aug 2023 11:09
136 lines
5577 bytes
5577 bytes
--00000000000050230d06026ac0e3 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Aug 8, 2023 at 1:42 AM Philip Kaludercic <philipk@posteo.net> wrote: > Andrew Hyatt <ahyatt@gmail.com> writes: > > > Hi everyone, > > > > I've created a new package called llm, for the purpose of abstracting the > > interface to various large language model providers. There are many LLM > > packages already, but it would be wasteful for all of them to try to be > > compatible with a range of LLM providers API (local LLMs such as Llama 2, > > API providers such as Open AI and Google Cloud's Vertex). This package > > attempts to solve this problem by defining generic functions which can > then > > be implemented by different LLM providers. I have started with just two: > > Open AI and Vertex. Llama 2 would be a next choice, but I don't yet have > > it working on my system. In addition, I'm starting with just two core > > functionality: chat and embeddings. Extending to async is probably > > something that I will do next. > > Llama was the model that could be executed locally, and the other two > are "real" services, right? > That's correct. > > > You can see the code at https://github.com/ahyatt/llm. > > > > I prefer that this is NonGNU, because I suspect people would like to > > contribute interfaces to different LLM, and not all of them will have FSF > > papers. > > I cannot estimate how important or not LLM will be in the future, but it > might be worth having something like this in the core, at some point. > Considering the size of a module at around 150-200 lines it seems, and > the relative infrequency of new models (at least to my understanding), I > don't know if the "advantage" of accepting contributions from people who > haven't signed the CA has that much weight, opposed to the general that > all users may enjoy from having the technology integrated into Emacs > itself, in a way that other packages (and perhaps even the core-help > system) could profit from it. > That seems reasonable. I don't have a strong opinion here, so if others want to see this in GNU ELPA instead, I'm happy to do that. > > > Your thoughts would be appreciated, thank you! > --00000000000050230d06026ac0e3 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir="ltr"><div dir="ltr">On Tue, Aug 8, 2023 at 1:42 AM Philip Kaludercic <<a href="mailto:philipk@posteo.net">philipk@posteo.net</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Andrew Hyatt <<a href="mailto:ahyatt@gmail.com" target="_blank">ahyatt@gmail.com</a>> writes:<br> <br> > Hi everyone,<br> ><br> > I've created a new package called llm, for the purpose of abstracting the<br> > interface to various large language model providers. There are many LLM<br> > packages already, but it would be wasteful for all of them to try to be<br> > compatible with a range of LLM providers API (local LLMs such as Llama 2,<br> > API providers such as Open AI and Google Cloud's Vertex). This package<br> > attempts to solve this problem by defining generic functions which can then<br> > be implemented by different LLM providers. I have started with just two:<br> > Open AI and Vertex. Llama 2 would be a next choice, but I don't yet have<br> > it working on my system. In addition, I'm starting with just two core<br> > functionality: chat and embeddings. Extending to async is probably<br> > something that I will do next.<br> <br> Llama was the model that could be executed locally, and the other two<br> are "real" services, right?<br></blockquote><div><br></div><div>That's correct.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br> > You can see the code at <a href="https://github.com/ahyatt/llm" rel="noreferrer" target="_blank">https://github.com/ahyatt/llm</a>.<br> ><br> > I prefer that this is NonGNU, because I suspect people would like to<br> > contribute interfaces to different LLM, and not all of them will have FSF<br> > papers.<br> <br> I cannot estimate how important or not LLM will be in the future, but it<br> might be worth having something like this in the core, at some point.<br> Considering the size of a module at around 150-200 lines it seems, and<br> the relative infrequency of new models (at least to my understanding), I<br> don't know if the "advantage" of accepting contributions from people who<br> haven't signed the CA has that much weight, opposed to the general that<br> all users may enjoy from having the technology integrated into Emacs<br> itself, in a way that other packages (and perhaps even the core-help<br> system) could profit from it.<br></blockquote><div><br></div><div>That seems reasonable. I don't have a strong opinion here, so if others want to see this in GNU ELPA instead, I'm happy to do that.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br> > Your thoughts would be appreciated, thank you!<br> </blockquote></div></div> --00000000000050230d06026ac0e3--
Re: [NonGNU ELPA] New package: llm
Author: Richard Stallman
Date: Tue, 08 Aug 2023 23:47
Date: Tue, 08 Aug 2023 23:47
35 lines
1495 bytes
1495 bytes
[[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] All the large language models are unjust -- either the models are nonfree software, released under a license that denies freedom 0 (https://www.gnu.org/philosophy/programs-must-not-limit-freedom-to-run.html), or they are not released at all, only made available for use as SaaSS (https://www.gnu.org/philosophy/who-does-that-server-really-serve.html). If a language model is much better known that GNU Emacs, it is ok to have code in Emacs to make it more convenient to use Emacs along with the language model. If the language model is not so well known, then Emacs should not mention it _in any way_. This is in the GNU coding standards. If Emacs is to have commands specifically to support them, we should make those commands inform the user -- every user of each of those commands -- of how they mistreat the user. It is enough to display a message explaining the situation in a way that it will really be seen. Displaying this for the first invocation on each day would be sufficient. Doing it more often would be annoying. Would someone please implemt this? -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org)
Re: [NonGNU ELPA] New package: llm
Author: Richard Stallman
Date: Tue, 08 Aug 2023 23:47
Date: Tue, 08 Aug 2023 23:47
39 lines
1367 bytes
1367 bytes
[[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > I've created a new package called llm, for the purpose of abstracting the > interface to various large language model providers. Note that packages in core Emacs or in GNU ELPA should not depend on anything in NonGNU ELPA. If llm is meant for other packages to use, it should be in GNU ELPA, not NonGNU ELPA. Why did you plan to put it in NonGNU ELPA? > I prefer that this is NonGNU, because I suspect people would like to > contribute interfaces to different LLM, and not all of them will have FSF > papers. I don't follow the logic here. It looks like the llm package is intended to be generic, so it would be used by other packages to implementr support for specific models. If llm package is on GNU ELPA, it can be used from packages no matter how those packages are distributed. But if the llm package is in NonGNU ELPA, it can only be used from packages in NonGNU ELPA. Have I misunderstood the intended design? -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org)
Re: [NonGNU ELPA] New package: llm
Author: Andrew Hyatt
Date: Wed, 09 Aug 2023 00:06
Date: Wed, 09 Aug 2023 00:06
153 lines
5935 bytes
5935 bytes
--000000000000d142110602759a6d Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Aug 8, 2023 at 11:47 PM Richard Stallman <rms@gnu.org> wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > I've created a new package called llm, for the purpose of abstracting > the > > interface to various large language model providers. > > Note that packages in core Emacs or in GNU ELPA > should not depend on anything in NonGNU ELPA. > If llm is meant for other packages to use, > it should be in GNU ELPA, not NonGNU ELPA. > > Why did you plan to put it in NonGNU ELPA? The logic was the same logic you quote below (I'll explain better what my point was below), but I agree that it would limit the use, so GNU ELPA makes more sense. Another factor was that I am using request.el, which is not in GNU ELPA, so I'd have to rewrite it, which complicates the code. > > > I prefer that this is NonGNU, because I suspect people would like to > > contribute interfaces to different LLM, and not all of them will have > FSF > > papers. > > I don't follow the logic here. It looks like the llm package is > intended to be generic, so it would be used by other packages to > implementr support for specific models. If llm package is on GNU ELPA, > it can be used from packages no matter how those packages are distributed. > It wasn't about use, it's more about accepting significant code contributions, which is less restricted with NonGNU ELPA, since I wouldn't have to ask for FSF papers. > > But if the llm package is in NonGNU ELPA, it can only be used from packages > in NonGNU ELPA. > > Have I misunderstood the intended design? > You understood correctly. This is a package designed to be used as a library from other packages. > > > > > -- > Dr Richard Stallman (https://stallman.org) > Chief GNUisance of the GNU Project (https://gnu.org) > Founder, Free Software Foundation (https://fsf.org) > Internet Hall-of-Famer (https://internethalloffame.org) > > > --000000000000d142110602759a6d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir="ltr"><div dir="ltr">On Tue, Aug 8, 2023 at 11:47 PM Richard Stallman <<a href="mailto:rms@gnu.org">rms@gnu.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">[[[ To any NSA and FBI agents reading my email: please consider ]]]<br> [[[ whether defending the US Constitution against all enemies, ]]]<br> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]<br> <br> > I've created a new package called llm, for the purpose of abstracting the<br> > interface to various large language model providers.<br> <br> Note that packages in core Emacs or in GNU ELPA<br> should not depend on anything in NonGNU ELPA.<br> If llm is meant for other packages to use,<br> it should be in GNU ELPA, not NonGNU ELPA.<br> <br> Why did you plan to put it in NonGNU ELPA?</blockquote><div><br></div><div>The logic was the same logic you quote below (I'll explain better what my point was below), but I agree that it would limit the use, so GNU ELPA makes more sense. Another factor was that I am using request.el, which is not in GNU ELPA, so I'd have to rewrite it, which complicates the code.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> </blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br> > I prefer that this is NonGNU, because I suspect people would like to<br> > contribute interfaces to different LLM, and not all of them will have FSF<br> > papers.<br> <br> I don't follow the logic here. It looks like the llm package is<br> intended to be generic, so it would be used by other packages to<br> implementr support for specific models. If llm package is on GNU ELPA,<br> it can be used from packages no matter how those packages are distributed.<br></blockquote><div><br></div><div>It wasn't about use, it's more about accepting significant code contributions, which is less restricted with NonGNU ELPA, since I wouldn't have to ask for FSF papers. </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br> But if the llm package is in NonGNU ELPA, it can only be used from packages<br> in NonGNU ELPA.<br> <br> Have I misunderstood the intended design?<br></blockquote><div><br></div><div>You understood correctly. This is a package designed to be used as a library from other packages. </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br> <br> <br> <br> -- <br> Dr Richard Stallman (<a href="https://stallman.org" rel="noreferrer" target="_blank">https://stallman.org</a>)<br> Chief GNUisance of the GNU Project (<a href="https://gnu.org" rel="noreferrer" target="_blank">https://gnu.org</a>)<br> Founder, Free Software Foundation (<a href="https://fsf.org" rel="noreferrer" target="_blank">https://fsf.org</a>)<br> Internet Hall-of-Famer (<a href="https://internethalloffame.org" rel="noreferrer" target="_blank">https://internethalloffame.org</a>)<br> <br> <br> </blockquote></div></div> --000000000000d142110602759a6d--
Re: [NonGNU ELPA] New package: llm
Author: Andrew Hyatt
Date: Wed, 09 Aug 2023 00:37
Date: Wed, 09 Aug 2023 00:37
160 lines
7972 bytes
7972 bytes
--000000000000624ca506027609ce Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Aug 8, 2023 at 11:47 PM Richard Stallman <rms@gnu.org> wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > All the large language models are unjust -- either the models > are nonfree software, released under a license that denies freedom 0 > ( > https://www.gnu.org/philosophy/programs-must-not-limit-freedom-to-run.html > ), > or they are not released at all, only made available for use as SaaSS > (https://www.gnu.org/philosophy/who-does-that-server-really-serve.html). > > If a language model is much better known that GNU Emacs, > it is ok to have code in Emacs to make it more convenient > to use Emacs along with the language model. If the language model > is not so well known, then Emacs should not mention it _in any way_. > This is in the GNU coding standards. > > If Emacs is to have commands specifically to support them, we should > make those commands inform the user -- every user of each of those > commands -- of how they mistreat the user. > > It is enough to display a message explaining the situation > in a way that it will really be seen. > > Displaying this for the first invocation on each day > would be sufficient. Doing it more often would be annoying. > What you are saying is consistent with the GNU coding standard. However, I think any message about this would be annoying, personally, and would be a deterrent for clients to use this library. How about this, which I think would satisfy your concerns: We contribute ONLY llm.el, which mentions no implementations of LLMs, no companies, and no specific language models, to GNU ELPA. With only the interface, I believe there is nothing to warn the user about, and the clients have something in GNU ELPA to code against. If some day there is an LLM that qualifies for inclusion because it is sufficiently free, it can be added to GNU ELPA as well. All implementations can then separately be made available on some other package library not associated with GNU. In this scenario, I wouldn't have warnings on those implementations, just as the many llm-based packages on various alternative ELPAs do not have warnings today. If it still seems wrong to you to have an interface in GNU ELPA whose most popular interaces today involves non-free software, then perhaps it might be best to leave this package out of GNU or Non-GNU ELPA for now. I think that would be a shame, since the flexibility it provides is likely the only good hedge against over-reliance on SaaS LLMs, especially in the case that acceptable llms are developed. Such llms are likely available today (research ones come to mind), however I don't have good visibility into that world or the likelihood they would be useful to emacs users. > > Would someone please implemt this? > -- > Dr Richard Stallman (https://stallman.org) > Chief GNUisance of the GNU Project (https://gnu.org) > Founder, Free Software Foundation (https://fsf.org) > Internet Hall-of-Famer (https://internethalloffame.org) > > > --000000000000624ca506027609ce Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir="ltr"><div dir="ltr">On Tue, Aug 8, 2023 at 11:47 PM Richard Stallman <<a href="mailto:rms@gnu.org">rms@gnu.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">[[[ To any NSA and FBI agents reading my email: please consider ]]]<br> [[[ whether defending the US Constitution against all enemies, ]]]<br> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]<br> <br> All the large language models are unjust -- either the models<br> are nonfree software, released under a license that denies freedom 0<br> (<a href="https://www.gnu.org/philosophy/programs-must-not-limit-freedom-to-run.html" rel="noreferrer" target="_blank">https://www.gnu.org/philosophy/programs-must-not-limit-freedom-to-run.html</a>),<br> or they are not released at all, only made available for use as SaaSS<br> (<a href="https://www.gnu.org/philosophy/who-does-that-server-really-serve.html" rel="noreferrer" target="_blank">https://www.gnu.org/philosophy/who-does-that-server-really-serve.html</a>).<br> <br> If a language model is much better known that GNU Emacs,<br> it is ok to have code in Emacs to make it more convenient<br> to use Emacs along with the language model. If the language model<br> is not so well known, then Emacs should not mention it _in any way_.<br> This is in the GNU coding standards.<br> <br> If Emacs is to have commands specifically to support them, we should<br> make those commands inform the user -- every user of each of those<br> commands -- of how they mistreat the user.<br> <br> It is enough to display a message explaining the situation<br> in a way that it will really be seen.<br> <br> Displaying this for the first invocation on each day<br> would be sufficient. Doing it more often would be annoying.<br></blockquote><div><br></div><div>What you are saying is consistent with the GNU coding standard. However, I think any message about this would be annoying, personally, and would be a deterrent for clients to use this library. </div><div><br></div><div>How about this, which I think would satisfy your concerns:</div><div><br></div><div>We contribute ONLY llm.el, which mentions no implementations of LLMs, no companies, and no specific language models, to GNU ELPA. With only the interface, I believe there is nothing to warn the user about, and the clients have something in GNU ELPA to code against. If some day there is an LLM that qualifies for inclusion because it is sufficiently free, it can be added to GNU ELPA as well. </div><div><br></div><div>All implementations can then separately be made available on some other package library not associated with GNU. In this scenario, I wouldn't have warnings on those implementations, just as the many llm-based packages on various alternative ELPAs do not have warnings today. </div><br class="gmail-Apple-interchange-newline"><div>If it still seems wrong to you to have an interface in GNU ELPA whose most popular interaces today involves non-free software, then perhaps it might be best to leave this package out of GNU or Non-GNU ELPA for now. I think that would be a shame, since the flexibility it provides is likely the only good hedge against over-reliance on SaaS LLMs, especially in the case that acceptable llms are developed. Such llms are likely available today (research ones come to mind), however I don't have good visibility into that world or the likelihood they would be useful to emacs users.</div><div><br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br> Would someone please implemt this?<br> -- <br> Dr Richard Stallman (<a href="https://stallman.org" rel="noreferrer" target="_blank">https://stallman.org</a>)<br> Chief GNUisance of the GNU Project (<a href="https://gnu.org" rel="noreferrer" target="_blank">https://gnu.org</a>)<br> Founder, Free Software Foundation (<a href="https://fsf.org" rel="noreferrer" target="_blank">https://fsf.org</a>)<br> Internet Hall-of-Famer (<a href="https://internethalloffame.org" rel="noreferrer" target="_blank">https://internethalloffame.org</a>)<br> <br> <br> </blockquote></div></div> --000000000000624ca506027609ce--
Re: [NonGNU ELPA] New package: llm
Author: Richard Stallman
Date: Fri, 11 Aug 2023 22:44
Date: Fri, 11 Aug 2023 22:44
21 lines
796 bytes
796 bytes
[[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > It wasn't about use, it's more about accepting significant code > contributions, which is less restricted with NonGNU ELPA, since I wouldn't > have to ask for FSF papers. This problem does arise, but it isn't a big problem in practice. We get lots of significan code contributions for the Emacs core and GNU ELPA. It is only rarly that there is an obstacle. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org)
Re: [NonGNU ELPA] New package: llm
Author: Richard Stallman
Date: Sat, 12 Aug 2023 21:43
Date: Sat, 12 Aug 2023 21:43
50 lines
2070 bytes
2070 bytes
[[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > What you are saying is consistent with the GNU coding standard. However, I > think any message about this would be annoying, I am sure it would be a little annoying. But assuming the user can type SPC and move on from that message, the annoyance will be quite little. personally, and would be a > deterrent for clients to use this library. If the library is quite useful I doubt anyone would be deterred. If anyone minded it the message enough to stop using the package, perse could edit this out of the code. This issue is an example of those where two different values are pertinent. There is convenience, which counts but is superficial. And there is the purpose of the GNU system, which for 40 years has led the fight against injustice in software. That value is deep and, in the long term, the most important value of all. When they conflict in a specific practical matter, there is always pressure to prioritize convenience. But that is not wise. The right approach is to look for a compromise which serves both goals. I am sure we can find one here. I suggested showing the message once a day, because that is what first occurred to me. But there are lots of ways to vary the details of the compromise. Here's an idea. For each language model, it could diisplay the message the first, second, fifth, tenth, and after that every tenth time the user starts that mode. With this method, the frequency of little annoyance will diminish quickly, but the point will not be forgotten. As long as we do not overvalue minor inconvenience, there will be good solutions. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org)
Re: [NonGNU ELPA] New package: llm
Author: Richard Stallman
Date: Sat, 12 Aug 2023 21:43
Date: Sat, 12 Aug 2023 21:43
64 lines
2866 bytes
2866 bytes
[[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > What you are saying is consistent with the GNU coding standard. However, I > think any message about this would be annoying, I am sure it would be a little annoying. But assuming the user can type SPC and move on from that message, the annoyance will be quite little. personally, and would be a > deterrent for clients to use this library. If the library is quite useful I doubt anyone would be deterred. If anyone minded it the message enough to stop using the package, perse could edit this out of the code. This issue is an example of those where two different values are pertinent. There is convenience, which counts but is superficial. And there is the purpose of the GNU system, which for 40 years has led the fight against injustice in software. That value is deep and, in the long term, the most important value of all. When they conflict in a specific practical matter, there is always pressure to prioritize convenience. But that is not wise. The right approach is to look for a ocmpromise which serves both goals. I am sure we can find one here. I suggested showing the message once a day, because that is what first occurred to me. But there are lots of ways to vary the details. Here's an idea. For each language model, it could diisplay the message the first, second, fifth, tenth, and after that every tenth time the user starts that mode. With this, the frequency of little annoyance will diminish soon, but the point will not be forgotten. You made suggestions for how to exclude more code from Emacs itself, and support for obscure language models we probably should exclude. But there is no need to exclude the support for the well-known ones, as I've explained. And we can do better than that! We can educate the users about what is wrong with those systems -- something that the media hysteria fails to mention at all. That is important -- let's use Emacs for it! > All implementations can then separately be made available on some other > package library not associated with GNU. In this scenario, I wouldn't have > warnings on those implementations, just as the many llm-based packages on > various alternative ELPAs do not have warnings today. They ought to show warnings -- the issue is exactly the same. We should not slide quietly into acceptance and normalization of a new systematic injustice. Opposing it is our job. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org)
Re: [NonGNU ELPA] New package: llm
Author: Emanuel Berg
Date: Sun, 13 Aug 2023 04:11
Date: Sun, 13 Aug 2023 04:11
26 lines
890 bytes
890 bytes
Richard Stallman wrote: > You made suggestions for how to exclude more code from Emacs > itself, and support for obscure language models we probably > should exclude. But there is no need to exclude the support > for the well-known ones, as I've explained. We should include as much as possible, but it doesn't really matter if we include it in vanilla Emacs or in a package in ELPA as long as it is included. Rather, the message would be, in vanilla Emacs where whenever something wasn't included, "you have opened a file for programming in X which is currently partially unsupported in vanilla Emacs, but note there are 7 packages in ELPA including a major mode to do exactly that ...". And when enough people get annoyed by that message one would consider it to be about time to move it from ELPA into vanilla Emacs ... -- underground experts united https://dataswamp.org/~incal
Re: [NonGNU ELPA] New package: llm
Author: Andrew Hyatt
Date: Tue, 15 Aug 2023 01:14
Date: Tue, 15 Aug 2023 01:14
218 lines
10156 bytes
10156 bytes
--0000000000001b31ec0602ef4329 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sat, Aug 12, 2023 at 9:43 PM Richard Stallman <rms@gnu.org> wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > What you are saying is consistent with the GNU coding standard. > However, I > > think any message about this would be annoying, > > I am sure it would be a little annoying. But assuming the user can > type SPC and move on from that message, the annoyance will be quite > little. > > personally, and would > be a > > deterrent for clients to use this library. > > If the library is quite useful I doubt anyone would be deterred. > If anyone minded it the message enough to stop using the package, perse > could > edit this out of the code. > > This issue is an example of those where two different values are > pertinent. There is convenience, which counts but is superficial. > And there is the purpose of the GNU system, which for 40 years has led > the fight against injustice in software. That value is deep and, in the > long term, the most important value of all. > > When they conflict in a specific practical matter, there is always > pressure to prioritize convenience. But that is not wise. > The right approach is to look for a ocmpromise which serves both > goals. I am sure we can find one here. > > I suggested showing the message once a day, because that is what first > occurred to me. But there are lots of ways to vary the details. > Here's an idea. For each language model, it could diisplay the > message the first, second, fifth, tenth, and after that every tenth > time the user starts that mode. With this, the frequency of little > annoyance will diminish soon, but the point will not be forgotten. > Is there anything else in emacs that does something similar? I'd like to look at how other modules do the same thing and how they communicate things to the user. I believe we can output something, but at least some of the LLM calls are asynchronous, and, as a library, even when not async, we have no idea about the UI context we're in. Suddenly throwing up a window in a function that has no side-effects seems unfriendly to clients of the library. Perhaps we could just use the "warn" function, which is more in line with what might be expected from a library. And the user can suppress the warning if needed. > You made suggestions for how to exclude more code from Emacs itself, > and support for obscure language models we probably should exclude. > But there is no need to exclude the support for the well-known ones, > as I've explained. > > And we can do better than that! We can educate the users about what > is wrong with those systems -- something that the media hysteria fails > to mention at all. That is important -- let's use Emacs for it! > > All implementations can then separately be made available on some other > > package library not associated with GNU. In this scenario, I wouldn't > have > > warnings on those implementations, just as the many llm-based packages > on > > various alternative ELPAs do not have warnings today. > > They ought to show warnings -- the issue is exactly the same. > > We should not slide quietly into acceptance and normalization of a new > systematic injustice. Opposing it is our job. > I don't doubt that or disagree, I'd just rather us oppose it in documentation or code comments, not during runtime. The other packages aren't under GNU control, and the authors may have different philosophies. It would be unfortunate if that worked out to the advantage of users, who have for whatever reason chosen to use a LLM provider being well aware that it is not a free system. I'm curious what others think. > > -- > Dr Richard Stallman (https://stallman.org) > Chief GNUisance of the GNU Project (https://gnu.org) > Founder, Free Software Foundation (https://fsf.org) > Internet Hall-of-Famer (https://internethalloffame.org) > > > --0000000000001b31ec0602ef4329 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir="ltr"><div dir="ltr">On Sat, Aug 12, 2023 at 9:43 PM Richard Stallman <<a href="mailto:rms@gnu.org">rms@gnu.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">[[[ To any NSA and FBI agents reading my email: please consider ]]]<br> [[[ whether defending the US Constitution against all enemies, ]]]<br> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]<br> <br> > What you are saying is consistent with the GNU coding standard. However, I<br> > think any message about this would be annoying,<br> <br> I am sure it would be a little annoying. But assuming the user can<br> type SPC and move on from that message, the annoyance will be quite<br> little.<br> <br> personally, and would be a<br> > deterrent for clients to use this library.<br> <br> If the library is quite useful I doubt anyone would be deterred.<br> If anyone minded it the message enough to stop using the package, perse could<br> edit this out of the code.<br> <br> This issue is an example of those where two different values are<br> pertinent. There is convenience, which counts but is superficial.<br> And there is the purpose of the GNU system, which for 40 years has led<br> the fight against injustice in software. That value is deep and, in the<br> long term, the most important value of all.<br> <br> When they conflict in a specific practical matter, there is always<br> pressure to prioritize convenience. But that is not wise.<br> The right approach is to look for a ocmpromise which serves both<br> goals. I am sure we can find one here.<br> <br> I suggested showing the message once a day, because that is what first<br> occurred to me. But there are lots of ways to vary the details.<br> Here's an idea. For each language model, it could diisplay the<br> message the first, second, fifth, tenth, and after that every tenth<br> time the user starts that mode. With this, the frequency of little<br> annoyance will diminish soon, but the point will not be forgotten.<br></blockquote><div><br></div><div>Is there anything else in emacs that does something similar? I'd like to look at how other modules do the same thing and how they communicate things to the user.</div><div><br></div><div>I believe we can output something, but at least some of the LLM calls are asynchronous, and, as a library, even when not async, we have no idea about the UI context we're in. Suddenly throwing up a window in a function that has no side-effects seems unfriendly to clients of the library. Perhaps we could just use the "warn" function, which is more in line with what might be expected from a library. And the user can suppress the warning if needed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> You made suggestions for how to exclude more code from Emacs itself,<br> and support for obscure language models we probably should exclude.<br> But there is no need to exclude the support for the well-known ones,<br> as I've explained.<br> <br> And we can do better than that! We can educate the users about what<br> is wrong with those systems -- something that the media hysteria fails<br> to mention at all. That is important -- let's use Emacs for it!<br></blockquote><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> > All implementations can then separately be made available on some other<br> > package library not associated with GNU. In this scenario, I wouldn't have<br> > warnings on those implementations, just as the many llm-based packages on<br> > various alternative ELPAs do not have warnings today.<br> <br> They ought to show warnings -- the issue is exactly the same.<br> <br> We should not slide quietly into acceptance and normalization of a new<br> systematic injustice. Opposing it is our job.<br></blockquote><div><br></div><div>I don't doubt that or disagree, I'd just rather us oppose it in documentation or code comments, not during runtime. The other packages aren't under GNU control, and the authors may have different philosophies. It would be unfortunate if that worked out to the advantage of users, who have for whatever reason chosen to use a LLM provider being well aware that it is not a free system. I'm curious what others think.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br> -- <br> Dr Richard Stallman (<a href="https://stallman.org" rel="noreferrer" target="_blank">https://stallman.org</a>)<br> Chief GNUisance of the GNU Project (<a href="https://gnu.org" rel="noreferrer" target="_blank">https://gnu.org</a>)<br> Founder, Free Software Foundation (<a href="https://fsf.org" rel="noreferrer" target="_blank">https://fsf.org</a>)<br> Internet Hall-of-Famer (<a href="https://internethalloffame.org" rel="noreferrer" target="_blank">https://internethalloffame.org</a>)<br> <br> <br> </blockquote></div></div> --0000000000001b31ec0602ef4329--
Re: [NonGNU ELPA] New package: llm
Author: Jim Porter
Date: Tue, 15 Aug 2023 10:12
Date: Tue, 15 Aug 2023 10:12
20 lines
940 bytes
940 bytes
On 8/14/2023 10:14 PM, Andrew Hyatt wrote: > I don't doubt that or disagree, I'd just rather us oppose it in > documentation or code comments, not during runtime. I'd be hesitant to add support for these LLMs even *with* a warning message at runtime. That's not to say there should never be a GNU project with support for any LLM, but that I think we should tread carefully. Among other things, I'm curious about what the FSF would say about the *models* the LLMs use. Are they "just data", or should we treat them more like object code? What does an LLM that fully adheres to FSF principles actually look like? I'm not personally aware of any official FSF stance on LLMs, so that would be the next step as I see it, before publishing any code. Again, that doesn't mean Emacs should never have an LLM package, just that some detailed guidance from the FSF would make it a lot clearer (to me, at least) how to progress. - Jim
Re: [NonGNU ELPA] New package: llm
Author: Richard Stallman
Date: Tue, 15 Aug 2023 22:30
Date: Tue, 15 Aug 2023 22:30
27 lines
1160 bytes
1160 bytes
[[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > I suggested showing the message once a day, because that is what first > > occurred to me. But there are lots of ways to vary the details. > > Here's an idea. For each language model, it could diisplay the > > message the first, second, fifth, tenth, and after that every tenth > > time the user starts that mode. With this, the frequency of little > > annoyance will diminish soon, but the point will not be forgotten. > > > Is there anything else in emacs that does something similar? I'd like to > look at how other modules do the same thing and how they communicate things > to the user. There are various features in Emacs that display some sort of notice temporarily and make it easy to move past. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org)
Re: [NonGNU ELPA] New package: llm
Author: Tomas Hlavaty
Date: Wed, 16 Aug 2023 07:11
Date: Wed, 16 Aug 2023 07:11
6 lines
229 bytes
229 bytes
On Tue 15 Aug 2023 at 22:30, Richard Stallman <rms@gnu.org> wrote: > There are various features in Emacs that display some sort of notice > temporarily and make it easy to move past. Is there a way to review the notices later?
Re: [NonGNU ELPA] New package: llm
Author: Richard Stallman
Date: Wed, 16 Aug 2023 22:02
Date: Wed, 16 Aug 2023 22:02
37 lines
1717 bytes
1717 bytes
[[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Among other things, I'm curious about what the FSF would say > about the *models* the LLMs use. Are they "just data", or should we > treat them more like object code? What does an LLM that fully adheres to > FSF principles actually look like? I've been thinking about this, and my tentative conclusion is that that precise question is not crucial, because what is certain is that they are part of the control over the system's behavior. So they ought to be released under a free license. In the examples I've heard of, that is never the case. Either they are secret -- users can only use them on a server, which is SaaSS, see https://gnu.org/philosophy/who-does-that-server-really-serve.html -- or they are released under nonfree licenses that restrict freedom 0; see https://www.gnu.org/philosophy/programs-must-not-limit-freedom-to-run.html. As I recall, we don't have a rule against features to interface servers whose code is not released, and we certainly don't have a rule against code in Emacs to interact with nonfree software _provided said software is well known_ -- that is why it is ok to have code to interact with Windows and Android. ISTR we have features in Emacs for talking to servers whose code is not release. But does anyone recall better than I do? -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org)
Re: [NonGNU ELPA] New package: llm
Author: Andrew Hyatt
Date: Wed, 16 Aug 2023 22:48
Date: Wed, 16 Aug 2023 22:48
142 lines
6700 bytes
6700 bytes
--00000000000061df940603157265 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Aug 16, 2023 at 10:02 PM Richard Stallman <rms@gnu.org> wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > Among other things, I'm curious about what the FSF would say > > about the *models* the LLMs use. Are they "just data", or should we > > treat them more like object code? What does an LLM that fully adheres > to > > FSF principles actually look like? > > I've been thinking about this, and my tentative conclusion is that > that precise question is not crucial, because what is certain is that > they are part of the control over the system's behavior. So they > ought to be released under a free license. > > In the examples I've heard of, that is never the case. Either they > are secret -- users can only use them on a server, which is SaaSS, see > https://gnu.org/philosophy/who-does-that-server-really-serve.html -- > or they are released under nonfree licenses that restrict freedom 0; > see > https://www.gnu.org/philosophy/programs-must-not-limit-freedom-to-run.html > . > > As I recall, we don't have a rule against features to interface > servers whose code is not released, and we certainly don't have a rule > against code in Emacs to interact with nonfree software _provided said > software is well known_ -- that is why it is ok to have code to > interact with Windows and Android. > > ISTR we have features in Emacs for talking to servers whose code > is not release. But does anyone recall better than I do? > There is the "excorporate" package on GNU ELPA that talks to exchange corporate servers (although perhaps there are free variants that also speak this protocol?). There's the "metar" package on GNU ELPA that receives the weather from the metar system. A brief search didn't find any code for that, but it might exist. The other interesting find was "sql-oracle", as well as other nonfree similar sql servers in the main emacs lisp. It is a server, although the interface used is local and mediated by a program. But it is an interface to a nonfree utility software. There is no warning given, but a message in `sql--help-docstring' asks the user to consider free alternatives. > > -- > Dr Richard Stallman (https://stallman.org) > Chief GNUisance of the GNU Project (https://gnu.org) > Founder, Free Software Foundation (https://fsf.org) > Internet Hall-of-Famer (https://internethalloffame.org) > > > --00000000000061df940603157265 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir="ltr"><div dir="ltr">On Wed, Aug 16, 2023 at 10:02 PM Richard Stallman <<a href="mailto:rms@gnu.org">rms@gnu.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">[[[ To any NSA and FBI agents reading my email: please consider ]]]<br> [[[ whether defending the US Constitution against all enemies, ]]]<br> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]<br> <br> > Among other things, I'm curious about what the FSF would say <br> > about the *models* the LLMs use. Are they "just data", or should we <br> > treat them more like object code? What does an LLM that fully adheres to <br> > FSF principles actually look like?<br> <br> I've been thinking about this, and my tentative conclusion is that<br> that precise question is not crucial, because what is certain is that<br> they are part of the control over the system's behavior. So they<br> ought to be released under a free license.<br> <br> In the examples I've heard of, that is never the case. Either they<br> are secret -- users can only use them on a server, which is SaaSS, see<br> <a href="https://gnu.org/philosophy/who-does-that-server-really-serve.html" rel="noreferrer" target="_blank">https://gnu.org/philosophy/who-does-that-server-really-serve.html</a> --<br> or they are released under nonfree licenses that restrict freedom 0;<br> see <a href="https://www.gnu.org/philosophy/programs-must-not-limit-freedom-to-run.html" rel="noreferrer" target="_blank">https://www.gnu.org/philosophy/programs-must-not-limit-freedom-to-run.html</a>.<br> <br> As I recall, we don't have a rule against features to interface<br> servers whose code is not released, and we certainly don't have a rule<br> against code in Emacs to interact with nonfree software _provided said<br> software is well known_ -- that is why it is ok to have code to<br> interact with Windows and Android.<br> <br> ISTR we have features in Emacs for talking to servers whose code<br> is not release. But does anyone recall better than I do?<br></blockquote><div><br></div><div>There is the "excorporate" package on GNU ELPA that talks to exchange corporate servers (although perhaps there are free variants that also speak this protocol?). There's the "metar" package on GNU ELPA that receives the weather from the metar system. A brief search didn't find any code for that, but it might exist.</div><div><br></div><div>The other interesting find was "sql-oracle", as well as other nonfree similar sql servers in the main emacs lisp. It is a server, although the interface used is local and mediated by a program. But it is an interface to a nonfree utility software. There is no warning given, but a message in `sql--help-docstring' asks the user to consider free alternatives.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br> -- <br> Dr Richard Stallman (<a href="https://stallman.org" rel="noreferrer" target="_blank">https://stallman.org</a>)<br> Chief GNUisance of the GNU Project (<a href="https://gnu.org" rel="noreferrer" target="_blank">https://gnu.org</a>)<br> Founder, Free Software Foundation (<a href="https://fsf.org" rel="noreferrer" target="_blank">https://fsf.org</a>)<br> Internet Hall-of-Famer (<a href="https://internethalloffame.org" rel="noreferrer" target="_blank">https://internethalloffame.org</a>)<br> <br> <br> </blockquote></div></div> --00000000000061df940603157265--
Re: [NonGNU ELPA] New package: llm
Author: Daniel Fleischer
Date: Thu, 17 Aug 2023 20:08
Date: Thu, 17 Aug 2023 20:08
19 lines
831 bytes
831 bytes
Richard Stallman <rms@gnu.org> writes: > In the examples I've heard of, that is never the case. Either they > are secret -- users can only use them on a server, which is SaaSS, see > https://gnu.org/philosophy/who-does-that-server-really-serve.html -- > or they are released under nonfree licenses that restrict freedom 0; > see https://www.gnu.org/philosophy/programs-must-not-limit-freedom-to-run.html. That is not accurate; LLMs can definitely run locally on your machine. Models can be downloaded and ran using Python. Here is an LLM released under Apache 2 license [0]. There are "black-box" models, served in the cloud, but the revolution we're is precisely because many models are released freely and can be ran (and trained) locally, even on a laptop. [0] https://huggingface.co/mosaicml/mpt-7b -- Daniel Fleischer
Re: [NonGNU ELPA] New package: llm
Author: Richard Stallman
Date: Thu, 17 Aug 2023 22:10
Date: Thu, 17 Aug 2023 22:10
24 lines
842 bytes
842 bytes
[[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > There are various features in Emacs that display some sort of notice > > temporarily and make it easy to move past. > Is there a way to review the notices later? These notices are displayed in various ways. For osme of the notices, there are ways to review them. But I don't think there is any one way that covers all. It might be a good thing to create one -- and that would not be fundamentally hard. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org)
Re: [NonGNU ELPA] New package: llm
Author: Richard Stallman
Date: Fri, 18 Aug 2023 21:49
Date: Fri, 18 Aug 2023 21:49
39 lines
1499 bytes
1499 bytes
[[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > In the examples I've heard of, that is never the case. Either they > > are secret -- users can only use them on a server, which is SaaSS, see > > https://gnu.org/philosophy/who-does-that-server-really-serve.html -- > > or they are released under nonfree licenses that restrict freedom 0; > > see https://www.gnu.org/philosophy/programs-must-not-limit-freedom-to-run.html. > That is not accurate; LLMs can definitely run locally on your machine. We are slightly miscommunicating. Yes there are models that could run locally on your machine, but all the ones I know of were released under a nonfree license. > Here is an LLM released > under Apache 2 license [0]. I haven't seen this before. Maybe it is an exception. Could you confirm that this is a language model itself, not the program that runs the language model? > There are "black-box" models, served in the > cloud, Could we please not use the term "cloud"? There is no cloud, only various companies' computers. See https://gnu.org/philosophy/words-to-avoid.html#CloudComputing. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org)
Re: [NonGNU ELPA] New package: llm
Author: Richard Stallman
Date: Fri, 18 Aug 2023 21:51
Date: Fri, 18 Aug 2023 21:51
40 lines
1674 bytes
1674 bytes
[[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > There's the "metar" package on GNU ELPA that receives the > weather from the metar system. Wikipedia says that METAR is a format, not a system. Where precsely does that package get the data? Servers run by who? Servers that publish data of public interest are normally NOT SaaSS. (Indeed, most servers are NOT SaaSS.) SaaSS means using a server to do computing that naturally is yours. You ask for some computing to be done, send the input, and get the output back. Services that give you METAR data do computing that you might find useful, but I think that computing isn't specifically yours, so it isn't SaaSS. A brief search didn't find any code for > that, but it might exist. I can't make sense of that. Didn't find any code for what? > The other interesting find was "sql-oracle", as well as other nonfree > similar sql servers in the main emacs lisp. It is a server, although the > interface used is local and mediated by a program. But it is an interface > to a nonfree utility software. There is no warning given, but a message in > `sql--help-docstring' asks the user to consider free alternatives. This sounds like SaaSS to me. Maybe we should add such a warning here. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org)
Re: [NonGNU ELPA] New package: llm
Author: Ihor Radchenko
Date: Sat, 19 Aug 2023 09:08
Date: Sat, 19 Aug 2023 09:08
20 lines
846 bytes
846 bytes
Richard Stallman <rms@gnu.org> writes: > > The other interesting find was "sql-oracle", as well as other nonfree > > similar sql servers in the main emacs lisp. It is a server, although the > > interface used is local and mediated by a program. But it is an interface > > to a nonfree utility software. There is no warning given, but a message in > > `sql--help-docstring' asks the user to consider free alternatives. > > This sounds like SaaSS to me. Maybe we should add such a warning here. AFAIU, this has been discussed recently in https://list.orgmode.org/orgmode/E1pJoMI-0001Rf-Rq@fencepost.gnu.org/ -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92>
Re: [NonGNU ELPA] New package: llm
Author: Daniel Fleischer
Date: Sat, 19 Aug 2023 11:15
Date: Sat, 19 Aug 2023 11:15
42 lines
1463 bytes
1463 bytes
Local LLMs usually run using the python `transformers' library; in order to interact with them using a REST API, some glue code is needed, for example: https://github.com/go-skynet/LocalAI The API is based on OpenAI.com which is what others are following and thus are relevant for the API access the llm package is going to offer. Richard Stallman <rms@gnu.org> writes: > We are slightly miscommunicating. Yes there are models that could run > locally on your machine, but all the ones I know of were released > under a nonfree license. > Could you confirm that this is a language model itself, not the > program that runs the language model? The most popular software framework for running LLMs is called `transformers' (named after the models' architecture): https://github.com/huggingface/transformers (Apache 2) Huggingface also offers free hosting for models and data sets. There are several families of free models: - XGEN https://huggingface.co/Salesforce/xgen-7b-8k-base - MPT https://huggingface.co/mosaicml/mpt-7b - Falcon https://huggingface.co/tiiuae/falcon-7b These are git project, e.g. see https://huggingface.co/tiiuae/falcon-7b/tree/main. These models are released under Apache 2. The models contains the weights (compressed numerical matrices) and possibly some Python code files needed and they explicitly depend on the `transformers' library and the `pytorch' neural networks library (BSD-3). -- Daniel Fleischer
Re: [NonGNU ELPA] New package: llm
Author: Richard Stallman
Date: Sun, 20 Aug 2023 21:12
Date: Sun, 20 Aug 2023 21:12
24 lines
808 bytes
808 bytes
[[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > This sounds like SaaSS to me. Maybe we should add such a warning here. > AFAIU, this has been discussed recently in > https://list.orgmode.org/orgmode/E1pJoMI-0001Rf-Rq@fencepost.gnu.org/ That seems to be a copy of the message I sent. It looks like maybe there was a discussion after that. Could you please tell me what conclusions or ideas that discussion reached? -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org)
Re: [NonGNU ELPA] New package: llm
Author: Richard Stallman
Date: Sun, 20 Aug 2023 21:12
Date: Sun, 20 Aug 2023 21:12
21 lines
756 bytes
756 bytes
[[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > The most popular software framework for running LLMs is called > `transformers' (named after the models' architecture): Ok. Is that a problem in any way? If the `transformers' library is libre, I think it is not a problem. Do you know why they use the name "huggingface"? It seems very strange to me as an anglophone. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org)
Re: [NonGNU ELPA] New package: llm
Author: Jim Porter
Date: Sun, 20 Aug 2023 21:48
Date: Sun, 20 Aug 2023 21:48
39 lines
2006 bytes
2006 bytes
On 8/17/2023 10:08 AM, Daniel Fleischer wrote: > That is not accurate; LLMs can definitely run locally on your machine. > Models can be downloaded and ran using Python. Here is an LLM released > under Apache 2 license [0]. There are "black-box" models, served in the > cloud, but the revolution we're is precisely because many models are > released freely and can be ran (and trained) locally, even on a laptop. > > [0] https://huggingface.co/mosaicml/mpt-7b The link says that this model has been pretrained, which is certainly useful for the average person who doesn't want (or doesn't have the resources) to perform the training themselves, but from the documentation, it's not clear how I *would* perform the training myself if I were so inclined. (I've only toyed with LLMs, so I'm not an expert at more "advanced" cases like this.) I do see that the documentation mentions the training datasets used, but it also says that "great efforts have been taken to clean the pretraining data". Am I able to access the cleaned datasets? I looked over their blog post[1], but I didn't see anything describing this in detail. While I certainly appreciate the effort people are making to produce LLMs that are more open than OpenAI (a low bar), I'm not sure if providing several gigabytes of model weights in binary format is really providing the *source*. It's true that you can still edit these models in a sense by fine-tuning them, but you could say the same thing about a project that only provided the generated output from GNU Bison, instead of the original input to Bison. (Just to be clear, I don't mean any of the above to be leading questions. I really don't know the answers, and using analogies to previous cases like Bison can only get us so far. I truly hope there *is* a freedom-respecting way to interface with LLMs, but I also think it's worth taking some extra care at the beginning so we can choose the right path forward.) [1] https://www.mosaicml.com/blog/mpt-7b
Re: [NonGNU ELPA] New package: llm
Author: Jim Porter
Date: Sun, 20 Aug 2023 23:03
Date: Sun, 20 Aug 2023 23:03
62 lines
3551 bytes
3551 bytes
On 8/20/2023 10:12 PM, Andrew Hyatt wrote: > The training of these is fairly straightforward, at least if you are > familiar with the area. ... the LLM we are talking about here use this technique to train and execute, changing some parameters and adding things like more attention heads, but keeping the fundamental architecture the same. I think the parameters would be a key part of this (or potentially all of the code they used for the training, if it does something unique), as well as the *actual* training datasets. That's why I'm especially concerned about the line in their docs saying "great efforts have been taken to clean the pretraining data". I couldn't find out whether they provided the cleaned data or only the "raw" data. From my understanding, properly cleaning the data is labor-intensive, and you wouldn't be able to reproduce another team's efforts in that area unless they gave you a diff or something equivalent. > I'm not an expert, but I believe that due to the use of stochastic > processes in training, even if you had the exact code, parameters and > data used in training, you would never be able to reproduce the model > they make available. It should be equivalent in quality, perhaps, but > not the same. This is a problem for reproducibility (it would be nice if you could *verify* that a model was built the way its makers said it was), but I don't think it's a critical problem for freedom. > To me, I believe it should be about freedom. Not absolute freedom, but > relative freedom: do you, the user, have the same amount of freedom as > anyone else, including the creator? For the LLMs like huggingface and > many other research LLMs, the answer is yes. So long as the creators provide all the necessary parameters to retrain the model from "scratch", I think I'd agree. If some of these aren't provided (cleaned datasets, training parameters, any direct human intervention if applicable, etc), then I think the answer is no. For example, the creator could decide that one data source is bad for some reason, and retrain their model without it. Would I be able to do that work independently with just what the creator has given me? I see that there was a presentation at LibrePlanet 2023 (or maybe shortly after) by Leandro von Werra of HuggingFace on the ethics of code-generating LLMs[1]. It says that it hasn't been published online yet, though. This might not be the final answer on all the concerns about incorporating LLMs into Emacs, but hopefully it would help. In practice though, I think if Emacs were to support communicating with LLMs, it would be good if - at minimum - we could direct users to an essay explaining the potential ethical/freedom issues with them. On that note, maybe we could also take a bit of inspiration from Emacs dynamic modules. They require a GPL compatibility symbol[2] in order to load, and perhaps a hypothetical 'llm-foobar' package that interfaces with the 'foobar' LLM could announce whether it respects users' freedom via some variable/symbol. Freedom-respecting LLMs wouldn't need a warning message then. We could even forbid packages that talk to particularly "bad" LLMs. (I suppose we can't stop users from writing their own packages and just lying about whether they're ok, but we could prevent their inclusion in ELPA.) [1] https://www.fsf.org/bulletin/2023/spring/trademarks-volunteering-and-code-generating-llm [2] https://www.gnu.org/software/emacs/manual/html_node/elisp/Module-Initialization.html
Re: [NonGNU ELPA] New package: llm
Author: Andrew Hyatt
Date: Mon, 21 Aug 2023 01:12
Date: Mon, 21 Aug 2023 01:12
176 lines
8869 bytes
8869 bytes
--000000000000f8019e060367ed8b Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, Aug 21, 2023 at 12:48 AM Jim Porter <jporterbugs@gmail.com> wrote: > On 8/17/2023 10:08 AM, Daniel Fleischer wrote: > > That is not accurate; LLMs can definitely run locally on your machine. > > Models can be downloaded and ran using Python. Here is an LLM released > > under Apache 2 license [0]. There are "black-box" models, served in the > > cloud, but the revolution we're is precisely because many models are > > released freely and can be ran (and trained) locally, even on a laptop. > > > > [0] https://huggingface.co/mosaicml/mpt-7b > > The link says that this model has been pretrained, which is certainly > useful for the average person who doesn't want (or doesn't have the > resources) to perform the training themselves, but from the > documentation, it's not clear how I *would* perform the training myself > if I were so inclined. (I've only toyed with LLMs, so I'm not an expert > at more "advanced" cases like this.) > The training of these is fairly straightforward, at least if you are familiar with the area. The code for implementing transformers in the original "Attention is All You Need" paper is at https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/models/transformer.py under an Apache License, and the LLM we are talking about here use this technique to train and execute, changing some parameters and adding things like more attention heads, but keeping the fundamental architecture the same. I'm not an expert, but I believe that due to the use of stochastic processes in training, even if you had the exact code, parameters and data used in training, you would never be able to reproduce the model they make available. It should be equivalent in quality, perhaps, but not the same. > > I do see that the documentation mentions the training datasets used, but > it also says that "great efforts have been taken to clean the > pretraining data". Am I able to access the cleaned datasets? I looked > over their blog post[1], but I didn't see anything describing this in > detail. > > While I certainly appreciate the effort people are making to produce > LLMs that are more open than OpenAI (a low bar), I'm not sure if > providing several gigabytes of model weights in binary format is really > providing the *source*. It's true that you can still edit these models > in a sense by fine-tuning them, but you could say the same thing about a > project that only provided the generated output from GNU Bison, instead > of the original input to Bison. > To me, I believe it should be about freedom. Not absolute freedom, but relative freedom: do you, the user, have the same amount of freedom as anyone else, including the creator? For the LLMs like huggingface and many other research LLMs, the answer is yes. You do have the freedom to fine-tune the model, as does the creator. You cannot change the base model in any meaningful way, but neither can the creator, because no one knows how to do that yet. You cannot understand the model, but neither can the creator, because while some progress has been made in understanding simple things about simple LLMs like GPT-2, the modern LLMs are too complex for anyone to make sense out of. > > (Just to be clear, I don't mean any of the above to be leading > questions. I really don't know the answers, and using analogies to > previous cases like Bison can only get us so far. I truly hope there > *is* a freedom-respecting way to interface with LLMs, but I also think > it's worth taking some extra care at the beginning so we can choose the > right path forward.) > > [1] https://www.mosaicml.com/blog/mpt-7b > --000000000000f8019e060367ed8b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir="ltr"><div dir="ltr">On Mon, Aug 21, 2023 at 12:48 AM Jim Porter <<a href="mailto:jporterbugs@gmail.com">jporterbugs@gmail.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 8/17/2023 10:08 AM, Daniel Fleischer wrote:<br> > That is not accurate; LLMs can definitely run locally on your machine.<br> > Models can be downloaded and ran using Python. Here is an LLM released<br> > under Apache 2 license [0]. There are "black-box" models, served in the<br> > cloud, but the revolution we're is precisely because many models are<br> > released freely and can be ran (and trained) locally, even on a laptop.<br> > <br> > [0] <a href="https://huggingface.co/mosaicml/mpt-7b" rel="noreferrer" target="_blank">https://huggingface.co/mosaicml/mpt-7b</a><br> <br> The link says that this model has been pretrained, which is certainly <br> useful for the average person who doesn't want (or doesn't have the <br> resources) to perform the training themselves, but from the <br> documentation, it's not clear how I *would* perform the training myself <br> if I were so inclined. (I've only toyed with LLMs, so I'm not an expert <br> at more "advanced" cases like this.)<br></blockquote><div><br></div><div>The training of these is fairly straightforward, at least if you are familiar with the area. The code for implementing transformers in the original "Attention is All You Need" paper is at <a href="https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/models/transformer.py">https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/models/transformer.py</a> under an Apache License, and the LLM we are talking about here use this technique to train and execute, changing some parameters and adding things like more attention heads, but keeping the fundamental architecture the same. </div><div><br></div><div>I'm not an expert, but I believe that due to the use of stochastic processes in training, even if you had the exact code, parameters and data used in training, you would never be able to reproduce the model they make available. It should be equivalent in quality, perhaps, but not the same.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br> I do see that the documentation mentions the training datasets used, but <br> it also says that "great efforts have been taken to clean the <br> pretraining data". Am I able to access the cleaned datasets? I looked <br> over their blog post[1], but I didn't see anything describing this in <br> detail.<br> <br> While I certainly appreciate the effort people are making to produce <br> LLMs that are more open than OpenAI (a low bar), I'm not sure if <br> providing several gigabytes of model weights in binary format is really <br> providing the *source*. It's true that you can still edit these models <br> in a sense by fine-tuning them, but you could say the same thing about a <br> project that only provided the generated output from GNU Bison, instead <br> of the original input to Bison.<br></blockquote><div><br></div><div>To me, I believe it should be about freedom. Not absolute freedom, but relative freedom: do you, the user, have the same amount of freedom as anyone else, including the creator? For the LLMs like huggingface and many other research LLMs, the answer is yes. You do have the freedom to fine-tune the model, as does the creator. You cannot change the base model in any meaningful way, but neither can the creator, because no one knows how to do that yet. You cannot understand the model, but neither can the creator, because while some progress has been made in understanding simple things about simple LLMs like GPT-2, the modern LLMs are too complex for anyone to make sense out of.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br> (Just to be clear, I don't mean any of the above to be leading <br> questions. I really don't know the answers, and using analogies to <br> previous cases like Bison can only get us so far. I truly hope there <br> *is* a freedom-respecting way to interface with LLMs, but I also think <br> it's worth taking some extra care at the beginning so we can choose the <br> right path forward.)<br> <br> [1] <a href="https://www.mosaicml.com/blog/mpt-7b" rel="noreferrer" target="_blank">https://www.mosaicml.com/blog/mpt-7b</a><br> </blockquote></div></div> --000000000000f8019e060367ed8b--
Re: [NonGNU ELPA] New package: llm
Author: Ihor Radchenko
Date: Mon, 21 Aug 2023 08:26
Date: Mon, 21 Aug 2023 08:26
75 lines
2998 bytes
2998 bytes
Richard Stallman <rms@gnu.org> writes: > Could you please tell me what conclusions or ideas that discussion > reached? Among other things, we have discussed Oracle SQL support in sql.el: https://list.orgmode.org/orgmode/E1pKtph-00082q-4Z@fencepost.gnu.org/ Richard Stallman <rms@gnu.org> writes: > ... > > The 'support' is essentially specialised comint based interfaces tweaked > > to work with the various SQL database engine command line clients such > > as psql for Postgres and sqlplus for Oracle. This involves codes to use > > the comint buffer to send commands/regions to the SQL client and read > > back the results and run interactive 'repl' like sessions with the > > client. > > Thanks. > > Based on our general policies, it is ok to do this. It is ok for > Postgres because that is free software. It is ok for Oracle because > that is widely known. Another relevant bit is related to the fact the Oracle SQL, through its free CLI, may actually connect to SaaS server. https://list.orgmode.org/orgmode/E1pKtpq-00086w-9s@fencepost.gnu.org/ Richard Stallman <rms@gnu.org> writes: > ... > > I am not sure about SaaSS - even postgresql (free software) may be used > > as a service provider by running it on server the user does not control. > > For sure, it CAN be used that way. If a Lisp package is designed to > work with a subprocess, a user can certainly rig it to talk with a > remote server. It is the nature of free software that people can > customize it, even so as to do something foolish with it. When a user > does this, it's per responsibility, not ours. > > We should not distribute specific support or recommendations to use > the Lisp package in that particular way. I also suggested the following, although did not yet find time open discussion on emacs-devel: https://list.orgmode.org/orgmode/87k015e80p.fsf@localhost/ Ihor Radchenko <yantar92@posteo.net> writes: > Richard Stallman <rms@gnu.org> writes: > >> > Would it then make sense to note the reasons why we support one or >> > another non-free software in a separate file like etc/NON-FREE-SUPPORT? >> >> I think it is a good idea to document the reasoning for these >> decision. But I think it does not necessarily have to be centralized >> in one file for all of Emacs. Another alternative, also natural, >> would be to describe these decisions with the code that implements the >> support. > > Will file header be a good place? > > Note that there is little point adding the reasons behind supporting > non-free software if they cannot be easily found. Ideally, it should be > a standard place documented as code convention. Then, people can > consistently check the reasons (or lack of) behind each individual > non-free software support decision. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92>
Re: [NonGNU ELPA] New package: llm
Author: Daniel Fleischer
Date: Mon, 21 Aug 2023 09:36
Date: Mon, 21 Aug 2023 09:36
66 lines
3323 bytes
3323 bytes
Jim Porter <jporterbugs@gmail.com> writes: > The link says that this model has been pretrained, which is certainly > useful for the average person who doesn't want (or doesn't have the > resources) to perform the training themselves, but from the > documentation, it's not clear how I *would* perform the training > myself if I were so inclined. (I've only toyed with LLMs, so I'm not > an expert at more "advanced" cases like this.) When I say people can train models themselves I mean "fine tuning" which is the process of taking an existing model and make it learn to do a specific task by showing it a small number of examples, as low as 1000 examples. There are advanced techniques that can train a model by modifying a small percentage of its weights; this type of training can be done in a few hours on a laptop. See https://huggingface.co/docs/peft/index for a tool to do that. > I do see that the documentation mentions the training datasets used, > but it also says that "great efforts have been taken to clean the > pretraining data". Am I able to access the cleaned datasets? I looked > over their blog post[1], but I didn't see anything describing this in > detail. > > While I certainly appreciate the effort people are making to produce > LLMs that are more open than OpenAI (a low bar), I'm not sure if > providing several gigabytes of model weights in binary format is > really providing the *source*. It's true that you can still edit these > models in a sense by fine-tuning them, but you could say the same > thing about a project that only provided the generated output from GNU > Bison, instead of the original input to Bison. To a large degree, the model is the weights. Today's models mainly share a single architecture, called a transformer decoder. Once you specify the architecture and a few hyper-parameters in a config file, the model is entirely determined by the weights. https://huggingface.co/mosaicml/mpt-7b/blob/main/config.json Put differently, today's models differ mainly by their weights, not architectural differences. As for reproducibility, the truth is one can not reproduce the models, theoretically and practically. The models can contain 7, 14, 30, 60 billion parameters which are floating point numbers; is it impossible to reproduce it exactly as there are many sources for randomness in the training process. Practically, pretraining is expensive; it requires hundreds of GPUs and training costs are 100,000$ for small models and up to millions for larger models. Some models do release the training data, see e.g. https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T A side note: we are in a stage where our theoretical understanding is lacking while practical applications are flourishing. Things move very very fast, and there is a strong drive to productize this technology, making people and companies invest a lot of resources into this. However the open source aspect is amazing; the fact that the architecture, code and insights are shared between everyone and even some companies share the models they pretrained under open licensing (taking upon themselves the high cost of training) is a huge win to everyone, including the open source and scientific communities because now the innovation can come from anywhere. -- Daniel Fleischer
Re: [NonGNU ELPA] New package: llm
Author: Richard Stallman
Date: Mon, 21 Aug 2023 21:06
Date: Mon, 21 Aug 2023 21:06
42 lines
1755 bytes
1755 bytes
[[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > While I certainly appreciate the effort people are making to produce > LLMs that are more open than OpenAI (a low bar), I'm not sure if > providing several gigabytes of model weights in binary format is really > providing the *source*. It's true that you can still edit these models > in a sense by fine-tuning them, but you could say the same thing about a > project that only provided the generated output from GNU Bison, instead > of the original input to Bison. I don't think that is valid. Bison processing is very different from training a neural net. Incremental retraining of a trained neural net is the same kind of processing as the original training -- except that you use other data and it produces a neural net that is trained differently. My conclusiuon is that the trained neural net is effectively a kind of source code. So we don't need to demand the "original training data" as part of a package's source code. That data does not have to be free, published, or available. > In practice though, I think if Emacs were to support communicating with > LLMs, it would be good if - at minimum - we could direct users to an > essay explaining the potential ethical/freedom issues with them. I agree, in principle. But it needs to be an article that the GNU Project can endorse. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org)
Re: [NonGNU ELPA] New package: llm
Author: Andrew Hyatt
Date: Sat, 26 Aug 2023 21:07
Date: Sat, 26 Aug 2023 21:07
221 lines
9942 bytes
9942 bytes
--0000000000008f849f0603dd34e7 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I've now made the changes requested to the llm package on github ( https://github.com/ahyatt/llm). Because what was requested was a warning to the user, I used `lwarn', and have added an option to turn the warnings off (and the user can turn the warnings off through the warning mechanism as well, via `warning-suppress-log-types'). To save you the trouble of looking at the code to see what exactly it says, here's the function I'm using to warn: (defun llm--warn-on-nonfree (name tos) "Issue a warning if `llm-warn-on-nonfree' is non-nil. NAME is the human readable name of the LLM (e.g 'Open AI'). TOS is the URL of the terms of service for the LLM. All non-free LLMs should call this function on each llm function invocation." (when llm-warn-on-nonfree (lwarn '(llm nonfree) :warning "%s API is not free software, and your freedom to use it is restricted. See %s for the details on the restrictions on use." name tos))) If this is sufficient, please consider accepting this package into GNU ELPA (see above where we decided this is a better fit than the Non-GNU ELPA). On Sat, Aug 12, 2023 at 9:43 PM Richard Stallman <rms@gnu.org> wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > What you are saying is consistent with the GNU coding standard. > However, I > > think any message about this would be annoying, > > I am sure it would be a little annoying. But assuming the user can > type SPC and move on from that message, the annoyance will be quite > little. > > personally, and would > be a > > deterrent for clients to use this library. > > If the library is quite useful I doubt anyone would be deterred. > If anyone minded it the message enough to stop using the package, perse > could > edit this out of the code. > > This issue is an example of those where two different values are > pertinent. There is convenience, which counts but is superficial. > And there is the purpose of the GNU system, which for 40 years has led > the fight against injustice in software. That value is deep and, in the > long term, the most important value of all. > > When they conflict in a specific practical matter, there is always > pressure to prioritize convenience. But that is not wise. > The right approach is to look for a ocmpromise which serves both > goals. I am sure we can find one here. > > I suggested showing the message once a day, because that is what first > occurred to me. But there are lots of ways to vary the details. > Here's an idea. For each language model, it could diisplay the > message the first, second, fifth, tenth, and after that every tenth > time the user starts that mode. With this, the frequency of little > annoyance will diminish soon, but the point will not be forgotten. > > > You made suggestions for how to exclude more code from Emacs itself, > and support for obscure language models we probably should exclude. > But there is no need to exclude the support for the well-known ones, > as I've explained. > > And we can do better than that! We can educate the users about what > is wrong with those systems -- something that the media hysteria fails > to mention at all. That is important -- let's use Emacs for it! > > > All implementations can then separately be made available on some other > > package library not associated with GNU. In this scenario, I wouldn't > have > > warnings on those implementations, just as the many llm-based packages > on > > various alternative ELPAs do not have warnings today. > > They ought to show warnings -- the issue is exactly the same. > > We should not slide quietly into acceptance and normalization of a new > systematic injustice. Opposing it is our job. > > -- > Dr Richard Stallman (https://stallman.org) > Chief GNUisance of the GNU Project (https://gnu.org) > Founder, Free Software Foundation (https://fsf.org) > Internet Hall-of-Famer (https://internethalloffame.org) > > > --0000000000008f849f0603dd34e7 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir="ltr">I've now made the changes requested to the llm package on github (<a href="https://github.com/ahyatt/llm">https://github.com/ahyatt/llm</a>).<div><br></div><div>Because what was requested was a warning to the user, I used `lwarn', and have added an option to turn the warnings off (and the user can turn the warnings off through the warning mechanism as well, via `warning-suppress-log-types').</div><div><br></div><div>To save you the trouble of looking at the code to see what exactly it says, here's the function I'm using to warn:</div><div><br></div><div>(defun llm--warn-on-nonfree (name tos)<br> "Issue a warning if `llm-warn-on-nonfree' is non-nil.<br>NAME is the human readable name of the LLM (e.g 'Open AI').<br><br>TOS is the URL of the terms of service for the LLM.<br><br>All non-free LLMs should call this function on each llm function<br>invocation."<br> (when llm-warn-on-nonfree<br> (lwarn '(llm nonfree) :warning "%s API is not free software, and your freedom to use it is restricted.<br>See %s for the details on the restrictions on use." name tos)))<br></div><div><br></div><div>If this is sufficient, please consider accepting this package into GNU ELPA (see above where we decided this is a better fit than the Non-GNU ELPA).</div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Aug 12, 2023 at 9:43 PM Richard Stallman <<a href="mailto:rms@gnu.org">rms@gnu.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">[[[ To any NSA and FBI agents reading my email: please consider ]]]<br> [[[ whether defending the US Constitution against all enemies, ]]]<br> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]<br> <br> > What you are saying is consistent with the GNU coding standard. However, I<br> > think any message about this would be annoying,<br> <br> I am sure it would be a little annoying. But assuming the user can<br> type SPC and move on from that message, the annoyance will be quite<br> little.<br> <br> personally, and would be a<br> > deterrent for clients to use this library.<br> <br> If the library is quite useful I doubt anyone would be deterred.<br> If anyone minded it the message enough to stop using the package, perse could<br> edit this out of the code.<br> <br> This issue is an example of those where two different values are<br> pertinent. There is convenience, which counts but is superficial.<br> And there is the purpose of the GNU system, which for 40 years has led<br> the fight against injustice in software. That value is deep and, in the<br> long term, the most important value of all.<br> <br> When they conflict in a specific practical matter, there is always<br> pressure to prioritize convenience. But that is not wise.<br> The right approach is to look for a ocmpromise which serves both<br> goals. I am sure we can find one here.<br> <br> I suggested showing the message once a day, because that is what first<br> occurred to me. But there are lots of ways to vary the details.<br> Here's an idea. For each language model, it could diisplay the<br> message the first, second, fifth, tenth, and after that every tenth<br> time the user starts that mode. With this, the frequency of little<br> annoyance will diminish soon, but the point will not be forgotten.<br> <br> <br> You made suggestions for how to exclude more code from Emacs itself,<br> and support for obscure language models we probably should exclude.<br> But there is no need to exclude the support for the well-known ones,<br> as I've explained.<br> <br> And we can do better than that! We can educate the users about what<br> is wrong with those systems -- something that the media hysteria fails<br> to mention at all. That is important -- let's use Emacs for it!<br> <br> > All implementations can then separately be made available on some other<br> > package library not associated with GNU. In this scenario, I wouldn't have<br> > warnings on those implementations, just as the many llm-based packages on<br> > various alternative ELPAs do not have warnings today.<br> <br> They ought to show warnings -- the issue is exactly the same.<br> <br> We should not slide quietly into acceptance and normalization of a new<br> systematic injustice. Opposing it is our job.<br> <br> -- <br> Dr Richard Stallman (<a href="https://stallman.org" rel="noreferrer" target="_blank">https://stallman.org</a>)<br> Chief GNUisance of the GNU Project (<a href="https://gnu.org" rel="noreferrer" target="_blank">https://gnu.org</a>)<br> Founder, Free Software Foundation (<a href="https://fsf.org" rel="noreferrer" target="_blank">https://fsf.org</a>)<br> Internet Hall-of-Famer (<a href="https://internethalloffame.org" rel="noreferrer" target="_blank">https://internethalloffame.org</a>)<br> <br> <br> </blockquote></div> --0000000000008f849f0603dd34e7--
Re: [NonGNU ELPA] New package: llm
Author: Jim Porter
Date: Sun, 27 Aug 2023 11:36
Date: Sun, 27 Aug 2023 11:36
54 lines
2362 bytes
2362 bytes
On 8/26/2023 6:07 PM, Andrew Hyatt wrote: > To save you the trouble of looking at the code to see what exactly it > says, here's the function I'm using to warn: > > (defun llm--warn-on-nonfree (name tos) > "Issue a warning if `llm-warn-on-nonfree' is non-nil. > NAME is the human readable name of the LLM (e.g 'Open AI'). > > TOS is the URL of the terms of service for the LLM. > > All non-free LLMs should call this function on each llm function > invocation." > (when llm-warn-on-nonfree > (lwarn '(llm nonfree) :warning "%s API is not free software, and > your freedom to use it is restricted. > See %s for the details on the restrictions on use." name tos))) To make this easier on third parties writing their own implementations for other LLMs, maybe you could make this (mostly) automatic? I see that you're using 'cl-defgeneric' in the code, so what about something like this? (cl-defgeneric llm-free-p (provider) "Return non-nil if PROVIDER is a freedom-respecting model." nil) (cl-defmethod llm-free-p ((provider my-free-llm)) t) Then, if all user-facing functions have some implementation that always calls this (maybe using the ":before" key for the generic functions?), third parties won't forget to set up the warning code; instead, they'll need to explicitly mark their LLM provider as free. I also see that there's a defcustom ('llm-warn-on-nonfree') that lets people opt out of this. I think it's a good idea to give users that control, but should this follow a similar pattern to 'inhibit-startup-echo-area-message'? Its docstring says: > The startup message is in the echo area as it provides information > about GNU Emacs and the GNU system in general, which we want all > users to see. As this is the least intrusive startup message, > this variable gets specialized treatment to prevent the message > from being disabled site-wide by systems administrators, while > still allowing individual users to do so. > > Setting this variable takes effect only if you do it with the > customization buffer or if your init file contains a line of this > form: > (setq inhibit-startup-echo-area-message "YOUR-USER-NAME") If we want it to be easy for users to opt out of the message, but hard for admins (or other packages) to automate opting out, something like the above might make sense.
Re: [NonGNU ELPA] New package: llm
Author: Philip Kaluderci
Date: Sun, 27 Aug 2023 13:11
Date: Sun, 27 Aug 2023 13:11
110 lines
4401 bytes
4401 bytes
Andrew Hyatt <ahyatt@gmail.com> writes: > I've now made the changes requested to the llm package on github ( > https://github.com/ahyatt/llm). > > Because what was requested was a warning to the user, I used `lwarn', and > have added an option to turn the warnings off (and the user can turn the > warnings off through the warning mechanism as well, via > `warning-suppress-log-types'). > > To save you the trouble of looking at the code to see what exactly it says, > here's the function I'm using to warn: > > (defun llm--warn-on-nonfree (name tos) > "Issue a warning if `llm-warn-on-nonfree' is non-nil. > NAME is the human readable name of the LLM (e.g 'Open AI'). > > TOS is the URL of the terms of service for the LLM. > > All non-free LLMs should call this function on each llm function > invocation." > (when llm-warn-on-nonfree > (lwarn '(llm nonfree) :warning "%s API is not free software, and your > freedom to use it is restricted. > See %s for the details on the restrictions on use." name tos))) > > If this is sufficient, please consider accepting this package into GNU ELPA > (see above where we decided this is a better fit than the Non-GNU ELPA). I would be fine with this, and would go ahead if there are no objections. > > On Sat, Aug 12, 2023 at 9:43 PM Richard Stallman <rms@gnu.org> wrote: > >> [[[ To any NSA and FBI agents reading my email: please consider ]]] >> [[[ whether defending the US Constitution against all enemies, ]]] >> [[[ foreign or domestic, requires you to follow Snowden's example. ]]] >> >> > What you are saying is consistent with the GNU coding standard. >> However, I >> > think any message about this would be annoying, >> >> I am sure it would be a little annoying. But assuming the user can >> type SPC and move on from that message, the annoyance will be quite >> little. >> >> personally, and would >> be a >> > deterrent for clients to use this library. >> >> If the library is quite useful I doubt anyone would be deterred. >> If anyone minded it the message enough to stop using the package, perse >> could >> edit this out of the code. >> >> This issue is an example of those where two different values are >> pertinent. There is convenience, which counts but is superficial. >> And there is the purpose of the GNU system, which for 40 years has led >> the fight against injustice in software. That value is deep and, in the >> long term, the most important value of all. >> >> When they conflict in a specific practical matter, there is always >> pressure to prioritize convenience. But that is not wise. >> The right approach is to look for a ocmpromise which serves both >> goals. I am sure we can find one here. >> >> I suggested showing the message once a day, because that is what first >> occurred to me. But there are lots of ways to vary the details. >> Here's an idea. For each language model, it could diisplay the >> message the first, second, fifth, tenth, and after that every tenth >> time the user starts that mode. With this, the frequency of little >> annoyance will diminish soon, but the point will not be forgotten. >> >> >> You made suggestions for how to exclude more code from Emacs itself, >> and support for obscure language models we probably should exclude. >> But there is no need to exclude the support for the well-known ones, >> as I've explained. >> >> And we can do better than that! We can educate the users about what >> is wrong with those systems -- something that the media hysteria fails >> to mention at all. That is important -- let's use Emacs for it! >> >> > All implementations can then separately be made available on some other >> > package library not associated with GNU. In this scenario, I wouldn't >> have >> > warnings on those implementations, just as the many llm-based packages >> on >> > various alternative ELPAs do not have warnings today. >> >> They ought to show warnings -- the issue is exactly the same. >> >> We should not slide quietly into acceptance and normalization of a new >> systematic injustice. Opposing it is our job. >> >> -- >> Dr Richard Stallman (https://stallman.org) >> Chief GNUisance of the GNU Project (https://gnu.org) >> Founder, Free Software Foundation (https://fsf.org) >> Internet Hall-of-Famer (https://internethalloffame.org) >> >> >>
Re: [NonGNU ELPA] New package: llm
Author: Jim Porter
Date: Sun, 27 Aug 2023 19:59
Date: Sun, 27 Aug 2023 19:59
42 lines
1968 bytes
1968 bytes
On 8/27/2023 7:32 PM, Andrew Hyatt wrote: > After following Jim Porter's suggestion above, here is the new function, > and you can see the advice we're giving in the docstring: > > (cl-defgeneric llm-nonfree-message-info (provider) > "If PROVIDER is non-free, return info for a warning. > This should be a cons of the name of the LLM, and the URL of the > terms of service. > > If the LLM is free and has no restrictions on use, this should > return nil. Since this function already returns nil, there is no > need to override it." > (ignore provider) > nil) For what it's worth, I was thinking about having the default be the opposite: warn users by default, since we don't really know if an LLM provider is free unless the Elisp code indicates it. (Otherwise, it could simply mean the author of that provider forgot to override 'llm-nonfree-message-info'.) In other words, assume the worst by default. :) That said, if everyone else thinks this isn't an issue, I won't stamp my feet about it. As for the docstring, I see that many models use ordinary software licenses, such as the Apache license. That could make it easier for us to define the criteria for a libre provider: is the model used by the provider available under a license the FSF considers a free software license?[1] (For LLM providers that you use by making a web request, we could also expect that all the code for their web API is libre too. However, that code is comparatively uninteresting, and so long as you could get the model to use on a self-hosted system[2], I don't see a need to warn the user.) (Also, if you prefer to avoid having to say '(ignore provider)', you can also prefix 'provider' with an underscore. That'll make the byte compiler happy.) [1] https://www.gnu.org/licenses/license-list.en.html [2] At least, in theory. A user might not have enough computing power to use the model in practice, but I don't think that matters for this case.
Re: [NonGNU ELPA] New package: llm
Author: Andrew Hyatt
Date: Sun, 27 Aug 2023 20:19
Date: Sun, 27 Aug 2023 20:19
161 lines
7121 bytes
7121 bytes
--00000000000036371b0603f0a5d7 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sun, Aug 27, 2023 at 2:36 PM Jim Porter <jporterbugs@gmail.com> wrote: > On 8/26/2023 6:07 PM, Andrew Hyatt wrote: > > To save you the trouble of looking at the code to see what exactly it > > says, here's the function I'm using to warn: > > > > (defun llm--warn-on-nonfree (name tos) > > "Issue a warning if `llm-warn-on-nonfree' is non-nil. > > NAME is the human readable name of the LLM (e.g 'Open AI'). > > > > TOS is the URL of the terms of service for the LLM. > > > > All non-free LLMs should call this function on each llm function > > invocation." > > (when llm-warn-on-nonfree > > (lwarn '(llm nonfree) :warning "%s API is not free software, and > > your freedom to use it is restricted. > > See %s for the details on the restrictions on use." name tos))) > > To make this easier on third parties writing their own implementations > for other LLMs, maybe you could make this (mostly) automatic? I see that > you're using 'cl-defgeneric' in the code, so what about something like > this? > > (cl-defgeneric llm-free-p (provider) > "Return non-nil if PROVIDER is a freedom-respecting model." > nil) > > (cl-defmethod llm-free-p ((provider my-free-llm)) > t) > > Then, if all user-facing functions have some implementation that always > calls this (maybe using the ":before" key for the generic functions?), > third parties won't forget to set up the warning code; instead, they'll > need to explicitly mark their LLM provider as free. > Good idea. I implemented something close to what you suggest, but I had to make a few changes to get it to be workable. Thank you for the suggestion! > I also see that there's a defcustom ('llm-warn-on-nonfree') that lets > people opt out of this. I think it's a good idea to give users that > control, but should this follow a similar pattern to > 'inhibit-startup-echo-area-message'? Its docstring says: > > > The startup message is in the echo area as it provides information > > about GNU Emacs and the GNU system in general, which we want all > > users to see. As this is the least intrusive startup message, > > this variable gets specialized treatment to prevent the message > > from being disabled site-wide by systems administrators, while > > still allowing individual users to do so. > > > > Setting this variable takes effect only if you do it with the > > customization buffer or if your init file contains a line of this > > form: > > (setq inhibit-startup-echo-area-message "YOUR-USER-NAME") > > If we want it to be easy for users to opt out of the message, but hard > for admins (or other packages) to automate opting out, something like > the above might make sense. > Very interesting, thanks. I took a look at the implementation, and I'd prefer not to do anything like that (which involves looking through the user's init file, and seems like it would miss at least some cases). For now, I'll keep it simple. --00000000000036371b0603f0a5d7 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir="ltr"><div dir="ltr">On Sun, Aug 27, 2023 at 2:36 PM Jim Porter <<a href="mailto:jporterbugs@gmail.com">jporterbugs@gmail.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 8/26/2023 6:07 PM, Andrew Hyatt wrote:<br> > To save you the trouble of looking at the code to see what exactly it <br> > says, here's the function I'm using to warn:<br> > <br> > (defun llm--warn-on-nonfree (name tos)<br> > "Issue a warning if `llm-warn-on-nonfree' is non-nil.<br> > NAME is the human readable name of the LLM (e.g 'Open AI').<br> > <br> > TOS is the URL of the terms of service for the LLM.<br> > <br> > All non-free LLMs should call this function on each llm function<br> > invocation."<br> > (when llm-warn-on-nonfree<br> > (lwarn '(llm nonfree) :warning "%s API is not free software, and <br> > your freedom to use it is restricted.<br> > See %s for the details on the restrictions on use." name tos)))<br> <br> To make this easier on third parties writing their own implementations <br> for other LLMs, maybe you could make this (mostly) automatic? I see that <br> you're using 'cl-defgeneric' in the code, so what about something like this?<br> <br> (cl-defgeneric llm-free-p (provider)<br> "Return non-nil if PROVIDER is a freedom-respecting model."<br> nil)<br> <br> (cl-defmethod llm-free-p ((provider my-free-llm))<br> t)<br> <br> Then, if all user-facing functions have some implementation that always <br> calls this (maybe using the ":before" key for the generic functions?), <br> third parties won't forget to set up the warning code; instead, they'll <br> need to explicitly mark their LLM provider as free.<br></blockquote><div><br></div><div>Good idea. I implemented something close to what you suggest, but I had to make a few changes to get it to be workable.</div><div>Thank you for the suggestion!</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> I also see that there's a defcustom ('llm-warn-on-nonfree') that lets <br> people opt out of this. I think it's a good idea to give users that <br> control, but should this follow a similar pattern to <br> 'inhibit-startup-echo-area-message'? Its docstring says:<br> <br> > The startup message is in the echo area as it provides information<br> > about GNU Emacs and the GNU system in general, which we want all<br> > users to see. As this is the least intrusive startup message,<br> > this variable gets specialized treatment to prevent the message<br> > from being disabled site-wide by systems administrators, while<br> > still allowing individual users to do so.<br> > <br> > Setting this variable takes effect only if you do it with the<br> > customization buffer or if your init file contains a line of this<br> > form:<br> > (setq inhibit-startup-echo-area-message "YOUR-USER-NAME")<br> <br> If we want it to be easy for users to opt out of the message, but hard <br> for admins (or other packages) to automate opting out, something like <br> the above might make sense.<br></blockquote><div><br></div><div>Very interesting, thanks. I took a look at the implementation, and I'd prefer not to do anything like that (which involves looking through the user's init file, and seems like it would miss at least some cases). For now, I'll keep it simple. </div></div></div> --00000000000036371b0603f0a5d7--
Re: [NonGNU ELPA] New package: llm
Author: Richard Stallman
Date: Sun, 27 Aug 2023 21:31
Date: Sun, 27 Aug 2023 21:31
40 lines
1551 bytes
1551 bytes
[[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > (defun llm--warn-on-nonfree (name tos) > > "Issue a warning if `llm-warn-on-nonfree' is non-nil. > > NAME is the human readable name of the LLM (e.g 'Open AI'). > > > > TOS is the URL of the terms of service for the LLM. > > > > All non-free LLMs should call this function on each llm function > > invocation." > > (when llm-warn-on-nonfree > > (lwarn '(llm nonfree) :warning "%s API is not free software, and your > > freedom to use it is restricted. > > See %s for the details on the restrictions on use." name tos))) I presume that the developers judge whether any given LLM calls for a warning, and add a call to this function if it does. Right? The basic approach looks right, bit it raises two questions about details: 1. What exactly is the criterion for deciding whether a given LLM should call this function? In other words, what are the conditions on which we should warn the user? Let's discuss that to make sure we get it right. 2. Is it better to include the TSO URL in the warning, or better NOT to include it and thus avoid helping bad guys publicize their demands? -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org)
Re: [NonGNU ELPA] New package: llm
Author: Andrew Hyatt
Date: Sun, 27 Aug 2023 22:32
Date: Sun, 27 Aug 2023 22:32
162 lines
6997 bytes
6997 bytes
--000000000000115e8a0603f28209 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sun, Aug 27, 2023 at 9:32 PM Richard Stallman <rms@gnu.org> wrote: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > > (defun llm--warn-on-nonfree (name tos) > > > "Issue a warning if `llm-warn-on-nonfree' is non-nil. > > > NAME is the human readable name of the LLM (e.g 'Open AI'). > > > > > > TOS is the URL of the terms of service for the LLM. > > > > > > All non-free LLMs should call this function on each llm function > > > invocation." > > > (when llm-warn-on-nonfree > > > (lwarn '(llm nonfree) :warning "%s API is not free software, and > your > > > freedom to use it is restricted. > > > See %s for the details on the restrictions on use." name tos))) > > I presume that the developers judge whether any given LLM calls for a > warning, and add a call to this function if it does. Right? > > The basic approach looks right, bit it raises two questions about > details: > > 1. What exactly is the criterion for deciding whether a given LLM > should call this function? In other words, what are the conditions on > which we should warn the user? Let's discuss that to make sure we > get it right. > After following Jim Porter's suggestion above, here is the new function, and you can see the advice we're giving in the docstring: (cl-defgeneric llm-nonfree-message-info (provider) "If PROVIDER is non-free, return info for a warning. This should be a cons of the name of the LLM, and the URL of the terms of service. If the LLM is free and has no restrictions on use, this should return nil. Since this function already returns nil, there is no need to override it." (ignore provider) nil) So, "free and no restrictions on use". I'm happy to link to any resources to help users understand better if you think it is needed. > > 2. Is it better to include the TSO URL in the warning, or better NOT > to include it and thus avoid helping bad guys publicize their demands? I think it's best to include it. To claim there are restrictions on use, but not reference those same restrictions strikes me as incomplete, from the point of view of the user who will be looking at the warning. > > > -- > Dr Richard Stallman (https://stallman.org) > Chief GNUisance of the GNU Project (https://gnu.org) > Founder, Free Software Foundation (https://fsf.org) > Internet Hall-of-Famer (https://internethalloffame.org) > > > --000000000000115e8a0603f28209 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir="ltr"><div dir="ltr">On Sun, Aug 27, 2023 at 9:32 PM Richard Stallman <<a href="mailto:rms@gnu.org">rms@gnu.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">[[[ To any NSA and FBI agents reading my email: please consider ]]]<br> [[[ whether defending the US Constitution against all enemies, ]]]<br> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]<br> <br> > > (defun llm--warn-on-nonfree (name tos)<br> > > "Issue a warning if `llm-warn-on-nonfree' is non-nil.<br> > > NAME is the human readable name of the LLM (e.g 'Open AI').<br> > ><br> > > TOS is the URL of the terms of service for the LLM.<br> > ><br> > > All non-free LLMs should call this function on each llm function<br> > > invocation."<br> > > (when llm-warn-on-nonfree<br> > > (lwarn '(llm nonfree) :warning "%s API is not free software, and your<br> > > freedom to use it is restricted.<br> > > See %s for the details on the restrictions on use." name tos)))<br> <br> I presume that the developers judge whether any given LLM calls for a<br> warning, and add a call to this function if it does. Right?<br> <br> The basic approach looks right, bit it raises two questions about<br> details:<br> <br> 1. What exactly is the criterion for deciding whether a given LLM<br> should call this function? In other words, what are the conditions on<br> which we should warn the user? Let's discuss that to make sure we<br> get it right.<br></blockquote><div><br></div><div>After following Jim Porter's suggestion above, here is the new function, and you can see the advice we're giving in the docstring:</div><div><br></div><div>(cl-defgeneric llm-nonfree-message-info (provider)<br> "If PROVIDER is non-free, return info for a warning.<br>This should be a cons of the name of the LLM, and the URL of the<br>terms of service.<br><br>If the LLM is free and has no restrictions on use, this should<br>return nil. Since this function already returns nil, there is no<br>need to override it."<br> (ignore provider)<br> nil)<br></div><div><br></div><div>So, "free and no restrictions on use". I'm happy to link to any resources to help users understand better if you think it is needed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br> 2. Is it better to include the TSO URL in the warning, or better NOT<br> to include it and thus avoid helping bad guys publicize their demands?</blockquote><div><br></div><div>I think it's best to include it. To claim there are restrictions on use, but not reference those same restrictions strikes me as incomplete, from the point of view of the user who will be looking at the warning.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br> -- <br> Dr Richard Stallman (<a href="https://stallman.org" rel="noreferrer" target="_blank">https://stallman.org</a>)<br> Chief GNUisance of the GNU Project (<a href="https://gnu.org" rel="noreferrer" target="_blank">https://gnu.org</a>)<br> Founder, Free Software Foundation (<a href="https://fsf.org" rel="noreferrer" target="_blank">https://fsf.org</a>)<br> Internet Hall-of-Famer (<a href="https://internethalloffame.org" rel="noreferrer" target="_blank">https://internethalloffame.org</a>)<br> <br> <br> </blockquote></div></div> --000000000000115e8a0603f28209--
Re: [NonGNU ELPA] New package: llm
Author: Andrew Hyatt
Date: Mon, 28 Aug 2023 00:54
Date: Mon, 28 Aug 2023 00:54
166 lines
7766 bytes
7766 bytes
--0000000000002796380603f47f5e Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sun, Aug 27, 2023 at 10:59 PM Jim Porter <jporterbugs@gmail.com> wrote: > On 8/27/2023 7:32 PM, Andrew Hyatt wrote: > > After following Jim Porter's suggestion above, here is the new function, > > and you can see the advice we're giving in the docstring: > > > > (cl-defgeneric llm-nonfree-message-info (provider) > > "If PROVIDER is non-free, return info for a warning. > > This should be a cons of the name of the LLM, and the URL of the > > terms of service. > > > > If the LLM is free and has no restrictions on use, this should > > return nil. Since this function already returns nil, there is no > > need to override it." > > (ignore provider) > > nil) > > For what it's worth, I was thinking about having the default be the > opposite: warn users by default, since we don't really know if an LLM > provider is free unless the Elisp code indicates it. (Otherwise, it > could simply mean the author of that provider forgot to override > 'llm-nonfree-message-info'.) In other words, assume the worst by > default. :) That said, if everyone else thinks this isn't an issue, I > won't stamp my feet about it. > I agree that it'd be nice to have that property. That's the way I had it initially, but since you need info if it's non-free (the name / TOS), but not if it is free, the design where free was the default was the simplest. The alternative was one method indicating it was free/nonfree and the other, if non-free, to provide the additional information. > > As for the docstring, I see that many models use ordinary software > licenses, such as the Apache license. That could make it easier for us > to define the criteria for a libre provider: is the model used by the > provider available under a license the FSF considers a free software > license?[1] (For LLM providers that you use by making a web request, we > could also expect that all the code for their web API is libre too. > However, that code is comparatively uninteresting, and so long as you > could get the model to use on a self-hosted system[2], I don't see a > need to warn the user.) > I agree that it'd be nice to define this in a more clear way, but we also can just wait until someone proposes a free LLM to include to judge it. We can always bring it back to the emacs-devel list if there is uncertainty. The hosting code is not that relevant here. For these companies, there would be restrictions on the use of the model even if there were no other unfree software in the middle (kind of like how Llama 2 is). Notably, no company is going to want the user to train competing models with their model. This is the most common restriction on freedoms of the user. > > (Also, if you prefer to avoid having to say '(ignore provider)', you can > also prefix 'provider' with an underscore. That'll make the byte > compiler happy.) > TIL, that's a great tip, thanks! > > [1] https://www.gnu.org/licenses/license-list.en.html > > [2] At least, in theory. A user might not have enough computing power to > use the model in practice, but I don't think that matters for this case. > --0000000000002796380603f47f5e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir="ltr"><div dir="ltr">On Sun, Aug 27, 2023 at 10:59 PM Jim Porter <<a href="mailto:jporterbugs@gmail.com">jporterbugs@gmail.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 8/27/2023 7:32 PM, Andrew Hyatt wrote:<br> > After following Jim Porter's suggestion above, here is the new function, <br> > and you can see the advice we're giving in the docstring:<br> > <br> > (cl-defgeneric llm-nonfree-message-info (provider)<br> > "If PROVIDER is non-free, return info for a warning.<br> > This should be a cons of the name of the LLM, and the URL of the<br> > terms of service.<br> > <br> > If the LLM is free and has no restrictions on use, this should<br> > return nil. Since this function already returns nil, there is no<br> > need to override it."<br> > (ignore provider)<br> > nil)<br> <br> For what it's worth, I was thinking about having the default be the <br> opposite: warn users by default, since we don't really know if an LLM <br> provider is free unless the Elisp code indicates it. (Otherwise, it <br> could simply mean the author of that provider forgot to override <br> 'llm-nonfree-message-info'.) In other words, assume the worst by <br> default. :) That said, if everyone else thinks this isn't an issue, I <br> won't stamp my feet about it.<br></blockquote><div><br></div><div>I agree that it'd be nice to have that property. That's the way I had it initially, but since you need info if it's non-free (the name / TOS), but not if it is free, the design where free was the default was the simplest. The alternative was one method indicating it was free/nonfree and the other, if non-free, to provide the additional information.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br> As for the docstring, I see that many models use ordinary software <br> licenses, such as the Apache license. That could make it easier for us <br> to define the criteria for a libre provider: is the model used by the <br> provider available under a license the FSF considers a free software <br> license?[1] (For LLM providers that you use by making a web request, we <br> could also expect that all the code for their web API is libre too. <br> However, that code is comparatively uninteresting, and so long as you <br> could get the model to use on a self-hosted system[2], I don't see a <br> need to warn the user.)<br></blockquote><div><br></div><div>I agree that it'd be nice to define this in a more clear way, but we also can just wait until someone proposes a free LLM to include to judge it. We can always bring it back to the emacs-devel list if there is uncertainty.</div><div><br></div><div>The hosting code is not that relevant here. For these companies, there would be restrictions on the use of the model even if there were no other unfree software in the middle (kind of like how Llama 2 is). Notably, no company is going to want the user to train competing models with their model. This is the most common restriction on freedoms of the user.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br> (Also, if you prefer to avoid having to say '(ignore provider)', you can <br> also prefix 'provider' with an underscore. That'll make the byte <br> compiler happy.)<br></blockquote><div><br></div><div>TIL, that's a great tip, thanks!</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br> [1] <a href="https://www.gnu.org/licenses/license-list.en.html" rel="noreferrer" target="_blank">https://www.gnu.org/licenses/license-list.en.html</a><br> <br> [2] At least, in theory. A user might not have enough computing power to <br> use the model in practice, but I don't think that matters for this case.<br> </blockquote></div></div> --0000000000002796380603f47f5e--
Re: [NonGNU ELPA] New package: llm
Author: Richard Stallman
Date: Wed, 30 Aug 2023 22:10
Date: Wed, 30 Aug 2023 22:10
81 lines
3552 bytes
3552 bytes
[[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > As for the docstring, I see that many models use ordinary software > licenses, such as the Apache license. That could make it easier for us > to define the criteria for a libre provider: is the model used by the > provider available under a license the FSF considers a free software > license In general, an LLM system consists of two parts: the engine, which is a program written in a programming language, and the trained neural network. For the system to be free, both parts must be free. A number of engines are free software, but it is unusual for a trained neural network to be free. I think that "model" refers to the trained neural network. That's how models are implemented. To figure out whether a program is free by scanning it is hard to do reliably. That is why for LibreJS we designed a more precise method for indicating licenses on parts of a file. I recommend against trying to do this. It should not be a lot of work for a human to check this and get a reliable result. That applies to LLM systems that you download and run on your own machine. As for LLMs that run on servers, they are a different issue entirely. They are all SaaSS (Service as a Software Substitute), and SaaSS is always unjust. See https://gnu.org/philosophy/who-does-that-server-really-serve.html for explanation. So if you contact it over the internet, it should get a warning with a reference to that page. Maybe there is no need need to pass info about the terms of service. Only a service can impose terms of service, and the mere fact that it is a service, rather than a program to download and run, inherently means the user does not control its operation. That by itself is reason for a notice that it is bad. Any restrictions imposed by terms of service could add to the bad. Perhaps it would be good to mention that that second injustice exists. Maybe it would be good to say, This language model treats users unjustly because it does the user's computing on a computer where the user has no control over its operation. It is "Service as a Software Substitute", as we call it. See https://gnu.org/philosophy/who-does-that-server-really-serve.html. In addition, it imposes "terms of service", restrictions over what users can do with the system. That is a second injustice. If society needs to restrict some of the uses of language model systems, it should do so by democratically passing laws to penalize those actions -- regardless of how they are done -- and not by allowing companies to impose restrictions arbitrarily on users. The laws would be more effective at achieving the goal, as weil as avoidng giving anyone unjust power over others. I think that it is better to present the URL of the web site's front page rather than the terms of service themselves. If we point the user at the terms of service, we are directly helping the company impose them. If the user visits the front page, perse can easily find the terms of service. But we will not have directly promoted attention to them. This is a compromise between two flaws. -- Dr Richard Stallman (https://stallman.org) Chief GNUisance of the GNU Project (https://gnu.org) Founder, Free Software Foundation (https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org)
Re: [NonGNU ELPA] New package: llm
Author: Ihor Radchenko
Date: Thu, 31 Aug 2023 09:06
Date: Thu, 31 Aug 2023 09:06
37 lines
1509 bytes
1509 bytes
Richard Stallman <rms@gnu.org> writes: > As for LLMs that run on servers, they are a different issue entirely. > They are all SaaSS (Service as a Software Substitute), and SaaSS is > always unjust. > > See https://gnu.org/philosophy/who-does-that-server-really-serve.html > for explanation. I do not fully agree here. A number of more powerful LLMs have very limiting hardware requirements. For example, some LLMs require 64+Gbs of RAM to run: https://github.com/ggerganov/llama.cpp#memorydisk-requirements. Not every PC is able to handle it, even if both the engine and the neural network weights are free. In such scenario, the base assumption you make in https://gnu.org/philosophy/who-does-that-server-really-serve.html may no longer hold for most users: "Suppose that any software tasks you might need for the job are implemented in free software, and you have copies, and you have whatever data you might need, as well as computers of whatever speed, functionality and capacity might be required." Thus, for many users (owning less powerful computers) LLMs as a service are going to be SaaS, not SaaSS. (Given that the SaaS LLM has free licence and users who choose to buy the necessary hardware retain their freedom to run the same LLM on their hardware.) -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92>
Thread Navigation
This is a paginated view of messages in the thread with full content displayed inline.
Messages are displayed in chronological order, with the original post highlighted in green.
Use pagination controls to navigate through all messages in large threads.
Back to All Threads