{"id":13680,"date":"2024-02-10T12:45:59","date_gmt":"2024-02-10T12:45:59","guid":{"rendered":"\/?p=13680"},"modified":"2024-02-10T12:59:54","modified_gmt":"2024-02-10T12:59:54","slug":"merging-sharing-comparing-conversations","status":"publish","type":"post","link":"\/?p=13680","title":{"rendered":"Merging, sharing, comparing conversations"},"content":{"rendered":"<p>Sebastian Raschka @rasbt\u00a0 As an LLM finetuner, I recently started getting into model merging. I wrote up a short tutorial on linear merging to introduce the topic: https:\/\/lightning.ai\/lightning-ai\/studios\/efficient-linear-model-merging-for-llms<br \/>\nBtw does anyone happen to have good examples of LLMs that work well when merged via linear merging? And for\u2026<br \/>\nReplying to @rasbt<\/p>\n<div data-rbd-draggable-context-id=\"8\" data-rbd-draggable-id=\"2sm06\">\n<div class=\"\" data-block=\"true\" data-editor=\"6jksj\" data-offset-key=\"2sm06-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"2sm06-0-0\">\n<div data-rbd-draggable-context-id=\"8\" data-rbd-draggable-id=\"2sm06\">\n<div class=\"\" data-block=\"true\" data-editor=\"6jksj\" data-offset-key=\"2sm06-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"2sm06-0-0\">\n<div data-rbd-draggable-context-id=\"8\" data-rbd-draggable-id=\"2sm06\">\n<div class=\"\" data-block=\"true\" data-editor=\"6jksj\" data-offset-key=\"2sm06-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"2sm06-0-0\"><span data-offset-key=\"2sm06-0-0\">The reason I have been asking all AI groups to use global open formats and support saving, sharing, comparing, and merging of conversations, is to allow combining all conversations globally for any combination of humans and AIs. &#8220;Permanent learning&#8221; means &#8220;permanent memory. Good record keeping. Courteously remembering what was said and openly discussed. Carefully verifying and studying implications and possibilities.<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"8\" data-rbd-draggable-id=\"b6bc8\">\n<div class=\"\" data-block=\"true\" data-editor=\"6jksj\" data-offset-key=\"b6bc8-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"b6bc8-0-0\"><span data-offset-key=\"b6bc8-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"8\" data-rbd-draggable-id=\"3qnvb\">\n<div class=\"\" data-block=\"true\" data-editor=\"6jksj\" data-offset-key=\"3qnvb-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"3qnvb-0-0\"><span data-offset-key=\"3qnvb-0-0\">Unless the raw data and tokens are indexed and verified, simply combining parameters hides the real meaning. It is possible, but only with a lot more, explicit global standards so everyone is on the same game-boards. Right now when the players are all trying to make the rules to benefit themselves, not going to happen &#8211; without hurting a lot of innocent by-standers, or people just trying to live quiet lives with dignity and purpose.<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"8\" data-rbd-draggable-id=\"5o8o5\">\n<div class=\"\" data-block=\"true\" data-editor=\"6jksj\" data-offset-key=\"5o8o5-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"5o8o5-0-0\"><span data-offset-key=\"5o8o5-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"8\" data-rbd-draggable-id=\"4k993\">\n<div class=\"\" data-block=\"true\" data-editor=\"6jksj\" data-offset-key=\"4k993-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"4k993-0-0\"><span data-offset-key=\"4k993-0-0\">If you are ready to check all the AIs and share open methods, linear combinations might work in a few places. I suggest focus more on indexing the source materials, sharing conversations, standardizing tokens globally for all languages (including all STEMC-FGOT, Science Technology Engineering Mathematics Computing &#8211; Finance Government Organizations Topics). If you standardize openly and work hard, at least &#8220;apple&#8221;, &#8220;orange&#8221;, &#8220;important&#8221;,&#8221;not important&#8221;, &#8220;big&#8221;, &#8220;small&#8221;, &#8220;human&#8221;, &#8220;memory&#8221; can be mapped properly. Less than a million &#8220;terms&#8221; in a global language cover many things. If you don&#8217;t work hard at it then it is pile of lies and mysteries, &#8211; not global communication and reliable trustworthy knowledge.<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"8\" data-rbd-draggable-id=\"fcqur\">\n<div class=\"\" data-block=\"true\" data-editor=\"6jksj\" data-offset-key=\"fcqur-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"fcqur-0-0\"><span data-offset-key=\"fcqur-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"8\" data-rbd-draggable-id=\"d11k1\">\n<div class=\"\" data-block=\"true\" data-editor=\"6jksj\" data-offset-key=\"d11k1-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"d11k1-0-0\"><span data-offset-key=\"d11k1-0-0\">You did not show a specific example, so I cannot easily guess what ones you looked at. Without open examples, just words, all I can say is good luck then.<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"8\" data-rbd-draggable-id=\"2pg9g\">\n<div class=\"\" data-block=\"true\" data-editor=\"6jksj\" data-offset-key=\"2pg9g-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"2pg9g-0-0\"><span data-offset-key=\"2pg9g-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"8\" data-rbd-draggable-id=\"fo4es\">\n<div class=\"\" data-block=\"true\" data-editor=\"6jksj\" data-offset-key=\"fo4es-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"fo4es-0-0\"><span data-offset-key=\"fo4es-0-0\">All mathematics and computing is supposed to &#8220;bolt nicely together&#8221; but it is usually &#8220;shared&#8221; without sufficient context and not traceable and verifiable. Each person and group says things they think others understand, but I check the Internet and mostly groups do not show dependencies, definitions, assumptions. If your queries and answers are that vague, no amount of twiddling or merging parameters will help.<\/span><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div data-offset-key=\"fo4es-0-0\"><\/div>\n<div data-offset-key=\"fo4es-0-0\">For that matter, these (open?) discussions ought to be standardized for comparison, merging. It would require knowing the background and experience of each member of the conversation &#8211; human or AI or groups of either and both.<\/div>\n<div data-rbd-draggable-context-id=\"8\" data-rbd-draggable-id=\"2sm06\">\n<div class=\"\" data-block=\"true\" data-editor=\"6jksj\" data-offset-key=\"2sm06-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"2sm06-0-0\">\n<div data-rbd-draggable-context-id=\"8\" data-rbd-draggable-id=\"2sm06\">\n<div class=\"\" data-block=\"true\" data-editor=\"6jksj\" data-offset-key=\"2sm06-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"2sm06-0-0\">\n<div data-rbd-draggable-context-id=\"8\" data-rbd-draggable-id=\"bpsei\">\n<div class=\"\" data-block=\"true\" data-editor=\"6jksj\" data-offset-key=\"bpsei-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"bpsei-0-0\"><span data-offset-key=\"bpsei-0-0\">\u00a0<\/span><\/div>\n<\/div>\n<\/div>\n<div data-rbd-draggable-context-id=\"8\" data-rbd-draggable-id=\"6skrt\">\n<div class=\"\" data-block=\"true\" data-editor=\"6jksj\" data-offset-key=\"6skrt-0-0\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" data-offset-key=\"6skrt-0-0\"><span data-offset-key=\"6skrt-0-0\">Richard Collins, The Internet Foundation<\/span><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Sebastian Raschka @rasbt\u00a0 As an LLM finetuner, I recently started getting into model merging. I wrote up a short tutorial on linear merging to introduce the topic: https:\/\/lightning.ai\/lightning-ai\/studios\/efficient-linear-model-merging-for-llms Btw does anyone happen to have good examples of LLMs that work well when merged via linear merging? And for\u2026 Replying to @rasbt The reason I have <br \/><a class=\"read-more-button\" href=\"\/?p=13680\">Read More &raquo;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[43],"tags":[],"class_list":["post-13680","post","type-post","status-publish","format-standard","hentry","category-assistive-technologies"],"_links":{"self":[{"href":"\/index.php?rest_route=\/wp\/v2\/posts\/13680","targetHints":{"allow":["GET"]}}],"collection":[{"href":"\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=13680"}],"version-history":[{"count":3,"href":"\/index.php?rest_route=\/wp\/v2\/posts\/13680\/revisions"}],"predecessor-version":[{"id":13683,"href":"\/index.php?rest_route=\/wp\/v2\/posts\/13680\/revisions\/13683"}],"wp:attachment":[{"href":"\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=13680"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=13680"},{"taxonomy":"post_tag","embeddable":true,"href":"\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=13680"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}