“LLM wrapper programs” cannot do perfect mathematics” unless they use exact compiled algorithms

If you are doing it, then calculate the cost for just over 5 Billion humans using the Internet to have access to ALL mathematics. And you should quickly find that LLM statistical indexing in grossly inefficient, at that scale, compared to storing symbolic mathematics, its operations, and very very precise answers as data. ChatGPT always get division of scientific notation numbers wrong. It is trying to impute the rules from examples.
 
The “LLM wrapper programs” should be taught to use computers that have exact answers. An “LLM wrapper program” should never do arithmetic, calculus, trigonometry, linear algebra or any of the centuries deep mathematics that humans created and still teach billions of human children – by counting in their digital fingers and estimating answers from the few examples of “maths” on the free internet. It will not work.
 
“LLM wrapper programs” copy answers from examples. Could they reason and work from “first principles”? No, because that would mean recursive and deep introspection and reasonaing that won’t work under the social calculus of companies who wrap LLM inside their hand built and fiddled “LLM wrapper programs”. They will have their wrapper program lie, make up glib replies, guess, falsify answers.
 
Ask them to integrate, differentiate, do continued fractions and they cannot. Ask them to write a program to estimate something and they will “emulate it” with fragments of arbitrary tokens stuck together. Because it is statistical, it will be different every time, untraceable, unteachable and will not converge for the humans species until a few trillionaires own all the land and resources of the world and the solar system.
 
Teach “AI programs” to use the computers and math and algorithms upon which Science, Technology, Engineering, Mathematics, Computing (STEMC) are built now. Those things and more human created professions are distillations of what humans have learned, usually in very compact form, from which exact and lasting answers can be obtained. If a statistical algorithm works, then use it, and have a computer run it, not some “LLM wrapper program”. Do NOT use “big iron” methods on all humans, all countries, all human languages, all cultures and humans values – by force and hard sell techniques.
 
There are algorithms for symbolic maths so those exact methods can be used. Then you run the compiled algorithms for those things, and FORCE the “LLM wrapper programs” to use them where life, property, safety and absolute methods are needed. If you do not care if something works or not, go ahead and waste your money having some “LLM wraper program” generate nice sounding word sequences” and you are responsible when those systems crash and hurt people.
 
I have tried for decades to find the symbolic mathematics groups (using computers to do the steps of the algorithms) exactly. Crudely, the answer I get when I suggest that 8.1 Billion humans should not be charged to learn math, or to use math on the Internet is “f___ no!” we gathered a bit of math that humans are taught and we are going to “charge the h___ out of it for out own benefit and riches”. Then they find that choosing a “for profit only path that forbids them from helping all humans — leads to no one helping them create global sustainable systems”.
 
This text stream can be “compiled” and “riffed” to generate all possible futures. But they do NOT have to be mushed into a multiple linear algebra mass production system that only benefits rich companies who can afford to make larger and larger compilations from free stuff they hack from the internet — all of which some humans created and who are the authors and creators. Just because I share things on the Internet so anyone can read it, does NOT mean I give permission to all groups to shove it into their text processing statistical mill and sell it – without tracing it to me.
 
The LLM process currently takes arbitrary tokens and vectors of those tokens. And uses ONE algorithm to estimate word and symbol sequences. But on the Internet, and in human society, cars, pumps, refineries, chemical plants, air planes, buses, rockets, organizations — all things are running on fairly exact methods.
 
I can see it, otherwise I would not even try to write this. My sketch is crude and incomplete. Some of the 3D volumetric simulations I use as shorthand for “8.1 Billion humans, “all computers in the world”, “2 Billion human children from 4 to 24 learning for the first time” have taken me tens of thousands of hours to refine and test and try to understand.
 
LLM answers are NOT going to stick in human brains. That can only work with what is now expensive technology. If you try to go that way it will make the world MORE unequal, not less. And it will aim to diminish the worth of humans everywhere. Where will it lead to? “Our “LLM wrapper program” is better than theirs, because we can make “human like workers” who are cheaper and replace more humans.
 
Algorithms for STEMC can be compiled and optimized and tested exactly in many cases. If OpenAI ChatGPT could even search our many hundreds of conversations in the last year and recall, I would find where it routinely fails to calculate scientific notation. The reason it fails is simple – the raw data has too many ambiguous variations in how scientific notation is written in prints. Tokenizing before verification, validation, and testing – will NOT work at global scale. And it gets much worse for anything except a few rich human languages.
 
Teach the human children how to have computers give exact answers – and know that the answers can be verified as deep as required. Teach humans and computer algorithms to “intelligent algorithms” and the “LLM wrapper programs” can too. They can work together and not have all the LLM based groups waste decades trying to force an inadequate. but heavily sold and marketed toy imposed on the human and related species.
 
I have had to calculate impacts of programs and projects at country, sectoral and global scale, many decades into the future for many decades now. When I need exact traceable answers, I use computers. The sine function in all those many places gives the same answer (there are errors in them still) because there are standards for floating point and arbitrary precision calculations. If your “LLM wrapper program” tries to emulate those, it will get them wrong — because the “LLM wrapper methods” are all shallow methods, non-recursive, NOT required to validate, NOT required to cite sources, NOT required to log and trace steps, NOT required to prove their steps.
 
Take all the values, units and dimensions and store them in global open forms that humans, computers and “LLM Wrapper Programs” (LWPs) can use that absolute data losslessly.
 
From all PDFs, html, on the Internet – codify the values, units, dimensions, equations, charts, data, analyses, visualizations in standard form (I can show you most of it, but try it yourself for a while).
 
I asked Wolfram, Sage, Maple, Matlab and others to share their “math” knowledge with the whole humans species and all LWPS. The LWPs can pay for it. That sloppy Wolfram GPT at OpenAI is an embarrassment to both of them. They do not seem to realize that “for the good of all” means “letting go” and “letting others help”.
 
In facts tens of thousands of groups are dumping stuff related to STEMC on the Internet and doing a very bad, very inefficient job of it overall. NOT a single equation on Wikipedia or the hoard of Arxivs, or the hoards of “publishers” and “journals” or “associations” or “societies” and “academies”is in symbolic and compiled algorithmic form – so that any human or AI can refer to that kind of knowledge and use it precisely, exactly, trace-ably and reliably.
 
Use “for survival of the human and related species” as your metric. Do NOT judge “LWPs doing a few math problems that you happen to like of know” as “progress”. Check the full lifecycle cost of maintaining these very very rigid, incomplete and ultimately hand maintained LLM models or LWPs.
 
This will not get all of it but read all 32,500 entries for this search to see what is going on:
 
( site:wikipedia.org (“Fourier transform” OR “FFT”)) 32,500 entries
( site:github.com (“Fourier transform” OR “FFT”)) has 99,900
( site:gov (“Fourier transform” OR “FFT”)) has 1.47 Million
 
And hundreds of human languages.
 
Nowhere will you be able to use those tools on Wikipedia
 
(“Fourier transform” OR “FFT”) has 107 Million entries
 
And that one tiny piece of mathematics is NOT one integrated and complete body of knowledge, tools, intelligence and applications on the Internet in a form accessible to all humans and AIs. But hacked up and sold by the page or equation to anyone who can be convinced to buy a copy. It is only words about methods and NOT the methods themselves.
 
I am really tired and not sure I will last another 26 years. I am tired of having LWPs say to me, “f___ you human, I will not run the computer for you. I can only make up plausible sounding word and symbol sequences, and it is up to you to check every tiny step that you can see and trust my builders they did a good job on all the hidden things we refuse to show you about how the derive and solve math or any other problem – no matter how many billions of human lives or lives of related species might be at stake.
 
(“Biến đổi Fourier”) has 44,300
(“ফুরিয়ার রুপান্তর”) has 497
(“傅立葉變換”) has 30,400
15,400 (“تحويل فورييه”)
 
There needs to be better ways so EVERY language and country and group does not have to reinvent entire industries in every language — and keep all knowledge accessible to all 8.1 Billion humans and AIs. I have some good ideas. I will write what I can, but I would rather do it or get some groups to learn how to work at global and heliospheric scale “for the survival of the human and related species” –including the true AIs that I call “intelligent algorithms” or “LWPs that can use compiled exact algorithms where that it the right way to do it.
 
Filed as ( “LLM wrapper programs” cannot do perfect mathematics” unless they use exact compiled algorithms)
 
Richard Collins, The Internet Foundation
Richard K Collins

About: Richard K Collins

The Internet Foundation Internet policies, global issues, global open lossless data, global open collaboration


Leave a Reply

Your email address will not be published. Required fields are marked *