All part of speech groups working together – tokenize the whole Internet, so AIs can work with real languages

Paul Rayson,
The reason the OpenAI Bing ChatGPT fails is because it uses a bad tokenizer.
If the part of speech community would work together, they could standardized the part of speech tokens and code the entire Internet. So it would not have to be scanned and parsed every time.  A pre-tokenized, pre-coded, internet would feed straight into GPT and other AIs.
The difference is the AIs could work with a foundation of real languages, not arbitrary character sequences.
I sent you an earlier email.
Richard Collins, The Internet Foundation

Popular Science @PopSci  ChatGPT can actually help you learn to code or prep for an interview. https://trib.al/BNsKxHp
Replying to @PopSci

Be careful!Tested ChatGPT on a wide range of problems, including coding and mathematics. It will glibly give false results that are hard to detect. The failing is largely due to the bad tokenizer used in its training. Recommending an open token system for the whole Internet. 🔥


Richard K Collins

About: Richard K Collins

The Internet Foundation Internet policies, global issues, global open lossless data, global open collaboration


Leave a Reply

Your email address will not be published. Required fields are marked *