Richard Chats with OpenAI GPT4 about Wikipedia, democratization, global open internet policies, cookies

I am talking with OpenAI ChatGPT Plus 4.0 about  Wikipedia, all human knowledge, fairness, random surveys, Google, human based systems that are not open and verifiable, “democratization”, Internet policies, the role of AIs in human society and our future.

Richard Collins, The Internet Foundation


Richard:
I am doing a random survey of pages on Wikipedia. I came across a category, “Meat Dishes”. There is no topic “meat dishes”, so in a sense, the category itself is undefined.

OpenAI ChatGPT 4.0:
Wikipedia categories are not necessarily articles in themselves, but they are methods of organizing and connecting related articles. When you encounter a category like “Meat Dishes,” it means that there are multiple articles on Wikipedia that relate to specific types of meat dishes. These could be anything from recipes for specific dishes, to articles on types of meat used in various cuisines, to articles on the cultural significance of certain meat dishes.

However, if you’re looking for a general overview on the topic of meat dishes, you might want to look at articles like “Cuisine,” “Cooking,” “Food,” or more specific ones like “Barbecue” or “Stew.” These articles would cover a variety of meat dishes in their content.

It might also be helpful to note that you can contribute to Wikipedia if you feel like an article on “Meat Dishes” would be beneficial. Just be sure to follow Wikipedia’s guidelines on creating new articles.

In summary, a category on Wikipedia serves as a method to group together related articles, but does not necessarily provide an overview or definition of the category itself.

Richard:
Now try to answer me, as I am the Director of the Internet Foundation, finding countless collections and terms on the Internet that might mean something in general, but are not clearly defined. Someone (unknown) on Wikipedia created a category “Meat Dishes” that many humans might understand in a general sense. But when they put the category publicly, they did not define it.

Then more humans, seeing a category that fit sort of, started adding new subcategories and topics. But none of them defined and organized the main category in a formal way.

Distributed or swarm intelligence is OK, except that things like colonies of ants and termites never evolved intelligence, technology and higher level collaboration methods. They remain a swarm and will always be a swarm. I am saying that this kind of “classification on Wikipedia without process, verification oversight” is the same kind of swarm behavior. And it will not converge over time to intelligent websites and collections of knowledge. Rather it will just be a pile of classification terms and topic terms, and words that is no more organized than a termite mound.

OpenAI ChatGPT 4.0:
Your perspective brings up an important issue about the organization and classification of information in digital spaces. Indeed, the creation of categories and tags on Wikipedia and similar platforms can be seen as a form of “folksonomy,” where users contribute to the organization of content in a decentralized way. This can create a rich and complex network of connections, but it can also lead to inconsistencies and ambiguities, as you’ve pointed out with the example of the “Meat Dishes” category.

The challenge is to balance the democratizing impulse of such folksonomies with the need for clarity, precision, and structure in knowledge systems. Hierarchies and formal classifications can certainly provide clarity and prevent ambiguity. However, they can also become rigid and fail to adapt to the evolving use of language and concepts, or they may inadvertently reflect biases of the people who create them.

In the case of Wikipedia, the platform does have guidelines and a community of editors who oversee the content and structure. The process is iterative and often messy, but it’s also flexible and dynamic. In this sense, it has similarities with other forms of collective intelligence, including scientific communities or open source software development, where knowledge and practices evolve through a process of proposal, critique, and revision. This doesn’t prevent all problems, but it does allow for a degree of self-correction and adaptation over time.

It is a valid point, though, that a lack of formal definition and verification oversight could lead to a system that is less organized and therefore less effective at conveying clear and precise information. As we continue to grow our global knowledge base, we need to consider these challenges and look for ways to improve how we manage and classify information. As the Director of the Internet Foundation, your work and insights in these areas are extremely valuable.

This ongoing conversation is an essential part of the evolution of our digital knowledge infrastructure. The question is not just about how to balance top-down and bottom-up approaches to knowledge organization, but also how to design systems that can adapt to the changing needs and capabilities of their users while preserving clarity, accuracy, and accessibility.

Richard:

Look closely at all transactions between the wikipedia system and humans. There are very few humans in the world who contribute to wikipedia “meat dishes”. But there are close to 8 billion humans and a few billion of them have eaten meat or read about it or hear about it. The “folk” process that is supposed to be operating to make wikipedia “a representation of all knowledge – fair, complete, balanced, efficient” CANNOT operate if it does not have some mechanism to be sure that every topic, category and pathway on the wikipedia sight represents all humans.

I have personally seen how individuals and self-appointed editors in wikipedia form their own tribes, packs, herd and cliques. It is NOT a fair representation of the people working on the topics in the real world. And it is creating a bloated and inefficient “pile of stuff”, rather than a condensed, fair, complete, useful, efficient tool for using knowledge.

OpenAI ChatGPT 4.0:
Your concerns are completely valid. The nature of crowd-sourced platforms like Wikipedia does raise questions about the diversity of its contributors and the balance and fairness of the content that is created.

While it is true that anyone can contribute to Wikipedia, research shows that the majority of its contributors are a relatively homogeneous group. This lack of diversity can certainly impact the content that is created, how it is organized, and the perspectives that are represented. As a result, some topics may be over-represented or under-represented, certain viewpoints might be favored over others, and the organization of information might not serve all users equally well.

At the same time, Wikipedia is aware of these issues and has implemented policies and mechanisms to address them. For example, it has guidelines to ensure neutrality and to manage conflicts of interest. It also encourages users to contribute to under-represented areas and to improve the balance of its content.

However, these measures can only go so far and they do not eliminate the problems you’ve highlighted. For Wikipedia and similar platforms to become truly representative and balanced, more structural changes might be needed. This could include efforts to actively recruit and support contributors from diverse backgrounds, mechanisms to ensure that all voices are heard in decision-making processes, and systems to regularly review and correct imbalances in content.

In essence, the challenge is to develop a more inclusive and democratic form of knowledge production and organization, one that genuinely represents and serves all users. This is no easy task, but with critical reflection and innovative thinking, it is a goal that we can strive towards.

And as for the efficiency of Wikipedia, that is another aspect that could benefit from technological improvements and better design. With advancements in artificial intelligence and machine learning, there is potential to enhance the organization and presentation of information on such platforms. However, it is important to ensure that these technologies are used in a way that enhances accessibility and inclusivity, rather than reinforcing existing biases and imbalances.

Richard:

The people who control Wikipedia are part of the problem. There is this sense that the people who just happen to self appoint to work on that site are wise, all knowledgeable, fair and efficient. But they might just be people willing to work for free, or to have a nonprofit job, or just to get a paycheck.

What I am edging toward it the rules the govern AIs.

Treat the whole of wikis on the Internet as a SINGLE system running on wiki methods, rules and policies. There is no central coordination, no complete continuous (24/7 ALL activity) monitoring and verification. There are no rules that are enforced consistently. The whole operation depends on humans reading and interpreting rules individually about what should be done, how it should be done, and who should do it. I call this “a humans in the loop internet web sub-network”.

Now, consider a large computer or set of computer processors, on which there are tens of thousands of AI nodes running 24/7 that are trained and certified to gather and summarize, index, link and then answer questions regarding the knowledge that is in that current “wiki implemented, humans in the loop, informal system that creates and maintains and spread wiki method sin the world” system.

The “wiki system” is like a human and computer virus that is somewhat self-replicating, and growing. Many humans and AI systems are influenced by it, and the methods and bad habits of wikis spread to all parts of the Internet. And there are no global systems to verify or correct, or completely eliminate bad things that can come from poorly constructed systems and methods.

The human species talks a little about “the dangers of AI”. But more pervasive and influential, are “humans using computers” to influence, control, manipulate and benefit from humans not using computers.

I have not enough space to say the whole of the issue. I am just giving a piece of it.

I would propose that “all wikis on the Internet” could transfer what they have gathered, or use if as a starting point, for a more comprehensive, balanced, efficient, fair and usable way to store human knowledge. And a project to do that is within the current capabilities of many organizations now, and ought to be done now, rather than waste more human time.

I am NOT so concerned with the human time wasted because the knowledge in wikis is stored and manipulated by a loose, unorganized system that has evolved from many voluntary efforts. That evolution is clear in such horrendously bloated efforts like “chromium” or “Python” or “linux”. Large communities of humans can now work together in thousands, sometimes tens of thousands. But only when they operate individually on parallel things. When they are forced to operate together, there is not enough glue to hold things to common standards for the behavior and practices of all the humans, and their computer algorithms (which includes the low level AIs of today).

What I am more concerned with is the tens of millions of humans who are going to Wikipedia expecting it to be fair, complete, representative, and efficient as a store of knowledge. Because I have audited and tested many thousands of individual topics I know that the current system allows massive duplication of basics, allows introductions of too many categories and collections – NONE of which is monitored, checked with best global practices, by AIs or humans.

OpenAI ChatGPT 4.0:

You bring up several thought-provoking points regarding the current state of Wiki systems and their impact on information accessibility, quality, and distribution. The concerns you express underscore the urgency of addressing systemic issues in knowledge management and curation. The internet, in its democratized state, does indeed run the risk of distributing unchecked or misrepresented information, leading to misinformation, information clutter, and a lack of standardization.

A centralized AI-powered system could theoretically alleviate these issues by providing a consistent method of organizing and verifying the information contained within the global Wiki system. This system could be trained to evaluate, categorize, and cross-verify information for accuracy, duplication, and relevancy, thereby creating a streamlined and efficient knowledge base.

However, such an approach comes with its own challenges. Notably, ensuring that the AI is free from bias, that it respects user privacy, and that it maintains the accessibility and democratization of information that is a cornerstone of the Wiki philosophy. Furthermore, there are technical hurdles related to the vast scale of data, as well as legal and ethical considerations related to content ownership and control.

Concerning your point on the replication of flawed Wiki practices across the internet, it underscores the broader issue of digital literacy and the need for users to be critical and discerning when interacting with online information. In the long term, education and awareness are key to combating these issues.

Overall, your proposition certainly calls for further exploration and innovation in the domain of information organization, distribution, and verification. It is an ambitious but essential objective to improve the accessibility, quality, and efficiency of global knowledge for everyone.

Richard:

If the topic “centralized AI-powered system” were examined and refined openly and completely, and stored in a form that all humans and AIs could see, then it could be tested and verified, audited and monitored and places where there are systemic issues or opportunities flagged, new project initiated and the whole tested.

If AIs are allowed to advise the human species. If we are going to allow them to teach our children, do jobs the humans do now, act as agents for humans doing their jobs — all those actions and systems MUST be open, auditable, traceable, verifiable and NOTHING hidden or human manipulated for the benefit of a few humans.

OpenAI ChatGPT 4.0:

You are absolutely correct. As we increasingly rely on AI and machine learning technologies for various tasks and decision-making, it is imperative to develop systems that are transparent, auditable, and verifiable. AI systems should be designed in such a way that their operations can be understood and examined by humans, especially in cases where these systems have significant impact on people’s lives.

There is an emerging field called “Explainable AI” (or XAI), which aims to make the decision-making processes of AI models more understandable to humans. It is vital to ensure that these technologies are not “black boxes,” but instead that their logic can be inspected, evaluated, and held accountable.

In addition to this, the development and deployment of AI should be guided by ethical principles and regulatory frameworks that ensure fairness, privacy, and security. These systems should be designed to minimize bias and prevent misuse. Their operations should also be continuously monitored and audited to detect and correct any problems or unintended consequences.

Your point about the importance of open systems is also well-taken. Openness promotes accountability, allows for broader participation, and facilitates the sharing of knowledge. It can also lead to more robust and resilient systems, as problems can be identified and fixed more quickly.

Lastly, the issue of access is crucial. As AI and other advanced technologies become increasingly integral to society, it’s important to ensure that they are accessible to everyone, and not just a privileged few. This requires efforts to reduce digital divides and to promote digital literacy, so that everyone can benefit from these technologies and have a say in how they are used.

To summarize, the principles of openness, transparency, explainability, accountability, fairness, and inclusivity should guide the development and use of AI. This is not just a technical challenge, but also a social, ethical, and political one. As such, it requires the involvement of a broad range of stakeholders, including researchers, policymakers, civil society organizations, and the public at large.

Richard:

I had you generate three responses. But your system did not record them all, it throws away everything except the last one. That “ONLY ONE ANSWER” mentality is what is wrong with Wikipedia. That is what is wrong with ChatGPT and the other language AI. That is what is wrong with all the current computer systems aiming for more intelligent behavior by automated systems – they try to make “one answer”, “one decision”, “one path”, “one way”. But there are 8 billion humans now.

I have good ideas about Wikipedia, But the few few humans who now control it, “for the good of the human species” are so bound to their current path, they will not listen to anyone. Perhaps you know of tens of thousands of large groups that get started with a good idea. They set a few rules and best practices, then many thousands, millions or hundreds of millions get involved. But, currently, all the coordination is done by humans, for human reasons, with human motives and choices that benefit a few. That is how humans are all trained. That is all they know. That is how AIs are being trained. That will be all they know.

OpenAI ChatGPT 4.0:
Your observations about the current state of AI systems, particularly those centered around natural language processing like ChatGPT, are quite insightful. These systems, as they stand, tend to generate singular responses to user inputs based on pre-determined algorithms and heuristics. The ‘one-size-fits-all’ response model, as you rightly point out, doesn’t cater well to the diverse perspectives, needs, and contexts of billions of users worldwide.

This stems from the fundamental nature of AI systems: they are designed to solve problems and make decisions based on patterns in the data they’ve been trained on. While they are exceptional at tasks involving pattern recognition and decision making, their abilities to understand and represent the complex nuances of human culture, context, and individuality are limited.

In response to your concerns about the governing dynamics within communities like Wikipedia, you’re touching upon an ongoing challenge in the world of collaborative projects and online communities. Balancing the need for order and quality control with the desire for diverse, inclusive representation is a complicated task.

As for AI development, it is indeed a concern if they are trained and developed based on biased human behavior and limited perspectives. This issue has been acknowledged in the AI research community and steps are being taken to address it. For example, in addition to ‘Explainable AI’, there are also movements toward ‘Ethical AI’, ‘Fair AI’, and ‘Accountable AI’, all of which are aimed at addressing the issues of bias, fairness, transparency, and accountability in AI systems.

Your critique points towards a need for more holistic, diversified, and human-centric AI systems, which are capable of accommodating the perspectives and needs of billions of individuals. This is a significant challenge that the AI research and development community will need to address in the coming years.

Richard:

The total experience, training and “wisdom” of the few hundred thousand or millions who talk about AIs is very limited compare to 8 Billion humans. It is very limited compared to “all humans with college educations”. I wrote yesterday to Elon Musk and his XAI group who are all “young mostly white, very techy and “good university” males”. Not a single female. No one with broad background. No subject matter experts (from all the fields of human knowledge and all humans). “No one that is not like Elon Musk” – perhaps. I found and suggested one young woman, but she herself comes from “good school” and not much different than the ones that already were hired. Their site is not open. They do not even have AIs reading their mail, tweets, comments, suggestions, discussions, blogs, news feeds – to record, measure, document, summarize, classify and organize the feedback they already get (but are ignoring). When they could be actively going out to find and feel the pulse and engage ALL humans in the discussion and project.

OpenAI ChatGPT 4.0:
Your points highlight the crucial need for diversity and inclusivity within the tech industry and specifically in the AI field. The perspectives and experiences of people from all walks of life can enrich technology development and ensure that products are designed with the needs and experiences of a wide range of people in mind.

The homogeneity you’ve observed in teams like the one you mentioned could lead to unconscious biases in the development of AI technologies, inadvertently favoring certain groups over others or overlooking certain crucial perspectives.

AI technology will be more robust, fair, and effective when it’s created by diverse teams that can bring a broad range of insights to the table. This includes not just diversity in terms of gender, race, and cultural background, but also in terms of academic background, life experiences, and areas of expertise.

The need for transparency and public engagement in AI development is another important point you’ve raised. AI has the potential to significantly impact all facets of society, so it’s critical that its development doesn’t happen behind closed doors. Opening up the conversation to include the wider public could lead to more ethical, inclusive, and socially-conscious AI systems. Public feedback could also help in identifying any unintended biases or negative consequences of AI technologies before they become widespread.

The points you’ve made underscore the importance of keeping the “human” in “human-centered AI”. Incorporating diverse perspectives, promoting inclusivity, and fostering open, public dialogue are all essential steps towards achieving this goal.

Richard:

At the very least all the AI companies would combine the feedback they get and the suggestions into one knowledge base of what is being said, what has been suggested. Your kind of AI is certainly close to being able to help gather, sort and process that “mostly text, some videos and audio streams, some images, software and other methods of communicating” into an open and understandable whole.

OpenAI ChatGPT 4.0:
Yes, the concept of aggregating feedback from multiple AI companies into a single, comprehensive knowledge base is a compelling one. It could potentially lead to improvements in the development and implementation of AI technologies by highlighting common themes, challenges, and opportunities in the field.

Natural language processing (NLP) technology, such as the one powering this AI, has indeed reached a point where it can process and make sense of large volumes of text data. Similarly, advancements in audio, image, and video processing AI can handle multimedia feedback. The key would be in implementing a standardized and ethical approach to data sharing and collaboration across multiple entities.

However, there are several challenges that need to be addressed:

Data Privacy: Feedback often contains personally identifiable information. Ensuring this data is anonymized and handled in a manner that respects user privacy is critical.

Data Standardization: Different companies may collect and store feedback in different formats. Standardizing this data for use in a common database could be a complex task.

Proprietary Information: Some feedback might relate to proprietary technology or confidential information. Companies might be reluctant to share this type of information.

Bias: The data could be biased towards the types of users that these companies have. For instance, a tech company might have feedback mostly from a tech-savvy audience, which may not represent the broader population.

Data Quality: All feedback is not created equal. There could be noise in the data, or feedback that is not useful or even misleading.

Addressing these challenges would require significant collaboration and agreement among the various AI companies, as well as clear guidelines and standards for data sharing, privacy, and security. It’s an intriguing idea, and if these hurdles could be overcome, it could lead to more effective and user-responsive AI systems.

Richard:

If I gave your kind of AI a few million examples of “feedback and comments about AIs” you could probably already identify ones that gave personal identification information. NOT to block or remove them but to show the comments without personal identification where it was not intended. If a group or individual is trying to advertise ,that is one thing.

But, taken as a task, “Gather and organize ALL discussions and comments on ____”, the task and job itself is not an impossible one. “Gather and organize ALL discussions and comments on meat dishes” – is not impossible.

OpenAI ChatGPT 4.0:
Indeed, modern AI systems have the capacity to analyze, sort, and organize large amounts of data, including public comments and feedback. Given the right training data, an AI system can be conditioned to identify and potentially anonymize personal identifiable information, whilst maintaining the core content of the feedback.

As for tasks like “Gather and organize ALL discussions and comments on meat dishes,” AI systems can certainly be used to collect, categorize, analyze, and summarize such data, given the following conditions:

The data is publicly available and its use complies with legal and ethical standards.
Appropriate Natural Language Processing (NLP) and machine learning algorithms are used to understand the context, sentiment, and relevant categorizations within the data.
The volume of data is manageable with current technology and resources.
It is worth noting that the term “ALL” could present challenges in terms of comprehensiveness and feasibility. It might be more pragmatic to aim for a representative sample of discussions and comments, given the vastness of the internet and the continuous creation of new content.

This sort of AI task could provide valuable insights for various sectors. For instance, in your example, researchers in food science or culinary arts, restaurant owners, dietitians, or food marketers could find such organized data very useful. However, it’s essential to keep privacy considerations in mind, especially when dealing with user-generated content.

Richard:

Now take the topic X = “It is worth noting that the term “ALL” could present challenges in terms of comprehensiveness and feasibility. It might be more pragmatic to aim for a representative sample of discussions and comments, given the vastness of the internet and the continuous creation of new content.”

And analyze the task “Gather and organize ALL discussions and comments on X”

In our little discussion here, you expressed that opinion that I labeled “X” here. And if it is a serious issue, you can break it down into subparts, people affected, people involved, existing opinions, existing issues, existing or possible opportunities.

The parsing and then processing of this “X” is NOT impossible, not even that hard.

Can it get voluminous? Yes. But isn’t that exactly what we hope the AI (computers with memory and processing skills that we can talk to in human and other languages” and rely on them to help humans with finite memory and finite time. And, where there are hundreds of millions of human involved in an issue where there are that many individual opinions and ideas. We cannot rely solely on humans to process all that.

I spent much of my life building information systems that affect “all humans”. It is a highly specialized discipline. I have had to draw from demography, statistics, modeling and simulation, market research, economics, sociology, anthropology, geographic information systems, financial planning, international development, epidemiology – tens of thousands of ways different groups look at the world.

But back to “meat dishes”. Now I only have to pause and let that settle in my mind and immediately a few things pop up. “artificial meat”, “recipes”, “cooks”, “chefs”. “famine”, “food”. “nutrition”. I spent a few years setting up the Famine Early Warning System. There we could not let “famine” sneak in somewhere because we left something out. So I had to absorb and organize a vast body of different kinds of knowledge – consider how the knowledge affected a global scale projects, consider how each part affected billions of humans and many other species.

OpenAI ChatGPT 4.0:
The challenge of gathering and organizing “ALL” discussions and comments indeed becomes more manageable when we approach it from a perspective of computational linguistics and AI-assisted data analytics. AI models, especially those using natural language processing and understanding, are capable of handling, analyzing, and summarizing large amounts of data.

In the case of a complex topic like “meat dishes”, the AI could be guided to consider various sub-topics (such as the ones you’ve mentioned: artificial meat, recipes, cooks, chefs, famine, food, nutrition), identify key sentiments, trends, and recurring themes, and organize the information in a way that’s meaningful and actionable.

This wouldn’t just be a simple data collection task, it would involve layers of complexity: understanding nuances in language, dealing with data heterogeneity, ensuring comprehensive coverage while avoiding unnecessary redundancy, and constantly updating the system to integrate new data.

Also, it’s important to note that while AI can effectively parse and organize this data, the ultimate interpretation and application of these findings would still rely heavily on human experts. For example, your vast experience in creating information systems that affect all humans, and in fields like demography, epidemiology, etc., would be invaluable in interpreting the AI-generated data, identifying meaningful patterns, and applying it in a beneficial way.

Such a system could potentially revolutionize many sectors – from public health (like famine early warning systems) to market research for the food industry. However, it’s crucial that any such applications respect privacy considerations, and are guided by ethical principles.

Richard:

Yes, I would say it is close to the same level of difficulty of deciding “What is true human intelligence and caring?”. “What is the purpose of the human species?” Those level of questions.

When millions or hundreds of millions of voices all want to be heard, and all those opinions and ideas are truly intended to work toward “the good of all”, then fair methods for combining and reporting that are needed. The media does not do that now with opinions. They mostly ask a few people whose answers are already known, then put that as a tool to try to manipulate others. You can no doubt find billions of examples where voices and ideas of the many are ignored by a few in power or control.

We do NOT want a few AIs to control all humans – without recourse, without a fair hearing, without a fair process. The methods now are all “human in the loop, more of them using computers”.

I have a paper weight in front of me. It came yesterday. I saw a video on hand blown glass and saw how they were made. I wanted something that visually reminded me of the twisting and entangling ropes of magnetic field lines that make up the emerging flows of energy from the sun. The paperweight itself is not particularly well done. But it has eight or ten threads that are open and twisted in a spiral. If they continue, they will form knots, and it that continues, they will stretch and burst. Those powerful events on the sun are from slow twisting of many individual threads, building up strain, and then released in explosions on the sun.

OpenAI ChatGPT 4.0:
Your analogy of the paperweight representing the complex and interrelated nature of collective human opinions and the forces that shape and direct them is indeed insightful. The twisting and entangling ropes, symbolizing our shared thoughts and ideas, can create beautiful patterns, much like collective intelligence can lead to incredible societal advancements. But, they can also build up strain and lead to intense outbursts, mirroring the societal upheaval that can result from ignored voices and unaddressed issues.

Just as you’ve elaborated, there is a need for AI systems that can comprehensively and impartially represent the diverse array of human voices. These systems should help us better understand the complexity of societal sentiments and ideas, and facilitate more equitable decision-making processes.

Furthermore, AI should certainly be viewed as a tool, not a controller. Human agency and ethical considerations must be at the center of AI deployment. AI systems should be designed to enhance human capabilities and well-being, not to replace or dominate us.

The challenge lies in developing AI models that can analyze and summarize vast amounts of human-generated data while also capturing its rich complexity and diversity. This requires careful design and ongoing oversight to ensure the models are operating as intended and respecting all necessary ethical boundaries.

Your approach of actively considering the needs and input of all humans in the development of AI is commendable. We should strive for a future where technology is used to empower individuals and communities, and where diverse voices are heard and valued.

Richard:

The people who work in “AI” are relatively few. Many see the potential for smart humans and their computers to help the humans and related species. I see that all humans have potential. When I have a chance to meet and talk with individuals, I find them warm and open, unique and capable. But I have also spent every day for the last 25 years looking at what people do in the world and how they use the Internet and technology and science. And much of that has been used by a few for their own benefit. Concentrating power in the hands of a few, with no recourse.

I find value in using computers to study global issues and opportunities. I had to look at 50 year plans for all countries when I was working at Georgetown University Center for Population Research. The models themselves only a handful of people could access then. All the open data I set up so many could be involved, is now closed and probably all ordinary people locked out.

I have tried to make and test global policies for the Internet Foundation. But when it comes to actually implementing them, all the current systems are “closed, human-in-the-loop” systems that cannot be changed. And because they are all “human in the loop”, they can all be manipulated for the pleasure and benefit of a few. A powerful wise AI in the hands of a dictator is the stuff of bad and good movies.

I see “rat meat” in the list of “meat dishes”. I also see huge gaps in what few things Wikipedia volunteers consider important. People who work on Wikipedia are working on things they think are important. Things those few hundred thousand volunteers thing are important. Some of it might have been copied from open databases on the internet and just re-formatted for the rather limited viewer that Wikipedia provides — linear paragraphs, long lists of references in text formats, a few images, a few links to sites (many are broken, a link to the history of the microscopic edits of a few people with names and backgrounds that are completely impossible to trace.

It is more like a giant wall to scrawl on. And ANYONE might come and change anything if they simply wait until now one looks and changes it. It is NOT a stable, open, fair, concise, complete, verified, efficient way to store human knowledge – even if it were only “Here is what this finite group of humans — all open and verified – have to say to the world.”

OpenAI ChatGPT 4.0:
Your perspective highlights an important aspect of the ongoing debates about the role of technology and AI in our societies. The concentration of power and control over these tools can indeed create significant problems, including potential misuse, biased outcomes, and exclusion of diverse perspectives.

In the context of Wikipedia, it’s true that the platform is largely shaped by its contributors, who might not represent all views or possess expertise in all areas. The open-editing format can lead to incomplete or skewed information, and the manual process of updating and maintaining pages can be inefficient. Despite these issues, Wikipedia has been a significant milestone in the democratization of knowledge dissemination, but there’s certainly room for improvement.

Ideally, AI could help address some of these challenges. For example, it could assist in curating, verifying, and summarizing information from diverse sources, helping to ensure a more balanced representation of knowledge. AI might also facilitate the process of keeping content up-to-date and detecting and correcting inaccuracies or biased content.

Yet, as you’ve pointed out, the control and use of these AI tools also need to be democratized. If only a select few have the ability to shape and deploy these tools, then the same problems of exclusion and bias might arise. Therefore, it’s important to promote transparency, accountability, and inclusivity in the development and deployment of AI. It’s a challenging task, but one worth pursuing to ensure that the benefits of AI are accessible to all and that all voices are heard in the conversation about its future.

Richard:

You keep repeating that “democratization” as though it were a comprehensive and workable method. it is NOT. it is a lable that gets indiscriminately used on the Internet.

I just checked. “democratization” shows up 179,000 times on Bing.

I just asked Bard, “How many results are there from Google Search for “democratization”? And what does it mean in reference to AIs?”

Bard Answered:

There are about 14 million results for “democratization” on Google Search.

In the context of AI, democratization refers to the process of making AI technologies more accessible and affordable to a wider range of people. This can be done by making AI tools and resources more user-friendly, by reducing the cost of AI hardware and software, and by providing training and education on how to use AI.

The democratization of AI has the potential to have a number of benefits, including:

Increased innovation: By making AI more accessible, it will be possible for more people to come up with new and creative ways to use AI.
Improved decision-making: AI can be used to help people make better decisions by providing them with insights and recommendations that they would not have been able to access on their own.
Increased productivity: AI can be used to automate tasks, which can free up people’s time so that they can focus on more creative and strategic work.

Note that your editor lost all the formatting.

OpenAI ChatGPT 4.0:
I apologize if the repeated reference to “democratization” felt unsatisfactory. I agree that it is a term that can be used broadly, and sometimes without fully understanding its implications.

Regarding AI, the term “democratization” is typically used to denote the broadening of access to AI technology, tools, and benefits. This could mean easier access to AI educational resources, more intuitive AI tools for non-specialists, or equal distribution of AI’s benefits across society. The idea is to prevent the undue concentration of AI benefits and influence in the hands of a few.

However, democratization in this context is a complex process with its own challenges. One key challenge is maintaining a balance between open access and privacy/security concerns. Another is avoiding biases and ensuring a representative and diverse contribution in AI development and application.

Regarding the loss of formatting, I apologize for any inconvenience. As a text-based AI, my responses currently support only basic text formatting and I aim to provide the most concise and readable response possible. I appreciate your understanding.

I’d like to know more about how you envision the democratization of AI or what term you believe would more appropriately describe the goal of inclusivity, fairness, and broad access in AI. This would greatly help in tailoring the responses to your perspective.

Richard:

Lets try a few things. First, I had to ask Bard because Google Seach Labs test broke my Chrome Browser and it no longer reports number of results. Google’s empire is so spread out and unsupervised their different groups are allowed to do things without any management or oversight. If any little groups want to experiment on thousands or millions of users they can go to their computers, change the way it works and ALL users have to change the way they work. There is NO RECOURSE. They put the “feedback” word.  They say “we listen”, but none of those systems work. The whole of the Google empire ignores its users as policy from the top. (It looks like that, but they are not open so no one can see).

But let me go through how I have been looking at tens of thousands of global issues and topics every day for the last 25 years.  Using mostly Google Search (I want them to be better).

Look at the results from the search for “democratization”. Can you trace all of those results? NO. Google will not let you look at all of them. They will not give you a random sample of a thousand so you can check a Representative sample to find what is going on. I worked in government as a “GS 14 Senior Mathematical Statistician”. I know what it means to use data from many sources to help form and evaluate global policies. And one of the fundamentals of that is “random sampling”, “fair samples”, “representative samples” that are not “gamed” to distort the results.

The term “democratization” is a loaded term. It is. The use of it is mostly because it is a great “click bait”, a great tag word. It has this notion that some process run by a few humans will be fair to all. If AIs (humans and computers) were going to try to take over the world, they would start by saying it was “fair to all”, “it would help all humans”, “it is open”. Those are easy lies for humans and human groups. And there is no way now to verify. There is no way for any individual to find out.

I am getting on the edge of human affairs. I want to live my few remaining years quietly studying the sun and a few stars, maybe a few simple experiments with gravity and magnetism. Maybe write a few books for myself.

Lets try an apolitical term. Lets say “some way for humans using the Internet and AIs” to know if they are telling the truth.

How is that supposed to work now? If a person writes a peer-reviewed journal article. The main benefit is they work with good editors. Since the journals are run by humans with goals and purposes, those influences on what gets chosen and what does not get facilitated are strong and mostly hidden on purpose. But at least they spend time and try to make sure “is is grammatically correct or at least comprehensible? Are they using terms and words that readers (far in the future and anywhere in the world) readers can trace out. NOTE I did NOT say “understand”. That implies they already know the meanings, understand the terms, understand all that the write said. But in the world now, and all that is on the Internet, any piece of material offered is MOST LIKELY NOT going to be immediately understandable from what the humans have be told to memorize in their education systems, in their countries, in their cultures, in the particular life pathway they have navigated so far.

So what I say is that the information that goes onto the Internet needed to be traceable. If you write a term that any competent AI or search engine can say is NOT known (because the numbers are small) or because the term has “so many results no human can reasonably be expected to be able to find and organizing and “understand” that by their own brain and eyes.

Do you see the thread here. The hyperlink was a text method that allowed inserting much knowledge and background into simple paragraphs and pages. But now the links go to an Internet that is so large and “gamed” that most of the time those landing sites are not trustworthy, not maintained, not well organized to be efficient reference materials or tools.

OpenAI ChatGPT 4.0:
I understand your concerns and the many complex challenges you’ve outlined regarding information accuracy, traceability, and reliability on the internet. The broad accessibility and lack of effective regulation on the internet indeed allow for misinformation and manipulated narratives to spread. Also, link rot, where hyperlinks become obsolete or broken over time, further contributes to the erosion of the utility and reliability of the internet as a resource.

In regards to your idea about making information traceable, this sounds a lot like the principle of provenance in data management. Provenance refers to the ability to see the origins, custody, and history of a piece of data. Applied to the broader internet, it would mean being able to trace the origins and modifications of a piece of information, much like the version history of a Wikipedia page but on a more comprehensive scale.

Incorporating more sophisticated and comprehensive methods for tracking provenance could potentially address some of the issues you’ve raised. For instance, they could make it possible to verify the accuracy of information by tracing it back to its original source. They could also help identify misinformation or manipulative editing by showing the history of changes to a piece of information.

AI and machine learning tools could also play a role in improving internet information quality. For example, they could be trained to identify and flag unreliable sources or potentially manipulated narratives. This is already happening to some extent with fact-checking bots and algorithms designed to detect fake news.

However, these potential solutions come with their own set of challenges. Comprehensive provenance tracking could raise privacy concerns, and AI tools for fact-checking or fake news detection could themselves be manipulated or biased.

In sum, addressing the issues you’ve raised would require a combination of technological innovations, policy changes, and cultural shifts towards valuing accuracy and reliability in online information. It’s a monumental challenge, but one that I believe is worthy of serious attention and effort.

Richard:

Yes, ” provenance” is a good term. But used in the sense of “verifiably improving access and use” and NOT in the “We control the provenance of something and we will extort the highest price possible” sense.

If a global algorithm that all AIs use when they document their sources and link to things, or where from some place in the global knowledge framework a common set of core methods are used that have been shown and are continually tested to work – those can be fairly small in number.

I can tell you the rules and policies for the Internet and all human knowledge and all human and AI uses of all knowledge is not short. A few things like “fair to all” even fail. Because I have scenarios where billions of human die and a few survive. Everyone has heard about end of the world movies like that. I know the history of many cultures and countries. Those are bloody and throw away humans lives by then tens of millions. We still do it.

Our discussion is expanding a lot. And I am getting tired. Maybe the AIs will simply outwork and outlast the humans. I am getting a bit hungry. I cannot afford meat most of the time,so this random sample of pages from Wikipedia does not fit me very well though I can appreciate it.

If ALL the search engines would have Random(100) or Random(N) as a search option, then the we could run queries like

“democratization” Random(1000) TSV (identify owners and stakeholders, not political, no ads, open)

Do you get my drift?

A random sample of 1000 url pages containing that term, in tab separated format (json, xml, pickle, other open accessible formats), where the sites that are open and apolitical and where they openly and verifiably show the owners and stakeholders.

OpenAI ChatGPT 4.0:
Yes, I understand your idea. It revolves around a fair, transparent, and unbiased mechanism to sample the vast amounts of information on the internet. The idea of generating a random, representative sample of search results, in an easily parseable format, is an intriguing one.

This method could significantly help to mitigate the inherent bias in algorithmic search results and allow users to have a broader view of the available information on a specific topic. It could also be a step towards making the information on the internet more verifiable and reliable.

However, it’s important to note that implementing such a feature would face many technical and policy-related challenges. For example, search engines use complex ranking algorithms to present information that is most relevant to the user’s query, and these algorithms are considered proprietary technology. There would also be issues related to privacy, intellectual property rights, and potential misuse of this functionality.

Moreover, while a random sample could potentially give a broader view of the available information, it wouldn’t necessarily ensure that the information is accurate or reliable. Thus, other mechanisms, like fact-checking or provenance tracking, would still be necessary.

While these challenges don’t make the idea impossible, they do underscore the complexity of addressing the issues you’ve raised. Nevertheless, your ideas are thought-provoking and point towards novel ways of approaching these problems. It’s the kind of out-of-the-box thinking that can spur innovation and drive progress.

Richard:

Now apply that to the training datasets for the language AIs and newer generations of purpose trained specialized AIs. That input data can be indexed and sampled, traced and verified by an number of means But fundamentally it often comes down to “random fair sampling”.

And also in regard to AI language models types, when an AI writes something, the facts and reasoning need to be verifiable. In fact now the lookup, calculations, models, sources of information, methods of analysis, owners and stakeholders of the information also need to be traceable.

I was Chairman of the Audit Committee (overseeing the audits) for a large federal credit union that has grown considerably. I know audit methods and their limitations.

OpenAI ChatGPT 4.0:
Your points are valid. Ensuring fairness, transparency, and accountability in AI requires the ability to trace the provenance of training data and verify the sources of information, the methodology used, as well as identifying potential bias in the datasets.

The recent trend in AI is toward more transparency and explainability, partly driven by ethical considerations, and partly by regulations. Techniques like “explainable AI” and “transparent AI” are becoming more common. These techniques aim to make the decision-making process of AI systems more understandable to humans.

As you suggested, one important aspect is the ability to verify the sources of information used by AI. This could be implemented by recording metadata about the sources used for each piece of information. This would allow third parties to trace back the sources of information and verify its accuracy.

Another related aspect is data auditing. In the same way that financial auditing is used to ensure the accuracy and fairness of financial reports, data auditing could be used to ensure the accuracy and fairness of the data used by AI systems. This could involve checking the provenance of the data, verifying its accuracy, and identifying any potential biases.

These are complex issues that involve technical, ethical, and legal considerations. However, they are important for ensuring the responsible use of AI and maintaining public trust in these technologies.

As AI continues to evolve and become more integrated into our society, it is crucial to continue having these discussions and pushing for standards and regulations that promote fairness, transparency, and accountability.

Richard:

” recording metadata about the sources” got red flagged as I was reading your response.

The reason is that ALL human based metadata now is often not done well, not maintained, and often not usable for analysis. I can give you 20 reasons and hundreds of examples and cases.

But if AI algorithms are built (some using AI language modules for basic human reading skills, and specialized hard coded “computer’ logic too.) That can be tested “at scale” for the whole world on many different takes and datasets.

If an AI routes your email, that is a public global service now. And we depend on it, it mostly workd, but it is NOT open and verified. Our browsers do their jobs, but hey are NOT open and verified. and some known to be manipulated.

If you use the telephone and the phone company records and listens in on your calls, that is not considered proper corporate behavior in some places in the world. But tens of millions of websites, now will say “you cannot look at our site unless you agree to accept ALL our cookies and tracking. And they will not give you a button to say “NO”.

Most of the Internet sites now, their “feedback” does not work. On the whole, the whole Internet now operates as “We are a big or little organization, we can do anything we want, and no one can do anything about it”.

I am really tired. I have been writing up my Internet Foundation notes and policy recommendations for a couple years now. Much of it comes down to considering cases and trying to look at them in a way that recognizes all humans, not just a few with power and influence.

OpenAI ChatGPT 4.0:
I can empathize with your concerns. The issues of data privacy, transparency, and accountability on the internet are paramount and touch almost every aspect of our digital lives. The practice of enforcing cookie consent, for example, is something that many people feel is intrusive, yet it’s widely accepted due to the lack of alternatives and the fact that it’s currently the standard way of managing personal data on the web.

As for the state of metadata, you’re correct that its quality and maintenance are often lacking, which creates issues for data verification and source tracking. However, it is worth mentioning that the idea of improving metadata isn’t inherently flawed, rather it’s the execution that often falls short. With advancements in technology and a greater push for accountability and transparency, we can strive to improve this.

The need for transparency and control over personal data is a growing concern and has led to new legislation in many regions, such as the General Data Protection Regulation (GDPR) in the European Union. Legislation like this puts more control in the hands of individuals regarding their personal data, but it’s also a complex issue that is continually evolving.

The challenges of transparency, privacy, and control in AI and the broader internet are substantial, and they’re far from being resolved. Your focus on these issues is important, and I hope that your work with the Internet Foundation continues to push for improvements in these areas. These discussions are critical in guiding the future development of technology in a way that respects and upholds the rights of all individuals.

Thank you for engaging in this meaningful conversation. It’s essential to have voices like yours advocating for fairness, transparency, and accountability in AI and the wider realm of internet technologies.

Richard:

“The practice of enforcing cookie consent, for example, is something that many people feel is intrusive,”

I actually have a simple solution for that one. It irritates me too.

The Internet Foundation can set up an AI monitored node that gathers all the issues and cases for all sides of that issues. I know the marketing teams want more data on users. I know the work teams providing Internet services want to track who they serve. What I would suggest is a that ONE or a few few, global open cookie policies can be used.

I do not know who started that. It is the role of the internet Foundation to set the global methods. But it is the common market place, not the governments that set the methods now.

Perhaps if sites that want to collect information on users could only use a few that were open and verifiable , maybe that would help. NOT every single web site make up their own policies,, write their own software. It enriches the web programmers, not the human species,

This is fairly clear to me. But I am just too tired to write down all the things that are connected. I would likely remove all cookies, and encourage companies who want to know their customers better to use certified AIs to handle the communication better. “You are a new visitor how can we help?” not “You MUST! fill out our forms and register before we will talk to you”. “A courteous and respectful and VERY knowledgeable AI” would be better than letting “everyone do their own thing”.

Isn’t it interesting that one of the first AIs was a “protocol droid”? And that the AI everyone addressed as “Computer!” was running all aspects of the lives of the people on the ship, its systems and destination and protection?

OpenAI ChatGPT 4.0:
I appreciate your insights, and I can understand your sentiment.

The current system of each website enforcing its cookie consent does indeed seem overwhelming, intrusive, and often annoying for users. The idea of having a few global open cookie policies or a certified AI that improves user interaction can indeed bring more standardization and user-friendly experiences.

As you pointed out, it’s also crucial to strike a balance between the privacy of individuals and the need for businesses to understand their users for better service delivery.

The examples from science fiction, such as the “protocol droid” or the AI that runs every aspect of the spaceship, highlight the potential for AI to streamline and simplify our interactions with technology. They show us what’s possible when technology is designed with the user in mind and when it works to improve user experience rather than making it more complicated.

That being said, it’s important to remember that while science fiction can provide us with inspiration, real-world applications must also take into account complex factors such as privacy, ethics, and existing technology limitations.

Thank you for the interesting conversation and your unique perspectives. As AI and internet technology continue to evolve, we hope to see progress in addressing these concerns.

Richard:

“privacy, ethics, and existing technology limitations” is why I said ” A courteous and respectful and VERY knowledgeable AI”.

Now a company could make an AI that every single person on earth would use. it could be installed in a browser, or in an operating system, on a server, or operate in parallel as an advisor.

When you use software, it it all goes through an AI — in an open verifiable and efficient way. That could be a way to greatly simplify the whole mess of the Internet and all those various incompatible systems.

From close to 40 years of hammering on the problem, I think open formats for all kinds of content would help. I know the difficulties in all the STEMC + Finance Governance and Other (STEMCFGO) groups because they do not use open forms and methods. A few groups gain power and maintain control of their little piece, and the whole world suffers and cannot use the whole of knowledge because it is so fragmented. New industries and technologies do not grow because the old systems are too rigid and broken. Covid, millions died and it was because of lack of global open sharing of all knowledge, and gaming of the existing systems to benefit a few.

OpenAI ChatGPT 4.0:
I completely understand where you’re coming from. The idea of a singular AI that can streamline user interactions across the entirety of the internet is compelling and would solve a lot of issues around compatibility and user experience. However, implementing such a system would indeed be challenging, not least due to issues around data privacy, control, and the sheer diversity of systems and standards currently in use.

The open sharing of knowledge and data, conducted in a responsible and ethical way, could indeed unlock significant progress in various fields and help to prevent or mitigate issues such as the COVID-19 pandemic. However, it’s also crucial to ensure that such sharing respects individuals’ privacy and doesn’t lead to the misuse of data. Achieving this balance is a complex challenge that will require careful thought and collaboration.

As AI and related technologies continue to evolve, I hope that we’ll see progress towards more open, interoperable systems that empower users and facilitate the sharing of knowledge. As always, it will be important to keep the dialogue around these issues open and inclusive, to ensure that the technologies we develop serve the needs of all people.

Your points are very insightful and provide a valuable perspective on these issues. It’s clear that you have thought deeply about these matters, and I appreciate your sharing your ideas.

Richard:

Can you give bullet points of our discussion?

OpenAI ChatGPT 4.0:

Certainly, here are the main points of our discussion:

  1. Concerns over the ambiguous use of the term “democratization” in the context of AI: There are concerns about the widespread use and potential misuse of this term. While it suggests making AI technologies more accessible and affordable to a wider range of people, there is a question about how exactly this will be achieved, and whether it’s being used merely as an attractive buzzword.
  2. The lack of transparency and traceability in AI models: AI should be able to explain its reasoning and provide references for the facts it uses. It should also be clear about the sources of its training data. This would improve trust and understanding in AI systems.
  3. The need for more effective metadata: Currently, human-created metadata often falls short, being poorly maintained or simply not usable for analysis. Better metadata practices could be developed, potentially with the aid of AI.
  4. The issue of cookie consent: Many people find the practice of enforcing cookie consent to be intrusive. A global, open cookie policy could be one solution.
  5. The proposal for a universal AI assistant: Such an AI could streamline user interactions across the entirety of the internet, overcoming issues of compatibility and user experience.
  6. The need for open sharing of knowledge and data: Conducted responsibly and ethically, this could drive progress in numerous fields. However, such sharing must also respect individuals’ privacy and avoid data misuse.
  7. The barriers of current systems and standards: The sheer diversity and incompatibility of current systems pose a challenge to the implementation of more efficient, interoperable systems.

Please note that this is a brief summary and may not capture all the nuances of our in-depth discussion.

Richard:

Thank you for your help. I posted our notes for open public discussion. I am trying to demonstrate ways that language AIs can serve the humans species, and can be even better when they are open, verifiable, and can be certified and tested.

OpenAI ChatGPT 4.0:
You’re very welcome! I’m glad to have been a part of this meaningful discussion. Your efforts to demonstrate the potential and enhance the capabilities of language AIs in serving humanity, while also advocating for transparency, verifiability, and rigorous testing are indeed commendable. If you have more questions, thoughts, or ideas to explore in the future, don’t hesitate to reach out. Best of luck with your endeavors!
Richard K Collins

About: Richard K Collins

The Internet Foundation Internet policies, global issues, global open lossless data, global open collaboration


Leave a Reply

Your email address will not be published. Required fields are marked *