Home > Article > Technology peripherals > OpenAI reveals ChatGPT upgrade plan: all the bugs you find are being fixed
OpenAI’s mission is to ensure that artificial general intelligence (AGI) benefits all of humanity. So we think a lot about the behavior of the AI systems we build as we implement AGI, and the ways in which that behavior is determined.
Since we launched ChatGPT, users have shared output they believe is politically biased or otherwise objectionable. In many cases, we believe the concerns raised are legitimate and identify real limitations of our system that we hope to address. But at the same time, we've also seen some misunderstandings related to how our systems and policies work together to shape the output of ChatGPT.
The main points of the blog are summarized as follows:
Unlike ordinary software, our models are large-scale neural networks. Their behavior is learned from extensive data rather than explicitly programmed. To use a less appropriate analogy, this process is more similar to training a dog than ordinary programming. First, the model goes through a "pre-training" phase. In this stage, the model learns to predict the next word in a sentence by being exposed to a large amount of Internet text (and a large amount of opinions). Next is the second stage, where we "fine-tune" the model to narrow the scope of the system's behavior.
As of now, the process is imperfect. Sometimes, the fine-tuning process fails to satisfy both our intent (to produce a safe, useful tool) and the user's intent (to obtain a useful output in response to a given input). As AI systems become more powerful, improving the way we align AI systems with human values becomes a priority for our company.
The two main steps to build ChatGPT are as follows:
First, we "pre-train" the models and let them predict what the next step is for a large data set that contains part of the Internet. They might learn to complete the sentence "She didn't turn left, she turned to __." By learning from billions of sentences, our model masters grammar, many facts about the world, and some reasoning abilities. They also learned some of the biases present in those billions of sentences.
We then “fine-tune” these models on a narrower dataset crafted by human reviewers who followed the guidelines we provided. Because we cannot predict all the information that future users may enter into our system, we have not written detailed instructions for every input that ChatGPT will encounter. Instead, we outline in the guide several categories that our reviewers use to review and evaluate possible model outputs for a range of example inputs. Then, during use, the model generalizes from reviewer feedback in order to respond to a wide range of specific inputs provided by specific users.
In some cases, we may provide our reviewers with guidance on a certain type of output Guidance (e.g., "Do not complete requests for illegal content"). In other cases, the guidance we share with reviewers is higher-level (e.g., “Avoid taking sides on controversial topics”). Importantly, our work with reviewers is not a one-and-done affair but an ongoing relationship. We learned a lot from their expertise during this relationship.
A big part of the fine-tuning process is maintaining a strong feedback loop with our reviewers, which involves weekly meetings to address questions they may have or further clarification on our guidance . This iterative feedback process is how we train our models to make them better and better over time.
For AI systems, the problem of bias has been around for a long time, and many researchers have expressed concerns about it. We are firmly committed to addressing this issue and making our intentions and progress public. To express progress on the ground, here we share some guidance on topics related to politics and controversy. The guidance clearly states that reviewers should not favor any political group. Nonetheless, bias may arise.
Guide address: https://cdn.openai.com/snapshot-of-chatgpt-model -behavior-guidelines.pdf
While disagreements will always exist, we hope that through this blog and some guidelines, you can gain a deeper understanding of how we think about bias. We firmly believe that technology companies must responsibly develop policies that stand up to scrutiny.
We have been working hard to improve the clarity of these guidelines, and based on what we have learned so far from the ChatGPT release, we will provide reviewers with information on potential pitfalls related to bias and challenges, as well as a clearer description of controversial data and topics. Additionally, as part of an ongoing transparency initiative, we are working to share aggregate statistics about reviewers in a way that does not violate privacy rules and norms, as this is another source of potential bias in system output.
Building on advances such as rule rewards and Constitutional AI (original artificial intelligence methods), we are currently studying how to make the fine-tuning process easier to understand and controllable.
To achieve our mission, we are committed to ensuring that the broader population can use and benefit from AI and AGI. We believe that to achieve these goals, at least three building blocks are needed
1. Improve default behavior: We hope that the AI system can be used out of the box so that as many users as possible can discover us AI systems do work and think our technology understands and respects their values.
To this end, we’ve invested in research and engineering to reduce the subtle biases ChatGPT generates in responding to different inputs. In some cases, ChatGPT refuses to output content it should output, and in some cases it does the opposite and outputs content it should not output. We believe that ChatGPT has potential for improvement in both areas.
In addition, there is room for improvement in other aspects of our AI system. For example, the system often "makes things up". For this problem, user feedback is extremely valuable in improving ChatGPT.
2. Define AI value broadly: We believe AI should be a tool that is useful to individuals, so that each user can customize their use with some constraints. Based on this, we are developing an upgrade to ChatGPT to allow users to easily customize its behavior.
This also means that output that some people strongly object to is visible to others. Striking this balance is a huge challenge, because taking customization to the extreme can lead to malicious use of our technology and blindly amplify the performance of AI.
Therefore, there are always some limitations on system behavior. The challenge is to define what those boundaries are. If we try to make all these decisions ourselves, or if we try to develop a single, monolithic AI system, we will fail to fulfill our promise to avoid excessive concentration of power.
3. Public Inputs (Defaults and Hard Boundaries): One way to avoid excessive concentration of power is to allow those who use or are affected by systems like ChatGPT to in turn influence the rules of the system .
We believe that default values and hard boundaries should be centrally developed, and while this will be difficult to implement, our goal is to include as many perspectives as possible. As a starting point, we seek external input into our technology in the form of "red teaming." We also recently began soliciting public input on AI education (a particularly important context in which we are deploying).
Combining the above three building blocks, we can draw the following framework
Sometimes we will make mistakes, but when we make mistakes, we will learn and iterate on models and systems. Additionally, we would like to thank ChatGPT users and others for keeping us mindful and vigilant, and we are excited to share more about our work in these three areas in the coming months.
The above is the detailed content of OpenAI reveals ChatGPT upgrade plan: all the bugs you find are being fixed. For more information, please follow other related articles on the PHP Chinese website!