Item 44133176

But doesn't reward for "**C" means that "C" is in the training data?

I am not sure if that is an accurate model, but if you think of it as a vectorspace, sure you can generate a lot of vectors from some set of basevectors, but you can never generate a new basevector from others, since they are linearly independent, so there are a bunch of new vectors you can never generate.

JoshCole • 7 days ago

For an example of a reward model that doesn't include "C" explicitly consider a reward model defined to be the count of the one bits in letters in the input. It would define a reward for "C" but "C" doesn't show up explicitly, because the reward had universal reach and "C" was among its members as a result.

JoshCole • 7 days ago

> But doesn't reward for "*C" means that "C" is in the training data?

You're running into an issue here due to overloading terms. Training data has three different meanings in this conversation depending on which context you are in.

1. The first is the pre-training context in which we're provided a dataset. My words were appropriate in that context.

2. The second is the reinforcement learning setup context in which we don't provide any dataset, but instead provide a reward model. My words were appropriate in that context.

3. The final context is that during the reinforcement learning algorithms operation one of things it does is generate datasets and then learn from them. Here, its true that there exists a dataset in which "C" is defined.

Recall that the important aspect of this discussion has to do with data provenance. We led off with someone claiming that an analog of "C" wasn't provided in the training data by a human explicitly. That means that I only need to establish that "C" doesn't show up in either of the inputs to a learning algorithm. That is case one and that is case two. It is not case three, because upon entering case three the provenance is no longer from humans.

Therefore, the answer to the question but doesn't the reward model for C mean that C is in the training data has the answer: no, it doesn't, because although it appears in case three, it doesn't appear in case one or case two and those were the two cases which were relevant to the question. That is appears in case three is just the mechanism by which the refutation that it could not appear occurs.

> I am not sure if that is an accurate model, but if you think of it as a vectorspace, sure you can generate a lot of vectors from some set of basevectors, but you can never generate a new basevector from others, since they are linearly independent, so there are a bunch of new vectors you can never generate.

Your model of vectors sounds right to me, but your intuitions about it are a little bit off in places.

In machine learning, we introduce non-linearities during training (for example, through activation functions like ReLU or Sigmoid). This breaks the strict linear structure of the model, enabling it to approximate a much wider range of functions. There's a mathematical proof (known as the Universal Approximation Theorem) that shows how this non-linearity allows neural networks to represent virtually any continuous function, regardless of its complexity.

We're not really talking about datasets when we move into a discussion about this. Its closer to a discussion of inductive biases. Inductive bias refers to the assumptions a model makes about the underlying structure, which guide it toward certain types of solutions. If something doesn't map to the structure the inductive bias assumes, it can be possible for the model to be incapable of learning that function successfully.

The last generation of popular architectures used convolutional networks quite often. These baked in an inductive bias about where data that was related to other data was and so made learning some functions difficult or impossible when those assumptions were violated. The current generation of models tends to be built on transformers. Transformers use an attention mechanism that can determine what data to focus on and as a result they are more capable of avoiding the problems that bad inductive bias can create since they can end up figuring out what they are supposed to be paying attention to.