I have qualms with generative models. They aren’t even necessarily rational qualms, it’s mostly just a “feel” thing—I don’t like the fact that the output is this totally unstructured thing, and you have to either coerce or fine-tune the model into doing anything else.
But maybe it doesn’t have to be that way? I guess in one sense they’re just predicting tokens, and tokens are whatever we want them to be.
…something about context and how it’s outputting a sequence of information, and maybe that information doesn’t have to be the regular tokens we normally deal with. But also maybe that’s just a classification head? I dunno.