I have a self supervised problem, where I have a sequence of web page visits, each of which is denoted by some hash, and I can lookup this has to find a dense vector to represent this page.
I would like to train my model on various self supervised tasks, such as predicting a target page vector, given some context, or predicting the next vector in the sequence.
My question here is what is the best approach for implementation of such a problem? The naive approach would be to prep the data in a separate script.
I feel there must be a better approach, possibly using `tf.data.map` within the pipeline?
Would love to hear some best practices for self supervision, and the creation of datasets using TF and Keras.