The documents are listed in one file separated by a -DOCSTART- line (extracted from the Wikipedia dump) and the gold-standard labels are in an txt file (in a "documentid TAB label" format).
We use the same Vector Space Model throughout an iteration, just its binary labeling varies according to the target labels of the original document multi-lableing task.