|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectjava.util.AbstractCollection<E>
java.util.AbstractSet<E>
java.util.HashSet<Document>
szte.io.WikiDocSet
public class WikiDocSet
The reader and container class for the Wikipedia soocer corpus.
Constructor Summary | |
---|---|
WikiDocSet()
|
Method Summary | |
---|---|
void |
readDocumentSet(java.lang.String file)
The documents are listed in one file separated by a -DOCSTART- line (extracted from the Wikipedia dump) and the gold-standard labels are in an txt file (in a "documentid TAB label" format). |
Methods inherited from class java.util.HashSet |
---|
add, clear, clone, contains, isEmpty, iterator, remove, size |
Methods inherited from class java.util.AbstractSet |
---|
equals, hashCode, removeAll |
Methods inherited from class java.util.AbstractCollection |
---|
addAll, containsAll, retainAll, toArray, toArray, toString |
Methods inherited from class java.lang.Object |
---|
getClass, notify, notifyAll, wait, wait, wait |
Methods inherited from interface java.util.Collection |
---|
add, addAll, clear, contains, containsAll, equals, hashCode, isEmpty, iterator, remove, removeAll, retainAll, size, toArray, toArray |
Methods inherited from interface java.util.Set |
---|
addAll, containsAll, equals, hashCode, removeAll, retainAll, toArray, toArray |
Constructor Detail |
---|
public WikiDocSet()
Method Detail |
---|
public void readDocumentSet(java.lang.String file)
readDocumentSet
in interface DocumentSet
file
- consist of the path to the document (*.txt files will be read from this directory) and the label XML file path separated by a |
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |