szte.io
Class ReutersDocSet

java.lang.Object
  extended by java.util.AbstractCollection<E>
      extended by java.util.AbstractSet<E>
          extended by java.util.HashSet<Document>
              extended by szte.io.ReutersDocSet
All Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, java.lang.Iterable<Document>, java.util.Collection<Document>, java.util.Set<Document>, DocumentSet

public class ReutersDocSet
extends java.util.HashSet<Document>
implements DocumentSet

The reader and container class for the Reuters corpus. We use the 10 most frequnt labels.

See Also:
Serialized Form

Constructor Summary
ReutersDocSet()
           
 
Method Summary
static void main(java.lang.String[] args)
           
 void readDocumentSet(java.lang.String file)
          reads (and internally store) a corpus
 
Methods inherited from class java.util.HashSet
add, clear, clone, contains, isEmpty, iterator, remove, size
 
Methods inherited from class java.util.AbstractSet
equals, hashCode, removeAll
 
Methods inherited from class java.util.AbstractCollection
addAll, containsAll, retainAll, toArray, toArray, toString
 
Methods inherited from class java.lang.Object
getClass, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface java.util.Collection
add, addAll, clear, contains, containsAll, equals, hashCode, isEmpty, iterator, remove, removeAll, retainAll, size, toArray, toArray
 
Methods inherited from interface java.util.Set
addAll, containsAll, equals, hashCode, removeAll, retainAll, toArray, toArray
 

Constructor Detail

ReutersDocSet

public ReutersDocSet()
Method Detail

readDocumentSet

public void readDocumentSet(java.lang.String file)
Description copied from interface: DocumentSet
reads (and internally store) a corpus

Specified by:
readDocumentSet in interface DocumentSet
Parameters:
file - the path for the reuters.xml and TRAIN or TEST for indicating the Lewis split

main

public static void main(java.lang.String[] args)