szte.datamining.mallet
Class MalletDataHandler

java.lang.Object
  extended by szte.datamining.DataHandler
      extended by szte.datamining.mallet.MalletDataHandler
All Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable

public class MalletDataHandler
extends DataHandler
implements java.io.Serializable

See Also:
Serialized Form

Field Summary
 cc.mallet.types.InstanceList data
           
 java.util.Map<java.lang.String,java.lang.Integer> instanceIds
           
 
Constructor Summary
MalletDataHandler()
           
 
Method Summary
 void addDataHandler(DataHandler dh)
           
 ClassificationResult classifyDataset(Model model)
           
 void createNewDataset(java.util.Map<java.lang.String,java.lang.Object> parameters)
          creates a new empty dataset using the underlying native datatype
 DataHandler createSubset(java.util.Set<java.lang.String> instancesSelected, java.util.Set<java.lang.String> featuresSelected)
          creates a subset of the dataset where only the given instances and/or features are present
 java.lang.Boolean getBinaryValue(java.lang.String instanceId, java.lang.String featureName)
           
 int getFeatureCount()
           
 java.util.Set<java.lang.String> getFeatureNames()
           
 java.util.List<java.lang.String> getFeatureValues(java.lang.String featureName)
           
 int getInstanceCount()
           
 java.util.Set<java.lang.String> getInstanceIds()
           
<T extends java.lang.Comparable<?>>
T
getLabel(java.lang.String instanceId)
           
 java.lang.String getNominalValue(java.lang.String instanceId, java.lang.String featureName)
           
 java.lang.Double getNumericValue(java.lang.String instanceId, java.lang.String featureName)
           
<T extends java.lang.Comparable<?>>
T
getValue(java.lang.String instanceId, java.lang.String featureName)
           
 void initClassifier(java.util.Map<java.lang.String,java.lang.Object> parameters)
           
 void loadDataset(java.lang.String source)
          loads a native dataset from the given source
static void main(java.lang.String[] args)
           
 void removeFeature(java.lang.String featureName)
           
 void removeInstance(java.lang.String instanceId)
           
 void saveDataset(java.lang.String target)
          saves the current dataset to the given target
 void saveDatasetMallet(java.lang.String target)
           
 void saveDatasetSVM(java.lang.String target)
           
 void saveDatasetWeka(java.lang.String target)
           
 void setBinaryValue(java.lang.String instanceId, java.lang.String featureName, java.lang.Boolean value)
          Sets the value of a binary feature
 void setBinaryValue(java.lang.String instanceId, java.lang.String featureName, java.lang.Boolean value, boolean ternal)
           
 void setDefaultFeatureValue(java.lang.String featureName, java.lang.String value)
           
<T extends java.lang.Comparable<?>>
void
setLabel(java.lang.String instanceId, T label)
          sets the class label of the given instance
 void setNominalValue(java.lang.String instanceId, java.lang.String featureName, java.lang.String value)
          Sets the value of a nominal feature if this is a new nominal value it is added to the dataset
 void setNumericValue(java.lang.String instanceId, java.lang.String featureName, double value)
          Sets the value of a numeric feature
<T extends java.lang.Comparable<?>>
void
setValue(java.lang.String instanceId, java.lang.String featureName, T value)
          Sets the value of a feature, the type of the feature is given by the beginning of the feature name b_ binary feature n_ numeric feature m_ nominal feature t_ ternal feature
 Model trainClassifier()
           
 
Methods inherited from class szte.datamining.DataHandler
createEmptyDataHandler
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

data

public transient cc.mallet.types.InstanceList data

instanceIds

public transient java.util.Map<java.lang.String,java.lang.Integer> instanceIds
Constructor Detail

MalletDataHandler

public MalletDataHandler()
Method Detail

createNewDataset

public void createNewDataset(java.util.Map<java.lang.String,java.lang.Object> parameters)
Description copied from class: DataHandler
creates a new empty dataset using the underlying native datatype

Specified by:
createNewDataset in class DataHandler

createSubset

public DataHandler createSubset(java.util.Set<java.lang.String> instancesSelected,
                                java.util.Set<java.lang.String> featuresSelected)
                         throws DataMiningException
Description copied from class: DataHandler
creates a subset of the dataset where only the given instances and/or features are present

Specified by:
createSubset in class DataHandler
Returns:
Throws:
DataMiningException

addDataHandler

public void addDataHandler(DataHandler dh)
                    throws DataMiningException
Specified by:
addDataHandler in class DataHandler
Throws:
DataMiningException

getBinaryValue

public java.lang.Boolean getBinaryValue(java.lang.String instanceId,
                                        java.lang.String featureName)
                                 throws DataMiningException
Specified by:
getBinaryValue in class DataHandler
Throws:
DataMiningException

getFeatureCount

public int getFeatureCount()
Specified by:
getFeatureCount in class DataHandler

getFeatureNames

public java.util.Set<java.lang.String> getFeatureNames()
Specified by:
getFeatureNames in class DataHandler

getFeatureValues

public java.util.List<java.lang.String> getFeatureValues(java.lang.String featureName)
Specified by:
getFeatureValues in class DataHandler

getInstanceCount

public int getInstanceCount()
Specified by:
getInstanceCount in class DataHandler

getInstanceIds

public java.util.Set<java.lang.String> getInstanceIds()
Specified by:
getInstanceIds in class DataHandler

getLabel

public <T extends java.lang.Comparable<?>> T getLabel(java.lang.String instanceId)
Specified by:
getLabel in class DataHandler
Returns:
the class label f the given instance

getNominalValue

public java.lang.String getNominalValue(java.lang.String instanceId,
                                        java.lang.String featureName)
                                 throws DataMiningException
Specified by:
getNominalValue in class DataHandler
Throws:
DataMiningException

getNumericValue

public java.lang.Double getNumericValue(java.lang.String instanceId,
                                        java.lang.String featureName)
                                 throws DataMiningException
Specified by:
getNumericValue in class DataHandler
Throws:
DataMiningException

getValue

public <T extends java.lang.Comparable<?>> T getValue(java.lang.String instanceId,
                                                      java.lang.String featureName)
                                           throws DataMiningException
Specified by:
getValue in class DataHandler
Throws:
DataMiningException

initClassifier

public void initClassifier(java.util.Map<java.lang.String,java.lang.Object> parameters)
                    throws DataMiningException
Specified by:
initClassifier in class DataHandler
Throws:
DataMiningException

trainClassifier

public Model trainClassifier()
                      throws DataMiningException
Specified by:
trainClassifier in class DataHandler
Throws:
DataMiningException

classifyDataset

public ClassificationResult classifyDataset(Model model)
                                     throws DataMiningException
Specified by:
classifyDataset in class DataHandler
Throws:
DataMiningException

removeFeature

public void removeFeature(java.lang.String featureName)
                   throws DataMiningException
Specified by:
removeFeature in class DataHandler
Throws:
DataMiningException

removeInstance

public void removeInstance(java.lang.String instanceId)
                    throws DataMiningException
Specified by:
removeInstance in class DataHandler
Throws:
DataMiningException

loadDataset

public void loadDataset(java.lang.String source)
                 throws DataMiningException
Description copied from class: DataHandler
loads a native dataset from the given source

Specified by:
loadDataset in class DataHandler
Parameters:
source - A String denotes the source of the native dataset it contains a native dataset implementation dependent resource string
Throws:
DataMiningException

saveDataset

public void saveDataset(java.lang.String target)
Description copied from class: DataHandler
saves the current dataset to the given target

Specified by:
saveDataset in class DataHandler
Parameters:
target - A String denotes the target of the native dataset it contains a native dataset implementation dependent resource string

saveDatasetMallet

public void saveDatasetMallet(java.lang.String target)

saveDatasetSVM

public void saveDatasetSVM(java.lang.String target)

saveDatasetWeka

public void saveDatasetWeka(java.lang.String target)

setBinaryValue

public void setBinaryValue(java.lang.String instanceId,
                           java.lang.String featureName,
                           java.lang.Boolean value)
Description copied from class: DataHandler
Sets the value of a binary feature

Specified by:
setBinaryValue in class DataHandler
Parameters:
instanceId - instance identifier
featureName - name of the feature

setBinaryValue

public void setBinaryValue(java.lang.String instanceId,
                           java.lang.String featureName,
                           java.lang.Boolean value,
                           boolean ternal)
Specified by:
setBinaryValue in class DataHandler

setDefaultFeatureValue

public void setDefaultFeatureValue(java.lang.String featureName,
                                   java.lang.String value)
                            throws DataMiningException
Specified by:
setDefaultFeatureValue in class DataHandler
Throws:
DataMiningException

setLabel

public <T extends java.lang.Comparable<?>> void setLabel(java.lang.String instanceId,
                                                         T label)
Description copied from class: DataHandler
sets the class label of the given instance

Specified by:
setLabel in class DataHandler

setNominalValue

public void setNominalValue(java.lang.String instanceId,
                            java.lang.String featureName,
                            java.lang.String value)
Description copied from class: DataHandler
Sets the value of a nominal feature if this is a new nominal value it is added to the dataset

Specified by:
setNominalValue in class DataHandler
Parameters:
instanceId - instance identifier
featureName - name of the feature

setNumericValue

public void setNumericValue(java.lang.String instanceId,
                            java.lang.String featureName,
                            double value)
Description copied from class: DataHandler
Sets the value of a numeric feature

Specified by:
setNumericValue in class DataHandler
Parameters:
instanceId - instance identifier
featureName - name of the feature

setValue

public <T extends java.lang.Comparable<?>> void setValue(java.lang.String instanceId,
                                                         java.lang.String featureName,
                                                         T value)
              throws DataMiningException
Description copied from class: DataHandler
Sets the value of a feature, the type of the feature is given by the beginning of the feature name b_ binary feature n_ numeric feature m_ nominal feature t_ ternal feature

Specified by:
setValue in class DataHandler
Parameters:
instanceId - instance identifier
featureName - name of the feature
Throws:
DataMiningException

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Throws:
java.lang.Exception