
# Directory Structure

  

## DataPreprocessing

### preprocessDeli:

- Responsible for processing annotated CSV files.

- Generates ProbingIDs for each probing question.

- Matches causal utterances with messageIDs.

- Creates splits for training, development, and testing datasets.

  

### preprocessWTD:

- Processes annotated WTD files.

- Generates ProbingIDs for each probing question.

- Matches causal utterances with messageIDs.

- Creates splits for training, development, and testing datasets.

  

## GPT

### Deli_GPT

- Creates prompts for GPT to annotate causal utterances for each probing utterance.

  

### WTD_GPT

- Creates prompts to annotate ProbingIDs.

- Creates prompts to annotate causal utterances for each probing utterance.

  

## Baselines

### naive_baselines

- Calculates thresholds for cosine, lexical, and entity baselines.

- Evaluates baselines on the test set.

  

### helper_probing

- Contains helper functions:

- Creates golden clusters.

- Calculates naive baseline scores.

- Handles tokenization.

  

### modeling_probing 

- Defines architecture for Trainable-Baselines.

  

### modeling
- modeling file for joint-model

  

### training_deliberation
- training for joint model and other trainable baselines

  

### generate gold map
- maps GPT annotations with clusters

  

### predict_deliberation_chains 
- inference with trained model

  

### training_utils 
- training-specific helper files

  

### deliberation_chain_analysis
- provides cluster/chain level analysis

  

### helping_testing
- contains additional utils files

  
  

### performance_metrics_deli_final_all_pairs
- Plots on the joint model on various turn lengths b/w utterances for deli dataset

###  performance_metrics_wtd_final_all_pairs
- Plots on the joint model on various turn lengths b/w utterances for wtd dataset