# influence_risk_analysis

### Opinion Modeling

The scripts to train and evaluate the opinion models (LLAMA2 models) are in The "scripts_llama" directory. Refer scripts_llama/README.md for usage.

**C2V Training**

python ./data/cooccurrence_matrix/get_user_counts.py --in_file G:\reddit\reddit\comments\RC_2019-11 --out_dir ./user_counts
python ./data/cooccurrence_matrix/create_cooccurrence_matrix.py --in_dir ./user_counts --out_dir ./coocc_mat_dir
python similarity/community2vec.py

**C2V similarity calculation**

python similarity/calc_c2v_w2v_cos_sim.py --w2v_in_path c2v_out/best_model/word2vec.pickle --out_path ./c2v_w2v_sim_16cat.json --subs_json ./16c_input/16cat.json

**LLAMA2 0 shot**

python similarity/llm_inference_sim.py --model_id "meta-llama/Llama-2-7b-chat-hf" --input_db /mnt/e/influence_risk_analysis/data/mix_opp_3000.db --out_path f1_sims_l2_7B_0shot.json --exp_name L2_7B_0S --model_family LLAMA2 --num_examples 0

**LLAMA2 5 shot**

python similarity/llm_inference_sim.py --model_id "meta-llama/Llama-2-7b-chat-hf" --input_db /mnt/e/influence_risk_analysis/data/mix_opp_3000.db --out_path f1_sims_l2_7B_5shot.json --exp_name L2_7B_5S --model_family LLAMA2 --num_examples 5


#### Emb-PSR

python similarity/emb_psr_single_step_calc.py --data_dir=/mnt/e/reddit/emb_psr/combined/db/ --subs=./16c_input/16cat.json --out_dir=./16c_output_2/ --title_col='title' --model_name all-mpnet-base-v2

Hyperparameter tuning to pick best std - Pick Best Standard Deviation.ipynb

python similarity/calc_hits_at_n.py --cat2sub_json ./16c_input/cat2sub_16c.json --similarity_json .\emb_psr_16cat_sims_test_None.json

#### C2V W2V Hits@n (comparison against Emb-PSR)

python similarity/calc_hits_at_n.py --cat2sub_json ./16c_input/cat2sub_16c.json --similarity_json c2v_w2v_sim_16cat.json