Harsh Kumar
2026
Chronocept: Instilling a Sense of Time in Machines
Krish Goel | Sanskar Pandey | Mahadevan Ks | Harsh Kumar | Vishesh Khadaria
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Krish Goel | Sanskar Pandey | Mahadevan Ks | Harsh Kumar | Vishesh Khadaria
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Human cognition is deeply intertwined with a sense of time, known as Chronoception. This sense allows us to judge how long facts remain valid and when knowledge becomes outdated. Despite progress in vision, language, and motor control, AI still struggles to reason about temporal validity. We introduce Chronocept, the first benchmark to model temporal validity as a continuous probability distribution over time. Using skew-normal curves fitted along semantically decomposed temporal axes, Chronocept captures nuanced patterns of emergence, decay, and peak relevance. It includes two datasets: Benchmark I (atomic facts) and Benchmark II (multi-sentence passages). Annotations show strong inter-annotator agreement (84% and 89%). Our baselines predict curve parameters - location, scale, and skewness - enabling interpretable, generalizable learning and outperforming classification-based approaches. Chronocept fills a foundational gap in AI’s temporal reasoning, supporting applications in knowledge grounding, fact-checking, retrieval-augmented generation (RAG), and proactive agents. Code and data are publicly available.
2024
Emojis Trash or Treasure: Utilizing Emoji to Aid Hate Speech Detection
Tanik Saikh | Soham Barman | Harsh Kumar | Saswat Sahu | Souvick Palit
Proceedings of the 21st International Conference on Natural Language Processing (ICON)
Tanik Saikh | Soham Barman | Harsh Kumar | Saswat Sahu | Souvick Palit
Proceedings of the 21st International Conference on Natural Language Processing (ICON)
In this study, we delve into the fascinating realm of emojis and their impact on identifying hate speech in both Bengali and English languages. Through extensive exploration of various techniques, particularly the integration of Multilingual BERT (MBert) and Emoji2Vec embeddings, we strive to shed light on the immense potential of emojis in this detection process. By meticulously comparing these advanced models with conventional approaches, we uncover the intricate contextual cues that emojis bring to the table. Ultimately, our discoveries underscore the invaluable role of emojis in hate speech detection, thereby providing valuable insights for the creation of resilient and context-aware systems to combat online toxicity. Our findings showcase the potential of emojis as valuable assets rather than mere embellishments in the realm of hate speech detection. By leveraging the combined strength of MBert and Emoji2Vec, our models exhibit enhanced capabilities in deciphering the emotional subtleties often intertwined with hate speech expressions.