Legal documents are notorious for their complexity and domain-specific language, making them challenging for legal practitioners as well as non-experts to comprehend. To address this issue, the LegalEval 2023 track proposed several shared tasks, including the task of Rhetorical Roles Prediction (Task A). We participated as NITS_Legal team in Task A and conducted exploratory experiments to improve our understanding of the task. Our results suggest that sequence context is crucial in performing rhetorical roles prediction. Given the lengthy nature of legal documents, we propose a BiLSTM-based sentence sequence labeling approach that uses a local context-incorporated dataset created from the original dataset. To better represent the sentences during training, we extract legal domain-specific sentence embeddings from a Legal BERT model. Our experimental findings emphasize the importance of considering local context instead of treating each sentence independently to achieve better performance in this task. Our approach has the potential to improve the accessibility and usability of legal documents.
Extractive summarization of lengthy legal documents requires an appropriate sentence scoring mechanism. This mechanism should capture both the local semantics of a sentence as well as the global document-level context of a sentence. The search for an appropriate sentence embedding that can enable an effective scoring mechanism has been the focus of several research works in this domain. In this work, we propose an improved sentence embedding approach that combines a Legal Bert-based local embedding of the sentence with an anonymous random walk-based entire document embedding. Such combined features help effectively capture the local and global information present in a sentence. The experimental results suggest that the proposed sentence embedding approach can be very beneficial for the appropriate representation of sentences in legal documents, improving the sentence scoring mechanism required for extractive summarization of these documents.
Artist and music language recognitions of music recordings are crucial tasks in the music information retrieval domain. These tasks have many industrial applications and become much important with the advent of music streaming platforms. This work proposed a multitask learning-based deep learning model that leverages the shared latent representation between these two related tasks. Experimentally, we observe that applying multitask learning over a simple few blocks of a convolutional neural network-based model pays off with improvement in the performance. We conduct experiments on a regional music dataset curated for this task and released for others. Results show improvement up to 8.7 percent in AUC-PR, similar improvements observed in AUC-ROC.
Among the many applications of Music Information Retrieval (MIR), melody extraction is one of the most essential. It has risen to the top of the list of current research challenges in the field of MIR applications. We now need new means of defining, indexing, finding, and interacting with musical information, given the tremendous amount of music available at our fingertips. This article looked at some of the approaches that open the door to a broad variety of applications, such as automatically predicting the pitch sequence of a melody straight from the audio signal of a polyphonic music recording, commonly known as melody extraction. It is pretty easy for humans to identify the pitch of a melody, but doing so on an automated basis is very difficult and time-consuming. In this article, a comparison is made between the performance of the currently available melody extraction approach that is state-of-the-art Melodia and the technique based on time-domain adaptive filtering for melody extraction in terms of evaluation metrics introduced in MIREX 2005. Motivating by the same, this paper focuses on the discussion of datasets and state-of-the-art approaches for the extraction of the main melody from music signals. Additionally, a summary of the evaluation matrices based on which methodologies have been examined on various datasets is also present in this paper.