抽象语义表示(Abstract Meaning Representation,简称AMR)是将给定的文本的语义特征抽象成一个单根的有向无环图。AMR语义解析则是根据输入的文本获取对应的AMR图。相比于英文AMR,中文AMR的研究起步较晚,造成针对中文的AMR语义解析相关研究较少。本文针对公开的中文AMR语料库CAMR1.0,采用序列到序列的方法进行中文AMR语义解析的相关研究。具体地,首先基于Transformer模型实现一个适用于中文的序列到序列AMR语义解析系统;然后,探索并比较了不同预训练模型在中文AMR语义解析中的应用。基于该语料,本文中文AMR语义解析方法最优性能达到了70.29的Smatch F1值。本文是第一次在该数据集上报告实验结果。
电子病历是医疗信息的重要来源,包含大量与医疗相关的领域知识。本文从糖尿病电子病历文本入手,在调研了国内外已有的电子病历语料库的基础上,参考坉圲坂圲实体及关系分类,建立了糖尿病电子病历实体及实体关系分类体系,并制定了标注规范。利用实体及关系标注平台,进行了实体及关系预标注及多轮人工校对工作,形成了糖尿病电子病历实体及关系标注语料库(Diabetes Electronic Medical Record entity and Related Corpus DEMRC)。所构建的DEMRC包含8899个实体、456个实体修饰及16564个关系。对DEMRC进行一致性评价和分析,标注结果达到了较高的一致性。针对实体识别和实体关系抽取任务,分别采用基于迁移学习的Bi-LSTM-CRF模型和RoBERTa模型进行初步实验,并对语料库中的各类实体及关系进行评估,为后续糖尿病电子病历实体识别及关系抽取研究以及糖尿病知识图谱构建打下基础。
本文探讨了在脑卒中疾病中文电子病历文本中实体及实体间关系的标注问题,提出了适用于脑卒中疾病电子病历文本的实体及实体关系标注体系和规范。在标注体系和规范的指导下,进行了多轮的人工标注及校正工作,完成了158万余字的脑卒中电子病历文本实体及实体关系的标注工作。构建了脑卒中电子病历实体及实体关系标注语料库(Stroke Electronic Medical Record entity and entity related Corpus SEMRC)。所构建的语料库共包含命名实体10594个,实体关系14457个。实体名标注一致率达到85.16%,实体关系标注一致率达到94.16%。
Neural machine translation (NMT) usually employs beam search to expand the searching spaceand obtain more translation candidates. However the increase of the beam size often suffersfrom plenty of short translations resulting in dramatical decrease in translation quality. In this paper we handle the length bias problem through a perspective of causal inference. Specially we regard the model generated translation score S as a degraded true translation quality affectedby some noise and one of the confounders is the translation length. We apply a Half-Sibling Re-gression method to remove the length effect on S and then we can obtain a debiased translation score without length information. The proposed method is model agnostic and unsupervised which is adaptive to any NMT model and test dataset. We conduct the experiments on three translation tasks with different scales of datasets. Experimental results and further analyses showthat our approaches gain comparable performance with the empirical baseline methods.
Reinforcement learning has been proved to be effective in handling low resource machine trans-lation tasks and different sampling methods of reinforcement learning affect the performance ofthe model. The reward for generating translation is determined by the scalability and iteration ofthe sampling strategy so it is difficult for the model to achieve bias-variance trade-off. Therefore according to the poor ability of the model to analyze the structure of the sequence in low-resourcetasks this paper proposes a neural machine translation model parameter optimization method for asynchronous dynamic programming training strategies. In view of the experience priority situa-tion under the current strategy each selective sampling experience not only improves the value ofthe experience state but also avoids the high computational resource consumption inherent in tra-ditional valuation methods (such as dynamic programming). We verify the Mongolian-Chineseand Uyghur-Chinese tasks on CCMT2019. The result shows that our method has improved the quality of low-resource neural machine translation model compared with general reinforcement learning methods which fully demonstrates the effectiveness of our method.
Metaphor detection plays an important role in tasks such as machine translation and human-machine dialogue. As more users express their opinions on products or other topics on socialmedia through metaphorical expressions this task is particularly especially topical. Most of the research in this field focuses on English and there are few studies on minority languages thatlack language resources and tools. Moreover metaphorical expressions have different meaningsin different language environments. We therefore established a deep neural network (DNN)framework for Uyghur metaphor detection tasks. The proposed method can focus on the multi-level semantic information of the text from word embedding part of speech and location which makes the feature representation more complete. We also use the emotional information of words to learn the emotional consistency features of metaphorical words and their context. A qualitative analysis further confirms the need for broader emotional information in metaphor detection. Ourresults indicate the performance of Uyghur metaphor detection can be effectively improved withthe help of multi-attention and emotional information.
Exposure bias and poor translation diversity are two common problems in neural machine trans-lation (NMT) which are caused by the general of the teacher forcing strategy for training inthe NMT models. Moreover the NMT models usually require the large-scale and high-quality parallel corpus. However Korean is a low resource language and there is no large-scale parallel corpus between Chinese and Korean which is a challenging for the researchers. Therefore wepropose a method which is to incorporate translation quality estimation into the translation processand adopt reinforcement learning. The evaluation mechanism is used to guide the training of the model so that the prediction cannot converge completely to the ground truth word. When the model predicts a sequence different from the ground truth word the evaluation mechanism cangive an appropriate evaluation and reward to the model. In addition we alleviated the lack of Korean corpus resources by adding training data. In our experiment we introduce a monolingual corpus of a certain scale to construct pseudo-parallel data. At the same time we also preprocessed the Korean corpus with different granularities to overcome the data sparsity. Experimental results show that our work is superior to the baselines in Chinese-Korean and Korean-Chinese translation tasks which fully certificates the effectiveness of our method.
Emotion classification of COVID-19 Chinese microblogs helps analyze the public opinion triggered by COVID-19. Existing methods only consider the features of the microblog itself with-out combining the semantics of emotion categories for modeling. Emotion classification of mi-croblogs is a process of reading the content of microblogs and combining the semantics of emo-tion categories to understand whether it contains a certain emotion. Inspired by this we proposean emotion classification model based on the emotion category description for COVID-19 Chi-nese microblogs. Firstly we expand all emotion categories into formalized category descriptions.Secondly based on the idea of question answering we construct a question for each microblogin the form of ‘What is the emotion expressed in the text X?’ and regard all category descrip-tions as candidate answers. Finally we construct a question-and-answer pair and use it as the input of the BERT model to complete emotion classification. By integrating rich contextual andcategory semantics the model can better understand the emotion of microblogs. Experimentson the COVID-19 Chinese microblog dataset show that our approach outperforms many existinge motion classification methods including the BERT baseline.
Emotion cause analysis (ECA) aims to identify the potential causes behind certain emotions intext. Lots of ECA models have been designed to extract the emotion cause at the clause level.However in many scenarios only extracting the cause clause is ambiguous. To ease the problemin this paper we introduce multi-level emotion cause analysis which focuses on identifying emotion cause clause (ECC) and emotion cause keywords (ECK) simultaneously. ECK is a more challenging task since it not only requires capturing the specific understanding of the role of eachword in the clause but also the relation between each word and emotion expression. We observethat ECK task can incorporate the contextual information from the ECC task while ECC taskcan be improved by learning the correlation between emotion cause keywords and emotion fromthe ECK task. To fulfill the goal of joint learning we propose a multi-head attention basedmulti-task learning method which utilizes a series of mechanisms including shared and privatefeature extractor multi-head attention emotion attention and label embedding to capture featuresand correlations between the two tasks. Experimental results show that the proposed method consistently outperforms the state-of-the-art methods on a benchmark emotion cause dataset.
Manifold ranking has been successfully applied in query-oriented multi-document summariza-tion. It not only makes use of the relationships among the sentences but also the relationships between the given query and the sentences. However the information of original query is often insufficient. So we present a query expansion method which is combined in the manifold rank-ing to resolve this problem. Our method not only utilizes the information of the query term itselfand the knowledge base WordNet to expand it by synonyms but also uses the information of the document set itself to expand the query in various ways (mean expansion variance expansionand TextRank expansion). Compared with the previous query expansion methods our methodcombines multiple query expansion methods to better represent query information and at the same time it makes a useful attempt on manifold ranking. In addition we use the degree of wordoverlap and the proximity between words to calculate the similarity between sentences. We per-formed experiments on the datasets of DUC 2006 and DUC2007 and the evaluation results showthat the proposed query expansion method can significantly improve the system performance andmake our system comparable to the state-of-the-art systems.
Extractive text summarization seeks to extract indicative sentences from a source document andassemble them to form a summary. Selecting salient but not redundant sentences has alwaysbeen the main challenge. Unlike the previous two-stage strategies this paper presents a unifiedend-to-end model learning to rerank the sentences by modeling salience and redundancy simul-taneously. Through this ranking mechanism our method can improve the quality of the overall candidate summary by giving higher scores to sentences that can bring more novel informa-tion. We first design a summary-level measure to evaluate the cumulating gain of each candidate summaries. Then we propose an adaptive training objective to rerank the sentences aiming atobtaining a summary with a high summary-level score. The experimental results and evalua-tion show that our method outperforms the strong baselines on three datasets and further booststhe quality of candidate summaries which intensely indicate the effectiveness of the proposed framework.
Abstractive dialogue summarization is the task of capturing the highlights of a dialogue andrewriting them into a concise version. In this paper we present a novel multi-speaker dialogue summarizer to demonstrate how large-scale commonsense knowledge can facilitate dialogue un-derstanding and summary generation. In detail we consider utterance and commonsense knowl-edge as two different types of data and design a Dialogue Heterogeneous Graph Network (D-HGN) for modeling both information. Meanwhile we also add speakers as heterogeneous nodes to facilitate information flow. Experimental results on the SAMSum dataset show that our modelcan outperform various methods. We also conduct zero-shot setting experiments on the Argu-mentative Dialogue Summary Corpus the results show that our model can better generalized tothe new domain.
Question generation (QG) is to generate natural and grammatical questions that can be answeredby a specific answer for a given context. Previous sequence-to-sequence models suffer from aproblem that asking high-quality questions requires commonsense knowledge as backgrounds which in most cases can not be learned directly from training data resulting in unsatisfactory questions deprived of knowledge. In this paper we propose a multi-task learning framework tointroduce commonsense knowledge into question generation process. We first retrieve relevant commonsense knowledge triples from mature databases and select triples with the conversion information from source context to question. Based on these informative knowledge triples wedesign two auxiliary tasks to incorporate commonsense knowledge into the main QG modelwhere one task is Concept Relation Classification and the other is Tail Concept Generation. Ex-perimental results on SQuAD show that our proposed methods are able to noticeably improvethe QG performance on both automatic and human evaluation metrics demonstrating that incor-porating external commonsense knowledge with multi-task learning can help the model generatehuman-like and high-quality questions.
In this paper we focus on machine reading comprehension in social media. In this domain onenormally posts a message on the assumption that the readers have specific background knowledge. Therefore those messages are usually short and lacking in background information whichis different from the text in the other domain. Thus it is difficult for a machine to understandthe messages comprehensively. Fortunately a key nature of social media is clustering. A group of people tend to express their opinion or report news around one topic. Having realized this we propose a novel method that utilizes the topic knowledge implied by the clustered messages to aid in the comprehension of those short messages. The experiments on TweetQA datasets demonstrate the effectiveness of our method.
GuessWhat?! is a task-oriented visual dialogue task which has two players a guesser and anoracle. Guesser aims to locate the object supposed by oracle by asking several Yes/No questions which are answered by oracle. How to ask proper questions is crucial to achieve the final goal of the whole task. Previous methods generally use an word-level generator which is hard to grasp the dialogue-level questioning strategy. They often generate repeated or useless questions. This paper proposes a sentence-level category-based strategy-driven question generator(CSQG) to explicitly provide a category based questioning strategy for the generator. First we encode the image and the dialogue history to decide the category of the next question to be generated. Thenthe question is generated with the helps of category-based dialogue strategy as well as encoding of both the image and dialogue history. The evaluation on large-scale visual dialogue dataset GuessWhat?! shows that our method can help guesser achieve 51.71% success rate which is the state-of-the-art on the supervised training methods.
Few-shot relation classification has attracted great attention recently and is regarded as an ef-fective way to tackle the long-tail problem in relation classification. Most previous works onfew-shot relation classification are based on learning-to-match paradigms which focus on learn-ing an effective universal matcher between the query and one target class prototype based oninner-class support sets. However the learning-to-match paradigm focuses on capturing the sim-ilarity knowledge between query and class prototype while fails to consider discriminative infor-mation between different candidate classes. Such information is critical especially when targetclasses are highly confusing and domain shifting exists between training and testing phases. Inthis paper we propose the Global Transformed Prototypical Networks(GTPN) which learns tobuild a few-shot model to directly discriminate between the query and all target classes with bothinner-class local information and inter-class global information. Such learning-to-discriminate paradigm can make the model concentrate more on the discriminative knowledge between allcandidate classes and therefore leads to better classification performance. We conducted exper-iments on standard FewRel benchmarks. Experimental results show that GTPN achieves very competitive performance on few-shot relation classification and reached the best performance onthe official leaderboard of FewRel 2.0 1.
The irrelevant information in documents poses a great challenge for machine reading compre-hension (MRC). To deal with such a challenge current MRC models generally fall into twoseparate parts: evidence extraction and answer prediction where the former extracts the key evi-dence corresponding to the question and the latter predicts the answer based on those sentences.However such pipeline paradigms tend to accumulate errors i.e. extracting the incorrect evi-dence results in predicting the wrong answer. In order to address this problem we propose aMulti-Strategy Knowledge Distillation based Teacher-Student framework (MSKDTS) for ma-chine reading comprehension. In our approach we first take evidence and document respec-tively as the input reference information to build a teacher model and a student model. Then the multi-strategy knowledge distillation method transfers the knowledge from the teacher model to the student model at both feature and prediction level through knowledge distillation approach.Therefore in the testing phase the enhanced student model can predict answer similar to the teacher model without being aware of which sentence is the corresponding evidence in the docu-ment. Experimental results on the ReCO dataset demonstrate the effectiveness of our approachand further ablation studies prove the effectiveness of both knowledge distillation strategies.
The predominant approach of visual question answering (VQA) relies on encoding the imageand question with a ”black box” neural encoder and decoding a single token into answers suchas ”yes” or ”no”. Despite this approach’s strong quantitative results it struggles to come up withhuman-readable forms of justification for the prediction process. To address this insufficiency we propose LRRA[LookReadReasoningAnswer]a transparent neural-symbolic framework forvisual question answering that solves the complicated problem in the real world step-by-steplike humans and provides human-readable form of justification at each step.Specifically LRRAlearns to first convert an image into a scene graph and parse a question into multiple reasoning instructions. It then executes the reasoning instructions one at a time by traversing the scenegraph using a recurrent neural-symbolic execution module.Finally it generates answers to the given questions and makes corresponding marks on the image. Furthermore we believe that the relations between objects in the question is of great significance for obtaining the correct answerso we create a perturbed GQA test set by removing linguistic cues (attributes and relations) in the questions to analyze which part of the question contributes more to the answer.Our experimentson the GQA dataset show that LRRA is significantly better than the existing representative model(57.12% vs. 56.39%). Our experiments on the perturbed GQA test set show that the relations between objects is more important for answering complicated questions than the attributes ofobjects.Keywords:Visual Question Answering Relations Between Objects Neural-Symbolic Reason-ing.
Zipf’s law is a succinct yet powerful mathematical law in linguistics. However the mean-ingfulness and units of the law have remained controversial. The current study usesonline video comments call “danmu comment” to investigate these two questions. Theresults are consistent with previous studies arguing Zipf’s law is subject to topical coher-ence. Specifically it is found that danmu comments sampled from a single video followZipf’s law better than danmu comments sampled from a collection of videos. The resultsalso suggest the existence of multiple units of Zipf’s law. When different units includingwords n-grams and danmu comments are compared both words and danmu commentsobey Zipf’s law and words may be a better fit. The issues of combined n-grams in the literature are also discussed.
For text-level discourse analysis there are various discourse schemes but relatively few labeleddata because discourse research is still immature and it is labor-intensive to annotate the innerlogic of a text. In this paper we attempt to unify multiple Chinese discourse corpora under different annotation schemes with discourse dependency framework by designing semi-automatic methods to convert them into dependency structures. We also implement several benchmark dependency parsers and research on how they can leverage the unified data to improve performance.1
Machine reading comprehension (MRC) is a typical natural language processing (NLP)task and has developed rapidly in the last few years. Various reading comprehension datasets have been built to support MRC studies. However large-scale and high-quality datasets are rare due to the high complexity and huge workforce cost of making sucha dataset. Besides most reading comprehension datasets are in English and Chinesedatasets are insufficient.In this paper we propose an automatic method for MRCdataset generation and build the largest Chinese medical reading comprehension dataset presently named CMedRC. Our dataset contains 17k questions generated by our auto-matic method and some seed questions. We obtain the corresponding answers from amedical knowledge graph and manually check all of them. Finally we test BiLSTM andBERT-based pre-trained language models (PLMs) on our dataset and propose a base-line for the following studies. Results show that the automatic MRC dataset generation method is considerable for future model improvements.
Morphological analysis is a fundamental task in natural language processing and results can beapplied to different downstream tasks such as named entity recognition syntactic analysis andmachine translation. However there are many problems in morphological analysis such as lowaccuracy caused by a lack of resources. In this paper to alleviate the lack of resources in Uyghurmorphological analysis research we construct a Uyghur morphological analysis corpus based onthe analysis of grammatical features and the format of the general morphological analysis corpus.We define morphological tags from 14 dimensions and 53 features manually annotate and correctthe dataset. Finally the corpus provided some informations such as word lemma part of speech morphological analysis tags morphological segmentation and lemmatization. Also this paperanalyzes some basic features of the corpus and we use the models and datasets provided bySIGMORPHON Shared Task organizers to design comparative experiments to verify the corpus’savailability. Results of the experiment are 85.56% 88.29% respectively. The corpus provides areference value for morphological analysis and promotes the research of Uyghur natural language processing.
Entity Linking (EL) refers to the task of linking entity mentions in the text to the correct entities inthe Knowledge Base (KB) in which entity embeddings play a vital and challenging role because of the subtle differences between entities. However existing pre-trained entity embeddings onlylearn the underlying semantic information in texts yet the fine-grained entity type informationis ignored which causes the type of the linked entity is incompatible with the mention context.In order to solve this problem we propose to encode fine-grained type information into entity embeddings. We firstly pre-train word vectors to inject type information by embedding wordsand fine-grained entity types into the same vector space. Then we retrain entity embeddings withword vectors containing fine-grained type information. By applying our entity embeddings to twoexisting EL models our method respectively achieves 0.82% and 0.42% improvement on average F1 score of the test sets. Meanwhile our method is model-irrelevant which means it can helpother EL models.
Open Relation Extraction (OpenRE) aiming to extract relational facts from open-domain cor-pora is a sub-task of Relation Extraction and a crucial upstream process for many other NLPtasks. However various previous clustering-based OpenRE strategies either confine themselves to unsupervised paradigms or can not directly build a unified relational semantic space henceimpacting down-stream clustering. In this paper we propose a novel supervised learning frame-work named MORE-RLL (Metric learning-based Open Relation Extraction with Ranked ListLoss) to construct a semantic metric space by utilizing Ranked List Loss to discover new rela-tional facts. Experiments on real-world datasets show that MORE-RLL can achieve excellent performance compared with previous state-of-the-art methods demonstrating the capability of MORE-RLL in unified semantic representation learning and novel relational fact detection.
Distant supervision can generate large-scale relation classification data quickly and economi-cally. However a great number of noise sentences are introduced which can not express their labeled relations. By means of pre-trained language model BERT’s powerful function in this paper we propose a BERT-based semantic denoising approach for distantly supervised relation classification. In detail we define an entity pair as a source entity and a target entity. For the specific sentences whose target entities in BERT-vocabulary (one-token word) we present the differences of dependency between two entities for noise and non-noise sentences. For general sentences whose target entity is multi-token word we further present the differences of last hid-den states of [MASK]-entity (MASK-lhs for short) in BERT for noise and non-noise sentences.We regard the dependency and MASK-lhs in BERT as two semantic features of sentences. With BERT we capture the dependency feature to discriminate specific sentences first then capturethe MASK-lhs feature to denoise distant supervision datasets. We propose NS-Hunter a noveldenoising model which leverages BERT-cloze ability to capture the two semantic features andintegrates above functions. According to the experiment on NYT data our NS-Hunter modelachieves the best results in distant supervision denoising and sentence-level relation classification. Keywords: Distant supervision relation classification semantic denoisingIntroduction
This paper tackles a new task for event entity recognition (EER). Different from named entity recognizing (NER) task it only identifies the named entities which are related to a specific event type. Currently there is no specific model to directly deal with the EER task. Previous namedentity recognition methods that combine both relation extraction and argument role classification(named NER+TD+ARC) can be adapted for the task by utilizing the relation extraction component for event trigger detection (TD). However these technical alternatives heavily rely on the efficiency of the event trigger detection which have to require the tedious yet expensive human la-beling of the event triggers especially for languages where triggers contain multiple tokens andhave numerous synonymous expressions (such as Chinese). In this paper a novel trigger-awaremulti-task learning framework (TAM) which jointly performs both trigger detection and evententity recognition is proposed to tackle Chinese EER task. We conduct extensive experimentson a real-world Chinese EER dataset. Compared with the previous methods TAM outperformsthe existing technical alternatives in terms of F1 measure. Besides TAM can accurately identifythe synonymous expressions that are not included in the trigger dictionary. Morover TAM canobtain a robust performance when only a few labeled triggers are available.
Deep neural networks have achieved state-of-the-art performances on named entity recognition(NER) with sufficient training data while they perform poorly in low-resource scenarios due to data scarcity. To solve this problem we propose a novel data augmentation method based on pre-trained language model (PLM) and curriculum learning strategy. Concretely we use the PLMto generate diverse training instances through predicting different masked words and design atask-specific curriculum learning strategy to alleviate the influence of noises. We evaluate the effectiveness of our approach on three datasets: CoNLL-2003 OntoNotes5.0 and MaScip of which the first two are simulated low-resource scenarios and the last one is a real low-resource dataset in material science domain. Experimental results show that our method consistently outperform the baseline model. Specifically our method achieves an absolute improvement of3.46% F1 score on the 1% CoNLL-2003 2.58% on the 1% OntoNotes5.0 and 0.99% on the full of MaScip.
Existing entity alignment models mainly use the topology structure of the original knowledge graph and have achieved promising performance. However they are still challenged by the heterogeneous topological neighborhood structures which could cause the models to produce different representations of counterpart entities. In the paper we propose a global entity alignment model with gated latent space neighborhood aggregation (LatsEA) to address this challenge. Latent space neighborhood is formed by calculating the similarity between the entity embeddings it can introduce long-range neighbors to expand the topological neighborhood and reconcile the heterogeneous neighborhood structures. Meanwhile it uses vanilla GCN to aggregate the topological neighborhood and latent space neighborhood respectively. Then it uses an average gating mechanism to aggregate topological neighborhood information and latent space neighborhood information of the central entity. In order to further consider the interdependence between entity alignment decisions we propose a global entity alignment strategy i.e. formulate entity alignment as the maximum bipartite matching problem which is effectively solved by Hungarian algorithm. Our experiments with ablation studies on three real-world entity alignment datasets prove the effectiveness of the proposed model. Latent space neighborhood informationand global entity alignment decisions both contributes to the entity alignment performance improvement.
Charge prediction aims to predict the final charge for a case according to its fact descriptionand plays an important role in legal assistance systems. With deep learning based methods prediction on high-frequency charges has achieved promising results but that on few-shot chargesis still challenging. In this work we propose a framework with multi-grained features and mutual information for few-shot charge prediction. Specifically we extract coarse- and fine-grained features to enhance the model’s capability on representation based on which the few-shot chargescan be better distinguished. Furthermore we propose a loss function based on mutual information.This loss function leverages the prior distribution of the charges to tune their weights so the few-shot charges can contribute more on model optimization. Experimental results on several datasets demonstrate the effectiveness and robustness of our method. Besides our method can work wellon tiny datasets and has better efficiency in the training which provides better applicability in realscenarios.
To enrich the research about sketch modality a new task termed Sketchy Scene Captioning isproposed in this paper. This task aims to generate sentence-level and paragraph-level descrip-tions for a sketchy scene. The sentence-level description provides the salient semantics of asketchy scene while the paragraph-level description gives more details about the sketchy scene.Sketchy Scene Captioning can be viewed as an extension of sketch classification which can onlyprovide one class label for a sketch. To generate multi-level descriptions for a sketchy scene ischallenging because of the visual sparsity and ambiguity of the sketch modality. To achieve ourgoal we first contribute a sketchy scene captioning dataset to lay the foundation of this new task.The popular sequence learning scheme e.g. Long Short-Term Memory neural network with vi-sual attention mechanism is then adopted to recognize the objects in a sketchy scene and inferthe relations among the objects. In the experiments promising results have been achieved on the proposed dataset. We believe that this work will motivate further researches on the understanding of sketch modality and the numerous sketch-based applications in our daily life. The collected dataset is released at https://github.com/SketchysceneCaption/Dataset.
Building an interpretable AI diagnosis system for breast cancer is an important embodiment ofAI assisted medicine. Traditional breast cancer diagnosis methods based on machine learning areeasy to explain but the accuracy is very low. Deep neural network greatly improves the accuracy of diagnosis but the black box model does not provide transparency and interpretation. In this work we propose a semantic embedding self-explanatory Breast Diagnostic Capsules Network(BDCN). This model is the first to combine the capsule network with semantic embedding for theAI diagnosis of breast tumors using capsules to simulate semantics. We pre-trained the extrac-tion word vector by embedding the semantic tree into the BERT and used the capsule network to improve the semantic representation of multiple heads of attention to construct the extraction feature the capsule network was extended from the computer vision classification task to the text classification task. Simultaneously both the back propagation principle and dynamic routing algorithm are used to realize the local interpretability of the diagnostic model. The experimental results show that this breast diagnosis model improves the model performance and has good interpretability which is more suitable for clinical situations.IntroductionBreast cancer is an important killer threatening women’s health because of rising incidence. Early detection and diagnosis are the key to reduce the mortality rate of breast cancer and improve the quality of life of patients. Mammary gland molybdenum target report contains rich semantic information whichcan directly reflect the results of breast cancer screening (CACA-CBCS 2019) and AI-assisted diagno-sis of breast cancer is an important means. Therefore various diagnostic models were born. Mengwan(2020) used support vector machine(SVM) and Naive Bayes to classify morphological features with anaccuracy of 91.11%. Wei (2009) proposed a classification method of breast cancer based on SVM andthe accuracy of the classifier experiment is 79.25%. These traditional AI diagnoses of breast tumors havelimited data volume and low accuracy. Deep Neural Networks (DNN) enters into the ranks of the diagno-sis of breast tumor. Wang (2019) put forward a kind of based on feature fusion with CNN deep features of breast computer-aided diagnosis methods the accuracy is 92.3%. Zhao (2018) investigated capsule networks with dynamic routing for text classification which proves the feasibility of text categorization. Existing models have poor predictive effect and lack of interpretation which can not meet the clinical needs.
In recent years with the development of deep learning and the increasing demand for medical information acquisition in medical information technology applications such as clinical decision support Clinical Event Detection has been widely studied as its subtask. However directly applying advances in deep learning to Clinical Event Detection tasks often produces undesirable results. This paper proposes a multi-granularity information fusion encoder-decoder frameworkthat introduces external knowledge. First the word embedding generated by the pre-trained biomedical language representation model (BioBERT) and the character embedding generatedby the Convolutional Neural Network are spliced. And then perform Part-of-Speech attention coding for character-level embedding perform semantic Graph Convolutional Network codingfor the spliced character-word embedding. Finally the information of these three parts is fusedas Conditional Random Field input to generate the sequence label of the word. The experimental results on the 2012 i2b2 data set show that the model in this paper is superior to other existingmodels. In addition the model in this paper alleviates the problem that “occurrence” event typeseem more difficult to detect than other event types.
With the increasing popularity of learning Chinese as a second language (L2) the development of an automatic essay scoring (AES) method specially for Chinese L2 essays has become animportant task. To build a robust model that could easily adapt to prompt changes we propose 90linguistic features with consideration of both language complexity and correctness and introducethe Ordinal Logistic Regression model that explicitly combines these linguistic features and low-level textual representations. Our model obtains a high QWK of 0.714 a low RMSE of 1.516 anda considerable Pearson correlation of 0.734. With a simple linear model we further analyze the contribution of the linguistic features to score prediction revealing the model’s interpretability and its potential to give writing feedback to users. This work provides insights and establishes asolid baseline for Chinese L2 AES studies.
In the paper we present a ‘pre-training’+‘post-training’+‘fine-tuning’ three-stage paradigm which is a supplementary framework for the standard ‘pre-training’+‘fine-tuning’ languagemodel approach. Furthermore based on three-stage paradigm we present a language modelnamed PPBERT. Compared with original BERT architecture that is based on the standard two-stage paradigm we do not fine-tune pre-trained model directly but rather post-train it on the domain or task related dataset first which helps to better incorporate task-awareness knowl-edge and domain-awareness knowledge within pre-trained model also from the training datasetreduce bias. Extensive experimental results indicate that proposed model improves the perfor-mance of the baselines on 24 NLP tasks which includes eight GLUE benchmarks eight Su-perGLUE benchmarks six extractive question answering benchmarks. More remarkably our proposed model is a more flexible and pluggable model where post-training approach is able to be plugged into other PLMs that are based on BERT. Extensive ablations further validate the effectiveness and its state-of-the-art (SOTA) performance. The open source code pre-trained models and post-trained models are available publicly.