Rule mining is an effective approach for reasoning over knowledge graph (KG). Existing works mainly concentrate on mining rules. However, there might be several rules that could be applied for reasoning for one relation, and how to select appropriate rules for completion of different triples has not been discussed. In this paper, we propose to take the context information into consideration, which helps select suitable rules for the inference tasks. Based on this idea, we propose a transformer-based rule mining approach, Ruleformer. It consists of two blocks: 1) an encoder extracting the context information from subgraph of head entities with modified attention mechanism, and 2) a decoder which aggregates the subgraph information from the encoder output and generates the probability of relations for each step of reasoning. The basic idea behind Ruleformer is regarding rule mining process as a sequence to sequence task. To make the subgraph a sequence input to the encoder and retain the graph structure, we devise a relational attention mechanism in Transformer. The experiment results show the necessity of considering these information in rule mining task and the effectiveness of our model.