Authors: Qi Chen, Shaonan Ma, and Kang Chen (Tsinghua University, China); Teng Ma (Alibaba Inc); Xin Liu and Dexun Chen (National Supercomputing Center in Wuxi); Yongwei Wu (Tsinghua University, China); and Zuoning Chen (Chinese Academy of Engineering; Tsinghua University, China)
Abstract: Distributed locks are used to guarantee the distributed client-cache coherence in parallel file systems. However, they lead to poor performance in the case of parallel writes under high contention workloads. We analyze the distributed lock manager and find out that lock conflict resolution is the root cause of the poor performance, which involves frequent lock revocations and slow data flushing from client caches to data servers. We design a distributed lock manager named SeqDLM by exploiting the sequencer mechanism. SeqDLM mitigates the lock conflict resolution overhead using early grant and early revocation while keeping the same semantics as traditional distributed locks. To evaluate SeqDLM, we have implemented a parallel file system called ccPFS using SeqDLM and traditional distributed locks. Evaluations on 96 nodes show SeqDLM outperforms the traditional distributed locks by up to 10.3x for high contention parallel writes on a shared file with multiple stripes.
Presentation: file
Back to Technical Papers Archive Listing