Hi Brain,

>I don’t know much about the Swift bug. A BUG() or crash in the kernel is generally always a kernel bug, regardless of what userspace is doing. It >certainly could be that whatever userspace is doing to trigger the kernel bug is a bug in the userspace application, but either way it shouldn’t cause the >kernel to crash. By the same token, if Swift is updated to fix the aforementioned bug and the kernel crash no longer reproduces, that doesn’t >necessarily mean the kernel bug is fixed (just potentially hidden).

Understand. 

[Previous Message]

The valid inode has an inode number of 13668207561.
- The fsname for this inode is "sdb."
- The inode does appear to have a non-NULL if_data:

    if_u1 = {
      if_extents = 0xffff88084feaf5c0,
      if_ext_irec = 0xffff88084feaf5c0,
      if_data = 0xffff88084feaf5c0 "\004"
    },

        find <mntpath> -inum 13668207561

Q1: Were you able to track down the directory inode mentioned in the previous message?

Ans: Yes, it’s the directory/file as below. /srv/node/d224 is the mount point of /dev/sdb . This is the original location of the path. This folder includes the file 1436266052.71893.ts now. The .ts file is 0 size


[root@r2obj01 ~]# find /srv/node/d224 -inum 13668207561
/srv/node/d224/objects/45382/b32/b146865bf8034bfc42570b747c341b32

[root@r2obj01 ~]# ls -lrt /srv/node/d224/objects/45382/b32/b146865bf8034bfc42570b747c341b32
-rw------- 1 swift swift 0 Jul 7 22:37 1436266052.71893.ts

Q2: Is it some kind of internal directory used by the application (e.g., perhaps related to the quarantine mechanism mentioned in the bug)?

Ans: Yes, it’s a directory which accessing by application.


 37 ffff8810718343c0 ffff88105b9d32c0 ffff8808745aa5e8 REG  [eventpoll]
 38 ffff8808713da780 ffff880010c9a900 ffff88096368a188 REG /srv/node/d224/quarantined/objects/b146865bf8034bfc42570b747c341b32/1436266042.57775.ts
 39 ffff880871cb03c0 ffff880495a8b380 ffff8808a5e6c988 REG  /srv/node/d224/tmp/tmpSpnrHg

 40 ffff8808715b4540 ffff8804819c58c0 ffff8802381f8d88 DIR  /srv/node/d224/quarantined/objects/b146865bf8034bfc42570b747c341b32

The above operation in the swift-object-server was doing python function call to rename the file /srv/node/d224/objects/45382/b32/b146865bf8034bfc42570b747c341b32/1436266042.57775.ts as /srv/node/d224/quarantined/objects/b146865bf8034bfc42570b747c341b32/1436266042.57775.ts

os.rename(old, new)

And it crashed at this point. In the Q1, we found the inum is pointing to the directory /srv/node/d224/objects/45382/b32/b146865bf8034bfc42570b747c341b32 . 

We found that multiple(over 10) DELETE from application against the target file at almost same moment. The DELETE is removing the original file in the directory and create new empty .ts file in this directory. I suspect that multiple os.rename on the same file in that directory will cause the kernel panic. 

And the file /srv/node/d224/quarantined/objects/b146865bf8034bfc42570b747c341b32/1436266042.57775.ts was not created.

Regards // Hugo