Hi all,   Please note I took the call chains in my secondmost previous message for the race condition from mtpctl.c in kernel 4.18.20.   Thank you, Mark On Tue, 20 Aug 2019, Mark Balançian wrote: > Hello Mister Prakash, Calaby, and Subramani, > > I also please request your reply to my previous message before the end of > this Thursday the latest, as I am partaking in an evaluation period from the > organization I am working for with a deadline very close to that time. > > Thank you, > > Mark > > On 2019-08-20 7:46 a.m., Mark Balantzyan wrote: >> Hi all, >> >> The race condition in the mptctl driver I'm wishing to have confirmed is >> evidenced by the pair of call chains: >> >> compat_mpctl_ioctl -> compat_mpt_command -> mptctl_do_mpt_command which >> calls mpt_get_msg_frame(mptctl_id, ioc) >> >> and >> >> __mptctl_ioctl -> mpt_fw_download -> mptctl_do_fw_download which calls >> mpt_put_msg_frame(mptctl_id, iocp, mf) and calls >> mpt_get_msg_frame(mptctl_id, iocp) >> >> Since ioctl can be called at any time, accessing of mptctl_id occurs >> concurrently between threads causing a race. >> >> I realize in past messages I've tried to patch by surrounding all instances >> of mptctl_id with mutexes but I'm focusing this time on one clear instance >> of the race condition involving the variable mptctl_id, since Julian asks >> what the exact race condition is with respect to the case. >> >> Please let me know the confirmation or not confirmation of this race >> possibility. >> >> Thank you, >> Mark >> >> On Sun, 18 Aug 2019, Julian Calaby wrote: >> >>> Hi Mark, >>> >>> On Thu, Aug 15, 2019 at 8:02 PM Mark Balantzyan >>> wrote: >>>> >>>> Certain functions in the driver, such as mptctl_do_fw_download() and >>>> mptctl_do_mpt_command(), rely on the instance of mptctl_id, which does >>>> the >>>> id-ing. There is race condition possible when these functions operate in >>>> concurrency. Via, mutexes, the functions are mutually signalled to >>>> cooperate. >>>> >>>> Changelog v2 >>>> >>>> Lacked a version number but added properly terminated else condition at >>>> (former) line 2300. >>>> >>>> Changelog v3 >>>> >>>> Fixes "return -EAGAIN" lines which were erroneously tabbed as if to be >>>> guarded >>>> by "if" conditions lying above them. >>>> >>>> Signed-off-by: Mark Balantzyan >>>> >>>> --- >>> >>> Changelog should be down here after the "---" >>> >>>>  drivers/message/fusion/mptctl.c | 43 +++++++++++++++++++++++++-------- >>>>  1 file changed, 33 insertions(+), 10 deletions(-) >>>> >>>> diff --git a/drivers/message/fusion/mptctl.c >>>> b/drivers/message/fusion/mptctl.c >>>> index 4470630d..3270843c 100644 >>>> --- a/drivers/message/fusion/mptctl.c >>>> +++ b/drivers/message/fusion/mptctl.c >>>> @@ -816,12 +816,15 @@ mptctl_do_fw_download(int ioc, char __user *ufwbuf, >>>> size_t fwlen) >>>> >>>>                 /*  Valid device. Get a message frame and construct the >>>> FW download message. >>>>                 */ >>>> +               mutex_lock(&mpctl_mutex); >>>>                 if ((mf = mpt_get_msg_frame(mptctl_id, iocp)) == NULL) >>>> -                       return -EAGAIN; >>>> +                       mutex_unlock(&mpctl_mutex); >>>> +               return -EAGAIN; >>> >>> Are you sure this is right? >>> >>> 1. We're now exiting early with -EAGAIN regardless of the result of >>> mpt_get_msg_frame() >>> 2. If the result of mpt_get_msg_frame() is not NULL, we don't unlock the >>> mutex >>> >>> Do you mean something like: >>> >>> - - - - - - >>> >>> mutex_lock(&mpctl_mutex); >>> mf = mpt_get_msg_frame(mptctl_id, iocp); >>> mutex_unlock(&mpctl_mutex); >>> >>> if (mf == NULL) { >>> >>> - - - - - - >>> >>>> @@ -1889,8 +1894,10 @@ mptctl_do_mpt_command (struct mpt_ioctl_command >>>> karg, void __user *mfPtr) >>>> >>>>         /* Get a free request frame and save the message context. >>>>          */ >>>> +       mutex_lock(&mpctl_mutex); >>>>          if ((mf = mpt_get_msg_frame(mptctl_id, ioc)) == NULL) >>>> -                return -EAGAIN; >>>> +               mutex_unlock(&mpctl_mutex); >>>> +        return -EAGAIN; >>> >>> Same comments here. >>> >>>> @@ -2563,7 +2576,9 @@ mptctl_hp_hostinfo(unsigned long arg, unsigned int >>>> data_size) >>>>         /* >>>>          * Gather ISTWI(Industry Standard Two Wire Interface) Data >>>>          */ >>>> +       mutex_lock(&mpctl_mutex); >>>>         if ((mf = mpt_get_msg_frame(mptctl_id, ioc)) == NULL) { >>>> +       mutex_unlock(&mpctl_mutex); >>> >>> This line needs to be indented to match the line below, also we don't >>> unlock the mutex if mpt_get_msg_frame() is not NULL. >>> >>>> @@ -3010,9 +3027,11 @@ static int __init mptctl_init(void) >>>>          *  Install our handler >>>>          */ >>>>         ++where; >>>> +       mutex_lock(&mpctl_mutex); >>>>         mptctl_id = mpt_register(mptctl_reply, MPTCTL_DRIVER, >>>>             "mptctl_reply"); >>>>         if (!mptctl_id || mptctl_id >= MPT_MAX_PROTOCOL_DRIVERS) { >>>> +               mutex_unlock(&mpctl_mutex); >>> >>> Why not use a local variable and only update the global variable if it's >>> valid? >>> >>>>                 printk(KERN_ERR MYNAM ": ERROR: Failed to register with >>>> Fusion MPT base driver\n"); >>>>                 misc_deregister(&mptctl_miscdev); >>>>                 err = -EBUSY; >>>> @@ -3022,13 +3041,14 @@ static int __init mptctl_init(void) >>>>         mptctl_taskmgmt_id = mpt_register(mptctl_taskmgmt_reply, >>>> MPTCTL_DRIVER, >>>>             "mptctl_taskmgmt_reply"); >>>>         if (!mptctl_taskmgmt_id || mptctl_taskmgmt_id >= >>>> MPT_MAX_PROTOCOL_DRIVERS) { >>>> +               mutex_unlock(&mpctl_mutex); >>> >>> Same comment here. >>> >>>> @@ -3044,13 +3064,14 @@ out_fail: >>>>  /*=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=*/ >>>>  static void mptctl_exit(void) >>>>  { >>>> +       mutex_lock(&mpctl_mutex); >>>>         misc_deregister(&mptctl_miscdev); >>>>         printk(KERN_INFO MYNAM ": Deregistered /dev/%s @ >>>> (major,minor=%d,%d)\n", >>>>                          mptctl_miscdev.name, MISC_MAJOR, >>>> mptctl_miscdev.minor); >>>> >>>>         /* De-register event handler from base module */ >>>>         mpt_event_deregister(mptctl_id); >>>> - >>>> + >>> >>> Please don't add trailing whitespace. >>> >>> Did you test this on real hardware? I'd expect it to deadlock and >>> crash almost immediately. >>> >>> Also, it might be worthwhile creating accessor functions for the >>> mptctl variables or using atomics, that way the locking doesn't need >>> to be right every time they're used. >>> >>> Finally, what's the exact race condition here? Are the functions you >>> reference changing the values of the mptctl variables while other code >>> is using them? These functions appear to be setup functions, so why >>> are those variables being used before the device is fully set up? >>> Single usages of those variables shouldn't be subject to race >>> conditions, so you shouldn't need mutexes around those. >>> >>> Thanks, >>> >>> -- >>> Julian Calaby >>> >>> Email: julian.calaby@gmail.com >>> Profile: http://www.google.com/profiles/julian.calaby/ >>> >