From mboxrd@z Thu Jan 1 00:00:00 1970 From: Lukasz Majewski Date: Tue, 01 Sep 2015 13:19:09 +0200 Subject: [U-Boot] [PATCH 1/2] mmc: dw_mmc: Increase timeout to 20 seconds In-Reply-To: <201508292119.11094.marex@denx.de> References: <1440769821-24005-1-git-send-email-l.majewski@samsung.com> <201508291552.10117.marex@denx.de> <20150829183848.53f7d79b@jawa> <201508292119.11094.marex@denx.de> Message-ID: <20150901131909.59df773d@amdc2363> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de Hi Marek, > On Saturday, August 29, 2015 at 06:38:48 PM, Lukasz Majewski wrote: > > Hi Marek, > > Hi Lukasz, > > > > On Saturday, August 29, 2015 at 01:55:36 PM, Lukasz Majewski > > > wrote: > > > > On Fri, 28 Aug 2015 23:55:17 +0200 > > > > > > Hi! > > > > > > > Marek Vasut wrote: > > > > > On Friday, August 28, 2015 at 03:50:20 PM, Lukasz Majewski > > > > > wrote: > > > > > > The commit: d9dbb97be0e4a550457aec5f11afefb446169c90 > > > > > > "mmc: dw_mmc: Zap endless timeout" removed endless loop > > > > > > waiting for end of dw mmc transfer. > > > > > > > > > > > > For some workloads - dfu test @ Odroid XU3 (sending 8MiB > > > > > > file) - and SD cards (e.g. MicroSD Kingston 4GiB, Adata > > > > > > 4GiB) the default timeout is to short. > > > > > > > > > > > > The new value - 20 seconds - takes into account the > > > > > > situation when SD card triggers internal clean up. Such > > > > > > process may take more than 10 seconds on some cards. > > > > > > > > > > What happens if you pull the SD card out of the slot during > > > > > such a process? > > > > > > > > Then we would wait 20 seconds :-) to proceed. > > > > > > Oops, I think I was not clear here. I was wondering what happens > > > to the card if you yank it out of the slot whole it's performing > > > it's internal cleanup or whatever. Is it possible that the card > > > suffers data corruption, effectively trashing user data ? > > > > I think that only the card manufacturer may reveal what can happen > > with the card (what policy have been implemented in their FW). > > > > I guess that you may lost data in such case. > > Uuuurgh, that's seriously shitty. > > > > Is this behavior > > > specific to Samsung SD cards ? > > > > I've experienced the problem with Kingston (brand new one) and Adata > > MicroSD HC (4GiB) cards. > > I had bad previous experience with Kingston too. > > > > > To be clear - the mentioned patch introduced regression. > > > > > > That's a bug, not a regression, but anyway, > > > that's not the point. I do > > > agree with you that we do have a problem and I'm inclined to Ack > > > this patch, I'd like to understand what the real implications of > > > such a behavior of these cards are. > > > > > > > It works for > > > > small files on a commonly available SD cards (like 4 GiB > > > > Kingston/Adata). > > > > > > > > When I ran DFU tests I've discovered that there is a problem > > > > with storing 8MiB file (dat_8M.img). > > > > > > > > Even worse - when one wants to store Image.itb file (which > > > > might be 4-6 MiB) it sometimes works and sometimes not. > > > > Nightmare for debugging. > > > > > > Ew, that's one crappy card you have there. I'm reading the SD card > > > "Physical Layer Simplified Specification Version > > > 4.10" (part1_410.pdf) section 4.6.2.2 and it states that for SDHC > > > cards, the write operation should take at most 250mS, for SDXC > > > it's 500mS. Could it be that your card is violating the spec ? The "timeout" error is for situation when you issue write command (either single or multiple block) and you don't receive any response from the card. In our case we use multiblock transfer (CMD25) with either set number of block to transfer (CMD23) or explicit end of transmission (CMD12). Let's consider the second case. We setup data and issue CMD25. Then we check the CMD25 status (if we don't receive reply in 250 ms we would get timeout error). Afterwards data from buffer is transmitted to the card and flashed. This operation may take long time. During this process you can issue CMD13 (SEND_STATUS) to receive information about card state ([*] 4.10. - page 85). Two notable fields of [*] to check are READY_FOR_DATA and CURRENT_STATE. Those two state what is the SD Card controller situation. Then, you end the transfer with CMD12, which also provides some status information from the SDCard (like "prg"|"rcv", etc). If you want to issue next command you must check READY_FOR_DATA and CURRENT_STATE. In the case of internal SD card controller operation you would not get READY_FOR_DATA until it ends. > > > > I'll look into the spec and then comment :-). > > > > For my boards the time was 1.2 seconds for storing 8 MiB file. > > OK, but that's weird. The transfer should always be up to 512B long > and not any longer, unless it's a multiblock transfer. It is a multi block transfer. > > > > > Please correct me if I'm wrong - but is seems like we are now > > > > using 1 second timeout for detection if SD card has been > > > > removed? > > > > > > > > Shouldn't we use polling on the card detect IO pin instead? We > > > > could add such polling in several places in the MMC subsystem > > > > (like we do it with watchdog). > > > > > > > > Marek, Pantelis, what do you think about this? > > > > > > If you implement board_mmc_getcd(), you can check if the card is > > > present this way instead of waiting for command to time out. The > > > infrastructure for that is already in place. Right ? > > > > So you suggest adding board_mmc_getcd() in several places in the mmc > > subsystem driver to detect removal of the SD card? > > Hmmmm, I'm not sure about this one. Panto ? > > > > It'd be cool if the MMC subsystem could pull the wp-gpios and > > > cd-gpios from DT though :) > > > > +1 > > > > > > > Also, where did you find out there is such "cleanup" mechanism > > > > > please ? > > > > > > > > Internally we did some tests with several SD cards. We were > > > > stunned when it turned out that for some workloads it took up > > > > to 15 seconds to end write operation for small data. > > > > > > > > The culprit is the SD Card embedded controller responsible for > > > > FTL - flash translation layer. > > > > It allows NAND memory on the card to be visible as the block > > > > device. More importantly it also takes care of wear leveling > > > > and bad block management. > > > > > > > > Hence, we don't know when it would start housekeeping > > > > operations. We can only poll/wait until this controller > > > > finishes it work. The code as it was (with the indefinite loop) > > > > was taking this situation into account. > > > > > > > > The 1 second timeout is apparently too short and makes using SD > > > > card non-deterministic and error prone in u-boot. > > > > > > > > Even worse, many devices use SD card as the only storage device. > > > > > > Yes, horrible. > > > > Good that we have agreed. > > Heh :) > > Best regards, > Marek Vasut -- Best regards, Lukasz Majewski Samsung R&D Institute Poland (SRPOL) | Linux Platform Group