Hi Miquel, On 06/09/2023 01:35 AM, Miquel Raynal wrote: > Hi William, > > william.zhang@broadcom.com wrote on Thu, 8 Jun 2023 12:10:06 -0700: > >> On 06/07/2023 11:18 PM, Miquel Raynal wrote: >>> Hi William, >>> >>> william.zhang@broadcom.com wrote on Wed, 7 Jun 2023 13:24:23 -0700: >>> >>>> Hi Miquel, >>>> >>>> On 06/07/2023 01:22 AM, Miquel Raynal wrote: >>>>> Hi William, >>>>> >>>>> william.zhang@broadcom.com wrote on Tue, 6 Jun 2023 16:12:50 -0700: >>>>> >>>> The BCMBCA broadband SoC integrates the NAND controller differently than >>>>>> STB, iProc and other SoCs. It has different endianness for NAND cache >>>>>> data and ONFI parameter data. >>>>>> >>>>>> Add a SoC read data bus shim for BCMBCA to meet the specific SoC need >>>>>> and performance improvement using the optimized memcpy function on NAND >>>>>> cache memory. >>>>>> >>>>>> Signed-off-by: William Zhang >>>>>> --- >>>>>> >>>>>> drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c | 36 +++++++++++++++++ >>>>>> drivers/mtd/nand/raw/brcmnand/brcmnand.c | 44 ++++++++++++++------- >>>>>> drivers/mtd/nand/raw/brcmnand/brcmnand.h | 2 + >>>>>> 3 files changed, 68 insertions(+), 14 deletions(-) >>>>>> >>>>>> diff --git a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >>>>>> index 7e48b6a0bfa2..899103a62c98 100644 >>>>>> --- a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >>>>>> +++ b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >>>>>> @@ -26,6 +26,18 @@ enum { >>>>>> BCMBCA_CTLRDY = BIT(4), >>>>>> }; >>>>>> >> +#if defined(CONFIG_ARM64) >>>>>> +#define ALIGN_REQ 8 >>>>>> +#else >>>>>> +#define ALIGN_REQ 4 >>>>>> +#endif >>>>>> + >>>>>> +static inline bool bcmbca_nand_is_buf_aligned(void *flash_cache, void *buffer) >>>>>> +{ >>>>>> + return IS_ALIGNED((uintptr_t)buffer, ALIGN_REQ) && >>>>>> + IS_ALIGNED((uintptr_t)flash_cache, ALIGN_REQ); >>>>>> +} >>>>>> + >>>>>> static bool bcmbca_nand_intc_ack(struct brcmnand_soc *soc) >>>>>> { >>>>>> struct bcmbca_nand_soc *priv = >>>>>> @@ -56,6 +68,29 @@ static void bcmbca_nand_intc_set(struct brcmnand_soc *soc, bool en) >>>>>> brcmnand_writel(val, mmio); >>>>>> } >>>>>> >> +static void bcmbca_read_data_bus(struct brcmnand_soc *soc, >>>>>> + void __iomem *flash_cache, u32 *buffer, >>>>>> + int fc_words, bool is_param) >>>>>> +{ >>>>>> + int i; >>>>>> + >>>>>> + if (!is_param) { >>>>>> + /* >>>>>> + * memcpy can do unaligned aligned access depending on source >>>>>> + * and dest address, which is incompatible with nand cache. Fallback >>>>>> + * to the memcpy for io version >>>>>> + */ >>>>>> + if (bcmbca_nand_is_buf_aligned(flash_cache, buffer)) >>>>>> + memcpy((void *)buffer, (void *)flash_cache, fc_words * 4); >>>>>> + else >>>>>> + memcpy_fromio((void *)buffer, (void *)flash_cache, fc_words * 4); >>>>>> + } else { >>>>>> + /* Flash cache has same endian as the host for parameter pages */ >>>>>> + for (i = 0; i < fc_words; i++, buffer++) >>>>>> + *buffer = __raw_readl(flash_cache + i * 4); >>>>>> + } >>>>>> +} >>>>>> + >>>>>> static int bcmbca_nand_probe(struct platform_device *pdev) >>>>>> { >>>>>> struct device *dev = &pdev->dev; >>>>>> @@ -75,6 +110,7 @@ static int bcmbca_nand_probe(struct platform_device *pdev) >>>>>> >> soc->ctlrdy_ack = bcmbca_nand_intc_ack; >>>>>> soc->ctlrdy_set_enabled = bcmbca_nand_intc_set; >>>>>> + soc->read_data_bus = bcmbca_read_data_bus; >>>>>> >> return brcmnand_probe(pdev, soc); >>>>>> } >>>>>> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c >>>>>> index d920e88c7f5b..656be4d73016 100644 >>>>>> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c >>>>>> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c >>>>>> @@ -814,6 +814,30 @@ static inline u32 edu_readl(struct brcmnand_controller *ctrl, >>>>>> return brcmnand_readl(ctrl->edu_base + offs); >>>>>> } >>>>>> >> +static inline void brcmnand_read_data_bus(struct brcmnand_controller *ctrl, >>>>>> + void __iomem *flash_cache, u32 *buffer, >>>>>> + int fc_words, bool is_param) >>>>>> +{ >>>>>> + struct brcmnand_soc *soc = ctrl->soc; >>>>>> + int i; >>>>>> + >>>>>> + if (soc->read_data_bus) { >>>>>> + soc->read_data_bus(soc, flash_cache, buffer, fc_words, is_param); >>>>>> + } else { >>>>>> + if (!is_param) { >>>>>> + for (i = 0; i < fc_words; i++, buffer++) >>>>>> + *buffer = brcmnand_read_fc(ctrl, i); >>>>>> + } else { >>>>>> + for (i = 0; i < fc_words; i++) >>>>>> + /* >>>>>> + * Flash cache is big endian for parameter pages, at >>>>>> + * least on STB SoCs >>>>>> + */ >>>>>> + buffer[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); >>>>>> + } >>>>>> + } >>>>> >>>>> Perhaps we could have a single function that is statically assigned at >>>>> probe time instead of a first helper with two conditions which calls in >>>>> one case another hook... This can be simplified I guess. >>>>> >> Well this will need to be done at the SoC specific implementation level (bcm_nand.c) and each SoC will need to have either general data bus read func with is_param option or data_bus_read_page, data_bus_read_param. >>> >>> You told me in case we would use exec_op we could avoid the param >>> cache. If that's true then the whole support can be simplified. >>> >> Correct we may possibly unified the parameter data read but exec_op is long shot and we are not fully ready for that yet. It also depends on if the low level data register has endianess difference for the parameter data between difference SoCs. >> >> So I would like to push the current implementation and we can explore the exec_op option late which will be a much big and complete different implementation. > > I am sorry but this series is totally backwards, you're trying to guess > what comes next with the 'is_param' thing, it's exactly what we are > fighting against since 2017. There are plenty of ->exec_op() > conversions out there, I don't believe this one will be harder. You > need to convert the driver to this new API and get rid of this whole > endianness non-sense to simplify a lot the driver. > I am not guessing anything but just factor out the existing common nand cache read logic into the single default function(or one for page read and another for parameter read as I mentioned in another thread) and allow SoC to overrides the implementation when needed. I agree ->exec_op can possibly get rid of the parameter page read function and is the way to go. But it won't help on the page read for endianess. It's not that I am against exec_op but I want to take one step a time and I'd like to get these fixes and support for bcmbca soc first and then work on the exec_op API to minimize the change and reduce the risk. >> >>>> Not sure how much this can be simplified... Or we have default >>>> implementation in brcmnand.c but then there is one condition check >>>> too. Page read is done at 512 bytes burst. One or two conditions >>>> check outside of the per 512 bytes read loop does not sounds too bad >>>> if performance is concern. >>> >>> It is unreadable. That is my main concern. >>> >>>> >>>>>> +} >>>>>> + >>>>>> static void brcmnand_clear_ecc_addr(struct brcmnand_controller *ctrl) >>>>>> { >>>>>> >> @@ -1811,20 +1835,11 @@ static void brcmnand_cmdfunc(struct nand_chip *chip, unsigned command, >>>>>> native_cmd == CMD_PARAMETER_CHANGE_COL) { >>>>>> /* Copy flash cache word-wise */ >>>>>> u32 *flash_cache = (u32 *)ctrl->flash_cache; >>>>>> - int i; >>>>>> >> brcmnand_soc_data_bus_prepare(ctrl->soc, true); >>>>>> >> - /* >>>>>> - * Must cache the FLASH_CACHE now, since changes in >>>>>> - * SECTOR_SIZE_1K may invalidate it >>>>>> - */ >>>>>> - for (i = 0; i < FC_WORDS; i++) >>>>>> - /* >>>>>> - * Flash cache is big endian for parameter pages, at >>>>>> - * least on STB SoCs >>>>>> - */ >>>>>> - flash_cache[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); >>>>>> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, flash_cache, >>>>>> + FC_WORDS, true); >>>>>> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, true); >>>>>> >> @@ -2137,7 +2152,7 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, >>>>>> { >>>>>> struct brcmnand_host *host = nand_get_controller_data(chip); >>>>>> struct brcmnand_controller *ctrl = host->ctrl; >>>>>> - int i, j, ret = 0; >>>>>> + int i, ret = 0; >>>>>> >> brcmnand_clear_ecc_addr(ctrl); >>>>>> >> @@ -2150,8 +2165,9 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, >>>>>> if (likely(buf)) { >>>>>> brcmnand_soc_data_bus_prepare(ctrl->soc, false); >>>>>> >> - for (j = 0; j < FC_WORDS; j++, buf++) >>>>>> - *buf = brcmnand_read_fc(ctrl, j); >>>>>> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, buf, >>>>>> + FC_WORDS, false); >>>>>> + buf += FC_WORDS; >>>>>> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, false); >>>>>> } >>>>>> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.h b/drivers/mtd/nand/raw/brcmnand/brcmnand.h >>>>>> index f1f93d85f50d..88819bc395f8 100644 >>>>>> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.h >>>>>> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.h >>>>>> @@ -24,6 +24,8 @@ struct brcmnand_soc { >>>>>> void (*ctlrdy_set_enabled)(struct brcmnand_soc *soc, bool en); >>>>>> void (*prepare_data_bus)(struct brcmnand_soc *soc, bool prepare, >>>>>> bool is_param); >>>>>> + void (*read_data_bus)(struct brcmnand_soc *soc, void __iomem *flash_cache, >>>>>> + u32 *buffer, int fc_words, bool is_param); >>>>>> const struct brcmnand_io_ops *ops; >>>>>> }; >>>>>> > > >>>>> Thanks, >>>>> Miquèl >>>>> > > >>> Thanks, >>> Miquèl >>> > > > Thanks, > Miquèl >