Now that I had a working serial cable for my Nexus 5, as described in the "Making an UART cable for the Nexus 5" post, I was ready to face some action and help in upstreaming.
Looking through Brian Masney's TODO page there were a couple options, but the one I ended going for was the rear flashlight. Hardware doesn't get much simpler than an LED and it would be easy to test if it was working.
To my surprise though, the downstream driver had over 3500 lines! But it wasn't only for the flash LED, it supported multiple LED types: WLED, Flash/Torch, RGB, MPP and KPDBL. To make it easier to port and since the flash would be the only one I would be able to test and be sure that worked, I decided to create a new file for the driver and only copy over what was needed for the flash LED.
I started by copying the probe function, compiling the driver, and seeing which errors of missing definitions were thrown. If those definitions had "flash" or "torch" in the name, I would also copy them, otherwise I just removed those references. I repeated this until eventually there were no more missing definition errors and my driver had everything needed for the flash LED.
At this point I still had a lot of compiling errors though, since the downstream driver was for kernel 3.4 and I was compiling for 5.7.6. So I just went through each error, referencing the definitions on both the downstream and upstream kernels, and made the necessary changes.
With the driver successfully compiling, I added a CONFIG for it and enabled it
as a module in the defconfig used by Nexus 5 (qcom_defconfig
). I also went
through the devicetree files on the downstream tree to find which nodes I
needed to describe the flash LED hardware to the driver, and the properties
needed in them, so I could also add them upstream.
Note: The design of devicetrees can make the properties of a node be
scattered across different files. Later I learned that I could compile the
downstream kernel and read the source for the complete tree from the blob with
dtc -I dtb -O dts -o downstream_dt.dts
kernel_lge_hammerhead/arch/arm/boot/msm8974-hammerhead-rev-11.dtb
.
With a driver that compiles, a valid devicetree, and configs enabling the driver as a module, I was ready to finally compile my kernel with the driver and flash it to the phone. So I commited my changes and went to battle.
And of course it failed. In fact, it failed so badly that the driver wasn't even binding to the device. Not having a solid understanding of devicetrees and how the binding between the devices and drivers worked, I started researching about it.
A great resource I found was Solving Device Tree Issues (more on eLinux).
It was in fact using the dt_node_info
script that presentation shows that I
found out that the device was being loaded but the driver wasn't. Also, the
debugging techniques shown, like enabling the dynamic debug prints on probe
functions helped me discover that my driver's probe function wasn't even being
called.
After a lot of reading, both online documentation and other drivers' code, I noticed that my driver was registering in the SPMI bus, which made sense to me since it needs to communicate over that bus, but given that the flash LED node in the devicetree was being registered in the platform bus, my driver also needed to, otherwise they would never bind. So that was something I needed to change.
But making my driver register on the platform bus, made me no longer have the
spmi_device
pointer that I needed to use the SPMI functions to read and
write on the registers. Again, looking around on other drivers, like
qcom-spmi-iadc
, I discovered there was this cool regmap thing I could use
instead to read and write on the registers over SPMI but without being specific
to SPMI. I did the reasonable thing and just tried it out!
With these changes commited, the driver was now binding to the device, but the probe function was failing with the following messages:
[ 14.547394] spmi spmi-0: pmic_arb_wait_for_done: transaction failed (0x3)
[ 14.547405] qcom,leds-qpnp fc4cf000.spmi:pm8941@1:qcom,leds@d300: Unable to read from addr=5, rc(-5)
[ 14.547503] qcom,leds-qpnp fc4cf000.spmi:pm8941@1:qcom,leds@d300: Regulator get failed(-517)
[ 14.547512] qcom,leds-qpnp fc4cf000.spmi:pm8941@1:qcom,leds@d300: Unable to read flash config data
Given that the regulator was the one failing, and that I had just copied over the regulator nodes from the downstream devicetree, the problem was clearly there. I needed to find out what regulators were needed for the LED to work, and add them upstream if they weren't already there.
At this point I emailed Brian Masney asking for some light, and the advice he gave me was to compile and test the downstream kernel. Having something that works to use as a reference, even if it is very outdated, is invaluable.
Using an older toolchain to compile, as instructed in the build script, and after a minor problem, I got the downstream kernel compiling, flashed it, and verified that the LED and the downstream driver indeed worked. I should really have done this from the beginning... Imagine if the flash LED itself was actually faulty!
Then I started exploring the sysfs on this system to see how it worked. I
found the regulator that was being used by the LED, whose status
went to
enabled
whenever I turned the LED on with echo 1 >
/sys/class/leds/led\:flash_torch/brightness
.
With the downstream regulator node and devicetree, and the upstream driver for
the regulator (qcom_spmi-regulator
) and its dtbinding in hand, I started
doing some detective work.
After some investigation I finally discovered that, given that the regulator
address was 0xa000
, the regulator named 8941_boost
downstream is known
as s4
in the upstream kernel, or also by its nickname pm8941_5v
.
I still had to discover the other regulator's real identity, but having gained
some confidence from that fine detective work, and with some tips from the
devicetree, I bet all my money that pm8941_chg_boost
's real identity was
5vs1
, also called pm8941_5vs1
.
Commit, flash, test, aaand... nope. It still didn't work, although I had clearly made some progress. Now the probe function was successfully being executed, although the read and write operations on the SPMI registers were still failing:
[ 13.346704] leds_qpnp:qpnp_leds_probe: Probe called!
[ 13.346746] spmi spmi-0: pmic_arb_wait_for_done: transaction failed (0x3)
[ 13.346760] qcom,leds-qpnp fc4cf000.spmi:pm8941@1:qcom,leds@d300: Unable to read from addr=5, rc(-5)
[ 13.347250] leds_qpnp:qpnp_dump_regs: ===== led:flash_0 LED register dump start =====
[ 13.347285] leds_qpnp:qpnp_dump_regs: led:flash_0: 0x40 = 0x0
[ 13.347319] leds_qpnp:qpnp_dump_regs: led:flash_0: 0x41 = 0x0
[ 13.347353] leds_qpnp:qpnp_dump_regs: led:flash_0: 0x42 = 0x0
[ 13.347385] leds_qpnp:qpnp_dump_regs: led:flash_0: 0x43 = 0x0
[ 13.347419] leds_qpnp:qpnp_dump_regs: led:flash_0: 0x44 = 0x0
[ 13.347445] leds_qpnp:qpnp_dump_regs: led:flash_0: 0x48 = 0x0
[ 13.347470] leds_qpnp:qpnp_dump_regs: led:flash_0: 0x49 = 0x0
[ 13.347496] leds_qpnp:qpnp_dump_regs: led:flash_0: 0x4b = 0x0
[ 13.347530] leds_qpnp:qpnp_dump_regs: led:flash_0: 0x4c = 0x0
[ 13.347563] leds_qpnp:qpnp_dump_regs: led:flash_0: 0x4f = 0x0
[ 13.347589] leds_qpnp:qpnp_dump_regs: led:flash_0: 0x46 = 0x0
[ 13.347614] leds_qpnp:qpnp_dump_regs: led:flash_0: 0x47 = 0x0
[ 13.347622] leds_qpnp:qpnp_dump_regs: ===== led:flash_0 LED register dump end =====
Since I was confident that the devicetree was now right, I went back to the
driver code. I sprinkled some pr_debug()
over the probe function and
noticed that the reg
value, which is read from the devicetree and used as
the base address for all read and write operations, was 0
, even though it
should be 0xd300
.
Oh. Really?? Well, this wouldn't be a complete adventure without me introducing a dumb bug myself, right? 😝
After fixing the bug, I recompiled, reflashed, rebooted, retested aaand... YESS!!!
Isn't it sooo beautiful? 😍
But let's not get too amazed by that gorgeous light. Now that I finally had a driver that actually worked, I rebased everything on top of mainline (at this point the kernel had already moved from 5.7 to 5.9) to make sure it still worked and sent an RFC patch!
And that's the tale of "How I ported a driver upstream"! Now, that's the end of this post, but not of the adventure. There are lots of things that I need to do to get that driver patch series actually included in mainline (some of which I will still find out from the feedback on my patches).
Since my goal here was always to just to get it working, changing as minimum as possible, and only after that clean it up and make it decent, this is what I'll need to start doing now 🙂.
Just one last thing. This post might have made it seem that the problem resolutions were streamlined, but it wasn't like that at all. There were multiple points were I had no idea what to do, sat on it for multiple weeks, and even thought about giving up and working on another project.
But I'm glad I always kept trying and asking for help, because as much as it was frustrating at times, it was also so much fun and I learned so much. I can't even express my happiness the moment that light finally turned on after 4 months (on and off) of work.
And that's it! I hope to be back in a few months with a new post telling my adventure on getting an RFC patch for a new driver into an actual upstream driver 🙂. See you then!