this post was submitted on 03 Aug 2023
8 points (90.0% liked)

Embedded

452 readers
1 users here now

This sub is dedicated to discussion and questions about embedded systems: "a controller programmed and controlled by a real-time operating system (RTOS) with a dedicated function within a larger mechanical or electrical system, often with real-time computing constraints."

founded 2 years ago
MODERATORS
 

I'm trying to run an LED matrix display (with a Max7219 controller) from a raspberry pi pico using rust. There is a max7219-crate that I used. But i am unsure about how to prepare the pins I want to use. Can I Use any of the pins? Do I have to set them to push-pull-output?

you are viewing a single comment's thread
view the rest of the comments
[–] orclev 3 points 1 year ago* (last edited 1 year ago) (1 children)

OK, so SPI is a bus protocol, although when you're talking that low level that's almost overselling it. With SPI you have a clock signal on one pin, a device select signal on another pin, and then two unidirectional signal pins, MOSI (master out, slave in) and MISO/SOMI (master in, slave out, this pin is often unused in SPI if the device being talked to is write only). The big advantage to this setup is that you can read and write from devices at the same time since the read and write pins are separate. The other tradeoff with SPI is that adding additional devices requires one extra pin per device (for the device select signal, usually identified as CS). The way you route signals to a particular device is by driving the CS pin low for the device you want to communicate with. So you need 4 pins for 1 device, 5 pins for 2 devices, 6 pins for 3, etc.

It's is entirely possible to implement SPI communication using software, you just manually flip the CLK and MOSI pins to send your messages (and read from the MISO pin). That's usually referred to as bit bashing. Most microcontrollers though also include special hardware for managing SPI for you where you can set a desired clock speed and write/read from buffers and the controller will handle flipping the CLK and MOSI pins for you and filling the buffers.

The other major bus protocol you'll typically run into for embedded devices is I2C (or IIC). I2C only uses 2 or 3 pins, a clock pin, a command pin, and sometimes a data pin. I2C operates by assigning each device on the bus a unique address, and as part of each message sent on the bus the target devices address is included. The downside to this design is that reading and writing is often significantly slower than what's possible with SPI, but the upside is that you can have a nearly unlimited number of devices all driven by just 2 or 3 pins. Just like with SPI many microcontrollers will include hardware specifically for managing I2C communications, but you can also just bit bash your messages across the bus as well.

No matter how you decide to configure things with the MAX7219 it's using SPI, it's just the difference of whether you're "manually" implementing SPI where there's code running on the main CPU that's sending the messages, or if the actual sending is being handled by some dedicated SPI hardware and the main CPU is just writing to buffers. The library itself is handling the SPI communications when you pass it the pins, so you don't need to actually implement SPI yourself, the library handles it for you, but it is going to end up being noticeably slower than if you initialize the SPI module on the rp-pico and let its dedicate SPI hardware manage it instead.

Looking at the SPI example there's this chunk of code:

    let clocks = hal::clocks::init_clocks_and_plls(
        rp_pico::XOSC_CRYSTAL_FREQ,
        pac.XOSC,
        pac.CLOCKS,
        pac.PLL_SYS,
        pac.PLL_USB,
        &mut pac.RESETS,
        &mut watchdog,
    )
    .ok()
    .unwrap();

That's initializing the system clocks in the rp-pico. You should be doing this no matter what and this doesn't really have anything to do with SPI except in that SPI uses the system clocks.

Then there's this chunk here:

    let _spi_sclk = pins.gpio2.into_mode::();
    let _spi_mosi = pins.gpio3.into_mode::();
    let _spi_miso = pins.gpio4.into_mode::();
    let spi_cs = pins.gpio5.into_push_pull_output();

That block is configuring a few pins to be usable with the hardware SPI driver and configuring one final pin as a CS pin.

Edit: N.B. this chunk is getting mangled by lemmy for some reason, it keeps removing the turbofish operators. You can follow this link to see what I'm talking about. Additionally you can reference this chart to see what pins on the rp-pico support which hardware functions. GPIO 0 through 7 and 16 to 19 all support the SPI0 hardware driver, and GPIO 8 through 15 support SPI1.

Lastly there's this chunk:

    let spi = spi::Spi::<_, _, 8>::new(pac.SPI0);
    let spi = spi.init(
        &mut pac.RESETS,
        clocks.peripheral_clock.freq(),
        400.kHz(), // card initialization happens at low baud rate
        &embedded_hal::spi::MODE_0,
    );

That's using the first hardware SPI unit (SPI0) and initializing it. It's setting it to run at 400kHz and in Mode 0 (SPI loosely defines a couple different ways communication can be implemented, usually referred to as mode 0 through 4). The clock speed and mode that need to be configured will vary based on the device. For the MAX7219 it looks like it wants to run at a max of 1mHz and mode 0.

Once you have the MAX7219 struct returned from the from_spi_cs function you can basically just ignore SPI entirely and just use the functions on the MAX7219 driver to send messages to the display.

[–] [email protected] 3 points 1 year ago (1 children)

Hey! Thank you very much! I've actually managed to get the SPI to work with the max7219. I initialized it with <,,16> as template args. It only compiled when i changed the 16 to 8. (it was 16 in another example). It took me a long time to figure out, because i couldn't find documentation about what this parameter means - it's simply called "DS" which i interpreted as "Data Size" but I could be wrong.

I guess the main difficulty in the HAL-Code is reading the heavily generic types used.

i don't want to get on your nerves, so don't answer if it's annoying, I will find some answers eventually. But I think I will continue to ask questions here, maybe some other people will see it and frequent this community more often.

I've got some weird issues now that it isn't behaving like i would expect, but that may have to do with power or something else. Maybe some transmission errors? Could a lower SPI-frequency reduce errors in transmission.. (my example uses 1MHz, yours 400kHz).

[–] orclev 2 points 1 year ago* (last edited 1 year ago) (1 children)

Edit: OK, so Lemmy keeps stripping all the angle brackets out of my comments which makes posting any code that uses generics really hard/impossible. To work around that I'm just going to link to a gist of this post the way it's supposed to look.

I'm going to guess you're not super familiar with Rust yet, in which case good job making it this far with embedded Rust, that's kind of the deep end of the pool. The embedded-hal crate that's at the core of all these crates is a really amazing piece of engineering, it walks a fine line between defining a set of primitives that can be used across all embedded devices while also not being so generic as to be useless or so specific as to exclude certain embedded devices from being supported. A big part of how it accomplishes that is by very carefully using traits and generics. Traits are easiest to work with but they have the downside of potentially introducing dynamic dispatch which has runtime overhead, so static dispatch is preferred. A big part of how you avoid dynamic dispatch is using generics.

For a concrete example, we can look at the Pin struct declared by the rp2040-hal crate. The Pin struct is generic and includes two parameters, an Id that's an instance of PinId which is itself simply a marker trait that can be applied to each GPIO address, and a Mode that's an instance of the PinMode trait which is a marker trait for the various modes each Pin can be toggled into. Using these you could for instance have an instance of the Pin struct declared like so Pin which would indicate the GPIO0 pin that has been configured into PushPull mode. The PushPull struct is an instance of the marker trait OutputConfig. Going back to the Pin struct for a moment, we can see that it provides a generic implementation for OutputPin which is defined for any Pin whose mode is an instance of OutputConfig. Using that OutputPin marker trait then allows writers of drivers, such as the one for the MAX7219 to write a generic implementation that will work for literally any Pin that's an instance of OutputPin.

Now an important point in all of that, is that generics are made concrete at compile time. While you see a declaration like this:

pub struct PinConnectorwhere
    DATA: OutputPin,
    CS: OutputPin,
    SCK: OutputPin,

at compile time that actually ends up looking more like PinConnector,Pin,Pin> which is declaring that you're using the GPIO pin 3 for MOSI, pin 5 for CS, and pin 2 as clock as well as statically asserting at compile time that they've all been properly configured into output mode. You would for instance get a compile error if you attempted to pass a pin instance like Pin because PullUp is an instance of InputConfig and therefore that Pin instance is an instance of InputPin not an instance of OutputPin as declared by the bounds on the PinConnector generics.

Now, that does make reading the docs for all this a little tricky, and requires some getting used to, but it's incredibly powerful once you do understand it. One skill you're going to want to get in the habit of to make the most out of the embedded-hal ecosystem is reading blanket and auto trait implementation, they're really the core of what makes the entire thing function.

To make all of these even more complicated, embedded rust docs are only half the picture, the other half is the docs for the specific hardware devices in question. For instance here is the datasheet for the Max7219. Looking at that I can already see I made a mistake in one of my previous comments. I said the max supported speed was 1mHz, but the datasheet actually indicates it's 10mHz, and indeed when I double check the driver docs I linked previously they do in fact say 10mHz, not 1mHz. Based on the datasheet for the Max7219, I would expect that the DS parameter on the Spi device should actually be 16 as that's the size of each serialized packet sent over the SPI bus that it's expecting, however I see that the Max7219 driver crate specifies that the Spi instance should be a Write which is only defined for Spi with a DS value of 8 or lower. I'm guessing maybe there's some quirk of the Max7219 command set that the driver is working around? Not really sure what's going on there honestly.

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago) (1 children)

Hey! Thank you very much! This is an incredibly well made, probably labor-intensive and (nice!) comment! (and yeah a few code-pieces seem to disappear, but i think i understand the original meaning.

that cleared a lot up to be honest. I have been using rust for a while now, but i think all the more advanced features that i didn't really have to deep-dive into before are now used all at once in the embedded context. it's all very dense to read when only looking into the source code (or the docs). But your explanations helped tremendously (i will read them again tomorrow though.

It's really fascinating what rust makes possible here. I haven't really programmed too much in c++ in the embedded context, but i guess i would have to basically rewrite a lot of software if i want to use it on a different device, right?

Regarding the 8 or 16 values of the DS-values, i am not quite sure myself. I've found two examples where a Max7219-chip is used together with a raspberry Pi pico with Rust. One implemented the max7219-struct itself and didn't use the max7219-crate and used the value 16 for DS.. This example works on my setup.

The other example is using the max7219 and it needs DS=8 otherwise it doesn't compile. It kinda works, but there seems to be some errors when i use it: if I use write_raw to set all the pixels on the display certain values seem to change the display's state. at a certain point it changes its intensity and changes into all-pixels-on-mode suddenly. This shouldn't happen if i only use wrote_raw.

But with your explanations i might understand a little more of the stuff that i used in the code. Thank you very much!

[–] orclev 1 points 1 year ago* (last edited 1 year ago) (2 children)

Honestly I'm suspecting that the driver crate is just broken, and that it is supposed to be using a value of 16 for the DS parameter. The trait constraint the from_spi function should have applied should be Write with a u16 generic, not u8 which would then allow you to use 16 as the DS parameter when initializing the Spi instance. If I had a Max7219 chip at hand I would try modifying the driver crate to verify if that's the case, but I don't unfortunately. Maybe open an issue on the driver repo describing the behavior you're seeing (and maybe link him back to this thread) to see what he thinks?

As for the case with C++ code it is often more device specific, but it can also cheat a certain amount. Rust is all about safety, it doesn't let you make a bunch of mistakes that are possible in C++. The upshot of that is that when you get a piece of Rust code to compile, it's more often than not correct. That's somewhat on the skill of the person writing the libraries though, you can certainly write code that can be used wrong, but a good author can often define their APIs in such a way that it's impossible to use it incorrectly. As in the example above, the Spi instance is being constrained to a DS of 8 due to the way the Max7219 crate is defined, it's impossible to accidentally use a DS of 16 with it, it just happens that it seems like that constraint is wrong in this case.

C++ in contrast lets you take shortcuts. For instance you can define a bunch of constants and use ifdefs to conditionally set them at compile time. For example you can see this random driver I found using a google search that it defines the Max7219 class as taking a PinName class/struct/enum (not sure which honestly) which I'm sure is defined elsewhere as the raw pin identifier constant exposed by the underlying hardware. That driver for instance does not enforce that the pin has been configured into the proper PushPull mode prior to it being passed to the driver, it's on you as the user of the library to make sure everything has been properly setup before hand. It's "easier" in that everything is basic, but it's also error prone as it doesn't double check your work, you'll just get a crash at runtime.

C/C++ is very low level, barely higher than assembly. If you're armed with the datasheets for everything you can probably make it work, but you need to be very sure you're getting all the details right. Rust on the other hand tries to force you to use things correctly. Ideally you should have just been able to grab the Max7219 crate, and just use it and everything would work. The fact it isn't suggests there's a possible bug in the crate, rather than that you're just using it wrong, as it really should be impossible to use it wrong.

[–] [email protected] 2 points 1 year ago

Hey thank you! it might actually be that the driver has an error. For me somebody pointing that out is actually very helpful, as I always suspect that I'm doing something wrong. But playing around with the other example that kind of implemented the max7219 interface from scratch (using the u16 for the data send) was pretty fun!

I guess I will try changing the original max719-crate from u8 to u16 tomorrow and see what happens. I also posted an issue about that on the GitHub.

[–] orclev 2 points 1 year ago* (last edited 1 year ago) (1 children)

I decided to crack open the source of the Max7219 crate to get a better idea of what's going on.

Reading the chips datasheet it looks like it's expecting 16 bit packets sent in little endian format on the wire. The high byte consists of a 4 bit segment address (or command) and then 4 bits of padding. The low byte is interpreted depending on the address or command in the high byte as well as what the currently set decoding mode is.

Looking at the code for the crate, I see in the Spi struct it declares a buffer like so buffer: [u8; MAX_DISPLAYS * 2],. I believe a more correct version of that declaration would be buffer: [u16; MAX_DISPLAYS],. Then looking at the actual implementation of the write_raw method I see this:

    fn write_raw(&amp;mut self, addr: usize, header: u8, data: u8) -> Result {
        let offset = addr * 2;
        let max_bytes = self.devices * 2;
        self.buffer = [0; MAX_DISPLAYS * 2];

        self.buffer[offset] = header;
        self.buffer[offset + 1] = data;

        self.spi
            .write(&amp;self.buffer[0..max_bytes])
            .map_err(|_| DataError::Spi)?;

        Ok(())
    }

where once again a bunch of double counting of u8s is being done. I think a more accurate version of that would be:

    fn write_raw(&amp;mut self, addr: usize, header: u8, data: u8) -> Result {
        self.buffer = [0; MAX_DISPLAYS];

        self.buffer[addr] = u16::from_ne_bytes([header, data]);
 
        self.spi
            .write(&amp;self.buffer[0..self.devices])
            .map_err(|_| DataError::Spi)?;

        Ok(())
    }

This skips messing around with packing the u8 bytes into pairs via address calculations and instead uses the from_ne_bytes function to directly pack the address/header byte and the data byte into a little endian u16 suitable for serialization across the SPI bus. I'm not 100% sure that from_ne_bytes is correct in this case, as I'm not entirely clear how that would interact with the native endianness of the CPU and the SPI controller, but I'm hoping that by explicitly putting the header in the high byte that it would respect that. Some experimentation would be necessary there I think to make sure it was actually portable.

[–] [email protected] 2 points 1 year ago (1 children)

Hi, Thank you. It took me a while, but I experimented around a little bit. I have not yet tried to fix the max7219-library though. I think it is from_be_bytes (the other one didn't work).

But one thing that I am not understanding (I think this is a "can't tell the forest from the trees"-situation) is how exactly multiple 8x8-matrices are connected i.e. how the data-stream looks exactly.

In your example (from the max7219-library) it seems like if I use 4 devices I send 4 times a u16 out and the 4 connected Max7219's figure out themselves which one is meant?

[–] orclev 1 points 1 year ago (1 children)

So it took me a little while to figure out between reading the datasheet for the Max7219 and looking at the source code. Basically it's taking advantage of a feature of the Max7219 that allows daisy chaining multiple chips off the same SPI connection. In order to take advantage of this feature you would take N Max7219 chips and wire all their CS and CLK pins together with your controller, and then run the connection from the controller to the first chips DIN port, and then the DOUT port from the first chip to the DIN port of the next chip. Keep chaining DOUT to DIN to daisy chain all the chips together.

In the datasheet for the Max7219 there's this section:

For the MAX7219, serial data at DIN, sent in 16-bit packets, is shifted into the internal 16-bit shift register with each rising edge of CLK regardless of the state of LOAD. For the MAX7221, CS must be low to clock data in or out. The data is then latched into either the digit or control registers on the rising edge of LOAD/CS. LOAD/CS must go high concurrently with or after the 16th rising clock edge, but before the next rising clock edge or data will be lost. Data at DIN is propagated through the shift register and appears at DOUT 16.5 clock cycles later

Essentially what that all boils down to, is that each Max7219 maintains a 16 bit internal shift register, so as each bit is received on DIN it's pushed onto the register, and the highest bit of the register gets pushed out to DOUT. When you daisy chain multiple chips together it's effectively like concatenating all their shift registers together. So if you have 4 chips, that's 64 bits of register. If you write 64 bits out to MOSI the first 16 bits will end up on the farthest out chip, the next 16 in the next closest, etc. Switching the CS pin from low to high is the trigger for the Max7219 to actually lock in and read the contents of those shift registers. The way the driver crates code is structured that's the purpose of the buffer field in the various Connector structs. So if you have say 4 chips, you need 4 x u16 storage, and each write cycle you write all 4 u16 values out, one to each daisy chained device. Technically the driver is less efficient than it could be, in that it takes advantage of the fact that writing 0 to a chip is a no-op, so in practice while it does write to every device each time, when you call write_raw it actually 0s the buffer for all but the selected chip.

If you think about a sequence of chips, lets say once again 4 of them labeled A to D. They would be connected like so:

RP-Pico-MOSI----DIN-A-DOUT----DIN-B-DOUT----DIN-C-DOUT----DIN-D-DOUT
       -CS----------CS------------CS------------CS------------CS
       -CLK---------CLK-----------CLK-----------CLK-----------CLK

Then you write to all four chips like so:

  • Set CS low
  • Write u16 for D
  • Write u16 for C
  • Write u16 for B
  • Write u16 for A
  • Set CS high
[–] [email protected] 1 points 1 year ago (1 children)

Thank you.

Ive actually read the section you quoted a few times and my brain just couldn't parse it. But i finally understand how the max7219 makes this. I've thought about it completely wrong. It just shifting through all the bits from chip to chip so obvious now.

I think i will go a step back and not use spi for a while and just do the bit-banging -thingy first to get more familiar first.

I've read somewhere that Is faster and I guess it's cheaper for the cpu to use as the cpu doesn't have to set the pin outs high or low with each cycle. Instead (i guess) the cpu can simply call a spi-out-funtctio one time and the spi does its thing for a while while the cpu can do other things.

But right now I don't do much yet on the rest of the CPU, so i can afford to do it manually.

Just one other question regarding multiple displays: as e.g. 4 displays requires 4x16bits does this mean that there would have to be a Write-trait implemented somewhere (or Write<[u16;4]>)?

could it be that the max7219-crate is incomplete here? The write-funtion you corrected seems like it was copied 1to1 from the cpp-lib (LedControl).

[–] orclev 1 points 1 year ago

Just one other question regarding multiple displays: as e.g. 4 displays requires 4x16bits does this mean that there would have to be a Write-trait implemented somewhere (or Write<[u16;4]>)?

Nope. The Write trait is indicating the size of the "packet" that's written on the SPI bus, it's the equivalent of the DS generic off the Spi struct. The way SPI works is, when you toggle CS low, the device is notified that it needs to start listening on MOSI, at which point you're free to start sending it packets. There's no requirement that you only send a single packet, you can send as many as you want, however many devices will have special rules about processing with respect to the state of the CS pin. E.G. just like with the Max7219 it's common for devices to buffer commands and not actually process them until CS is sent high.

The only reason why the Write and the Spi generic are important is because it defines the minimum number of bits that will be written to the bus (or more concretely it's the stride size the SPI controller uses when reading and writing from its buffers). That's why using u8/8 as the parameter mostly works except for occasionally demonstrating strange behavior. Using u16 guarantees that it always writes a number of bits that's a multiple of 16, while using u8 can allow for essentially a half packet to be written.

As for bit banging vs. SPI controller, it's essentially the same thing as DMA if you're familiar with that concept. Using bit banging the CPU is spending time toggling the various pins off and on, which although fast, is still relatively slow by communication standards and puts an upper limit on the speed data is transmitted on the SPI bus that's directly tied to the frequency of the CPU and the number of cycles it takes to toggle a pin (minimum two pin toggles, maybe one for MOSI, two for CLK). Using the SPI controller on the other hand, the CPU writes bytes into memory and then passes essentially a couple of pointers to the SPI controller then flips some bits in a register. The CPU does need to pause occasionally to refill the buffers, but that's a relatively fast operation and is mostly decoupled from the actual bus speed of SPI.

Manually implementing SPI with bit banging is probably a good learning exercise, but understanding how to properly use the SPI controller is also good to know. For an extra challenge you can usually also setup the SPI buffer to be managed using DMA for the most optimal way to handle things. I would suggest configuring a u16 buffer sized based on the number of devices and then using DMA to write its contents out using the SPI buffer would be a very educational exercise.