Introduction to controlling long LED strands
This post wraps up one remaining issue from Part I (http://hypnocube.com/2013/12/design-and-implementation-of-serial-led-gadgets/), namely, figuring out how to control long strands of WS2812 LEDs, which fail above certain lengths due to timing skew.
The last post detailed our findings on how the internals of the WS2812 chip likely work, and resulted in detailed knowledge of how the signal timings must be shaped to get the LEDs to transition correctly. From this we implemented the ability to tweak timings in our code for the various components of the signal, and we experimented with long strands to see how far we could reliably get a signal to propagate.
Our testing rig was composed of eight strips of 1250 LEDs each, for a total of 10,000 LEDs, the maximum our controller will run. We built these into circular arc panels, which allow us to stack them in various configurations. Here is a YouTube video showing the resulting panels playing animations.
Due to the circular sides you can only see 6 of the 8 panels at once. Our testing involved using all eight panels in various configurations.
Setup and testing method
From last time, we created a controller, the Hypnocube LED Serial Driver (HypnoLSD, http://hypnocube.com/product/led_serial_driver/), that executes 60 instructions in 1250ns (a PIC32 clocked at 48Mhz), which is the recommended timing length for one bit of data sent to the WS2812 modules. Figure 1 shows the timing diagram from last time.
Figure 1- Timing Labels
From experiments last time, we decided that making the final low signal longer does not cause errors, allowing us to change the length from the usual 1/3 of the window to many multiples of this. Changing the required high and middle parts did not help us control errors. When we tried to run strands over about 5,000 in length, there was significant data corruption, and we could not get images on long designs. The issue was bit transitions stacking up too tightly due to the WS2812 signal shaping causing skew, and eventually a transition is either lost or pushed into a time window where it was ignored. So we wanted to introduce longer timing windows to see what happened.
Due to the density of our code, we initially tried changing only the delay at the end of each byte sent, but this resulted in no real error reduction. Next we implemented per-bit user selectable timing, and this resulted in the ability to handle long strands.
Initial testing showed some variation depending on what we tried to draw, so we picked simple colors with various bit-transition patterns to help us understand what is happening. For example, trying to go from an all-white image to all red behaved differently than going from all black to all red. Likely the timing skew introduced by the modules is somewhat power dependent. So our initial experiments went from an all-black image to a solid color.
In the following, the “bit-delay” is one-half of the number of instructions we increase the low part of the timing diagram (the ‘w’ in Figure 1). Due to the existing code density to handle the throughput, we could only squeeze in a few instructions per bit, resulting in timing loops illustrated in Figure 2, where the “bne” instruction loops back to itself, while decrementing the counter in the following branch-delay slot (this is an artifact of MIPS assembly – it looks odd at first to the uninitiated, but is correct).
Figure 2-PIC Assembly timing loop
As a result, since each instruction adds 1250/60 ns to the time, each increase in bit-delay adds 1250/30 ns (41 and 2/3 ns) to the initial 416 + 2/3 ns value of ‘w’. Each experiment is repeated 10 times for various bit-delays, and the number of panels out of 8 that show correct results is recorded. Thus a score of 8 means all 10,000 LEDs were correctly set. For each experiment, the LEDs are set to black (color 0, 0, 0 – all off), and then an image is sent down them.
Note longer bit-delays result in longer transmission times, lowering effective frame rate. Since the initial window is 60 instructions for a bit, and each increase in bit-delay adds two instructions, each increase in bit-delay increases time per bit by one part in 30.
Experiments
Experiment 1
We set up all eight panels in serial as one long 10,000 LED strand, used a 3Mbps serial connection to our gadget (which may be immaterial), and sent the GRB byte colors 255, 0, 170 (to get many bit transitions) down the strand.
Here are the results:
Bit-delay | # panels perfect out of 8, for 10 experiments | |||||||||
0 | 5 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
4 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 |
6 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 |
So in this experiment, a bit delay of 6 allowed us to transition all 10,000 LEDs from black to the selected color, while the default timing allowed only about half that many to be changed.
Experiment 2
Next we sent the GRB color 128, 0, 1, for another ten experiments on the 10,000 LED strand, resulting in
Bit-delay | # panels perfect out of 8, for 10 experiments | |||||||||
0 | 3 | 4 | 4 | 3 | 3 | 3 | 3 | 4 | 3 | 4 |
1 | 5 | 4 | 4 | 5 | 4 | 4 | 5 | 4 | 5 | 4 |
2 | 5 | 5 | 5 | 6 | 5 | 5 | 5 | 6 | 5 | 5 |
3 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |
4 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 |
5 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 |
6 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 |
This time a bit delay of 4 was sufficient, which failed in experiment #1. A bit-delay of 6 still worked. Note this transition drew a lot less current than the last one.
Experiment 3
Repeating with the GRB color 1, 2, 3 on the 10,000 LED strand resulted in
Bit-delay | # panels perfect out of 8, for 10 experiments | |||||||||
3 | 5 | 5 | 5 | 5 | 5 | 6 | 6 | 5 | 5 | 5 |
4 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 |
Again, a bit-delay of 4 was enough, but the bit-delay 3 showed slightly better behavior than the previous tests. Note again the lower color numbers draw significantly less current.
Experiment 4
Out of curiosity, we wanted to see how large we could make the bit delay, and obtained, for color 128, 0, 1, on the 10,000 LED strand,
Bit-delay | # panels perfect out of 8, for 10 experiments | Notes | |||||||||
16 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | |
32 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | |
64 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | |
96 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | |
98 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | |
99 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | |
100 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
104 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
112 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
128 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | (1) |
136 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | (2) |
144 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | (3) |
160 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | (4) |
192 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | (4) |
256 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | (4) |
Notes:
(1) Many pixels change, very noisy.
(2) About 600 or so pixels change
(3) About 1 or 2 pixels flicker, perhaps hitting latches?
(4) no pixels change
We found a bit delay over 100 completely dropped the signal, a cutoff that was observed in others (non-recorded) testing. This results in a 5000ns bit period, which, oddly enough, is not near the approximate 9400ns latching requirement we deduced last time. However it is four times the recommended bit-length.
We initially thought that no larger bit-delay would reverse this behavior, but upon trying some larger values we got noise on the panels. The noisy LEDs seem to be dropping an entire byte at a time, leading us to speculate there are 3 independent counters in chip, triggering some weirdness between them. We also tested holding the last signal high a long time, instead of the low required to latch colors, and this led to more interesting noise. But we did not pursue it much due to the apparent complexity of the outcomes.
Experiment 5
We then performed testing over the animations shown in the video linked above. There are about 30 or so animations we wrote in our custom software, with no special regard to overall brightness or pixel coverage. Since the previous experiments suggested a bit-delay around 6-10 resulted in stable images, we tested it. The results were a bit odd.
Testing with the panels arranged as a 10,000 LED strand resulted in most demos working, with the one we call “Fire” by far the most noisy. Under visual inspection it did not appear any more or less complex than the ones we call “Plasma”, but mathematically it would have somewhat more “randomness” between adjacent pixels. We found
bit delay | Notes |
8 | fire flakey on 8th panel |
16 | fire flakey on 8th panel, better than bit-delay 8 |
32 | fire flakey on 8th panel, better than bit-delay 16 |
64 | fire flakey on 8th panel, almost fine, boundary seems to be noisy |
99 | fire fine |
We suspect there are soldering issues between panels that is adding to the noise, which is why it took such a high bit-delay to get the “fire” animation to work, when all the other ones ran fine.
Experiment 6
Rewiring the modules into a two 5,000 length LED strands led to the following:
At bit-delay 2, all demos except “Fire,” “Plasma,” and “Ray tracer” worked fine. At bit-delay 4, “Ray tracer” went to mostly fixed. At bit-delay 20, all were better, but still a little flakey. We suspected some bad wiring was making noise. At bit-delay 4, with fire drawing upside down, it worked fine, leading us to suspect wiring or a flakey LED.
We did find flakey LEDs in the middle of a strand several times, where the signal suddenly went bad. Replacing them fixed several issues.
Experiment 7
Running the GRB color 128, 0, 1 similar to experiment #2 except on the two by 5,000 LED arrangement led to the following:
Bit-delay | # panels perfect out of 8, for 10 experiments | |||||||||
0 | 6 | 7 | 6 | 7 | 7 | 7 | 7 | 7 | 6 | 6 |
1 | 8 | 7 | 8 | 7 | 8 | 8 | 7 | 8 | 8 | 8 |
2 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 |
Experiment #2 required a bit-delay of 4 to get 10,000 LEDs working correctly; this experiment required a bit-delay of 2 to get the 2 x 5000 configuration working, with a bit-delay of 1 almost sufficient.
Experiment 8
Testing the panels as four by 2,500 strands, with a bit-delay of 0 (the default, spec recommended value), resulted in no errors on our demo animations.
Frames per second maxed out around 8.5 fps, which was limited by the 3 Mbps serial input. We tested at 12 Mbps and it worked fine.
Experiment 9
Running the panels as eight length 1250 panels worked flawlessly with bit-delay 0 at both 3Mbps and 12Mbps inputs.
Conclusion
From our experiments, it seems the default WS2812 protocol can easily handle 2500 length strands, and with careful wiring probably even 5,000 or so. We were not terribly careful in our designs, but we would be surprised if 10,000 LEDs could be controlled reliably at the specification recommended timings. If you find out otherwise, please let us know.
We noted there seemed to be increased noise at solder junctions (as expected), or other non-homogeneous parts of a design. For example, if you have a long strand, then some wiring to get to your next strand, there is the possibility of this in-between wiring introducing noise.
As a result of these experiments, we added the ability for end-users to select the bit-delay in our modules in order to see what works for their designs. We found that bit-delays of around 200-400ns on the low portion of the per-bit signal adds significant reliability for long strands.
Since we did this work we have built several LED gadgets and displays based on our modules, and have sent samples to several others. So far they seem to work just fine. Let us know if you find otherwise.
Happy hacking!