<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.8.7">Jekyll</generator><link href="https://www.zzzconsulting.se/feed.xml" rel="self" type="application/atom+xml" /><link href="https://www.zzzconsulting.se/" rel="alternate" type="text/html" /><updated>2022-01-30T17:33:10+01:00</updated><id>https://www.zzzconsulting.se/feed.xml</id><title type="html">ZZZ-Consulting</title><subtitle>Me writing blog posts about various topics I find interesting. Turns out these are mostly FPGA and retro-computing related.</subtitle><entry><title type="html">Making some PCBs - part 2 (HyperRAM)</title><link href="https://www.zzzconsulting.se/2022/01/05/making-pcbs-part-2.html" rel="alternate" type="text/html" title="Making some PCBs - part 2 (HyperRAM)" /><published>2022-01-05T00:00:00+01:00</published><updated>2022-01-05T00:00:00+01:00</updated><id>https://www.zzzconsulting.se/2022/01/05/making-pcbs-part-2</id><content type="html" xml:base="https://www.zzzconsulting.se/2022/01/05/making-pcbs-part-2.html">&lt;h1 id=&quot;stage-1---hyperram&quot;&gt;Stage 1 - HyperRAM&lt;/h1&gt;
&lt;p&gt;2-layer board.&lt;/p&gt;

&lt;p&gt;First out is a HyperRAM expansion board for the ULX3S. It makes sense to see
early if I will be able to pull off BGA assembly with the equipment I have.
Also I happen to have 10 of these
&lt;a href=&quot;https://www.mouser.se/ProductDetail/cypress-semiconductor/s27kl0641dabhi020/&quot;&gt;S27KL0641DABHI020&lt;/a&gt;
laying around.&lt;/p&gt;

&lt;p&gt;5x5 BGA means that the two outermost rings can be routed in the same layer
(only one ball remaining in the middle but that can also escape in the same
layer as one corner is missing a ball).&lt;/p&gt;

&lt;p&gt;I finally had a go at this and put together quick and dirty PCB in KiCAD and
had &lt;a href=&quot;https://aisler.net/&quot;&gt;https://aisler.net/&lt;/a&gt; manufacture it for me. By quick and dirty I mean that
the layout is ugly and the traces are not length matched so I would not expect
to get any performance out of it. The board should be functionally correct
though. I did not bother to put the KiCad files on GitHub.&lt;/p&gt;

&lt;p&gt;Two different soldering methods were evaluated and I tried capturing the chip
profile with a cheap cell-phone camera through a binocular microscope.&lt;/p&gt;

&lt;p&gt;On the left only tacky flux was applied and then using hot air from above
(station set to 300C at 50l/min) the chip was soldered in place. After a while
the chip noticeably sunk into place and I tried to nudge it several times with
a pair of tweezers after which it immediately bounced back. This was a quick
method but the end result is that the chip sits flush with the board and visual
inspection is not really possible.&lt;/p&gt;

&lt;p&gt;On the right no flux was used but instead solder paste was applied through a
stencil. The board was placed on a hot-plate that slowly rose to a temperature
set to 220C after which hot air was applied from above (with station set to
300C at 40l/min). It was not as noticeable when the chip sunk into place and I
did not dare to nudge it (as it was still on the hot-plate). The end result
looks much nicer though with the solder balls clearly visible and one can
clearly see that there is no bridging.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/download/pcb/hyperram-bga-pcb.jpg&quot; alt=&quot;HyperRAM BGA&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Eventually I will try to verify the boards with one of the controllers from the
list below and report the results.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/blackmesalabs/hyperram&quot;&gt;https://github.com/blackmesalabs/hyperram&lt;/a&gt; - slow but portable controller&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/gregdavill/litex-hyperram&quot;&gt;https://github.com/gregdavill/litex-hyperram&lt;/a&gt; - full performance controller for ECP5 in Migen&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://1bitsquared.com/products/pmod-hyperram&quot;&gt;https://1bitsquared.com/products/pmod-hyperram&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=dThZDl-QG5s&quot;&gt;https://www.youtube.com/watch?v=dThZDl-QG5s&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I did a second try run using the stencil after having acquired
&lt;a href=&quot;https://www.oshstencils.com/&quot;&gt;https://www.oshstencils.com/&lt;/a&gt; excellent PCB fixture jig. To me having the
board properly held in place by jig helped a lot when applying the paste. Of
course one can get a similar effect using a few scrap PCBs but I don’t really
have many of those lying around that are the right size and besides these
brackets just cost a few dollars.&lt;/p&gt;

&lt;p&gt;As far as I can tell the stencil alignment and paste coverage turned out well
and the end result looks good. So with this I feel reasonably confident that I
will also be able to handle the larger FPGA BGA with its 0.8mm pitch (the
HyperRAM BGA has a 1.0mm pitch). Again I still have not gotten around to
actually verify that these modules work so maybe I should not say too much.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/download/pcb/hyperram-bga-pcb-2.jpg&quot; alt=&quot;HyperRAM BGA second attempt&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;verification&quot;&gt;Verification&lt;/h2&gt;

&lt;p&gt;Now a few monhts later I felt sufficiently energized to have a go at testing
the HyperRAM boards.  I set up &lt;a href=&quot;https://github.com/markus-zzz/hyperram-test&quot;&gt;a test
bench&lt;/a&gt; using &lt;a href=&quot;https://github.com/blackmesalabs/hyperram&quot;&gt;the portable
HyperRAM controller from
blackmesalabs&lt;/a&gt; and the
&lt;a href=&quot;https://github.com/YosysHQ/picorv32&quot;&gt;PicoRV32&lt;/a&gt; RISC-V CPU core.&lt;/p&gt;

&lt;p&gt;The DRAM is mapped into the CPU’s address space presenting its 8MB as 2M 32-bit
dwords.&lt;/p&gt;

&lt;p&gt;The test consists of having the CPU write the Fibonacci sequence, as 32-bit
integers, to memory and then verifying that it reads back correctly. Test
result is presented by lighting the appropriate LEDs.&lt;/p&gt;

&lt;p&gt;All three boards have been fully assembled and the test passes on all three.&lt;/p&gt;

&lt;p&gt;It should be noted that this is a basic test running the memory at a low
frequency (6.25MHz) so it does not stress test the board in any way. It does
however provide some confidence in that the part survived the assembly process,
the solder joints are reasonably sound and that there is no bridging.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/download/pcb/hyperram-test-pcb.jpg&quot; alt=&quot;Testing the HyperRAM board&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;future&quot;&gt;Future&lt;/h2&gt;

&lt;p&gt;It could be interesting to improve the controller, setting a shorter latency,
or simply switch to using a different one.&lt;/p&gt;</content><author><name></name></author><category term="hardware" /><summary type="html">Stage 1 - HyperRAM 2-layer board.</summary></entry><entry><title type="html">[DRAFT] Making some PCBs - part 3 (MAX9850)</title><link href="https://www.zzzconsulting.se/2022/01/05/making-pcbs-part-3.html" rel="alternate" type="text/html" title="[DRAFT] Making some PCBs - part 3 (MAX9850)" /><published>2022-01-05T00:00:00+01:00</published><updated>2022-01-05T00:00:00+01:00</updated><id>https://www.zzzconsulting.se/2022/01/05/making-pcbs-part-3</id><content type="html" xml:base="https://www.zzzconsulting.se/2022/01/05/making-pcbs-part-3.html">&lt;h1 id=&quot;stage-2---max9850&quot;&gt;Stage 2 - MAX9850&lt;/h1&gt;
&lt;p&gt;2-layer board.&lt;/p&gt;

&lt;p&gt;I also got started designing an audio extension board featuring the
&lt;a href=&quot;https://www.maximintegrated.com/en/products/analog/audio/MAX9850.html&quot;&gt;MAX9850&lt;/a&gt;
stereo audio DAC with builtin headphone amplifier.&lt;/p&gt;

&lt;p&gt;Now the MAX9850 part comes in TQFN-28 which is an absolutely tiny package with
0.5mm pitch (the KiCad footprint is “TQFN-28-1EP_5x5mm_P0.5mm_EP3.25x3.25mm”).
See comparison below to get an idea how tiny this really is.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/download/pcb/max9850-size-comp.jpg&quot; alt=&quot;&quot; /&gt;
&lt;em&gt;MAX9850 next to a standard 3.5mm audio plug.&lt;/em&gt;&lt;/p&gt;

&lt;h2 id=&quot;pcb-design&quot;&gt;PCB design&lt;/h2&gt;

&lt;p&gt;Using the best of my currently rather amateurish PCB design skills I ended up
with the following layout with &lt;a href=&quot;https://github.com/markus-zzz/mydacboard&quot;&gt;KiCad project files on
GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/download/pcb/dac-layout.png&quot; alt=&quot;PCB layout&quot; /&gt;&lt;/p&gt;

&lt;p&gt;While it initially looked pretty good to me I realize that is has a number of
layout flaws that I will need to remedy before I attempt the combined ECP5
board. To this end I have signed up for the &lt;a href=&quot;https://www.phils-lab.net/courses&quot;&gt;Mixed-Signal Hardware Design with
KiCad&lt;/a&gt; course from well known YouTube
creator &lt;a href=&quot;https://www.phils-lab.net/&quot;&gt;Phil’s Lab&lt;/a&gt;. Having browsed through the 5
hour video material that focuses heavily on PCB layout I can only say that this
appears to be a goldmine. So after finishing this board I will probably take a
detour to study the course material and follow along with its board project.&lt;/p&gt;

&lt;h2 id=&quot;assembly&quot;&gt;Assembly&lt;/h2&gt;

&lt;p&gt;Again soldering the top side using stencil and paste. The board is placed on a
hot-plate that slowly rises to 220C after which hot air is applied from above
for about 20-30s with station set to 300C at 30l/min.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/download/pcb/max9850-full.jpg&quot; alt=&quot;&quot; /&gt;
&lt;em&gt;Stencil alignment, stencil with paste, paste alignment, component placement
and post solder result.&lt;/em&gt;&lt;/p&gt;

&lt;h2 id=&quot;bring-up&quot;&gt;Bring up&lt;/h2&gt;

&lt;p&gt;After skimming through the MAX9850 datasheet it does seem that setting up a
working audio flow will require writing a fair amount of I2C configuration
registers. Ideally you would want some simple kind of test that can be used
immediately after assembly to verify that the part has not been totally
destroyed in the process. Needless to say you want this test to be dead simple
so that you are not led to believe that the part is broken after assembly just
because you made a mistake in the test code.&lt;/p&gt;

&lt;p&gt;The MAX9850 does have a (single) GPIO pin that is controllable by a I2C
register. The pin has also been routed back to the FPGA so the dead simple
first test we are looking for could be to write the appropriate I2C control
register to toggle this GPIO and then have the FPGA forward the signal to a
LED.&lt;/p&gt;

&lt;p&gt;Using &lt;a href=&quot;https://github.com/markus-zzz/max9850-test/tree/fafe550eb5bd27762d1c5857b90f52380dd440eb&quot;&gt;the simple test
bench&lt;/a&gt;
I have been able to verify that both of the boards assembled so far accept I2C
access and can toggle their GPIO pins.&lt;/p&gt;

&lt;p&gt;Next up is to configure all required control registers and start streaming
digital audio. Using &lt;a href=&quot;https://github.com/markus-zzz/max9850-test/tree/d2ac9c3ed896a54b3c6f4d6b7b6cd1a0d96cc9c4&quot;&gt;the slightly more advanced test
bench&lt;/a&gt;
the device is configured to use a sample rate of 48.8kHz and sawtooth waveforms
at two different frequencies are generated in the FPGA (one for each channel).
Pressing a button swaps the sawtooth frequencies. Both assembled boards pass
this test.&lt;/p&gt;

&lt;p&gt;And finally just for fun, here is a quick integration with the MyC64 project
playing the classic Bubble Bobble tune. Using branches
&lt;a href=&quot;https://github.com/markus-zzz/myc64/tree/max9850&quot;&gt;myc64/max9850&lt;/a&gt; and
&lt;a href=&quot;https://github.com/markus-zzz/usbdev/tree/max9850&quot;&gt;usbdev/max9850&lt;/a&gt;.&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./myc64-keyb ../../../c64-psid/MUSICIANS/C/Clarke_Peter/Bubble_Bobble.prg
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube.com/embed/SnuTHjIvyxg&quot; title=&quot;YouTube video player&quot; frameborder=&quot;0&quot; allow=&quot;accelerometer; autoplay; clipboard-write;
encrypted-media; gyroscope; picture-in-picture&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;p&gt;This does not sound quite right though, frequency (of the MyC64 design) is
expected to be off but I suspect there is more to it than that. Need to examine
closer at some point but for now this is good enough.&lt;/p&gt;</content><author><name></name></author><category term="hardware" /><summary type="html">Stage 2 - MAX9850 2-layer board.</summary></entry><entry><title type="html">[DRAFT] Making some PCBs - part 4 (ECP5)</title><link href="https://www.zzzconsulting.se/2022/01/05/making-pcbs-part-4.html" rel="alternate" type="text/html" title="[DRAFT] Making some PCBs - part 4 (ECP5)" /><published>2022-01-05T00:00:00+01:00</published><updated>2022-01-05T00:00:00+01:00</updated><id>https://www.zzzconsulting.se/2022/01/05/making-pcbs-part-4</id><content type="html" xml:base="https://www.zzzconsulting.se/2022/01/05/making-pcbs-part-4.html">&lt;h1 id=&quot;stage-3---ecp5&quot;&gt;Stage 3 - ECP5&lt;/h1&gt;
&lt;p&gt;4-layer board (signal, power, gnd, signal).&lt;/p&gt;

&lt;p&gt;Study these designs:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/mattvenn/basic-ecp5-pcb&quot;&gt;https://github.com/mattvenn/basic-ecp5-pcb&lt;/a&gt; - most basic ECP5 design, highly interesting.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/emard/ulx3s&quot;&gt;https://github.com/emard/ulx3s&lt;/a&gt; - ULX3S design&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The
&lt;a href=&quot;https://www.mouser.se/ProductDetail/Lattice/LFE5U-12F-6BG256C?qs=w%2Fv1CP2dgqpgiPWLwc1Bzg==&quot;&gt;LFE5U-12F-6BG256C&lt;/a&gt;
does indeed seem like the most suitable model and package.&lt;/p&gt;

&lt;p&gt;Although routing a 256 pin BGA does seem intimidating one has to keep in mind
that a large number of these balls will be &lt;em&gt;no-connect&lt;/em&gt;. Since this is a custom
board for a given application we only need to route what we use. This situation
is substantially different compared to doing a generic development board where
you really want to expose as many pins and interfaces as possible.&lt;/p&gt;

&lt;p&gt;Another simplifying matter is that this is a FPGA so it has very few fixed
function pins and interfaces (expect the fixed function ones like JTAG and
flash SPI) can generally be brought out in way that will cause no trace
crossing on the PCB.&lt;/p&gt;

&lt;p&gt;So besides power and ground how many signal do we need expose from the FPGA?&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Interface&lt;/th&gt;
      &lt;th&gt;Pins&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;HyperRAM&lt;/td&gt;
      &lt;td&gt;12&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;HDMI&lt;/td&gt;
      &lt;td&gt;10&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;JTAG&lt;/td&gt;
      &lt;td&gt;5&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Flash&lt;/td&gt;
      &lt;td&gt;5&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;MAX9850&lt;/td&gt;
      &lt;td&gt;5&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;USB&lt;/td&gt;
      &lt;td&gt;4&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Okay so it is quite a few but still much less than the 197 I/Os that are
available in the given package.&lt;/p&gt;

&lt;h2 id=&quot;configuration&quot;&gt;Configuration&lt;/h2&gt;

&lt;p&gt;When first looking into how the device is to be configured while on the PCB the
options and tools involved seem quite a bit overwhelming. However after doing
some googling and reading up a bit I think the following answers most of my
worries (mostly by confirming that this process does indeed depend on well
established and standardized techniques and not so much proprietary magic).&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Use JTAG (TCK,TMS,TDI and TDO) as main means of configuration. Either driving
it with external JTAG-cable, an integrated FTDI chip or onboard MCU. Timing is
not critical, bit-banging is fine, the TCK may stop etc.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;For non-volatile bitstream storage use SPI connected flash. ECP5 acts as SPI
master. The FPGAs JTAG interface allows direct programming of the SPI connected
flash memory.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The Symbiflow tools support generating both .bit files and .svf files. The
former contains the configuration bitstream only while the latter contains JTAG
commands to setup programming and the entire contents of the bitstream. This
allows for programming with a JTAG tool that does not know anything about the
actual device (i.e. all instructions are in the .svf file).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;There are several tools for programming the ECP5. ujprog, fleajtag, OpenOCD and
even one in python &lt;a href=&quot;https://github.com/emard/esp32ecp5&quot;&gt;https://github.com/emard/esp32ecp5&lt;/a&gt;. The last one does
bit-banging from a MCU while the first three relay on a USB connected FTDI chip
for JTAG. OpenOCD only support programming with .svf files while ujprog and
fleajtag can do .bit directly. That is they contain code to write the
appropriate JTAG registers to initiate the programming. Actually IEEE 1532
defines a standardized JTAG based interface for programming that the ECP5
follows. So ujprog has code to interface the FTDI chip as well as code to issue
the relevant IEEE 1532 commands over the JTAG interface.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Probably don’t need to care much about the PROGN, INIT and DONE signals.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The interface between the ECP5 and FTDI need only be the 4 wire JTAG signals.
Often the FTDI also provides usb-serial port services but this is not strictly
needed. Actually the FTDI chips are really expensive (same cost as the ECP5
12F) so while it may be useful for the development version of the board it will
not go on any production variant.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In fact since FTDI appears to be a generally unlikable company it would be a
good thing to try to not use their products at all. So instead of putting an
overpriced FTDI chip on each board let us use an external JTAG cable for
programming and only put a JTAG pin-header on the PCB (with pull-down on TCK
but nothing else). Once the board has been “factory” programmed via JTAG
subsequent updates could be handled via the USB interface.  During FPGA power
up the configuration bitstream is read from the SPI flash but after that the
SPI interface pins are controllable by the FPGA fabric (according to
documentation). So in theory once our FPGA core is up and running we can
receive a new configuration bitstream via our USB interface and then write that
to the SPI flash so that it will be used during the following FPGA power up
sequence.&lt;/p&gt;

&lt;h2 id=&quot;pcb-manufacturing&quot;&gt;PCB manufacturing&lt;/h2&gt;

&lt;p&gt;The chosen ECP5 BGA package is 0.8mm pitch so in order to escape route the
inner rings (mostly power and gnd) rather small vias need to be placed (via
diameter needs to be less than 0.55mm). Unfortunately it turns out that making
such small vias exceeds the capabilities of Aisler so a different PCB service
needs to be used. These vias are just within the capabilites of the Oshpark so
we have to turn to them instead.&lt;/p&gt;

&lt;p&gt;When it comes to PCB services the order of preference for me would be Aisler
(Europe), Oshpark (US), JLCPCB and PCBWAY (China).&lt;/p&gt;

&lt;h2 id=&quot;pcb-layout&quot;&gt;PCB layout&lt;/h2&gt;

&lt;p&gt;Thinking that we can mount the switch mode voltage regulators and support
circuitry on the back side of the board. Also if possible decoupling capacitors
for the BGA can go on the back side as well.&lt;/p&gt;</content><author><name></name></author><category term="hardware" /><summary type="html">Stage 3 - ECP5 4-layer board (signal, power, gnd, signal).</summary></entry><entry><title type="html">[DRAFT] Making some PCBs - part 1</title><link href="https://www.zzzconsulting.se/2021/06/08/making-pcbs.html" rel="alternate" type="text/html" title="[DRAFT] Making some PCBs - part 1" /><published>2021-06-08T00:00:00+02:00</published><updated>2021-06-08T00:00:00+02:00</updated><id>https://www.zzzconsulting.se/2021/06/08/making-pcbs</id><content type="html" xml:base="https://www.zzzconsulting.se/2021/06/08/making-pcbs.html">&lt;p&gt;&lt;strong&gt;NOTE: This post is not finished yet but instead of keeping it as a hidden
draft I have decided to publish it anyway - mostly to serve as a notebook
to myself while I work on the project.&lt;/strong&gt;&lt;/p&gt;

&lt;h2 id=&quot;introduction&quot;&gt;Introduction&lt;/h2&gt;

&lt;p&gt;For a while now I have had a desire to make a custom PCB for the MyC64 project,
a board with suitable display interface and perhaps some tiny keyboard.
Something that could be used in a handheld. A handheld FPGA based C64 emulator
now that would be pretty cool.&lt;/p&gt;

&lt;p&gt;Back in 2019 I did play a bit with &lt;a href=&quot;https://www.kicad.org/&quot;&gt;KiCad&lt;/a&gt; making a
very simple (through hole) PCB for &lt;a href=&quot;https://github.com/markus-zzz/myvgapmod&quot;&gt;PMOD like VGA
interface&lt;/a&gt; but wrt complexity that is
light years away from what I need to do now.&lt;/p&gt;

&lt;p&gt;A board for the MyC64 will need to include the ECP5 FPGA with its, at least,
256 ball BGA packaging as well as HyperRAM 24 ball BGA and multiple other fine
pitch surface mount components.&lt;/p&gt;

&lt;p&gt;While I certainly need to learn a lot about board design and KiCad that bit
does not scare me half as much as doing the actual assembly for those parts.&lt;/p&gt;

&lt;p&gt;As a side not I spent some time during summer vacation to build a workbench for
my lab corner where I can put all my equipment. This is how it turned out.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/download/pcb/workbench.jpg&quot; alt=&quot;Workbench&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;the-plan&quot;&gt;The Plan&lt;/h2&gt;

&lt;p&gt;So to get started I thought I could begin with making a few rather simple PMOD
boards for some of the components that I intend to use on the final board. This
time actually following &lt;a href=&quot;https://www.digilentinc.com/Pmods/Digilent-Pmod_%20Interface_Specification.pdf&quot;&gt;the Pmod
specification&lt;/a&gt;.
This way I get a chance to improve my limited KiCad and assembly skills along
the way as well as trying out interfacing the components with the FPGA on the
trusted ULX3S board. That seems like a sound plan.&lt;/p&gt;</content><author><name></name></author><category term="hardware" /><summary type="html">NOTE: This post is not finished yet but instead of keeping it as a hidden draft I have decided to publish it anyway - mostly to serve as a notebook to myself while I work on the project.</summary></entry><entry><title type="html">[DRAFT] A most basic GPU - part 1</title><link href="https://www.zzzconsulting.se/2021/03/26/basic-gpu-part-1.html" rel="alternate" type="text/html" title="[DRAFT] A most basic GPU - part 1" /><published>2021-03-26T00:00:00+01:00</published><updated>2021-03-26T00:00:00+01:00</updated><id>https://www.zzzconsulting.se/2021/03/26/basic-gpu-part-1</id><content type="html" xml:base="https://www.zzzconsulting.se/2021/03/26/basic-gpu-part-1.html">&lt;p&gt;&lt;strong&gt;NOTE: This post is not finished yet but instead of keeping it as a hidden
draft I have decided to publish it anyway. Right now I am not working on this
project but if/when I return I will try to update the post with the progress.&lt;/strong&gt;&lt;/p&gt;

&lt;h2 id=&quot;introduction&quot;&gt;Introduction&lt;/h2&gt;

&lt;p&gt;Putting the MyC64 project on hold for a while.&lt;/p&gt;

&lt;p&gt;Recently I have seen updates on the new PS1 FPGA core by
&lt;a href=&quot;https://twitter.com/laxer3a&quot;&gt;@Laxer3A&lt;/a&gt;, which I of course can’t help to find
interesting. I have never owned a PS1 and honestly never cared much about it
until now.  It turns out that graphics wise it had have some rather interesting
design choices. More preciecely the graphics pipeline is based around fixed
point integer representation so there are no floating point numbers involved at
all.  Further more there is no z-buffer. In fact Modern Vintage Gamer has put
together a short video describing these limitations that is well worth a watch
&lt;a href=&quot;https://www.youtube.com/watch?v=x8TO-nrUtSI&quot;&gt;Why PlayStation 1 Graphics Warped and Wobbled so much |
MVG&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;So apparently it was possible to reach commercial success with a system like
this some 25 years ago. Nowadays probably not so much but it is still
interesting to see that one can get away with these limitations and still get
reasonable results.&lt;/p&gt;

&lt;p&gt;For me this suggests a very basic GPU may actually be within reach as a fun
FPGA project.&lt;/p&gt;

&lt;p&gt;The rasterization algorithm is very well described in &lt;a href=&quot;https://www.scratchapixel.com/lessons/3d-basic-rendering/rasterization-practical-implementation&quot;&gt;Rasterization: a
Practical
Implementation&lt;/a&gt;
over at &lt;a href=&quot;https://www.scratchapixel.com/&quot;&gt;Scratch Pixel&lt;/a&gt;. What follows below
will be some thoughts that I had while reading that and thinking about a simple
design that could be put inside a FPGA.&lt;/p&gt;

&lt;h2 id=&quot;project-scope&quot;&gt;Project scope&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Fixed pipeline (duh, not likely to implement programmable shaders any time soon)&lt;/li&gt;
  &lt;li&gt;Fixed point integers only (no floating point)&lt;/li&gt;
  &lt;li&gt;Flat shading only (triangles filled with solid color)&lt;/li&gt;
  &lt;li&gt;No z-buffer (geometry has to be sorted before sent down the pipeline)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;rasterization-stage&quot;&gt;Rasterization stage&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Rasterization is the process of finding pixel coverage for a given triangle&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This stage is about finding out, for a given triangle, what screen pixels are
covered by that triangle. Since well behaved triangles contain significantly
more pixels than vertices (three) this step is likely to be the main bottle
neck of the system.&lt;/p&gt;

&lt;p&gt;Determining if a pixel is inside a triangle is done by checking if it falls on
the &lt;em&gt;right&lt;/em&gt; side of all three edges.&lt;/p&gt;

&lt;p&gt;Intuitively it seems that the cross product could be used to determine if a
point is on the left-side or right-side of an edge. Now screen space is 2d and
the cross product is a 3d only concept but one can of course limit the vectors
to be contained in the plane &lt;script type=&quot;math/tex&quot;&gt;z=0&lt;/script&gt;. In that case the well known cross
product&lt;/p&gt;

&lt;script type=&quot;math/tex; mode=display&quot;&gt;a \times b = (a_2b_3-a_3b_2, a_3b_1-a_1b_3, a_1b_2-a_2b_1)&lt;/script&gt;

&lt;p&gt;simply becomes&lt;/p&gt;

&lt;script type=&quot;math/tex; mode=display&quot;&gt;a_1b_2-a_2b_1&lt;/script&gt;

&lt;p&gt;which in rasterization is known as an &lt;em&gt;edge function&lt;/em&gt;. It is worth noting that
the &lt;em&gt;edge function&lt;/em&gt; is in fact the magnitude of the cross product of the two vectors
in the &lt;script type=&quot;math/tex&quot;&gt;z=0&lt;/script&gt; plane (so it has something to do with area).&lt;/p&gt;

&lt;p&gt;So the &lt;em&gt;edge function&lt;/em&gt; for edge &lt;script type=&quot;math/tex&quot;&gt;u \to v&lt;/script&gt; and arbitrary point &lt;script type=&quot;math/tex&quot;&gt;p&lt;/script&gt; is&lt;/p&gt;

&lt;script type=&quot;math/tex; mode=display&quot;&gt;E_{u \to v}(p) =  (v_x - u_x)(p_y - u_y) - (v_y - u_y)(p_x - u_x)&lt;/script&gt;

&lt;p&gt;now condsider what happens if an offset &lt;script type=&quot;math/tex&quot;&gt;o&lt;/script&gt; is applied&lt;/p&gt;

&lt;script type=&quot;math/tex; mode=display&quot;&gt;% &lt;![CDATA[
\begin{eqnarray}
E_{u \to v}(p+o) &amp;=&amp; (v_x - u_x)(p_y + o_y - u_y) - (v_y - u_y)(p_x + o_x - u_x) \\
                 &amp;=&amp; E_{u \to v}(p) + (v_x - u_x)o_y - (v_y - u_y)o_x
\end{eqnarray} %]]&gt;&lt;/script&gt;

&lt;p&gt;in other words once we have computed &lt;script type=&quot;math/tex&quot;&gt;E_{u \to v}(p)&lt;/script&gt; we can explore
neighbouring pixels by simply adding and subtracting &lt;script type=&quot;math/tex&quot;&gt;(v_x - u_x)&lt;/script&gt; and &lt;script type=&quot;math/tex&quot;&gt;(v_y - u_y)&lt;/script&gt; both of which we have already computed for &lt;script type=&quot;math/tex&quot;&gt;E_{u \to v}(p)&lt;/script&gt;.&lt;/p&gt;

&lt;p&gt;This is a significant improvement as it allows us to replace two
multiplications and five additions with a single addition.&lt;/p&gt;

&lt;p&gt;Naively every pixel of the screen need to be checked against each triangle for
coverage but obviously one often do much better by simply checking the bounding
box of the triangle. Either way the amount of pixels is significant so this
last result is really important as it will allow us to check multiple pixels
each cycle at a reasonable cost in hardware.&lt;/p&gt;

&lt;p&gt;The idea is to consider quads of 2x2 pixels. Begin by computing the &lt;script type=&quot;math/tex&quot;&gt;E_{u \to
v}(p)&lt;/script&gt; for the pixel in the upper left corner of the bounding box. After that
add/subtract as described above to get the remaining three pixels of the quad.
This should preferably be done in sequence not to wast too much resources. Once
the first quad is done the neighbouring quad in &lt;script type=&quot;math/tex&quot;&gt;x&lt;/script&gt; direction is obtained
by simple addition of the precomputed &lt;script type=&quot;math/tex&quot;&gt;2(v_y - u_y)&lt;/script&gt; (keeping in mind that
we get the shift by one for free in hardware by simply hard-wiring in a zero at
the lsb). So we begin with sequential computations for the first quad but after
that we reach a steady state where we evaluate an entire quad each cycle.&lt;/p&gt;

&lt;p&gt;Still though every triangle has three edges so evaluating a quad every cycle
means that 12 adders need to be instantiated. If that results in too high
resource utilization we may need to cut down a bit.&lt;/p&gt;

&lt;h2 id=&quot;first-milestone&quot;&gt;First milestone&lt;/h2&gt;

&lt;p&gt;Rasterize fixed triangle over DVI/HDMI. Use block RAMs for frame buffer.
Suitable resolution 320x240. Try quad rastierizer and check resource
utilization in FPGA. If it works then great otherwise scale down.&lt;/p&gt;

&lt;h2 id=&quot;geometry-processing&quot;&gt;Geometry processing&lt;/h2&gt;

&lt;p&gt;Eventually we could try floating point format for the geometry processing.
Since geometry processing is not expected to be a bottleneck compared to pixel
rasterization one can probably afford a completely sequential implementation
and hence floating point is within reach. Also the dynamic range of floating
point makes more sense when it comes to vertices compared to pixel coordinates
in raster space (which are inherently integer).&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.cs.ucf.edu/~dcm/Teaching/CDA5106-Fall2015/Appendices/appendix_j.pdf&quot;&gt;Computer aritmetic appendix from Computer Architecture: A Quantitative Approach&lt;/a&gt;&lt;/p&gt;</content><author><name></name></author><category term="hardware" /><category term="rtl" /><category term="verilog" /><category term="gpu" /><summary type="html">NOTE: This post is not finished yet but instead of keeping it as a hidden draft I have decided to publish it anyway. Right now I am not working on this project but if/when I return I will try to update the post with the progress.</summary></entry><entry><title type="html">Commodore 64 experiments part-3 (FPGA)</title><link href="https://www.zzzconsulting.se/2020/09/26/c64-part-3.html" rel="alternate" type="text/html" title="Commodore 64 experiments part-3 (FPGA)" /><published>2020-09-26T00:00:00+02:00</published><updated>2020-09-26T00:00:00+02:00</updated><id>https://www.zzzconsulting.se/2020/09/26/c64-part-3</id><content type="html" xml:base="https://www.zzzconsulting.se/2020/09/26/c64-part-3.html">&lt;p&gt;I figured the natural next step of our MyC64 journey would be to get the
emulator running on a FPGA, or more specifically the ULX3S board. Doing so
requires a few support systems such as keyboard input and visual output to be
in place so let us use this post to look at how that can be accomplished.&lt;/p&gt;

&lt;h2 id=&quot;keyboard-input-and-control&quot;&gt;Keyboard input and control&lt;/h2&gt;

&lt;p&gt;As we already know the real C64 keyboard is a 8x8 matrix connected directly to
the CIA1 and scanned by the KERNAL code. Any modern keyboard we might find is
based on scan-codes so some conversion layer (much like what we already do in
&lt;code class=&quot;highlighter-rouge&quot;&gt;myc64-sim&lt;/code&gt;) will be necessary.&lt;/p&gt;

&lt;p&gt;In addition to keyboard input &lt;code class=&quot;highlighter-rouge&quot;&gt;.prg&lt;/code&gt; injection also needs to be handled.&lt;/p&gt;

&lt;p&gt;A straight forward way to deal with these requirements is to treat the emulator
as a USB device and develop some kind of software on the host workstation that
performs the keyboard forwarding and program injection. Additional control
signaling such as reset/reboot could also be handled this way.&lt;/p&gt;

&lt;p&gt;Following this approach would put good use to my &lt;a href=&quot;https://github.com/markus-zzz/usbdev/tree/dev&quot;&gt;USB
device&lt;/a&gt; implementation detailed
in previous posts as well as serving as motivation for thoroughly finalizing it
and working out remaining bugs.&lt;/p&gt;

&lt;p&gt;So let’s do that! We simply grab the entire USB device subsystem
&lt;a href=&quot;https://github.com/markus-zzz/usbdev/tree/dev&quot;&gt;from github&lt;/a&gt; including controlling
RISC-V CPU with ROM and RAM.&lt;/p&gt;

&lt;p&gt;On the workstation software side we need to develop a program that communicates
with the USB device while maintaining a workable GUI. For the USB communication
we use &lt;a href=&quot;https://libusb.info/&quot;&gt;libusb&lt;/a&gt; and for GUI we relay on
&lt;a href=&quot;https://www.gtk.org/&quot;&gt;GTK&lt;/a&gt;. Combining these two essentially boils down to
using the asynchronous API of libusb and plugging it in to the single event
loop handled by GTK.&lt;/p&gt;

&lt;p&gt;To be more precise libusb is queried for its (Unix) event file descriptors and
the GTK event loop is setup to watch these (presumably using the select system
call internally). Details are found in
&lt;a href=&quot;https://github.com/markus-zzz/myc64/blob/master/sw/myc64-keyb.cpp&quot;&gt;sw/myc64-keyb.cpp&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;video-output&quot;&gt;Video output&lt;/h2&gt;

&lt;p&gt;Displaying C64 video output in a reasonable way is rather tricky as we need to
comply with both the internal timing of the VIC-II (to avoid compatibility
issues with existing software) as well as the timing of the chosen video format for
a modern display.&lt;/p&gt;

&lt;p&gt;Apparently this is not impossible as the &lt;a href=&quot;https://ultimate64.com/Ultimate-64&quot;&gt;Ultimate
64&lt;/a&gt; manages to tweak the video timing of
576p50 to match its VIC-II output (described
&lt;a href=&quot;https://1541u-documentation.readthedocs.io/en/latest/hardware/hdmi.html&quot;&gt;here&lt;/a&gt;).
Details on the 576p50 video mode can be found
&lt;a href=&quot;https://ez.analog.com/video/f/q-a/109341/timing-details-of-720x576-progressive-video&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;For this project we choose to go a different route and store intermediate
VIC-II video frames in external SDRAM and display them in 640x480@60Hz over
DVI/HDMI.&lt;/p&gt;

&lt;h2 id=&quot;audio-output&quot;&gt;Audio output&lt;/h2&gt;

&lt;p&gt;Not much to say about this. The ULX3S features a 4-bit resistor ladder for DAC
and we simply connect the SID’s mono output to it. Actually depending on the
content 4-bit resolution does not sounds as bad as one would think. Just listen
to &lt;a href=&quot;https://www.youtube.com/watch?v=nODn5HJGtAY&quot;&gt;this comparison video&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;the-myc64-soc&quot;&gt;The MyC64 SoC&lt;/h2&gt;

&lt;p&gt;Combining these systems in a SoC turns out as follows&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/download/c64/myc64-soc.svg&quot; alt=&quot;MyC64 SoC&quot; /&gt;&lt;/p&gt;

&lt;p&gt;As can be seen this involves interaction between several clock domains.
Especially the visual path, which is rather intense in terms of pixel
shoveling, uses asynchronous FIFOs when crossing in and out of the SDRAM’s
125MHz clock domain.&lt;/p&gt;

&lt;p&gt;The ECP5-12F only has two PLLs so we are limited in the number of clock
frequencies that we can synthesize. Two blocks have strict requirements on the
clock frequency. The USB block needs a clock that is a multiple of 1.5MHz and
the DVI/HDMI controller needs a 125MHz clock (and a derived 25MHz pixel clock).
The MyC64 block needs a clock in the neighborhood of 8MHz but it does not have
to be exact (especially now that it is decoupled from the display).&lt;/p&gt;

&lt;p&gt;So given these constraints we end up with the three clock domains 15MHz, 25MHz
and 125MHz seen in the diagram. The MyC64 block uses an internal clock enable
to “divide” the 15MHz clock by two.&lt;/p&gt;

&lt;p&gt;For the SDRAM the 125MHz clock appears to be a bit too tight, both on the PCB
as well as inside the FPGA (builds often fail timing constraints). So with
additional clocking resources it would be preferable if this clock could be
lowered to say 100MHz.&lt;/p&gt;

&lt;p&gt;Since the async FIFOs don’t provide &lt;em&gt;almost full&lt;/em&gt; and &lt;em&gt;almost empty&lt;/em&gt; signaling
it seemed difficult to operate the SDRAM in burst mode (i.e. as the burst is
pipelined so we need to stop it before the FIFO is full to avoid overrun).
Adding &lt;em&gt;almost full&lt;/em&gt; and &lt;em&gt;almost empty&lt;/em&gt; signaling to the FIFO was deemed out of
scope since asynchronous FIFOs are already notoriously complicated in that
area.&lt;/p&gt;

&lt;p&gt;One option could have been to add a normal synchronous FIFO in series to get
the &lt;em&gt;almost full&lt;/em&gt; and  &lt;em&gt;almost empty&lt;/em&gt; signaling from that FIFO. Another option
could have been to adjust the SDRAM address pointers when an overrun or
underrun occurs (but since timing is already tight in the SDRAM clock domain
this kind of additional logic could make things worse).&lt;/p&gt;

&lt;p&gt;While those suggestions seem possible in theory we instead chose a simpler
approach of operating the SDRAM in single access mode. Since the C64 only
generates 16 colors we can store these in SDRAM instead of the full RGB value.
Doing so means that we get four pixels per 16-bit memory access which appears
to fall well within the time budget when operating the SDRAM at 125MHz.&lt;/p&gt;

&lt;p&gt;The components used are gathered from various places&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;SDRAM controller - Slightly modified version of the one from &lt;a href=&quot;https://www.fpga4fun.com/SDRAM.html&quot;&gt;fpga4fun.com&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;DVI/HDMI controller - From &lt;a href=&quot;https://github.com/daveshah1/prjtrellis-dvi&quot;&gt;Project Trellis DVI&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;Asynchronous FIFO - From &lt;a href=&quot;https://github.com/ZipCPU/website/blob/master/examples/afifo.v&quot;&gt;here&lt;/a&gt;. This is a variant of Cliff Cummings design formally verified by Dan Gisselquist. In fact it is the subject of a &lt;a href=&quot;https://zipcpu.com/blog/2018/07/06/afifo.html&quot;&gt;Crossing clock domains with an Asynchronous FIFO&lt;/a&gt; article by Dan.&lt;/li&gt;
  &lt;li&gt;MyC64-SubSys - That is what we are working on here.&lt;/li&gt;
  &lt;li&gt;UsbDev-SubSys - That is what I was working on before this project.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;verilator-simulation-for-the-soc&quot;&gt;Verilator simulation for the SoC&lt;/h3&gt;

&lt;p&gt;The simulator needed to be extended a bit to handle all the different components
with their associated clock domains. Regrettably I dragged my feet on this and
as a result wasted valuable time trying to diagnose what turned out to be
rather trivial problems while running on the FPGA.&lt;/p&gt;

&lt;p&gt;Mostly it is about handling of multiple clock domains in Verilator. Turns out
that Dan Gisselquist has an excellent article titled &lt;a href=&quot;https://zipcpu.com/blog/2018/09/06/tbclock.html&quot;&gt;Handling multiple clocks
with Verilator &lt;/a&gt; so I went
ahead and did something similar.&lt;/p&gt;

&lt;p&gt;To avoid breaking the video chain we also need to simulate the external SDRAM
so I added a very basic SDRAM model to
&lt;a href=&quot;https://github.com/markus-zzz/myc64/blob/master/rtl/myc64-soc/myc64-soc-top.v&quot;&gt;rtl/myc64-soc/myc64-soc-top.v&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;With all this additional stuff in place simulation with &lt;code class=&quot;highlighter-rouge&quot;&gt;myc64-soc-sim&lt;/code&gt; is a
much slower than with our previous &lt;code class=&quot;highlighter-rouge&quot;&gt;myc64-sim&lt;/code&gt;. So much slower that interactive
input and display does not make sense any more. Instead the SoC simulator
stores each video frame directly to the file system in PNG format.&lt;/p&gt;

&lt;p&gt;We store video frames directly after the VIC-II output as well as the
intermediate VGA frames that are being fed into the DVI controller. That is we
store output both before and after the error prone SDRAM part.&lt;/p&gt;

&lt;p&gt;The SoC simulator &lt;code class=&quot;highlighter-rouge&quot;&gt;myc64-soc-sim&lt;/code&gt; is implemented in
&lt;a href=&quot;https://github.com/markus-zzz/myc64/blob/master/sim/myc64-soc-sim.cpp&quot;&gt;sim/myc64-soc-sim.cpp&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;synthesis&quot;&gt;Synthesis&lt;/h3&gt;

&lt;p&gt;Synthesis for my ULX3S’s ECP5-12F FPGA yields the following utilization which I
find rather low given the amount of stuff we have put in. Of course we have
used up all the PLLs and almost all of the block RAM but in terms of logic
slices it is not that high.&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Info: Device utilisation:
Info:          TRELLIS_SLICE:  6136/12144    50%
Info:             TRELLIS_IO:   128/  197    64%
Info:                   DCCA:     4/   56     7%
Info:                 DP16KD:    49/   56    87%
Info:             MULT18X18D:     3/   28    10%
Info:                 ALU54B:     0/   14     0%
Info:                EHXPLLL:     2/    2   100%
Info:                EXTREFB:     0/    1     0%
Info:                   DCUA:     0/    1     0%
Info:              PCSCLKDIV:     0/    2     0%
Info:                IOLOGIC:     0/  128     0%
Info:               SIOLOGIC:     8/   69    11%
Info:                    GSR:     0/    1     0%
Info:                  JTAGG:     0/    1     0%
Info:                   OSCG:     0/    1     0%
Info:                  SEDGA:     0/    1     0%
Info:                    DTR:     0/    1     0%
Info:                USRMCLK:     0/    1     0%
Info:                CLKDIVF:     0/    4     0%
Info:              ECLKSYNCB:     0/   10     0%
Info:                DLLDELD:     0/    8     0%
Info:                 DDRDLL:     0/    4     0%
Info:                DQSBUFM:     0/    8     0%
Info:        TRELLIS_ECLKBUF:     0/    8     0%
Info:           ECLKBRIDGECS:     0/    2     0%
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;As earlier mentioned, the 125MHz clock domain appears to be operating at its
limits as we often get timing constraint violations after small (even
unrelated) modifications to the design.&lt;/p&gt;

&lt;h2 id=&quot;demonstration&quot;&gt;Demonstration&lt;/h2&gt;

&lt;p&gt;After loading the design on the ULX3S board and plugging in the secondary USB
connector we should expect to see the following in &lt;code class=&quot;highlighter-rouge&quot;&gt;dmesg&lt;/code&gt;&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;usb 1-5: new low-speed USB device number 6 using xhci_hcd
usb 1-5: New USB device found, idVendor=abc0, idProduct=0064, bcdDevice= 1.00
usb 1-5: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 1-5: Product: MyC64 - FPGA based Commodore 64 emulator as a USB device
usb 1-5: Manufacturer: Markus Lavin (https://www.zzzconsulting.se/)
usb 1-5: SerialNumber: 0123456789abcdef
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;At this point the device should be generating DVI/HDMI output and it is
advisable to connect a screen to verify. If all looks good we can go ahead and
connect with &lt;code class=&quot;highlighter-rouge&quot;&gt;myc64-keyb&lt;/code&gt; and start typing in some BASIC commands. Any key
press (and release) that goes into the application window will be forwarded to
the USB connected MyC64 emulator.&lt;/p&gt;

&lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;myc64-keyb&lt;/code&gt; program takes an optional argument that specifies a &lt;code class=&quot;highlighter-rouge&quot;&gt;.prg&lt;/code&gt;
file to inject into emulator memory.&lt;/p&gt;

&lt;p&gt;An interesting program to try out is the machine language monitor SuperMon by
&lt;a href=&quot;https://en.wikipedia.org/wiki/Jim_Butterfield&quot;&gt;Jim Butterfield&lt;/a&gt;. SuperMon is
the monitor used in Jim’s classic book &lt;a href=&quot;https://archive.org/details/Machine_Language_for_the_Commodore_Revised_and_Expanded_Edition&quot;&gt;Machine Language for the
Commodore&lt;/a&gt;.
To get hold of SuperMon itself in modern times simply go to &lt;a href=&quot;https://github.com/jblang/supermon64&quot;&gt;Supermon+64
V1.2&lt;/a&gt;. There one can also find a
&lt;a href=&quot;https://www.youtube.com/watch?v=MEjnMt_3wkU&quot;&gt;video&lt;/a&gt; demonstrating its
capabilities.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/download/c64/myc64-ulx3s.jpg&quot; alt=&quot;MyC64 on ULX3S&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;known-issues--bugs&quot;&gt;Known issues / Bugs&lt;/h2&gt;
&lt;ul&gt;
  &lt;li&gt;USB device loses sync (and host aborts after tree packets with no response) so longer transfers often fail. Works with retry though.&lt;/li&gt;
  &lt;li&gt;“Static noise” moves past the screen occasionally (seems to be dependent on when reset is released). Believed to be limitations of SDRAM (traces on PCB) operating at 125MHz.&lt;/li&gt;
  &lt;li&gt;Sometimes the &lt;code class=&quot;highlighter-rouge&quot;&gt;.prg&lt;/code&gt; appears to get corrupted when loaded over USB.&lt;/li&gt;
&lt;/ul&gt;</content><author><name></name></author><category term="hardware" /><category term="c64" /><summary type="html">I figured the natural next step of our MyC64 journey would be to get the emulator running on a FPGA, or more specifically the ULX3S board. Doing so requires a few support systems such as keyboard input and visual output to be in place so let us use this post to look at how that can be accomplished.</summary></entry><entry><title type="html">Commodore 64 experiments part-2 (SID)</title><link href="https://www.zzzconsulting.se/2020/08/22/c64-part-2.html" rel="alternate" type="text/html" title="Commodore 64 experiments part-2 (SID)" /><published>2020-08-22T00:00:00+02:00</published><updated>2020-08-22T00:00:00+02:00</updated><id>https://www.zzzconsulting.se/2020/08/22/c64-part-2</id><content type="html" xml:base="https://www.zzzconsulting.se/2020/08/22/c64-part-2.html">&lt;p&gt;&lt;strong&gt;NOTE: I wrote the majority of this post (and the implementation) about a month
ago but never got around to publish it until now.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;After the last post’s progress of getting the system to boot and being able to
run small test programs I felt motivated to carry on. While perhaps it would
have been wiser focus on the core system I instead turned my attention to the
SID chip.&lt;/p&gt;

&lt;p&gt;Again, since I did not really do much with my C64 back in the day, and have had
no reason to study it afterwards, my pre-existing knowledge about the SID was
pretty limited. I basically only knew “it was the chip that made the sounds”.&lt;/p&gt;

&lt;p&gt;To get an idea what the SID sounds like the most convenient solution is proably
to head over to the online player at &lt;a href=&quot;https://deepsid.chordian.net/&quot;&gt;DeepSID&lt;/a&gt;
where there is a huge collection of C64 music from games and demos that will
play right in your browser.&lt;/p&gt;

&lt;p&gt;So the SID player plays SID files, but what exactly is a &lt;code class=&quot;highlighter-rouge&quot;&gt;.sid&lt;/code&gt; file? Well
unlike modern audio formats it does not contain PCM samples (e.g. WAV or
compressed formats like MP3) nor does it contain the tones/or notes like a MIDI
file does. Instead it actually contains 6502 machine code that writes values
into SID registers to produce the desired sound. As such a SID player is
essentially a scaled down C64 emulator. For a complete description of the SID
file format see &lt;a href=&quot;https://www.hvsc.c64.org/download/C64Music/DOCUMENTS/SID_file_format.txt&quot;&gt;this
link&lt;/a&gt;
but for our purposes it suffices to say that it contain two entry points. One
entry point for an initialization routine that will prepare to play a given
song (SID files contain many) and a second entry point for the play routine
that is to be called repeatedly (usually at 60Hz).&lt;/p&gt;

&lt;p&gt;Now while this format plays directly in a SID player it requires some
scaffolding and relocation to play on actual C64 hardware. This is where
&lt;a href=&quot;http://psid64.sourceforge.net/&quot;&gt;PSID64&lt;/a&gt; utility comes in, it will provide this
scaffolding and relocation and basically converts a &lt;code class=&quot;highlighter-rouge&quot;&gt;.sid&lt;/code&gt; file to a &lt;code class=&quot;highlighter-rouge&quot;&gt;.prg&lt;/code&gt;
that can be loaded and run on a C64.&lt;/p&gt;

&lt;p&gt;A huge collection of SID files can be found at the &lt;a href=&quot;https://www.hvsc.c64.org/&quot;&gt;High Voltage SID Collection
(HVSC)&lt;/a&gt; and convinently their downloads page also
provide &lt;a href=&quot;https://boswme.home.xs4all.nl/HVSC/HVSC73_PSID64_packed.7z&quot;&gt;an archive of the HVSC collection converted to PSID64
format&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Getting the PSID64 &lt;code class=&quot;highlighter-rouge&quot;&gt;.prg&lt;/code&gt; to run on MyC64 required some work but it mostly
boiled down to finally implementing bank switching.&lt;/p&gt;

&lt;p&gt;Let’s try and fire up the good old Bubble Bobble tune&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./myc64-sim --cmd-load-prg=130:./MUSICIANS/C/Clarke_Peter/Bubble_Bobble.prg --cmd-inject-keys=135:&quot;LIST&amp;lt;RETURN&amp;gt;RUN&amp;lt;RETURN&amp;gt;&quot;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src=&quot;/download/c64/psid64-bubble-bobble.apng&quot; alt=&quot;PSID64 playing Bubble Bobble&quot; /&gt;&lt;/p&gt;

&lt;p&gt;So now that we have something to test the SID with we can move our focus to try
and emulate the actual chip.&lt;/p&gt;

&lt;p&gt;For those like me who don’t know anything about music synthesis the following
videos (&amp;lt; 10min in total) proved helpful to intorduce the basic concepts&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://youtu.be/qV10Gb-Dvao&quot;&gt;What’s Synthesis and Sound Design? Part 1: Oscillators &amp;amp; Waveforms (Music Theory)&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://youtu.be/In23B9qZhI8&quot;&gt;What’s Synthesis and Sound Design? Part 2: Subtractive Synthesis &amp;amp; Filters (Music Theory)&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://youtu.be/n-k0NQ5lcSA&quot;&gt;What’s Synthesis and Sound Design? Part 3: Envelopes &amp;amp; ADSR (Music Theory)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After that I suggest the following souces for details about the SID&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;http://www.6502.org/documents/datasheets/mos/mos_6582_sid.pdf&quot;&gt;6582 SOUND Interface Device (SID) - datasheet&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;http://sid.kubarth.com/articles/interview_bob_yannes.html&quot;&gt;Interview with Bob Yannes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A very basic implementation based on that was started in
&lt;a href=&quot;https://github.com/markus-zzz/myc64/blob/master/rtl/myc64/sid.v&quot;&gt;rtl/myc64/sid.v&lt;/a&gt;.
As usual it is far from complete but it is able to produce some basic tunes.
Modifying &lt;code class=&quot;highlighter-rouge&quot;&gt;myc64-sim&lt;/code&gt; to store the generated audio signal into a &lt;code class=&quot;highlighter-rouge&quot;&gt;.wav&lt;/code&gt; file
allowed us to capture &lt;a href=&quot;/download/c64/Bubble_Bobble-MyC64-sim.wav&quot;&gt;Bubble Bobble - MyC64
SID&lt;/a&gt;. This admittedly sounds quite a
bit off (especially in frequency) compared to &lt;a href=&quot;/download/c64/Bubble_Bobble-VICE-8580-ReSID.wav&quot;&gt;Bubble Bobble - VICE 8580
ReSID&lt;/a&gt; generated by the well
known &lt;a href=&quot;https://vice-emu.sourceforge.io/&quot;&gt;VICE&lt;/a&gt; emulator. Still it is good
enough as a first approximation and allows us to move on an focus on other
areas. That concludes this post. As always if you have questions or feedback -
leave a comment below!&lt;/p&gt;</content><author><name></name></author><category term="hardware" /><category term="c64" /><summary type="html">NOTE: I wrote the majority of this post (and the implementation) about a month ago but never got around to publish it until now.</summary></entry><entry><title type="html">Commodore 64 experiments part-1</title><link href="https://www.zzzconsulting.se/2020/07/12/c64-part-1.html" rel="alternate" type="text/html" title="Commodore 64 experiments part-1" /><published>2020-07-12T00:00:00+02:00</published><updated>2020-07-12T00:00:00+02:00</updated><id>https://www.zzzconsulting.se/2020/07/12/c64-part-1</id><content type="html" xml:base="https://www.zzzconsulting.se/2020/07/12/c64-part-1.html">&lt;p&gt;When I was a kid back in the 80s my parents bought our family’s first computer,
a Commodore 64 with a datasette station. I still have fond memories of that
marvelous little machine. At the time though I was only ten years old so my
mental capacity (and likely also patience) was quite limited. I never managed
to do much with it besides playing games and typing in the occasional BASIC
listing from the Sunday newspaper. But that almost never worked anyway.&lt;/p&gt;

&lt;p&gt;Now in recent times I often find myself watching retro computing episodes from
the big YouTubers in the field such as &lt;a href=&quot;https://www.youtube.com/user/adric22&quot;&gt;The 8-Bit
Guy&lt;/a&gt; and &lt;a href=&quot;https://www.youtube.com/channel/UC6gARF3ICgaLfs3o2znuqXA&quot;&gt;Retro
Recipes&lt;/a&gt;. Needless to
say they often feature the C64 and it is refreshing to be reminded of a simpler
time. A time when things were not complicated to absurdity, sometimes
seemingly, just for the sake of it.&lt;/p&gt;

&lt;p&gt;While I know that there are numerous C64 emulators available (mostly in
software but also a few FPGA based) I always thought that would be a neat
project to try on.&lt;/p&gt;

&lt;p&gt;Some weeks ago I ran across the &lt;a href=&quot;http://c64onfpga.blogspot.com/&quot;&gt;C64 on an
FPGA&lt;/a&gt; blog series by Johan Steenkamp, a blog
series dating from 2017 to present detailing his journey in implementing an
FPGA based C64 emulator. While I don’t agree with every design decision taken
there the work put in to implementing and the effort spent documenting it is
really impressive.&lt;/p&gt;

&lt;p&gt;So inspired by this I thought I would try doing something similar or at least
see how much work would be required to get a system that boots into BASIC.&lt;/p&gt;

&lt;h2 id=&quot;preparations&quot;&gt;Preparations&lt;/h2&gt;

&lt;p&gt;Before we get started it is useful to gather some preliminary materials.&lt;/p&gt;

&lt;h3 id=&quot;documentation&quot;&gt;Documentation&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.c64-wiki.com/wiki/Commodore_64_Programmer%27s_Reference_Guide&quot;&gt;Commodore 64 Programmer’s Reference Guide&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://sta.c64.org/cbm64mem.html&quot;&gt;Commodore 64 memory map&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;http://www.zimmers.net/cbmpics/cbm/c64/vic-ii.txt&quot;&gt;The MOS 6567/6569 video controller (VIC-II) and its application in the Commodore 64 by Christian Bauer&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;http://www.6502.org/documents/datasheets/mos/mos_6526_cia_recreated.pdf&quot;&gt;6526 Complex Interface Adapter (CIA) - datasheet&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;http://www.6502.org/documents/datasheets/mos/mos_6582_sid.pdf&quot;&gt;6582 SOUND Interface Device (SID) - datasheet&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;cpu&quot;&gt;CPU&lt;/h3&gt;

&lt;p&gt;A Verilog model of a 6502/6510 CPU by Arlet Ottens can be downloaded from
&lt;a href=&quot;http://ladybug.xs4all.nl/arlet/fpga/6502/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;roms&quot;&gt;ROMs&lt;/h3&gt;

&lt;p&gt;Kernal, basic and character ROMs can be picked up from
&lt;a href=&quot;http://www.zimmers.net/anonftp/pub/cbm/firmware/computers/c64/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;considerations&quot;&gt;Considerations&lt;/h2&gt;

&lt;p&gt;By today’s standards the C64 is a simply system. However when trying to
reimplement it with modern technology (and methodology) inside a single FPGA
there are a few things that need to be kept in mind.&lt;/p&gt;

&lt;p&gt;The design is based around asynchronous RAM so for reading you expect the data
to become available in the same cycle as the address is presented. This is
different from the synchronous RAM blocks found inside FPGAs where an
additional clock cycle must pass before the data is available.&lt;/p&gt;

&lt;p&gt;There are no stall cycles or handshake signals in the bus protocol so every
transaction is expected to complete immediately. Also if a chip is not
connected to the bus the writes to that will simply be ignored and reads return
a default value. When trying to bring up the system this is both good and bad.
Good because you feel you make progress fast but bad since you will probably
miss implementing functionality that turned out to be really important to avoid
those hard to find bugs later on.&lt;/p&gt;

&lt;p&gt;The CPU and VIC-II share the same bus by allowing the CPU to access during the
first phase of the clock and the VIC-II during the second phase of the clock.
This does not map really well to the single edge methodology used in most
modern designs.&lt;/p&gt;

&lt;p&gt;Also as the design is made up of discrete ICs they are connected with a classic
tri-state bus. Tri-state buses inside the fabric is not supported by modern
FPGAs so multiplexers need to be used instead. In a way though, it would be
nice to be able to closely recreate the PCB interconnect inside the FPGA and
have each IC correspond to a Verilog module with the exact same interface
signals.&lt;/p&gt;

&lt;p&gt;The VIC-II has a 8Mhz pixel clock but the remaining design is based around a
1Mhz system clock where both phases (or edges) are used. This also complicates
matters and it is always wise to avoid multiple clock domains unless absolutely
necessary.&lt;/p&gt;

&lt;h2 id=&quot;design&quot;&gt;Design&lt;/h2&gt;

&lt;p&gt;So given the considerations in the previous section I have chosen to use a
single clock domain of 8Mhz with enable signals corresponding to the positive
and negative phase of a 1Mhz clock.&lt;/p&gt;

&lt;p&gt;The single ported synchronous RAM modules are wrapped so that they act as
asynchronous RAM that can be independently accessed with the two 1Mhz phases
enable signals. See
&lt;a href=&quot;https://github.com/markus-zzz/myc64/blob/master/rtl/myc64/spram.v&quot;&gt;rtl/spram.v&lt;/a&gt; and
&lt;a href=&quot;https://github.com/markus-zzz/myc64/blob/master/rtl/myc64/sprom.v&quot;&gt;rtl/sprom.v&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;A implementation of a very basic VIC-II is started in
&lt;a href=&quot;https://github.com/markus-zzz/myc64/blob/master/rtl/myc64/vic-ii.v&quot;&gt;rtl/vic-ii.v&lt;/a&gt;
and likewise for the CIA in
&lt;a href=&quot;https://github.com/markus-zzz/myc64/blob/master/rtl/myc64/cia.v&quot;&gt;rtl/cia.v&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Everything is wired up and connected to a fake keyboard matrix in
&lt;a href=&quot;https://github.com/markus-zzz/myc64/blob/master/rtl/myc64/myc64-top.v&quot;&gt;rtl/myc64-top.v&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;results-so-far&quot;&gt;Results (so far)&lt;/h2&gt;

&lt;p&gt;First of all the code for the MyC64 project can be found &lt;a href=&quot;https://github.com/markus-zzz/myc64&quot;&gt;here on
github&lt;/a&gt; where the README contains a short
intruction sequence to get started with the simulator for those interested.&lt;/p&gt;

&lt;p&gt;Simulation is Verilator based and the driver program
&lt;a href=&quot;https://github.com/markus-zzz/myc64/blob/master/sim/myc64-sim.cpp&quot;&gt;myc64-sim&lt;/a&gt; is
similar to the one used for &lt;a href=&quot;/2019/08/12/retro-gaming-hw-part-1.html&quot;&gt;my previous RetroCon project&lt;/a&gt; but with the addition of some nifty
features to execute delayed commands at a certain frame (e.g. no point
injecting a key press sequence until the system has finished booting).&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;markus@workstation:~/work/repos/myc64/sim$ ./myc64-sim -h
Usage: ./myc64-sim [OPTIONS]

  --scale=N             -- set pixel scaling
  --frame-rate=N        -- try to produce a new frame every N ms
  --save-frame-from=N   -- dump frames to .png starting from frame #N
  --save-frame-to=N=N   -- dump frames to .png ending with frame #N
  --save-frame-prefix=S -- prefix dump frame files with S
  --exit-after-frame=N  -- exit after frame #N
  --trace               -- create dump.vcd
  --cmd-load-prg=&amp;lt;FRAME&amp;gt;:&amp;lt;PRG&amp;gt;           -- wait until &amp;lt;FRAME&amp;gt; then load &amp;lt;PRG&amp;gt;
  --cmd-inject-keys=&amp;lt;FRAME&amp;gt;:&amp;lt;KEYS&amp;gt;       -- wait until &amp;lt;FRAME&amp;gt; then inject &amp;lt;KEYS&amp;gt;
  --cmd-dump-ram=&amp;lt;FRAME&amp;gt;:&amp;lt;ADDR&amp;gt;:&amp;lt;LENGTH&amp;gt; -- wait until &amp;lt;FRAME&amp;gt; then dump &amp;lt;LENGTH&amp;gt; bytes of RAM starting at &amp;lt;ADDR&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;As an example consider the following command line that starts simulation and
then waits until video frame #135 before injecting the key sequence for the
classic one-line BASIC maze generator.&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./myc64-sim --cmd-inject-keys=135:&quot;10&amp;lt;SPACE&amp;gt;PRINT&amp;lt;SPACE&amp;gt;CHR&amp;lt;LSHIFT&amp;gt;4&amp;lt;LSHIFT&amp;gt;8205.5+RND&amp;lt;LSHIFT&amp;gt;81&amp;lt;LSHIFT&amp;gt;9&amp;lt;LSHIFT&amp;gt;9;:GOTO&amp;lt;SPACE&amp;gt;10&amp;lt;RETURN&amp;gt;RUN&amp;lt;RETURN&amp;gt;&quot;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src=&quot;/download/c64/basic-maze.apng&quot; alt=&quot;C64 boot and one line BASIC maze&quot; /&gt;&lt;/p&gt;

&lt;p&gt;It is also possible to inject an arbitrary &lt;code class=&quot;highlighter-rouge&quot;&gt;.prg&lt;/code&gt; file into memory and run
that. E.g. here we take a trivial program that cycles the background color and
assemble it into a &lt;code class=&quot;highlighter-rouge&quot;&gt;.prg&lt;/code&gt; file.&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Start:
loop:  inc $d021
       jmp loop
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;cl65 -o test_001.prg -t c64 -C c64-asm.cfg -u __EXEHDR__ testasm/test_001.s
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;It is interesting to note that the &lt;code class=&quot;highlighter-rouge&quot;&gt;cl65&lt;/code&gt; tool will also insert a small BASIC
header containing the necessary &lt;code class=&quot;highlighter-rouge&quot;&gt;SYS &amp;lt;addr&amp;gt;&lt;/code&gt; command so that the assembled
program can be started with a simple BASIC &lt;code class=&quot;highlighter-rouge&quot;&gt;RUN&lt;/code&gt; command.&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./myc64-sim --cmd-load-prg=130:test_001.prg --cmd-inject-keys=135:&quot;LIST&amp;lt;RETURN&amp;gt;RUN&amp;lt;RETURN&amp;gt;&quot;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src=&quot;/download/c64/cycle-bgcolor.apng&quot; alt=&quot;C64 cycle bgcolor&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;next-steps&quot;&gt;Next steps&lt;/h2&gt;

&lt;p&gt;I am reasonably happy with the result so far given the limited effort put in,
but of course it does not have to end here, this could evolve in any number of
different directions. Here are some ideas that come to mind&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Continue making the implementation more complete and exact, e.g. take a game such as &lt;a href=&quot;https://www.c64-wiki.com/wiki/Bubble_Bobble&quot;&gt;Bubble Bobble&lt;/a&gt; and try to make it play.&lt;/li&gt;
  &lt;li&gt;Get it running on the ULX3S. This would require putting the video frame in SDRAM so that HDMI/DVI can read it at the correct refresh rate for 640x480@60Hz. Also here we could find real use for &lt;a href=&quot;/2020/05/16/usb-dev-part-2.html&quot;&gt;the usbdev project&lt;/a&gt; to allow key and &lt;code class=&quot;highlighter-rouge&quot;&gt;.prg&lt;/code&gt; injection from a host PC.&lt;/li&gt;
  &lt;li&gt;Start looking at a SID implementation.&lt;/li&gt;
  &lt;li&gt;Rewrite in &lt;a href=&quot;https://github.com/m-labs/nmigen&quot;&gt;nMigen&lt;/a&gt;. For quite some time now I have been wanting to try out nMigen on a project and this might be a good opportunity.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is it for this post. If you have questions or feedback - leave a comment below!&lt;/p&gt;</content><author><name></name></author><category term="hardware" /><category term="c64" /><summary type="html">When I was a kid back in the 80s my parents bought our family’s first computer, a Commodore 64 with a datasette station. I still have fond memories of that marvelous little machine. At the time though I was only ten years old so my mental capacity (and likely also patience) was quite limited. I never managed to do much with it besides playing games and typing in the occasional BASIC listing from the Sunday newspaper. But that almost never worked anyway.</summary></entry><entry><title type="html">Building a USB device part-2</title><link href="https://www.zzzconsulting.se/2020/05/16/usb-dev-part-2.html" rel="alternate" type="text/html" title="Building a USB device part-2" /><published>2020-05-16T00:00:00+02:00</published><updated>2020-05-16T00:00:00+02:00</updated><id>https://www.zzzconsulting.se/2020/05/16/usb-dev-part-2</id><content type="html" xml:base="https://www.zzzconsulting.se/2020/05/16/usb-dev-part-2.html">&lt;p&gt;In this post we pick up where we left of last time and start looking at the
design and implementation of the USB device that I have been working on. First
things first though, the code for the entire project can be found in &lt;a href=&quot;https://github.com/markus-zzz/usbdev/tree/dev&quot;&gt;this
github repository&lt;/a&gt;. The reader
is advised to have that readily available.&lt;/p&gt;

&lt;p&gt;The first major goal of this series will be to have the device play along
during the USB enumeration process so that the host can set address and read
relevant descriptors. This can be verified by making sure that the device shows
up properly when issuing a  &lt;code class=&quot;highlighter-rouge&quot;&gt;lsusb&lt;/code&gt; on my Linux workstation.&lt;/p&gt;

&lt;h2 id=&quot;design-of-usbdev-and-soc&quot;&gt;Design of usbdev (and SoC)&lt;/h2&gt;

&lt;h3 id=&quot;clock-recovery&quot;&gt;Clock recovery&lt;/h3&gt;

&lt;p&gt;The signaling in USB consists of the differential pair D+ and D-. For a &lt;em&gt;full
speed (FS)&lt;/em&gt; device the bit rate is 12Mbit/s so if we were able to sample at
exactly the right spot a 12Mhz clock should suffice, in reality though this is
the tricky bit. To aid in clock recovery USB employs both NRZI encoding and
bit-stuffing to ensure that the differential pair will contain a level
transition at least every seven bit times.&lt;/p&gt;

&lt;p&gt;With this in mind it would seem a reasonable approach to run the design at
48Mhz and oversample the differential pair with a factor of four. More
precisely by obtaining four equally spaced samples for each bit time we should
be able to adjust the actual sampling position (in terms of 1/4 bit times) to be
as far away for any level transition as possible (i.e. in the middle of the eye
diagram).&lt;/p&gt;

&lt;p&gt;So running at a 48Mhz clock we have a 2-bit counter (&lt;code class=&quot;highlighter-rouge&quot;&gt;reg [1:0] cntr&lt;/code&gt;)
incrementing each cycle (except for when adjusting). When the counter equals
zero we perform the real sample. For every 48Mhz cycle we also sample and shift
the value into a four bit shift register (&lt;code class=&quot;highlighter-rouge&quot;&gt;reg [3:0] past&lt;/code&gt;). Since we want any
possible level transition to occur in the middle of this shift register we
either advance or delay the counter with one increment depending on if a
transition occurred early or late in the shift register.&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;  // A bit transition should ideally occur between past[2] and past[1]. If it
  // occurs elsewhere we are either sampling too early or too late.
  assign advance = past[3] ^ past[2];
  assign delay   = past[1] ^ past[0];
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;If &lt;code class=&quot;highlighter-rouge&quot;&gt;advance&lt;/code&gt; is active the counter increments by two and if &lt;code class=&quot;highlighter-rouge&quot;&gt;delay&lt;/code&gt; is active
there is no increment (for the given 48Mhz cycle).&lt;/p&gt;

&lt;p&gt;After seeing some transitions this should be able to adjust the sampling point
to where the signal lines are stable.&lt;/p&gt;

&lt;p&gt;This is not the only option for clock recovery though but it is the simplest
one to implement. However if the bit rate was significantly higher compared to
the frequency our design is clocked at (e.g. USB &lt;em&gt;high-speed&lt;/em&gt; at 480Mbit/s)
other methods would have to be used. In such cases I would suspect that
sampling at the exact bit rate and then phase adjusting the clock (with help of
a PLL) until the synchronization pattern is reliably detected would be the
method of choice. Of course simply phase adjusting until the synchronization
pattern is detected would not be enough, you probably want a FSM to find the
midpoint of highest and lowest phase adjustment that makes the pattern
detectable.&lt;/p&gt;

&lt;p&gt;A variant of the previous approach would be to use adjustable delay lines on
the inputs instead of phase adjusting the clock.&lt;/p&gt;

&lt;h3 id=&quot;the-hwsw-interface&quot;&gt;The HW/SW interface&lt;/h3&gt;

&lt;p&gt;To control the USB block some kind of hardware/software interface needs to be
created. I have chosen the simplest possible design that came to mind.&lt;/p&gt;

&lt;p&gt;Each endpoint has a 8 byte buffer in RAM, starting at RAM address zero comes
the 16 OUT endpoints immediately followed by the 16 IN endpoints. In total 256
bytes of RAM are used for endpoint buffers.&lt;/p&gt;

&lt;p&gt;In addition the following registers are exposed to the CPU.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Register&lt;/th&gt;
      &lt;th&gt;Access&lt;/th&gt;
      &lt;th&gt;Address&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;R_USB_ADDR&lt;/td&gt;
      &lt;td&gt;R/W&lt;/td&gt;
      &lt;td&gt;0x20000000&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;R_USB_ENDP_OWNER&lt;/td&gt;
      &lt;td&gt;R/W&lt;/td&gt;
      &lt;td&gt;0x20000004&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;R_USB_CTRL&lt;/td&gt;
      &lt;td&gt;W&lt;/td&gt;
      &lt;td&gt;0x20000008&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;R_USB_IN_SIZE_0_7&lt;/td&gt;
      &lt;td&gt;R/W&lt;/td&gt;
      &lt;td&gt;0x2000000C&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;R_USB_IN_SIZE_8_15&lt;/td&gt;
      &lt;td&gt;R/W&lt;/td&gt;
      &lt;td&gt;0x20000010&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;R_USB_DATA_TOGGLE&lt;/td&gt;
      &lt;td&gt;R/W&lt;/td&gt;
      &lt;td&gt;0x20000014&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;R_USB_OUT_SIZE_0_7&lt;/td&gt;
      &lt;td&gt;R&lt;/td&gt;
      &lt;td&gt;0x20000018&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;R_USB_OUT_SIZE_8_15&lt;/td&gt;
      &lt;td&gt;R&lt;/td&gt;
      &lt;td&gt;0x2000001C&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;I have not bothered documenting them in detail yet but essentially it is as follows.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;em&gt;R_USB_ADDR&lt;/em&gt; - is the 7-bit device address.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;R_USB_ENDP_OWNER&lt;/em&gt; - The 16 low bits correspond to the 16 OUT endpoints and the 16 high bits correspond to the 16 IN endpoints. A set bit indicates that the corresponding endpoint buffer is handed over to the USB block. This means that the corresponding endpoint will accept one IN/OUT packet (with data) and ACK, after which the bit will be cleared and the buffer is handed over to the CPU. When the CPU owns a buffer the USB block will respond with NAK to all IN/OUT+DATA packets.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;R_USB_CTRL&lt;/em&gt; - Control pull-ups for attach.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;R_USB_IN_SIZE_0_7&lt;/em&gt; - 4-bits per endpoint indicate the number of bytes in the corresponding buffer.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;R_USB_IN_SIZE_8_15&lt;/em&gt; - 4-bits per endpoint indicate the number of bytes in the corresponding buffer.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;R_USB_DATA_TOGGLE&lt;/em&gt; - The 16 low bits select the data toggle (DATA0/DATA1) to be expected for the 16 OUT endpoints and the 16 high bits select the data toggle to be sent for the 16 IN endpoints.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;R_USB_OUT_SIZE_0_7&lt;/em&gt; - 4-bits per endpoint indicate the number of bytes in the corresponding buffer.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;R_USB_OUT_SIZE_8_15&lt;/em&gt; - 4-bits per endpoint indicate the number of bytes in the corresponding buffer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Of course a primitive interface like this requires a fair amount of CPU
intervention and if one wants to offload the CPU and achieve higher performance
a lot of this handling could be automated by the USB block itself.&lt;/p&gt;

&lt;h3 id=&quot;soc&quot;&gt;SoC&lt;/h3&gt;

&lt;p&gt;The resulting SoC consists of the USB device block, a PicoRV32 RISCV CPU, RAM
and ROM. From the CPU’s side the memory map is as follows.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Base&lt;/th&gt;
      &lt;th&gt;Size&lt;/th&gt;
      &lt;th&gt;Memory&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;0x00000000&lt;/td&gt;
      &lt;td&gt;16KB&lt;/td&gt;
      &lt;td&gt;ROM&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;0x10000000&lt;/td&gt;
      &lt;td&gt;4KB&lt;/td&gt;
      &lt;td&gt;RAM&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;0x20000000&lt;/td&gt;
      &lt;td&gt;32B&lt;/td&gt;
      &lt;td&gt;USB control &amp;amp; status registers&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The USB device block’s endpoint buffers reside in RAM and the block has
priority over the CPU when accessing. RAM is constructed of four 8-bit banks.
The CPU is connected by a 32-bit bus while the USB block has a 8-bit bus. As
mentioned USB has priority when accessing RAM but in practice this should
result in minimal stall cycles for the CPU as its clock is significantly higher
than the rate of which USB will read/write bytes.&lt;/p&gt;

&lt;h2 id=&quot;simulation&quot;&gt;Simulation&lt;/h2&gt;

&lt;p&gt;The simulation environment is based around Verilator and a set of C++ classes
to build, manipulate and dissect USB packets.&lt;/p&gt;

&lt;h3 id=&quot;usb-pack-gen&quot;&gt;usb-pack-gen&lt;/h3&gt;

&lt;p&gt;The USB packet generation and manipulation code is found in
&lt;a href=&quot;https://github.com/markus-zzz/usbdev/blob/dev/sim/usb-pack-gen.h&quot;&gt;sim/usb-pack-gen.h&lt;/a&gt;
and
&lt;a href=&quot;https://github.com/markus-zzz/usbdev/blob/dev/sim/usb-pack-gen.cpp&quot;&gt;sim/usb-pack-gen.cpp&lt;/a&gt;. It allows both encoding and decoding of USB packets. In essence it consists of three layers&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;UsbPacket - Base class for the various USB packet types (SETUP,IN,OUT,DATA0,DATA1,ACK,NAK) that allow easy high level manipulation. UsbPacket is derived into the various packet types.&lt;/li&gt;
  &lt;li&gt;USbBitVector - A sequence of USB bits.&lt;/li&gt;
  &lt;li&gt;UsbSymbolVector - A sequence of USB symbols (J,K,SE0,SE1).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are methods for translating in both directions performing the necessary
steps such as CRC calculation, bit-stuffing and NRZI encoding.&lt;/p&gt;

&lt;h3 id=&quot;test-suite&quot;&gt;Test-suite&lt;/h3&gt;

&lt;p&gt;The test-suite is invoked by the &lt;code class=&quot;highlighter-rouge&quot;&gt;sim/runner.pl&lt;/code&gt; script that will execute all
tests found in &lt;code class=&quot;highlighter-rouge&quot;&gt;sim/tests&lt;/code&gt; (or the ones given as argument). Each
&lt;code class=&quot;highlighter-rouge&quot;&gt;sim/tests/test_XXX.c&lt;/code&gt; consists of both firmware code to be compiled for the
RISCV CPU as well as usb-pack-gen code compiled into the Verilator based
simulation environment.&lt;/p&gt;

&lt;p&gt;To get a feel for what a test looks like I suggest studying
&lt;a href=&quot;https://github.com/markus-zzz/usbdev/blob/dev/sim/tests/test_003.c&quot;&gt;sim/tests/test_003.c&lt;/a&gt;.
It is probably a good idea to start with running the test and looking at the
decoded USB traffic.&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./runner.pl tests/test_003.c
sigrok-cli -i test_003.sim.sr -P usb_signalling:dp=0:dm=1,usb_packet | awk '/usb_packet-1: [^:]+$/{ print $0 }'
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Should give us something like this (but without the host/device annotations I
added manually).&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;host   : OUT ADDR 0 EP 0
host   : DATA0 [ 23 64 54 AF CA FE ]
device : ACK
host   : IN ADDR 0 EP 1
device : NAK
host   : IN ADDR 0 EP 1
device : DATA0 [ 24 65 55 B0 CB FF ]
host   : ACK
host   : IN ADDR 0 EP 1
device : NAK
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;So the host sends six bytes of data to endpoint zero and the device
acknowledges it. The host tries to read from endpoint one but the device is
busy (the dummy loop reading the address register) so responds with a not
acknowledge. The host tries to read again and this time the device responds
with the byte array with each element incremented by one. The host acknowledges
the received data. The host tries to read yet again but now the endpoint buffer
has been handed back to the CPU so the USB block responds with not acknowledge.&lt;/p&gt;

&lt;p&gt;Now with this in mind it should be rather clear what is going on in the test
case.&lt;/p&gt;

&lt;h3 id=&quot;test-suite-artifacts&quot;&gt;Test-suite artifacts&lt;/h3&gt;

&lt;p&gt;When run the test-suite outputs several useful artifacts. These are&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;test_XXX.comp.log - Any messages (errors and warnings) from the firmware compile.&lt;/li&gt;
  &lt;li&gt;test_XXX.sim.log - Log and debug printouts from the simulation.&lt;/li&gt;
  &lt;li&gt;test_XXX.sim.vcd - RTL simulation waveform in VCD format.&lt;/li&gt;
  &lt;li&gt;test_XXX.sim.sr - Captured USB signaling in sigrok’s format.&lt;/li&gt;
  &lt;li&gt;test_XXX.elf - Firmware code for the RISCV CPU.&lt;/li&gt;
  &lt;li&gt;test_XXX.so - Shared object with test-case code to be loaded into the simulator.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;a-real-world-example&quot;&gt;A real world example&lt;/h2&gt;

&lt;p&gt;I thought we finish this post by using some work-in-progress driver code
(&lt;a href=&quot;https://github.com/markus-zzz/usbdev/blob/dev/sw/usb-dev-driver.c&quot;&gt;sw/usb-dev-driver.c&lt;/a&gt;)
to perform the first steps of the USB enumeration process with the ULX3S
connected to my Linux workstation. The logic analyzer captured the following
&lt;a href=&quot;https://www.zzzconsulting.se/download/sigrok-usb/ulx3s-usbdev-1.sr&quot;&gt;trace&lt;/a&gt; (5M samples at 5Mhz
but only 68KB file size with sigrok’s native format). The reader is urged to
decode it by at least using&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;sigrok-cli -i ulx3s-usbdev-1.sr -P usb_signalling:dp=1:dm=0,usb_packet | awk '/usb_packet-1: [^:]+$/{ print $0 }'
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;but preferably also in PulseView to better see reset signaling etc.&lt;/p&gt;

&lt;p&gt;In summary the following happens in the trace. The host resets the bus, the
device descriptor is read, the host resets the bus, the host sets the device’s
address to 23, the host reads the device descriptor once again, the host reads
the configuration descriptor (first nine bytes only to figure out total size),
the host reads the total size of the configuration descriptor (which includes
interface and endpoint descriptors but in our case there are none so the size
is still 9 bytes). Finally the host tries to set the configuration of the
device but this is not yet implemented in the firmware and the device responds
with NAK indefinitely.&lt;/p&gt;

&lt;p&gt;On my Linux workstation the &lt;code class=&quot;highlighter-rouge&quot;&gt;dmesg&lt;/code&gt; log contained the following lines&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[ 2980.864737] usb 1-5: new low-speed USB device number 16 using xhci_hcd
[ 2981.014361] usb 1-5: config 0 has no interfaces?
[ 2981.014369] usb 1-5: New USB device found, idVendor=1234, idProduct=5678
[ 2981.014372] usb 1-5: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[ 2981.014647] usb 1-5: config 0 descriptor??
[ 2986.192817] usb 1-5: can't set config #0, error -110
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;and a &lt;code class=&quot;highlighter-rouge&quot;&gt;lsusb -v&lt;/code&gt; contained&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Bus 001 Device 016: ID 1234:5678 Brain Actuated Technologies
Couldn't open device, some information will be missing
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass          255 Vendor Specific Class
  bDeviceSubClass       255 Vendor Specific Subclass
  bDeviceProtocol       255 Vendor Specific Protocol
  bMaxPacketSize0         8
  idVendor           0x1234 Brain Actuated Technologies
  idProduct          0x5678
  bcdDevice            1.00
  iManufacturer           0
  iProduct                0
  iSerial                 0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength            9
    bNumInterfaces          0
    bConfigurationValue     0
    iConfiguration          0
    bmAttributes         0x80
      (Bus Powered)
    MaxPower              100mA
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;As can be seen my made up &lt;em&gt;idVendor&lt;/em&gt; actually corresponded to a registered vendor
as can be confirmed in &lt;a href=&quot;http://www.linux-usb.org/usb.ids&quot;&gt;Linux usb.ids&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Slightly strange though is that &lt;code class=&quot;highlighter-rouge&quot;&gt;lsusb&lt;/code&gt; claims that the device address is 16
while the trace clearly contains a &lt;em&gt;SET_ADDRESS&lt;/em&gt; and subsequent use of
  address 23.&lt;/p&gt;

&lt;h2 id=&quot;next-steps&quot;&gt;Next steps&lt;/h2&gt;

&lt;p&gt;The logical next steps would be to continue working on the driver code until
the enumeration process completes successfully supporting all required
requests. Likely some RTL bugs will show up in the process but we will try to
deal with them when they do.&lt;/p&gt;

&lt;p&gt;That is it for today. As always if you have questions or feedback - leave a
comment below!&lt;/p&gt;</content><author><name></name></author><category term="hardware" /><summary type="html">In this post we pick up where we left of last time and start looking at the design and implementation of the USB device that I have been working on. First things first though, the code for the entire project can be found in this github repository. The reader is advised to have that readily available.</summary></entry><entry><title type="html">Sigrok and thoughts on building a USB device</title><link href="https://www.zzzconsulting.se/2020/04/25/sigrok-usb-dev.html" rel="alternate" type="text/html" title="Sigrok and thoughts on building a USB device" /><published>2020-04-25T00:00:00+02:00</published><updated>2020-04-25T00:00:00+02:00</updated><id>https://www.zzzconsulting.se/2020/04/25/sigrok-usb-dev</id><content type="html" xml:base="https://www.zzzconsulting.se/2020/04/25/sigrok-usb-dev.html">&lt;p&gt;Last summer I bought a &lt;a href=&quot;https://www.dreamsourcelab.com/product/dslogic-plus/&quot;&gt;DSLogic
Plus&lt;/a&gt; USB-based Logic
Analyzer, about 6 months ago I tried it for the first time and today I hope to
finish this post describing the experience. The thing comes with its own
analyzer software running on the PC called
&lt;a href=&quot;https://www.dreamsourcelab.com/download/&quot;&gt;DSView&lt;/a&gt;, but I never bothered trying
that and instead went with the better known &lt;a href=&quot;https://sigrok.org/&quot;&gt;sigrok&lt;/a&gt; (of
which DSView is a derivative).&lt;/p&gt;

&lt;h2 id=&quot;setting-up-sigrok-for-the-dslogic-plus&quot;&gt;Setting up sigrok for the DSLogic Plus&lt;/h2&gt;

&lt;p&gt;Building from source seemed rather complicated with many dependencies to be
met, but luckily they also provide self contained
&lt;a href=&quot;https://appimage.org/&quot;&gt;AppImage&lt;/a&gt; binaries for download
&lt;a href=&quot;https://sigrok.org/wiki/Downloads&quot;&gt;here&lt;/a&gt;. So I picked up a &lt;em&gt;PulseView&lt;/em&gt; and a
&lt;em&gt;sigrok-cli&lt;/em&gt; binary.&lt;/p&gt;

&lt;p&gt;Although being self contained two additional steps were required to get things
going.&lt;/p&gt;

&lt;p&gt;First setting up udev rules&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;git clone git://sigrok.org/libsigrok
cp libsigrok/contrib/*.rules /etc/udev/rules.d/
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;and second, due to legal reasons, the self contained sigrok binaries do not
contain the necessary firmware for the DSLogic Plus so we need to grab a script
that fetches those separately&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;git clone git://sigrok.org/sigrok-util
sudo sigrok-util/firmware/dreamsourcelab-dslogic/sigrok-fwextract-dreamsourcelab-dslogic
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;test-capture&quot;&gt;Test capture&lt;/h2&gt;

&lt;p&gt;Attaching to the USB &lt;strong&gt;D+&lt;/strong&gt; and &lt;strong&gt;D-&lt;/strong&gt; of a STM32 based thumb device and
capturing the signaling that occurs when the device is connected to the bus.&lt;/p&gt;

&lt;p&gt;For background on USB 2.0 the spec summary &lt;a href=&quot;https://www.beyondlogic.org/usbnutshell/usb1.shtml&quot;&gt;USB in a
NutShell&lt;/a&gt; is a highly
recommended read, the similar summary &lt;a href=&quot;http://www.usbmadesimple.co.uk/&quot;&gt;USB Made
Simple&lt;/a&gt; also provides very valuable
information. For all the gory details the full specification is available
&lt;a href=&quot;https://www.usb.org/document-library/usb-20-specification&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Sigrok provides a very neat protocol decoder for USB (as well as for many other
protocols).&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/download/sigrok-usb/sigrok-usb-decode.png&quot; alt=&quot;PulseView USB decode&quot; /&gt;&lt;/p&gt;

&lt;p&gt;It is interesting to observe what happens when a new device is connected&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;# Newly connected device has address 0.
host   : SETUP ADDR 0 EP 0
host   : DATA0 [ 80 06 00 01 00 00 40 00 ]
device : ACK
host   : IN ADDR 0 EP 0
device : NAK
host   : IN ADDR 0 EP 0
device : DATA1 [ 12 01 00 02 02 02 00 40 83 04 40 57 00 02 01 02 03 01 ]
host   : ACK

host   : OUT ADDR 0 EP 0
host   : DATA1 [ ]
device : NAK
host   : OUT ADDR 0 EP 0
host   : DATA1 [ ]
device : ACK

host: &amp;lt;RESET for 50ms&amp;gt;

# SET_ADDRESS (=0x05) to 10 (=0x0A)
host   : SETUP ADDR 0 EP 0
host   : DATA0 [ 00 05 0A 00 00 00 00 00 ]
device : ACK
host   : IN ADDR 0 EP 0
device : NAK
host   : IN ADDR 0 EP 0
device : DATA1 [ ]
device : ACK

# From this point on the device has address 10.

host   : SETUP ADDR 10 EP 0
host   : DATA0 [ 80 06 00 01 00 00 12 00 ]
device : ACK
...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;what appears to be going on here is that the host begins by reading the &lt;em&gt;device
descriptor&lt;/em&gt; containing, among other things, the highest USB version this device
supports, &lt;em&gt;idVendor&lt;/em&gt; and &lt;em&gt;idProduct&lt;/em&gt;. If these are satisfactory the host
goes ahead and drives a long reset followed by assigning the device a new
address.&lt;/p&gt;

&lt;p&gt;A simple &lt;code class=&quot;highlighter-rouge&quot;&gt;lsusb&lt;/code&gt; confirms that &lt;em&gt;idVendor&lt;/em&gt; and &lt;em&gt;idProduct&lt;/em&gt; appear in the
response message retrieved before the reset.&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Bus 001 Device 010: ID 0483:5740 STMicroelectronics STM32F407
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;thoughts-on-building-a-usb-device&quot;&gt;Thoughts on building a USB device&lt;/h2&gt;

&lt;p&gt;If you are so inclined you might now wonder what kind of digital circuitry it
would take to build a USB device. Common sense says this would be a rather
significant undertaking so it seems wise to begin with some preparatory
considerations.&lt;/p&gt;

&lt;h3 id=&quot;ulx3s-setup&quot;&gt;ULX3S setup&lt;/h3&gt;

&lt;p&gt;As usual the platform for my experiment will be the excellent
&lt;a href=&quot;https://radiona.org/ulx3s/&quot;&gt;ULX3S&lt;/a&gt;. The board has two USB micro female
connectors where the second (designated US2) is wired directly to the ECP5
FPGA.&lt;/p&gt;

&lt;h4 id=&quot;schematics&quot;&gt;Schematics&lt;/h4&gt;

&lt;p&gt;Full schematics are found
&lt;a href=&quot;https://github.com/emard/ulx3s/blob/master/doc/schematics.pdf&quot;&gt;here&lt;/a&gt; but for
the sake of this discussion I have extracted the relevant parts.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/download/sigrok-usb/ulx3s-schematics-usb-1.png&quot; alt=&quot;ULX3S US2&quot; /&gt;&lt;/p&gt;

&lt;p&gt;What is interesting to note here is that the &lt;em&gt;USB_FPGA_D+&lt;/em&gt; and &lt;em&gt;USB_FPGA_D-&lt;/em&gt;
pair is connected to the FPGA twice. One pair is connected to a differential
IO cell and the second pair is connected to single ended IO cells. Reason is
that USB requires us to both drive and sample differential as well as single
ended. Still weird though as ECP5 docs kind of suggest that all of that could
be done within one IO cell (pair).&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/download/sigrok-usb/ulx3s-schematics-usb-2.png&quot; alt=&quot;ULX3S US2&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The second interesting part is that the board has FPGA controllable pull-ups.
This allows us to attach/detach from the USB without physically touching any
cables. As well as choosing if we want to identify as a &lt;em&gt;full-speed&lt;/em&gt; or
&lt;em&gt;low-speed&lt;/em&gt; device.&lt;/p&gt;

&lt;h4 id=&quot;clocking&quot;&gt;Clocking&lt;/h4&gt;

&lt;p&gt;USB &lt;em&gt;full-speed&lt;/em&gt; is 12Mbit/s and USB &lt;em&gt;low-speed&lt;/em&gt; is 1.5Mbit/s, the ULX3S board
has a 25Mhz crystal oscillator.&lt;/p&gt;

&lt;p&gt;For &lt;em&gt;full-speed&lt;/em&gt; if we were to sample at exactly the right spot a 12Mhz clock
should suffice, but in reality this is not possible (without phase adjusting
the clock). So instead one typically settles for oversampling with a factor of
four (so a 48Mhz clock would be needed).&lt;/p&gt;

&lt;p&gt;Now a 25Mhz clock does not directly PLL into a 48Mhz clock (nor any other
reasonable multiple of 12Mhz). It does however PLL it into a 15Mhz clock that
can be used for &lt;em&gt;low-speed&lt;/em&gt; (oversampling with a factor 10).&lt;/p&gt;

&lt;p&gt;Eventually though for &lt;em&gt;full-speed&lt;/em&gt; we can use two PLLs in cascade configured as
follows to reach exactly 48Mhz.&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;markus@workstation:~$ ecppll -i 25 -o 60
Pll parameters:
Refclk divisor: 5
Feedback divisor: 12
clkout0 divisor: 10
clkout0 frequency: 60 MHz
VCO frequency: 600
markus@workstation:~$ ecppll -i 60 -o 48
Pll parameters:
Refclk divisor: 5
Feedback divisor: 4
clkout0 divisor: 12
clkout0 frequency: 48 MHz
VCO frequency: 576
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h4 id=&quot;connecting-the-logic-analyzer&quot;&gt;Connecting the Logic Analyzer&lt;/h4&gt;

&lt;p&gt;A sturdy attachment point for the analyzer is desirable as it is rather
annoying having test hook clips constantly falling off the board as soon as it
is moved/touched only the slightest.&lt;/p&gt;

&lt;p&gt;Luckily since it is a FPGA we can route the &lt;em&gt;USB_FPGA_D+&lt;/em&gt; and &lt;em&gt;USB_FPGA_D-&lt;/em&gt;
pair “out on the other side” to make it available on the pin header. This is
mechanically stable and has the additional advantage of being isolated from the
actual signals so there is no risk of interference from the analyzer probes.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/download/sigrok-usb/ulx3s-pin-header.jpg&quot; alt=&quot;ULX3S US2&quot; /&gt;&lt;/p&gt;

&lt;h3 id=&quot;sigrok-support&quot;&gt;Sigrok support&lt;/h3&gt;

&lt;p&gt;Having powerful tools such as the Sigrok suite and the Logic Analyzer will
prove invaluable for the task ahead. In fact, as we shall soon see, they will be
useful in not only the obvious way.&lt;/p&gt;

&lt;p&gt;Capture signaling/traffic between the USB host and FPGA would be the obvious
application and while this will eventually be its main use we need to get quite
a lot of things working to reach that point.&lt;/p&gt;

&lt;p&gt;In the meantime we can capture authentic host signaling and feed into RTL
simulation. Doing so can be easily accomplished with&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;sigrok-cli --input-file capture.sr -O csv | awk 'BEGIN{FS=&quot;,&quot;}{print $2$3}' - &amp;gt; capture.vh
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;and initializing the Verilog array with &lt;em&gt;$readmemb&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;However this presents an issue with mismatch in sampling rate. The closest the
analyzer can come to 48Mhz is 50Mhz so this will be an issue for the USB
device’s clock recovery.&lt;/p&gt;

&lt;p&gt;To avoid the sampling rate mismatch another option is to have sigrok run the
low level &lt;em&gt;usb-signalling&lt;/em&gt; decoder to reliably extract the &lt;em&gt;J&lt;/em&gt;, &lt;em&gt;K&lt;/em&gt;, &lt;em&gt;SE0&lt;/em&gt; and
&lt;em&gt;SE1&lt;/em&gt; symbols for us. To do this we use the follow python snippet to translate&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;#!/usr/bin/env python
&lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;sys&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;re&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;pattern_j&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;compile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;^usb_signalling-1: J$&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;pattern_k&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;compile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;^usb_signalling-1: K$&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;pattern_se0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;compile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;^usb_signalling-1: SE0$&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;pattern_se1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;compile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;^usb_signalling-1: SE1$&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;line&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sys&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stdin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pattern_j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;match&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'01'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pattern_k&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;match&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'10'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pattern_se0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;match&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'00'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pattern_se1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;match&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'11'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;and then run the entire chain&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;sigrok-cli -i capture.sr -P usb_signalling:dp=1:dm=0 | ./usb_signalling2vh.py &amp;gt; capture.vh
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Once our RTL simulation generates USB signaling we can feed that into sigrok
for decode and verification. It is just a matter of having simulation produce a
  &lt;a href=&quot;https://en.wikipedia.org/wiki/Comma-separated_values&quot;&gt;CSV&lt;/a&gt; file and then&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;sigrok-cli -i capture.csv -I csv:samplerate=48000000 -o capture.sr
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;where the file &lt;code class=&quot;highlighter-rouge&quot;&gt;capture.sr&lt;/code&gt; can be opened and graphically decoded in
&lt;em&gt;PulseView&lt;/em&gt;. Pretty awesome!&lt;/p&gt;

&lt;h2 id=&quot;wrap-up&quot;&gt;Wrap up&lt;/h2&gt;

&lt;p&gt;That concludes this post. Next time we will look closer at the design of our
USB device, its simulation environment and traffic generator. Questions or
feedback - leave a comment below!&lt;/p&gt;</content><author><name></name></author><category term="hardware" /><summary type="html">Last summer I bought a DSLogic Plus USB-based Logic Analyzer, about 6 months ago I tried it for the first time and today I hope to finish this post describing the experience. The thing comes with its own analyzer software running on the PC called DSView, but I never bothered trying that and instead went with the better known sigrok (of which DSView is a derivative).</summary></entry></feed>