Arch directory restructuring

From Miosix Wiki
Revision as of 15:44, 13 November 2023 by Daniele.cattaneo (talk | contribs) (Start of proposal)
Jump to navigation Jump to search

The arch directory is a mess with lots of copy-pasted code. Ideally it should be consistently and logically organized: each directory should have a documented purpose in order to leave as little doubt as possible about what goes where. Additionally, each directory's purpose makes sense in the overall structure, without overlaps between the scope of each directory. At the moment however we observe this is not the case. We believe that in order to make the future growth of Miosix simpler and faster, the right moment to rectify the situation is now.

Some definitions

Common sense suggests the following taxonomy of parts for MCU-equipped devices targeted by Miosix:

  • Board: The device itself being targeted. It's called a board because nobody has observed any MCU wired with point-to-point wiring yet. Includes the MCU itself plus any peripheral external to the MCU but driven by it.
  • Chip: The MCU itself. May come in RAM/OTP/ROM/Flash size variations, which are mostly ignored.
  • Chip family: A set of MCU types from the same manufacturer with the same microprocessor core but potentially different set of peripherals. While the available set of peripherals may vary, the behaviour and I/O ports of each kind of peripheral are the same across the entire family. Thanks to the brilliant ideas of the various marketing departments of each vendor, the classification here can get very fluid.
  • Core: The microprocessor core used by a given chip family.

Let's take as an example the classic stm32f4discovery board that is the most common development device for Miosix:

  • The board is the stm32f4discovery itself. It includes the chip, and miscellaneous peripherals such as an accelerometer and a Cirrus Logic audio chip. The STLink programming interface and the power supply are not considered relevant parts of the board for Miosix, as the MCU is not in control of what they do.
  • The chip is an STM32F407VGT6. The VGT6 suffix of the chip name specifies package (100 pin LQFP), Flash size (1MB) and temperature range options (see the datasheet at the "Ordering Information" page).
  • The chip family is STM32F4. This can be inferred by the fact that all available chips with this prefix in the part name share the same reference manual.
  • The core is an ARM Cortex M4, as stated in the datasheet. Roughly, ST buys the Verilog for the core from ARM and then takes care of the manufacturing. As far as we know, ARM's licence shouldn't allow ST to customize the core in any way, so all Cortex M4 cores should behave the same, even from other vendors.

What is an architecture?

One definition we did not give is the definition of what an architecture is. The reason is that there is no accepted definition for this term, especially for forum pundits and bloggers, but also in academia. The RISC school of thought (Patterson/Hennessy and their spawn) historically associated an architecture with a combination of what are more specifically called the instruction set architecture (ISA) and the microarchitecture of a core. The ISA is the overall structure and design of the instructions, as opposed to the instruction set proper which is the set of instructions accepted by a specific core. The microarchitecture is the overall internal design of the core itself.

Their thesis was that the microarchitecture design will follow naturally from the definition of the ISA, hence simpler ISAs (RISC) to implement always lead to more efficient and generally better cores. Therefore one can consider the architecture to be primarily determined by the ISA. However the industry has proven them wrong, and the most glaring example are Intel x86 chips, which have seen an increasing variety of microarchitectures implementing the same ISA (variously extended of course) with wildly varying levels of (non-)efficiency (in all possible ramifications of "efficiency"). Closer to Miosix's targets, the ARM ISA has been extended so much over the years that its core RISC design has now evolved to basically a traditional CISC ISA, but ARM chips still come in every possible size and flavour.

When we talk about x86 "architectures" or ARM "architectures", but do we mean desktop-grade chips or microcontrollers? Even though they have the same "architecture" the cores in these two classes of chips have nothing to do with each other, they don't even share the same instruction decoder probably. Hence our conclusion that the word "architecture" has now lost all useful meaning for our purposes, and is only useful for void parlor chitchat.

And then we come to Miosix which defines "architecture" (arch) as... a board. Which is not what anyone means when they talk about architecture of anything. For Miosix, an "architecture" is more like what in common parlance would be meant as a "platform".

Current structure

Here's a schematic diagram of how the Miosix arch directory (plus related directories) is configured. In the following / refers to the miosix directory at the top level of the repository.

arch                        Architecture-specific files
  common                      Pool of useful files. Very few files here are common to all archs.
    CMSIS                       Collection of ARM & vendor headers following the CMSIS standard
    core                        Board independent drivers required for the OS to function (roughly)
      atomic_ops_*.h              Core dependent
      cache_*.cpp/h               Core dependent. This is not required at all for Miosix to work but is nevertheless here
      *_os_timer.cpp              Chip family dependent
      endianness_*.h              Core dependent. Should have a generic (slow) implementation but it's missing.
      interrupts.h                Umbrella header for interrupts_*.h
      interrupts_*.h              Core dependent. Misleading name, this is for fault handling.
      memory_protection.h         Umbrella header for mpu_*.h. Required only for process mode. Name is not consistent with other umbrella headers, should have been "mpu.h"
      mpu_*.cpp/h                 Core dependent.
    drivers                     Miscellaneous drivers, which are "common" only very roughly
      dcc.cpp/h                   Core dependent (but common to all ARM cores apparently)
      *_adc.cpp/h                 Chip family dependent
      serial.h                    Umbrella header for *serial*.h
      *serial*.cpp/h              Chip family dependent
      sd_*.cpp/h                  Chip family dependent
      servo_*.cpp/h               Board dependent really, should go into utils
      *_hardware_rng.cpp/h        Chip family dependent
      *_rtc.cpp/h                 Chip family dependent
      *_sgm.cpp/h                 Chip family dependent
      *_wd.cpp/h                  Chip family dependent
      *_i2c.cpp/h                 Chip family dependent
  <board group name>          Board group specific files.
    common                      Common drivers and configuration for all the boards in the group
      arch_settings.h             Core dependent. Misleading name, defines requirements for ctxsave
      interfaces-impl             The actual drivers, most of them required by Miosix to function.
        arch_registers_impl.h     Chip family dependent. Includes the appropriate register definitions
        delays.cpp                Chip family dependent as it also depends on ROM/Flash speed. Actually application dependent (!) as cache/prefetch mechanisms can be configured but whatever.
        gpio_impl.cpp/h           Chip family dependent.
        portability_impl.h        Core dependent.
        portability.cpp           Core dependent.
      drivers                 Miscellaneous drivers not required by Miosix to function, with exceptions (atsam4l clock)
        *.cpp/h
    <board name>              Board/chip specific files.
      core                      Board/chip dependent code required for the OS to function.
        stage_1_boot.c/cpp/s      Chip dependent. Interrupt vector table, reset handler, clock tree setup, bss/data initialization
      interfaces_impl         Important board/chip specific drivers, most of them required by Miosix to function.
        bsp_impl.h              Board dependent. Useful board-specific definitions.
        bsp.cpp                 Board dependent. Useful board-specific functions.
        hwmapping.h             Board dependent. GPIO pin definitions.
        deep_sleep.cpp          Chip family dependent. Optional implementation of deep sleep primitives.
        delays.cpp              See above. Sometimes it's here, with no true reason why.
        arch_registers_impl.h     See above. Sometimes it's here, depends on chip.
        ...                     Some boards have extra miscellaneous drivers here
      *.h                     Some boards have chip registers definitions here (LPC2138)
      *.cfg                   Some boards have OpenOCD config files here
      *.ld                    Chip family specific. Linker files for different RAM/Flash configurations
interfaces                  Headers for hardware interfaces TO miosix (internals) and SOME interfaces OF miosix to the application (APIs). Only commonality is that they are all hardware dependent (while stuff in utils/kernel is ''generally'' hardware independent)
  arch_registers.h            API. Umbrella for "arch_registers_impl.h", selected through include search path manipulation.
  atomic_ops.h                API. Umbrella for "atomic_ops_*.h", selected through ifdefs.
  bsp.h                       API + internal prototypes (IRQbspInit, bspInit2). Umbrella for "bsp_impl.h", selected through include search path manipulation.
  deep_sleep.h                Internal prototypes only. Not an umbrella header.
  delays.h                    API. Not an umbrella header.
  endianness.h                API. Umbrella for "endianness_*.h", selected through ifdefs.
  gpio.h                      API. Umbrella for "gpio_impl.h", selected through include search path manipulation.
  os_timer.h                  Internal declarations. Not an umbrella header.
  portability.h               Internal declarations. Primitives for context switching and syscalls. Umbrella for "portability_impl.h", selected through include search path manipulation.
util                        Additional APIs made available for applications as utilities, built on top of kernel primitives or other drivers
  lcd44780.cpp/h              Driver for Hitachi-standard HD44780 character displays.
  software_i2c/spi.h          Bit-banged i2c/spi implementations. Arguably drivers.
  utils.cpp/h                 Memory/CPU usage profiling. Misleading name. Not a driver
  ...                         Other files are not related to hardware

<board group name> is defined as the name of the chip core, followed by an underscore, followed by the name of a chip family. <board name> is defined by the name of the chip, followed by an underscore, followed by a "common name" of the board. For commercial prototyping boards or specific devices, "common name" is derived from the name given by the manufacturer, or a name which is in common use with some exceptions (i.e. stm32f103 bluepill boards are named "breakout").

Some general patterns can be inferred:

  • Apart from minor exceptions, directories named core contain drivers required for the OS to function.
  • Directories named interfaces_impl generally contain implementations of things defined in /interfaces. Header files in interfaces_impl are always given the ``_impl`` suffix. C++ files are usually not given the "_impl" suffix but sometimes they have it (gpio_impl.cpp).
  • Directories named common contain code shared by multiple boards. No attempt is made to separate different levels of commonality apart from chip-family specific stuff which is grouped by board group.

Issues

Experience has shown this organization has the following problems:

  • What goes where is not well defined due to the vagueness of directory names
    • This leads to files being in the wrong place even though it's not obvious
      • servo_* is the most glaring example
      • The location of delays.cpp and arch_registers_impl.h is also inconsistent
  • The two level board group/board classification does not reflect reality, which leads to missing code sharing opportunities:
    • GPIO implementations are duplicated multiple times as they are common to multiple chip families and not to a single one.
    • Bootstrap code is duplicated multiple times as they are common to boards with the same chip and not common to the board.
    • Portability code (i.e. context save/restore primitives mostly) is duplicated multiple times as they are common to a given core and not to a chip family
  • Classifications sometimes privilege syntax rather than semantics, in an inconsistent fashion
    • Directories named interfaces_impl contain drivers just like directories named drivers, except that they included from /interfaces.
    • What is common and what is not loosely determined by the amount of code sharing but most stuff not under a common directory is duplicated at least twice

Of course, the root issue is the fluidity of the distinction between chip and chip family, and how each vendor manages and/or defines each family. For instance, STM32 chips all have similar USART (serial) implementations, but at least two versions can be distinguished through different register definitions and feature sets. Therefore the same overall driver can be shared across almost all STM32 chips, however the driver is littered with ifdefs to single out the specificity of each implementation. There is really no strict distinction to what can be found where, as the vendors can and do mix and match peripherals even amongst themselves. For example, some STM32 chips have Synopsis-designed USB interfaces which can also be found on chips from vendors that are not ST.

These considerations highlight that the solution to handling the issue of organizing platform-specific code in a consistent way to minimize code duplication is not to add hierarchy levels, but to remove them. This runs contrary from the naive consideration that "Board includes Chip, which belongs to Chip Family, which contains a Core".

A proposal

Pointing out what's wrong is easy, fixing it up is hard. Of course "what we should do" is a matter of opinion, and a valid answer to that could even be "nothing". If I were to do something, I would do the following:

With classification per device type

platform
  arch_registers.h
  atomic_ops.h
  bsp.h
  deep_sleep.h
  delays.h
  endianness.h
  gpio.h
  os_timer.h
  portability.h
  memory_protection.h
  interrupts.h
  mpu.h
  serial.h
  arch_registers
    CMSIS
    <chip/chip family>
      arch_registers_impl.h
      ....h
  core
    boot
      <chip>
        stage_1_boot.c/cpp/s
    atomic_ops
      atomic_ops_*.h
    os_timer
      *_os_timer.cpp
    endianness
      endianness_*.h
    faults
      interrupts_*.h
    mpu
      *_mpu.cpp/h
    portability
      <core>
        arch_settings.h
        portability_impl.h
        portability.cpp
    gpio
      <chip/chip family>
        gpio_impl.cpp/h
    delays
      <chip/chip family>
        delays.cpp
  drivers
    arm_dcc.cpp/h
    deep_sleep
      deep_sleep.cpp
    cache
      cache_*.cpp/h
    serial
      *_serial.cpp/h
    adc
      *_adc.cpp/h
    sd
      *_sd.cpp/h
    servo
      *_servo.cpp/h
    hardware_rng
      *_hardware_rng.cpp/h
    rtc
      *_rtc.cpp/h
    sgm
      *_sgm.cpp/h
    wd
      *_wd.cpp/h
    i2c
      *_i2c.cpp/h
      software_i2c.h
    spi
      software_spi.h
    display
      lcd44780.cpp/h
    ...
  board
    <board>
      bsp_impl.h
      bsp.cpp
      hwmapping.h
      *.cfg
  linker_scripts
    <chip>
      *.ld

Without classification per device type

platform
  arch_registers.h
  atomic_ops.h
  bsp.h
  deep_sleep.h
  delays.h
  endianness.h
  gpio.h
  os_timer.h
  portability.h
  memory_protection.h
  interrupts.h
  mpu.h
  serial.h
  arch_registers
    CMSIS
    <chip/chip family>
      arch_registers_impl.h
      ....h
  core
    <chip/chip family>
      stage_1_boot.c/cpp/s
      *_os_timer.cpp
      *_mpu.cpp/h
      gpio_impl.cpp/h
      delays.cpp
    <core>
      atomic_ops_*.h
      endianness_*.h
      interrupts_*.h
      arch_settings.h
      portability_impl.h
      portability.cpp
  drivers
    dcc.cpp/h
    *_servo.cpp/h
    software_i2c.h
    software_spi.h
    lcd44780.cpp/h
    <core>
      cache_*.cpp/h
    <chip/chip family>
      deep_sleep.cpp
      *_serial.cpp/h
      *_adc.cpp/h
      *_sd.cpp/h
      *_hardware_rng.cpp/h
      *_rtc.cpp/h
      *_sgm.cpp/h
      *_wd.cpp/h
      *_i2c.cpp/h
    ...
  board
    <board>
      bsp_impl.h
      bsp.cpp
      hwmapping.h
      *.cfg
  linker_scripts
    <chip>
      *.ld