Arch directory restructuring

From Miosix Wiki
Jump to navigation Jump to search

The arch directory is a mess with lots of copy-pasted code. Ideally it should be consistently and logically organized: each directory should have a documented purpose in order to leave as little doubt as possible about what goes where. Additionally, each directory's purpose makes sense in the overall structure, without overlaps between the scope of each directory. At the moment however we observe this is not the case. We believe that in order to make the future growth of Miosix simpler and faster, the right moment to rectify the situation is now.

Update 31/12/2023 I discovered a bug in the testing branch of Miosix which was caused by the copy-pasting of gpio_impl.h: version 1.4 (sic, see comments) didn't have the alternateFunction method of Gpio, even though serial_stm32.cpp assumes it always exists. (btw don't we have git? manual versioning in comments is not a robust practice!!!) I bet other similar bugs exist.

Some definitions

Common sense suggests the following taxonomy of parts for MCU-equipped devices targeted by Miosix:

  • Board: The device itself being targeted. It's called a board because nobody has observed any MCU wired with point-to-point wiring yet. Includes the MCU itself plus any peripheral external to the MCU but driven by it.
  • Chip: The MCU itself. May come in RAM/OTP/ROM/Flash size variations, which are mostly ignored.
  • Chip family: A set of MCU types from the same manufacturer with the same microprocessor core but potentially different set of peripherals. While the available set of peripherals may vary, the behaviour and I/O ports of each kind of peripheral are the same across the entire family. Thanks to the brilliant ideas of the various marketing departments of each vendor, the classification here can get very fluid.
  • Core: The microprocessor core used by a given chip family.

Let's take as an example the classic stm32f4discovery board that is the most common development device for Miosix:

  • The board is the stm32f4discovery itself. It includes the chip, and miscellaneous peripherals such as an accelerometer and a Cirrus Logic audio chip. The STLink programming interface and the power supply are not considered relevant parts of the board for Miosix, as the MCU is not in control of what they do.
  • The chip is an STM32F407VGT6. The VGT6 suffix of the chip name specifies package (100 pin LQFP), Flash size (1MB) and temperature range options (see the datasheet at the "Ordering Information" page).
  • The chip family is STM32F4. This can be inferred by the fact that all available chips with this prefix in the part name share the same reference manual.
  • The core is an ARM Cortex M4, as stated in the datasheet. Roughly, ST buys the Verilog for the core from ARM and then takes care of the manufacturing. As far as we know, ARM's licence shouldn't allow ST to customize the core in any way, so all Cortex M4 cores should behave the same, even from other vendors.

What is an architecture?

One definition we did not give is the definition of what an architecture is. The reason is that there is no accepted definition for this term, especially for forum pundits and bloggers, but also in academia. The RISC school of thought (Patterson/Hennessy and their spawn) historically associated an architecture with a combination of what are more specifically called the instruction set architecture (ISA) and the microarchitecture of a core. The ISA is the overall structure and design of the instructions, as opposed to the instruction set proper which is the set of instructions accepted by a specific core. The microarchitecture is the overall internal design of the core itself.

Their thesis was that the microarchitecture design will follow naturally from the definition of the ISA, hence simpler ISAs (RISC) to implement always lead to more efficient and generally better cores. Therefore one can consider the architecture to be primarily determined by the ISA. However the industry has proven them wrong, and the most glaring example are Intel x86 chips, which have seen an increasing variety of microarchitectures implementing the same ISA (variously extended of course) with wildly varying levels of (non-)efficiency (in all possible ramifications of "efficiency"). Closer to Miosix's targets, the ARM ISA has been extended so much over the years that its core RISC design has now evolved to basically a traditional CISC ISA, but ARM chips still come in every possible size and flavour.

When we talk about x86 "architectures" or ARM "architectures", but do we mean desktop-grade chips or microcontrollers? Even though they have the same "architecture" the cores in these two classes of chips have nothing to do with each other, they don't even share the same instruction decoder probably. Hence our conclusion that the word "architecture" has now lost all useful meaning for our purposes, and is only useful for void parlor chitchat.

And then we come to Miosix which defines "architecture" (arch) as... a board. Which is not what anyone means when they talk about architecture of anything. For Miosix, an "architecture" is more like what in common parlance would be meant as a "platform".

Current structure

Here's a schematic diagram of how the Miosix arch directory (plus related directories) is configured. In the following / refers to the miosix directory at the top level of the repository.

arch                        Architecture-specific files
  common                      Pool of useful files. Very few files here are common to all archs.
    CMSIS                       Collection of ARM & vendor headers following the CMSIS standard
    core                        Board independent drivers required for the OS to function (roughly)
      atomic_ops_*.h              Core dependent
      cache_*.cpp/h               Core dependent. This is not required at all for Miosix to work but is nevertheless here
      *_os_timer.cpp              Chip family dependent
      endianness_*.h              Core dependent. Should have a generic (slow) implementation but it's missing.
      interrupts.h                Umbrella header for interrupts_*.h
      interrupts_*.h              Core dependent. Misleading name, this is for fault handling.
      memory_protection.h         Umbrella header for mpu_*.h. Required only for process mode. Name is not consistent with other umbrella headers, should have been "mpu.h"
      mpu_*.cpp/h                 Core dependent.
    drivers                     Miscellaneous drivers, which are "common" only very roughly
      dcc.cpp/h                   Core dependent (but common to all ARM cores apparently)
      *_adc.cpp/h                 Chip family dependent
      serial.h                    Umbrella header for *serial*.h
      *serial*.cpp/h              Chip family dependent
      sd_*.cpp/h                  Chip family dependent
      servo_*.cpp/h               Board dependent really, should go into utils
      *_hardware_rng.cpp/h        Chip family dependent
      *_rtc.cpp/h                 Chip family dependent
      *_sgm.cpp/h                 Chip family dependent
      *_wd.cpp/h                  Chip family dependent
      *_i2c.cpp/h                 Chip family dependent
  <board group name>          Board group specific files.
    common                      Common drivers and configuration for all the boards in the group
      arch_settings.h             Core dependent. Misleading name, defines requirements for ctxsave
      interfaces-impl             The actual drivers, most of them required by Miosix to function.
        arch_registers_impl.h     Chip family dependent. Includes the appropriate register definitions
        delays.cpp                Chip family dependent as it also depends on ROM/Flash speed. Actually application dependent (!) as cache/prefetch mechanisms can be configured but whatever.
        gpio_impl.cpp/h           Chip family dependent.
        portability_impl.h        Core dependent.
        portability.cpp           Core dependent.
      drivers                 Miscellaneous drivers not required by Miosix to function, with exceptions (atsam4l clock)
        *.cpp/h
    <board name>              Board/chip specific files.
      core                      Board/chip dependent code required for the OS to function.
        stage_1_boot.c/cpp/s      Chip dependent. Interrupt vector table, reset handler, clock tree setup, bss/data initialization
      interfaces_impl         Important board/chip specific drivers, most of them required by Miosix to function.
        bsp_impl.h              Board dependent. Useful board-specific definitions.
        bsp.cpp                 Board dependent. Useful board-specific functions.
        hwmapping.h             Board dependent. GPIO pin definitions.
        deep_sleep.cpp          Chip family dependent. Optional implementation of deep sleep primitives.
        delays.cpp              See above. Sometimes it's here, with no true reason why.
        arch_registers_impl.h     See above. Sometimes it's here, depends on chip.
        ...                     Some boards have extra miscellaneous drivers here
      *.h                     Some boards have chip registers definitions here (LPC2138)
      *.cfg                   Some boards have OpenOCD config files here
      *.ld                    Chip family specific. Linker files for different RAM/Flash configurations
interfaces                  Headers for hardware interfaces TO miosix (internals) and SOME interfaces OF miosix to the application (APIs). Only commonality is that they are all hardware dependent (while stuff in utils/kernel is ''generally'' hardware independent)
  arch_registers.h            API. Umbrella for "arch_registers_impl.h", selected through include search path manipulation.
  atomic_ops.h                API. Umbrella for "atomic_ops_*.h", selected through ifdefs.
  bsp.h                       API + internal prototypes (IRQbspInit, bspInit2). Umbrella for "bsp_impl.h", selected through include search path manipulation.
  deep_sleep.h                Internal prototypes only. Not an umbrella header.
  delays.h                    API. Not an umbrella header.
  endianness.h                API. Umbrella for "endianness_*.h", selected through ifdefs.
  gpio.h                      API. Umbrella for "gpio_impl.h", selected through include search path manipulation.
  os_timer.h                  Internal declarations. Not an umbrella header.
  portability.h               Internal declarations. Primitives for context switching and syscalls. Umbrella for "portability_impl.h", selected through include search path manipulation.
util                        Additional APIs made available for applications as utilities, built on top of kernel primitives or other drivers
  lcd44780.cpp/h              Driver for Hitachi-standard HD44780 character displays.
  software_i2c/spi.h          Bit-banged i2c/spi implementations. Arguably drivers.
  utils.cpp/h                 Memory/CPU usage profiling. Misleading name. Not a driver
  ...                         Other files are not related to hardware
config                      Configuration files, user-modifiable
  Makefile.inc                List of board + selection of board to build for + selection of compile time options through defines passed as argument to the compiler. The first section of the file arguably is not user-modifiable configuration.
  miosix-settings.h           Global settings selected through defines in header files
  arch                        Board-dependent settings
    <board group>
      <board>
        board_settings.h      Board-dependent settings selected through defines in header files

<board group name> is defined as the name of the chip core, followed by an underscore, followed by the name of a chip family. <board name> is defined by the name of the chip, followed by an underscore, followed by a "common name" of the board. For commercial prototyping boards or specific devices, "common name" is derived from the name given by the manufacturer, or a name which is in common use with some exceptions (i.e. stm32f103 bluepill boards are named "breakout").

Some general patterns can be inferred:

  • Apart from minor exceptions, directories named core contain drivers required for the OS to function.
  • Directories named interfaces_impl generally contain implementations of things defined in /interfaces. Header files in interfaces_impl are always given the ``_impl`` suffix. C++ files are usually not given the "_impl" suffix but sometimes they have it (gpio_impl.cpp).
  • Directories named common contain code shared by multiple boards. No attempt is made to separate different levels of commonality apart from chip-family specific stuff which is grouped by board group.

Include paths

The current directory structure allows the Makefile to select the implementation of some file by solely changing the include search paths of the compiler. This mostly applies to files in <board group name>/<board name> and in <board group name>/config. The Makefile defines several variables that are used to refer to specific paths in the directory tree:

Variable Value Pattern Scope Is config Defined by Description
LINKER_SCRIPT_PATH arch/<board group name>/<board name>/ Board Yes Makefile.inc Directory where the linker scripts are found
LINKER_SCRIPT $(LINKER_SCRIPT_PATH)/<script>.ld Board Yes Makefile.inc Path of the linker script
ARCH_INC arch/<board group name>/common Board group No Makefile.inc Path of the common directory of the board group
BOARD_INC arch/<board group name>/<board name> Board No Makefile.inc Path of the board directory
BOOT_FILE $(BOARD_INC)/core/stage_1_boot.o Board No Makefile.inc Path of the object file for the board bootstrap code
ARCH_SRC $(BOARD_INC)/*/*.cpp $(ARCH_INC)/*/*.cpp arch/common/drivers/*.cpp Board + Board group No Sources for the drivers enabled for this board
KPATH miosix Global Yes Appl. Makefile Root directory of the kernel files
CONFPATH $(KPATH) Global Yes Appl. Makefile Root directory for configuration files

For reference, the final include paths, defined by the application makefile, are:

$(CONFPATH)
$(CONFPATH)/config/$(BOARD_INC)
$(KPATH)
$(KPATH)/arch/common
$(KPATH)/$(ARCH_INC)
$(KPATH)/$(BOARD_INC)

Notice that, despite the name of the defines, there is no distinction between include paths and source paths, as includes are alongside the source. We do not have a problem with this approach, however it results in all includes being publicly available for applications, which is something that is not entirely desirable.

Drivers selected by include path

Sometimes Miosix's headers include files that depend on the board selected for building. This of course can give problems when reorganizing files around. The affected files are:

  • interfaces/arch_registers.h
  • interfaces/bsp.h
  • interfaces/gpio.h
  • interfaces/portability.h

Overall selecting different files through the include paths is desirable, so we want to avoid adding additional architecture-specific ifdefs as much as possible.

Issues

Experience has shown this organization has the following problems:

  • What goes where is not well defined due to the vagueness of directory names
    • This leads to files being in the wrong place even though it's not obvious
      • servo_* is the most glaring example
      • The location of delays.cpp and arch_registers_impl.h is also inconsistent
  • The two level board group/board classification does not reflect reality, which leads to missing code sharing opportunities:
    • GPIO implementations are duplicated multiple times as they are common to multiple chip families and not to a single one.
    • Bootstrap code is duplicated multiple times as they are common to boards with the same chip and not common to the board.
    • Portability code (i.e. context save/restore primitives mostly) is duplicated multiple times as they are common to a given core and not to a chip family
    • Linker scripts are also duplicated multiple times
  • Classifications sometimes privilege syntax rather than semantics, in an inconsistent fashion
    • Directories named interfaces_impl contain drivers just like directories named drivers, except that they included from /interfaces.
    • What is common and what is not loosely determined by the amount of code sharing but most stuff not under a common directory is duplicated at least twice

Of course, the root issue is the fluidity of the distinction between chip and chip family, and how each vendor manages and/or defines each family. For instance, STM32 chips all have similar USART (serial) implementations, but at least two versions can be distinguished through different register definitions and feature sets. Therefore the same overall driver can be shared across almost all STM32 chips, however the driver is littered with ifdefs to single out the specificity of each implementation. There is really no strict distinction to what can be found where, as the vendors can and do mix and match peripherals even amongst themselves. For example, some STM32 chips have Synopsis-designed USB interfaces which can also be found on chips from vendors that are not ST.

These considerations highlight that the solution to handling the issue of organizing platform-specific code in a consistent way to minimize code duplication is not to add hierarchy levels, but to remove them. This runs contrary from the naive consideration that "Board includes Chip, which belongs to Chip Family, which contains a Core".

A proposal

Pointing out what's wrong is easy, fixing it up is hard. Of course "what we should do" is a matter of opinion, and a valid answer to that could even be "nothing". Overall, what we ultimately want is for all drivers to be decoupled from boards and board groups, in order to be able to classify each driver depending on its actual logical group it belongs to. We can take this opportunity for a more radical restructuring of Miosix's source tree, or we can opt for more conservative solutions.

With classification per device type

This proposal is the most radical as it splits the drivers in a completely different way than what is currently done: by device type. For instance all USART drivers go in a "serial" directory, all GPIO drivers go in a GPIO directory, and so on. This approach scales well when there are a lot of drivers around, but completely prevents selection of drivers by using the include paths.

The arch directory has been named "platform" because of our general dislike of the "architecture" word, but that's really just a bonus. Most other operating systems use "arch" as a directory for drivers and target-specific stuff, so keeping the name "arch" is also an option. We also moved the interfaces headers under the "platform" ("arch") directory, since they basically are just umbrella headers for various drivers. This loses the distinction of some of them being more essential than others, additional directories can be added for this purpose.

platform
  arch_registers.h
  atomic_ops.h
  bsp.h
  deep_sleep.h
  delays.h
  endianness.h
  gpio.h
  os_timer.h
  portability.h
  memory_protection.h
  interrupts.h
  mpu.h
  serial.h
  arch_registers
    CMSIS
    <chip/chip family>
      arch_registers_impl.h
      ....h
  core
    boot
      <chip>
        stage_1_boot.c/cpp/s
    atomic_ops
      atomic_ops_*.h
    os_timer
      *_os_timer.cpp
    endianness
      endianness_*.h
    faults
      interrupts_*.h
    mpu
      *_mpu.cpp/h
    portability
      <core>
        arch_settings.h
        portability_impl.h
        portability.cpp
    gpio
      <chip/chip family>
        gpio_impl.cpp/h
    delays
      <chip/chip family>
        delays.cpp
  drivers
    arm_dcc.cpp/h
    deep_sleep
      deep_sleep.cpp
    cache
      cache_*.cpp/h
    serial
      *_serial.cpp/h
    adc
      *_adc.cpp/h
    sd
      *_sd.cpp/h
    servo
      *_servo.cpp/h
    hardware_rng
      *_hardware_rng.cpp/h
    rtc
      *_rtc.cpp/h
    sgm
      *_sgm.cpp/h
    wd
      *_wd.cpp/h
    i2c
      *_i2c.cpp/h
      software_i2c.h
    spi
      software_spi.h
    display
      lcd44780.cpp/h
    ...
  board
    <board>
      bsp_impl.h
      bsp.cpp
      hwmapping.h
      *.cfg
  linker_scripts
    <chip>
      *.ld
config
  Makefile.inc
  miosix_settings.h
  <board>
    board_settings.h

Without classification per device type

This is just the same proposal as before but which classifies drivers only by "manufacturer" or chip. This allows to use the include trick for selecting headers.

platform
  arch_registers.h
  atomic_ops.h
  bsp.h
  deep_sleep.h
  delays.h
  endianness.h
  gpio.h
  os_timer.h
  portability.h
  memory_protection.h
  interrupts.h
  mpu.h
  serial.h
  arch_registers
    CMSIS
    <chip/chip family>
      arch_registers_impl.h
      ....h
  core
    <chip/chip family>
      stage_1_boot.c/cpp/s
      *_os_timer.cpp
      *_mpu.cpp/h
      gpio_impl.cpp/h
      delays.cpp
    <core>
      atomic_ops_*.h
      endianness_*.h
      interrupts_*.h
      arch_settings.h
      portability_impl.h
      portability.cpp
  drivers
    dcc.cpp/h
    *_servo.cpp/h
    software_i2c.h
    software_spi.h
    lcd44780.cpp/h
    <core>
      cache_*.cpp/h
    <chip/chip family>
      deep_sleep.cpp
      *_serial.cpp/h
      *_adc.cpp/h
      *_sd.cpp/h
      *_hardware_rng.cpp/h
      *_rtc.cpp/h
      *_sgm.cpp/h
      *_wd.cpp/h
      *_i2c.cpp/h
    ...
  board
    <board>
      bsp_impl.h
      bsp.cpp
      hwmapping.h
      *.cfg
  linker_scripts
    <chip>
      *.ld
config
  Makefile.inc
  miosix_settings.h
  <board>
    board_settings.h

With minimal changes to the existing structure

arch
  common
    CMSIS
      core
        boot
          <chip>_stage_1_boot.c/cpp/s
        <core>_portability
            arch_settings.h
            portability_impl.h
            portability.cpp
        <chip family>
          delays.cpp
          gpio_impl.cpp/h
          deep_sleep.cpp
        (all other existing files)
      drivers
        All miscellaneous drivers in board directory are moved here
        (all other existing files)
    linker_scripts
      <chip>
        *.ld
  <board group name>
    common
      (should now be empty)
    <board name>
      interfaces_impl
        bsp_impl.h
        bsp.cpp
        hwmapping.h
      *.cfg
interfaces                    Unmodified
config                        Unmodified