# Calamares modules

<!-- SPDX-FileCopyrightText: 2014 Teo Mrnjavac <teo@kde.org>
     SPDX-FileCopyrightText: 2017 Adriaan de Groot <groot@kde.org>
     SPDX-License-Identifier: GPL-3.0-or-later
-->

Calamares modules are plugins that provide features like installer pages,
batch jobs, etc. An installer page (visible to the user) is called a "view",
while other modules are "jobs".

Each Calamares module lives in its own directory.

All modules are installed in `$DESTDIR/lib/calamares/modules`.

There are two **types** of Calamares module:
* viewmodule, for user-visible modules. These use C++ and either Widgets or QML
* jobmodule, for not-user-visible modules. These may be done in C++,
  Python, or as external processes (external processes not recommended).

A viewmodule exposes a UI to the user.

There are three **interfaces** for Calamares modules:
* qtplugin (viewmodules, jobmodules),
* python (jobmodules only),
* process (jobmodules only, not recommended).

## Module directory

Each Calamares module lives in its own directory. The contents
of the directory depend on the interface and type of the module.

### Module descriptor

A Calamares module must have a *module descriptor file*, named
`module.desc`. For C++ (qtplugin) modules using CMake as a build-
system and using the calamares_add_plugin() function -- this is the
recommended way to create such modules -- the module descriptor
file is optional, since it can be generated by the build system.
For other module interfaces, the module descriptor file is required.

The module descriptor file, if required, is placed in the module's directory.
The module descriptor file is a YAML 1.2 document which defines the
module's name, type, interface and possibly other properties. The name
of the module as defined in `module.desc` must be the same as the name
of the module's directory.

Module descriptors **must** have the following keys:
- *name* (an identifier; must be the same as the directory name)
- *type* ("job" or "view")
- *interface* (see below for the different interfaces; generally we
  refer to the kinds of modules by their interface)

Module descriptors for C++ modules **may** have the following key:
- *load* (the name of the shared library to load; if empty, uses a
  standard library name derived from the module name)

Module descriptors for Python modules **must** have the following key:
- *script* (the name of the Python script to load, nearly always `main.py`)

Module descriptors for process modules **must** have the following key:
- *command* (the command to run)

Module descriptors for process modules **may** have the following keys:
- *timeout* (how long, in seconds, to wait for the command to run)
- *chroot* (if true, run the command in the target system rather than the host)
Note that process modules are not recommended.

Module descriptors **may** have the following keys:
- *emergency* (a boolean value, set to true to mark the module
  as an emergency module; see the section *Emergency Modules*, below)
- *noconfig* (a boolean value, set to true to state that the module
  has no configuration file; defaults to false)
- *requiredModules* (a list of modules which are required for this module
  to operate properly)
- *weight* (a relative module weight, used to scale progress reporting)


### Required Modules

A module may list zero (if it has no requirements) or more modules
by name. As modules are loaded from the global sequence in `settings.conf`,
each module is checked that all of the modules it requires are
already loaded before it. This ensures that if a module needs
another one to fill in globalstorage keys, that happens before
it needs those keys.

### Emergency Modules

If, during an *exec* step in the sequence, a module fails, installation as
a whole fails and the install is aborted. If there are emergency modules
in the **same** exec block, those will be executed before the installation
is aborted. Non-emergency modules are not executed.

If an emergency-module fails while processing emergency-modules for
another failed module, that failure is ignored and emergency-module
processing continues.

Use the EMERGENCY keyword in the CMake description of a C++ module
to generate a suitable `module.desc`.  For Python modules, manually add
`emergency:  true` to `module.desc`.

A module that is marked as an emergency module in its module.desc
must **also** set the *emergency* key to *true* in its configuration file
(see below). If it does not, the module is not considered to be an emergency
module after all. This is so that you can have modules that have several
instances, only some of which are actually needed for emergencies.

In summary:
- in `module.desc`, write `emergency: true` to make it **possible** to
  run the module in emergency mode,
- in `<modulename>.conf`, write `emergency: true` to make that specific
  module run in emergency mode.

### Module-specific configuration

A Calamares module **may** read a module configuration file,
named `<modulename>.conf`. If such a file is present in the
module's directory, it can be shipped as a *default* configuration file.
This only happens if the CMake-time option `INSTALL_CONFIG` is on.

The name of the configuration file for a given module can be
influenced by the `settings.conf` of the overall Calamares configuration.
By default, though, the module's own name is used.

Modules that have *noconfig* set to true will not attempt to
read a configuration file, and will not warn that one is missing;
conversely if *noconfig* is set to false (or is missing, since
the default value is false) if there is no configuration file,
a warning is printed during Calamares start-up.

The sample configuration files may work and may be suitable for
your distribution, but no guarantee is given about their stability
beyond syntactic correctness.

The module configuration file, if it exists, is a YAML 1.2 document
which contains a YAML map of anything.

All sample module configuration files are installed in
`$DESTDIR/share/calamares/modules` but can be overridden by
files with the same name placed manually (or by the packager)
in `/etc/calamares/modules`.

### Module Weights

During the *exec* phase of an installation, where jobs are run and
things happen to the target system, there is a running progress bar.
It goes from 0% to 100% while all of the jobs for that exec phase
are run. Generally, one module creates one job, but this varies a little
(e.g. the partition module can spawn a whole bunch of jobs to
deal with each disk, and the users module has separate jobs for
the regular user and the root user).

By default, modules all "weigh" the same, and each job is equal.
A typical installation has about 30 modules in the exec phase,
so there may be 40 jobs or so: each job represents 2.5% of the
overall progress of the installation.

The consequence is that the *unpackfs* module, which needs to write
a few hundred MB to disk, gets 2.5% of the progress, and the *machineid*
module, which is essentially instantaneous, also gets 2.5% of the progress.
This makes progress reporting seem weird and uneven, and suggests to users
that Calamares may be "hanging" during the unpackfs stage.

A module may be assigned a different "weight" in the `module.desc`
file (or via the CMake macros for adding plugins). This gives the
module more space in the overall progress: for instance, the *unpackfs*
module now has a weight of 12, so (assuming there are 38 modules
in the exec phase with a weight of 1, and *unpackfs* with a weight of 12)
regular modules get 2% (1 in 50 total weight) of the overall progress
bar, and the *unpackfs* module gets 24% (12 in 50). While this doesn't
speed anything up, it does make the progress in the unpackfs module more
visible.

It is also possible to set a weight on a specific module **instance**,
which can be done in `settings.conf`. This overrides any weight
set in the module descriptor. Doing so is the recommended approach,
since that is where the specific installation-process is configured;
it is possible to take the whole installation-process into account
for determining the relative weights there.


## Global Storage keys

Some modules place values in Global Storage so that they can be referenced later by other modules or even other parts of the same module.  The following table represents a partial list of the values available as well as where they originate from and which module consume them.
Keys whose name is followed by a `+` are **structured** data, and have
entries (which start with `+`) below the parent key describing subkeys.
Some structured keys refer to other documentation sources.

Key               |Source          |Consumers      |Description
------------------|----------------|---------------|---
bootloader +      |partition       |               |Bootloader location
\+ installPath    |                |               |Device (e.g. `/dev/sda`) where the bootloader is installed
branding +        |                |               |See `src/branding/README.md`
btrfsSubvolumes   |mount           |fstab          |List of maps containing the mountpoint and btrtfs subvolume
btrfsRootSubvolume|mount           |bootloader, luksopenswaphook|String containing the subvolume mounted at root
efiSystemPartition|partition       |bootloader, fstab|String containing the path to the ESP relative to the installed system
extraMounts       |mount           |unpackfs|List of maps holding metadata for the temporary mountpoints used by the installer
fullname          |users           |               |The full username (e.g. "Jane Q. Public")
hostname          |users           |               |A string containing the hostname of the new system
netinstallAdd     |packagechooser  |netinstall     |Data to add to netinstall tree. Same format as netinstall.yaml
netinstallSelect  |packagechooser  |netinstall     |List of group names to select in the netinstall tree
packageOperations +|packagechooser, netinstall|packages|Operations to perform
\+ (list data)    |                |               |See `packages.conf`
partitions +      |partition, rawfs|(many)         |List of maps of metadata about each partition
\+ device         |                |               |path to the partition device
\+ fs             |                |               |the name of the file system
\+ mountPoint     |                |               |where the device should be mounted
\+ uuid           |                |               |the UUID of the partition device
rootMountPoint    |mount           |(many)         |A string with the absolute path to the root mountpoint
username          |users           |networkcfg, plasmainf, preservefiles|A string containing the username of the new user
zfsDatasets       |zfs             |bootloader, grubcfg, mount|List of maps of zfs datasets including the name and mount information
zfsInfo           |partition       |mount, zfs     |List of encrypted zfs partitions and the encription info
zfsPoolInfo       |zfs             |mount, umount  |List of maps of zfs pool info including the name and mountpoint


## C++ modules

> Type: viewmodule, jobmodule
> Interface: qtplugin

Currently the recommended way to write a module which exposes one or more
installer pages (viewmodule) is through a C++ and Qt plugin. Viewmodules must
implement `Calamares::ViewStep`. They can also implement `Calamares::Job`
to provide jobs.

To add a Qt plugin module, put it in a subdirectory and make sure it has
a `CMakeLists.txt` with a `calamares_add_plugin` call. It will be picked
up automatically by our CMake magic. The `module.desc` file is not recommended:
nearly all cases can be described in CMake.

Modules can be tested with the `loadmodule` testing executable in
the build directory. See the section on [testing modules](#testing-modules)
for more details.


### C++ Jobmodule

**TODO:** this needs documentation

### C++ Widgets Viewmodule

**TODO:** this needs documentation

### C++ QML Viewmodule

A QML Viewmodule (or view step) puts much of the UI work in one or more
QML files; the files may be loaded from the branding directory or compiled
into the module. Which QML is used depends on the deployment and the
configuration files for Calamares.

#### Explicit properties

The QML can access data from the C++ framework though properties
exposed to QML. There are two libraries that need to be imported
explicitly:

```
import io.calamares.core 1.0
import io.calamares.ui 1.0
```

The *ui* library contains the *Branding* object, which corresponds to
the branding information set through `branding.desc`. The Branding
class (in `src/libcalamaresui/Branding.h` offers a QObject-property
based API, where the most important functions are `string()` and the
convenience functions `versionedName()` and similar.

The *core* library contains both *ViewManager*, which handles overall
progress through the application, and *Global*, which holds global
storage information. Both objects have an extensive API. The *ViewManager*
can behave as a model for list views and the like.

These explicit properties from libraries are shared across all the
QML modules (for global storage that goes without saying: it is
the mechanism to share information with other modules).

#### Implicit properties

Each module also has an implicit context property available to it.
No import is needed. The context property *config* (note lower case)
holds the Config object for the module.

The Config object is the bridge between C++ and QML.

A Config object must inherit QObject and should expose, as `Q_PROPERTY`,
all of the relevant configuration information for the module instance.
The general description how to do that is available
in the [Qt documentation](https://doc.qt.io/qt-5/qtqml-cppintegration-topic.html).


## Python modules

Modules may use one of the python interfaces, which may be present
in a Calamares installation (but also may not be). These modules must have
a `module.desc` file. The Python script must implement the
Python jobmodule interface.

To add a Python or process jobmodule, put it in a subdirectory and make sure
it has a `module.desc`. It will be picked up automatically by our CMake magic.
For all kinds of Python jobs, the key *script* must be set to the name of
the main python file for the job. This is almost universally `main.py`.

`CMakeLists.txt` is *not* used for Python jobmodules.

Calamares offers a Python API for module developers, the core Calamares
functionality is exposed as `libcalamares.job` for job data,
`libcalamares.globalstorage` for shared data and `libcalamares.utils` for
generic utility functions. Documentation is inline.

All code in Python job modules must obey PEP8, the only exception are
`libcalamares.globalstorage` keys, which should always be
camelCaseWithLowerCaseInitial to match the C++ identifier convention.

Modules can be tested with the `loadmodule` testing executable in
the build directory. See the section on [testing modules](#testing-modules)
for more details.


### Python Jobmodule

> Type: jobmodule
> Interface: python

A Python jobmodule is a Python program which imports libcalamares and has a
function `run()` as entry point. The function `run()` must return `None` if
everything went well, or a tuple `(str,str)` with an error message and
description if something went wrong.

### Python API

The interface from a Python module to Calamares internals is
found in the *libcalamares* module. This is not a standard Python
module, and is only available inside the Calamares "runtime" for
Python modules (it is implemented in C++ and injected into the Python
environment by Calamares).

A module should start by importing the Calamares internals:

```
import libcalamares
```

There are three important (sub)modules in *libcalamares*:
- *globalstorage* behaves like a dictionary, and interfaces
  with the global storage in Calamares; use it to transfer
  information between modules (e.g. the *partition* module
  shares the partition layout it creates). Note that some information
  in global storage is expected to be structured, and it may be
  dicts-within-dicts.

  An example of using globalstorage:
  ```
  if not libcalamares.globalstorage.contains("lala"):
      libcalamares.globalstorage.insert("lala", 72)
  ```
- *job* is the interface to the job's behavior, with one important
  data member: *configuration* which is a dictionary derived from the
  configuration file for the module (if there is one, empty otherwise).
  Less important data is *pretty_name* (a string) and *working_path*
  which are normally not needed. The *pretty_name* value is
  obtained by the Calamares internals by calling the `pretty_name()`
  function inside the Python module.

  There is one function: `setprogress(p)` which can be passed a float
  *p* between 0 and 1 to indicate 0% to 100% completion of the module's
  work.
- *utils* is where non-job-specific functions are placed:
  - `debug(s)` and `warning(s)` are logger functions, which send output
    to the usual Calamares logging functions. Use these over `print()`
    which may not be visible at all.
  - `mount(device, path, type, options)` mounts a filesystem from
    *device* onto *path*, as if running the mount command from the shell.
    Use this in preference to running mount by hand. In Calamares 3.3
    this function also handles privilege escalation.
  - `gettext_path()` and `gettext_languages()` are support functions
    for translations, which would normally be called only once when
    setting up gettext (see below).
  - `obscure(s)` is a lousy string obfuscation mechanism. Do not use it.
  - A half-dozen functions for running a command and dealing with its
    output. These are recommended over using `os.system()` or the *subprocess*
    module because they handle the chroot behavior for running in the
    target system transparently. In Calamares 3.3 these functions also
    handle privilege escalation. See below, *Running Commands in Python* for details.

A module **must** contain a `run()` function to do the actual work
of the module. The module **may** define the following functions
to provide information to Calamares:
- `pretty_name()` returns a string that is a human-readable name or
  short description of the module. Since it is human-readable,
  return a translated string.
- `pretty_status_message()` returns a (longer) string that is a human-readable
  description of the state of the module, or what it is doing. This is
  primarily of importance for long-running modules. The function is called
  by the Calamares framework when the module reports progress through the
  `job.setprogress()` function. Since the status is human-readable,
  return a translated string.

### Python Translations

Translations in Python modules -- at least the ones in the Calamares core
repository -- are handled through gettext. You should import the standard
Python *gettext* module. Conventionally, `_` is used to mark translations.
That function needs to be configured specifically for use in Calamares
so that it can find the translations. A boilerplate solution is this:

```
import gettext
_ = gettext.translation("calamares-python",
                        localedir=libcalamares.utils.gettext_path(),
                        languages=libcalamares.utils.gettext_languages(),
                        fallback=True).gettext
```

Error messages should be logged in English, and given to the user
in translated form. In particular, when returning an error message
and description from the `run()` function, return translated forms,
like the following:

```
return (
    _("No configuration found"),
    _("<a longer description of the problem>"))
```

### Running Commands in Python

The use of the `os.system()` function and *subprocess* modules is
discouraged. Using these makes the caller responsible for handling
any chroot or other target-versus-host-system manipulation, and in
Calamares 3.3 may require additional privilege escalation handling.

The primary functions for running a command from Python are:
- `target_env_process_output(command, callback, stdin, timeout)`
- `host_env_process_output(command, callback, stdin, timeout)`
They run the given *command* (which must be a list of strings, like
`sys.argv` or what would be passed to a *subprocess* module call)
either in the target system (within the chroot) or in the host system.
Except for *command*, the arguments are optional.

A very simple example is running `ls` from a Python module (with `libcalamares.utils.` qualification omitted):
```
target_env_process_output(["ls"])
```

The functions return 0. If the exit code of *command* is not 0, an exception
is raised instead of returning 0. The exception is `subprocess.CalledProcessError`
(as if the *subprocess* module had been used), and the `returncode` member
of the exception object can be used to determine the exit code.

Parameter *stdin* may be a string which is fed to the command as standard input.
The *timeout* is in seconds, with 0 (or a negative number) treated as no-timeout.

Parameter *callback* is special:
- If it is `None`, no special handling of the command's output is done.
  The output will be logged, though (if there is any).
- If it is a list, then the output of the command will be appended to the list,
  one line at a time. Lines will still contain the trailing newline character
  (if there is one; output may end without a newline).
  Use this approach to process the command output after it has completed.
- Anything else is assumed to be a callable function that takes one parameter.
  The function is called once for each line of output produced by the command.
  The line of output still contains the trailing newline character (if there is one).
  Use this approach to process the command output while it is running.

Here are three examples of running `ls` with different callbacks:
```
# No processing at all, output is logged
target_env_process_output(["ls"])
target_env_process_output(["ls"], None)

# Appends to the list
ls_output = []
target_env_process_output(["ls"], ls_output)

# Calls the function for each line, which then calls debug()
def handle_output(s):
    debug(f"ls said {s}")
target_env_process_output(["ls"], handle_output)
```


There are additional functions for running commands in the target,
which can select what they return and whether exceptions are raised
or only an exit code is returned. These functions have an overload
that takes a single string (the name of an executable) as well. They should
all be considered deprecated by the callback-enabled functions, above.

- `target_env_call(command, stdin, timeout)` returns the exit code, does not raise.
- `check_target_env_call(command, stdin, timeout)` raises on a non-zero exit code.
- `check_target_env_output(command, stdin, timeout)` returns a single string with the output of *command*, raises on a non-zero exit code.

All of the API functions for running commands set the environment
LC_ALL and LANG to "C" for the called command.

## Process modules

Use of this kind of module is **not** recommended. Use *shellprocess*
instead, which is more configurable.

> Type: jobmodule
> Interface: process

A process jobmodule runs a (single) command. The interface is *process*,
while the module type must be *job* or *jobmodule*.

The module-descriptor key *command* should have a string as value, which is
passed to the shell -- remember to quote it properly in YAML. It is generally
recommended to use a *shellprocess* job module instead (less configuration,
easier to have multiple instances). There is no configuration outside
of the module-descriptor. The *command* undergoes Calamares variable-
expansion (e.g. replacing `${ROOT}` by the target of the installation).
See *shellprocess* documentation for details.

Optional keys are *timeout* and *chroot*.

`CMakeLists.txt` is *not* used for process jobmodules.


## Testing Modules

For testing purposes there is an executable `loadmodule` which is
built, but not installed. It can be found in the build directory.
The `loadmodule` executable behaves like single-module Calamares:
it loads global configuration, job configuration, and then runs
a single module which may be a C++ module or a Python module,
a Job or a ViewModule.

The same application can also be used to test translations,
branding, and slideshows, without starting up a whole Calamares
each time. It is possible to run multiple `loadmodule` executables
at the same time (Calamares tries to enforce that it runs only
once).

The following arguments can be used with `loadmodule`
(there are more; run `loadmodule --help` for a complete list):
 - `--global` takes a filename and reads the file to provide data in
   global storage. The file must be YAML-formatted.
 - `--job` takes a filename and reads that to provide the job
   configuration (e.g. the `.conf` file for the module).
 - `--ui` runs a view module with a UI. Without this option,
   view modules are run as jobs, and most of them are not
   prepared for that, and will crash.