diff --git a/report/iot_sec_and_priv_report_korthals.tex b/report/iot_sec_and_priv_report_korthals.tex index 72c1e0ecdc0bcfad01394dac78d93576b2bb7ae3..86a59e99532021fd6d00432e98bcc063b1c5d573 100644 --- a/report/iot_sec_and_priv_report_korthals.tex +++ b/report/iot_sec_and_priv_report_korthals.tex @@ -57,9 +57,15 @@ Due to the widespread usage of such microcontroller-powered devices in many diff Most general-purpose system software assumes a plethora of different ressources from the hardware it is running on. A fast processor with cryptographic extensions (e.g. AES \cite[p.~1]{aes_extension}), relatively large amounts of RAM to use for the operating system (OS) and the applications it serves and a Memory Management Unit (MMU) which achieves memory protection and thereby process isolation by implementing virtual address spaces and memory segmenation. In contrast low-power microcontrollers (MCUs) are much more limited concerning it's hardware and power ressources and do not offer an MMU and often not even more than 64~kB of RAM. -\cite[p.~235, par.~1]{tockos} +\cite[p.~234,235, sect.~1, par.~2]{tockos} This circumstances make the development of secure system software for low-power IoT devices much more challenging than for general-purpose computers. +Due to the missing virtual address spaces and memory segmenation low-power IoT systems are often constrainted operate with the applications and the OS in the same memory region which makes sharing pointers easy and efficient but also carries substantial security risks. +The missing memory isolation requires all code in an unrestricted multiprogramming environment to be trusted to not be malicious or misbehave. +Controversially, IoT devices are often required to run fault free for long periods of time which often requires static memory allocation for most tasks to avoid memory exhaustion errors. +In such a ressource constrained system, static memory allocation makes it even more important to find the right balance for an applications memory usage such as when deciding how many concurrent requests a service can process since it can not dynamically changed at runtime. +\cite[p.~234,235, sect.~1, par.~4-7]{tockos} + The goal of this report is to answer how low-power IoT system software can ensure security properties (confidentiality, integrity, availability) in consideration of the above-mentioned constraints in memory space, processing power, power usage and hardware components. This goal is aimed to be achieved by first non-exhaustively describing a couple of general attack vectors on IoT system software which motivates the following overview about different approaches to IoT system software security which are not necessarily mutually exclusive. Subsequently the Tock OS is examined more closely as a concrete implementation of some of the preceding approaches. @@ -80,6 +86,23 @@ These physical manipulations include hard-resetting the system, probing the memo However this report will restrict it's examination exclusively on attacks without the possibility of any physical interactions with the systems components. %Short example, e.g. Return address corruption on the stack +\subsection{Threat model} +\label{sec:threat_model} + +If not stated otherwise, we will make use of a general model of an attacker which is defined by its capabilities. +For this report we will assume an attacker that can arbitrarily write to memory at certain points in time during the execution of the software that is attacked. +Further we assume that the attacker can not execute data or write to code segments in memory as enforced for example by a correspondingly configured Memory Protection Unit (MPU). %TODO introduce abbreviation again after abstract? But abstract is not really part of the text, is it? +As previously mentioned the attacker is explicitly not able to perform any kind of physical attacks on the system the considered system software is executed on because most approaches disregard such attacks as they might require different security measures which would go beyond the scope of this report even though it is an important subject especially for IoT devices. + +Assuming such an attacker is realistic since exploitation of memory corruption vulnerabilities can in general achieve some forms of write access to memory even though in practice this access can be severely constrained somehow if certain security policies are enforced but a defender might not always be able to rely uppon such guarantees as will be shown in later sections. +\cite[p.~3, sect.~3]{bending} + +The top goal of an attacker could be arbitrary code execution which means the attacker can invoke arbitrary system calls with arbitrary parameters to exercise all permissions the exploited software might have. +Otherwise an attacker could also try to achieve confined code execution which makes it possible for the attacker to execute arbitrary code inside the memory space of the exploited software excluding the ability to invoke arbitrary system calls. +Lastly the goal of an attacker could also be to leak information in form of arbitrary values from memory. +\cite[p.~3, sect.~3]{bending} + + %TODO Threat Model? or threat model of every individual aproach? \section{Approaches} @@ -88,8 +111,7 @@ However this report will restrict it's examination exclusively on attacks withou The following sub-chapters will deal with the question which approaches are currently researched or even in use to improve the security of low-power system oftware. These approaches for enhancing system software security are not mutually exclusive and also non-exhaustive. -The first examined approach is the introduction of hardware architecture enhancements to support existing/conventional software systems which in this report will cover the usage of additional memory protection hardware like the Memory Protection Unit (MPU) %TODO introduce abbreviation again after abstract? But abstract is not really part of the text, is it? -and the Protected Module Architectures (PMA) which includes Trust Execution Environments (TEE). +The first examined approach is the introduction of hardware architecture enhancements to support existing/conventional software systems which in this report will cover the usage of additional memory protection hardware like the MPU and the Protected Module Architectures (PMA) which includes Trust Execution Environments (TEE). For usage in novel system software a (mostly) memory- and type-safe programming language can also be used (in contrast to the widespread usage of the C-language). Some operating systems like seL4 use formal methods to verify the code of the system software against a specification. The concept and properties of Control-Flow Integrity (CFI) is briefly introduced. @@ -151,7 +173,7 @@ Based on this structure certain access control rules can be defined: For example that the private section is only readable while the PC is inside the module's code in the public section or that the public section is not writeable from outside the module. \cite[p.~347, sect.~3.1, par.~2,3]{spm} This is called Program Counter Based Access Control (PCBAC). -\cite[p.~4, sect.~2.2, par.~2]{b2} +\cite[p.~4, sect.~2.2, par.~2]{pcbac} Before the described structure of an SPM can provide any security guarantees it needs to be created and initialized. The OS loads the sections SPublic and SEntry into memory first. @@ -276,9 +298,10 @@ These second category are the unsafe parts in the kernel itself which include fo The goal of any safety-concerned OS in a memory-safe language is to reduce these unsafe parts to a minimum. \cite[p.~1,2, sect.~1, par.~4]{rust} -\colorbox{yellow}{[TODO...]} +\subsubsection*{Attack vectors} +An attacker would probably first search for vulnerabilities in these unsafe parts of the software that the type-safe part needs to operate since here it might be still possible to perform traditional attacks like buffer overflows. -"While type safety is a good property to have, it is not very strong. The kernel may still misbehave or attempt, for instance, a null pointer access." \cite[p.~218, par.~10]{sel4} +%"While type safety is a good property to have, it is not very strong. The kernel may still misbehave or attempt, for instance, a null pointer access." \cite[p.~218, par.~10]{sel4} \subsection{Formal verification} \label{sec:verify} @@ -331,7 +354,22 @@ CFI mechanisms restrict the set of target addresses of every control flow transf Likewise software written in memory- and type-safe programming lanugages could indirectly make use of some of the security guarantees that a CFI mechanism provides since all of them depend on some usage of unsafe code like a VM (see sect.~\ref{sec:vm}) or libraries written in C which again can be secured using CFI. \cite[p.~16:4, sect.~2.1, par.~1]{cfi_performance} +CFI mechanisms consist of two phases: the analysis phase and the enforcement phase. +In the analysis phase a control-flow graph (CFG) is constructed which usually happens before the programs runtime. +The programs code can be considered as split into blocks separated by control-flow transfers (as seen in fig.~\ref{fig:cfi}). +Thus the CFG is a directed graph consisting of vertices representing such blocks and edges representing control-flow transfers. +The CFG is therefore a describes the expected control-flow of the program it is generated for. +In the enforcement phase the CFI mechanism enforces the security policy as defined by the CFG. +Any control-flow transfer is validated by checking if the control-flow transfer is contained in the set of valid CFG-edges of the current block the program is executing. +If that is not the case then an attacker must have manipulated the program and the CFI can act accourdingly. +\cite[S.~16:2, par.~2]{cfi_performance} + +The more precise the CFG is constructed the stronger the security guarantees are that the CFI mechanism can provide. \cite[p.~16:4-16:5, sect.~2.1, par.~2]{cfi_performance} +A CFI mechanism usually requires that it is enforced that any code memory segment of the program is read-only since otherwise an attacker could rewrite the instructions that check the control-flow transfers and could therefore nullify any security guarantee of the CFI mechanism. +Therefore the CFI mechanism does not have to validate direct control-flow transfers from which the target address is static and therefore known at compile time so that an attacker could not modify the address since the code can not be modified. +The type of control-flow transfers that in constrast have to be validated are the indirect control-flow transfers from which the target address is to be determined at runtime which is the case for transfers that depends on input from a human user or for function returns that are read from the stack. +% \begin{lstlisting}[language=C] bool lt(int x, int y) { @@ -351,30 +389,14 @@ sort2(int a[], int b[], int len) { \caption{A scheme of a CFG (bottom) of an example program (top). \cite[p.~4:10, fig.~1]{cfi_principles} The boxes in the scheme represent the blocks of code and the arrows represent the control-flow transfers. Direct control-flow transfers are represented by dotted arrows, calls (indirect transfer) by continuous arrows and returns (indirect transfer) by dashed arrows.} \label{fig:cfi} \end{figure} - -CFI mechanisms consist of two phases: the analysis phase and the enforcement phase. -In the analysis phase a control-flow graph (CFG) is constructed which usually happens before the programs runtime. -The programs code can be considered as split into blocks separated by control-flow transfers (as seen in fig.~\ref{fig:cfi}). -Thus the CFG is a directed graph consisting of vertices representing such blocks and edges representing control-flow transfers. -The CFG is therefore a describes the expected control-flow of the program it is generated for. -In the enforcement phase the CFI mechanism enforces the security policy as defined by the CFG. -Any control-flow transfer is validated by checking if the control-flow transfer is contained in the set of valid CFG-edges of the current block the program is executing. -If that is not the case then an attacker must have manipulated the program and the CFI can act accourdingly. -\cite[S.~16:2, par.~2]{cfi_performance} - -The more precise the CFG is constructed the stronger the security guarantees are that the CFI mechanism can provide. \cite[p.~16:4-16:5, sect.~2.1, par.~2]{cfi_performance} - -A CFI mechanism usually requires that it is enforced that any code memory segment of the program is read-only since otherwise an attacker could rewrite the instructions that check the control-flow transfers and could therefore nullify any security guarantee of the CFI mechanism. -Therefore the CFI mechanism does not have to validate direct control-flow transfers from which the target address is static and therefore known at compile time so that an attacker could not modify the address since the code can not be modified. -The type of control-flow transfers that in constrast have to be validated are the indirect control-flow transfers from which the target address is to be determined at runtime which is the case for transfers that depends on input from a human user or for function returns that are read from the stack. +% The target addresses of such indirect control-flow transfers can reside in registers as well as in the general memory depending on the program's code and the chosen compiler. To enforce the security policy the CFI mechanism has to compare the target address of every indirect control-flow transfer to the set of valid target addresses in the CFG and determine if the program's control-flow still follows the expected edges in the CFG. \cite[p.~16:4-16:5, sect.~2.1, par.~3,4]{cfi_performance} However the CFG individually can not enforce that the return address of a function call actually returns to the address where the function has been called. The CFG can not define the exact order that the edges must be traversed therefore returning to a different target address would not be a violation of the security policy as long as the target address corresponds to a valid edge in the CFG. -This makes it easier for an attacker to use the more numerous possible edges to for example relatively easy % TODO see "Control-Flow Bending" -perform a return-to-libc attack. %TODO explain return-to-libc shortly +This makes it easier for an attacker to use the more numerous possible edges to for example relatively easy perform a return-to-libc attack which is the technique of reusing functions, including especially powerful libc functions like \lstil{system()}, that already exist in the code segment of the vulnerable process \cite[p.~2, sect.~2.2]{bending}. To enforce that any \lstil{return} instruction actually returns to the caller a CFI mechanism can use a so called protected shadow call stack. \cite[p.~4:23, sect.~5.4, par.~1]{cfi_principles} @@ -386,8 +408,6 @@ If both return addresses differ an attacker might have manipulated the stack and To make sure that the shadow call stack can not be manipulated by an attacker to bypass the restrictions it is necessary for the shadow call stack to reside in a protected memory area. The memory protection can be enforced using CFI and some form of hardware memory protection. -%TODO Talk about the fact that even "perfect" CFI ist not fully secure (see "Control-Flow Bending") - Implementing CFI on low power MCUs is challenging due to the additional code size and the performance overhead that the runtime checks require. Also the necessary hardware memory protection can often just be provided by a relatively simple MPU which does not support memory segmentation like an MMU does (see sect.~\ref{sec:mpu}). \cite[p.~6, sect.~3.3]{cfi_care} @@ -482,7 +502,7 @@ The other objectives might also be mentioned but only in so far as they contribu In an OS that consists of multiple components and may even support multiple applications, memory faults may happen in one component or application which may in turn corrupt the state of other components or applications. Let an actor be either a component or application on the examined system, for conciseness purposes. Due to the possible memory corruption between the different actors, an isolation between them is desirable, so that an actor can fail and be properly processed by the OS kernel without impairing other actors. -Fault isolation on MCUs is typically not as easy as it is on general-purpose computers mostly due to hardware constraints, as described in section~\ref{intro}. +This kind of fault isolation on MCUs is typically not as easy as it is on general-purpose computers mostly due to hardware constraints, as described in section~\ref{intro}. The following part of this section covers how Tock OS aims to achieve the fault isolation and other security properties "by leveraging recent advances in microcontrollers and programming languages". \cite[p.~237, sect.~2.2, par.~5]{tockos} @@ -500,6 +520,20 @@ Processes in contrast can be loaded at runtime and are analogous to the prevalen Capsules are scheduled cooperatively and therefore can not interrupt each other while processes are scheduled preemptively and therefore can be interrupted causing a context switch if required. \cite[p.~237, sect.~3]{tockos} +\subsubsection{Threat Model} +\label{sec:tock_threats} + +Tock OS separates the threats accourding to the four stakeholders of the system: "board integrators, kernel component developers, application developers, and end-users" \cite[p.~237, sect.~3.1, par.~1]{tockos} + +Board integrators are the most trusted stakeholder as they have complete control over the firmware of the MCU since they combine the Tock kernel and the MCU-specifc parts of the code. +The kernel component developers create the capsules which define much of the kernels functionality, which includes for example peripheral drivers. +It is assumed that the board integrators audit the source code of the capsules before compiling the kernel. +However the kernel component developers are not trusted to protect confidentiality and integrity of other capsules therefore they may starve the CPU but they can not violate capsule isolation like performing unauthorized access to peripherals. +Application developers build processes that provide functionality to end-users and use the functionality by the kernel and its capsules that is provided to them. +Applications, running as processes, are considered malicious since the board integrators might not be able to audit the code therefore the should not be able to violate confidentiality, integrity and availability of other parts of the system and can therefore can also not starve the system in contrast to capsules. +End-users can install, replace or update applications and may interact with I/O periphery but they are also considered untrusted from the perspective of the OS. +\cite[p.~237, sect.~3.1, par.~$\geq$2]{tockos} + \subsubsection{Capsules} \label{sec:capsule} @@ -580,38 +614,40 @@ In contrast to the processes, the MPU is disabled, as long as the kernel is acti %\subsection{Evalutation} %TODO of Tock OS against what? by which measure? Maybe in compairison to the other approaches? How they could be used to enhance the security properties even more? \section{Evalutation} -The following part is an evaluation of Tock OS by the author, considering the other mentioned approaches: \colorbox{yellow}{[Comment:} The following part is just an evaluation idea, this might be thrown away if it is deemed superfluous/wasteful/subpar but i thought i might add some bit more of a personal contribution, even though it is not explicitly part of the main question of the paper] +Even though Tock OS already makes use of a type-safe language (see sect.~\ref{sec:safelang}) in form of Rust and of hardware architecture enhancements (see sect.~\ref{sec:hardware_arch_enhance}) like the MPU (see sect.~\ref{sec:mpu}) it could be possible that Tock OS might also benefit from other approaches, including the others mentioned in section~\ref{sec:approaches}. -To answer the question if Tock OS could benefit from the other mentioned approaches in section~\ref{sec:approaches}, it is necessary to first evaluate which of the approaches Tock already uses. -Concerning hardware architecture enhancements (see sect.~\ref{sec:hardware_arch_enhance}) Tock already uses the MPU, although the usage of one of the PMA (see sect.~\ref{sec:pma}) approaches might be beneficial to the security properties of especially the trusted capsules in the kernel (see sect.~\ref{sec:capsule}) since the MPU might not be as powerful of an isolation mechanism as for examble a TEE. -Since Tock already uses Rust for the kernel, the usage of a type-safe language (see sect.~\ref{sec:safelang}) is given. -Formal verification (see sect.~\ref{sec:verify}) on the other hand is currently not applied in any form. +In addition to the MPU, one of the the PMA (see sect.~\ref{sec:pma}) approaches might also be beneficial to the security properties of especially the trusted capsules in the kernel (see sect.~\ref{sec:capsule}) since the MPU might not be as powerful of an isolation mechanism as for examble a TEE. + +Formal verification (see sect.~\ref{sec:verify}) is currently not applied in any form in Tock OS. This might not be especially beneficial for the implementation of the processes or the trusted modules, since these are secured using hardware and the features of the Rust language, assuming that the implementation of the hardware and the Rust compiler is correct. -But since the kernel's core and the trusted capsules have to use parts of unsafe code to function, especially when directly manipulating the hardware, formal verification of these trusted und partly unsafe parts of the OS could be used to identify critical bugs and design flaws. -%TODO What about virtualization and CFI? +But since the kernel's core and the trusted capsules have to use parts of unsafe code to function, especially when directly manipulating the hardware, formal verification of these trusted und partly unsafe parts of the OS could be used to identify critical bugs and design flaws, as stated by the project itself \cite[p.~248, 249, sect.~7, par.~6]{tockos}. + +Even though virtualization and CFI could be applied additionally for individual processes or capsules the resulting benefit is not clear but especially since processes can be written in any language, it could be beneficial to apply a CFI mechanism for the internal integrity of the software even though it should not be possible for the attacker to corrupt other processes or capsules assuming that the security guarantees hold. \section{Conclusion} \label{sec:conclusion} Especially due to the constrained ressources securing low-power IoT system software will continue to be a challenging problem. -Due to the requirement of the devices to use less power, have smaller form-factors and be cheaper the mentioned hardware ressource limitations will presumably pervail for the foreseeable future \cite[p.~249, sect.~8, par.~1]{tockos}. +Since the requirement of the devices to use less power, have smaller form-factors and be cheaper will also probably not change the mentioned hardware ressource limitations will presumably pervail for the foreseeable future \cite[p.~249, sect.~8, par.~1]{tockos}. But as these device become more and more integrated in every aspect of our live it will become even more important to ensure certain security guarantees to provide these systems with the necessary confidentiality, integrity and availability. +The presented approaches might be able to provide some of the security properties that are needed to reach the goal of a widespread usage of sufficiently secure low-power IoT system software that is able to protect itself and its users. %\section{Prospects} %TODO Search for other upcoming hardware enhancements etc. otherwise remove the section! %\colorbox{yellow}{[TODO?]} +%TODO remove unused references \begin{thebibliography}{00} %\subsection*{\upshape Given Sources:} \bibitem{spm} Strackx, Raoul \& Piessens, Frank \& Preneel, Bart. (2010). Efficient Isolation of Trusted Subsystems in Embedded Systems. Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering. 50. 344-361. 10.1007/978-3-642-16161-2\_20. -\bibitem{b2} Mühlberg, J.T., Noorman, J., Piessens, F. (2015). Lightweight and Flexible Trust Assessment Modules for the Internet of Things. In: Pernul, G., Y A Ryan, P., Weippl, E. (eds) Computer Security -- ESORICS 2015. ESORICS 2015. Lecture Notes in Computer Science(), vol 9326. Springer, Cham. https://doi.org/10.1007/978-3-319-24174-6\_26 -\bibitem{b3} E. Baccelli et al., "RIOT: An Open Source Operating System for Low-End Embedded Devices in the IoT," in IEEE Internet of Things Journal, vol. 5, no. 6, pp. 4428-4440, Dec. 2018, doi: 10.1109/JIOT.2018.2815038. +\bibitem{pcbac} Mühlberg, J.T., Noorman, J., Piessens, F. (2015). Lightweight and Flexible Trust Assessment Modules for the Internet of Things. In: Pernul, G., Y A Ryan, P., Weippl, E. (eds) Computer Security -- ESORICS 2015. ESORICS 2015. Lecture Notes in Computer Science(), vol 9326. Springer, Cham. https://doi.org/10.1007/978-3-319-24174-6\_26 +%\bibitem{b3} E. Baccelli et al., "RIOT: An Open Source Operating System for Low-End Embedded Devices in the IoT," in IEEE Internet of Things Journal, vol. 5, no. 6, pp. 4428-4440, Dec. 2018, doi: 10.1109/JIOT.2018.2815038. \bibitem{tockos} Levy, Amit \& Campbell, Bradford \& Ghena, Branden \& Giffin, Daniel \& Pannuto, Pat \& Dutta, Prabal \& Levis, Philip. (2017). Multiprogramming a 64kB Computer Safely and Efficiently. 234-251. 10.1145/3132747.3132786. %\subsection*{\upshape Other sources:} -\bibitem{b5} Erlingsson, Úlfar. (2007). Low-Level Software Security: Attacks and Defenses. 92-134. 10.1007/978-3-540-74810-6\_4. -\bibitem{b6} Clercq, Ruan \& Piessens, Frank \& Schellekens, Dries \& Verbauwhede, Ingrid. (2014). Secure interrupts on low-end microcontrollers. 147-152. 10.1109/ASAP.2014.6868649. -\bibitem{b7} A. Dunkels, B. Gronvall and T. Voigt, "Contiki - a lightweight and flexible operating system for tiny networked sensors," 29th Annual IEEE International Conference on Local Computer Networks, 2004, pp. 455-462, doi: 10.1109/LCN.2004.38. -\bibitem{b8} Banegas, Gustavo \& Zandberg, Koen \& Herrmann, Adrian \& Baccelli, Emmanuel \& Smith, Benjamin. (2021). Quantum-Resistant Security for Software Updates on Low-power Networked Embedded Devices. +%\bibitem{b5} Erlingsson, Úlfar. (2007). Low-Level Software Security: Attacks and Defenses. 92-134. 10.1007/978-3-540-74810-6\_4. +%\bibitem{b6} Clercq, Ruan \& Piessens, Frank \& Schellekens, Dries \& Verbauwhede, Ingrid. (2014). Secure interrupts on low-end microcontrollers. 147-152. 10.1109/ASAP.2014.6868649. +%\bibitem{b7} A. Dunkels, B. Gronvall and T. Voigt, "Contiki - a lightweight and flexible operating system for tiny networked sensors," 29th Annual IEEE International Conference on Local Computer Networks, 2004, pp. 455-462, doi: 10.1109/LCN.2004.38. +%\bibitem{b8} Banegas, Gustavo \& Zandberg, Koen \& Herrmann, Adrian \& Baccelli, Emmanuel \& Smith, Benjamin. (2021). Quantum-Resistant Security for Software Updates on Low-power Networked Embedded Devices. % Safe languages \bibitem{rust} Amit Levy, Bradford Campbell, Branden Ghena, Pat Pannuto, Prabal Dutta, and Philip Levis. 2017. The Case for Writing a Kernel in Rust. In Proceedings of the 8th Asia-Pacific Workshop on Systems (APSys '17). Association for Computing Machinery, New York, NY, USA, Article 1, 1–7. https://doi.org/10.1145/3124680.3124717 % Formal verification