Skip to content

OS

KB

  • Scheduler:
    • simplicity
    • fairness
    • responsiveness
    • Low waiting time
    • Flexible priority
    • predicatability
  • Write-back cache: only write back if the entry is going to be replaced.
  • Architecture
    • Monolithic OS: where the entire OS is working in kernel space and is alone in supervisor mode
    • Modular OS: some part is implemented in modules that can be loaded at runtime
    • Micro OS: kernel is broken down into separate processes
  • Memory layout of a process
    • Stack <------> Heap, Data, Text
  • Process Control Block (PCB)
    • PID
    • process state: Start, Ready (from Start or from interrupted), Running, Waiting (for I/O etc.), Terminated
    • Privileges
    • Pointer to parent process
    • Program counter
    • CPU registers
    • CPU scheduling info
    • Memory info (page table, memory limits, segment table)
    • Accounting info (CPU slices, time limits...)
    • IO status info
  • What is a file descriptor?
    • Unix-y system
    • Name for a unit of I/O "resource"
    • Abstraction designed by an API standard
  • Why is system call expensive?
  • Thread vs process
    • Threads share code, data, heap, and files, but not registers, and stack
  • Kernel level threads
  • I/O device
    • Block device -- hard disk, USB cameras etc.
    • Char devices -- serial ports, keyboards etc.
  • DMA (direct memory access)
  • Process scheduling
    • PCBs are maintained in Process Scheduling Queues, and one queue for each process state.
    • PCBs of the same state is placed in the same queue
    • Job queue (all), ready queue (device queue -- for waiting ones).
    • Context switch -- stores and restores the state of context of a CPU in a PCB -- essential for multi-tasking OS.
      • Some hardware systems employ two or more sets of processor registers to avoid the cost
  • Hyper-threading: basically cheating to gain performance from the I/O blocking in programs.
  • What does fork do?
    • https://www.geeksforgeeks.org/fork-system-call/
    • fork() creates a new process by duplicating the calling process
    • What are duplicated?
      • Same PC (program counter)
      • Same registers
      • Same open files
    • Returned an integer value
      • negative: error
      • zero: inside child process
      • positive: PID
    • unistd.h
  • virtual memory
    • versus physical memory
    • Each process has its own virtual memory, limited only by the width of pointer
    • Contains both code and data (stack, heap)
    • malloc, free are the main APIs used, also includes calloc and realloc
      • calloc will initialize the allocated memory to 0, while malloc doesn't
      • calloc takes two parameters: num of blocks and size per block.
      • realloc: can be used to change the allocated size given an existing pointer.
    • OS (kernel) is responsible for managing the mapping between virtual memory and physical memory
      • Only a subset of addresses in the virtual memory is in VM, other used ones might be cached in disk.
      • Page algorithm.
    • Use less main memory, isolate processes, simplify programming
    • Implementation: HW+SW
      • Demand paging/segmentation
        • If CPU try to use a page not available in the main memory, a memory access fault interrupt (page fault) is generated
        • OS will try to find the required page in the logical address space.
          • What is "logical address space"?
    • Swapping: swapping a process out means removing all its pages from memory or mark them so. It will happen if you suspend a process from execution.
    • Thrashing: if a process kept swapping pages in and out
    • TLB: Translation Look-aside Buffer
      • Each process will have a page table of PTEs (page table entries)
        • PTE: frame number, valid/invalid bit, dirty bit, protection bit etc.
      • The problem is where to place page table
        • Main memory: still a bit slow
        • TLB is a special cache for storing some queries -> most recently used PTE
          • If TLB miss, the page number is used to index the process's full page table
      • MMU (Memory Management Unit) is a special hardware that manages many aspects of paging, and include TLB
    • Frames vs Pages
      • The Physical Address Space is conceptually divided into a number of fixed-size blocks, called frames.
      • The Logical address Space is also splitted into fixed-size blocks, called pages.
      • Page Size = Frame Size
    • Multi-level/hierarchical paging
      • Virtual address can be split in to Level 1 offset + Level 1 offset + Level 3 offset + final offset.
      • First, chop up the page table into page-sized units; then, if an entire page of page-table entries (PTEs) is invalid, don’t allocate that page of the page table at all.
  • Difference between virtual memory and caches?
  • IPC mechanisms
    • https://opensource.com/article/19/4/interprocess-communication-linux-storage
    • Shared files
      • File lock API flock, fcntl
    • Shared memory
      • API:
        • System V: legacy
        • POSIX: mmap -- combines speed with persistence
      • Protection: semaphores
    • Message passing
    • Pipes: file descriptor based
      • Pipes provide a unidirectional interprocess communication channel.
      • A pipe is created using pipe(2), which returns two file descriptors, one referring to the read end of the pipe, the other referring to the write end.
      • https://toroid.org/unix-pipe-implementation
        • Allocate buffers, API for reading/writing from/to the buffer, and locking mechanism for access control (atomicity)
        • Reading or writing pipe data is atomic if the size of data written is not greater than PIPE_BUF.
      • The atomicity of a pipe operation becomes important when more than one process is involved (FIFOS). For example, if the number of bytes written to a pipe exceeds the atomic limit for a single operation, and multiple processes are writing to the pipe, the data will be "interleaved" or "chunked". In other words, one process may insert data into the pipeline between the writes of another.
    • Sockets:
    • RPC
    • Signals
  • Virtualization
    • Hypervisor: control the actual hardware.
    • Hypervisor + Machine = Host
      • Many VMs: guests
    • Full vs Para virtualization
    • Xen & KVM are two hypervisor available in linux.
      • Xen kernel needs to be installed and booted into.
      • KVM is a module of Linux.
      • Type-1 hypervisor: Bare metal hypervisor that runs on hardware (e.g. Hyper-V and ESXI Server)
      • Type-2 hypervisor: e.g. VMware
  • RAID: Redundant Array of Independent Disks https://en.wikipedia.org/wiki/RAID. RAID levels are basically a recipes of different types of trade-offs between reliability, performance, availability, and capacity. https://www.prepressure.com/library/technology/raid has some great illustrations.
  • select、poll、epoll 之间的区别
    • epoll 可以理解为 event poll,其实不是轮询!

High-level

https://medium.com/cracking-the-data-science-interview/how-operating-systems-work-10-concepts-you-should-know-as-a-developer-8d63bb38331f

  • Three key elements of an OS
    • Abstractions: process, thread, file, socket, memory
    • Mechanisms: create, schedule, open, write, allocate
    • Policies: LRU, EDF

GRADUATE COURSE: Intro. to Computer Systems Research

  • Concurrency
    • Mutex: Mesa (Processes and Monitors)
  • High-performance Web server
    • C10K problem
    • Event-driven, async I/O
    • Low-level: TCP flow-control (buffer), cong-control (packet loss), slow-start (estimate bandwidth)
    • Persistent connections
  • Concurrency Control & Recovery in DB
    • Motivation for concurrency: Performance (multi-core, I/O overlapping)
    • Programming model: Transaction: ACID
      • Serializability test
      • 2 Phase Locking
    • Data structures: Volatile/Non-volatile, buffer pool, paging, WAL
      • STEAL/NO-STEAL: flush before committing?
      • FORCE/NO-FORCE: flush every time immediately after committing?
    • Logging: UNDO/REDO, Physical/Logical
    • Recovery: Use log and checkpoint
  • Distributed Transactions
    • Why distribution? Partition of workload, Reliability by replication
    • Coordinator + Participants
    • DT Log on stable storage
    • 2PC: network failure tolerant
    • 3PC: Non-blocking
  • Distributed Computation:
    • Clocks, Event Ordering, and Global Predicate Computation
    • Stable RTT based scheme to sync physical clock
    • Lamport clock: can’t capture causality, hard to replay
    • Vector clock: better than above — capturing causality
  • Consensus protocol: Paxos
  • Distributed File System
    • Spanner
      • Baseline: simplest approach to implement distributed transactions in a sharded setting is to layer 2PC over Paxos