The virtual-to-physical translation (mapping) is done in hardware and is fast; the operating system is responsible for the contents of the physical pages and maintaining the consistency of the mapping. The operating system can provide more virtual memory than there is physical memory. To do this, the operating system does not keep all the virtual memory pages in RAM. The page is said to be nonresident and paged out -- the contents of the page is written to disk. The process of writing out a virtual memory page's content to disk is called paging out. When a page of memory is needed by a program, the operating system locates a free, unused page of physical memory. If none is available, a page is made free by paging it out. Once a page of physical memory is available, the virtual page's contents are paged in from disk, and the virtual-to-physical address translation fixed so that the virtual address translates to the right physical address.
The operating system knows when a page is needed by a program by the means of a page fault. This is an exception that is generated by the hardware whenver a program tries to access a virtual address that it is not allowed to. (This could be caused by an error in the program, in which case it is a segmentation fault and the operating system generates a core image or other diagnostic information.) See below for an explanation of exceptions and interrupts.
I also talked about shared text pages as an optimization to lower physical memory requirements. This is another reason why text segments should be read-only.
A thread is basically an independent register set. A process is at least one thread, an address space and associated memory, and various other state like the current directory and access rights (I/O descriptors, user ID, etc). Threads run independently, so if there are two threads, the two register sets change independently under program control -- their two PCs cause (usually) different instructions to be executed.
The idea of a thread is an abstraction. There are two ways that threads are implemented: kernel threads and user threads. A kernel thread is one that is provided for you by the operating system, and the kernel automatically context switches among them. Like processes that block, if one kernel thread becomes blocked, the OS will switch the CPU to another runnable kernel thread. Because the kernel threads are managed by the OS, if the program is running on a multiprocessor, the kernel threads can actually take advantage of that and multiple runnable kernel threads can run on the various physical processors available.
User threads are also known as coroutine threads or cooperative threads. The kernel does not know about them -- application code (usually in a library) does the context switching. Context switching between user threads is faster, since the thread management in user-space is simpler: no user-to-kernel context switch needs to be done. Coroutine threads are typically non-preemptive: a thread must voluntarily give up the CPU to another thread by explicitly calling a yield function. Furthermore, if a program using coroutine threads is run on a multiprocessor, very little speedup will occur: the OS can not make the coroutine threads run simultaneously on two or more processors.
Some exceptions are fatal to the program (typically division by zero is) and generate a core image (or at least a signal to be delivered). Other exceptions are benigh: a page fault causes the operating system to page in the missing page after which the processor continue from where it left off, and the program remains unaware that an exception occurred.
Timer interrupts occur typically every 1/60th or 1/100th of a second. Operating systems use this interrupt to keep their time-of-day clock accurate, and to preemptively context switch among the runnable processes or runnable threads within a process. Interrupts from hardware devices are transparent to the running process.
The OS does not context switch on every timer interrupt. Processes/threads are given a time slice, called a scheduling quantum to run prior to being preemptively context switched. The size of the scheduling quantum determines the apparent simultaneity of the programs running and the interactive response time of the system. If the scheduling quantum is small, the interactive response improves. If the quantum size is too small, however, the efficiency of the system goes down: the CPU spends too much of its time context switching and not enough doing real work.
email@example.com, last updated