Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Multiprocessors and Process Synchronization and Consistency, Study Guides, Projects, Research of Advanced Computer Architecture

How multi processor environments are synchronized and memory kept consistent

Typology: Study Guides, Projects, Research

2019/2020

Uploaded on 04/25/2020

jayasurya-venugopala
jayasurya-venugopala 🇮🇳

1

(1)

3 documents

1 / 99

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Shared Memory Multiprocessors
Introduction
UMA systems
NUMA systems
COMA systems
Cache coherency
Process synchronization
Models of memory consistency
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63

Partial preview of the text

Download Multiprocessors and Process Synchronization and Consistency and more Study Guides, Projects, Research Advanced Computer Architecture in PDF only on Docsity!

Shared Memory Multiprocessors

  • Introduction
  • UMA systems
  • NUMA systems
  • COMA systems
  • Cache coherency
  • Process synchronization
  • Models of memory consistency

Shared memory multiprocessors

  • A system with multiple CPUs “sharing” the same main memory is called multiprocessor.
  • In a multiprocessor system all processes on the various CPUs share a unique logical address space , which is mapped on a physical memory that can be distributed among the processors.
  • Each process can read and write a data item simply using load and store operations, and process communication is through shared memory.
  • It is the hardware that makes all CPUs access and use the same main memory.

Shared memory multiprocessors

  • Since all CPUs share the address space, only a single instance of the operating system is required.
  • When a process terminates or goes into a wait state for whichever reason, the O.S. can look in the process table (more precisely, in the ready processes queue) for another process to be dispatched to the idle CPU.
  • On the contrary, in systems with no shared memory , each CPU must have its own copy of the operating system, and processes can only communicate through message passing.
  • The basic issue in shared memory multiprocessor systems is memory itself, since the larger the number of processors involved, the more difficult to work on memory efficiently.

Shared memory multiprocessors

  • All modern OS (Windows, Solaris, Linux, MacOS) support

symmetric multiprocessing , ( SMP ), with a scheduler

running on every processor (a simplified description, of

course).

  • “ready to run” processes can be inserted into a single queue,

that can be accessed by every scheduler, alternatively there

can be a “ready to run” queue for each processor.

  • When a scheduler is activated in a processor, it chooses one

of the “ready to run” processes and dispatches it on its

processor (with a single queue, things are somewhat more

difficult, can you guess why?)

Shared memory multiprocessors

  • Modern OSs designed for SMP often have a separate queue

for each processor (to avoid the problems associated with a

single queue).

  • There is an explicit mechanism for load balancing, by which

a process on the wait list of an overloaded processor is

moved to the queue of another, less loaded processor.

  • As an example, SMP Linux activates its load balancing

scheme every 200 ms, and whenever a processor queue

empties.

Shared memory multiprocessors

  • Migrating a process to a different processor can be costly

when each core has a private cache (can you guess why?).

  • This is why some OSs, such as Linux, offer a system call to

specify that a process is tied to the processor, independently

of the processors load.

  • There are three classes of multiprocessors, according to the

way each CPU sees main memory:

Shared memory multiprocessors

2. Non Uniform Memory Access (NUMA) : these systems have a shared logical address space, but physical memory is distributed among CPUs, so that access time to data depends on data position, in local or in a remote memory (thus the NUMA denomination)

  • These systems are also called Distributed Shared Memory (DSM) architectures (Hennessy-Patterson, Fig. 6.2)

Shared memory multiprocessors

3. Cache Only Memory Access (COMA) : data have no specific “permanent” location (no specific memory address) where they stay and whence they can be read (copied into local caches) and/or modified (first in the cache and then updated at their “permanent” location).

  • Data can migrate and/or can be replicated in the various memory banks of the central main memory.

UMA multiprocessors

  • Larger multiprocessor systems (>32 CPUs) cannot use a single bus to interconnet CPUs to memory modules, because bus contention becomes un-manegeable.
  • CPU – memory is realized through an interconnection network (in jargon “fabric”).

UMA multiprocessors

  • Caches local to each CPU alleviate the problem, furthermore each processor can be equipped with a private memory to store data of computations that need not be shared by other processors. Traffic to/from shared memory can reduce considerably (Tanenbaum, Fig. 8.24)

UMA multicores - manycores

shared RAM CORE L1 cache L2 cache Shared L3 cache CORE L1 cache L2 cache CORE L1 cache L2 cache CORE L1 cache L2 cache

multicores: 2 ÷ 22 manycores:^ ∼^70

Shared LLC cache shared RAM CORE L1 cache L2 cache CORE L1 cache L2 cache CORE L1 cache L2 cache CORE L1 cache L2 cache

Caches and memory in

multiprocessors

  • Memory (and the memory hierarchy) in multiprocessors poses two different problems:
  • Coherency: whenever the address space is shared – the same memory location can have multiple instances (cached data) at different processors
  • Consistency: whenever different access times can be seen by processors – write operations from different processors require some model for guaranteeing a sound, consistent behaviour ( the when issue – namely, the ordering of writes)

Crossbar switch UMA systems

  • A switch is located at each crosspoint between a vertical and a horizontal line, allowing to connect the two, when required.
  • In the figure, three switches are closed, thus connecting CPU- memory pairs (001-000), (101-101) and (110-010). (Tanenbaum, Fig. 8.27)

Crossbar switch UMA systems

  • It is possible to configure the switches so that each CPU can connect to each memory bank (and this makes the system UMA)
  • The number of switches for these scheme scales with the number of CPUs and memories; n CPU and n memories require n 2 switches.
  • This pattern fits well medium scale systems (various multiprocessor systems from Sun Corporation use this scheme); certainly, a 256- processor system cannot use it ( 2 switches would be required !!).