176:, proposed during the 1980s, are multiprocessors in which data and partial results are rhythmically pumped from processor to processor through a regular, local interconnection network. Systolic architectures use a global clock and explicit timing delays to synchronize data flow from processor to processor. Each processor in a systolic system executes an invariant sequence of instructions before data and results are pulsed to neighboring processors.
185:
can be different for each processor, they need not be. Thus, MIMD architectures can run identical programs that are in various stages at any given time, run unique instruction and data streams on each processor or execute a combination of each these scenarios. This category is subdivided further primarily on the basis of memory organization.
184:
Based on Flynn's multiple-instruction-multiple-data streams terminology, this category spans a wide spectrum of architectures in which processors execute multiple instruction sequences on (potentially) dissimilar data streams without strict synchronization. Although both instruction and data streams
154:
subclasses. SIMD architectures are characterized by having a control unit broadcast a common instruction to all processing elements, which execute that instruction in lockstep on diverse operands from local data. Common features include the ability for individual processors to disable an instruction
46:
This category includes all the parallel architectures that coordinate concurrent execution in lockstep fashion and do so via mechanisms such as global clocks, central control units or vector unit controllers. Further subdivision of this category is made primarily on the basis of the synchronization
203:
The MIMD-based paradigms category subsumes systems in which a specific programming or execution paradigm is at least as fundamental to the architectural design as structural considerations are. Thus, the design of
212:
is as much the product of supporting their distinctive execution paradigm as it is a product of connecting processors and memories in MIMD fashion. The category's subdivisions are defined by these paradigms.
117:
The late 1980s and early 1990s saw the introduction of vector architectures, such as the Cray Y-MP/4 and Nippon
Electric Corporation SX-3 that supported 4-10 vector processors with a shared memory (see
329:
Michael
Jurczyk and Thomas Schwederski,"SIMD-Processing: Concepts and Systems", pp. 649-679 in Parallel and Distributed Computing Handbook, A. Zomaya, ed., McGraw-Hill, 1996.
38:
The taxonomy was developed during 1988-1990 and was first published in 1990. Its original categories are indicated below.
280:
364:
96:
architectures. Early examples of register-to-register architectures from the 1960s and early 1970s include the
111:
155:
and the ability to propagate instruction results to immediate neighbors over an interconnection network.
262:
Duncan, Ralph, "A Survey of
Parallel Computer Architectures", IEEE Computer. February 1990, pp. 5-16.
92:
architectures, while those that feed functional units from special memory buffers are designated as
101:
70:
23:
61:
are characterized by pipelined functional units that accept a sequential stream of array or
66:
271:
Flynn, M.J., "Very High Speed
Computing Systems", Proc. IEEE. Vol. 54, 1966, pp.1901-1909.
8:
143:
119:
27:
77:
described above, as well as by operating multiple units of this kind in parallel and by
108:
84:
Vector architectures that stream vector elements into functional units from special
341:
Kung, H.T., "Why
Systolic Arrays?", Computer, Vol. 15, No. 1, Jan. 1982, pp. 37-46.
85:
57:
74:
298:
Computer
Society Press, Los Alamitos, California, 1984, esp. chapters 1 and 2.
358:
307:
Russell, R.M., "The CRAY-1 Computer System," Comm. ACM, Jan. 1978, pp. 63-72.
104:
124:
26:, proposed by Ralph Duncan in 1990. Duncan suggested modifications to
127:
may mark the beginning of the modern revival of Vector processing.
69:
are processing different elements of the vector at a given time.
142:(single instruction stream, multiple data stream) category from
318:
The ASC: a Highly
Modular Flexible Super Computer Architecture,
97:
350:
C Xavier and S S Iyengar, Introduction to
Parallel Programming
320:
Proc. AFIPS Fall Joint
Computer Conference, 1972, pp. 221-228.
114:
are early examples of memory-to-memory vector architectures.
138:
81:the output of one unit into another unit as input.
296:Tutorial Supercomputers: Design and Applications.
73:is provided both by the pipelining in individual
65:elements, such that different stages in a filled
356:
198:
50:
41:
290:
288:
337:
335:
301:
258:
256:
254:
252:
250:
248:
246:
16:Classification of computer architectures
310:
30:to include pipelined vector processes.
357:
285:
332:
323:
243:
188:
179:
163:
265:
226:
281:Introduction to Parallel Algorithms
274:
221:
216:
13:
231:
158:
14:
376:
168:
193:
100:and Fujitsu VP-200, while the
1:
236:
112:Advanced Scientific Computer
7:
199:MIMD-paradigm architectures
51:Pipelined vector processors
33:
10:
381:
42:Synchronous architectures
102:Control Data Corporation
131:
22:is a classification of
206:dataflow architectures
24:computer architectures
365:Computer architecture
136:This scheme uses the
146:as a root class for
90:register-to-register
120:NEC SX architecture
210:reduction machines
189:Distributed memory
180:MIMD architectures
164:Associative memory
152:associative memory
107:, CDC 205 and the
227:Reduction machine
109:Texas Instruments
58:vector processors
20:Duncan's taxonomy
372:
342:
339:
330:
327:
321:
314:
308:
305:
299:
294:Hwang, K., ed.,
292:
283:
278:
272:
269:
263:
260:
222:Dataflow machine
217:MIMD/SIMD hybrid
144:Flynn's taxonomy
128:
94:memory-to-memory
86:vector registers
75:functional units
28:Flynn's taxonomy
380:
379:
375:
374:
373:
371:
370:
369:
355:
354:
353:
346:
345:
340:
333:
328:
324:
315:
311:
306:
302:
293:
286:
279:
275:
270:
266:
261:
244:
239:
234:
232:Wavefront array
229:
224:
219:
201:
196:
191:
182:
174:Systolic arrays
171:
166:
161:
159:Processor array
148:processor array
134:
123:
53:
44:
36:
17:
12:
11:
5:
378:
368:
367:
352:
351:
347:
344:
343:
331:
322:
316:Watson, W.J.,
309:
300:
284:
273:
264:
241:
240:
238:
235:
233:
230:
228:
225:
223:
220:
218:
215:
200:
197:
195:
192:
190:
187:
181:
178:
170:
169:Systolic array
167:
165:
162:
160:
157:
133:
130:
52:
49:
43:
40:
35:
32:
15:
9:
6:
4:
3:
2:
377:
366:
363:
362:
360:
349:
348:
338:
336:
326:
319:
313:
304:
297:
291:
289:
282:
277:
268:
259:
257:
255:
253:
251:
249:
247:
242:
214:
211:
207:
194:Shared memory
186:
177:
175:
156:
153:
149:
145:
141:
140:
129:
126:
121:
115:
113:
110:
106:
103:
99:
95:
91:
87:
82:
80:
76:
72:
68:
64:
60:
59:
48:
39:
31:
29:
25:
21:
325:
317:
312:
303:
295:
276:
267:
209:
205:
202:
183:
173:
172:
151:
147:
137:
135:
116:
93:
89:
83:
78:
62:
55:
54:
45:
37:
19:
18:
88:are termed
71:Parallelism
47:mechanism.
237:References
125:RISC-V RVV
56:Pipelined
359:Category
105:STAR-100
79:chaining
67:pipeline
34:Taxonomy
98:Cray-1
63:vector
208:and
150:and
139:SIMD
132:SIMD
122:).
361::
334:^
287:^
245:^
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.