## 台灣聯合大學系統 107 學年度碩士班招生考試試題 類組:<u>電機類</u> 科目:計算機系統(計算機組織)(300A) 共 4 頁 第 1 頁 ### ※請在答案卷內作答 - \[11\%] Assume that a benchmark program running on processor A has an instruction count of 1.2×10<sup>12</sup> and a CPU response time of 900 s. The clock cycle is 0.25 ns. (a) [2%] Find the CPI. - (c) [6%] Assume that 2-issue processor B is used and the clock frequency is 2 GHz. The same benchmark program now has $7.2 \times 10^{11}$ issue packets but $1.2 \times 10^{12}$ instructions. What is the CPU response time now? What is the CPI?. - = \ [13%] Assume that a user-defined floating-point format is 16 bits wide. The leftmost bit is the sign bit, and the exponent is 6 bits wide with a bias of 30. The normal exponent field has a value ranging from 1 to 62. The mantissa is 9 bits long with a hidden 1. Please answer the following questions. - (a) [3%] What is the maximum value that can be represented? Please approximate it by the decimal value in scientific notation. - (b) [3%] What is the smallest positive value that can be represented? Please approximate it by the decimal value in scientific notation. - (c) [3%] Please express the floating-point representation for 254.5. - (d) [4%] Please calculate floating-point addition 2.875+254.5 by hand, assuming both values are stored in 16-bit user-defined floating-point format. Assume 1 guard bit, 1 round bit, and 1 sticky bit and round to the nearest even. Please write your answer both in the 16-bit user-defined floating-point format and in decimal. $\equiv$ \ [11%] Consider a program with the following properties. | Data Read | Data Writes | Instruction | Instruction | Data Cache | Data Cache | |--------------|--------------|-------------|-------------|--------------|--------------| | per 100 | per 100 | Cache Miss | Cache Miss | Miss Rate at | Miss Rate to | | Instructions | Instructions | Rate at | Rate to | Level-1 | Main | | | | Level-1 | Main | Cache | Memory | | | | Cache | Memory | j | | | 35 | 15 | 1% | 0 | 4% | 1% | - (a) [3%] Given that the clock frequency of the processor with two-level cache is 2.5 GHz, the main memory access time is 80 ns, and the secondary cache has a 6 ns access time, what is the miss penalty to the main memory in terms of the clock cycles? - (b) [4%] If the ideal CPI is 2 without any memory stalls, what is the total CPI including memory stalls when we run the above program? - (c) [4%] We want to reduce the total CPI including memory stalls to be smaller than 3 by increasing the size of the level-2 cache, which is used to improve the data cache miss rate to main memory. What is the target data cache miss rate to main memory in this case? 注意:背面有試題 參考用 # 類組:電機類 科目:計算機系統(計算機組織)(300A) 共 4 頁 第 2 頁 ### ※請在答案卷內作答 四、[10%] Assume that A and B are two word arrays and the base address of the array A and B are in registers \$56 and \$57, respectively. Assume further that the variable f and g are assigned to registers \$50 and \$51. For the following C statement: $$B[q] = A[f] + A[1+f];$$ - (a) [5%] Please write the corresponding MIPS assembly code. To get the full score, you should write the minimum number of MIPS instructions as possible as you can. - (b) [5%] Please identify what the instruction type of each MIPS instruction in the code that you compiling for (a). For the I-type instructions, show the value of the immediate field, and for the R-type instructions, show the value of the destination register (RD) field. £ \ [10%] Although keeping all MIPS instructions 32 bits long simplifies the hardware, there are times where it would be convenient to have a 32-bit constant or 32-bit address. (a) [5%] Without using 1w and sw instructions, what is the MIPS assembly code to load the following 32-bit constant into register \$50? (b) [5%] Given a branch on register \$s0 being equal to register \$s1, please replace it with MIPS instructions that offer a branching distance L1 which is much greater than $\pm 2^{15}$ words of the current instruction. ∴ [15%] This problem considers building a MIPS datapath. As well known, the operations of R-type and the memory instructions datapath are quite similar. The key differences are the following: - (i) The R-type instructions use the ALU, with the inputs coming from the two registers. The memory instructions can also use the ALU to do the address calculation, although the second input is the sign-extended 16-bit offset field from the instruction. - (ii) The value stored into a destination register comes from the ALU (for an R-type instruction) or the memory (for a load). With the following elements, please build a datapath for the operational portion of the memory-reference and R-type instructions that uses a single register file and a single ALU to handle both types of instructions. You should add any necessary multiplexors and control lines, if possible, to get full score. 注意:背面有試題 ## 台灣聯合大學系統 107 學年度碩士班招生考試試題 類組:<u>電機類</u> 科目:計算機系統(計算機組織)(300A) 共 4 頁第 3 頁 ### ※請在答案卷內作答 七、[12%] (a) [4%] Consider the code in the below. List all RAW, WAR, and WAW hazards found in the code, and the register causing the hazard. You can specify hazards in the form (13→15 for F8) I1: LD F1, 0(Rx) 12: LD F2, 8(Rx) 13: MULT.D F3, F1, F2 I4: ADD.D F4, F1, F3 15: ADD.D F5, F1, F2 16: LD F5, 0(Rx) | Instruction | Effects | |-------------------|------------| | LD F1, 0(Rx) | F1←Mem(Rx) | | ADD.D F1, F2, F3 | F1←F2 + F3 | | MULT.D F1, F2, F3 | F1←F2 * F3 | (b) [6%] Use a standard MIPS five-stage integer pipeline (IF = Instruction Fetch, ID = Instruction Decode, EX = Execute, MEM = Memory access, WB = Register write back), show the timing of this instruction sequence. Assume all memory accesses take 1 clock cycle (40 ns), and a register may be read and written in the same clock cycle. Assume normal forwarding and bypassing hardware. | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | |-----------------------|----|------|----|-----|----|---|---|---|---|----|----|----| | I1: LD F1, 0(Rx) | IF | · ID | EX | MEM | WB | | | | | | | | | 12: LD F2, 8(Rx) | _ | | | | | | | | | | | | | I3: MULT.D F3, F1, F2 | | | | | | | | | | | | | | l4: ADD.D F4, F1, F3 | | | | | | | | | | | | | | I5: ADD.D F5, F1, F2 | | | | | | | | | | | | | | l6: LD F5, 0(Rx) | | | | | | _ | | | | | | | Assume there is no forwarding or bypassing hardware. | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | |-----------------------|----|----|----|-----|----|---|---|---|---|----|----|----| | I1: LD F1, 0(Rx) | IF | ID | EX | MEM | WB | | | | | | | | | I2: LD F2, 8(Rx) | | | | | | | | | | | | | | 13: MULT D F3, F1, F2 | | | | | | | | | | | | | | I4: ADD.D F4, F1, F3 | | | | | | | | | | | | | | I5: ADD.D F5, F1, F2 | | | | | | | | | | | | | | l6: LD F5, 0(Rx) | | | | | | | | | | | | | - (c) [2%] For part (b), list all forwarding hardware used by each instruction (e.g., 11 used MEM→EX) - > [8%] RAID (redundant arrays of inexpensive disks) can be used to improve both performance and reliability of hard disks. Consider the following data: | Bit <sub>1</sub> | Bit <sub>2</sub> | Bit <sub>3</sub> | Bit4 | Bit <sub>5</sub> | Bit <sub>6</sub> | Bit <sub>7</sub> | Bit <sub>8</sub> | |------------------|------------------|------------------|------|------------------|------------------|------------------|------------------| | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 注意:背面有試題 ## 台灣聯合大學系統 107 學年度碩士班招生考試試題 類組:<u>電機類</u> 科目:<u>計算機系統(計算機組織)(300A)</u> 共 4 頁 第 4 頁 ※請在答案卷內作答 (a) [2%] Show how it can be stored in a RAID level 0 system (striping) with 2 disks. | Disk | Data | | | | | |------|------|---|--|--|--| | 1 | | | | | | | 2 | 1 | • | | | | (b) [4%] Show how it can be stored in a RAID level 3 system (striping + parity disk) with 3 disks, where parity information is stored in disk 3. | Disk | Data | | | | | | |------|------|--|--|--|--|--| | 1 | | | | | | | | 2 | | | | | | | | 3 | | | | | | | (c) [2%] Consider the following RAID level 3 system, where disk 2 has failed. Rebuild disk 2 based on the parity information in disk 5. | Disk | | Data | | |------|---|------|---| | 1 | 0 | 1 | 1 | | 2 | | | | | 3 | 1 | 0 | 0 | | 4 | 1 | 1 | 1 | | 5 | 1 | 1 | 0 | ## 九、[10%] - (a) [3%] Explain what is meant when it is said that a multicore processor system is cache coherent? - (b) [3%] Explain the acronym SMP and what can be said about how the memory access time varies for each processors in such a system. - (c) [4%] Discuss two advantages and two disadvantages of shared memory.