subject

Hw6-1 (43 points) suppose we wish to write a procedure that computes the inner productof two vectors u and v. an abstract version of the function has a cpe of 14{18 with x86-64 fordi erent types of integer andoating-point data. by doing the same sort of transformations we didto transform the abstract program combine1 into the more ecient combine4, we get the followingcode: void inner4(vec_ptr u, vec_ptr v, data t *dest) {long i; long length = vec_length(u); data_t *udata = get_vec_start(u); data_t *vdata = get_vec_start(v); data_t sum = (data_t) 0; for (i = 0; i < length; i++){sum = sum + udata[i] * vdata[i]; }*dest = sum; }our measurements show that this function has a cpe of 1.50 for integer data and 3.00 foroating-point data. for data type double, the x86-64 assembly code for the inner loop is asfollows: # inner loop of inner4. data_t = double. op = *.# udata in %rbp, vdata %rax, sum in %xmm0, i in rcx, limit in rbx. l15: # loop: vmovsd 0(%rbp,%rcx,8), %xmm1 # get udata[i]vmulsd (%rax,%rcx,8), %xmm1, %xmm1 # multiply by vdata[i]vaddsd %xmm1, %xmm0, %xmm0 # add to sumaddq $1, %rcx # increment icmpq %rbx, %rcx # compare i: limitjl .l15 # if < , goto loopassume that the functional units have the latencies and issue times given in figure 5.12 (andin the course notes).a. diagram how this instruction sequence would be decoded into operations, and show how the datadependencies between them would create a critical path of operations in the style of figures 5.13(figure: opt/dpb-sequential) and 5.14 (figure: opt/dpb-ow and figure: opt/dpb-ow-abstract). (25points.)b. for data type double, what lower bound on the cpe is determined by the critical path? givea numerical value and an explanation. (6 points.)c. assuming similar instruction sequences for the integer code as well, what lower bound on thecpe is determined by the critical path for integer data? give a numerical value and an explanation.(6 points.)d. explain how theoating-point version can have a cpe of 3.00 even though the multiplicationoperation requires 5 cycles. (6 points.)hw6-2 (27 points) write a version of the inner product procedure described in the previousproblem that uses six-way loop unrolling (6 1; no parallelism). (11 points.)

ansver
Answers: 1

Another question on Computers and Technology

question
Computers and Technology, 23.06.2019 11:00
In the context of the box model, what is the difference between a margin and a padding? a. a padding lies outside a box border, while a margin lies inside it. b. a padding lies inside a box border, while a margin lies outside it. c. a padding can be adjusted independently, while a margin depends on the size of its box. d. a padding depends on the size of its box, while a margin can be adjusted independently.
Answers: 3
question
Computers and Technology, 23.06.2019 11:30
Me dangers of social media and the internetexplain what each means: 1) social media and phones have become an addiction.2) outside people have access to you all the time.3) cyberstalking4) cyberbullying5) catphishing6) viruses7) identity theft8) credit card fraud9) hacking10) money schemes
Answers: 1
question
Computers and Technology, 23.06.2019 16:10
What is the ooh? a. omaha occupation handbook b. online occupational c. occupations online d. occupational outlook handbook select the best answer from the choices provided
Answers: 3
question
Computers and Technology, 24.06.2019 00:10
Read each statement below. if the statement describes a peer-to-peer network, put a p next to it. if the statement describes a server-based network, put an s next to it. p - peer-to-peer s - server-based
Answers: 1
You know the right answer?
Hw6-1 (43 points) suppose we wish to write a procedure that computes the inner productof two vectors...
Questions
question
Mathematics, 30.03.2021 20:30
question
History, 30.03.2021 20:30
question
Mathematics, 30.03.2021 20:30