Hi Chris, Thank you for responding. I am currently using SSE2 intrinsics to optimize my image processing algo. Let us say we have four different non contiguous addresses inside one single XMM register: XMM 0 = |ADD1 | Add2 | Add3 | Add4 | If I want to get the data at those addresses what is the best way possible. The only way I could think of writing it back to memory/cache and then read them with pointer indexing like XMM1 = |*ADD1 | *ADD2 |*ADD3 |*ADD4 | This would give a huge hit in performance since there is memory read and write back which is a lot of cycles per 4 indexed values.:confused:....Is there any way round it to get hold of those values from those addresses. Normal vector processing machines support Gather,Scatter which is the equivalent of getting data from non contiguous addresses. However, these kind of support seems to be absent in SSE2... Please respond if you have any thoughts about it. Anything is helpful. Thank you for the help. Best regards, Anand
A
Anand RK
@Anand RK
Posts
-
sse2 intrinsics question? -
sse2 intrinsics question?Hi there, Is anybody here dealt with coding of integer image processing algorithms using SSE2 intrinsics on a P-4. If yes, please respond. I am stuck with a problem. wanted to know if there is a round way about it. ARK